Build an AI Agent Code Execution Tool
Build a tool that lets LLMs execute code in a sandboxed environment using InstaVM and OpenAI function calling.
The short version
from instavm import InstaVM
from openai import OpenAI
vm = InstaVM("your_instavm_key")
openai = OpenAI()
tools = [{
"type": "function",
"function": {
"name": "execute_code",
"description": "Execute Python code in a sandbox and return stdout",
"parameters": {
"type": "object",
"properties": {"code": {"type": "string"}},
"required": ["code"]
}
}
}]
response = openai.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Calculate the first 20 Fibonacci numbers"}],
tools=tools
)
When the model calls execute_code, run the code on InstaVM:
import json
tool_call = response.choices[0].message.tool_calls[0]
code = json.loads(tool_call.function.arguments)["code"]
result = vm.execute(code)
print(result["output"])
Full agent loop
A complete agent that handles tool calls in a loop:
import json
from instavm import InstaVM
from openai import OpenAI
vm = InstaVM("your_instavm_key")
openai_client = OpenAI()
tools = [{
"type": "function",
"function": {
"name": "execute_code",
"description": "Execute Python code in a secure sandbox. Use print() to show output. You can install packages with pip.",
"parameters": {
"type": "object",
"properties": {"code": {"type": "string", "description": "Python code to execute"}},
"required": ["code"]
}
}
}]
def run_agent(prompt):
messages = [
{"role": "system", "content": "You are a helpful coding assistant. Use the execute_code tool to run Python code when needed."},
{"role": "user", "content": prompt}
]
while True:
response = openai_client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools
)
msg = response.choices[0].message
messages.append(msg)
if not msg.tool_calls:
return msg.content
for call in msg.tool_calls:
code = json.loads(call.function.arguments)["code"]
result = vm.execute(code)
messages.append({
"role": "tool",
"tool_call_id": call.id,
"content": result.get("output", "No output")
})
# Run it
answer = run_agent("Analyze the distribution of prime numbers under 10000 and plot a histogram")
print(answer)
Sandbox-per-agent
For multi-tenant systems, create one InstaVM session per agent to isolate state:
def create_agent_sandbox(api_key):
vm = InstaVM(api_key, cpu_count=2, memory_mb=1024)
# Restrict network access
vm.egress.set_session(
allow_package_managers=True,
allow_https=False,
allow_http=False,
allowed_domains=[]
)
return vm
# Each user/agent gets its own sandbox
agent_a = create_agent_sandbox("your_key")
agent_b = create_agent_sandbox("your_key")
With LangChain
See the LangChain integration guide for using InstaVM as a LangChain tool.
With LlamaIndex
See the LlamaIndex integration guide for RAG pipelines with code execution.
Security considerations
- Use egress policies to restrict what code can access on the network
- Each sandbox is a full microVM with kernel-level isolation
- Set
vm_lifetime_secondsto auto-terminate long-running sandboxes - Do not pass sensitive credentials as code arguments -- use environment variables via the
envparameter
Next steps
- Build a Code Interpreter -- multi-turn conversational code execution
- Egress Policies -- securing sandbox network access
- Python SDK -- full SDK reference