Skip to main content

Build an AI Agent Code Execution Tool

Build a tool that lets LLMs execute code in a sandboxed environment using InstaVM and OpenAI function calling.

The short version

from instavm import InstaVM
from openai import OpenAI

vm = InstaVM("your_instavm_key")
openai = OpenAI()

tools = [{
"type": "function",
"function": {
"name": "execute_code",
"description": "Execute Python code in a sandbox and return stdout",
"parameters": {
"type": "object",
"properties": {"code": {"type": "string"}},
"required": ["code"]
}
}
}]

response = openai.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Calculate the first 20 Fibonacci numbers"}],
tools=tools
)

When the model calls execute_code, run the code on InstaVM:

import json

tool_call = response.choices[0].message.tool_calls[0]
code = json.loads(tool_call.function.arguments)["code"]
result = vm.execute(code)
print(result["output"])

Full agent loop

A complete agent that handles tool calls in a loop:

import json
from instavm import InstaVM
from openai import OpenAI

vm = InstaVM("your_instavm_key")
openai_client = OpenAI()

tools = [{
"type": "function",
"function": {
"name": "execute_code",
"description": "Execute Python code in a secure sandbox. Use print() to show output. You can install packages with pip.",
"parameters": {
"type": "object",
"properties": {"code": {"type": "string", "description": "Python code to execute"}},
"required": ["code"]
}
}
}]

def run_agent(prompt):
messages = [
{"role": "system", "content": "You are a helpful coding assistant. Use the execute_code tool to run Python code when needed."},
{"role": "user", "content": prompt}
]

while True:
response = openai_client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools
)

msg = response.choices[0].message
messages.append(msg)

if not msg.tool_calls:
return msg.content

for call in msg.tool_calls:
code = json.loads(call.function.arguments)["code"]
result = vm.execute(code)
messages.append({
"role": "tool",
"tool_call_id": call.id,
"content": result.get("output", "No output")
})

# Run it
answer = run_agent("Analyze the distribution of prime numbers under 10000 and plot a histogram")
print(answer)

Sandbox-per-agent

For multi-tenant systems, create one InstaVM session per agent to isolate state:

def create_agent_sandbox(api_key):
vm = InstaVM(api_key, cpu_count=2, memory_mb=1024)
# Restrict network access
vm.egress.set_session(
allow_package_managers=True,
allow_https=False,
allow_http=False,
allowed_domains=[]
)
return vm

# Each user/agent gets its own sandbox
agent_a = create_agent_sandbox("your_key")
agent_b = create_agent_sandbox("your_key")

With LangChain

See the LangChain integration guide for using InstaVM as a LangChain tool.

With LlamaIndex

See the LlamaIndex integration guide for RAG pipelines with code execution.

Security considerations

  • Use egress policies to restrict what code can access on the network
  • Each sandbox is a full microVM with kernel-level isolation
  • Set vm_lifetime_seconds to auto-terminate long-running sandboxes
  • Do not pass sensitive credentials as code arguments -- use environment variables via the env parameter

Next steps