Build an AI Agent Code Execution Tool

Build a tool that lets LLMs execute code in a sandboxed environment using InstaVM and OpenAI function calling.

The short version

from instavm import InstaVM
from openai import OpenAI

vm = InstaVM("your_instavm_key")
openai = OpenAI()

tools = [{
    "type": "function",
    "function": {
        "name": "execute_code",
        "description": "Execute Python code in a sandbox and return stdout",
        "parameters": {
            "type": "object",
            "properties": {"code": {"type": "string"}},
            "required": ["code"]
        }
    }
}]

response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Calculate the first 20 Fibonacci numbers"}],
    tools=tools
)

When the model calls execute_code, run the code on InstaVM:

import json

tool_call = response.choices[0].message.tool_calls[0]
code = json.loads(tool_call.function.arguments)["code"]
result = vm.execute(code)
print(result["output"])

Full agent loop

A complete agent that handles tool calls in a loop:

import json
from instavm import InstaVM
from openai import OpenAI

vm = InstaVM("your_instavm_key")
openai_client = OpenAI()

tools = [{
    "type": "function",
    "function": {
        "name": "execute_code",
        "description": "Execute Python code in a secure sandbox. Use print() to show output. You can install packages with pip.",
        "parameters": {
            "type": "object",
            "properties": {"code": {"type": "string", "description": "Python code to execute"}},
            "required": ["code"]
        }
    }
}]

def run_agent(prompt):
    messages = [
        {"role": "system", "content": "You are a helpful coding assistant. Use the execute_code tool to run Python code when needed."},
        {"role": "user", "content": prompt}
    ]

    while True:
        response = openai_client.chat.completions.create(
            model="gpt-4o",
            messages=messages,
            tools=tools
        )

        msg = response.choices[0].message
        messages.append(msg)

        if not msg.tool_calls:
            return msg.content

        for call in msg.tool_calls:
            code = json.loads(call.function.arguments)["code"]
            result = vm.execute(code)
            messages.append({
                "role": "tool",
                "tool_call_id": call.id,
                "content": result.get("output", "No output")
            })

# Run it
answer = run_agent("Analyze the distribution of prime numbers under 10000 and plot a histogram")
print(answer)

Sandbox-per-agent

For multi-tenant systems, create one InstaVM session per agent to isolate state:

def create_agent_sandbox(api_key):
    vm = InstaVM(api_key, cpu_count=2, memory_mb=1024)
    # Restrict network access
    vm.egress.set_session(
        allow_package_managers=True,
        allow_https=False,
        allow_http=False,
        allowed_domains=[]
    )
    return vm

# Each user/agent gets its own sandbox
agent_a = create_agent_sandbox("your_key")
agent_b = create_agent_sandbox("your_key")

With LangChain

See the LangChain integration guide for using InstaVM as a LangChain tool.

With LlamaIndex

See the LlamaIndex integration guide for RAG pipelines with code execution.

Coding agents vs agent tools

This guide is for apps that use InstaVM as a tool for code execution. If you want Claude Code, Codex, GitHub Copilot, or another coding agent to deploy the app it is editing, use Deploy from a Coding Agent instead.

Security considerations

Use egress policies to restrict what code can access on the network
Each sandbox is a full microVM with kernel-level isolation
Set vm_lifetime_seconds to auto-terminate long-running sandboxes
Do not pass sensitive credentials as code arguments -- use environment variables via the env parameter

Next steps

Build a Code Interpreter -- multi-turn conversational code execution
Egress Policies -- securing sandbox network access
Python SDK -- full SDK reference

The short version​

Full agent loop​

Sandbox-per-agent​

With LangChain​

With LlamaIndex​

Coding agents vs agent tools​

Security considerations​

Next steps​