Skip to main content
Interpreters give agents a programmable workspace where they can explore data, coordinate tool calls, and keep intermediate work out of the model context. The agent writes code to express its intent, then an in-memory runtime executes that code and returns the relevant results. Where sandboxes are a code-first way for acting on an environment (such as running commands, installing dependencies, and editing files), interpreters are a code-first way for acting inside the agent loop: composing tools, preserving state, and deciding what information should return to the model.
Interpreters are in beta. APIs and lifecycle behavior may change between releases.
Interpreters require langchain-quickjs>=0.1.0 and Python >=3.11.

Why use interpreters?

Most agent work alternates between model reasoning and tool calls. A model can fire several tool calls in one turn, but that batch is fixed the moment it is emitted. Nothing can loop, branch on a result, retry a failure, or feed one call’s output into the next without another model turn, and every result returns to the model’s context. The model also decides how many calls to issue, so asking it to dispatch work across hundreds of items is unreliable, and it tends to cover a sample rather than every one. Interpreters give the agent a runtime for that work. A loop runs every iteration, tools are called from code, intermediate values stay in variables, and only a compact result returns to the model.

Programmatic tool calling (PTC)

Call selected tools from interpreter code, including loops, retries, branching, and parallel batches.

Programmatic subagents

Dispatch subagents from code for fan-out, verification, and recursive workflows over large inputs.

Stateful work

Keep intermediate values in runtime state without overloading the model context.

Deterministic transforms

Sort, group, parse, validate, score, aggregate, and explore structured data in code.

Choose a pattern

Use interpreters for code inside the agent loop: composing tools, preserving state, and controlling what returns to the model. Use sandboxes for code against an environment: shell commands, package installs, tests, filesystem edits, and OS-level execution.
NeedUse
One or two simple external callsNormal tool calling
A small program that loops, branches, retries, or aggregates resultsInterpreter
Many selected tool calls that should run from codeInterpreter with programmatic tool calling (PTC)
Many independent units of work, multiple perspectives, or recursive analysis over large inputsInterpreter with programmatic subagents
Shell commands, package installs, tests, or full OS filesystem accessSandboxes

Quickstart

Install the QuickJS middleware package, then pass interpreter middleware using the middleware argument on create_deep_agent.
pip install -U "deepagents[quickjs]"
from deepagents import create_deep_agent
from langchain_quickjs import CodeInterpreterMiddleware

agent = create_deep_agent(
    model="openai:gpt-5.5",
    middleware=[CodeInterpreterMiddleware()],
)

How interpreters work

The middleware adds an eval tool to the agent. When useful, the agent writes JavaScript and calls eval; you do not call the interpreter directly. The tool runs code in a persistent context, captures console.log, and returns the result of the last expression. The agent can write code like this:
const rows = [
  { team: "alpha", score: 8 },
  { team: "beta", score: 13 },
  { team: "alpha", score: 21 },
];

const totals = rows.reduce((acc, row) => {
  acc[row.team] = (acc[row.team] ?? 0) + row.score;
  console.log(`${row.team} score: ${acc[row.team]}`)
  return acc;
}, {});

totals;
By default, interpreter state also persists across turns in the same thread by snapshotting the working state after each agent run, and restoring it before the next run. Code runs against QuickJS, a lightweight JavaScript runtime. By default, interpreter code has no access to the host filesystem, network, shell, package manager, or clock. It can compute, hold state, and write to console.log, and nothing more. Two explicit bridges extend that reach:
  • Tools, through programmatic tool calling (PTC). Expose an allowlist of tools as async functions under the tools namespace. These can be the agent’s own tools or standalone tools you define and pass in.
  • Subagents, through programmatic subagents. Dispatch configured subagents from code and orchestrate them in plain JavaScript.
Programmatic tool calling is off until you enable it. Subagent dispatch is on by default whenever the agent has subagents, and you can turn it off. Nothing else crosses the QuickJS boundary unless you expose it.

Programmatic tool calling (PTC)

Programmatic tool calling (PTC) exposes selected agent tools inside the interpreter under the global tools namespace. Instead of asking the model to issue one tool call, wait for the result, and then decide the next call, the agent can write code that calls tools in loops, branches, retries, or parallel batches. This helps when intermediate results are only inputs to the next step: the interpreter filters or aggregates them before anything returns to the model, keeping multi-step workflows token-efficient. It is model-agnostic, implemented by middleware rather than a provider-specific tool-calling API. The middleware exposes each allowlisted tool as an async function under tools. The agent calls it with await, processes the result in code, and the model sees only the final interpreter output, not every intermediate value. Tool names are converted to camel case while the input object still follows the tool’s schema, so a tool named web_search becomes tools.webSearch(...):
const result: string = await tools.webSearch({
  query: "deepagents interpreters",
});

Enable PTC

Enable PTC with an explicit allowlist:
from deepagents import create_deep_agent
from langchain_quickjs import CodeInterpreterMiddleware

agent = create_deep_agent(
    model="openai:gpt-5.5",
    middleware=[CodeInterpreterMiddleware(ptc=["web_search"])],
)
After PTC is enabled, the agent can call the allowlisted tool from interpreter code. This example searches several topics in parallel and combines the results before returning to the model:
const topics = ["retrieval", "memory", "evaluation"];

const results = await Promise.all(
  topics.map((topic) =>
    tools.webSearch({ query: `${topic} best practices 2025` }),
  ),
);

results.join("\n\n");
PTC calls currently execute through the interpreter bridge and do not go through the normal tool calling path. As a result, interrupt_on approval workflows are not enforced per PTC-invoked tool call.

Programmatic subagents

Programmatic subagents let the interpreter dispatch configured subagents from code using the built-in task() global. A task that spans many independent units, such as reviewing every file in a directory or triaging a batch of tickets, becomes a loop that fans work out and synthesizes the results. Use programmatic subagents for:
  • Fan-out and synthesize: Run the same kind of work across many items in parallel, then combine the results.
  • Verification: Send findings to independent verifier subagents and keep only confirmed results.
  • Recursive workflows: Keep a working set in interpreter variables, select slices, call subagents, and refine the result.
For configuration, examples, orchestration patterns, and safety notes, see Programmatic subagents.

Persistence

CodeInterpreterMiddleware snapshots interpreter state after each agent run and restores it before the next run by default. A snapshot is a serialized copy of the interpreter’s in-memory JavaScript state, including globals, variables, functions, and imported modules that exist when the agent finishes running code. Across conversation turns, the lifecycle is:
  1. A turn starts, and CodeInterpreterMiddleware restores the latest interpreter snapshot for the thread.
  2. The agent calls eval, and the code can read or mutate interpreter variables.
  3. The agent run finishes, and the middleware snapshots the updated interpreter state into graph state.
  4. The next turn starts from that restored interpreter state instead of an empty runtime.
Within a single agent run, repeated eval calls use the live interpreter context object. The middleware does not snapshot and restore between those calls; it snapshots the context when the run completes so it can be restored on a later turn or checkpoint replay.
Between conversation turns, snapshots only retain values that can be reasonably serialized. Use them for data, not for live runtime objects. Functions, classes, and other unserializable values are restored as unaccessible artifacts. If interpreter code accesses one after restore, the eval tool will throw an error like Value for 'fn' was not restored because it is not serializable (type: function).
Snapshots preserve interpreter memory, not outside-world effects. If interpreter code calls a tool through PTC, restoring a prior interpreter snapshot does not undo side effects from that tool call. It only restores the interpreter variables that recorded or processed the result. When the graph uses a checkpointer, this pairs with LangGraph time travel. Restoring a graph checkpoint can restore the interpreter snapshot stored in graph state, so you can return to an earlier agent context and interpreter state while debugging or replaying.
from deepagents import create_deep_agent
from langchain_quickjs import CodeInterpreterMiddleware
from langgraph.checkpoint.memory import MemorySaver

checkpointer = MemorySaver()

agent = create_deep_agent(
    model="openai:gpt-5.5",
    checkpointer=checkpointer,
    middleware=[
        CodeInterpreterMiddleware(
            snapshot_between_turns=True,  # Default
        )
    ],
)
You can disable cross-turn snapshots with snapshot_between_turns=False.

Security

Interpreters use QuickJS to run untrusted JavaScript with strict default isolation. Treat that as a scoped interpreter runtime, not a full production sandbox backend. Every tool you expose through PTC is an outside capability that interpreter code can use. Treat the PTC allowlist as a permission boundary: expose only the tools the agent needs, and avoid bridging broad tools that can access sensitive systems, spend money, mutate data, or call unrestricted networks unless that behavior is intentional.
CapabilityAvailable by defaultHow to expose it
JavaScript executionYesAdd interpreter middleware
Top-level awaitYesUse promises in interpreter code
console.log captureYesDisable with capture_console=False
Agent toolsNoAdd a PTC allowlist
Filesystem accessNoAdd the built-in filesystem tools via the PTC allowlist
Network accessNoExpose a specific network tool through PTC
Wall-clock or datetime accessNoExpose an explicit time tool if needed
Shell commands, package installs, tests, OS-level executionNoUse a sandbox backend
How code execution worksInterpreter code runs in an embedded QuickJS context, not a separate VM or process. In Python, this runtime is provided by quickjs-rs, which documents the same-process execution boundary in its Security guide.Treat interpreters as a capability-scoped execution layer, not a host-memory isolation boundary. For untrusted or semi-trusted code, run agents in isolated worker processes or containers and keep the PTC allowlist narrow.

Configuration

CodeInterpreterMiddleware accepts the following options:
KwargDefaultPurpose
memory_limit64 * 1024 * 1024
(64 MB)
QuickJS heap memory limit in bytes.
timeout5.0Per-eval timeout in seconds.
max_ptc_calls256Maximum tools.* calls per eval. Use None only in trusted environments.
tool_name"eval"Name of the interpreter tool exposed to the model.
max_result_chars4000Maximum characters returned from result and stdout blocks.
capture_consoleTrueWhether console.log, console.warn, and console.error output is captured.
subagentsTrueExpose the built-in task() global for programmatic subagents. Set to False to require subagent dispatch through the normal task tool path.
ptcNonePTC allowlist: list of tool names or BaseTool instances.
snapshot_between_turnsTrueWhether interpreter state snapshots persist across agent turns.
max_snapshot_bytesNoneMaximum serialized snapshot size. Defaults to memory_limit.