Skip to main content

AI Agent execution in a Control Sandbox

Understand how an AI Agent executes within a Control Sandbox

Updated over 2 months ago

AI agents utilize generative-AI models (LLMs), and all LLMs have some common core capabilities:

  • The ability to have contextual conversational interaction and follow user-provided instructions.

  • "Knowledge" of a broad collection of information from public sources (though usually a few months stale).

  • The ability to understand and generate content (documents, images, videos).

"Agent Environment" vs "LLM"

It is important to distinguish between AI agents (things that do work on behalf of the user) and the large language models (LLM) that power them. The LLM used in Thunk.AI could be GPT-4o or Claude Sonnet or Google Gemini or eventually other LLMs.

An AI agent is a directed and controlled use of an LLM to achieve a specific task. This is implemented by running the LLM in the control of an "AI agent environment". In Thunk.AI, the platform runtime is responsible for implementing and managing this environment.

The agent environment sets up the appropriate context and messages to send to the LLM. It then interprets the response of the LLM, takes some actions based on that, and then repeats this iteratively until it decides that the work is done. The LLM can only respond by asking the environment to invoke one more "AI tools". The tools are a way for the LLM to indicate that some extra information should be fetched.

The important takeaways are:

  • The LLM never accesses information directly. Any information needed by the LLM is provided by the AI agent, either as part of setting up the environment or by responding to tool call requests.

  • The LLM never updates data directly. Any such changes are made by the AI environment by responding to tool call requests.

The Thunk.AI Agent Environment

LLMs can only respond to instructions with responses. It is upto the agent environment to provide them meaningful instructions ("Steering"), meaningful data ("Grounding"), and meaningful capabilities via AI tools. However, this is only part of the required behavior. It is very important to also appropriately handle the responses appropriately ("Validation"). Collectively, this is where the AI agentic layer of the Thunk.AI platform plays a crucial role.

Every AI agent runs within and is controlled and constrained by a control sandbox.

The Control Sandbox

Control, reliability, and security are enabled by the control sandbox. This sandbox performs several critical functions:

  1. It invokes the AI agent in a loop until the task is complete.

  2. It sets up the appropriate context and instructions to provide the AI agent. This context varies dynamically as the loop progresses, reflecting the AI agent's progress through the task at hand and keeping it focused on the current aspect of the task.

  3. It limits the responses of the AI agent to a specific set of structured AI tool calls.

  4. It vets and validates every proposed tool call, then executes it, checks its result, and then passes it back to the AI agent for the next iteration of processing.

  5. It implements several mechanisms that mitigate common flaws in LLMs (like hallucination, inconsistency, early termination, etc).

The control sandbox is the most important and novel platform innovation in the Thunk.AI platform. AI reliability stems from the effective use of the control sandbox. Read more about AI reliability in Thunk.AI here.

Did this answer your question?