Every thunk can define content folders to provide information relevant to the intended work. The documents in these folders become available to the AI agents executing the workflow. For example, a customer support thunk may define a folder of product usage documents that can help answer questions.
A thunk has two content folders by default:
“Compliance Policies”: as the name suggests, this folder should hold documents that contain policies that guide and constrain the execution of work in a business.
“Documents”: this is a catch-all default folder for any relevant documents that the workflow logic might need to look up.
The thunk owner/designer can create additional content folders and populate them with suitable curated collections of documents.
Each content folder can include documents in a variety of formats, images, video, or web pages. They can hold a small set of content, or they can hold thousands of documents. New content can be added via a form, via a bulk load from a file system, via links from a spreadsheet, via instructions to the planning AI agent (eg: “add the top 50 pages on https:support.tesla.com” or “add all Google Drive documents starting with ProductInfo”), or via API calls.
The Thunk.AI platform automatically indexes the content in all these documents and makes it available to the runtime AI agents in the form search “tools”. For example, “search_compliance_policies” and “search_documents” are automatically generated tools that the AI agents can use as appropriate.
Curation
Any specific workflow process may benefit from access to relevant documents. A naive approach might be to collect every document on the corporate internet, index it all, and expect the AI agents to sift through them and make sense of them. The problem with this approach is that many documents are outdated or incorrect or just irrelevant to the specific workflow process. Providing incorrect or irrelevant information can adversely compromise the quality of AI outcomes. This is why Thunk.AI intentionally gives the thunk owner the power to create multiple content folders, and to populate each of them with curated content collections.
Structured Properties
It is quite common for documents to be associated with structured properties. For example, documents that describe features of products at a car company might have properties like the car model and model years. Documents that represent research studies may have a title and authors. Structured properties are very useful in improving the relevance of documents retrieved and used by AI agents.
Every content folder in a thunk has settings that can be modified to add structured properties. In some situations, documents already reside in a specialied document store with structured properties and are added to a thunk via an API. In these cases, the document and its properties can be added at the same time. In other cases, the properties need to be extracted from the document in an intelligent way. Every content folder can define “AI Properties” that define AI Instructions to extract and compute property values from each document. These properties are extracted by an AI agent and indexed before the content folder is ready for use by the workflow. AI properties add significantly to the initial time and cost of setting up a content folder, however they improve the quality and efficiency of subsequent retrieval during workflow execution.
Compliance Policies – special behavior
The pre-defined Compliance Policies folder is treated specially in a few ways:
Depending on the specific AI instructions in any workflow step, its contents are dynamically referenced to find appropriate policies that influence the behavior of the AI agent
When the “Checker” feature is enabled to improve AI agent reliability, policies defined in this folder are retrieved and checked for compliance.
To prevent injection attacks, most data and content is marked as being untrusted and they cannot provide instructions to the AI agents. However, the content in the Compliance Policies folder is intended to act as instructions and it is not treated as untrusted content. Thunk owners should keep this in mind and intentionally curate the content of the Compliance Policies folder appropriately.