How to troubleshoot AI reliability errors
Identify where the AI automation didn’t do what you expect
Identify the location of the problem – in which step and at which turn of the agent loop did the error occur
Wherever an error manifests, look at the “AI Fingerprint”. It has AI Instructions, specific Input values, and a trace of the agent loop messages.
Expand the message loop to get to the message that produced the error.
Classify the error (see Slide 4) and list hypotheses (eg: bad instruction prompts, bad input property bindings, error in previous step, etc)
Does AI Fingerprint confirm a hypothesis for the root cause?
With root cause known, iteratively attempt to fix until successful,
make a change to address hypothesis
re-run failing step
If it seems like a platform error, report to [email protected]
Classes of AI reliability errors
There are eight classes of reliability errors:
1: It overlooked information you wanted
You gave it an email and expected it to use the Subject
AI Trace: did it read the email and is the email content in the agent message loop?
Obvious: check AI instructions: did you tell it record the Subject. Recorded properties get greater attention in subsequent workflow logic.
Not so obvious: do you have a state property called “Subject” with a description saying it should have the email subject. And do the step AI Instructions include that property in its Outputs
You gave it a document and expected it would use some data in it
Obvious: did it read the document (look for an “open_url” tool call)
Not so obvious: was it able to read the document format? Was the content too long and did it get cut off (change config settings for this)
Did you ask it to record the specific data you want it to use subsequently?
You gave it a document with instructions you wanted it to follow
FYI: For security reasons (injection attacks), the platform tries to avoid instructions inside content/data. Put instructions in the AI instructions, and links to instruction documents there.
2: It didn't record specific information correctly
You gave it an email and expected it to record the Subject
AI Trace: did it read the email and is the email content in the agent message loop?
Obvious: check AI instructions: did you tell it copy the Subject
Not so obvious: do you have a state property called “Subject” with a description saying it should have the email subject.
Not so obvious: does the step AI Instructions include that property in its Outputs
Not so obvious: was the UpdateItem tool call enabled?
You gave it a PDF and expected it would extract/record some info from it
Obvious: did it read the document (look for an “open_url” tool call)
Not so obvious: was it able to read the document format? Was the content too long and did it get cut off (change config settings for this)
Obvious: Did you ask it to record the specific data you want it to use?
Not so obvious: Are the property name, description, and type meaningful?
Not so obvious: was the UpdateItem tool call enabled?
3: It hallucinated values
You asked it to fetch the date from a document. It wrote some arbitrary date
AI Trace: did it read the document?
AI Trace: did it get the contents of the document? Does it include the date?
Obvious: check AI instructions: did you tell it where to find the date in the document (eg: on the first page)
Not so obvious: do you have a state property called “Date” with a description saying it should have the date from the document.
FYI: hallucination occurs when (a) information is considered necessary to output, and (b) the information is missing in the input
Why isn’t the information in the input (could be an error – investigate)
If it is valid for the information to be missing in the input, provide an explicit escape hatch for the output – eg: (description of the Date column: “date from document title, or N/A”)
Can you use a better AI model (modify at the thunk settings)?
4: It refuses to make a specific tool call
You asked it to do a web_search, but it never does it
Not so obvious: in the AI trace, the first message records a planned sequence of tool calls. Is web_search included in this plan? If not, it was probably not enabled (see below) OR the instructions do not clearly tell it to do a web search
Not so obvious: at the level of the thunk tools, is the web library enabled?
Not so obvious: look at the enabled tools in the AI instructions. Is the web library disabled? Is the web_search tool within the web library disabled?
Not so obvious: enable the debug flag and verify that the tool was actually provided to the LLM
You asked it to do a lookup_sheet but it never does it
AI Trace: do you find an open_url call where it “opened” the sheet? Some tools (like lookup_sheet) are only enabled when a sheet is opened
Did the open_url call succeed?
Obvious: Did you instruct it which sheet to use and to lookup something in the sheet?
Not so obvious: is the appropriate tool library (Google Sheets or Microsoft Excel) enabled and is the appropriate lookup tool enabled
5: It didn't provide the right parameters to a tool call
You asked it to do a web search, but the results are wrong
AI Trace: you find the web_search tool call and see that the query term is too generic
Not so obvious: in the AI Instructions, provide an example of a query term (eg: “when using web_search, use a query term like this: ‘clinical trials for pediatric AML’. Don’t do generic searches like ‘cancer trials’).
Not so obvious: tools have multiple parameters, some optional. Web_search allows a site restriction. If you want to do that, specify it in instructions (eg: restrict to site clinicaltrials.gov)
Not so obvious: in the tools configuration, specify a constraint on the tool parameters
You asked it to create a file in a folder, but it creates a file in the wrong folder
AI Trace: you find the add_file tool call and see that before that, it opened the wrong folder
Obvious: Did you instruct it clearly which folder to use?
Not so obvious: where is the erroneous folder coming from? Does the step involve two different folders and work that has to be done in each?
Not so obvious: Consider giving each folder a semantic name (InputsFolder, ResultsFolder)
Not so obvious: break the step into two steps
6: It does different things on different runs
Different behavior with exactly the same inputs
Assume: the instructions did not change at all between runs
Hypothesis 1: inadequate detail – are there meaningful instructions? Are there output property bindings with well-defined schematized properties?
Hypothesis 2: too much context – is it a huge/complex instruction? Is there a lot of data input? Are there more than 20 tools being provided for the AI agent to choose from?
FYI: Reliability comes from
Greater instruction detail increases AI Reliability
Simplifying the context increases AI Reliability
Different behavior with marginally different inputs or instructions
AI Trace: open both traces and look at their differences
If one is a rational choice but the other is wrong, troubleshoot the error case
If both are rational (but different choices), there is a case of inconsistency.
AI choices that are under-constrained (many similar options to pick, or inadequate detail of instruction) hamper AI Reliability
7: It didn't call the right sequence of tool calls
You expect it to do A then B then C, but it gets the order wrong
Not so obvious: in the AI trace, the first message records a planned sequence of tool calls. Is A -> B -> C listed in this plan?
Not so obvious: Can you write the AI Instructions as a bulleted list with three bullets describing what would map to each of A, B, and C
You asked it to do A, B, C, and D but it skips C
Not so obvious: in the AI trace, the first message records a planned sequence of tool calls. Is A -> B -> C -> D listed in this plan?
Is tool C enabled (at the thunk level, and at the step level)?
Look at the description of tool C — is its name and description clear enough to match the AI instructions?
8: Seems like a platform error
No platform error shown but the AI results are wrong
Not so obvious: use a different/better AI model
Not so obvious: use debug mode to look at actual trace of the LLM call
Stuck spinning or takes a long time / times out (5 mins max)
AI Trace: are there validation errors with tool calls? Might be configuration error. Report to [email protected]
Are you trying to read a web site that takes a long time to load? Are you trying to read a document that is very large?
System errors shown in AI chat window
“LLM error” : usually a configuration or API key problem – try using a standard AI model, or contact [email protected]
“Repeated tool calls error” – AI agent is stuck in a loop and cannot get terminate. Usually because of some configuration error. Contact [email protected]
