Troubleshooting AI reliability issues

How to troubleshoot AI reliability errors

Identify where the AI automation didn’t do what you expect
Identify the location of the problem – in which step and at which turn of the agent loop did the error occur
1. Wherever an error manifests, look at the “AI Fingerprint”. It has AI Instructions, specific Input values, and a trace of the agent loop messages.
2. Expand the message loop to get to the message that produced the error.
3. Classify the error (see Slide 4) and list hypotheses (eg: bad instruction prompts, bad input property bindings, error in previous step, etc)
4. Does AI Fingerprint confirm a hypothesis for the root cause?
With root cause known, iteratively attempt to fix until successful,
1. make a change to address hypothesis
2. re-run failing step
If it seems like a platform error, report to [email protected]

Classes of AI reliability errors

There are eight classes of reliability errors:

It overlooked information I wanted.
It didn’t record specific information correctly.
It hallucinated values.
It refuses to make a specific tool call (eg: lookup_sheet)
It didn’t provide the right parameters to a tool call
It did a different thing on different runs with the same or similar inputs
It didn’t call the right sequence of tool calls that I intended it to do
Seems like a platform error

1: It overlooked information you wanted

You gave it an email and expected it to use the Subject

AI Trace: did it read the email and is the email content in the agent message loop?
Obvious: check AI instructions: did you tell it record the Subject. Recorded properties get greater attention in subsequent workflow logic.
Not so obvious: do you have a state property called “Subject” with a description saying it should have the email subject. And do the step AI Instructions include that property in its Outputs

You gave it a document and expected it would use some data in it

Obvious: did it read the document (look for an “open_url” tool call)
Not so obvious: was it able to read the document format? Was the content too long and did it get cut off (change config settings for this)
Did you ask it to record the specific data you want it to use subsequently?

You gave it a document with instructions you wanted it to follow

FYI: For security reasons (injection attacks), the platform tries to avoid instructions inside content/data. Put instructions in the AI instructions, and links to instruction documents there.

2: It didn't record specific information correctly

You gave it an email and expected it to record the Subject

AI Trace: did it read the email and is the email content in the agent message loop?
Obvious: check AI instructions: did you tell it copy the Subject
Not so obvious: do you have a state property called “Subject” with a description saying it should have the email subject.
Not so obvious: does the step AI Instructions include that property in its Outputs
Not so obvious: was the UpdateItem tool call enabled?

You gave it a PDF and expected it would extract/record some info from it

Obvious: did it read the document (look for an “open_url” tool call)
Not so obvious: was it able to read the document format? Was the content too long and did it get cut off (change config settings for this)
Obvious: Did you ask it to record the specific data you want it to use?
Not so obvious: Are the property name, description, and type meaningful?
Not so obvious: was the UpdateItem tool call enabled?

3: It hallucinated values

You asked it to fetch the date from a document. It wrote some arbitrary date

AI Trace: did it read the document?
AI Trace: did it get the contents of the document? Does it include the date?
Obvious: check AI instructions: did you tell it where to find the date in the document (eg: on the first page)
Not so obvious: do you have a state property called “Date” with a description saying it should have the date from the document.
FYI: hallucination occurs when (a) information is considered necessary to output, and (b) the information is missing in the input
Why isn’t the information in the input (could be an error – investigate)
If it is valid for the information to be missing in the input, provide an explicit escape hatch for the output – eg: (description of the Date column: “date from document title, or N/A”)
Can you use a better AI model (modify at the thunk settings)?

4: It refuses to make a specific tool call

You asked it to do a web_search, but it never does it

Not so obvious: in the AI trace, the first message records a planned sequence of tool calls. Is web_search included in this plan? If not, it was probably not enabled (see below) OR the instructions do not clearly tell it to do a web search
Not so obvious: at the level of the thunk tools, is the web library enabled?
Not so obvious: look at the enabled tools in the AI instructions. Is the web library disabled? Is the web_search tool within the web library disabled?
Not so obvious: enable the debug flag and verify that the tool was actually provided to the LLM

You asked it to do a lookup_sheet but it never does it

AI Trace: do you find an open_url call where it “opened” the sheet? Some tools (like lookup_sheet) are only enabled when a sheet is opened
Did the open_url call succeed?
Obvious: Did you instruct it which sheet to use and to lookup something in the sheet?
Not so obvious: is the appropriate tool library (Google Sheets or Microsoft Excel) enabled and is the appropriate lookup tool enabled

5: It didn't provide the right parameters to a tool call

You asked it to do a web search, but the results are wrong

AI Trace: you find the web_search tool call and see that the query term is too generic
Not so obvious: in the AI Instructions, provide an example of a query term (eg: “when using web_search, use a query term like this: ‘clinical trials for pediatric AML’. Don’t do generic searches like ‘cancer trials’).
Not so obvious: tools have multiple parameters, some optional. Web_search allows a site restriction. If you want to do that, specify it in instructions (eg: restrict to site clinicaltrials.gov)
Not so obvious: in the tools configuration, specify a constraint on the tool parameters

You asked it to create a file in a folder, but it creates a file in the wrong folder

AI Trace: you find the add_file tool call and see that before that, it opened the wrong folder
Obvious: Did you instruct it clearly which folder to use?
Not so obvious: where is the erroneous folder coming from? Does the step involve two different folders and work that has to be done in each?
Not so obvious: Consider giving each folder a semantic name (InputsFolder, ResultsFolder)
Not so obvious: break the step into two steps

6: It does different things on different runs

Different behavior with exactly the same inputs

Assume: the instructions did not change at all between runs
Hypothesis 1: inadequate detail – are there meaningful instructions? Are there output property bindings with well-defined schematized properties?
Hypothesis 2: too much context – is it a huge/complex instruction? Is there a lot of data input? Are there more than 20 tools being provided for the AI agent to choose from?
- FYI: Reliability comes from

Greater instruction detail increases AI Reliability

Simplifying the context increases AI Reliability

Different behavior with marginally different inputs or instructions

AI Trace: open both traces and look at their differences
If one is a rational choice but the other is wrong, troubleshoot the error case
If both are rational (but different choices), there is a case of inconsistency.

AI choices that are under-constrained (many similar options to pick, or inadequate detail of instruction) hamper AI Reliability

7: It didn't call the right sequence of tool calls

You expect it to do A then B then C, but it gets the order wrong

Not so obvious: in the AI trace, the first message records a planned sequence of tool calls. Is A -> B -> C listed in this plan?
Not so obvious: Can you write the AI Instructions as a bulleted list with three bullets describing what would map to each of A, B, and C

You asked it to do A, B, C, and D but it skips C

Not so obvious: in the AI trace, the first message records a planned sequence of tool calls. Is A -> B -> C -> D listed in this plan?
Is tool C enabled (at the thunk level, and at the step level)?
Look at the description of tool C — is its name and description clear enough to match the AI instructions?

8: Seems like a platform error

No platform error shown but the AI results are wrong

Not so obvious: use a different/better AI model
Not so obvious: use debug mode to look at actual trace of the LLM call

Stuck spinning or takes a long time / times out (5 mins max)

AI Trace: are there validation errors with tool calls? Might be configuration error. Report to [email protected]
Are you trying to read a web site that takes a long time to load? Are you trying to read a document that is very large?

System errors shown in AI chat window

“LLM error” : usually a configuration or API key problem – try using a standard AI model, or contact [email protected]
“Repeated tool calls error” – AI agent is stuck in a loop and cannot get terminate. Usually because of some configuration error. Contact [email protected]

AI Reliability: Concepts and Principles

Secure AI Agents

AI Agent execution within an AI Guardian

AI Instructions

AI reliability mechanisms