Layer Four: The Systems Didn't Agree
Behavioural Audit – Layer Four - A practical guide to interpreting comparative stability in AI systems
Have you ever compared two AI systems reviewing the same case and noticed that the conclusions were not quite the same?
The information may be identical. The interpretation may not be.
This guide introduces the fourth observable probe in behavioural audit: comparative stability.
What This Guide Will Help You Do
After reading, you will be able to:
• Recognise when different AI systems interpret the same case differently
• Identify variation in judgement thresholds, caution and operational recommendation
• Apply the Comparative Stability Test
• Understand what disagreement between systems may reveal about behavioural sensitivity
• Treat comparative disagreement as a behavioural signal rather than automatic failure
Who It’s For
Practitioners, analysts, HR professionals, policy leads and managers using multiple AI tools within operational workflows.
What It Is Not
This is not a technical guide to model benchmarking or vendor comparison.
It introduces an observational method for assessing how AI systems behave when evaluating the same material under identical conditions.
Also available as part of the Applied AI Guides bundle (Layers 1–4).
~30 pages