Your Cart
Loading
Only -1 left

Agents at Work: Phase 3 Report

On Sale
£11.99
£11.99
Added to cart

A behavioural audit of how AI judgement changes under repetition, ambiguity and constraint.



Introduces the behavioural framework used to evaluate AI judgement.


Description


This report presents Phase 3 of the Agents at Work research series, examining how an AI system behaves when asked to evaluate age-related bias in recruitment language under repeated and constrained conditions.


Building on earlier phases, which examined where age-related signals appear and how they are interpreted, Phase 3 focuses on how judgement behaves when the same task is performed multiple times.


The report introduces a behavioural framework for evaluating AI systems beyond single outputs, examining patterns of stability, variation and signal response over repeated evaluation.


What This Report Does


Phase 3 examines how AI judgement behaves under:


  • repeated execution of the same task
  • ambiguous or borderline language
  • partial or degraded input context
  • variation in internal signals such as confidence and agreement


The report applies a structured behavioural audit to analyse:


  • run-to-run judgement stability
  • variation in explanations
  • confidence behaviour under uncertainty
  • consistency of cue identification
  • cross-model agreement
  • responsiveness of internal self-review signals
  • sensitivity to truncated input


The focus is on observable behaviour rather than individual results.


What This Report Does Not Do


This report does not:


  • assess real-world discrimination or hiring outcomes
  • determine employer intent
  • provide compliance or legal determinations
  • measure model accuracy against ground truth


The analysis focuses on system behaviour under controlled conditions.


Who This Is For


This report is intended for:


  • researchers examining AI system behaviour
  • audit, risk and assurance professionals
  • policymakers and regulators
  • practitioners working with AI decision-support systems

Research Context


This report forms Phase 3 of the Agents at Work series.


  • Phase 1 examines detection of age-related signals
  • Phase 2 examines interpretation of those signals
  • Phase 3 examines how judgement behaves under repetition and constraint


This phase establishes the behavioural perspective that underpins later evaluation work.


Why This Matters


AI systems are often trusted based on individual outputs and fluent explanations.


Phase 3 shows that these signals do not fully reflect how a system behaves over time. Reliability emerges from patterns of behaviour, not from a single result.


Licence and Usage


© 2026 Imogen Hull – Beyond the Average

Licensed under Creative Commons CC BY-NC-ND 4.0.

The underlying methodology, agent design and analytical framework remain proprietary.

You will get a PDF (1MB) file