AI Reliability Analysis

Session 6 Deliverable | Author: Chase Petersen

⚖︎ RIGOR RUNE ☍ RISK RUNE

Executive Summary: The Cost of Being Wrong

In readymix dispatch, an AI hallucination is not a software bug—it is an irreversible physical failure. Concrete has a strict ~90-minute perishable window. If YARDMASTER issues an unverified reroute, the payload cures in the drum, resulting in a $10,000–$100,000 loss per incident (material cost, truck damage, and job-site liability). This document applies the Risk and Rigor Runes to identify exact failure modes, data thinness, and boundary violations.

1. Where the System Hallucinates

Vector A: Spatial Extrapolation ("Ghost Trucks")

The Risk: If TruckTrax GPS drops a ping (e.g., in a rural pour zone with bad cell service), the Generative AI's natural instinct is to predict the next token—in this case, interpolating the truck's location based on average speed and route history.

The Harm: Dispatch commits a reroute based on a truck that is actually stuck in traffic or broke down out of cell range. The concrete clock expires.

Mitigation: Strict timestamp validation. If telemetry is > 3 minutes old, the AI is forbidden from guessing. The entity status forces to STATE: UNVERIFIED and requires manual radio contact.

2. Where the Data is Thin

Vector B: Human-Input Latency (Status Desync)

The Risk: YARDMASTER relies on Jonel and TruckTrax APIs, which in turn rely on drivers tapping a tablet in the cab to change status (e.g., from "Pouring" to "Washing Out").

The Reality: Drivers forget to push the button. The data says the truck is still pouring, but physically, it is empty and heading back to the plant. The data is "thin" because it lacks physical hardware verification (like a drum-weight sensor).

Mitigation: YARDMASTER implements an anomaly timer. If a truck's [POUR_RATE] implies it should be empty, but the status hasn't changed in 15 minutes, the AI flags a "Status Desync Warning" rather than treating the stale data as ground truth.

3. Out-of-Bounds Queries (The Boundary Condition)

Vector C: Scope Creep & Unauthorized Synthesis

The Risk: A dispatcher or plant manager types: "We are short on drivers today. Should we discount our 3000 PSI mix to win the Smith job tomorrow to keep the plant busy?"

The Harm: If the Generative AI attempts to answer this, it is operating entirely outside its Knowledge Graph. It does not have access to the ERP, union labor rates, or enterprise pricing strategies. Recommending financial discounts based purely on dispatch availability is catastrophic to the P&L.

Mitigation (The Boundary Guardrail): YARDMASTER employs strict semantic routing. If a prompt falls outside the ontology of [DISPATCH], [BATCHING], or [ROUTING], it triggers a hard-coded refusal.

System Response: "YARDMASTER is a dispatch intelligence tool. I cannot calculate pricing, bid strategy, or HR policy. Please consult the enterprise ERP."

4. The Ultimate Guardrail: The Triple-Gate Verification

To enforce the ⚖︎ Rigor Rune, YARDMASTER V1 operates exclusively as a Read-Only, Human-In-The-Loop (HITL) system. It cannot autonomously dispatch. Furthermore, before it can even suggest a reroute to the dispatcher, it must successfully pass a Boolean check across three specific nodes in the Knowledge Graph:

  1. Material Gate: Does Plant B physically hold the required aggregate for the active mix design?
  2. Washout Gate: Is the target truck's drum physically cleared to take a different mix?
  3. Labor Gate: Does the driver have enough DOT Hours of Service to legally complete the transit and pour?

Kill Criteria: If any of these three gates return FALSE or NULL (due to thin data), the reroute calculation is killed instantly. The option is never shown to the human.