โ† yenklabs.com

// Lab Note

Why Authority Existence and Proposition Support are Different Engineering Problems

LLM-Evaluation Legal-AI Semantic-Drift

Jun 2026

Most engineering teams building legal or medical RAG pipelines focus on the wrong telemetry. They track link validity. They check if an ID, case name, or volume sequence hits a 200 OK status against a source API.

This catches the "Mata v. Avianca" rookie error (complete docket hallucination). It completely misses the far more toxic production failure mode: The Sophisticated Lie.

The Anatomy of the Sophisticated Lie

  1. The Citation is Real: Smith v. Jones, 412 F.3d 98 exists in the database.
  2. The Output Link is Valid: The user clicks it, it opens a canonical docket entry.
  3. The Semantic Entailment Fails: The LLM asserts that the case stands for a strict text interpretation (e.g., "Parental rights are overridden by maritime insurance liabilities"). In reality, Smith v. Jones is an administrative law case regarding filing timelines.

The Verification Loop Architecture

Standard cosine similarity against vector embeddings fails here because the legal terminology matches contextually, hiding the structural logical mismatch.

To solve this in Dali, we split verification into a dual-stage execution loop:

  1. Structural Step: Deterministic citation parsing via regex trees + canonical hash lookup against known court registries.
  2. Semantic Entailment Step: Natural Language Inference (NLI) classification. We extract the exact clause/proposition claimed by the generation model, map it against the token stream of the actual judicial holding, and process it as a directed graph logic check (Supports, Contradicts, or Neutral).

If your evaluation platform stops at step 1, you aren't ship-ready.

Part of the Dali R&D thread โ€” semantic proposition validation and immutable chain-of-evidence preservation.