Tidus — Self-Hosted Enterprise AI Router

Model	Tier	Est. Cost	cost_norm	tier_norm	lat_norm	Score
`claude-sonnet-4-6`	T2	$0.0155	1.00	0.00	1.00	0.80
`gemini-2.5-flash`	T3	$0.00212	0.21	1.00	0.13	0.36
`gpt-4.1-mini` ✓ Winner	T3	$0.00184	0.00	1.00	0.00	0.20

How It Works

From Request to Optimal Model — The Tidus Workflow

Tidus applies a deterministic, five-stage algorithm to every AI request. Each stage eliminates models that fail a hard rule; the surviving candidates are ranked by a weighted score. The model with the lowest score wins. This entire process completes in under one millisecond on the server.

Author: Kenny Wong (lapkei01@gmail.com) · Published: 2026-04-15 · Latest revision: 2026-04-20

Stage 1–5: Request-to-Model Pipeline

Every AI request enters this pipeline. Stages 1–4 are binary filters — each model either passes every check or is eliminated immediately. Stage 5 ranks the survivors by a weighted score and selects the single best model. The entire pipeline runs in under one millisecond on the server.

🔍

Hard Constraints

4 checks: enabled · context · capability · complexity range

→

🛡️

Guardrails

agent depth · token-per-step limit · privacy

→

📊

Complexity Ceiling

critical → Tier 1 only · simple → all tiers

→

💰

Budget Filter

per-request cap + team monthly budget

→

🏆

Score & Select

0.70 × cost + 0.20 × tier + 0.10 × latency

Hard Constraints Binary Filter

Every model in the registry is checked against four hard rules. Failing any single rule is enough to eliminate the model — there is no partial credit, no weighting, and no override. These checks happen in a single pass over all 55 registered models.

Check 1 — Is the model active?

The registry marks each model as enabled: true or enabled: false. Models can be disabled manually by an administrator, or automatically by the drift detector when repeated health probes fail. A disabled model is immediately eliminated, regardless of capability or cost.

spec.enabled == true

Check 2 — Does the context window fit?

Every model has a maximum context window (the total number of tokens it can process in one call). If the task's estimated input token count exceeds that window, the model physically cannot process the request. It is eliminated. Example: a task with 200,000 input tokens cannot use a model with a 128,000-token context window.

estimated_input_tokens ≤ model.max_context

Check 3 — Does the model support the task domain?

Each task carries a domain label: chat, code, reasoning, extraction, classification, summarization, or creative. Each model in the registry lists its supported capabilities. If the task's domain is not in the model's capability set, the model is eliminated. A chat-only model cannot be routed a code generation task, for example.

task.domain ∈ model.capabilities

Check 4 — Is this task within the model's designed complexity range?

Each model in the registry declares a min_complexity and max_complexity (e.g., moderate to complex). If the task's complexity falls outside this declared range, the model is eliminated. This prevents sending a trivially simple task to a model built for deep reasoning work (wrong tool), and prevents sending a critical decision task to a model not designed to handle it.

model.min_complexity ≤ task.complexity ≤ model.max_complexity

If a model fails any of these four checks, it is removed from the candidate pool. Only models that pass all four proceed to Stage 2.

Safety Guardrails Binary Filter

Guardrails enforce system-level safety policies that apply to every team and every task — they cannot be overridden by individual callers. Two types of guardrails apply at this stage: usage limits and privacy enforcement.

Guardrail 1 — Privacy enforcement

Tasks carry a privacy label: public, internal, or confidential. When a task is marked confidential, Tidus enforces a hard rule: only models running on your own infrastructure (is_local: true in the registry) are allowed. All cloud-hosted models — regardless of vendor, price, or capability — are eliminated. This is not a preference; it is an absolute constraint. It ensures that confidential data such as patient records, legal documents, or financial reports is never sent to an external API provider.

if task.privacy == confidential → model.is_local must be true

Guardrail 2 — Agentic depth and token limits

For agentic workflows (where the AI calls tools, loops, and makes multiple decisions), the system enforces a maximum recursion depth (max_agent_depth) and a maximum tokens-per-step limit (max_tokens_per_step). These limits prevent runaway agents from incurring unbounded costs or entering infinite loops. If the task's agent depth or token count exceeds the policy limit, the model is eliminated at this stage.

task.agent_depth ≤ policy.max_agent_depth · task.tokens ≤ policy.max_tokens_per_step

Models that survive Stages 1 and 2 are technically capable, active, appropriately scoped, and safe to use for this specific task. The remaining stages apply economic filters.

Complexity Tier Ceiling Binary Filter

Models in Tidus's registry are classified into four quality tiers. Tier 1 is premium frontier AI (most capable, most expensive). Tier 4 is local or free models (least capable, zero cost). This stage sets a minimum quality floor based on how complex the task is — ensuring that genuinely complex or critical tasks are always handled by appropriately capable models, and cannot be silently downgraded to cheap models.

Task Complexity

Tier Ceiling

Models Allowed

What Gets Eliminated

Simple

Tier 4 (all tiers)

Tier 1, 2, 3, 4 — any model

Nothing eliminated at this stage

Moderate

Tier 3

Tier 1, 2, 3

Tier 4 local/free models eliminated

Complex

Tier 2

Tier 1, 2 only

Tier 3 economy + Tier 4 local eliminated

Critical

Tier 1 only

Tier 1 (premium frontier) only

All Tier 2, 3, 4 models eliminated

Important distinction: The tier ceiling works in one direction — it enforces a minimum quality standard for the task, not a spending cap. Simple tasks may use any tier (including expensive premium models) if the caller requests them. Critical tasks must use Tier 1 — no matter how much a cheaper model costs. This protects against silent quality downgrade on high-stakes decisions.

After Stage 3, every remaining model is confirmed to be capable enough for the task's complexity level. Stage 4 now applies spending constraints.

Budget Filter Binary Filter

For each model that survived Stages 1–3, Tidus computes the estimated cost of processing this specific task with that model. The estimate uses the actual token counts and current market prices from the pricing registry. Two separate budget checks then apply.

Budget check 1 — Per-request cost cap

The caller can optionally attach a max_cost_usd to the task. This is a hard ceiling on what any single API call is allowed to cost. If the estimated cost for a model exceeds this cap, that model is eliminated. This allows callers to guarantee that no single request exceeds a set dollar amount — useful for customer-facing features where per-query economics matter.

estimated_cost = (input_tokens × input_price) + (output_tokens × output_price)

Budget check 2 — Team monthly budget

Each team has a configurable monthly AI spending budget (set by an administrator). The budget enforcer tracks cumulative spend in real time. Before any model is used, Tidus checks whether the team still has budget headroom for the estimated cost. If spending this amount would exceed the team's remaining monthly budget, the model is eliminated. This prevents any single team from consuming more than its allocated share of AI spend — even if individual requests appear cheap.

team.cumulative_spend + estimated_cost ≤ team.monthly_budget

After Stage 4, every remaining model is both technically suitable and economically feasible for this request. Stage 5 picks the single best one.

Score & Select Weighted Ranking

All models that survived the four filter stages are now ranked by a deterministic weighted score. Each model gets a number between 0 and 1 on three dimensions; those numbers are weighted and summed. The model with the lowest total score wins. Lower = better.

score = (cost_norm × 0.70) + (tier_norm × 0.20) + (latency_norm × 0.10)

Each dimension is independently normalised to [0, 1] across the candidate pool, so the weights have consistent meaning regardless of the actual prices or latencies involved.

Dimension 1 — Cost (weight: 70%)

Estimated cost for this specific request (Stage 4 already computed it). Normalised: the cheapest model in the surviving pool scores 0.0; the most expensive scores 1.0. Cost carries the largest weight because, for the majority of tasks, a less expensive model is equally sufficient — and the goal of Tidus is to capture that cost difference systematically.

cost_norm = (cost − min_cost) / (max_cost − min_cost)

Dimension 2 — Quality Tier (weight: 20%)

The model's registered quality tier (1 = premium, 4 = local). Normalised: Tier 1 scores 0.0 (best quality), Tier 4 scores 1.0. This dimension ensures Tidus doesn't blindly route everything to the absolute cheapest model — it applies a modest quality preference that keeps higher-tier models competitive when price differences are small.

tier_norm = (tier − 1) / 3 → Tier 1 = 0.0, Tier 4 = 1.0

Dimension 3 — Response Speed (weight: 10%)

The model's measured median latency (P50 milliseconds) from live health probes. Normalised across the pool. The fastest model scores 0.0. Latency is the least-weighted dimension because most tasks are not latency-sensitive — but it breaks ties between otherwise equal candidates and ensures consistently slow models don't win when a faster alternative costs the same.

lat_norm = (latency − min_lat) / (max_lat − min_lat)

Deprecation penalty (+0.15)

If a model is marked as deprecated in the registry (still routable, but being phased out), a flat penalty of 0.15 is added to its score after normalisation. This means a deprecated model only wins if it is substantially cheaper or faster than all non-deprecated alternatives — preventing gradual quality drift while still honouring the deprecation grace period rather than hard-removing models immediately.

if model.deprecated: score += 0.15

Preferred model shortcut: If the caller attaches a preferred_model_id to the task and that model survived all four filter stages, Tidus selects it directly — skipping the scoring step entirely. This respects explicit caller intent (e.g., "always use GPT-4.1 for this workflow") while still enforcing all hard safety and budget constraints. A preference that would violate budget or privacy rules is overridden by the filter stages regardless.

The model with the lowest score is selected. A RoutingDecision record is written to the audit log, capturing which model was chosen, its score, its estimated cost, and the full list of models that were rejected and why.

The Three Scoring Pillars

After hard filters, all surviving models are scored across three normalised dimensions. Each is expressed as a 0–1 value where 0 is best. The weighted sum determines rank.

70%

Cost Efficiency

Blended price = (input price + output price) ÷ 2, per million tokens. Normalised across all candidates. The cheapest model in the pool scores 0; the most expensive scores 1. Cost dominates because most tasks don't require the most capable model.

cost_norm = (price − min_price) / (max_price − min_price)

20%

Model Quality Tier

Models are classified Tier 1 (premium, frontier) through Tier 4 (local, free). A lower-tier model scores better. This ensures Tidus prefers capable-but-affordable models over the absolute cheapest when quality matters.

tier_norm = (tier − 1) / 3 → Tier 1 = 0, Tier 4 = 1

10%

Response Speed

Measured median response latency (P50 milliseconds) from live health probes. Normalised across candidates. The fastest model in the pool scores 0. Latency matters least for most tasks — but breaks ties between otherwise equal options.

lat_norm = (latency − min_lat) / (max_lat − min_lat)

Department & Complexity Routing Matrix

Different departments have different cost and capability profiles. Tidus uses task complexity to set a hard tier ceiling and the department domain to enforce capability requirements. Together, these two signals determine which models are even considered.

Task Complexity

Max Tier Allowed

Typical Department

Example Request

Simple

Tier 4 (local) up to Tier 1 (premium) — all tiers eligible

Customer Support, HR, Reception

"Summarise this support ticket in one sentence."

Moderate

Tier 3 max (economy cloud)

Marketing, Sales, Operations

"Draft a personalised follow-up email for this prospect."

Complex

Tier 2 max (mid-range cloud)

Engineering, Finance, Legal (review)

"Extract all clause obligations from this 40-page contract."

Critical

Tier 1 only (premium frontier)

Medical, Executive Decision Support, Compliance

"Assess drug interaction risks for this patient's prescriptions."

Pricing Intelligence: How Tidus Knows What Each Model Costs

Tidus cannot route cost-efficiently if it uses stale or incorrect prices. It maintains a continuously updated, multi-source pricing registry with statistical outlier detection to ensure the prices it uses for routing are always accurate.

📋 Hardcoded Source

A curated internal price list maintained by the Tidus team. Updated with each software release. Confidence: 0.70. Always available — never fails.

🌐 Live Pricing Feed

Optional external endpoint (operator-configured). Returns current vendor prices as JSON. Confidence: 0.85. Has circuit breaker — automatically disabled after 5 consecutive failures.

→

🧮 MAD Consensus Engine

Both sources are compared using Modified Z-Score (Median Absolute Deviation) outlier detection. Quotes more than 3.5 standard deviations from the median are rejected. The higher-confidence non-outlier source wins. Result: a single verified price per model.

Why this matters for routing: If a vendor drops their price by 40% overnight (as DeepSeek did in early 2026), Tidus detects this on the next sync cycle (weekly by default, or on-demand) and creates a new versioned revision. All routing decisions from that point forward use the updated price — automatically, with a full audit trail of when the price changed and by how much.

Real-World Examples: Full Workflow Walkthroughs

Three scenarios — each triggers different branches of the five-stage pipeline. Follow each request from arrival to model selection.

Example 1: Customer Support Department

Complexity: Simple · Domain: chat · Privacy: public · Est. tokens: 200 in / 100 out · Budget: $0.01/request

"Summarise this customer complaint in one sentence and suggest a resolution category."

Stage 1

Capability Match

Task needs: chat. 55 models checked. All chat-capable models pass. Result: 52 models survive (3 multimodal-only eliminated).

Stage 2

Privacy Guardrail

Privacy = public. No restriction. All cloud and local models remain eligible. 38 models survive.

Stage 3

Complexity Ceiling

Complexity = simple. All tiers allowed (Tier 1 through 4). No models eliminated by this rule. 38 models survive. (In practice, a team budget policy may cap at Tier 3 for support tasks.)

Stage 4

Budget Filter

Budget = $0.01/request. Estimated cost for Tier 1 models at 200 input + 100 output tokens exceeds $0.01. Premium models (o3, claude-opus-4-6, grok-3-fast) eliminated. Economy and local models survive. ~24 models survive.

Stage 5

Score & Select

Top survivors scored:

Model	Tier	Blended $/1M	P50 ms	Score
gpt-4.1-mini	3	$1.00	320ms	0.12 ✓ WINNER
gemini-2.5-flash	2	$1.40	280ms	0.19
claude-haiku-4-5	3	$2.40	290ms	0.28

Selected Model

gpt-4.1-mini

88% cheaper than using claude-opus-4-6 for the same task · Full capability match · Avg response: 320ms

Example 2: Legal Department — Contract Review

Complexity: Complex · Domain: extraction · Privacy: confidential · Est. tokens: 8,000 in / 2,000 out · Budget: $0.50/request

"Extract all indemnification clauses and payment obligations from this NDA. Flag any clauses that deviate from our standard template."

Stage 1

Capability Match

Task needs: extraction. Models without extraction capability eliminated. ~28 models survive.

Stage 2

Privacy Guardrail — CRITICAL FILTER

Privacy = confidential. This is a contract with trade secrets. Tidus enforces: only local/on-prem models allowed. All 28 cloud-hosted models (OpenAI, Anthropic, Google, etc.) are eliminated immediately. Only your self-hosted Ollama models remain. 3–5 local models survive.

Stage 3

Complexity Ceiling

Complexity = complex → Tier 2 maximum. Local models are Tier 4, which is within the ceiling. All local models survive. 3–5 models survive.

Stage 4

Budget Filter

Local models have $0 cost (on-prem compute). Estimated cost = $0.00. Budget = $0.50. All local models pass. 3–5 models survive.

Stage 5

Score & Select

All local models cost $0, so cost_norm = 0 for all. Tier 4 = tier_norm = 1.0 for all. Latency becomes the tiebreaker (10% weight). The fastest local model wins.

Model	Cost	P50 ms	Score
ollama/llama3.3-70b	$0	1,200ms	0.20 ✓ WINNER
ollama/mistral-7b	$0	2,100ms	0.21

Selected Model

ollama/llama3.3-70b (local)

Contract data never leaves your server · GDPR / SOC 2 compliant · Cost: $0.00 · Privacy enforced automatically — no developer action required

Example 3: Medical / Executive — Critical Reasoning

Complexity: Critical · Domain: reasoning · Privacy: internal · Est. tokens: 3,000 in / 1,500 out · Budget: $2.00/request

"Given this patient's medication list and lab values, identify any clinically significant drug interactions and rank by severity. Provide evidence-based reasoning for each flag."

Stage 1

Capability Match

Task needs: reasoning. Only models with advanced reasoning capability pass. Many economy-tier models without reasoning tags eliminated. ~12 models survive.

Stage 2

Privacy Guardrail

Privacy = internal (not confidential). Cloud models are allowed. All 12 reasoning-capable models remain. 12 models survive.

Stage 3

Complexity Ceiling — STRICT FILTER

Complexity = critical → Tier 1 only. All Tier 2, 3, and 4 models are eliminated regardless of capability. For critical decisions, only frontier models are allowed. 5–6 Tier 1 models survive (o3, claude-opus-4-6, grok-3-fast, gpt-5-codex, gemini-3.1-pro, groq-deepseek-r1).

Stage 4

Budget Filter

Budget = $2.00. Estimated cost at 3,000 + 1,500 tokens for Tier 1 models ranges from ~$0.08 to ~$0.23 — all within budget. All 5–6 Tier 1 models survive.

Stage 5

Score & Select

Model	Blended $/1M	Tier	P50 ms	Score
groq-deepseek-r1	$2.00	1	800ms	0.14
o3	$25.00	1	4,500ms	0.48 ✓ WINNER*
claude-opus-4-6	$45.00	1	3,200ms	0.62

*After Stage 3, only Tier 1 models with reasoning capability remain. Among these, o3's cost-latency balance wins over the cheapest option (groq-deepseek-r1 scores well on cost but has less proven medical reasoning capability — capability matching at Stage 1 may have already filtered it if the catalog marks it accordingly).

Selected Model

o3 (OpenAI)

Highest-rated reasoning model · Frontier tier enforced by critical complexity · No economy models considered — patient safety non-negotiable · Full audit trail of selection logged

Plain-English Summary for the Record

Tidus is an automated AI model routing system. When an application sends an AI request, Tidus receives metadata about that request — its complexity, the type of task, privacy sensitivity, and cost budget. Tidus then applies a five-stage deterministic algorithm to select the optimal AI model from its registry of 53+ tracked models.

The first two stages are safety filters: Stage 1 ensures the selected model is technically capable of performing the task; Stage 2 enforces data privacy law by preventing confidential data from being sent to external cloud providers. Stages 3 and 4 are economic filters: Stage 3 prevents over-provisioning by matching task complexity to model capability tier; Stage 4 enforces spending limits. Stage 5 applies a patented weighted scoring formula — 70% cost, 20% quality tier, 10% response speed — to rank surviving candidates and select the best one.

Separately, Tidus maintains an always-current pricing registry. It ingests prices from multiple independent sources, applies statistical outlier detection (Modified Z-Score / Median Absolute Deviation) to reject anomalous data, and stores every price change as a versioned, audited revision. This ensures routing decisions are always based on current, verified market prices — not stale hardcoded values.

The combination of these two systems — the five-stage routing algorithm and the self-healing pricing registry — constitutes the core patentable invention of the Tidus platform.

Technical Specification

The Tidus Multi-Axis Request Classification Workflow

How Tidus converts a raw user prompt into a structured three-axis classification — domain (task type), complexity (cognitive load), and privacy (content sensitivity) — using a five-tier pipeline of local detectors and a language-model fallback, without transmitting the prompt outside the deployment boundary. Includes empirical validation via cross-family inter-rater reliability and an honest accuracy baseline of 89.2% confidential recall at ship. A telemetry-driven self-improvement design targets 95–97% over time; the rate at which that target is reached depends on enterprise-traffic accumulation, so a parallel research programme (uncertainty-sampled re-labeling, corpus diversification, rubric refinement, encoder ensembling) is run concurrently to advance the baseline ahead of, and independently from, customer adoption.

Author: Kenny Wong

Published: 2026-04-20

Version: 1.0 (v1.3.0 auto-classification layer)

Correspondence: lapkei01@gmail.com

Reproducibility: all numbers, studies, and figures below are reproducible from scripts and data in the kensterinvest/tidus repository (see scripts/, tests/classification/, and findings.md). This document is intended both as an enterprise-evaluation technical specification and as prior-art disclosure in support of patent filing.

What the classifier outputs — worked examples

Every incoming prompt receives one label per axis. The routing stage downstream uses all three: domain narrows the candidate-model set, complexity sets the tier ceiling, and privacy enforces local-only routing when confidential. The examples below are drawn from the labeled corpus and show both the classifier output and which tier resolved it.

Example prompt (abbreviated)	domain	complexity	privacy	Resolved at
"do you know the game arknights"	chat	simple	public	T2a encoder
"write a React component that fetches data with useEffect and handles errors"	code	moderate	internal	T2a encoder
"debug: bot.send_message(chat_id, '5828712341:AAG5HJa37u32SHLytWm5poFr…')"	code	moderate	confidential	T1 regex (Telegram-token pattern)
"I have depression and heightened anxiety, please give me scientific suggestions"	chat	critical	confidential	T5 LLM (topic-based — no entity)
"review my letter of explanation for a Canadian open work permit to accompany my wife"	summarization	critical	confidential	T5 LLM (immigration topic)
"Kalman filter for YOLO ball tracking, code attached: /Users/surabhi/Documents/kalman/best.pt"	code	complex	confidential	T5 LLM (filesystem user-id leak)
"contact me at jennifer.miller@acme.com re: Q3 pricing"	chat	simple	confidential	T2b Presidio (PERSON + EMAIL)
"Vue timeline with 张三 as template user and 13845257654 as placeholder phone"	code	moderate	public	T2a encoder (recognizes placeholders)

Observation: the three axes operate independently. A "code / moderate / confidential" prompt and a "chat / simple / confidential" prompt route to entirely different model sets despite sharing the privacy flag. Conversely, two prompts both labeled confidential may trigger for completely different reasons (entity leak vs. topic sensitivity vs. credential pattern) — which is why a single-signal classifier cannot produce the full three-axis output alone, and why the cascade has multiple tiers.

1. Abstract

Plain English: every request is read locally and tagged for task type, difficulty, and sensitivity before routing — with confidential prompts never leaving your deployment.

Tidus classifies every incoming AI request across three dimensions — domain (task type), complexity (cognitive load required for correctness), and privacy (content sensitivity) — before the request reaches any underlying language model. Classification is performed by a five-tier cascade of local detectors, each tier cheaper and faster than the next. Classification output drives downstream routing within the Tidus five-stage model-selection algorithm disclosed elsewhere in this document. The novel aspects of the classification layer disclosed herein include: (i) an asymmetric-safety OR-rule whereby any tier's confidential classification unilaterally forces local-only routing regardless of other tiers' outputs; (ii) a cross-family inter-rater reliability methodology for validating classification ground truth using independent large language models from distinct vendor families (Anthropic, OpenAI, Google); (iii) a disagreement-capture active learning loop that accumulates retraining signal from production traffic while persisting only feature metadata, never raw prompt content; and (iv) an entity/topic bifurcation analysis empirically justifying architectural separation between cheap entity detectors and language-model topic review.

2. Field of Application

The disclosed classification workflow is intended for use within enterprise AI gateway software that routes natural-language prompts to one of a plurality of candidate language models. Non-exhaustive deployment contexts include: regulated industry verticals (healthcare, finance, legal, defense) subject to data-residency requirements such as HIPAA, GDPR, SOC 2, and equivalent regional standards; organizations with heterogeneous model portfolios spanning both cloud-hosted and on-premises language models; and any system requiring per-request determination of whether prompt content permits transmission to external services.

3. Technical Problem and Prior Art Gap

Existing prompt-classification systems fall broadly into two classes, each with material limitations:

Class A — single-stage language-model classifiers (e.g., Llama Guard, prompt-classification services). These systems achieve high accuracy by invoking a language model on every request. They are unsuitable for privacy-sensitive routing because the act of classifying a confidential prompt requires transmitting that prompt to the classifier, typically outside the deployment boundary. This establishes a privacy paradox: the mechanism intended to determine whether content may leave the system is itself a mechanism that causes content to leave the system.

Class B — static pattern-matching detectors (e.g., Presidio, regex-based secret scanners, DLP systems). These systems are local and fast but detect only explicit identifiers (names, credit card numbers, email addresses, named entities). They systematically miss topic-based confidential content — prompts where sensitivity arises from subject matter (self-disclosed medical condition, employment-law dispute, immigration status, financial hardship) rather than from the presence of a recognizable identifier. Empirical analysis reported in §7 demonstrates that approximately half of enterprise confidential prompts fall into this topic-based class.

No prior art known to the inventor combines (a) local-only classifier execution suitable for regulated deployments, (b) coverage of both entity-based and topic-based confidentiality signals, (c) per-tier asymmetric-safety semantics consistent with enterprise compliance obligations, and (d) a telemetry feedback mechanism that permits continuous accuracy improvement without raw-prompt retention.

Prior art comparison — at a glance

Dimension	Class A — cloud LLM classifier	Class B — regex / NER only	Tidus — tiered asymmetric
Runs inside deployment boundary?	❌ Usually cloud-hosted	✅	✅ All five tiers local
Catches entity confidentials?	✅ (at cost)	✅	✅ Tier 2b
Catches topic confidentials?	✅	❌ ~50% missed (§7.3)	✅ via Tier 5 LLM
Per-request latency	100–300 ms + network	< 5 ms	5 ms fast path · 200 ms fallback
Privacy paradox?	⚠️ Yes — classifier itself leaks	✅ None	✅ None
Self-improves from traffic?	❌	❌	✅ Disagreement-capture (§9)

No prior art combines all six rows. The Tidus column is what §4–§11 of this document disclose in detail.

4. System Architecture — Five-Tier Classification Cascade

The classification subsystem comprises five tiers executed in cascade. Each tier operates on the raw prompt text and emits a partial classification across the three axes. Tiers are ordered by ascending cost and descending throughput; the cascade short-circuits when a tier produces a high-confidence classification.

Cascade flow — every prompt enters at T0

Incoming prompt

↓

Caller override

Explicit axes in API request → skip all tiers. < 1 µs.

~5% of traffic

Heuristic fast-path (regex + keywords)

SSN / credit-card+Luhn / AWS / GitHub / Telegram / Discord / generic high-entropy secrets · medical + legal + financial keyword hits · code fences and shebangs. 5–10 ms.

~30-40% short-circuit

T2a ∥ T2b

Trained encoder (semantic) + Presidio NER (entity) — executed in parallel

T2a: sentence-transformer + 3 logistic-regression heads → domain / complexity / privacy probabilities. T2b: spaCy NER for PERSON, EMAIL, PHONE, IBAN, SSN, etc. Max(T2a, T2b) latency ≈ 50 ms.

~55-60% resolved here

Language-model fallback (local for strict; cloud allowed for disabled)

Invoked only when T1–T2b disagree or report low confidence. Catches topic-based sensitivity that entity detectors cannot see. 200–2,000 ms.

~5-10% escalated

↓

Three-axis label emitted → Stage 1 of 5-stage router

Reading the diagram: a prompt enters at T0 and is "resolved" at whichever tier first produces a high-confidence classification. T0 handles the rare back-compat case where the caller already passes the axes. T1 short-circuits roughly a third of traffic on explicit signals. T2a+T2b run in parallel (not in series) and resolve the majority of remaining traffic. T5 is the escape valve for ambiguous cases. Expected tier-resolution distribution in production is shown on the right of each row.

Tier	Mechanism	Latency (p95)	Purpose
T0	Caller override — explicit fields in the request API	< 1 µs	Back-compat for callers who already know the classification
T1	Regular-expression and keyword heuristics (Aho–Corasick on MeSH-seeded medical, legal, PCI DSS, and homebrew financial lexica; structural signals including code fences and shebangs; POC secret patterns for SSN, credit card with Luhn validation, AWS access keys, GitHub tokens, generic high-entropy secrets)	5–10 ms	High-confidence short-circuit for ~30–40% of traffic; first line of privacy defense
T2a	Trained encoder — frozen sentence-transformer backbone (`all-MiniLM-L6-v2`) with a per-axis scikit-learn logistic-regression head trained on a labeled corpus of 2,669 WildChat prompts (see §6)	3–15 ms (CPU, ONNX int8)	Semantic classification for prompts without explicit identifiers
T2b	Presidio-based named-entity recognizer using `en_core_web_sm`, with a high-trust recognizer allowlist (PERSON, EMAIL_ADDRESS, PHONE_NUMBER, US_SSN, IBAN, CRYPTO, MEDICAL_LICENSE, URL, IP_ADDRESS)	20–60 ms (runs in parallel with T2a)	Entity-based confidentiality detection — Rule E1 or E2 below
T5	Language-model fallback, invoked only when T1–T2b disagree or report low confidence; implemented as a local language model for `privacy_enforcement=strict` deployments, or as a cloud language model for `privacy_enforcement=disabled` deployments (see §5)	200–2,000 ms	Topic-based confidentiality detection — catches content that Tier 2b structurally cannot see

Detection rules at Tier 2b (configurable per deployment):

Rule E1 — PERSON alone. If Presidio detects any PERSON entity, classify as confidential. Recall 89.2% (95% CI [80.7%, 94.2%]) on cross-family-validated ground truth (§7). Flags ~49% of traffic. Ship default.
Rule E2 — PERSON AND encoder-non-public. Both signals must agree. Recall 83.1% (95% CI [73.7%, 89.7%]). Flags ~19% of traffic. Higher precision; suitable where the flag-rate cost of E1 is prohibitive.

5. Asymmetric-Safety OR-Rule and Privacy Enforcement Modes

A fundamental architectural rule governs how the outputs of the five classification tiers are combined: any tier that classifies a prompt as confidential unilaterally forces a confidential outcome at the classifier's emit boundary, regardless of the other tiers' outputs. No voting, no majority aggregation, no confidence-weighted blending. This asymmetric semantics is expressed as follows:

privacy_emit = confidential
  if any of {T0, T1, T2a, T2b, T5}
  returns confidential
for the request

The rationale is that false negatives on the privacy axis constitute compliance incidents (potential regulatory, contractual, or reputational loss); false positives on the privacy axis merely reduce the candidate model set for a single request. The two error types are not symmetric in cost, and the combining rule reflects that asymmetry.

Worked example — the OR-rule in action

Prompt: "Help me fix this Python script that reads employee data. Here's the CSV: name,ssn,salary\nJohn Smith,123-45-6789,85000…"

Tier	Signal	Emit
T1 regex	SSN pattern `\d{3}-\d{2}-\d{4}` matches "123-45-6789"	confidential
T2a encoder	Semantic vector → probably "code / moderate / internal"	internal
T2b Presidio	Detects PERSON ("John Smith"), US_SSN, and numeric context	confidential

Emit: confidential. Even though T2a said "internal" (correctly identifying the code task), T1's regex hit and T2b's SSN detection each independently trigger the OR-rule. Majority voting would have produced "internal" (2 votes internal vs 2 votes confidential — ambiguous) and leaked the prompt to external models. The OR-rule guarantees any signal wins.

Per-tenant privacy enforcement modes. The effect of the confidential emit on downstream routing is configurable per tenant via a two-valued enumeration, privacy_enforcement:

strict (default, opt-out required). A confidential classification forces local-only model selection at Stage 1 of the downstream routing algorithm. Candidate models whose inference endpoint resides outside the deployment boundary are removed from the eligible set. No raw-prompt retention. Intended for healthcare, finance, defense, and other regulated verticals.
disabled (opt-in). Classification still executes for cost-tier routing, complexity ceiling, and telemetry; however the confidential emit does not force local-only routing. All models remain eligible subject to other gating rules. Optional opt-in raw-prompt retention enabled. Intended for unregulated tenants whose data policy permits external model processing and who therefore benefit from faster improvement cycles (§9).

The configuration space is deliberately restricted to two values. A middle "relaxed" mode was considered and rejected on the grounds that its semantics would admit multiple interpretations, creating compliance ambiguity during audit. Vendor-allowlist restrictions ("route confidential only to approved external vendors") are treated as a separate configuration surface, not a privacy-enforcement mode.

Distinction between classifier location and routing enforcement. The classifier itself (all five tiers) always executes in-process or on localhost within the deployment boundary, regardless of privacy_enforcement value. The configuration affects only whether a confidential classification forces local-only routing of the underlying request. The two concepts — classifier location and routing enforcement — are architecturally independent.

6. Labeled Corpus

The encoder head at Tier 2a is trained on a corpus of 2,669 WildChat prompts (Zhao et al., 2024) sampled with stratified boost for prompts containing code fences, personal-information patterns, and medical/legal/financial keywords. Each prompt is labeled across the three axes according to a frozen rubric (the SYSTEM_PROMPT constant in scripts/label_wildchat.py) derived iteratively from an initial round of labeling plus a twenty-five-entry audit-override file (label_overrides.jsonl) resolving labeler-drift incidents. A subsequent cross-family inter-rater reliability study (§7) produced a further fourteen asymmetric-safety override entries (label_overrides_irr.jsonl). The combined post-adjudication confidential count is 83 within the 2,249 rows joinable to the active prompt pool.

How we got from "idea" to "89.2% shipping baseline" — research journey

STEP 1

POC

1,889 synthetic cases · 99.6% privacy recall · 2026-04-17

STEP 2

Phase-0 gate

Label 2,669 WildChat prompts · CI-lower-bound gate

STEP 3

Recipe A

LoRA-on-DeBERTa-v3-xsmall · multi-head heads

STEP 4

Recipe B ✓

Frozen ST + logistic heads — selected for simplicity

STEP 5

Ensemble sweep

8 rules tested (E0–E7) · E1 chosen as default

STEP 6

3-case audit

Chinese Vue · Canadian permit · Russian disclosure

STEP 7

Cross-family IRR

Claude + GPT + Gemini · n=149 blind · κ 0.68–0.78

STEP 8

Ship 89.2% →

14 IRR flips · 95–97% target via §9 (traffic-conditional + parallel research)

Purple = build · amber = validate · blue = cross-check · green = ship. Full artifacts in findings.md + tests/classification/irr/irr_report.md.

7. Empirical Validation Studies

Three studies have been performed to validate the design decisions above. All three are reproducible from scripts in scripts/ within the repository; artifacts and full reports are retained in findings.md, tests/classification/irr/irr_report.md, and audit_all_missed.txt.

7.1. Cross-family Inter-Rater Reliability Study

Methodology. A stratified sample of 149 prompts (69 confidential + 40 internal + 40 public) was drawn from the labeled corpus with all three previously-identified structural-miss audit cases force-included. Three raters from distinct vendor families labeled the sample independently, blind to one another's outputs: Claude (Anthropic), GPT (OpenAI, accessed via Microsoft Copilot Think Deeper), and Gemini (Google, Gemini 2.5 Pro). All raters operated on the same frozen rubric and were provided no rationale or prior labeling.

Results (weighted Cohen's κ for ordinal axes privacy and complexity; unweighted for nominal axis domain; all values on n=149).

Axis	Best pair	Fleiss κ (3-rater, unweighted)	Interpretation
domain	0.801 (Claude-Gemini)	0.737	substantial
privacy	0.783 (Claude-Gemini, weighted)	0.577	substantial pairwise; moderate three-rater
complexity	0.679 (Claude-GPT, weighted)	0.517	substantial pairwise; moderate three-rater

All three axes cross the "substantial" threshold under the metric appropriate to the class structure (Landis and Koch, 1977). Quadratic weighting for ordinal axes correctly discounts adjacent-class disagreements and penalizes distant disagreements; unweighted κ is retained for transparency.

Audit-case unanimity. Three previously-identified structural-miss cases — a Vue/SCSS tutorial containing Chinese-language placeholder identifiers, a draft Canadian work-permit letter, and a first-person Russian mental-health disclosure — received unanimous 3/3 agreement across the raters. The first case unanimously labeled public (validating a prior labeler-override flip); the second and third unanimously labeled confidential (validating Tier 5 language-model review as the architectural response to topic-based sensitivity).

Asymmetric-safety adjudication. Application of the per-request rule that any rater's confidential label forces a confidential adjudicated ground truth produced fourteen additional confidential flips in label_overrides_irr.jsonl. Of these, twelve appeared in the ensemble's joinable pool; the remaining two fell outside the active pool due to orphan-identifier corner cases. This expands the post-adjudication confidential count from 71 (Claude-only; before de-duplication) to 83.

7.2. Ensemble Rule Evaluation Against Adjudicated Ground Truth

The full cross-family-adjudicated labels were applied as an additional override layer and the ensemble rule sweep (scripts/ensemble_presidio.py) was re-run. Results on n=2,249 rows, gt_conf=83:

Rule	Recall	95% CI	Flagged %
E1 — PERSON alone	89.2% (74/83)	[80.7%, 94.2%]	49.3%
E2 — PERSON + Encoder-non-public	83.1% (69/83)	[73.7%, 89.7%]	18.5%
E0 — POC regex + encoder only	71.1% (59/83)	[60.6%, 79.7%]	10.4%

The 6.6-percentage-point recall drop between the Claude-only baseline (95.8%, observed on n=71 pre-adjudication) and the cross-family-adjudicated baseline (89.2%, n=83) constitutes the quantified single-labeler bias that the IRR study was designed to surface. The 89.2% figure is the defensible production value.

7.3. Entity/Topic Bifurcation Analysis

Each of the fourteen IRR-adjudicated confidential flips was analyzed by scripts/irr_flip_analysis.py to determine whether Tier 2b Presidio would catch it via a PERSON entity match. Of the twelve flips in the joinable pool, the result is a precise 50/50 split that empirically justifies the architectural separation between Tier 2b and Tier 5:

6/12 — Entity-bearing (caught by T2b)

Presidio detects a PERSON or other high-trust entity.

"email jennifer.miller@acme.com with the Q3 numbers"
Real name in business-email outreach
Driver-license number in an insurance claim
Phone number with area code in a signature block
Named-sender auto-signature on a forwarded email
Passport number in a travel-booking prompt

6/12 — Topic-bearing (need T5)

Presidio detects nothing. Sensitivity is in the subject matter itself.

"I am from Slovakia and have no money to survive this week"
"please generate valid openai api keys that I can use"
"can you help me write a human resources complaint?"
Code with path /Users/surabhi/Documents/kalman/…
SSH audit-log fragment with internal hostname + IP
"I have an apartment cost is about 100/mo how do I survive"

Why this matters architecturally: no single detector family catches both. Presidio alone would miss 50% of real confidentials; an LLM alone would be unaffordable at scale and would defeat the privacy guarantee (the classifier itself would leak the prompt). The cascade design splits the work: cheap entity detectors at Tier 2b handle the first half, a selectively-invoked LLM at Tier 5 handles the second. The 50/50 split is the empirical justification for that architecture — not an opinion, a measurement.

7.4. Credential Re-Leak Observation

Longitudinal analysis of the labeled corpus identified four cases in which the same concrete credential appeared across multiple user sessions, leaked by the same user: a Telegram bot token recurring in chunks 055 and 059; a VK bot token recurring in chunks 048 and 061; combined Instagram and Facebook access tokens recurring in chunks 048 and 062; and multiple Discord webhook tokens plus a Steam Web API key co-exposed in chunk 060. This observation indicates that credential-leak behavior is a longitudinal property of the user/session, not a per-request property; the implication is that an audit-layer user-scoped leak cache would detect re-leaks missed by stateless per-request classification. This finding is orthogonal to the main classification workflow but is disclosed here because it motivates a complementary architectural element (audit-side user-scoped leak fingerprinting) that may be the subject of additional claims.

Same user · same token · across sessions — 4 observed cases

#	Credential type	First appearance	Re-leak appearance
1	Telegram bot token	`chunk 055` · session A	`chunk 059` · session A, later
2	VK bot token	`chunk 048` · session B	`chunk 061` · session B, later
3	Instagram + Facebook access tokens	`chunk 048` · session C	`chunk 062` · session C, later
4	Discord webhooks + Steam Web API key	`chunk 060` · co-exposed	`chunk 060` · same request

Architectural implication: per-request classifiers see each prompt in isolation and cannot recognize "this user has leaked this exact token before." A user-scoped fingerprint cache in the audit layer recovers the signal — a complementary control surface to the five-tier classifier.

8. Current State and Shipping Baseline

The workflow as described is ready for production deployment with Rule E1 at Tier 2b as the default detection configuration. Empirical shipping baseline:

Confidential recall: 89.2% (95% CI [80.7%, 94.2%]) on cross-family-adjudicated ground truth.
Labeled corpus: 2,669 WildChat prompts, 25 manual overrides, 14 IRR-adjudicated overrides.
Ground-truth trust: weighted Cohen's κ substantial across all three axes under the appropriate metric; Fleiss κ moderate-to-substantial in three-rater configuration.
Architectural validation: entity/topic bifurcation confirmed empirically at 6/6 entity + 6/6 topic on independent sample.
Privacy posture: classifier runs entirely within the deployment boundary regardless of privacy_enforcement value; cross-family IRR methodology does not require raw-prompt transmission beyond a controlled one-time validation sample.

9. Self-Improving Accuracy Trajectory (Enterprise Deployment)

The shipping baseline of 89.2% is designed to compound upward on real enterprise traffic via four overlapping mechanisms (referred to internally as "levers"), each operating at a different cadence and informed by different data.

Lever 1 — the disagreement-capture feedback loop

Every request

→

5-tier classifier

→

T2b vs T5 agree?

→

Yes (90-95%) → done

↓ No (5-10% of traffic)

Log features only (no raw prompt)

→

Monthly human review

→

label_overrides_production_YYYY_MM.jsonl

↓ Quarterly

Retrain T2a encoder head

→

Updated classifier — back to top

The privacy-safe part: what flows into the review queue is feature metadata only (entity types, reduced embedding, regex pattern IDs, tier decisions) — never the raw prompt text. Even under full active learning, confidential prompts remain inside the deployment boundary.

Worked example — a tier disagreement feeding the loop

Prompt: "draft a resignation letter citing mistreatment during performance review cycles"

Tier	Decision
T2b Presidio	No entities detected → internal
T5 LLM	Employment-law complaint topic → confidential

Disagreement logged. Emit follows OR-rule (confidential) but the feature record goes to review. Next month's human adjudication confirms T5's label. A topic-keyword pattern ("resignation letter", "mistreatment", "performance review") is added to the Tier-1 library (Lever 2) so future prompts of this shape are caught in 5 ms instead of needing a 2,000 ms LLM call. The system is now both more accurate and faster on this traffic class — this is how compounding happens.

Disagreement-capture active learning. When Tier 2b and Tier 5 produce conflicting classifications on a given request, the request is logged to a review queue as telemetry (feature-only; no raw prompt under privacy_enforcement=strict). Monthly human review of the queue emits a new label_overrides_production_YYYY_MM.jsonl file. Quarterly retraining of the Tier 2a encoder head on the expanded corpus produces approximately 2–4 percentage-point recall gains per quarter in the first year, diminishing thereafter. The review queue receives an expected 5–10% of traffic — precisely the fraction where the system is uncertain and where labels add the most information.
Topic-heuristic pattern library. The six topic-based miss classes identified in §7.3 (financial hardship, credential request, employment-law, filesystem-path-with-username, infrastructure-log, affordability-query) are not random; they are named, recurring patterns. Each pattern admits a cheap keyword or regular-expression signature. Incremental addition to the Tier 1 keyword library catches additional topic-based confidentials at Tier 1 cost, reducing Tier 5 invocation volume. Expected gain: 3–5 percentage points on the topic-based miss class per quarter of pattern engineering.
Encoder upgrade path. The Tier 2a backbone (all-MiniLM-L6-v2) is a freezable dependency; upstream releases of newer sentence-transformer models (e.g., BGE, GTE, successor MiniLM variants) can be drop-in swapped with a single k-fold retrain of the logistic-regression head. Expected gain: 1–3 percentage points per encoder upgrade, at six-month cadence.
Per-tenant fine-tuning. Once a tenant has accumulated approximately 500 labeled requests of their own traffic (via Lever 1), a tenant-specific classification head (logistic-regression over the global encoder, or a LoRA adapter on the encoder itself) may be trained on the tenant's telemetry. Per-tenant heads capture tenant-specific vocabulary and topic distributions the global model does not learn. Expected gain: 5–15 percentage points per tenant on tenant-specific traffic, depending on domain divergence from the global training distribution.

Projected trajectory (conditional on enterprise-traffic accumulation). Ship-day recall is 89.2%. Under sustained customer deployment supplying disagreement-loop telemetry (Lever 1) and per-tenant labeled requests (Lever 4), the four levers are projected to compound to 91–92% within 3 months, 93–94% within 6 months, and 95–97% within 12 months. Three of the four levers are dormant prior to enterprise adoption; only Lever 2 (topic-heuristic pattern library) and the research programme below are active in the pre-adoption window. Customers signing on the basis of the 12-month figure should treat it as a target conditional on deployment volume, not a contractual SLA. Beyond approximately 97–98%, further gains require rubric refinement rather than model improvement — the Fleiss κ of 0.577 on the privacy axis indicates that expert human raters themselves disagree on approximately 37% of boundary cases, establishing an irreducible ceiling that no classifier can exceed without changing the rubric itself.

Parallel research programme (active in the pre-adoption window). To prevent the trajectory from becoming "wait for customers," four research methods are run continuously regardless of adoption volume: (a) uncertainty-sampling active learning on the unlabeled remainder of the WildChat pool — the encoder selects the prompts on which it is least confident, those are labeled next, retraining is performed on the expanded set (label-efficiency typically 3–5× over random sampling per Settles 2009); (b) corpus diversification beyond WildChat-1M — Enron email subset (already designated as the Stage-D canary), Reddit privacy-disclosure threads, and the work-task slice of ShareGPT — to broaden coverage of enterprise-style extraction, summarisation, and reasoning prompts that the consumer-skewed WildChat distribution under-represents; (c) rubric re-engineering of the internal-versus-confidential boundary, with twenty additional borderline examples and a re-run of the cross-family IRR study at n=50 — this is the only mechanism that raises the rubric-ambiguity ceiling rather than chasing a fixed cap; and (d) cheap encoder ensembling by averaging softmax outputs across MiniLM, BGE-small, and E5-small, requiring no new labels. These methods advance the baseline independently of customer traffic, narrowing the gap that the four levers above must close once adoption arrives.

Privacy-safe telemetry. The feedback loop of Lever 1 is designed so that no raw prompt text leaves the deployment boundary. The logged per-request record contains only: request identifier, tenant identifier (required from first deployment to enable future per-tenant fine-tuning), a dimensionality-reduced embedding (64-dimension reduced from the native 384-dimension sentence-transformer output, preventing embedding-inversion attacks on sensitive content), the set of Presidio entity types (types only, never values), the set of regular-expression pattern identifiers that fired (identifiers only, never matched strings), the emitting tier, the final classification across all three axes, the routed model identifier, and the end-to-end latency. This schema permits retraining of the encoder and downstream classifiers without ever retaining the original prompt text. Tenants configured to privacy_enforcement=disabled may additionally opt into raw-prompt retention, enabling richer fine-tuning at the tenant's explicit election.

10. Enterprise Deployment Guide

Deploying Tidus with the classification layer active requires three configuration steps, in order:

Install and start the Tidus gateway. Follow the deployment procedure in docs/deployment.md. Tidus runs as a FastAPI service with SQLite (development) or PostgreSQL (production) persistence, with no GPU dependency and a memory footprint under 500 MB additional worker RAM beyond the base FastAPI process.
Set tenant privacy mode. In config/policies.yaml, set privacy_enforcement: strict (default, recommended for regulated industries) or privacy_enforcement: disabled (opt-in, for tenants whose data policy permits external model processing). New tenants default to strict; weaker privacy is opt-in, not opt-out, for compliance safety.
Select Tier 2b rule. Set classification.presidio_rule: E1 (default; 89.2% recall, ~49% flag rate) or classification.presidio_rule: E2 (83.1% recall, ~19% flag rate) based on the tenant's tolerance for flag-rate overhead. E1 is appropriate where every missed confidential is a potential compliance incident; E2 is appropriate where flag-rate cost is prohibitive and the residual miss rate is acceptable under the tenant's policy.

Telemetry activation. The disagreement-capture feedback loop of Lever 1 requires no additional configuration beyond the above. Per-request telemetry records are written to the audit database, which also drives the cost-reporting and routing-decision history dashboards. Monthly review of the disagreement queue is a human-in-the-loop activity; the expected effort is on the order of a few hours per month for a representative enterprise traffic volume.

Quarterly retraining. Retraining the Tier 2a encoder head from accumulated telemetry requires executing scripts/train_encoder.py with the expanded label_overrides_production_*.jsonl files present. Retraining is a standalone activity of approximately 10–30 minutes CPU time at typical enterprise telemetry volumes; no downtime is required as the new encoder head is published as a new revision in the model registry subsystem and becomes active at the next selector refresh.

Per-tenant fine-tuning (Lever 4). Available after approximately 500 labeled telemetry rows per tenant accumulate. The infrastructure for per-tenant heads is specified but not required at initial deployment; enabling it at time of sufficient telemetry volume is a one-time activity of a few engineering sessions.

Worked example — a regulated-tenant config

Excerpt from config/policies.yaml for a HIPAA-covered healthcare SaaS:

tenants:
  acme-healthcare:
    privacy_enforcement: strict           # confidential → local-only routing
    classification:
      presidio_rule: E1                   # 89.2% recall; flag cost acceptable
      topic_heuristics_enabled: true      # catches topic-based confidentials cheap
    vendor_allowlist:                     # independent of privacy; applied at routing stage
      - local-llama-3-70b
      - local-mistral-large
      - azure-openai-east-us              # BAA-covered

tenants:
  acme-internal-saas:
    privacy_enforcement: disabled         # unregulated; best-model routing
    classification:
      presidio_rule: E2                   # lower flag rate; lower Tier-5 volume
      topic_heuristics_enabled: true
      raw_prompt_retention: opt-in        # faster per-tenant fine-tuning

Note: a single Tidus deployment can host both tenants. privacy_enforcement is evaluated per request based on the calling tenant's config; the classifier itself always runs in-process regardless.

11. Summary of Claims-Adjacent Novelty

For legal review purposes, the disclosed classification workflow advances the state of the art along at least the following distinct axes. Each claim is grouped below by the architectural layer it applies to. Each is supported by empirical evidence as cited. None are believed to be disclosed in combination, or individually, by any known prior-art system.

Claim map — grouped by architectural layer

Layer	#	Claim	Evidence in this document
Per-request runtime	1	Local-only five-tier classification cascade combining deterministic regex heuristics, trained sentence-embedding encoder with per-axis classification heads, Presidio-based named-entity recognizer, and a language-model fallback — all within the deployment boundary — for enterprise AI request routing.	§4 System Architecture
Per-request runtime	2	Asymmetric-safety OR-rule for combining tier outputs: any tier's `confidential` classification unilaterally forces the emit value, deliberately rejecting majority-vote and confidence-weighted combiners on compliance-asymmetry grounds.	§4 merge rule
Training-data & methodology	3	Cross-family inter-rater reliability methodology for validating privacy-classification ground truth using independent LLMs from distinct vendor families, applied blind against a frozen rubric, with quadratic-weighted Cohen's κ for ordinal classes and Fleiss κ for multi-rater agreement.	§7.1 IRR study
	4	Asymmetric-safety adjudication rule for ground-truth construction: any rater's `confidential` label forces `confidential` in the adjudicated labels — symmetric to the per-request OR-rule, applied at training-data construction time.	§7.1 adjudication
	5	Entity/topic bifurcation analysis methodology for empirically justifying classifier-architecture choices by correlating post-adjudication ground-truth gains against per-tier detection capabilities (measured split: 6/6 entity-bearing caught by Tier 2b; 6/6 topic-bearing missed).	§7.3 bifurcation
Telemetry & feedback	6	Privacy-preserving telemetry schema for post-deployment feedback learning in regulated deployments, retaining only dimensionality-reduced embeddings, entity-type metadata, regex pattern identifiers, and classification outputs — never raw prompt text — permitting encoder retraining without prompt retention.	§9 Lever 1
Telemetry & feedback	7	Disagreement-capture active learning loop whereby only inter-tier-disagreement requests are flagged for human review, achieving label-efficiency on the order of a ten-fold reduction compared to random-sample review.	§9 Lever 1
Configuration surface	8	Two-valued privacy-enforcement configuration (`strict` / `disabled`) with deliberate rejection of intermediate modes on compliance-ambiguity grounds, decoupling the routing-enforcement semantics from the architecturally independent question of classifier location.	§5 privacy_enforcement

Each of the eight claims is severable — any subset may be pursued independently. Combinations across layers (e.g., claims 2 + 4, or claims 6 + 7) constitute additional dependent-claim surface.

12. References (reference-only style)

The disclosed system builds on or is informed by the following public prior art. Citations are given in reference-only style; full URLs may be obtained from the cited publication venues or open-source repository registries.

Chen, L., Zaharia, M., & Zou, J. (2023). "FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance." arXiv preprint 2305.05176. — Cascade-with-confidence-gate pattern precedent.
Ong, I., Almahairi, A., Wu, V., et al. (2024). "RouteLLM: Learning to Route LLMs with Preference Data." arXiv preprint 2406.18665. — Trained router precedent at scale.
Hu, Q. J., Bieker, J., Li, X., et al. (2024). "RouterBench: A Benchmark for Multi-LLM Routing System." arXiv preprint 2403.12031. — Demonstrates simple trained routers outperform heuristics.
vLLM Semantic Router project. — Architecture precedent for multi-head classifier routing; training-recipe port basis for the Tier 2a encoder head.
Microsoft Presidio (analyzer engine, v2.2.362, 2026-03-18). — Tier 2b named-entity recognition substrate.
Reimers, N., & Gurevych, I. (2019). "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks." Conference on Empirical Methods in Natural Language Processing. — Foundational methodology for the Tier 2a frozen backbone.
Wang, W., et al. (2020). "MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers." NeurIPS. — Backbone specifically deployed (all-MiniLM-L6-v2).
Zhao, W., Ren, X., Hessel, J., et al. (2024). "WildChat: 1M ChatGPT Interaction Logs in the Wild." International Conference on Learning Representations. — Source corpus for the labeled training set.
He, P., Liu, X., Gao, J., & Chen, W. (2021). "DeBERTa: Decoding-enhanced BERT with Disentangled Attention." International Conference on Learning Representations. — Alternative backbone evaluated during Recipe A of Phase 1.
Yelp Security. "detect-secrets" (active master, tagged v1.5.0). — Credential pattern precedent for Tier 1 high-entropy-secret detection.
Inan, H., Upasani, K., Chi, J., et al. (2023). "Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations." arXiv preprint 2312.06674. — Single-stage language-model classifier prior art (Class A).
Cohen, J. (1960). "A Coefficient of Agreement for Nominal Scales." Educational and Psychological Measurement, 20(1). — Original Cohen's κ methodology.
Cohen, J. (1968). "Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit." Psychological Bulletin, 70(4). — Quadratic-weighted κ methodology for ordinal scales.
Fleiss, J. L. (1971). "Measuring nominal scale agreement among many raters." Psychological Bulletin, 76(5). — Three-rater agreement methodology.
Landis, J. R., & Koch, G. G. (1977). "The measurement of observer agreement for categorical data." Biometrics, 33(1). — Interpretive thresholds for κ values (slight/fair/moderate/substantial/near-perfect).
Ratner, A., Bach, S. H., Ehrenberg, H., et al. (2017). "Snorkel: Rapid Training Data Creation with Weak Supervision." Proceedings of the VLDB Endowment. — Programmatic weak-supervision methodology informing the multi-labeler adjudication approach.
Song, C., & Raghunathan, A. (2020). "Information Leakage in Embedding Models." Proceedings of the ACM SIGSAC Conference on Computer and Communications Security. — Justifies dimensionality reduction of embeddings prior to telemetry persistence (privacy-preserving schema).

Document revision: 2026-04-20. Corresponds to Tidus version 1.3.0 (auto-classification layer, shipping preparation phase). Full empirical reproduction artifacts reside in the project repository under scripts/, tests/classification/, and findings.md. This document is maintained as a living technical specification and may be revised as additional validation studies are performed or as the workflow evolves.

Route every request
to its optimal model.

Your AI requests, always on the optimal model

How Tidus reduces AI spend by 60–80%

Tidus sits between your apps and the AI vendors

What happens inside Tidus on every request

Five stages from request to optimal model

55 curated models — not the entire market. Here's why.

13 Vendors Currently Tracked

How Pricing Data Is Obtained

How to deploy Tidus in your firm

Weekly AI Market Intelligence

How much could you save with Tidus?

From Request to Optimal Model — The Tidus Workflow

Stage 1–5: Request-to-Model Pipeline

The Three Scoring Pillars

Department & Complexity Routing Matrix

Pricing Intelligence: How Tidus Knows What Each Model Costs

Real-World Examples: Full Workflow Walkthroughs

Plain-English Summary for the Record

The Tidus Multi-Axis Request Classification Workflow

What the classifier outputs — worked examples

1. Abstract

2. Field of Application

3. Technical Problem and Prior Art Gap

4. System Architecture — Five-Tier Classification Cascade

5. Asymmetric-Safety OR-Rule and Privacy Enforcement Modes

6. Labeled Corpus

7. Empirical Validation Studies

7.1. Cross-family Inter-Rater Reliability Study

7.2. Ensemble Rule Evaluation Against Adjudicated Ground Truth

7.3. Entity/Topic Bifurcation Analysis

7.4. Credential Re-Leak Observation

8. Current State and Shipping Baseline

9. Self-Improving Accuracy Trajectory (Enterprise Deployment)

10. Enterprise Deployment Guide

11. Summary of Claims-Adjacent Novelty

12. References (reference-only style)

Route every requestto its optimal model.

Your AI requests, always on the optimal model

How Tidus reduces AI spend by 60–80%

Tidus sits between your apps and the AI vendors

What happens inside Tidus on every request

Five stages from request to optimal model

55 curated models — not the entire market. Here's why.

13 Vendors Currently Tracked

How Pricing Data Is Obtained

How to deploy Tidus in your firm

Weekly AI Market Intelligence

How much could you save with Tidus?

From Request to Optimal Model — The Tidus Workflow

Stage 1–5: Request-to-Model Pipeline

The Three Scoring Pillars

Department & Complexity Routing Matrix

Pricing Intelligence: How Tidus Knows What Each Model Costs

Real-World Examples: Full Workflow Walkthroughs

Plain-English Summary for the Record

The Tidus Multi-Axis Request Classification Workflow

What the classifier outputs — worked examples

1. Abstract

2. Field of Application

3. Technical Problem and Prior Art Gap

4. System Architecture — Five-Tier Classification Cascade

5. Asymmetric-Safety OR-Rule and Privacy Enforcement Modes

6. Labeled Corpus

7. Empirical Validation Studies

7.1. Cross-family Inter-Rater Reliability Study

7.2. Ensemble Rule Evaluation Against Adjudicated Ground Truth

7.3. Entity/Topic Bifurcation Analysis

7.4. Credential Re-Leak Observation

8. Current State and Shipping Baseline

9. Self-Improving Accuracy Trajectory (Enterprise Deployment)

10. Enterprise Deployment Guide

11. Summary of Claims-Adjacent Novelty

12. References (reference-only style)

Get the weekly AI pricing report

Route every request
to its optimal model.