Production Reality

Principle X: Production Reality

Learn continuously from production reality.

Benchmarks are auditions. Production is the performance. The difference between the two is the difference between a résumé and a career — one tells you what someone claims they can do, the other tells you what they actually do under pressure, at scale, when nobody is watching.

The Problem with Assumptions

Every multi-agent system starts with assumptions. This model is fast. That provider is reliable. This routing strategy is optimal. These assumptions are necessary at launch — you have to start somewhere. They are also, without exception, wrong in ways that only production will reveal.

The model that benchmarked at 95% accuracy delivers 87% on your specific workload. The provider with 99.9% uptime SLA has 99.2% uptime for your traffic pattern. The routing strategy that looked optimal in testing creates hot spots under real load.

Obsidian does not treat these discoveries as surprises. It treats them as data.

Continuous Learning

Every model call in Obsidian generates performance data: latency, token usage, output quality, error rates. Every routing decision is recorded with its outcome. Every failover event is analyzed. This data feeds back into the routing and quality systems, adjusting behavior based on what is actually happening rather than what was expected to happen.

This is not machine learning in the fashionable sense. It is empiricism — the practice of updating beliefs based on evidence. When an agent discovers that Model A consistently outperforms Model B on code generation tasks despite Model B’s superior benchmark scores, the system adapts. Not eventually. Continuously.

User Feedback as Signal

Production reality includes user feedback. A task that completes successfully by every metric but produces output the user rejects has not, in fact, succeeded. Obsidian’s quality assessment includes user signal as a first-class input, because the ultimate measure of system performance is whether it produces outcomes that humans find useful.

The Honesty Requirement

This principle requires a particular form of organizational honesty: the willingness to let data override intuition. When production metrics contradict your architectural assumptions, the metrics win. When a provider you chose for strategic reasons underperforms a provider you overlooked, you switch. The Constitution does not have favorites.

Implications

Every decision system in Obsidian — routing, model selection, retry strategies, timeout values — must be informed by production data. Static configuration is a starting point, not a destination. The Warden uses production metrics when evaluating constitutional compliance , because a system that is technically compliant but empirically failing is not actually compliant.

Relationship to Other Principles

Production reality depends on Observable by Default (Principle V) for its data. It informs No Single Point of Failure (Principle IX) with empirical failover data. And it is the mechanism by which Leverage Over Effort (Principle VI) stays honest — because leverage that does not produce measurable results is just abstraction.