Deterministic vs probabilistic in regulated environments
Here's the tension at the heart of building AI-native software for financial services: regulators want reproducibility, and AI is inherently probabilistic. We solved this with a two-mode architecture, and it's not a compromise — it's genuinely better than either approach alone.
The reproducibility problem
When a regulator asks "why did you publish this NAV for share class IE00B4L5Y983 on 14 February?", they expect a deterministic answer. The source file came from here. This mapping rule was applied. This validation passed. This value was published. Every step traceable, every step reproducible.
Run the same input through the same pipeline tomorrow and you must get the same output. That's not a nice-to-have. It's a regulatory requirement.
Now ask an LLM to map a column called Nettoinventarwert to your canonical schema. It will correctly identify it as a NAV field and map it to OFST050030. Probably. With high confidence. But "probably" and "high confidence" are not the same as "deterministically, every time, with the same result."
Ask it again tomorrow and it might return the same answer with slightly different confidence scores. Or it might phrase the reasoning differently. Or — rarely, but possibly — it might map it to a different field entirely. That's the nature of probabilistic systems.
The wrong solutions
I've seen two bad approaches to this problem:
The "just don't use AI" approach. Stay fully deterministic. Build rigid mapping templates. Require humans to manually configure every source. This works, but it's slow and expensive. Onboarding a new data source takes days of analyst time. At scale, it doesn't keep up.
The "just trust the AI" approach. Let the model map fields, validate data, make decisions. Ship the output. Hope for the best. This is the approach that gets you featured in a regulatory enforcement notice.
Both are wrong because they treat deterministic and probabilistic as mutually exclusive. They're not.
The two-mode architecture
In Kairo, AI and deterministic processing operate in strictly separated modes:
Mode 1: AI builds the configuration. When a new data source arrives, the AI mapper analyses the schema, suggests field mappings, infers data types, and proposes validation rules. It does this probabilistically, with confidence scores on every suggestion. A human reviews the output, approves or corrects each mapping, and the result is saved as a deterministic configuration.
Mode 2: Deterministic pipeline executes. Once the configuration is approved, the daily pipeline runs without any AI involvement. Source column A maps to canonical field B. Validation rule C checks the value. Transformation D formats the output. Every step is a pure function — same input, same output, every time.
AI is the architect. Deterministic code is the builder. The architect draws the plans once; the builder follows them exactly, every day.
This separation is enforced at the architecture level, not by convention. The pipeline execution engine literally cannot call an LLM. It doesn't have the credentials. The AI mapper cannot write directly to the production data store. The two systems share a configuration layer and nothing else.
Why this is better than pure deterministic
The pure deterministic approach requires a human to manually build every mapping configuration from scratch. For a source with 200 columns, that's half a day of tedious work by someone who knows both the source format and the Openfunds standard. With AI-assisted configuration, that same task takes twenty minutes — five minutes for the AI to generate suggestions, fifteen minutes for the human to review and approve.
Crucially, the quality is the same or better. The AI catches mappings that humans miss. It recognises that Verwaltungsgesellschaft is the management company name even when the analyst doesn't speak German. It flags ambiguous fields that could map to multiple canonical targets and asks the human to resolve the ambiguity.
Why this is better than pure AI
Every AI mapping comes with a confidence score. In our system, we enforce three layers of defence against hallucinated field IDs:
- Schema validation. The AI can only output field IDs that exist in our canonical registry. If it hallucinates
OFST999999, the system rejects it before a human ever sees it. - Confidence thresholds. Mappings below 70% confidence are flagged for mandatory human review. The AI doesn't get to be uncertain and ship anyway.
- Human approval gate. No AI-generated configuration reaches the production pipeline without explicit human approval. Period.
The result is a system where the AI does the heavy lifting of initial analysis, the human provides judgement and approval, and the deterministic engine provides the reproducibility that regulators require.
The audit trail
Every configuration change records who approved it, when, and what the AI's original suggestion was. If a mapping was AI-suggested and human-approved, the audit trail shows both. If a human overrode the AI's suggestion, the trail shows the original suggestion, the override, and the reason.
This matters because regulators are increasingly asking about AI governance. "Do you use AI in your data processing?" is a question every fund data firm will face. Our answer is clear: yes, for configuration. No, for execution. Here's the audit trail proving it.
That's not a compromise. It's the only architecture that makes sense.