Why we chose Openfunds as our canonical standard
Every data platform needs a canonical schema — the internal language everything gets normalised to. We evaluated four serious contenders. Openfunds won, but not for the reasons you might expect.
The standards landscape
Fund data has no shortage of standards. That's actually part of the problem — there are too many, and they overlap in confusing ways.
ISO 20022 is the messaging standard that dominates securities settlement. It defines message types for fund orders, transfers, and reporting. It's comprehensive, XML-based, and deeply embedded in the plumbing of financial infrastructure. But it's a messaging format, not a field-level data dictionary. It tells you how to structure a message, not what "fund currency" means across different contexts.
EFAMA / FinDatEx templates — the EMT, EPT, EET, and related templates — are regulatory-driven. They define specific fields needed for PRIIPs, MiFID II, and SFDR compliance. They're essential for regulatory delivery but narrow in scope. You can't build a canonical data model on EPT alone because it only covers a slice of what you need.
FundsXML is a solid XML-based format used by several European data providers. Good structure, reasonable coverage. But it's XML-native in a world that's increasingly JSON and tabular, and its adoption curve has plateaued.
Openfunds is a field-level standard. Each field has a unique identifier — OFST010010 for Fund Legal Name, OFST020050 for Fund Currency, OFST050030 for NAV — with defined data types, valid values, and clear documentation. It covers static fund data, share class data, and key operational fields.
Why Openfunds won
Three reasons, in order of importance:
1. Field-level granularity. When you're mapping a source column called Fondswährung to your canonical schema, you need a target that operates at exactly that level of granularity. Not a message structure. Not a template position. A specific field with a specific ID that means exactly one thing. Openfunds gives you OFST020050 (Fund Currency) with a defined ISO 4217 value set. There's no ambiguity.
2. Machine-readable field registry. The Openfunds field list is downloadable, parseable, and structured. Every field has an ID, a name, a data type, valid values where applicable, and a description. This matters enormously when you're building AI-assisted mapping — you can feed the entire field registry to a language model as context and get surprisingly good mapping suggestions.
3. Industry adoption in the right places. Openfunds has traction with European fund platforms, transfer agents, and data providers. It's not universal — nothing is — but it's used by enough of the ecosystem that mappings to and from it are well-understood.
What Openfunds doesn't cover
Being honest about the gaps is important. Openfunds has real limitations:
- Time series data. Openfunds is primarily a static data standard. Daily NAVs, performance history, flow data — these need additional modelling on top. We handle this with our own time series schema that references Openfunds field IDs where applicable.
- Regulatory template specifics. EMT field 04010 and Openfunds
OFST020050both reference fund currency, but the regulatory template has specific formatting and validation rules that Openfunds doesn't encode. You need both. - Custom fields. Every asset manager has proprietary fields that don't map to any standard. Internal fund codes, custom risk ratings, bespoke classification schemes. We handle these with a
custom:prefix namespace that sits alongside the Openfunds fields. - Relationship modelling. The hierarchy from management company to umbrella to fund to share class isn't deeply modelled in Openfunds. We built our own entity graph for this.
The practical implementation
In Kairo, every canonical field is either an Openfunds ID or a namespaced extension. When data arrives from any source, the mapping step converts source columns to these canonical IDs. The entire downstream pipeline — validation, reconciliation, delivery — operates on canonical IDs only.
The mapping is the hard part. Once data is in canonical form, everything downstream becomes dramatically simpler.
This is why the AI-assisted mapper matters so much. Manually mapping 200 source columns to Openfunds fields takes a skilled analyst half a day. Our AI mapper does it in seconds and gets about 85% right on the first pass. The analyst reviews and corrects, the corrections feed back into the model, and the next mapping from a similar source is better.
No standard is perfect. Openfunds has gaps, its governance moves slowly, and not everyone in the industry uses it. But for field-level normalisation of fund data — the specific problem we need to solve — it's the best option available. And where it falls short, we extend it rather than replace it.
That pragmatism matters more than purity. In fund data, you ship with the standard you have, not the standard you wish existed.