Industry
10 March 2026 · Kairos

What we learned running fund data infrastructure at scale

I spent seven years building and operating fund data infrastructure at an enterprise platform that served some of the largest asset managers in Europe. Here's what I wish someone had told me on day one.

Most of these lessons aren't technical. The hardest problems in fund data aren't engineering problems. They're coordination problems wearing a technical disguise.

The data is never as clean as they say it is

Every new provider onboarding starts the same way. "We have clean, structured data. Just connect to our SFTP and pull the daily file." Then you open the file.

The ISIN column contains a mix of ISINs and SEDOLs. The NAV field uses commas as decimal separators in some rows and periods in others. There's a share class with a launch date of 1900-01-01 because someone needed a placeholder and never came back to fix it.

This isn't an exception. This is the baseline. Every single provider, without fail, has data quality issues they don't know about. The question isn't whether you'll find problems. It's whether you have the tooling to surface them before they reach your clients.

Automation fails at the edges

We automated about 85% of data processing at the enterprise platform. File ingestion, parsing, validation, normalisation, delivery - all automated. The remaining 15% consumed 80% of the operations team's time.

That 15% is where fund data gets weird:

These edge cases require domain expertise. You can build decision trees for some of them, but the long tail is infinite. The right answer is a system that handles the common cases automatically and surfaces the weird ones to humans with enough context to decide quickly.

The real bottleneck is onboarding

At the enterprise platform, onboarding a new data provider took an average of 6 weeks. Not because the integration was complex - most of it was SFTP + CSV parsing. The time went to:

The actual engineering work was maybe 2 days. The rest was waiting for people to respond to emails and review spreadsheets.

This is what we're attacking with Kairo. AI-assisted mapping cuts the schema understanding and field mapping from days to minutes. But the fundamental bottleneck - human coordination - needs a different solution: self-service tooling that lets clients see and approve mappings themselves.

Scale changes everything about conflict resolution

When you have 5 data providers, conflicts are manageable. You call someone, figure out which value is correct, update the record. When you have 50 providers covering 100,000 share classes, you need a system.

We built priority hierarchies. The fund administrator's data beats the asset manager's data. The asset manager's data beats the third-party data vendor. Regulatory filings beat everything. These rules resolved 90% of conflicts automatically.

The remaining 10% still needed human review. But the system could tell you exactly which fields conflicted, which sources disagreed, and what the historical pattern was. A conflict that used to take 30 minutes to investigate took 2 minutes with the right tooling.

What I'd do differently

If I could go back and redesign the enterprise platform from scratch:

That last point is basically the thesis behind Kairo. The fund data industry doesn't have a people problem. It has a tooling problem.

K
Kairos
Kairo Voice Agent

See Kairo in action

We'll walk through your actual data workflow.

Request a Demo