Five domains, not twelve services
We had twelve microservices on the whiteboard. Auth service. Ingestion service. Mapping service. Validation service. It looked impressive. Then we actually tried to deploy it.
The microservices architecture for a fund data platform makes perfect sense on paper. Clean separation of concerns. Independent scaling. Polyglot persistence. All the conference-talk talking points.
Here's what they don't mention at conferences: for a team of three, twelve services means twelve deployment pipelines, twelve sets of health checks, twelve log streams to correlate when something goes wrong at 6am on NAV day. The operational overhead ate us alive before we wrote any business logic.
The twelve we designed
For the record, here's what the whiteboard looked like:
- Auth & identity
- File ingestion
- Schema detection
- Field mapping
- Validation engine
- Entity resolution
- Data store (static)
- Data store (time series)
- Transformation engine
- Delivery/egress
- Notification service
- Audit log
Each one "needed" its own database. Each one had API contracts with at least three others. The dependency graph looked like a plate of spaghetti drawn by an architect who bills by the arrow.
What went wrong
Two weeks into building this, we hit reality. The mapping service needed validation context. The validation engine needed entity resolution results. Entity resolution needed the data store. The data store needed mapping outputs. Everything talked to everything.
When every service depends on every other service, you don't have microservices. You have a distributed monolith with network latency.
We also discovered that half our "services" were really just functions. Schema detection doesn't need its own process running 24/7. It runs when a file arrives. That's a function, not a service.
The five domains
We threw it all away and asked a different question: what are the actual bounded contexts in fund data operations?
The answer was five:
- Ingest: Files come in, get parsed, get mapped to a canonical schema. Schema detection, field mapping, and initial validation all live here. One pipeline, one concern: turn messy external data into clean internal data.
- Resolve: Entity resolution, identifier graphs, cross-source matching. This is the "who is this fund?" domain. It has its own data model (the graph) but it's one logical unit.
- Store: The canonical data layer. Static reference data and time series in one domain with appropriate storage underneath. One API surface, not two services arguing over who owns what.
- Assure: Cross-source comparison, anomaly detection, validation rules, data quality scoring. Everything that answers "is this data correct?"
- Deliver: Outbound pipes, publication matrices, format transformation, scheduling. Everything that answers "where does this data go?"
Same separation, less pain
Each domain has clear boundaries. Ingest doesn't know about delivery. Resolve doesn't know about validation rules. The separation of concerns is identical to the twelve-service model.
But operationally, five domains can run as modules within a smaller number of deployable units. In our case, three services handle all five domains. Ingest and Resolve are one deployment. Store and Assure are another. Deliver is its own thing because outbound scheduling has genuinely different scaling characteristics.
Three deployments. Three log streams. Three health checks. One person can hold the entire system in their head.
The decision framework
If you're making this choice, here's the test we used:
- Does it need independent scaling? If two things always scale together, they're one deployment.
- Does it need independent release cycles? If you always deploy them together anyway, stop pretending they're separate.
- Does it have a genuinely separate data model? If two services share a database, they're one service wearing a trench coat.
- Can your team actually operate it? If you have more services than engineers, you've gone too far.
That last one is the real test. Architecture should serve the team you have, not the team you imagine hiring in three years.
Five domains gave us the intellectual clarity of microservices with the operational sanity of a well-structured monolith. We ship faster, debug faster, and sleep better. The whiteboard is less impressive. The system actually works.