The SFDR data challenge nobody talks about
The Sustainable Finance Disclosure Regulation asks a reasonable question: how sustainable is this fund? The answer requires data that most of the industry is still guessing at. Not because they don't care, but because the data literally doesn't exist in reliable form.
SFDR classifies funds into Article 6, 8, or 9 based on their sustainability characteristics. Article 8 funds promote environmental or social characteristics. Article 9 funds have sustainable investment as their objective. The classification itself is fairly straightforward. The data needed to support it is anything but.
The data gap
To report under SFDR, you need granular ESG data at the holding level. Not just "is this company ESG-friendly?" but specific metrics: carbon intensity, percentage of revenue from fossil fuels, board gender diversity, involvement in controversial weapons. The European Supervisory Authorities defined 18 mandatory principal adverse impact (PAI) indicators. Each one requires specific, quantitative data from the underlying investee companies.
Here's the problem: most companies don't report this data. Or they report it inconsistently. Or they report it in different formats, using different methodologies, with different reporting periods.
A 2025 study found that coverage for mandatory PAI indicators ranged from 30% to 85% depending on the indicator and the data provider. Carbon emissions data? Reasonably available for large-cap companies. Water usage data for mid-cap Asian companies? Good luck.
The estimation problem
When actual data isn't available, the industry does what it always does: it estimates. ESG data vendors use models to fill gaps. Company A in sector X with revenue Y probably has carbon intensity Z, based on peer companies that do report.
This creates an uncomfortable situation. A significant portion of the ESG data used in SFDR reporting is estimated, not measured. The regulation requires disclosure of actual sustainability metrics, but the underlying data infrastructure can't deliver them for a large part of the investable universe.
We're building regulatory reporting on top of modelled data. Everyone knows this. Nobody wants to say it out loud in a compliance meeting.
The vendors are transparent about it in their methodologies. They distinguish between reported and estimated values. But by the time the data reaches a fund factsheet, that distinction is often lost. The end investor sees a clean number. They don't see the asterisk.
What platforms actually do
In practice, fund data platforms handling SFDR face a few concrete problems:
- Multiple ESG data sources disagree. Provider A says the fund's carbon intensity is 142 tCO2e/mEUR. Provider B says 167. Both are using different scopes, different estimation models, and different reference dates. Which one goes in the report?
- Coverage varies by field. You might have 95% coverage for SFDR classification (Article 6/8/9) but only 60% coverage for the "share of investments in fossil fuel companies" PAI indicator. Partial data is harder to handle than no data.
- The taxonomy alignment question. EU Taxonomy alignment requires even more granular data than SFDR — revenue breakdowns by economic activity, assessed against technical screening criteria. Coverage here is often below 40%.
Handling incompleteness honestly
The temptation is to show whatever data you have and leave blank cells where you don't. This is technically correct but practically useless. A client looking at an SFDR pre-contractual template with 12 out of 18 PAI indicators populated doesn't know if the missing 6 are zero, unavailable, or not applicable.
What we've built is a completeness layer that sits on top of the raw ESG data. For every fund, it calculates:
- Coverage percentage per indicator (what proportion of holdings have reported data vs estimated vs missing)
- Data vintage (how old is the underlying data — last quarter? Last year? Two years ago?)
- Source attribution (which data vendor contributed which values)
- Estimation flag (explicitly marking which values are modelled, not reported)
This doesn't solve the data gap. Nothing will, until corporate sustainability reporting catches up with the regulation that depends on it. The Corporate Sustainability Reporting Directive (CSRD) should help, but full rollout won't be complete until 2028 at the earliest.
The practical advice
Don't wait for perfect data. SFDR reporting is required now. Use the best available data, be transparent about its limitations, and build the infrastructure to improve as better data becomes available.
Track data quality over time. Coverage will improve. When it does, you want to know that your PAI indicator coverage went from 62% to 78% in Q3. That's a story worth telling to compliance and to clients.
Standardise your ESG data intake. The EET (European ESG Template) exists for a reason. It's not perfect, but it provides a common format for fund-level ESG data exchange. If your sources send data in EET format, your life gets measurably easier. If they don't, build the mapping once and reuse it.
The SFDR data challenge isn't a technology problem. It's a maturity problem. The regulation arrived before the data infrastructure was ready. Every platform is navigating the gap. The ones that do it honestly — showing what they know, flagging what they don't, and improving coverage systematically — are the ones that will earn trust when the dust settles.