by Rajdip Ghosh and Katerina Papamihail

As sustainable investment strategies gain momentum among asset owners and managers, the calls for more reliable ESG research and ratings at scale gather pace. How can we address the imperfect data challenge?

In God we trust. All others must bring data.
W. Edwards Deming and others

With the growing global shift towards sustainable investment strategies, ESG data has improved in recent years, allowing investors greater visibility into how companies are performing from a non-financial perspective.

However, challenges around inconsistent data across companies, sectors and markets, which are often not standardized and sometimes backward looking, mean investors are increasingly looking at how they can assess whether a business or strategy is truly sustainable. Different frameworks in different countries, and with some companies not reporting, continue to make it difficult to accurately assess the climate risks faced by each firm.

And as sustainability issues can have a major influence on the risk and return potential of an investment, investors simply can’t afford to shy away from addressing this complex data challenge.

This is compounded by the relative youth of the problem set. In the US, the Securities Exchange Act of 1934 first required the publication and monitoring of financial reports 87 years ago. Bloomberg was started as Innovative Market Systems to distribute market and financial data in 1981 – 40 years ago. The traditional financial data ecosystem may have had many challenges, but it has had many decades to work out the kinks.

Sustainable Finance Disclosure Regulation (SFDR) is the first SI data regulation that has the heft of – and is potentially as consequential as – the Securities Exchange Act of 1934. In comparison, the sustainable investing (SI) data and regulatory ecosystem is still in its infancy – or gestation, depending on where in the world you look – and waiting for another 87 years for it to mature to a point of ubiquity is not an option.

Missing ESG data

ESG data combined with inconsistent global reporting standards create a ‘missing’ data problem when you look across the asset class universe. This becomes especially pronounced when you move outside of developed markets. For example, many Chinese companies lag their regional peers in terms of disclosure on company policies to tackle emissions. But amid growing market pressure, we are already seeing sustainability disclosure requirements improve reporting standards and ESG practices across Chinese companies.

In developed markets, however, the lack of ESG data in even some investment grade corporate bonds, as well as across both private and public companies is under extreme scrutiny and is driving pressure from all corners to report accurate ESG data.

So, given the gaps in data, how can we build an ESG data model for so-called ‘poor reporters’ from other best-practice companies? This challenge is not unusual as investment decisions are actually typically made using imperfect data that requires making both inferences and assessing probabilities of certain outcomes.

3rd Party ESG Coverage by Credit Grade

Source: UBS AM, MSCI, Sustainalytics. As at October 2021

A look at the missing and available ESG data across investment grade and speculative grade.

3rd Party ESG Coverage by Market type

Source: UBS AM, MSCI, Sustainalytics. As at October 2021

A look at the missing and available ESG data across markets.

Can human intelligence add data depth?

Using ‘base rates’ as a mental model which relies on specific information rather than exact calculations when making a future probability judgement can help to ground financial forecasts and, potentially prevent some of the biases that might exist. These base rates can be used alongside a market hierarchy.

For example by taking different data points being reported across companies in the same sector, hierarchical mental models can then be used as a starting point to fill the data gaps for a more effective forecast.



Watch our recent webinars

Hierarchical modeling - identifying patterns from data

We believe a hierarchical modeling approach can be applied to statistically fill the ESG data gaps with a reasonable degree of certainty and when applying this model to, for example, fixed income corporate bonds, we can correlate credit ratings with ESG scores to fill those gaps.

This allows us to broaden the investment opportunity universe, particularly when thinking about sustainable investing strategies.

For example, similar to how a fundamental analyst fills in gaps using base rates, our intuition tells us that companies within the same credit rating bands and in the same sector and country should have similar ESG scores. We believe sustainability performance is sector relevant as it ranks companies within a sector by weighting factors (or the absence of factors) based on emissions data reporting.

Meanwhile, with different regulations across countries, regional biases also exist. One example is US oil companies, which may actually have similar ESG ratings to technology companies in China, and any differences should be reflected in their credit rating and market pricing.

By identifying this varying distribution of ESG scores, we can build statistical models which take into account this intuition in a similar manner to how a fundamental analyst would approach it.

Can data influence decision making?

There is also strong behavioral and physiological evidence that the human brain both presents probability distributions and performs probabilistic inference (Fiser J, 2001) (Alexandre Pouget 1, 2013)1. However, it wasn’t until the 17th century that games of chance started to entice the minds of mathematicians like B. Pascal and P. Fermat to create a theory that predicts the odds of a player’s win.

Although the result of a game could not be guaranteed, the mathematics suggesting a certain move might give a player an 80% chance of winning was greatly welcomed in the gaming circles. This illustrates that providing an answer but with a level of uncertainty can become more acceptable.

What role does Bayesian inference play in ESG ratings?

The Bayesian method of statistical inference is also something investors can use today. ESG ratings can be good or bad, with issuers either from green industries (low-carbon emitting) or brown industries (high carbon-emitting). If investors are provided with an unrated bond in the brown industry and have been asked to rate it good or bad, as information on the rated issuer is known, they are able to calculate how the rating is distributed among the two industries, also referred to as joint probability.

Statistical theory can help close the data gap

To make more progress in the sustainability journey of investors it is clear that companies will need to take steps to increase the robustness of their ESG data. However, until better measurement is available, finding innovative solutions that use the power of statistics to infer how to fill in those gaps will be vitally important for portfolio managers to make sustainable choices and, by doing so, seek to maximize the positive impact of investors’ portfolios.


Investment outlook 2022

As we work towards building a more sustainable future and continue to face global supply chain and inflation challenges, what role will asset managers play, and how will this reshape the economy?

About the author
  • Katerina Papamihail

    Data Scientist, Quantitative Evidence and Data Science Team

Related insights

Read more

Contact us

Make an inquiry

Fill in an inquiry form and leave your details – we’ll be back in touch.

Introducing our leadership team

Meet the members of the team responsible for UBS Asset Management’s strategic direction.

Find our offices

We’re closer than you think, find out here