ESG Ratings: valuable or untrustworthy?

For the Dutch version, click here.

In recent years, sustainability reporting has become increasingly important for companies. The number of companies reporting their ESG information has increased from 20 in the 1990s to 9,000 in 2016 (Amir & Serafeim, 2017). Investors want all this information to be easily accessible so they can use it in their decisions. To meet demand, several raters have started handing out “ESG Scores”. These scores summarize the ESG report into a number, indicating how well a company is performing on its ESG goals. Examples of raters include KLD, Sustainalytics, Moody’s ESG, S&P Global, Refinitiv and MSCI. All are prominent and reliable raters. You would expect all of these raters to come to roughly the same conclusion when they assign an ESG score to a company. Yet research shows that the scores between raters can vary greatly. In this article, I find out why raters disagree and whether the ESG scores they assign then still have value.

Berg et al.

Research by (Berg et al., 2022) [1] shows that the correlation between the six raters mentioned above varies between 0.38 and 0.71, meaning that even though the ratings move in the same direction, they do not always agree. This makes it difficult for investors to place value on a particular rating and it is not clear to companies which aspects they need to improve in order to achieve a high score. The researchers identify three factors that lead to these differences. In short, these three factors boil down to the what, how and how much they are going to rate. What raters measure is something they can decide for themselves. Where one rater includes corruption in her assessment, another chooses to measure and assess working conditions. Next, how they are going to measure this may also differ. For example, one rater may choose to rate working conditions based on employee satisfaction scores, while another chooses to use the number of lawsuits filed by employees against the employer. How heavily raters judge may be up to them. If two raters both rate corruption and working conditions, one may give greater weight to corruption while the other considers working conditions important. As a result, the rating may differ even if the raters include exactly the same variables. The researchers talk about 709 indicators in 64 categories. From this stack, raters themselves choose which indicators they give how much weight, so it is natural for differences to arise.

Research Affiliates.

The differences that arise can be large. A study by Research Affiliates [2] demonstrates a good example: Wells Fargo. This American bank is placed in the lowest 5% on Governance by one rater, while another rater places it in the highest 25%. This is because one rater places a scandal in the Governance pillar, while the other thinks it falls under the Social pillar. The scandal involved widespread solicitation of new accounts and credit cards for customers without their consent. The fees that came with these accounts and credit cards were charged though.

“They cannot take a random rating and make a decision.”

The results are in stark contrast to ratings for non-ESG attributes such as long-term debt. Raters S&P, Moody’s and Fitch show correlations between 0.94 and 0.96 in this area, indicating that the ratings are very similar. The reason? There are many more requirements and guidelines attached to ratings on financial information than there are for ESG ratings. Moreover, the numbers and amounts are often easily measurable. It is easy to give an unambiguous answer on the total amount of outstanding debt than on how satisfied employees are.

Investors

The fact that ESG ratings vary so much among themselves has implications for investors[3]: they cannot take a random rating and make a decision to invest or not based on it. The investor must do additional research on how the rater calculates scores and whether the factors the rater uses match the factors the rater considers important. If so, the investor can use these scores. The differences in themselves can also give the rater information. In fact, research by Gibson and Krueger [4] shows that the greater the difference in ratings, the higher the returns for that company.

Future

To make ratings more compatible, some agreements will have to be made about the what, how and how heavily. Accenture [5] lists five recommendations to bring ratings closer together. For example, raters could be more transparent about how they arrived at their ratings. Also, regulators could specify specifically which factors fall under E, S and G, so that all raters include the same factors under the same rubric. There is also a task for the companies that publish their ESG information: standardization. This means that the data they provide through their ESG report is consistent and meets global guidelines as prescribed by the International Sustainability Standards Board. In this way, all companies provide the same type of information to raters, reducing the chance that they will interpret it in different ways.

reactions