>*Greenpixie said they have the data (AHA!!) And their data is verified (ISO-14064 & aligned with the Greenhouse Gas Protocol)."
What does this say about accuracy, and I guess ultimately the impact of the emissions?
Whenever I have tried to find a meaningful measurement of environmental impact of power use I have gotten into a quagmire of statistics taking past each other, with arbitrary mixing of units and definitions. (Like energy/power/electricity being defined differently but used interchangeably. Similarly water usage being blended regardless of whether it is potable or from an area of scarcity)
The end result has to be what harm is caused, because harmless use of something at any magnitude is still harmless.
How do you figure out what that level is with any degree of accuracy. It's a difficult problem, but it seems that easier answers are not likely to be useful if they are not accurate.
There are thousands of ways to calculate carbon that are all valid, that’s why a similar usage amount in AWS and Azure will give you wildly different numbers. We prioritise consistency, coverage, and transparency. If the users understand where the numbers come from, and we are applying the similar data science across all clouds, then you have comparable numbers. We get our numbers audited by 3rd parties regularly to ensure robustness and credibility, but an accurate number for your entire AWS environment isn’t useful if you are just trying to calculate the difference between an AMD instance family, and a Graviton instance family. This is where we focus our methodology and why it works inside of Infracost.
A big focus now is applying this same level of rigorousness to different AI models and their impact. Batching, caching, model size and manufacturer are the choices engineers are making now. We want to ensure that choices being made are cost and carbon efficient.
Curious to know what decision you're making at the moment that's triggered you looking into your own methodology?
That's a great question - it is a hard thing to build for sure. We started talking to the CTO office of Google about it, and exactly as you say, it gets into the details. The folks at Greenpixie have been doing a lot of research on this, so when we spoke to a few of their big customers (Like Mastercard), they told us about the process they went through to evaluate the data, and trusted it from their ESG initiatives too.
Let me ask one of the Greenpixie folks to jump in here, maybe they can explain how they do it!
Lerc - they are in the UK, so some of them are offline, but I text their CEO :)
Check this out: https://greenpixie.com/gpx-data Thoughts?
This reminds of me calorie tracking: you cannot perfectly capture the number of calories or macronutrients, but measuring does seem to help people loose weight. There are probably many loop holes where eating large amounts of certain food, with a certain margin of error, can leads to wildly incorrect estimates.
I wonder how much this analogy applies to carbon tracking? Does using a wide variety of foods help make the tracking more accurate because no single bad estimate becomes overrepresented? Can a similar approach be taken via a wide variety of cloud technologies being used?