The Fallacy of Metrics - From GPA to GDP

Before every semester starts, without fail, you would find me extremely excited with anticipation of more things to learn about. Especially if it’s a difficult class. The more difficult, the more I know I’ll be challenged and be able to grow. Check with me again a month into the semester, and you’ll find me just waiting for the semester to end.

Why? It’s not like my preferences changed in between. I took the class because I was interested. What changed was the goal. In my mind, it went from an opportunity to maximize learning to the requirement of maximizing grades. Along the way, learning became a by-product, not the priority, and with it, my interest dove down.

In the fields of machine learning and statistics, metrics are used everywhere. They are used to evaluate the effectiveness of the model, and more importantly, used when actually training (or adjusting) the model so that it can do better at predicting.

In the real world, we have metrics by which we track all aspects of our own lives, as well as the global economy. We track how many steps we’re taking, we keep track of our budgets and net income. The government might look at metrics for the health of the economy, such as GDP and unemployment rates.

If you look closely, we’re using metrics to achieve an underlying goal, whether it’s financial independence, sustainable economy, a healthier body, etc. Unfortunately, we tend to start using the metric as the goal instead.

When a measure becomes a target, it ceases to be a good measure
‘Improving ratings’: audit in the British University system

Misunderstanding and Misuse of Metrics

We love numbers. Even the people who hate maths can’t get away from measuring things. If you have any productivity or habit-building books, the general consensus is that you need to keep track of your activities and progress.

If you have ever tried to get more into fitness, you’ve undoubtedly heard that nutrition and calorie counting are big. Almost every fitness influencer you see on YouTube and Instagram is probably measuring their calories.

Metrics by themselves aren’t bad. It’s when we fail to use them effectively where things get out of hand. Like social dissent level of bad in certain situations. When they transition from just acting as proxies for our goals to actually becoming our goal.

Take policing in the US as an example, particularly in NYC. Police in general, are supposed to provide a sense of security and uphold the law. Well, what happens when we try to measure this? Great metrics for measuring police effectiveness would be

Community satisfaction
Sense of safety
Trust in the authorities

Unfortunately, it’s extremely hard to measure those things. What is easy to measure is the number of tickets and arrests (the quota system). If we choose to use something like the quota system as a proxy for police effectiveness, police are undoubtedly going to optimize for that.

It doesn’t matter if the underlying goal is to improve trust and safety. Police have a metric. They need to hit that metric to then retain funding.

McNamara Fallacy

Coined by sociologist Daniel Yankelovitch, the term is based on former US Secretary of Defense, Robert McNamara, after the employment of enemy body counts as taken to be a precise and objective measure of success in the Vietnam War. This just goes to show how even the highest echelons of government perform so poorly when using simple metrics to define a complex situation.

The fallacy presents itself in a few stages, each more pressing than the previous.

📏Measure what can easily be measured — Usually what we start with and tends to be okay.
💣Disregard that which cannot easily be measured — Starting to get into dangerous territory.
🙈Presume that which cannot be measured easily is not important — Ignorance.
💀Presume that which cannot be measured easily does not exist — WTF are you actually doing.

So what domains do we need to worry about? Pretty much every domain where metrics are used.

Economic Health 📈

We don’t have to go very far to see how metric fallacy can lull us into a false sense of security. When the Coivd pandemic hit, financial markets around the world tumbled. And then, in a matter of months, they were reaching heights never seen before. People invested in the stock market were growing rich and the housing market (in the US) was booming. But to say that the national or global economic health was also improving is BS, to say the least.

Shops closing even when the market was booming — Photo by the BlowUp on Unsplash

Fitness and the Gym 🤸🏾‍♀️

I frequently find first-time gym-goers to be amongst the worst at recognizing the distinction between optimizing for the metric and optimizing for the goal. I say this because I’ve made this mistake and still make this mistake. True, we all have different goals, and for some people, getting the maximum number of repetitions out of an exercise is the metric they use to track progress. But there is this switch. It happens unbeknownst to the person. They disregard other qualitative and quantitative metrics (e.g. feeling of well-being) and focus on maximizing just the one, even if they end up performing it in terrible form.

Artificial Intelligence and Ethics 💻

I mentioned before how machine learning, a sub-field of AI, uses metrics. Well, it’s no different either when it comes to incorrect metric usage. One common metric is accuracy, i.e. out of 100 different images presented to the model, how many can it correctly determine are images of cats.

Now, let’s consider the case where AI is being used to determine who is at a higher risk of criminal activity. By letting the model-accuracy be the benchmark for accurate criminal activity prediction, it’s very likely that the data will bias the results.

A guy who has molested a small child every day for a year could still come out as a low risk because he probably has a job. Meanwhile, a drunk guy will look high risk because he’s homeless.
Machine Bias Risk Assessment in Criminal Sentencing

It would be extremely unethical to only consider the accuracy metric as a performance indicator for the AI.

Competitive Admission for Medical School and Jobs 💯

At some point or another, you have most likely submitted an application for a job, or for a school. In that application, some of the most prominent features include your GPA and standardized test scores (SAT, GRE, USMLE).

These scores are everywhere and for good reason. People and organizations like to operate efficiently. If you have 20,000 applications for 100 spots, it would be amazing to filter out many of those applicants based on single numerical thresholds. GPA less than 4.0? Trash! And you’re quickly down to 300 applications.

But are those indicators good predictors of future success? Some exams, such as STEP 2 for medical students, are considered to be good indicators, whereas, in my field as a software engineer, I firmly believe GPA doesn’t tell you anything beyond a certain point. And even then, it never ever gives you the whole story.

Determination, interests, communication skills, are all much more difficult to measure but represent holistic indicators for future success.

I just listed a few instances, but you can see how these fallacies would be prevalent without us even realizing it. And then we build systems on top of it, where they go from measurement mechanisms to goals to hit.

Being able to distinguish metrics and goals is crucial, not just for our own happiness, but for the sake of effective improvements in society.

The Fallacy of Metrics – From GPA to GDP

Misunderstanding and Misuse of Metrics

McNamara Fallacy

Economic Health 📈

Fitness and the Gym 🤸🏾‍♀️

Artificial Intelligence and Ethics 💻

Competitive Admission for Medical School and Jobs 💯

Subscribe to learn faster! 📬

Subscribe to learn faster! 📬

The Fallacy of Metrics – From GPA to GDP

Misunderstanding and Misuse of Metrics

McNamara Fallacy

Economic Health 📈

Fitness and the Gym 🤸🏾‍♀️

Artificial Intelligence and Ethics 💻

Competitive Admission for Medical School and Jobs 💯

Share this:

Subscribe to learn faster! 📬

Subscribe to learn faster! 📬