Essays [Amir Sohil]

"When a measure becomes a target, it ceases to be a good measure."

The Law

Goodhart's Law

In 1975, British economist Charles Goodhart made an observation that has haunted boardrooms ever since. Working on monetary policy at the Bank of England, he noticed that whenever the government targeted a specific economic indicator, the indicator stopped working as a reliable signal. People changed their behaviour to hit the number, and the number stopped meaning what it used to.

"Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes." - Charles Goodhart, 1975

Although this may look like a piece of economic trivia, it is actually a fundamental law of human incentive systems that plays out every day inside companies that live and die by their KPI dashboards.

It sounds almost paradoxical. You want to improve customer satisfaction, so you start tracking NPS scores. You incentivise teams around NPS. Scores go up. But satisfaction? That might be going anywhere. Why? Because your front-line staff learned to ask customers for high scores, rather than earn them. In this case, the metric has become a goal in itself, rather than a measure of progress towards a larger objective.

This is Goodhart's Law in its purest form. And it plays out with stunning regularity across almost every industry, department, and well-intentioned KPI dashboard ever built.

Related Principles

Cousins of the Law

Goodhart's Law has several structural cousins.

Donald Campbell, a psychologist, reached a similar conclusion to Goodhart's Law. His adage is known today as Campbell's Law: "The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor."

Robert Lucas Jr., a Nobel-winning economist, framed the same problem in macroeconomics. The Lucas Critique states that relationships that are observed in historical data will not necessarily hold once policies change.

Then there's the Cobra Effect: British colonial India, where a bounty on dead cobras led enterprising locals to breed cobras to collect the reward. When the government caught on and scrapped the bounty, the breeders released their now-worthless cobras, making the problem worse than before.

All these concepts share the same underlying truth: metrics reshape behavior. Or, in other words, you can't treat observed relationships as fixed if people are aware of them and have incentives to respond.

Case Studies

Times When the Metrics Ate the Mission

History is not short of examples for the law. Each have the following in common: a sensible-sounding metric, a powerful incentive, and a system that learned to game both. Click any card to read the full story.

↓ Click to expand

Banking · 2016

The Wells Fargo Cross-Selling Scandal

Aggressive targets for accounts per customer turned a bank's incentive structure into a fraud machine.

In the early 2010s, Wells Fargo's retail banking division set aggressive sales/KPI targets for bank staff. Tied to bonuses and job security, these metrics, focused on the number of products per customer, became everything. Employees opened approximately 3.5 million fake accounts using forged signatures and unauthorised transfers. The SEC's investigation described systematic gaming practices across thousands of branches.

The bank hit its cross-sell KPI. It looked like growth. But it destroyed trust, triggered billions in fines, and cost the CEO his job. The metric was hit. The actual mission, genuine customer loyalty, was obliterated.

Finance · 2008

The 2008 Subprime Mortgage Crisis

Mortgage officers paid per loan closed had every incentive to write risky ones.

In the run-up to the financial crisis, many mortgage originators were compensated based on loan volume, often through commissions tied to originations. Research shows that such incentives increased loan approvals, loan sizes, and default rates, as officers exercised greater discretion and approved marginal borrowers.

While these systems boosted volume, they often failed to align incentives with long-term loan performance, contributing to a deterioration in lending standards.

Public Sector · UK

G4S and the Fake 999 Calls

A policing contract's response-time KPI was met by making hundreds of fake emergency calls.

The incident involving G4S and Lincolnshire Police is a clear example of how performance metrics can distort behaviour when they become targets. Call handlers were measured against a specific KPI: answering a high percentage of emergency calls within 10 seconds. Because financial penalties were tied to missing this threshold, the focus shifted from the real goal, responding effectively to genuine emergencies, to simply achieving the numerical target. To meet the metric, staff made hundreds of fake 999 calls during quiet periods, which were answered instantly and artificially boosted performance figures.

On paper, targets were met; in reality, operational effectiveness was not.

Education · Rankings

College Rankings Gaming

US News & World Report's selectivity metric turned admissions offices into rejection machines.

US News & World Report college rankings use selectivity, typically measured by acceptance rates (the percentage of applicants admitted), as a quality signal. The logic here is that competitive schools attract stronger students. Once schools recognised that lower acceptance rates can improve their perceived standing, they had an incentive to manipulate the metric.

Some colleges increased applications through widespread outreach aimed at students unlikely to be admitted, purely to generate more rejections. As a result, the metric becomes less informative about what it was intended to measure (academic quality).

The Connection

From Economic Law to Your Daily Dashboards

You might be thinking: "This is all interesting, but I'm not running a bank or a police department. I'm just trying to build a dashboard for my team." And that's exactly the point. Dashboards are essentially compression machines. They take the messy, sprawling reality of a business and squeeze it into a handful of numbers on a screen. While this compression is genuinely useful, it's also where the trouble starts.

When you put a metric on a dashboard, you are effectively saying: "This is what we care about. This is how we will measure success." And that creates an incentive for people to hit those numbers, even if it means gaming the system. In other words, optimising for the visible metrics. Not necessarily dishonest optimisation, just the entirely rational human instinct to look good on the thing you're being judged by.

At this point, the dashboard stops merely reflecting reality and starts shaping it. Teams reorganise their work around what gets tracked. Managers move the number rather than the underlying thing it is used to measure. The dashboard becomes a target, and the metrics become less and less meaningful as a result.

Dashboards are especially vulnerable to this because numbers feel objective. A sales target showing 98% attainment looks like success. What it doesn't show is whether those deals were discounted into unprofitability, or closed with customers who will churn in 90 days. The green indicator triggers a feeling of safety before any critical thinking kicks in.

Interactive

The Goodhart Simulator

Watch the law in action. This is a fictional example where you are a call centre manager. Press "Apply Volume Bonus" to attach an incentive to the call-volume metric. Watch what happens to everything else.

Phase 1 — Baseline (No Incentive)

Agents balance speed and quality naturally. No bonus tied to volume.

Calls Handled / Day

48

per agent avg.

First-Contact Resolution

74%

issues resolved on first call

Customer Satisfaction

82%

post-call survey score

Repeat Call Rate

18%

customers who call back

Status: Balanced baseline state. Agents are productive, customers reasonably satisfied, repeat contacts manageable.

The Playbook

Rules to Keep Metrics Honest

So, what do we do? Do we abandon metrics altogether? Do we just hope that people won't game the system? Of course not. What we can do is design measurements that are harder to game and that capture actual outcomes rather than proxies. This section discusses six principles that make the difference. Obviously, these principles won't make what you have perfect. But they would at least prevent what you have from actively misleading you.

🎯	Name the Underlying Construct First As your very first task, even before picking a metric, you should decide what you actually want to achieve. Not "NPS score". Not "completion rate". The actual underlying objective. Think customer loyalty, deep product knowledge, genuine satisfaction. That's it. If you can't articulate what you really want, any metric you pick will drift. And you won't even notice until it's too late. The real target should always be explicit and separate from the proxy.
⚖️	Never Govern by a Single Number One metric is one attack surface. For every target metric, pair it with at least one counter-metric (something that should hold steady or improve alongside it). Call volume rises? First-contact resolution had better hold. NPS goes up? Check the detractor rate. A single number is easy to game. Two that pull in opposite directions are much harder.
🔬	Separate Learning Metrics from Reward Metrics Tie a metric to a bonus and people will optimise it. That's human nature. It's just how incentives work. The problem is when the same metric does both rewarding and learning, at the same time. It can't. Once it's attached to a performance review, people stop using it to understand reality and start using it to manage perception. Keep some KPIs safe. No consequences.
🧪	Goodhart-Test Every Metric Before Launch Before any new KPI goes live with an incentive attached, get your team in a room and ask one question: how would a rational person game this? Not a dishonest person. A rational one. Because that's who you're dealing with. If your team can break the metric in ten minutes, best believe the people being measured will break it too. Find the flaw before they do.
🗣️	Combine Quantitative with Qualitative Numbers tell you what changed. Yes, important. But qualitative methods (say, interviews, audits, observations, etc.) tell you why and whether it matters. Beware McNamara's Fallacy, treating only the measurable as meaningful. Some of the most important things happening in your organisation right now cannot be counted. Build qualitative checkpoints into your process. And if the numbers look fine but the stories from the floor are alarming, trust the stories. Every time.
📅	Set Expiry Dates on Every KPI Yes. Remember that measure you have been tracking since 2018. Yeah, that one that everyone knows by heart. That one that shows up in every board report like clockwork. It is time. User behaviour shifts, competitive landscapes change, technology alters what's possible. A metric that was a perfect proxy five years ago might be actively lying to you now. Schedule mandatory reviews. Make each KPI prove it still correlates with what you actually care about. Retire the ones that can't. Set an expiry date, or the measure will.

Template

Before You Ship Your Metric...

Run every new KPI through this template before it goes live. I have thrown in a few example metrics.

Metric / KPI	Underlying Goal	Gaming Risk	Counter-Metric	Incentive?	Review Cadence
NPS Score	Customer satisfaction	Staff coaching customers to rate high	% Detractors; follow-up survey; unstructured feedback	No (informational only)	Quarterly
Average Handle Time	Call centre efficiency	Rushing or dropping customers to cut call short	First-Contact Resolution; CSAT; Repeat Call Rate	No (paired target)	Monthly
Training Completion %	Employee knowledge & compliance	Speed-clicking through modules to tick box	Post-training knowledge quiz; audit score	No (dashboard only)	Per cohort

Mohamed Amir Sohil Bishrul Hafi

When Your Dashboards Deceive