Evaluating a human being is hard. You might point to the Olympics as proof that we can measure human performance — and sure, we can time a sprint to the hundredth of a second. But what about team performance? Trickier. Yet we still manage to crown FIFA’s Best Player or an MVP in the NBA.

Where it gets genuinely difficult is measuring intellectual performance. In a team. Over time. That’s a different beast entirely.

The theater of measurement#

And yet, that’s exactly what we do in the corporate world — at least for as long as I’ve been working. And even though it doesn’t really work, we keep doing it, hiding the system’s flaws behind rigorous-looking processes.

The classic example is the annual performance review. It used to be simpler: the manager, in their wisdom, would render judgment. A judge among judges, deciding who performed and who didn’t. Imprecise at best, unfair at worst.

But why? A few reasons, really. Unlike a 100-meter dash, intellectual performance isn’t measurable — not in any meaningful way. Sure, there are IQ tests, but let’s be honest: they’re one-dimensional and widely criticized. More fundamentally, humans can’t separate observation from emotion. We always have bias. It’s one reason we run double-blind trials in science. And these limitations are intrinsic to being human. They don’t go away.

Enter peer feedback#

So what did we do? We invented peer feedback. The thinking, presumably, is that if you multiply subjective opinions, the average would become… somehow… objective? That’s not how any of this works.

What we forget is that peers have even less experience evaluating performance than managers do. A manager might have training, coaching, years of practice. A peer? They’re operating almost entirely on gut feeling and personal relationships.

Picture it: there’s someone on your team — super friendly, you get along great, maybe your families hang out on weekends. But honestly? They don’t bring much energy to the work, and they’re not particularly good at their job. Are you really going to write that in their peer feedback? Or are you going to let your affection for this charming friend-of-the-family color your assessment?

And then these peer opinions influence the manager. A strong engineer says something about a teammate? Must be true. The bias compounds.

It gets worse with promotions and reference checks. There, you get to choose who speaks on your behalf. As if anyone would ever name someone they clashed with as a reference. Let’s be serious.

This isn’t unlike the autonomy trap — where we diffuse decision-making to feel more democratic. Here, we diffuse evaluation to feel more objective, but end up with bias wearing the mask of consensus.

So what’s the answer?#

I don’t have a neat solution. But I’m fairly certain that peer feedback and reference checks don’t solve the problem — they just dress up something completely subjective in the costume of scientific consensus.

Is it better than the old way, where a single manager held all the power? I’d argue no: it’s even worse. You can train a manager. You can coach them. You can work with HR to identify and counteract their biases over time. But handing evaluation to untrained people who were chosen by the person being evaluated? That seems genuinely misguided.

Maybe the answer is accepting that intellectual performance simply can’t be measured — not with any precision. But if we accept that, what criteria do we use for hiring and promotion?

I don’t know. But pretending we’ve solved the problem by adding more opinions isn’t it.