When Attractive Science Falls Apart Under Scrutiny

What a Flawed Study Taught Me—and How It Shaped My Career

May 01, 2025

In 2014, a respected colleague tweeted a study claiming that Tour de France cyclists judged more attractive also performed better in the race. Even more striking, the authors claimed that women—particularly when most fertile—were uniquely attuned to this cue.

The study combined irresistible ingredients: evolution, sexuality, elite sport, subconscious cues. Within days it dominated headlines:

Why Good Athletes Are Good-Looking, Too (NBC News)
Does Attractiveness Predict Endurance Performance? (Runner’s World)
Faster Cyclists Are More Handsome (Discover Magazine)

But as a sports scientist, I couldn’t shake the feeling that something didn’t add up. So I went beyond the abstract and read the study in full, and within minutes it unraveled.

Yes, the paper looked convincing—principal-components analysis, tidy p-values, glossy graphs—yet its conclusions felt implausible. The gap between scientific appearance and scientific meaning quickly became obvious—and became a pivot point in my career.

This isn’t just about one questionable study; it’s about a pattern: weak evidence, polished with impressive statistics, gets amplified without scrutiny—and that cycle has consequences. It's about what happens when weak evidence is dressed up as truth—and what that teaches us about the system that produces, promotes, and rewards scientific claims.

The study: what it claimed

The study, published in Biology Letters, described how 800 participants analyzed portraits of 80 professional cyclists who completed the 2012 Tour de France. Their headshots were rated for attractiveness via online surveys.

The paper reported three key results:

Performance–attractiveness correlation – Higher general-classification rank correlated weakly with attractiveness (Spearman ρ ≈ 0.17; ≈ 3% of rank variance).
Sex-specific preference – The link was strongest in women not using hormonal contraception; weaker in men and in women on contraception.
Evolutionary interpretation – Facial traits supposedly advertise heritable endurance capacity, advantageous in ancestral hunting contexts.

Hence, attractiveness supposedly “signals” good endurance genes, and women retain an evolved sensitivity to that cue - which was enhanced when they were most fertile.

Gaining a greater understanding about human evolution and sexual selection based on photos from a three-week cycling race seemed like a huge stretch to me, so I decided to dissect the study in great detail.

The problem: why it didn’t sit right

On the surface, it was a compelling narrative. But the methods and results told a very different, less sensational story:

Who were these cyclists? The Tour de France is the physiological 0.001%—not a random sample of men. Drawing evolutionary inferences from such an extreme group ignores training, team role, and yes, occasional performance-enhancing drugs.
Who’s missing? Of 153 finishers, only 80 headshots met the author’s inclusion rules (no beards, sunglasses, smiles). Exclusions swept away the champion Bradley Wiggins and the entire Sky roster—as well as every rider who abandoned the race. When the best and the worst disappear, “best riders are most attractive” becomes difficult to test.
How big was the effect? Attractiveness scores explained only 3–5% of placing differences—“not zero” but barely hypothesis-generating, let alone headline-worthy.

That’s just a sample of the issues we identified—and that’s just through evaluating the methods and results themselves. When we analyzed the evolutionary interpretation of the data, we found even more issues. The author suggested that lower-placed finishers might be less healthy, vigorous, or strong. Hard to accept when even the lanterne rouge (last place rider) could out-perform 99.999% of humanity.

Why does this matter?

You might be thinking: so what? Who really cares if a study about cyclists’ faces and attractiveness makes a few evolutionary leaps? It’s not clinical. It’s not policy. It’s not hurting anyone.

But here’s the thing: this study—like so many others—was widely shared, cited, and praised not because it was rigorous, but because it was clever. It told a story that sounded right. And the moment a study like that gets amplified, it stops being just an academic curiosity. It becomes a building block for other research, a headline for science communication, and a subconscious reinforcement of cultural assumptions about beauty, biology, and worth.

In a world where many clinicians and researchers rely on the abstract (and the press release) to decide what matters, the distinction between “interesting” and “true” is more than academic. It shapes grant funding. It shapes careers. It shapes what students learn is rewarded.

And that’s why I care—not just about this paper, but about what it represents.

My response: a turning point in how I do science

In 2015, I co-authored a reply in Biology Letters that distilled the study’s weaknesses into six themes: 1) limited external validity; 2) sample homogeneity; 3) over-interpretation of a weak correlation; 4) conflating race ranking with general endurance; 5) untested statistical assumptions; and 6) contradictions within the broader attractiveness literature.

For readers who enjoy the details—plus a bit of humor—the full article is worth a look. One example: we note that Hollywood’s “Sexiest Man Alive,” Ryan Reynolds, ran a 3 h 50 min marathon, while seven-time Tour de France winner Lance Armstrong clocked 2 h 46 min. Attractiveness clearly isn’t broadcasting endurance genes there!

Publishing that critique reset my academic compass and showed how easily intriguing data morph into “just-so” stories—then ricochet through the media echo chamber. Ever since, I’ve focused on helping readers separate methodological substance from seductive narratives.

What this taught me (and what I write about now)

I didn’t start Beyond the Abstract to dunk on research.

I started it because this kind of experience — reading something “important” that didn’t hold up — wasn’t rare. And because I realized that even smart, trained scientists and clinicians often lack the time, context, or tools to critically evaluate evidence that’s already been peer-reviewed. If those formally trained to do so don’t have the bandwidth to critically appraise research, how is the general public going to separate reality from sensationalism?

That’s not a judgment. It’s an opportunity.

So here’s what I do now:

Read between the lines of methods and media releases
Unpack studies that go viral or unquestioned
Examine how academic incentives shape publication and interpretation
Share what I’ve learned — from my own research, teaching, and academic experiences.

Sometimes I critique a headline. Sometimes I unpack a methodological flaw. Sometimes I write about careers in academia, or cultural assumptions around sport and science. Occasionally, I even write about running - the sport that triggered my love for science.

But always, I try to stay in that space between hype and substance. The part where critical thinking happens.

Why Beyond the Abstract matters

Science doesn’t need defenders. It needs interpreters.

We need people who believe in evidence—but also know that evidence doesn’t always mean what we think it means. People who understand that peer review is the start of a conversation, not the end. People who know how to ask, “What’s the actual effect size?” before tweeting a link. This newsletter is for them. For you.

I know there’s no shortage of content out there. Thanks for choosing to spend some time here. If this resonated, I’d love if you shared it with someone else who might value it too.