Death by statistics — When one replicate is enough

KamounLab
11 min readOct 14, 2024

--

This week, I committed statistical blasphemy by arguing that, in some cases, one replicate can be enough for a specific type of molecular biology experiment. Before you rush to conclusions, let me explain. I haven’t lost my mind.

Click here to listen to the podcast.

A single experiment would prove me wrong

One of my favorite quotes comes from Albert Einstein: “No amount of experimentation can ever prove me right.” He was likely emphasizing the importance of falsification in science, as he continued, “but one experiment can prove me wrong.”

A single experiment can prove me wrong. Try to fail better.

What I like about this quote

What I appreciate most about this quote isn’t just the emphasis on falsification — although that is a critical and often overlooked part of experimental science. What really resonates with me is the dismissal of the idea that an experiment can provide a providential answer. Yes, occasionally, a single experiment can be incredibly informative, and yes, there are those Aha moments when one experiment yields a powerful clue to a process. But no experiment happens in a vacuum. It can’t be taken out of the context of how we view the world at that particular moment. That’s the crux of the matter — and it’s where we see the difference between two competing schools of statistics: frequentists vs. Bayesians.

The Bayesian view… “Death by Statistics”

To be less and less wrong

I’ve touched on Bayesian thinking before in a previous essay, Death by Statistics, so I won’t dwell too much on it here. But essentially, unlike frequentists who treat individual experiments as completely disconnected from any broader context, Bayesians embrace the concept of prior knowledge — or priors for short. The focus isn’t on single experiments in isolation, but on the impact they have on our overall understanding of the world. What matters most is whether an experiment — or any new piece of knowledge — shifts us closer to or further away from the probability that a phenomenon is true.

I have got information, man. New sh… has come to light.” — Jeffrey “The Dude” Lebowski embracing Bayesian philosophy.

This is especially relevant when new data emerges. Different scientists will start with different levels of belief in a conclusion’s correctness. But the key isn’t where they start; it’s whether, as new evidence comes in, those probabilities inch upward or downward. Ideally, the scientific community moves toward consensus — the famed Bayesian convergence. This is the dream for any field of science: that new knowledge steadily coalesces around a set of robust findings, helping us continually refine our understanding of the natural world.

Bayesian convergence as illustrated by Nate Silver in the Signal and the Noise. Adapted from the lecture notes of Danilo Freire.

How many replicates is enough?

Back to my comment about a single replicate. This came up in the context of RNA-sequencing (abbreviated as RNA-seq) experiments, which are a method to measure gene expression levels by capturing and sequencing RNA molecules. RNA-seq can be incredibly powerful but also notoriously finicky. Variability can creep in at multiple stages — sample preparation, sequencing, data processing. It’s a method that demands careful experimental design.

The question that triggered my quip was about the ideal number of replicates for such an experiment. Someone referred to a statistics paper that recommended an ideal number of 6 to 12 replicates.

This is exactly the kind of statistical advice that drives me up the wall. First, from a practical point of view, 6 to 12 replicates is often unrealistic. It’s way too costly in both time and money for most experimental systems. That’s a lot of samples to process, and let’s be honest — it’s expensive. Not to mention the false precision in that number: 6 to 12. Why not 5? Why not 13?

Andy dealing with unrealistic statistical advice. Designed by Grok.

One single replicate can prove me wrong

What’s even more frustrating is how this advice ignores the experimental context. Why are you doing the experiment? What hypothesis are you testing? What led you to conduct the experiment in the first place? What’s the prior knowledge on the system? It treats RNA-seq without considering the wide variety of applications and goals such experiments can have. If you’re interested in more on how to think through experimental design, check out my earlier post on GOHREP.

Here’s my point: if I want to test a hypothesis that Sample A is different from Sample B, a single replicate could give me the answer. In fact, it’s pretty straightforward if you design the experiment to disprove your hypothesis. Two possible outcomes:

1. Sample A behaves like Sample B. Congratulations, you’ve done the hardest and most valuable thing in science — you’ve killed your hypothesis. Time to go back to the drawing board and come up with a new one.

2. Sample A turns out to be different from Sample B. Your hypothesis lives to fight another day, but you’re not out of the woods yet. You’ve gained some confidence, sure, but now you need to go back and design another experiment that aims at disproving your hypothesis again. And the best follow-up experiment? It’s not another RNA-seq. The best replication comes from what we call orthogonal replication — using a different method to challenge the original finding. This kind of replication is far more valuable than endless debates about the ideal number of replicates. Orthogonal replication beats a flashy p-value obtained by repeating the same experiment a thousand times.

Probability distributions of Prior Beliefs, Posterior Beliefs and Current Evidence. Image by NSS via Analytics Vidhya obtained from Bayesian Analysis & The Replication Crisis: A Layperson’s Perspective.

Continuously revising your knowledge is what makes you a scientist

Continuously revising your knowledge is what defines you as a scientist. Yet, there are those who published flawed experiments ages ago and still cling to their findings, despite overwhelming evidence to the contrary. It doesn’t take much to find these examples — just visit the blogs of Elisabeth Bik or Leonid Schneider, or browse the PubPeer website. These individuals have failed the test of good science. Frankly, they don’t deserve to be called scientists.

This is why, instead of endlessly debating the number of replicates, we should be asking why admitting you’re wrong has become so taboo in modern science. That’s the real problem — scientists who resist embracing a Bayesian mindset, where knowledge is continuously refined. Our job is not to be right; it’s to contribute to that ongoing process of revision and improvement.

“I was wrong” shouldn’t be taboo.

A real-world example of a single replicate experiment

Let me give you a real example from our own research where a single replicate RNA-seq experiment proved informative. I’ll simplify it so everyone can follow. The experiment was triggered by a negative result. We had cloned three related genes from three different plant species and assayed them for their ability to trigger the cell death response, a hallmark of plant immunity. The setup was simple: for each gene, we had two conditions — either the gene was “off” or “on.” It was a straightforward pairwise comparison.

Two of the genes behaved as expected: the “on” state triggered a cell death that was markedly stronger than the “off” state. But the third gene didn’t behave — it stayed “off” in both states. That led to a discussion in the lab. Maybe our cell death assay wasn’t sensitive enough? Perhaps RNA-sequencing could serve as a more sensitive readout of immune activation. It was a long shot, but worth pursuing just to be sure that this third gene was truly “off.” So we ran the experiment — with only single replicates.

Why just single replicates? This was a pragmatic decision driven by three main factors:

1. Ease of execution: This experiment was bundled with other unrelated RNA extractions, and triplicates would have made the workload impractical.

2. Cost: Running single replicates cost roughly a third of what triplicates would.

3. Balance: It seemed more beneficial to add more genes and controls rather than increase the number of replicates, especially when considering the workload and cost constraints.

Here’s what the RNA-seq data looked like when we examined it using PCA (Principal Component Analysis), which groups treatments by similarity.

A real-world example of a single replicate RNA-seq experiment. The hypothesis was that Gene 3 (purple) induces transcriptional reprogramming in the “on” state. It was more informative to include multiple treatments and an on/off control than to have replicates. Note that genes 1 and 2 have weak activity in their “on” state, which is also what prompted our hypothesis.

The experiment was conclusive. Our hypothesis — that gene 3 induces a weak response detectable by RNA-seq — was falsified. Whether in the “off” or “on” state, gene 3 grouped clearly with the “off” state of the control. Mission accomplished. We had successfully falsified our hypothesis, allowing us to move on to the next one.

Always account for opportunity cost

Now, I’m not arguing that this was the ideal experimental setup. In an ideal world and from a purely statistical point of view, perhaps 12 replicates would have been even more convincing. But in practice, this was one experiment bundled among a few others, and the costs in terms of research funds and time were not trivial. We must also account for the opportunity cost in experimental research — if you’re not familiar with the concept, you should be.

Opportunity cost refers to everything you don’t do because you chose to do something else. In this case, running the experiment in triplicates would have meant that other samples couldn’t be processed. And this isn’t just a minor inconvenience — it’s a fundamental reality of academic research. Time, funding, and resources are always limiting, no matter the lab or the project. By increasing the number of replicates for this particular experiment, we would have effectively blocked other potentially valuable experiments from moving forward.

That’s the crux of the matter — efficiency. Every decision in research comes with a trade-off, and here, the opportunity cost would have been high. Imagine the cost of leaving other crucial experiments on the back burner, waiting for resources to become available. The time and resources spent on more replicates could have been directed toward exploring new hypotheses, testing additional genes, or gathering more controls. In the fast-paced world of scientific discovery, choosing where to allocate your finite resources can be the difference between moving your research forward or getting stuck in a cycle of diminishing returns. And that’s why, in this case, we prioritized breadth over redundancy — choosing to collect a single, informative replicate rather than sink resources into unnecessary duplication.

Opportunity cost or appreciating that time is your most valuable asset. See also this post.

It didn’t end there

Running this experiment in triplicates — or, heaven forbid, 6 to 12 replicates — would only matter if you’re fixated on single experiments and lose sight of the broader context. Science isn’t about a single providential experiment; it’s a process of becoming less wrong over time. This single replicate experiment was just one small clue in our Bayesian convergence plot. It nudged us in one direction, but it didn’t end there — it must be interpreted in the wider context of other related experiments, just as Bayesian thinking suggests.

Of course, we would never publish this single replicate experiment. If we wanted to publish, we would repeat it with the proper number of replicates to satisfy the standards of statistical rigor and peer review. But for us, it served its purpose: it helped us better understand our experimental system and move forward with our research.

Q: Is it okay to do an RNA-seq experiment with a single replicate?

When a single replicate of an RNA-seq experiment is enough

For those of you who remain puzzled by my single replicate stance, here’s a list of scenarios where a single replicate may be sufficient in the case of RNA-seq. Keep in mind, this doesn’t mean it’s always the best choice — that’s up to you to decide based on your specific situation.

1. Proof of Concept — When you’re testing the waters with a new technique or a new system, a single replicate can help you quickly determine whether it’s worth investing more resources. If the initial result is promising, you can scale up.

2. Exploratory Experiments — Early-stage exploratory experiments often benefit from single replicates. These experiments aim to generate hypotheses rather than rigorously test them. If something interesting shows up, you can follow up with better replicated experiments later.

3. Negative Control Confirmation — If you’re simply confirming that a negative control behaves as expected — showing no significant expression changes — one replicate can often suffice to give you a clear picture without wasting resources.

4. Clear, Robust Phenotypes — In cases where the difference between conditions is dramatic and unambiguous, a single replicate might provide a quick answer. As Harish Kothandaraman pointed out it might be better to add a spread of samples then replicates, particularly when resources are limited. This is in fact the case in the example above with the three genes and a control.

Can more related samples be better than replicates?

5. Resource Limitations — Let’s face it, research budgets aren’t unlimited. If you’re working under tight constraints — whether financial, time, or both — a single replicate can help you maximize the number of samples or conditions you test without breaking the bank.

RNA-seq for genome annotation.

6. Specific Experiments (e.g., Genome Annotations, Response Curves, Time Courses ) —In certain cases, replicates may not add significant value. For example, when using RNA-seq to annotate genes and introns, the focus is on generating a reference, and additional replicates won’t dramatically change the outcome. Similarly, in time-course experiments, where different time points effectively act as pseudo-replicates, the design inherently builds in replication through temporal sampling.

Example of experiments where single replicates can be acceptables.

7. Rare Samples — When biological material is limited or precious, a single-replicate RNA-seq can be justifiable. An example is diseased tissue collected from the field in approaches like “Field Pathogenomics,” where obtaining more samples may not be feasible. In these cases, the rarity of the sample itself makes single replicates a practical necessity.

A single replicate RNA-seq of healthy vs. diseased plant tissue distinguishing between host plant (gray) and pathogen (blue) transcripts. This method is known as Field Pathogenomics. See: Hubbard et al. 2015. Figure from Islam et al. 2016.

Each of these cases presents a scenario where a single replicate can give you useful, actionable data. But remember, just because you can doesn’t mean you should. Weigh the benefits against your experimental goals, resources, and the broader context of your work.

Acknowledgements

I’m grateful to everyone who inspired this post, whether through Q&A sessions, discussions, or social media comments. Special thanks to Steph Bornemann for providing my photo. This article was written with assistance from ChatGPT. I used Grok to design the image of the frustrated plant biologist.

This article is available on a CC-BY license via Zenodo.

Cite as: Kamoun, S. (2024) Death by statistics — When one replicate is enough. Zenodo. https://doi.org/10.5281/zenodo.13925560

--

--

KamounLab
KamounLab

Written by KamounLab

Biologist; passionate about science, plant pathogens, genomics, and evolution; open science advocate; loves travel, food, and sports; nomad and hunter-gatherer.

Responses (1)