Same mission, different path: Building the ground truth for biological superintelligence

Authored by
Nima Alidoust
Released on
June 29, 2026
Authored by
Nima Alidoust
Released on
June 29, 2026
Summary

The world looks different than it did three years ago. It even looks different than 6 months ago. 

We have tools we didn't imagine we'd have. We are no longer alone in the highest-leverage things we do: writing, coding, building models, reasoning.

It's tempting to read this as efficiency. Faster, cheaper, more at once. That undersells it. Reasoning at scale is not just better and faster. It is a different kind of power, and it pushes us past an inflection point. It doesn't change our mission. It should change how we think about the path.

Our mission has always been to reason biologically across genes, cells, and tissues. The thesis was straightforward: with enough relevant data, you can learn one corner of biology, then expand it by adding more data. The way to learn was to find the right model to train.

Is that still the only path?

Building a generalizable model of the cell is a worthy ambition. It will be a bigger moment than AlphaFold. But with reasoning models in hand, I think we can do some of what we imagined that model would do, before we ever finish building it.

One way to get there is to couple reasoning models tightly with the data we once used only to train biological models.

That coupling could be simple tool use. But that wastes what these models can actually do: generate, test, evaluate, and create new knowledge. So try the maximal version. Let them think as scientists, not as tools in the hands of scientists. They know a great deal. We can feed them much more. They are getting better every month. Treat them accordingly!

Then it gets hard.

There is an input problem. Much of what these models learned isn't ground truth. A large share of published science doesn't reproduce. The ground they reason from is softer than it looks.

There is also an output problem. Hallucinations get blamed for this, and I expect them to fade; their rate today is a fraction of what it was six months ago. But removing hallucinations doesn't solve it. Even a scientist with deep expertise, real rigor, and total integrity is wrong more often than right. What do they use to fix their priors? Data.

And in our world (genes, cells, tissues, disease), data doesn't come easily. Find the right cells, bank them, thaw them, grow them, perturb them, prepare the samples, read out the molecular changes: months, sometimes years. Lean on manual verification or lab-in-the-loop and you create an impedance mismatch: hypotheses generated in seconds, validated in quarters.

Hypotheses are cheap. Verification against ground truth is the bottleneck.

So attack the bottleneck directly. Generate high-quality, high-precision data across samples, perturbations, and disease models at a scale well beyond what the field aims for today, until we cover vast corners of biological ground truth. Then immerse reasoning models in measurements that have already been made.

What happens to biological models trained purely on biological data? They remain a huge part of the puzzle, but their role changes. Instead of the all-knowing model that reasons across every sample and modality, build them to estimate biological quantities with high accuracy, filling the gaps where we haven't yet run the experiment. Specialized models that do narrow things extremely well and generate data we can trust.

In the last six months, the world changed. Pursuing our mission as if it didn't is a missed opportunity.

We are taking the other path. We are building the ground truth, and putting reasoning models inside it.