30 October 2025

Why we’re exploring AI Scientists: Update #1


Over the coming months, Ant Rowstron, our CTO, will share updates on ARIA’s exploration of AI Scientists: autonomous systems that could transform how breakthrough research happens. This first entry explains why we’re doing this, what we’re testing, and why it matters.

ARIA CTO Ant Rowstron speaking at an event.


A few weeks ago, I was at Bletchley Park for an ARIA workshop. Standing in front of EDSAC, the Bombe, Colossus – systems the UK built when the world demanded entirely new capabilities – I found myself thinking: are we at another one of those moments?

The technical shift


We’re seeing AI move from tools that assist human scientists (like AlphaFold optimising protein structure prediction) to systems attempting the full research loop: hypothesis generation, experimental design, execution in automated labs, data interpretation, iteration. That’s a fundamentally different capability.

If this works – and that’s still a big if – it could change what’s possible. Testing hypotheses at speeds humans simply can’t match. Spotting patterns we’d miss. Exploring research directions that would otherwise never get tried because there aren’t enough hours in the day.

The technical enablers are converging: frontier LLMs with improved reasoning, high-throughput automated lab infrastructure, access to structured knowledge at scale. But the integration challenges are real. Moving from ‘this works in a demo’ to ‘this reliably produces novel insights’ requires solving problems we don’t fully understand yet.

Can current systems actually handle experimental failure and adaptive redesign? Do they need human intervention at key decision points? What’s the real throughput on automated equipment versus what’s theoretically possible? There’s a lot of hype, but I’m not sure we know where the frontier actually is.

We need a deeper understanding of current capabilities; that’s why we’ve launched this funding call.

Testing both sides of the boundary


We’re looking to fund 5-6 organisations with working AI Scientist systems to tackle real problems over nine months. Each proposal needs two problems: one they’re confident they can solve, and one they expect to struggle with. We’re deliberately testing what these systems can reliably do now, and where they hit their limits.

What I’m watching for


A few questions particularly interest me:

  • Can they iterate when things fail? The most valuable outcome might be seeing a system formulate a hypothesis, run an experiment that doesn’t work, work out why it failed, then design a better experiment. That iterative learning — can they actually do it?

  • Can we solve the reproducibility problem? In computer science, reproducibility is deterministic: same inputs, same outputs. But in life sciences, results often can’t be reproduced, even with controlled variables. Is this biological stochasticity or because of untracked parameters? AI Scientists could conduct experiments with far greater precision – logging every variable and deviation – helping distinguish genuine biological variability from experimental noise. That alone would be valuable.

  • What about breaking down disciplinary walls? Biology labs, materials labs, chemistry labs operate independently. But what if you created one automated space with all that equipment and gave an AI Scientist access? Most of the exciting things in my career happened when I worked across disciplines. I don’t know what that looks like, but it might enable entirely new approaches.


These are genuinely open questions. The answers will inform not just ARIA’s strategy, but how the broader research community should think about engaging with these systems.

Being open-minded about capability


Scepticism is warranted — both technically and ethically. Current AI systems excel at pattern matching but can struggle with genuine causal reasoning. If they’re just interpolating from existing research to propose incremental variations, that’s useful but not transformational.

But here’s what I’ve learned watching coding tools evolve: the gap between ‘useful assistant’ and ‘can actually architect systems’ closed faster than most expected. Tools like Claude or Cursor today can do things that seemed impossible three years ago. A 20-year-old engineer using them can achieve in hours what took experienced teams days when I was building distributed systems at Microsoft.

I’m seeing similar early signals with AI Scientists. Not in hype, but in actual capability demonstrations. The question is whether that trajectory translates to scientific research, or whether research has fundamentally different challenges that will slow progress.

That’s what this exploration is about: generating evidence for informed decisions.

We’re calling this broader effort ‘AI for breakthroughs’: ARIA’s approach to understanding and directing AI-driven research toward transformational discoveries. As AI Scientists develop, learning how to interface with them effectively becomes as important as the systems themselves.

I’ll share what we find.

Learn more about the AI Scientist funding call