Anthropic on issues with research data in biology

2026/06/14 data drug discovery ai shorts

Anthropic's recent research post brings additional points to Arachne.ai's use-case-specific materialisation approach for biomedical data in agentic context.

A week ago Anthropic published their research on using agents in biology. Other claims in the article strongly aside, it highlights the difficulty of accessing the heterogeneous, multi-scale data of the biomedical domain in an agentic setting.

As I see it, the following seem to be problematic:

understanding the exact definition, context, and expected use case
access methods (filtering, API definitions) implicit and not necessarily up to par
making the data interoperable is costly
consuming public resources for large-scale ops is asking for a meltdown

Platform-level centralisation is the current standard - sounds great and might cover what you need at the moment, but the big picture:

limits your use cases
compromises specificity and generalisability
you still have to know exactly the context in order to leverage particular resources
maintenance and change management become a considerable cost
considerable effort is spent on rebuilding in and across orgs.

These are true for both external (buy / open source) and internal (build) tools. You can absolutely hybridise, but then you end up with the worst of both worlds - but you end up maintaining both ungoverned ad-hoc pieces and a toothless platform. I believe there is a better way forward - a system to orchestrate creation of use-case-specific auto-generated knowledge bases.

So, as usual when agents are mentioned, it should be about providing context and tools for a specific task. More specifically:

what datasets exist out there which could be used
has anyone already used them - and if so, how and for what
are there already established tools or access patterns you can leverage
being able to collate the data to your data infrastructure so you are not limited by external factors
a unified agent-ready access layer

If you are seeing this already, I would be keen to talk. Arachne.ai was conceived around tackling these sorts of issues through rapid, context-aware, use-case-specific materialisations of biomedical knowledge bases.