Strategic grant

Using large language models to make in vivo toxicity endpoints machine learning-ready

Mathematician staring at maths problems on a whiteboard

At a glance

In progress

Award date

March 2026 - September 2026

Grant amount

£25,000

Principal investigator

Dr Jessica Ewald

Institute

European Bioinformatics Institute

R

Replacement

Overview

Jessica will assess how accurately in vitro approaches, such as high-throughput cell-based assays, can predict toxicology outcomes traditionally obtained from animal studies. Working with AstraZeneca, she will apply in silico tools to standardise the language used in historical animal toxicity reports, converting them into structured toxicity endpoints suitable for machine-learning analysis. The resulting datasets will enable preliminary comparisons of toxicological outcomes from animal and non-animal studies, providing an initial indication of the reproducibility of animal safety tests and data that could inform performance benchmarks for new approach methodologies in safety assessment.

Application abstract

In this project, our team will develop a machine learning-ready resource of harmonised in vivo toxicity outcomes by applying large language models (LLMs) to curate and standardise descriptions of toxicological effects in historical animal study data. These data, drawn from the US EPA’s ToxValDB and spanning ~40 international sources of guideline testing data, currently use highly inconsistent terminology to describe the in vivo toxicological outcomes which prevents direct comparison or use as training labels in machine learning applications.

Using text processing and locally deployed LLMs, we will map toxicity outcome descriptions to standardized categories, while retaining links to study metadata such as species, exposure route, and dose. The resulting dataset will be integrated into our existing computational pipelines that benchmark how accurately high-throughput cell-based assays such as image-based profiling and targeted transcriptomics can predict in vivo outcomes in animal testing. The curated datasets and benchmarking results will be made publicly available to accelerate toxicology research and the development of new approach methodologies (NAMs).

High predictive performance across even a small number of the toxicological outcomes would enable pharmaceutical companies to replace some preclinical, non-regulatory animal studies with in vitro HTS assays because if the high-performing model predicts that a potential therapeutic will cause a particular endpoint, it will be dropped from the development pipeline without animal testing. Low predictive performance on certain endpoints is also informative because it helps prioritise the development of future NAMs to specifically capture toxicological endpoints that simple cell models miss. Additionally, by harmonising historical animal data, the project will allow the reproducibility of animal tests to be quantified, providing a benchmark for realistic NAM performance expectations.

Using large language models to make in vivo toxicity endpoints machine learning-ready

At a glance

R

Contents

Overview

Application abstract