From the voices of pharma data strategists: Festival of Genomics 2025 key takeaways

Achieving drug development success through the lens of data strategy and Artificial Intelligence (AI) technologies was arguably the hottest topic area at the annual Festival of Genomics & Biodata conference, taking place January 29-30th 2025 in London. Biorelate’s CSO Dr Ben Sidders spoke on this topic on Day 1 and chaired a specialist session for pharma data leaders on Day 2 (see a newly published Journal for Clinical Studies article here on the topic), but the Biorelate team also attended many other talks and had countless conversations with the leading experts in this space.
Across these talks and conversations emerged six critical themes which pharma are focusing on in 2025 and beyond for data-driven drug discovery success strategies:
- Generating meaningful and actionable insights from AI models requires them to be trained on high-quality data. Data quality can refer to many different aspects of data, but ultimately without a comprehensive sample of robust benchmark data, AI models will never be fully reliable or helpful enough to reduce the high failure rate in drug discovery.
- One key aspect of data quality that remains a significant hurdle in drug discovery is historical biases in biomedical data; Krishna Bulusu of AstraZeneca summarised it nicely: “AI models trained on biased benchmark datasets will produce biased outputs.” Obtaining more representative datasets is difficult when so few are available. Helena Andres-Terre, UCB Pharma raised the question of, is synthetic data enough to bridge the gap, particularly when there is no universal metric to assess its quality, and Anguraj Sadanandam of ICR commented “can an LLM create diversity”?

- A crucial prerequisite to having high quality, representative training datasets is the ability to incorporate multi-modal data types into AI models, e.g. text, omics datasets, images, etc. Significant challenges remain in not only accessing all types of datasets, but collating the data in such a way that it is “minable and aligned” (as captured nicely by Victor Neduva from Roche).
- Zhihao Ding, Boehringer Ingelheim raised, how can we ensure we can identify the right target for the right patient population? Inclusion of multi-modal datasets from a broad spectrum of sources is crucial for using AI to identify data specific to individual patient populations, not only for identifying the right target, but also in prescribing the best treatment and anticipating potential unsafe drug combinations (an important theme also raised by Harriet Dickinson of Gilead Sciences).
- Having the appropriate data available is one large hurdle; another is making this data accessible in a manner that is usable. Ease of use of AI tools across multiple departments and scientific fields is crucial to enable the scientific collaboration required to make the most impactful scientific breakthroughs.
- By incorporating AI tools into our everyday workflows, these advanced tools promise to improve efficiency within pharma companies across “user productivity, process optimisation and business innovation” (Elif Ozkirimli, Roche). There is some anecdotal evidence to support that this is beginning to happen within pharma companies, but the full impact can be difficult to measure, particularly across the lengthy drug development pipeline.

At Biorelate, we help pharma get maximum value from advanced Artificial Intelligence technologies by curating the highest-quality data from unstructured sources (literature, patents, etc.), providing the critical context needed to train AI models effectively. Our causal models embed explainable, mechanistic biology—ensuring AI delivers real impact in drug discovery programs. Start a conversation with us about accelerating your drug discovery programmes with higher quality, more explainable data by contacting us at info@biorelate.com and explore more at www.biorelate.com
Latest News
Discover new insights and updates for data science in biopharma