BLOG POST

AstraZeneca's innovative graph as a service and generative AI models: is this the future of data science?

Mira Nair, Head of Marketing

AstraZeneca's innovative graph as a service and generative AI models: is this the future of data science? 

Antonio Fabregat, Knowledge Graph Lead at AstraZeneca, recently presented on the innovations to make data more accessible and insightful for internal biopharma research customers. We had the pleasure of seeing his talk at InnovatePharmaHealthcare in London and are excited to blog on our take-aways from his thought leadership.

AstraZeneca was ranked highly in the latest 2023 Pharmaceutical Innovation and Invention Index; so it is no surprise that they are also innovating in their knowledge graphs. *

What are knowledge graphs and why are they important?


Knowledge graphs are a way of organising data and information in the form of a graph. More specifically, Antonio Fabregat summarised:

‘A knowledge graph is a collection of interlinking concepts, entities and events that represent a network of real-world entities and the relationships between them.’

In the age of ever-growing data across many sources, in many formats, knowledge graphs are a tool that can help informaticians take control of the data. Creating accessible representations of the data in the form of a knowledge graph enables scale in data management, but paves the way for making the data more insightful and thus useful for decision making.

Graphs as a service

Data scientists at large pharma companies will be very familiar with the scenario whereby they are given questions to ask of the data, they load up the data, they send it to their internal customers in a spreadsheet, and then they get the spreadsheet back with new follow-on questions or parameters, then the process needs to be repeated many times over. Antonio’s talk raised the question ‘Can GenAI help with common questions to automate some of this process and make everyone’s lives easier?’ The answer is yes.

Antonio’s team are using GenAI to more seamlessly extract and refine the right answers and also formulate the answers to be human-readable, as though they were written by a human in a fully manual workflow. The end result is taking knowledge graph development from something highly iterative and manual to amore productised and scalable internal data science service for biological researchers. This essentially puts the power of the data into the hands of those who make decisions every day as part of their research.

“With knowledge graphs, relationships are as or more important than the data points themselves.” -Antonio Fabregat Mundo 

Do you have the right data in your graphs to support evidence-based decision making? 

At Biorelate, we specialise in generating data for knowledge graphs that adds explainability and context, and makes novel connections between the data beyond what can be achieved with traditional curation methods. Specifically, we:

  • Add high-quality edges to knowledge graphs via millions of high quality cause-and-effect interactions that cannot be found in other sources.
  • The novel cause-and-effect data enables researchers to hypothesise disease mechanisms and novel targets.
  • Analyse many more entities in the data, well beyond what is achieved by standard tools like PubMed.
  • Provide contextual information around the data to make knowledge graphs more extendable and accessible and help researchers quickly and effectively identify the relevant articles.
  • Ensure data reliability and quality for knowledge graph data. All of Biorelate’s data is FAIR and we use tools like LLMs to quality check our data to ensure it’s not only the richest and most comprehensive on the market, but also the most trustworthy.
  • Knowledge anchors: We help data scientists eliminate hallucinations by using their data as the source of truth rather than using the web as the source. This way, they control the data inputs that the generative AI tools they use extracts the answers from and thus have more control over data quality outputs. 
  • In addition to minimising hallucinations, data scientists also get the context and links back to the evidence as part of the data in their knowledge graphs.

In summary, the data offered by Biorelate further enables knowledge graphs to be a source of truth within pharmaceutical organisations, by minimising hallucinations and also giving access to cause-and-effect relationship data, with directionality, for more powerful hypothesis generation.

*The work carried out by the Knowledge Graph Service (KGS) at AstraZeneca has not been done in collaboration with Biorelate. Biorelate's roadmap and services described in this blog post future plans are independent of the work conducted by the AstraZeneca KGS.