Unlocking the Future of Precision Oncology with Knowledge Graphs
Identifying which patients will respond to a drug is a critical step in developing a new therapeutic. Frequently, companion diagnostics measure presence or absence of the drug's target. This can miss the deeper, interconnected patterns that drive disease progression & drug response. Knowledge Graphs (KGs) are a game-changer in biomedical data integration and can be used to bring together vast amounts of data for each patient. KGs therefore have the potential to inform patient stratification & selection approaches.
A recent study published in Journal of Translational Medicine demonstrates that incorporating prior knowledge into a biomedical KG significantly enhances survival prediction models for NSCLC patients (Fang et al. (2024)). By leveraging patients’ genomic data, and NSCLC-specific insights from Biorelate’s Galactic AITM KG, AstraZeneca’s researchers developed machine learning models that achieved superior predictive accuracy. The integration of KG-driven insights into survival prediction and biomarker discovery paves the way for more precise, personalised cancer treatments informed by all available data.
Knowledge Graphs organise vast amounts of biomedical data into structured relationships, connecting genes, proteins, drugs, and diseases in a way that traditional databases cannot. In this study, the authors utilised KGs to enhance machine learning models for survival prediction by integrating curated, prior biological knowledge. This allows them to identify key biomarkers that differentiate patient survival outcomes and use these to improve patient stratification in clinical trials for immuno-oncology (IO) treatments.
Key Takeaways from the Study
✅ Survival prediction models incorporating KGs outperformed standard gene panel-based models. The inclusion of prior knowledge allowed for improved hazard ratios and more accurate risk stratification.
✅ NSCLC causal data, integrated within AstraZeneca’s BIKG, played a crucial role in uncovering hidden connections. By integrating causal relationships specific to NSCLC, the model identified gene-gene interactions that conventional analyses might overlook.
✅ The study demonstrated real-world application through clinical trial data. Using genomic data from patients in the POPLAR and OAK trials, KG-based models provided a clearer understanding of survival probabilities, reinforcing the clinical relevance of this approach.
✅ A new biomarker-driven mutational signature emerged. A model-defined 10-gene signature - including TP53, EGFR, ATM, PRKDC, STAT3, CTNNB1, KRAS, NFE2L2, EPHA7, and SOX9 - proved to be a strong differentiator of overall survival outcomes, further validating the power of KGs in biomarker discovery.
What This Means for the Future
One of the most interesting aspects of this research is the use of NSCLC-specific prior knowledge. In particular the causal relationships enhance the model’s ability to predict patient outcomes, construct tailored subgraphs that filter noise and focus on NSCLC-specific gene interactions. This all leads to the generation of biologically meaningful embeddings to improve machine learning model performance.
By incorporating disease-specific insights from Biorelate’s Galactic AITM, the approach ensured that survival models did not rely solely on raw gene panel data but instead allowed a deeper understanding of how genetic alterations impact disease progression.
The application of KGs in patient survival prediction represents a significant leap toward data-driven precision medicine. This is one of the first successful examples of knowledge graphs being applied to clinical research within drug discovery. As oncology moves towards increasingly personalised therapies, leveraging structured biomedical knowledge through KGs could be the key to unlocking new therapeutic strategies.
At Biorelate, we help pharma get maximum value from advanced Artificial Intelligence technologies by curating the highest-quality data from unstructured sources (literature, patents, etc.), providing the most comprehensive knowledge graph of biomedical knowledge in the industry. Start a conversation with us about accelerating your drug discovery programmes with higher quality, more explainable data by contacting us at info@biorelate.com and explore more at www.biorelate.com
References & Further Reading
📄 Fang et al. (2024). Integrating knowledge graphs into machine learning models for survival prediction and biomarker discovery in patients with non–small-cell lung cancer. Journal of Translational Medicine . Full Paper
🔗 Biorelate’s Galactic AI™️ – Learn More
📖 Some Recommended Readings:
- Himmelstein DS et al. (2017). Systematic integration of biomedical knowledge prioritizes drugs for repurposing. eLife. https://pubmed.ncbi.nlm.nih.gov/28936969/
- Chandak P, Huang K, Zitnik M. (2023). Building a knowledge graph to enable precision medicine. Sci Data. https://www.nature.com/articles/s41597-023-01960-3
- Ramirez R et al. (2021). Prediction and interpretation of cancer survival using graph convolution neural networks. Methods. https://pmc.ncbi.nlm.nih.gov/articles/PMC8808665/
- Geleta D et al. (2021). Biological Insights Knowledge Graph: an integrated knowledge graph to support drug development. bioRxiv. https://www.biorxiv.org/content/10.1101/2021.10.28.466262v1
- Zhang F et al. (2022). Co-occurring genomic alterations and immunotherapy efficacy in NSCLC. NPJ Precision Oncology. https://www.nature.com/articles/s41698-021-00243-7
Latest News
Discover new insights and updates for data science in biopharma