Summary of first 18 months

15 January, 2021

The treatment of pediatric cancers presents specific challenges distinct from the treatment of adult cancers. The iPC consortium is focused on developing a comprehensive computational set of models and software to address some of the most challenges that clinicians face in their daily treatment of pediatric patients. Towards this aim, iPC is combining a multitude of knowledge-base, machinelearning, and mechanistic models to predict optimal standard and experimental therapies for each child.

Our approach is based on the development of virtual patient models, i.e. in-silico avatars who resemble the molecular and clinical landscape of the child patient and can be used to investigate computationally personalized diagnostics and recommend treatments. To make our results accessible to all users, iPC is developing a computational platform that also allows caregivers to query models and infer benefits and drawbacks for specific treatment combinations for each child.

Work performed from January 2019-June 2020

WP1: We have already reached some agreements with clinical trials and are putting agreements in place to facilitate the work and distribution of data and other resources. These are often challenging because open clinical trials are protective of their data and generally do not agree to share any data before the trial closes to avoid biasing trial results.

WP2: The Data Management Plan (DMP) was submitted by the end of M06 containing an initial overview of the expected data to be used and/or generated in the context of iPC. An initial version of the iPC data catalogue has been deployed allowing metadata searching, filtering and visualization. This initial release already includes information from the OpenPBTA project. Cloud-based storage service for data sharing across iPC partners has been made available. It facilitates the data exchange across partners and will be the basis for the data ingestion for the analytical workflows using cloud-based and HPC systems. A Virtual Research Environment (iPC openVRE) has been set-up, which allows the deployment of partners tools, e.g. DoRoThEA. An initial data model has been proposed following the ones existing for EGA, ICGC and KidsFirst. The data model is associated with ontologies and controlled vocabularies to facilitate interoperability across systems. This constitutes the foundations for the integrated iPC submissions systems. A general framework integrating an initial set of components has been deployed. The main portal gives access to the platform components from one single place and using a single log-in. Progress has been made towards a federated login system based on Keycloak, which makes use of OpenIDConnect for Single-Sign-On authentication and authorization flow.

WP3: The partners identified common points and are defining a plan to guarantee the first deliverable due in M20. A member of BSC spent 3 months at IBM to jointly work on a project focused on NLP in paediatric oncology. Several packages have been open sourced. These packages are the basis for upcoming activities in this work package. The initial literature-based networks have been developed combining assets and tools from BSC and IBM.

WP4: Methodology of deriving gene-gene similarity and patient similarity networks from multi-omics datasets has been established and validated on some large publicly available datasets. The critical mass of molecular data for pediatric tumors has been assembled. An interface with other WPs using multi-level networks (WP3 and WP5) has been established.

WP5: As relational and graph data have to be married, we incepted a graphical model generating a joint embedding of graph and relation data. We performed a first implementation of a documentoriented database for the storage of knowledge graphs. We finished the implementation of the database and submitted D5.1. ipcrg package is running on a server at BSC. Methods for scalable simulation of mechanical constraints, such as kinetik laws laws and decision making on their basis is where published at AAAI2020. A method for active relational graph inference employing Perl’s do-calculus is under review at NeurIPS2020.

WP6: Model development and expansion started; Alacris’ drug database DrugDB has been updated and extended; Ongoing evaluation of parameter optimization methods.

WP7: Submission of a manuscript to present algorithm for clonal deconvolution; Benchmarking study to evaluate the intrinsic performance of existing deconvolution methods, manuscript under revision; Selection of reference marker genes; Collection of single cell RNA-seq datasets (ongoing).

WP8: Development of: PaccMann (IBM); molecular models to predict outcome and prescribe the intensity of therapies for high risk hepatoblastomas (BCM); a multi-omics model for drug discovery based on prior knowledge networks (UKL-HD); a tool that uses the transcriptional profile of a patient in order to predict small molecule inhibitors of cancer cell growth (UNINA). Integration of the tools developed by UNINA and UKL-HD and setting up of a case-study around hepatoblastoma expression data provided by IGTP. (UNINA, UKL-HD, IGTP).

WP9: A paper reporting sources of intratumoral heterogeneity based on the single-cell data analysis has been published (Aynault et al, Cell Reports, 2020) highlighting the role of the heterogeneous activity of EWSR1-FLI11, the fusion oncogene in this disease, in the regulation of cell proliferation, cell migration and metabolism. PDX datasets have been deposited to GEO (Curie) and a manuscript is under review at Science Advances revealing that neuroblastoma cancer cells resemble fetal sympathoblasts, but no other fetal adrenal cell type.

Expected results until the project end

The integration of the multiple data modalities collected for each patient holds the key to precise personalized models for diagnosis, prognosis and improved treatment for a multitude of diseases, including pediatric cancers. However, the volume of its complexity has so far prevented its widespread translation into clinical practice. The iPC project is building innovative clinic-ready software tools to address the real needs of pediatric tumor patients.

Specifically, iPC is:

  • Developing computational models and workflows for the systematic integration of large-scale quantitative data based on the integration of a multitude of orthogonal modeling approaches and powered by HPC implementations.
  • Integrating advanced mechanistic, statistical, and artificial intelligence models into the virtual patient framework to produce treatment recommendations based on deep molecular analyses of tumor and patient.
  • Investigating novel approaches for network reconstruction for large biological networks that can handle disparate data sources, including text sources.
  • Building personalized models of the immune system based on the microenvironment of the tumor and the characterizations of the immune system of the patient.
  • Advancing pipelines for the automatic generation of Comprehensive Molecular Tumour Analyses report per patient.
  • Assembling an online platform where models can be universally accessed and where FAIR principles for scientific data management and stewardship can be implemented while tackling cybersecurity challenges.