Public Deliverables
D7.3 Identification of cell subpopulations in each tumour type, their association with response to therapy, and prediction of effective alternative therapies
Tumour decomposition into cells and subtypes and inference about the effects of treatments and perturbations on each tumour component (cell or tumor subclone).
D1.4 Model development data including genetic perturbation screens and gene-drug synergies
This deliverable reports on the generation of CROPseq and drug screening data for two Ewing Sarcoma cell lines, one Hepatoblastoma cell line and one B-cell Acute Lymphoblastic Leukemia cell line.
D7.2 Software to define tumour subclones and association with therapy response
Flow cytometry is an important diagnostic tool in childhood acute lymphoblastic leukaemia (ALL), flow cytometry data analysis is limited by multiple sources of bias and variation. We present a unified machine learning framework for automated analysis of a standardized diagnostic paediatric leukaemia staining that can overcome these challenges. We applied our framework in a large cohort of ALL flow cytometry samples and demonstrated how it can robustly extract the frequencies of cell lineage populations with minimal expert intervention. This work provides a proof of concept that our method meets the needs of an automated analysis tool for diagnostic flow cytometry data.
D8.3 Metabolic models
Oncogene-driven metabolic rewiring in cancer is key to allow proliferation of tumour cells in low nutrient and oxygen conditions. To study such phenomena, reconstructing context-specific metabolic models through omics data integration is crucial. Here we report the original pipeline to construct context-specific metabolic models from scRNA-seq data and we applied it to scRNA-seq data from Ewing Sarcoma.
D4.3 Topological analysis of multi-omics and multi-cancer molecular networks resulting in the definition of molecular mechanisms
Three types of network-based analysis of gene-gene interaction networks have been suggested and tested on the multi-omics paediatric cancer datasets. User-friendly computational environment for joint application of matrix factorization and network analysis has been implemented.
D8.2 Network models for molecular target identification
We focused on the development of patient specific signalling networks using prior knowledge about the molecular events and CRISPR perturbation datasets and associated the activity of the nodes of signalling network with drug response data to find molecular targets.
D4.4 Consensus multi-omics subtypes of paediatric cancers
We report on the implementation of a method for multilayer community trajectory analysis and its applications, including a published study on medulloblastoma, a study on congenital myasthenic syndromes, and a study on the functional characterization of commonalities among a selection of paediatric tumours.
D2.4 DAC Portal prototype, validated analytical workflows, analysis prototype, updated metadata standards and portal prototype
We report on the selection of the appropriate data models to handle the available data and metadata to the iPC Central Computational and Data platform. We also report on the current status of the development for the iPC Data portal.
D3.3 Integration of INtERAcT, MelanomaMine and LimTox and application to biomedical publications on paediatric cancers
This deliverable reports on the integration of INtERAcT and the implemented text mining workflow. The workflow was developed to adapt LimTox and MelanomaMine to pediatric tumor abstracts from PubMed and relies on INtERAcT in its downstream component of inferring molecular associations between entities extracted from unstructured text.
D1.3 Synthetic data for testing and training patient, cancer, and drug models
Synthetic data generation is emerging as an important solution for precision medicine. Therefore, an explainable Variational AutoEncoder (VAE) model is developed for synthetic transcriptomics data generation in medulloblastoma. The model can be used to complement and interpolate available data with synthetic instances. It is also transparent as it is able to match the learned latent variables with unique gene expression patterns. The model can also be adapted to other pediatric cancers and the resulting synthetic datasets used to test and train patient, cancer, and drug models in other work packages of the iPC project.
D4.2 An interactive online atlas of interconnected network maps based on the NaviCell platform
With the development of the NaviCell 3.0 web server, there is a complete and automated web-based infrastructure for hosting molecular maps, patient similarity network maps, and multi-omics datasets for the project. The NaviCell platform supports molecular map navigation and exploration using the Google maps™ engine. The logic of navigation is taken from Google maps. This NaviCell 3.0 web-server is freely available and several step-by-step tutorials are accessible.
D7.1 Application of software enabling computational deconvolution of bulk RNA-sequencing data to immune cell profiles of patient samples
Computational deconvolution of bulk RNA-sequencing data to infer cell type composition of a sample is challenging. Benchmarking of various computational deconvolution tools revealed various data processing parameters that impact deconvolution accuracy and revealed the importance of a complete reference matrix. As a complete reference matrix is often not available, an algorithm was designed that can handle missing cell types. This algorithm can be applied to establish the immune cell repertoire of primary tumor biopsies without prior knowledge of the full spectrum of cell types in the biopsy.
D3.1 Identification of important regulatory elements using multi-level matrix factorization approaches
D3.1 describes the techniques for dimensionality reduction used in iPC and their application to a selection of cohorts (at different omics levels) as well as a meta-analysis of the four solid tumor types of interest. The goal of the deliverable is to provide a list of pathways and biological functions having a key role in multiple paediatric cancers.
D3.2 Adaptation of MelanomaMine and LiMTox to the analysis of paediatric cancers and application to biomedical publications on paediatric cancers
The paper reports on the implementation of the iPC text mining workflow and three use cases for extracting biomedical information from large volumes. The workflow builds on the general framework of two text mining tools, LimTox and MelanomaMine. These tools will be used in the framework of the iPC project but also beyond, having a clear impact in the research community.
D8.1 Data-driven model for molecular targets and drug repositioning
This deliverable provides a detailed overview of the proposed computational tool for predicting patient-specific drugs with potential therapeutic benefit for paediatric cancer treatment and provides, for example, evidence for the goodness of the model in predicting such patient-specific drugs.
D2.3 Recommended metadata standards and portal prototype
The iPC project aims to ensure interoperability of data between different resources, so the platform must enforce principles and well-defined standards for data accessibility, usability, and registration. This deliverable provides an overview of the different approaches to representing metadata within the iPC Platform, and the efforts to integrate and leverage them within the iPC Catalog and the overall iPC Central Computational and Data Platform to enable meaningful management of research data.
D1.2 Collection of high-quality clinical and molecular paediatric cancer datasets as well as other tumour types
In this deliverable, demographic, clinical, and molecular profiles were collected for several pediatric and adult tumors. In addition, the focus here is on collections of single cell profiles of high risk cancers. The datasets will be used to evaluate the effects of treatments and perturbations on cancer cells, build models, and provide information on deciphering regulatory interactions. These data will allow characterization of cancer cell types that predict treatment outcome, as well as cell types that are resistant to therapies.
D4.1 Building of cancer type-specific multi-layered molecular and patient similarity networks
iPC uses network inference techniques and applies a selection of pediatric patient cohorts at different omic levels. Networks will be generated, for example, for the generation of molecular patient networks to be used in downstream project activities involving the use of networks.
D2.2 “Initial infrastructure framework”
An initial demonstrator of the iPC infrastructure is reviewed. The platform’s architecture is based on modules, which allow parallel developments and integration of different open source-based software components. This allows us to leverage other efforts and contribute towards its sustainability and maintainability. The release of a minimum viable platform is allowing us to capture early feedback from researchers at iPC.
D1.1 “Collection of public molecular and clinical data”
The development of iPC predictive models for paediatric cancer genesis, progression, and response to therapies, as well as patient response to therapy, requires a vast quantity of molecular and clinical training data. In this deliverable, we have assembled a collection of these data to enable model construction and testing.
D10.1 “Internal and external IT communication infrastructure and project website”
This deliverable constitutes the launch of the internal and external iPC communication infrastructure including the establishment of mailing lists, new IT infrastructure and the iPC website.
D11.1 “Project Quality Plan”
A handbook of the project management process, review process, quality checks, meeting organisation, which is communicated to all partners.