The ICGA Data Visualization Platform

The Indian Cancer Genome Atlas (ICGA) provides a comprehensive, clinically annotated, multi-omics data visualization platform that enables an integrative understanding of cancer in the Indian population.

Powered by the cBioPortal framework, the ICGA Portal allows researchers to explore and analyze genomic, transcriptomic, proteomic, and clinical datasets through an intuitive and interactive interface. The platform currently hosts datasets beginning with Indian breast cancer cohorts and will expand to additional cancer types over time.


About the ICGA Portal

The ICGA Portal offers researchers a powerful gateway for accessing processed cancer genomic dataset curated by ICGA.  It enables researchers to perform exploratory and hypothesis-driven analyses while ensuring compliance with ethical, legal, and data governance frameworks.

The portal has been developed with philanthropic support from Strand Life Sciences Ltd. and reflects ICGA’s commitment to responsible data sharing in cancer research.

introduction to ICGA Data Portal

Data Availability and Ethical Compliance

ICGA adheres to strict ethical and regulatory standards in data sharing.

  • All breast cancer datasets available through the portal are de-identified and limited to somatic mutations only.
  • Germline data is not analyzed or shared. This restriction is a result of ethical approvals that ensure patient privacy and compliance with responsible data-use guidelines.
  • Data is generated and processed using standardized pipelines designed to ensure quality and compliance.

The portal provides secondary-level processed datasets, including:

  • Somatic variant data
  • Gene expression matrices
  • Proteomics summaries
  • Associated de-identified clinical metadata

The portal visualizes highly processed, curated, and harmonized multidimensional cancer genomics data. 

Study Title: ICGA Breast Cancer Cohort

The ICGA Breast Cancer Cohort represents a foundational dataset within the ICGA program.

  • Tumor and matched adjacent non-tumor control samples
  • Treatment-naïve patients, age range 18–90 years
  • Multi-omics profiling including genome sequencing, RNA sequencing and proteomics
  • Clinically annotated datasets collected across multiple centers in India

Metadata of the ICGA’s cohort on breast cancer patients

Data Access

The ICGA follows a controlled access model governed by the Data Access Committee (DAC) in alignment with ICGA Data Policy and DBT PRIDE guidelines.

The ICGA Foundation is committed to aligning its data governance framework with the Digital Personal Data Protection (DPDP) Act 2023 and the Digital Personal Data Protection Rules 2025, notified by the Ministry of Electronics and Information Technology (MeitY) in November 2025. The Rules provide for a phased implementation, with full substantive compliance required by May 2027. ICGA’s governance policies and data access framework are being updated accordingly during this period. Researchers with queries about data governance or compliance may write to suveera@icga.co.in.

How to Access ICGA Data

ICGA data is accessible through two routes. Both routes require a completed application and approval by the DAC before access is granted.

Route A — ICGA Data Portal (cBioPortal)

Interactive, browser-based access to processed and visualised datasets within the secure ICGA portal environment. Please remember no raw data would be available here. Researchers can explore somatic mutation profiles, gene expression patterns, proteomics summaries, and associated clinical metadata using the portal’s built-in analysis tools. Data remains within ICGA’s secure infrastructure at all times. [The portal visualizes highly processed, curated, and harmonized multidimensional cancer genomics data. ]

Route B — AWS Controlled Access

Programmatic access to processed data files — including VCF/MAF files, expression matrices, and proteomics outputs — for researchers requiring computational analysis beyond what the portal interface supports. Access is provided within ICGA-managed, India-based infrastructure. Applicants are responsible for all associated AWS infrastructure costs. Contact ICGA for details.

Both routes are subject to a single consolidated application reviewed by the DAC.

Note: Commercial and industry applications are subject to additional review including execution of a Commercial Data Licensing Agreement. Contact suveera@icga.co.in before submitting any data request.

Application Process (Single-Step)

Researchers seeking access must submit a single consolidated application that includes:

  • Research proposal and objectives
  • Details of investigators and institutional affiliation
  • Data requirements (type, scope, and access route requested)
  • Data management and security plan
  • Timelines and expected outcomes
  • Institutional ethics approvals
  • Conflict of interest disclosures
  • Details of any international or inter-institutional collaborations

All applications are reviewed by the ICGA Data Access Committee (DAC). Incomplete applications will not be considered. Full and final approval is followed by the signing of a Data User Agreement (DUA) with ICGA.

Processed Data Access (Conditional)

In cases where there is clear scientific justification and upon DAC approval, researchers may be granted access to processed data files within an ICGA-approved secure compute environment. Data does not leave ICGA’s governed infrastructure unless the DAC exceptionally approves it. The modality of access will be determined by ICGA on a case-by-case basis following DAC’s review.

Such requests must demonstrate:

  • A scientific need that cannot be met through portal-based analysis
  • Adequate institutional data security measures, including encryption at rest and in transit
  • Confirmation that all data will be stored and processed on India-based infrastructure
  • Compliance with ICGA’s Data Policy and Data User Agreement

Eligible file types include processed somatic variant data (VCF/MAF), normalised expression matrices, and proteomics outputs.

Data Not Available

The following are not available for external access at this stage:

  • Raw sequencing data (BAM, FASTQ)
  • Raw proteomics data
  • Germline variant data
  • Any data that may increase re-identification risk

Requests for the above will not be considered under the current policy framework.

Conditions of Use

All approved users must comply with the following conditions throughout the approved access period:

  1. Data must be used strictly for the approved research purpose. Any change in research purpose, methodology, or key personnel must be notified to ICGA within 30 days and may require a new application.
  2. No attempt to re-identify any individual from ICGA de-identified datasets is permitted. This prohibition applies to all users and all methods of analysis.
  3. Data must not be shared with parties not named in the approved application and Data User Agreement.
  4. All ICGA data must be stored, accessed, and processed exclusively on India-based infrastructure. Transfer to servers or compute environments outside India is not permitted without explicit written approval from ICGA DAC.
  5. Significant derived datasets generated from ICGA data — such as integrated multi-omics outputs or novel variant call sets — should be shared back with ICGA to enable cumulative scientific value for the community.
  6. ICGA must be acknowledged in all oral presentations, written disclosures, and publications resulting from analyses of ICGA data.

Suggested citation:“The results [published or shown] here are based, in whole or in part, on data generated by the Indian Cancer Genome Atlas (ICGA) Network: https://icga.in, https://icga.net.in

ICGA Data, Resources, and Materials

ICGA is dedicated to advancing cancer research through a rigorous, end-to-end process that involves:

  • Collecting diverse biospecimens and clinical metadata from partnering hospitals and clinical centres across India
  • Generating molecular analytes for detailed multi-omics characterisation
  • Applying standardised sequencing, proteomic, and imaging methods
  • Curating and annotating data to enable responsible, reproducible research
  • Providing accessible data to the research community through a governed access framework

Requests for Biological Samples and Materials

Due to legal and ethical considerations, ICGA is unable to accommodate requests for biological samples, analytes, or tissue materials. All cases within the ICGA programme have been consented exclusively for ICGA use, and the redistribution of materials to outside parties is prohibited. Additionally, the majority of tissue samples have been depleted through the multiple assays performed for ICGA research.


Frequently Asked Questions (FAQ)

Loader image

The portal provides access to processed, secondary-level data, including:

  • Clinical metadata and clinical annotations
  • Sample-level phenotype data
  • Somatic mutation data
  • Copy Number Alteration (CNA) data
  • Gene-level summarized molecular profiles

Most visualizations allow download of the underlying data.

Yes, provided your specific access request was approved with download. Data can be downloaded from:

  • Study pages
  • Query results
  • Visualization panels

Users can also define custom cohorts (“virtual studies”) using clinical or genomic filters and download the corresponding datasets.

No. The portal does not host raw sequencing data or raw count-level datasets.

It is designed for access to curated and processed data only.

Access to controlled datasets requires application through the ICGA Data Access Committee (DAC).

This includes:

  • Raw RNA-seq data (including count-level data)
  • Variant Call Format (VCF) files
  • Other detailed datasets not available through the portal

ICGA Data Access Form

For queries, contact:

  • suveera@icga.co.in
  • data-access@icga.co.in

No. Since the portal provides processed, gene-level summarized data rather than raw count matrices, workflows requiring raw count reprocessing are not supported directly from portal downloads.

The portal supports:

  • Gene-level CNA downloads through the query interface
  • Segment-level copy number downloads through study-specific links

Users can query specific genes or define cohorts before exporting results.

No. The ICGA Portal is an independent instance of the cBioPortal platform and hosts ICGA-specific datasets only.

No. The interface is designed to be intuitive. However, familiarity with cBioPortal workflows may be helpful for advanced queries and cohort analyses.

Yes. Users can apply clinical and genomic filters to create custom cohorts (“virtual studies”) for downstream exploration and data export.

Users are required to acknowledge the Indian Cancer Genome Atlas (ICGA) in any publications, presentations, or outputs derived from ICGA data.

Suggested citation:

“The results published or shown here are based, in whole or in part, on data generated by the Indian Cancer Genome Atlas (ICGA) Network: https://icga.in and https://icga.net.in.”

Where applicable, users should also cite associated ICGA publications relevant to the dataset used.

For citation-related queries, contact:

  • suveera@icga.co.in
  • data-access@icga.co.in

Yes. ICGA datasets are periodically updated as new data is generated, processed, and curated.

Breast cancer study data, for example, is routinely updated.

Users should record:

  • Study identifier
  • Date of data access

for reproducibility and future reference.

For dataset version queries, contact:

  • suveera@icga.co.in
  • data-access@icga.co.in