Genome 3 T A Brown Pdf116
Defining the chronology of molecular alterations may identify milestones in carcinogenesis. To unravel the temporal evolution of aberrations from clinical tumors, we developed CLONET, which upon estimation of tumor admixture and ploidy infers the clonal hierarchy of genomic aberrations. Comparative analysis across 100 sequenced genomes from prostate, melanoma, and lung cancers established diverse evolutionary hierarchies, demonstrating the early disruption of tumor-specific pathways. The analyses highlight the diversity of clonal evolution within and across tumor types that might be informative for risk stratification and patient selection for targeted therapies. CLONET addresses heterogeneous clinical samples seen in the setting of precision medicine.
Genome 3 T A Brown Pdf116
These observations prompted us to develop a second generation tool based on local (in contrast to global) optimization where estimates of purity and ploidy are derived from few clonal events (Figure 1). We noted that the AF values of the informative SNPs within a somatic deletion result from the composition of signal from three cell populations: (i) non-tumor cells (contributing to the DNA admixture); (ii) tumor cells without the deletion; and (iii) tumor cells harboring the deletion (that is, a subclonal deletion given (ii) and (iii)). By modeling the probability distribution of the observed AFs, we compute a local estimate of the DNA admixture (1 - DNA purity) that accounts for both normal cell admixture and subclonal tumor cell population. After estimating local admixture values for all deletions across the genome, only selected lesions (from the most clonal side of the spectrum) contribute to the computation of the tumor sample global admixture. In the presence of homogenously aberrant genomes (Figure 1, top), global and local approaches result in similar estimates; for heterogeneous genomes (Figure 1, bottom), the local approach focused on selected lesions (blue arrows in Figure 1) leads to more realistic estimates. Here, we present the full implementation of CLONET (CLONality Estimate in Tumors) and study the clonality of somatic aberrations from whole genome sequencing (WGS) data across 3 tumor types comprising 55 individuals with primary prostate cancer [10], 24 metastatic melanomas [11], and 21 lung adenocarcinomas [12].
We reasoned that the reads mapped into a genomic window can be partitioned in two sets: one set includes reads that equally represent parental chromosomes (copy number neutral reads); and the other set contains reads from only one parent chromosome (active reads). There are four main steps that, starting from neutral read counts, allow inference of clonality of any genomic window. First, we estimate the percentage of neutral reads within a genomic segment independently of its Log R value. Second, we use the Log R value to relate the neutral reads percentage with a local estimate of DNA admixture. Local estimates are then aggregated to estimate global admixture and clonality of somatic copy number aberrations (SCNAs). Third, aneuploidy genomes are identified and the analysis corrected accordingly. Finally, we extend the analysis to point mutations (PMs) and structural rearrangements (REARRs) in a coherent manner. In the following we will briefly detail each step.
Reasoning that the signal from normal cells is uniform along the genome, local admixture values are then clustered, and the lowest median value among all the clusters determines the global admixture (Adm.global) of the sample. Reasoning that the more the local admixture value differs from the global one the more Seg is subclonal, the clonality of Seg, Cl Seg , is computed as the percentage of tumor cells in a sample harboring Seg. If Seg is a gain, Equation 1 extends by rescaling the percentage of neutral reads β to recover the percentage of reads sequenced from cells that does not harbor the gain of Seg (Figure 2B; Materials and methods). Bi-allelic deletions are treated separately; if the deletion is clonal, its AF distribution has binomial distribution (β = 1) and represents only DNA admixture, but in case of subclonality, the value of β is proportional to the percentage of tumor cells that do not harbor the deletion (Additional file 2).
In terms of ploidy assessment we noted significant differences in the melanoma dataset where CLONET tends to undercall polyploidy (Figure S5D in Additional file 7). Where our conservative approach might definitely introduce false negative calls, we identified cases where close inspection of allele-specific data is not necessarily compatible with ABSOLUTE original calls of polyploidy [11]; for example, ME049T (Figure S5E in Additional file 7), where the relative distances among Log R peaks are more compatible with a diploid genome.
Having assessed the variability in the clonality status of aberrations in individual patients, we then evaluated how it distributes along the whole genome as represented in clonality circos plots (Figure 6A). We can appreciate commonality among the three tumor types in some specific genomic regions. Genes on 8p are found clonally deleted in 96%, 100%, and 100% of the prostate, melanoma, and lung samples, respectively. They include the prostate cancer suppressor NXK3-1[19], the gene CSMD1, which is recurrently deleted in melanoma [11], and the phosphatase DUSP4, which is involved in negative feedback control of EGFR signaling in lung adenocarcinoma [20].
Clonality distribution along tumor genomes. (A) For each tumor type the circos plot represents the distribution of clonality along the genome. Each circos plot has five data tracks. The two outermost tracks report the proportion of clonal/subclonal losses and gains, respectively. Then, PM APs and the associated clonality status are depicted in the middle track. Finally, the inner tracks show the clonality status of intra- and inter-chromosomal REARRs. (B) Comparison of the clonality status of losses along chromosome 10q across tumor types. For each tumor type, the clonality status of losses was sampled every 100 kb and the proportion of clonal/subclonal losses reported.
It is important to note that to properly construct comprehensive tumor evolution maps, thousands of genomes are required to reach adequate statistical power when considering the range of frequencies of co-occurring aberrations. As sequencing data for multiple tumor samples and tumor types becomes accessible to the community, maps will be drafted and completed over multiple iterations. In turn, this will soon allow assigning an evolution time stamp to each new clinically profiled sample based on where the tumor genome fits into the evolution maps. Establishing a 'timeline' of cancer progression is critical for biomarkers development in the precision medicine era, allowing clinicians to more accurately gauge prognosis by adding a molecular measure of progression to standard staging and grading systems, which do not associate with molecular heterogeneity of samples (Table S3 in Additional file 5). Such an approach may allow improved clinical decision-making in a variety of cancer types, guiding the choice of management strategies and level of aggressive therapy based on how far the tumor has progressed at the genomic level.
Distinguishing gatekeeper or driver mutations from passenger mutations is a high priority for understanding disease progression. Knowledge of the chronology of molecular alterations can provide important insights into defining the most clinically relevant mutations that characterize important milestones in cancer. Genome sequencing of cancer samples taken during the course of precision medicine might demonstrate a wider range of genomic heterogeneity than previously observed in international genome sequencing studies. These clinical samples demonstrate more heterogeneity and admixture of both tumor and non-tumor components. To aid in unraveling the critical temporal evolution of somatic aberrations in challenging clinical tumors, we developed CLONET, a computation tool that requires only few clonal events to precisely estimate tumor purity and ploidy and then nominates the hierarchy of genomic aberrations. We demonstrate that CLONET can determine the clonality of different types of somatic aberrations, including SCNAs, PMs, and REARRs, using either WGS or WES datasets. We anticipate that with the emergence of larger genomic datasets, CLONET could help map out the evolution of molecular alterations.
Additional file 2: Figure S1.: Pictorial representation of the method CLONET uses to manage bi-allelic deletions. Three types of cells are considered: normal cells (yellow) with gene A (dark brown) and gene B (light brown) present in two copies; tumor cells of type I (light red) harbor a bi-allelic deletion of both genes A and B; tumor cells of type II (dark red) have zero copies of B and one copy of A. The bottom row reports the distribution of the expected AF at informative SNPs within gene A and gene B. In pure diploid cells with two copies of genes A and B, AF is centered at 0.5. In type I tumor cells, there is no signal, as both alleles are deleted. In type II tumor cells, one allele of gene B is present and the AF assumes values 0 or 1. In a hypothetical mixture of normal and tumor cells (right panel), the distribution of AFs along gene A reports only the signal from the DNA admixture, while the distribution of gene B corresponds to a mono-allelic deletion, reflecting the fact that cells with a bi-allelic deletion do not contribute to the AF. (PDF 51 KB)
Additional file 8: Figure S6.: (A) Summary of aberrations: genomic events (GE) characterized in three tumor datasets generated through whole genome sequencing. (B) Histogram of the alternative allelic proportion after Adm.global correction of the copy number neutral somatic point mutations detected in a cohort of 264 melanoma samples from TCGA. Pie chart indicates the mean numbers of events classified as clonal (green) or subclonal (blue) across samples. (C) Boxplot of the percentage of clonal genes across GEs and tumor types with respect to the total number of aberrant genes. Superimposed strip-charts represent per sample data: the size of each dot is proportional to the number of genes analyzed. (PDF 60 KB)