DEMETER2 Data Re-release 19Q3 ----------------------------------------------------------- Contents: ----------------------------------------------------------- *********************************************************** INPUTS *********************************************************** essential_genes.txt: Intersection of the 291 essential genes found in RNAi screens by Hart et al. (10.15252/msb.20145216) with those targeted by at least one of the combined RNAi libraries. essential_genes.txt: Intersection of the 917 nonessential genes found in RNAi screens by Hart et al. (10.15252/msb.20145216) with those targeted by at least one of the combined RNAi libraries. CRISPR_essential_genes.txt: The intersection of two studies using gene trap and CRISPR knockout to identify human essential genes, Hart et al. 2014 (10.15252/msb.20145216) and Blomen et al. 2015 (10.1126/science.aac7557), with the genes targeted by at least one of the combined RNAi libraries. For more input data, please refer to https://doi.org/10.6084/m9.figshare.6025238.v4 shRNA_mapping.csv: Gene target annotation of hairpins. *********************************************************** OUTPUTS *********************************************************** Results of the DEMETER2 model applied to the combined (Achilles, DRIVE and Marcotte 2016) datasets. ----------------------------------------------------------- D2 model results contents: ----------------------------------------------------------- 1) gene_effect: Estimated effect of gene knockdown for each cell line and gene (posterior mean estimates), rescaled so the median of positive controls (291 Hart essential genes) is -1 in each cell line. Indexed by cell line (DepMap ID). Each column is a gene in the format HUGOSymbol (Entrez ID). Gene families whose members are all cotargeted are concatenated with ampersands ("&"). 2) gene_dependency: For each effect score in the gene_effect matrix, we calculated the probability that the effect score represented a true depletion phenotype. This was done by decomposing each cell line's distribution of gene effects into the combination of a null distribution (given by the effects of unexpressed genes, or by the effects of nonessential genes in lines where Broad RNASeq profiles were not available) and a true depletion distribution, given by the gene effects of CRISPR_essential_genes. For more details, see https://doi.org/10.1101/720243. 3) CL_data: Table of model parameters estimated for each cell line (cell lines are indexed by DepMap IDs). Includes: a) gene_slope: "Screen signal" parameter (q_j) for each cell line. b) CL_slope: Overall multiplicative scaling term. These are estimated for each cell line j and batch k in the model (a_jk), and are averaged across batches k here. c) noise_vars: Average noise variance per cell line. These are estimated for each cell line j and batch k in the model (sigma_jk), and are provided here averaged across batches k (according to sqrt()). d) offset_mean: Average posterior mean offset term per cell line. These are estimated per cell line j and batch in in the model (a_jk), and are averaged across batches k here. e) offset_sd: Posterior SD of offset terms per cell line (a_jk), averaged across batches k. Averages are computed according to: sqrt(). 4) hp_data: Table of model parameters estimated for each shRNA (shRNAs are indexed by their targeting sequence). Includes: a) Geff: Estimated gene knockdown efficacy of each shRNA (alpha_i). b) Seff: Estimated off-target efficacy of each shRNA (beta_i) c) unpred_offset_mean: Posterior mean of 'unpredicted' across-cell-line average off-target effect per shRNA (c_i) d) unpred_offset_sd: Posterior std dev of 'unpredicted' across-cell-line average off-target effect per shRNA (c_i) e) hairpin_offset_mean: Posterior mean of additive offset per shRNA and batch (theta_ik), averaged across batches k. f) hairpin_offset_sd: Posterior std dev of additive offset per shRNA and batch (theta_ik), averaged across batches (as sqrt()) *********************************************************** ADDITIONAL INFO *********************************************************** * sample_info.csv * Table of meta data per cell line. Columns include: DepMap_ID: Unique, stable identifier for cell lines CCLE_ID: Readable name assigned to the cell line by CCLE. lineage: General cancer lineage category lineage_subtype: Subtype of cancer lineage; specific disease name lineage_sub_subtype: Subgrouping of disease in_DRIVE: was the cell line included in the DEMETER2 DRIVE dataset in_Achilles: was the cell line included in the DEMETER2 Achilles dataset in_Marcotte: was the cell line included in the DEMETER2 Marcotte dataset Novartis_Primary_site: Primary site annotation downloaded from Novartis web portal (https://oncologynibr.shinyapps.io/drive/) Novartis_Pathologist_Annotation: Pathologist annotation downloaded from Novartis web portal Marcotte_subtype_three_receptor: subtype_three_receptor taken from Marcotte cell line subtypes file (http://neellab.github.io/bfg/) Marcotte_subtype_neve: subtype_neve taken from Marcotte cell line subtypes file Marcotte_subtype_intrinsic: subtype_intrinsic taken from Marcotte subtypes file