br Mass cytometry analysis br EQ Four Element Calibration
Mass cytometry analysis
EQ Four Element Calibration Beads (Fluidigm) were added to cell suspensions in a 1:10 ratio (v/v). Samples were analyzed on a CyTOF2 (Fluidigm). The manufacturer’s standard operation procedures were used for acquisition at a cell rate of 500 ASP1517 per second. After the acquisition, all FCS files from the same barcoded sample were concatenated (Bodenmiller et al., 2012). Data were then normalized, and bead events were removed (Finck et al., 2013) before doublet removal and de-barcoding of cells into their corresponding wells using a doublet-filtering scheme and single-cell deconvolution algorithm (Zunder et al., 2015). Subsequently, data were processed using Cytobank (https://www.cytobank.org). Additional gating on the DNA channels (191Ir and 193Ir) and 139La/141Pr was used to remove remained doublets, debris and contaminating particulate.
QUANTIFICATION AND STATISTICAL ANALYSIS
Data preprocessing and BP-R2 analysis Data preprocessing
Raw data were transformed using the inverse hyperbolic sine transform with a cofactor of 5:
data = arcsinhðdataraw=5Þ
Except where use of raw data values is specifically noted, all visualizations and analyses were performed using transformed data. Data binning
For data binning, the range between the lower and upper 2.5% of observations was divided into ten equal bins bin1,...,bin10. The observations in the lower and upper 2.5% were assigned to the lowest and highest bins, respectively. In order to be able to compare expression levels between samples within a time course experiment, all observations of the time course were pooled to determine the binning. BP-R2 BP-R2 analysis is described in Lun et al., 2017 (https://github.com/BodenmillerGroup/Adnet). In brief, the median of a measured marker ðy~i Þ was calculated for each bin i. Additionally, the overall mean of the medians of all the 10 bins (m~) was calculated (bins y with less than 25 cells were discarded). Then, for each bin, we computed the sum of squared deviations from the bin medians and the sum of squared deviations from the overall mean of medians. These values were summed over all bins and the BP-R2 was defined as one minus the ratio between them: RBP = 1 nbins 1 ni
Pnbins 1 Pni 2
Following the method described in Lun et al., 2017, we chose the maximum BP-R2 among all the 108 control samples (FLAG-GFP overexpression and untransfected cells) as a cutoff. Relationships that had a BP-R2 higher than this threshold were considered as sufficiently strong to be of interest. Signed-BP-R2 The relationship strengths calculated as BP-R2 were mostly positive, with a few exceptions of negative BP-R2 values mostly from the cell cycle marker IdU, due to bimodality. These rare and weak negative BP-R2 values were considered as negligible and were there-fore assigned to 0. This allowed the integration of signaling relationship directionalities, determined by Spearman correlation of bin medians (rbin), with the relationship strengths (R2BP). The signed-BP-R2 score ðR2signed BPÞ was calculated as: 2 = RBP2; rbin R0
Hierarchical clustering was performed for kinases and phosphatases on their abundance-dependent signaling relationships, as signed-BP-R2, to all measured phosphorylation sites with and without 10-minute EGF stimulation. Ward’s method and Euclidean distances (Ward, 1963) were used for the clustering, and the hierarchical tree was cut at the height of 5 to obtain 10 clusters of kinases and phosphatases as shown in Figure S2A.
t-SNE analysis was performed with the Package ‘Rtsne’ in R.
Functional enrichment and association analysis using STRING database
The functional enrichment and interaction enrichment analyses were performed using the STRING database v10.5 (Szklarczyk et al., 2017). All the kinases and phosphates tested were mapped to STRING protein name-space establishing the background protein set for the further analysis. The functional enrichment p values were corrected using Benjamini and Hochberg method (Benjamini and
Hochberg, 1995) (the detailed description of the statistical methods can be found in Franceschini et al., 2013). To test whether the functional signal within the clusters arises exclusively from a homology between the proteins, the homologous proteins were grouped together into one node, and, therefore, the proteins that exhibited high or medium homology did not contribute independently to the enrichment functional term count. In order to form the grouped representation of the STRING network the single-linkage clustering method was applied to the homology relationships between the proteins in which neighbors were defined as having a self-normalized bit score (BLAST bit score of alignment between the two proteins divided by the bit score of self-alignment of shorter of the two proteins) equal to or higher than 0.2. For each functional term the grouped node contributed to the enrichment count when one or more of the proteins forming the group were annotated with the term in question. This process was applied to both the clusters and the background separately to ensure that for the groups in which proteins were shared between the cluster and the background the functional term was counted in both sets.