New MCP publication: Software-assisted reduction of missing values in phosphoproteomic and proteomic isobaric labeling data using tandem mass spectrum clustering

A novel tool for decreasing missing values in multibatch full proteome and phosphoproteome TMT experiments

Isobaric stable isotope labeling techniques such as tandem mass tags (TMT) have become popular in proteomics because they enable the relative quantification of proteins with high precision from up to 18 samples in a single experiment. While missing values in peptide quantification are rare in a single TMT experiment, they rapidly increase when combining multiple TMT experiments. As the field moves towards analyzing ever higher numbers of samples, tools that reduce missing values also become more important for analyzing TMT datasets. To this end, we developed SIMSI-Transfer (Similarity-based Isobaric MS2 Identification Transfer), a software tool that extends our previously developed software MaRaCluster by clustering similar tandem mass spectra (MS2) from multiple TMT experiments. SIMSI-Transfer is based on the assumption that similarity-clustered MS2 spectra represent the same peptide. Therefore, peptide identifications made by database searching in one TMT batch can be transferred to another TMT batch in which the same peptide was fragmented but not identified. To assess the validity of this approach, we tested SIMSI-Transfer on masked search engine identification results and recovered >80% of the masked identifications while controlling errors in the transfer procedure to below 1% FDR. Applying SIMSI-Transfer to six published full proteome and phosphoproteome data sets from the CPTAC (Clinical Proteomic Tumor Analysis Consortium) led to an increase of 26-45% of identified MS2 spectra with TMT quantifications. This significantly decreased the number of missing values across batches and, in turn, increased the number of peptides and proteins identified in all TMT batches by 43-56% and 13-16%, respectively.