Validation of mesenchymal stem cell transcripts assembled from RNA-Seq using proteomics data (#205)
Alternative splicing of mRNA diversifies the
function of human proteins, with tissue- and cell-specific protein isoforms
being the most difficult to validate. While transcriptomic experiments enable
the detection of many alternatively spliced transcripts, it is not known if
these transcripts have protein-coding potential. We recently published the PG
Nexus pipeline1, which facilitates high confidence validation of exons and
exon-exon junctions of spliced transcripts by integrating transcriptomics and
proteomics data. In this study, we applied PG Nexus towards the analysis of an undifferentiated
human mesenchymal stem cell line and compared the number of protein isoforms
validated using different protein sequence database, including public online
databases and RNA-seq derived databases. With significant overlaps with other
databases, we identified 8,011 exons and 3,824 splice junctions from 2,379
genes with the Ensembl database. The Ensembl database consistently outperformed
the other data sources, but predicted open reading frames from RNA-seq derived
transcripts were comparable, with only 6 less splice junctions validated. Using
proteotypic and isoform-specific peptides, we validated 462 protein isoforms
and a higher number is possible if we included multiple proteotypic peptides. Multiplexing
proteotypic peptides in SRM assays or similar experiments will increase the
confidence and coverage of protein isoform validation experiments.
- Pang, C. N.; Tay, A. P.; Aya, C.; Twine, N. A.; Harkness, L.; Hart-Smith, G.; Chia, S. Z.; Chen, Z.; Deshpande, N. P.; Kaakoush, N. O.; Mitchell, H. M.; Kassem, M.; Wilkins, M. R., Tools to covisualize and coanalyze proteomic data with genomes and transcriptomes: validation of genes and alternative mRNA splicing. J Proteome Res 2014, 13, (1), 84-98.