Comprehensive protein sequence databases and the advent of personalized sequence databases (#128)
“Which sequence database should I use for my proteomics analysis?” is a question that is perhaps not asked often enough. Reference protein sequence databases (such as Ensembl, UniProtKB, NCBInr or LudwigNR) all have their own advantages, disadvantages and quirks. Based on 3 different proteomic applications from protein characterisation, pathogen detection and OMICS integration (RNASeq and Proteomics), I will illustrate the advantages and disadvantages of these publically available reference sequence databases and hopefully shed some light and provide insight into the questions that should be asked, which ultimately of course depends on the experiment at hand and the biological questions being asked. More recently, the integration of datasets sourced from different technologies (such as RNASeq and Proteomics) has become a reality. Based on 2 different projects I will share some insight into the benefits and methodology involved in facilitating this integration and of course the pitfalls associated with this task.