Why is consistent protein naming important?For many proteins, a variety of different names are used across the scientific literature and public biological databases which makes effective organization and exchange of biological information a difficult task. Consistent protein nomenclature is indispensable for communication, literature searching and retrieval of database records.
New protein nomenclature guidelinesTo address this issue and provide some help in protein naming, a set of protein nomenclature guidelines have been produced jointly by the European Bioinformatics Institute (EMBL-EBI), the National Center for Biotechnology Information (NCBI), the Protein Information Resource (PIR) and the Swiss Institute for Bioinformatics (SIB). UniProt has been heavily involved in this work along with other groups from the four institutes. These efforts have built on existing guidelines which were already in use by groups such as UniProt and RefSeq, expanding and consolidating them into a single shared document which provides a comprehensive set of recommendations.
What makes a good protein name?A good protein name is one which is unique, unambiguous, can be attributed to orthologs from other species and follows official gene nomenclature where applicable. The guidelines help to achieve this goal by covering all aspects of protein naming from advice on expert sources of protein names and how to name novel proteins of unknown function to more detailed advice such as terms to avoid in a protein name and acceptable abbreviations.
Who are the guidelines intended for?The guidelines are intended for use by anyone who wants to name a protein. Groups who will find these guidelines helpful include:
- Biocurators who want to assign a protein name as part of a database record
- Bioinformaticians who intend to assign protein names as part of gene annotation pipelines prior to submission to public archives
- Researchers who isolate a new protein and want to name it prior to publication