We are pleased to introduce you to annotation scores on the
UniProt website! We have recently started providing annotation scores for all
UniProtKB entries. Annotation scores are a five point heuristic score. An annotation score of 5 points is associated
with the best-annotated entries, and a 1-point-score denotes an entry with
rather basic annotation. A 5-point annotation score would look like:
Annotation scores can help you quickly gauge the annotation content in a protein entry. For example, you could see which is the best-annotated protein in a family. We hope the scores will be useful in helping you narrow down to your entries of interest.
You can view annotation scores in the ‘Status’ line on all UniProtKB protein entry pages, as shown below.
You can also add annotation scores to your search results table
through the ‘Columns’ button.
How are they used?
There are several contexts in which annotation scores can be used:
- UniProtKB
The annotation scores can help you to get a quick idea of the relative level of annotation of the entries in your search results. Please note that search results are not ranked by the annotation score, but by a query score that considers not only the annotation scores of the entries that match your query, but also how often (and where) your query term(s) appear in a matching entry and across the whole database, and the importance of a term according to the total number of terms. For this reason, the best-ranked entries are not necessarily those with the highest annotation scores.
- UniRef
We will be using annotation scores to select the representative member of a UniRef cluster.
- Reference proteomes
We are using annotation scores to assist the selection of reference proteomes.
How are they computed?
- Different UniProtKB annotation
types (e.g. protein names, gene names, functional annotations (comments)
and sequence annotations (features), GO annotations, cross-references) are
scored either by presence or by number of occurrences. Annotations with
experimental evidence score higher than equivalent predicted/inferred
annotations, thereby favoring expert literature-based curation over
automatic annotation.
- The score of an individual entry
is the sum of the scores of its annotations.
- The score of a proteome is the sum
of the scores of the entries that are part of the proteome.
Next time you’re looking at
a UniProt protein, look out for annotation scores. We welcome your feedback.
Would you apply these scores in your work? Would you like to see them in your
UniProtKB search results by default? Write in and let us know!
No comments:
Post a Comment