We would like to invite the machine learning community to help UniProt by creating
computational methods to predict metal binding sites across the whole of UniProtKB.
which our curators have carefully identified from the literature or known structures
from PDB. UniProt identifies the specific amino acid residues that participate in metal
binding sites and also which metal is bound. For example, for the Neurospora crassa
metallothionein protein (shown below) contains 7 cysteine residues involved in binding
6 copper ions.
When we look at the uncurated TrEMBL section which contains the large majority of known
These annotations are created by a variety of automated annotation methods currently
used. The difference in coverage between the reviewed (Swiss-Prot) and unreviewed
(TrEMBL) suggests that there are many millions of missing metal binding site annotations
in the 225 million TrEMBL sequences.
We would like to invite interested researchers to take part in a challenge to create new
methods to rapidly predict metal binding site annotations that can be deployed by UniProt
as part of its automatic annotation pipeline. These methods could be completely based on
sequence data or perhaps incorporate information from known and/or predicted structures.
Although we don’t want to prejudge what methodology may work, we are particularly keen
that methods be both accurate and very fast for scalability. All data and software must be
open and not under restrictive licensing terms.
If you would like to take part in this initiative please register your interest by filling out
the participants to discuss timelines and evaluation of the methods.
No comments:
Post a Comment