Thursday, November 29, 2018

Using UniProtKB to explore the world of protein structure

Protein structures are used to understand the architecture of a protein, to explain how a protein interacts with its ligands or cofactors and to study the composition of protein complexes. They help us to identify the position and nature of post-translational modifications and, as 3D structure is more evolutionarily conserved than primary sequence, can also be used to predict protein function. Identifying proteins sharing a conserved protein fold may help to also ascertain a molecular function that is common to them all. Understanding how topology affects the active sites of enzymes or identifying sequence-conserved regions, such as binding sites or areas of electrostatic potential, on the surface of a protein can also give valuable clues to the role a protein plays in a cell.

Annotation of proteins based on structure-based analyses is an integral part of the work of the UniProt Knowledgebase (UniProtKB). UniProt works closely with the Protein Databank in Europe (PDBe) to map 3D structural entries (~100,000) to the appropriate UniProtKB entries at the individual residue level [1]. It then becomes possible to use the UniProtKB advanced search functionality to ask questions such as ‘How many proteins in the human proteome have at least a partial 3D structure?’

                                               Searching for structural data in UniProtKB

Once you have found the protein you are interested in, use our navigation tool in the entry to move to the Structure section where you may either find more information in the table view or visualise a 3D image. The table view lists all the structures available for that molecule, give details of the method by which the structure has been determined (e.g. X-ray, NMR, Electron Microscopy) and an accurate residue-level mapping to the region of amino acid sequence covered by each structure. Links to a number of external data repositories and resources enable you to access more detailed information.  To help our users visualize the structure, we have recently incorporated the LiteMol Viewer, an HTML5 web application that not only provides cartoons, surface and balls and stick visualizations but also links you to the PDBe database, allowing you to view and explore validation and annotation data.


Visualising Bloom's syndrome helicase (P54132) in complex with ADP and duplex DNA.

Hovering over the structure will show you the amino-acid residue-level mappings, a single click and you can zoom in to a more detailed view, for example enabling you to visualize the details of cofactor binding.

Zooming in on Bloom's syndrome helicase to show ADP binding

Knowing the shape of a protein can give you valuable clues to the function of that molecule. Use UniProtKB to explore the links between sequence, structure and function and understand how molecule topology can drive cellular phenotype. 

Want to learn more:

Go to our pre-recorded webinars to learn more about the annotation of structural data in UniProtKB