The 2019–20 Wuhan coronavirus outbreak is a viral epidemic which started in mainland China but has since spread to several other countries and territories. The virus was first identified in Wuhan, the capital of China's Hubei province. Wuhan 2020 Coronavirus (nCoV) is an enveloped single-stranded RNA virus. The particles are decorated with petal-shaped surface projections which are reminiscent of the solar corona. These viruses are found in many vertebrate species and cause respiratory diseases, such as the common cold or SARS. The more recent Wuhan 2020 coronavirus has emerged from a still unknown animal reservoir and can be transmitted from human to human.
Coronaviruses possess the largest genomes among all known RNA viruses. The 30 kilobase genome of the Wuhan seafood market strain has been sequenced (MN908947, NC_045512), this genome encodes a total of 13-14 proteins. In order to fast-track scientific research, these proteins have been manually annotated by UniProt biocurators and the entries made available as a pre-release dataset. This file provides pre-release access to the 2019-nCoV Wuhan Coronavirus protein sequences in UniProt from the current public health emergency. The data will become part of a future UniProt release and may be subject to further changes. A high-resolution crystal structure of 2019-nCoV coronavirus 3CL hydrolase (6lu7) has been determined by Zihe Rao and Haitao Yang's research team at ShanghaiTech University and is cross-referenced from P0DTD1.
Two copies of the 3C-like hydrolase (P0DTD1 -PRO_0000449623)
in a catalytically active assembly
In common with other public domain resources, UniProt has moved rapidly to make these valuable data publicly available at the time when it is most needed and hope that this will assist clinical researchers in their efforts to combat the virus. To download the entire dataset of protein sequences, expertly curated for function and fully cross-referenced to additional resources click here.