UniProt has launched a COVID-19 portal https://covid-19.uniprot.org/ for the latest pre-release data. This will be updated independently to the general UniProt 8 week release cycle. You can also find the data on FTP here ftp://ftp.uniprot.org/pub/databases/uniprot/pre_release/ .
The 2019–20 COVID-19 outbreak is a viral epidemic which started in mainland China but has since spread to several other countries and territories. The Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) was first identified in Wuhan, the capital of China's Hubei province. It is an enveloped single-stranded RNA virus. The particles are decorated with petal-shaped surface projections which are reminiscent of the solar corona. These viruses are found in many vertebrate species and cause respiratory diseases, such as the common cold or SARS. The more recent SARS-CoV-2 has emerged from a still unknown animal reservoir and can be transmitted from human to human.
The 2019–20 COVID-19 outbreak is a viral epidemic which started in mainland China but has since spread to several other countries and territories. The Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) was first identified in Wuhan, the capital of China's Hubei province. It is an enveloped single-stranded RNA virus. The particles are decorated with petal-shaped surface projections which are reminiscent of the solar corona. These viruses are found in many vertebrate species and cause respiratory diseases, such as the common cold or SARS. The more recent SARS-CoV-2 has emerged from a still unknown animal reservoir and can be transmitted from human to human.
Coronaviruses possess the
largest genomes among all known RNA viruses. The 30 kilobase genome of the Wuhan seafood market strain has been sequenced (MN908947, NC_045512), this genome encodes
a total of 13-14 proteins. In order to fast-track scientific research, these
proteins have been manually annotated by UniProt biocurators and the entries
made available as a pre-release dataset. This file provides pre-release
access to the SARS-CoV-2 protein sequences in UniProt from the
current public health emergency. The data will become part of a future UniProt
release and may be subject to further changes. A
high-resolution crystal structure of the SARS-CoV-2 3CL hydrolase (6lu7) has been determined by Zihe Rao and Haitao Yang's
research team at ShanghaiTech University and is cross-referenced from P0DTD1.
Two copies of the 3C-like hydrolase (P0DTD1 -PRO_0000449623)
in a catalytically active assembly
In common with other public domain resources, UniProt has
moved rapidly to make these valuable data publicly available at the time when
it is most needed and hope that this will assist clinical researchers in their
efforts to combat the virus. To download the entire dataset of protein
sequences, expertly curated for function and fully cross-referenced to
additional resources click here.
Nice Informative Post ! :)
ReplyDeleteThis is highly informatics, crisp and clear. I think everything has been described in systematic manner so that reader could get maximum information and learn many things.
ReplyDeleteget tested for COVID-19