0%

The UniProt databases

There are three UniProt databases:

  • The UniProt Knowledgebase (UniProtKB)
  • The UniProt Reference Clusters (UniRef)
  • The UniProt Archive (UniParc)

In this tutorial we will only focus on UniProtKB. For more information on UniRef or UniParc, visit the UniProt Quick Tour.

UniProtKB

UniProtKB is the central hub for the collection of functional information on proteins, with accurate, consistent and rich annotation. It consists of two sections:

  • Reviewed (Swiss-Prot) – contains manually annotated records with data added by expert biocurators giving information on protein function, structure, subcellular location and molecular interactions. Each entry in UniProtKB/Swiss-Prot represents a single, non redundant gene from a specific organism and all proteins and peptides transcribed by that gene are described within the record.
  • Unreviewed (TrEMBL) – contains computationally analysed records with additional information transferred from related well annotated records in UniProtKB/Swiss-Prot (automatic annotation). There may be several separate UniProtKB/TrEMBL entries describing the proteins derived from a specific gene.

A subset of UniProtKB entries also form the Proteomes dataset. This consists of the set of proteins thought to be expressed by an organism whose genome has been completely sequenced.