- Course overview
- Search within this course
- Where does the data come from?
- Why do we need UniProt?
- When to use UniProt
- Quiz: Check your learning I
- How to access and navigate UniProt
- How to search UniProt
- Annotation score
- Quiz: Check your learning II
- Exploring a UniProtKB entry
- How to use UniProt tools
- How to get data from UniProt
- How to submit data to UniProt
- When to use UniProt: guided example
- Exercise: finding entries with 3D structures
- Exercise: mapping other database identifiers to UniProt
- Summary
- Your feedback
- Get help and support on UniProt
- References
The UniProt databases
There are three UniProt databases:
- The UniProt Knowledgebase (UniProtKB)
- The UniProt Reference Clusters (UniRef)
- The UniProt Archive (UniParc)
In this tutorial we will only focus on UniProtKB. For more information on UniRef or UniParc, visit the UniProt Quick Tour.
UniProtKB
UniProtKB is the central hub for the collection of functional information on proteins, with accurate, consistent and rich annotation. It consists of two sections:
- Reviewed (Swiss-Prot) – contains manually annotated records with data added by expert biocurators giving information on protein function, structure, subcellular location and molecular interactions. Each entry in UniProtKB/Swiss-Prot represents a single, non redundant gene from a specific organism and all proteins and peptides transcribed by that gene are described within the record.
- Unreviewed (TrEMBL) – contains computationally analysed records with additional information transferred from related well annotated records in UniProtKB/Swiss-Prot (automatic annotation). There may be several separate UniProtKB/TrEMBL entries describing the proteins derived from a specific gene.
A subset of UniProtKB entries also form the Proteomes dataset. This consists of the set of proteins thought to be expressed by an organism whose genome has been completely sequenced.