UniProt, programmatically

UniProt is a comprehensive, expert-led, publicly available database of protein sequence, function and variation information.

This webinar will give an overview of programmatic access to the UniProt database using Python and cover key aspects of protein entry searches, data filtering, batch downloads and give examples of further processing of downloaded target data.

Following a brief introduction to UniProt services, where to find relevant documentation and help features, the webinar will focus on worked examples. These will include how to programmatically search and retrieve protein entries and sequences, within the results. We will then show how to align orthologous sequences and filter for features of interest, such as disease variant information.

The webinar also covers programmatic examples of the UniProt Retrieve/ID mapping service, batch downloads, processing, and filtering data by annotation type.

By the end of this video you will be able to:

  • Identify the different routes to access UniProt data and know how to pick the most appropriate for your workflow
  • List the UniProt services such as UniProtKB, Proteomes, UniParc and UniRef
  • Find documentation and useful help to guide your programmatic access
  • Retrieve full UniProtKB entries or specific fields using Python
  • Filter entries by annotation types and other target characteristics

Access the slides.

You can find more information and documentation on how to access UniProt programmatically and the Proteins API on our website.

For help and support visit the UniProt help pages or contact the help desk.