0%

Accessing predicted protein structures in the AlphaFold Database

Users can download many predicted protein structures from the AlphaFold Protein Structure Database (AFDB), without running the AlphaFold algorithm. Each predicted structure comes with valuable metadata and confidence metrics, supporting critical assessment. The database is freely available and can be accessed via multiple routes.

Contents of the AFDB

Google DeepMind and EMBL-EBI collaboratively developed the AlphaFold Protein Structure Database (AFDB). Their aim was to democratise predictions and make them widely available to the scientific community.

As of 2023, the AlphaFold Protein Structure Database (AFDB) hosts over 214 million entries: equivalent to almost the entirety of UniProt. The collection includes protein structures from over 1 million organisms, including model organisms and WHO pathogens of interest.

Users can search by the protein name, UniProt accession number or protein sequence.

Figure 21. You can search the AFDB in many ways, including: protein name, gene name, UniProt ID, organism name and amino acid sequence

Every predicted protein structure has a dedicated page. This contains a visualisation of the structure, plus instant access to the pLDDT scores and PAE plot. This allows fast judgement of the prediction quality for the individual structures. The structure page also contains related links and information, including a link to the protein’s UniProt page and information on the protein’s function and structural cluster.

Users can download data from AFDB. The atomic coordinates are available in PDB and mmCIF formats, while PAEs are presented in JSON format.

The data files in the archive are versioned, and previous versions are available via FTP, but the web pages will always display the latest version.

Figure 22. Timeline of key milestones in the development and use of the AlphaFold Protein Structure Database (AFDB), developed by Google DeepMind and EMBL-EBI in partnership.