- Course overview
- Search within this course
- An introductory guide to AlphaFold’s strengths and limitations
- Validation and impact
- Inputs and outputs
- Advanced modelling and applications of predicted protein structures
- Future directions and summary
- Your feedback
- Glossary of terms
- References
- Acknowledgements
What’s the best way to access the database?
Users can access the predicted structures and their associated confidence metrics from AFDB through four distinct channels: directly from the AFDB web page, via FTP, Google Cloud Public Data, or via Programmatic Access (API).
![](https://ftp.ebi.ac.uk/pub/training/2024/On-demand/four_ways_to_access.gif)
- If you only need to access AFDB occasionally, using the website may be the best option. The site is easy to use and does not require any coding experience.
- If you need to download big datasets, such as proteomes, FTP is likely to be your best option as it offers scalability.
- If you need to customise the way you access the AFDB, it may be better to use Google Big Query or the API. These approaches offer more flexibility and scalability.
- If you need to download the entire collection, you can do so using the Google Cloud Public Data.
When is the AlphaFold Protein Structure Database not an option?
Despite the scale of the AFDB, there are some cases where it may be necessary to use the AlphaFold2 algorithm to predict the structure of a protein. These situations include:
- The protein of interest is outside the range of lengths included in the database. The minimum length is 16 amino acids. The maximum is 2,700 for proteomes and Swiss-Prot (reviewed entries) and 1,280 for the rest of UniProt. For the human proteome only, and only via FTP, the download includes longer proteins segmented into fragments.
- You are interested in oligomers or protein-protein complexes. The database only includes structures of monomers, so you would need to run modelling yourself.
- The protein sequence has been added to, or otherwise modified, by UniProt in a more recent release.
- The protein of interest is known to have multiple conformations. The database will only have one predicted structure per protein, so by definition does not provide information about different conformational states.
- The protein of interest comes from a virus. The database does not include viral proteins.
- You need control over the prediction parameters. In particular, the database does not archive MSAs.
We have created a table summarising the different ways to access the AlphaFold Protein Structure Database (AFDB).
Feature | |
---|---|
|
|
|
|
|
|
|
|
How to access through API, FTP and Big Query
In this section, we will guide you through accessing the AlphaFold Protein Structure Database through API and BigQuery.