Quality and function at a glance: The role of TED in the AlphaFold Database
The AlphaFold Protein Structure Database (AFDB) now integrates The Encyclopedia of Domains (TED), a resource designed to systematically identify and classify structural domains within AlphaFold-predicted protein structures. TED provides domain boundaries for 365 million domains across more than one million taxa, offering critical insights into protein function and organisation.
What is TED?
The Encyclopedia of Domains (TED) is a collaborative project between the structural bioinformatics groups of Professor David Jones and Professor Christine Orengo – the team behind the CATH (Class, Architecture, Topology, Homologous superfamily) resource – at University College London. TED is a large-scale classification of structural domains derived from AlphaFold predictions. It uses multiple domain boundary prediction methods to identify independent folding units within proteins. This integration of TED into the AFDB provides domain-level annotations, simplifying the interpretation of complex protein structures.
Why is TED Important for the AFDB?
TED enhances the interpretability and usability of AlphaFold predictions by:
- Defining functional units – TED annotations help distinguish independent structural domains, ensuring that distinct regions that look like compact folding units are not mistaken for single functional entities.
- Improving structural analysis – TED domains are integrated with the Predicted Aligned Error (PAE) plot, allowing users to assess predicted domain packing confidence by AlphaFold.
- Facilitating comparative studies – By linking AlphaFold-predicted structures to established classifications (e.g., CATH superfamilies), which are derived from experimentally-determined structures from the wwPDB, TED enables evolutionary and functional insights.
TED in action: A case study
For example, Inosine-5′-monophosphate dehydrogenase (AF-Q9GZH3-F1), from C. elegans worm, represents an example where TED can be used to inspect domain boundaries for a nested CBS-domain (CATH ID: 3.10.580.10) inside a split Aldolase class I domain (CATH ID: 3.20.20.70). When used alongside the interactive PAE plot, further inferences can be made about how these domains possibly pack and interact with each other in this protein.

Explore TED in the AlphaFold Database
To access TED assignments, visit the AlphaFold Database, navigate to a protein of interest, and explore the interactive domain visualisation above the PAE plot. Full datasets, including domain-domain interaction data, globularity and secondary structure predictions, as well as tools and code are available for download via Zenodo.
Enhanced accessibility and ease of filtering searches in the AlphaFold Database
Furthermore, complementing the TED inclusion, the AlphaFold Protein Structure Database now also offers bulk file downloads and a few other enhancements. These include:
- Bulk Downloads: Download up to 100 files at once from search pages and the Foldseek table.
- Multiple Formats: Support for mmCIF, PDB, CSV, and PAE (JSON) formats.
- Enhanced Search Results: View pLDDT scores and sequence lengths at a glance.
- New pLDDT Slider: Quickly filter for high-confidence structures.
These updates make finding and analysing the most relevant protein structure predictions easier than ever.
Learn more about these improvements in this Science Direct paper.
Edit