PDBe REST API < PDBe

REST calls related to PDBe search service

Search on Solr instance based on polymeric entities in the PDB.

https://www.ebi.ac.uk/pdbe/search/pdb/select?:query

A document in this Solr instance represents a polymeric entity of type protein, DNA, RNA or sugar.
Output from the call depends on the query sent to Solr. Query parameters are well documented in Solr documentation.
Each document has a wide range of properties, grouped as follows:

Basic information about the entry
- pdb_id : PDB entry id code.
- number_of_polymer_entities : Number of unique polymers in the entry.
- number_of_bound_entities : Number of unique bound molecules (excluding water) in the entry.
- number_of_polymers : Number of polymer chains in the entry.
- number_of_bound_molecules : Number of bound molecules in the entry.
- number_of_polymer_residues : Number of polymer residues in all polymers.
- entry_authors : List of depositors who deposited this entry.
- all_authors : Combined list of depositors and authors of primary citation.
- title : Title of the entry as provided by the depositor.
- revision_date : Date of latest revision of the entry - this is a timestamp in format YYYY-MM-DDThh:mm:ssZ.
- revision_year : Year of latest revision of the entry.
- deposition_date : Date when entry was deposited - this is a timestamp in format YYYY-MM-DDThh:mm:ssZ
- deposition_year : Year of deposition of the entry.
- status : Release state of the entry, e.g. REL, OBS etc.
- release_date : Date of release of the entry - this is a timestamp in format YYYY-MM-DDThh:mm:ssZ.
- release_year : Year of release of the entry.
- number_of_protein_chains : Number of protein chains in the entry.
- number_of_DNA_chains : Number of DNA chains in the entry.
- number_of_RNA_chains : Number of RNA chains in the entry.
- number_of_D_RNA_hybrid_chains : Number of hybrid DNA/RNA chains in the entry.
- SG_center_name : The Structural Genomics project.
- SG_full_name : The full name of Structural Genomics Project center.
- na_conf_features : Describes secondary structure features in this entry.
Related database identifiers.
- bmrb_id : BMRB id.
- emdb_id : EMDB id.
- psi_id : PSI reference id.
Basic information about the entity
- entity_id : Entity id (molecule number in mmcif-speak).
- entry_entity : Concatenation of entry and entity ids, e.g. 1cbs_1.
- entity_weight : Formula weight of the entity in daltons.
- number_of_copies : Number of copies of the entity found in the entry.
- struct_asym_id : The struct_asym_ids (equivalent of chain ids in mmcif-speak) that the entity is found in.
- modified_residue_flag : Generally N, but for polypeptide or polynucleotide entries, it can be Y if the entity has modified (non-standard) amino acids or nucleotides.
- molecule_type : Type of the molecule.
- mutation : Entity mutation description
- max_observed_residues : The maximum number of observed residues in the entity
- mutation_type : A description of the differences between the sequence of the entity described in the data block and that in the referenced database entry.
- molecule_sequence : Cannonical chemical sequence expressed as string of one-letter amino acid codes. Modifications are coded as the parent amino acid where possible.
- microheterogeneity : Turns on the flag for microheterogeneity in the entity. Possible value = y.
- interacting_ligands : A list of ligands that interact with the entity.
- interacting_uniprot_id : A list of UniProt identifiers of molecules that interact with the entity.
- interacting_entry_id : A list of PDB entry id codes of molecules that interact with the entity.
- molecule_sequence : Cannonical chemical sequence expressed as string of one-letter amino acid codes.
- experiment_data_available : A flag indicating if experiment data is available.
Taxonomy information about the entity (sometimes an entity can have more than one organism!)
- genus : A list of genus of source species of the entity.
- superkingdom : A list of superkingdoms of source species of the entity.
- organism_scientific_name : A list of scientific organism names of the source species of the entity.
- organism_synonyms : A list of synonyms of scientific organism names of the source species of the entity.
- tax_id : A list of taxonomy identifiers of source species of the entity.
- tax_query : Taxonomy ids for querying.
- rank : Ordering of taxonomic groupings.
Experimental details for the entry
- sample_preparation_method : Sample preparation method, such as 'Natural source', 'Genetically manipulated', 'Synthetically obtained', etc.
- experimental_method : A list of experimental methods used in structure determination, e.g. 'X-ray diffraction'.
- refinement_software : Software package used for refinement.
- structure_solution_software : Software package used to solve the structure, e.g. CCP4.
- resolution : The higher limit of resolution of crystallographic data as reported by depositors. Null if not available.
- pivot_resolution : Resolution field for pivoring in Solr. This is set to an arbitrarily high number when resolution is null.
- r_factor : The crystallographic R factor - null when not available.
- spacegroup : Spacegroup in the Hermann-Mauguin space-group notation.
- synchrotron_beamline : Synchrotron beamline used in the diffraction experiment.
- synchrotron_site : Synchrotron site of the diffraction experiment.
- detector : The detector used to measure the scattered radiation, including any analyser and post-sample collimation.
- data_reduction_software : Program/package name for data reduction/intensity integration software.
- data_scaling_software : Package name for data reduction/data scaling.
- detector_type : The make, model or name of the detector device used.
- r_free : Value of the Free R-Factor.
- structure_determination_method : Any particular method, such as molecular replacement, used in structure determination.
- crystallisation_ph : The pH at which the crystal was grown.
- crystallisation_reservoir : Chemical content of the crystal solution.
- phasing_method : A listing of the method or methods used to phase this structure.
- diffraction_protocol : Diffraction protocol used in solving an X-ray structure.
- beam_source_name : The general class of the radiation source.
Crystallographic cell parameters
- cell_alpha : Unit-cell angle alpha of the reported structure in degrees.
- cell_beta : Unit-cell angle beta of the reported structure in degrees.
- cell_gamma : Unit-cell angle gamma of the reported structure in degrees.
- cell_a : Unit-cell length a corresponding to the structure.
- cell_b : Unit-cell length b corresponding to the structure.
- cell_c : Unit-cell length c corresponding to the structure.
Simplified quality information
- model_quality : Harmonic mean of absolute percentiles related to model geometry.
- data_quality : Harmonic mean of absolute percentiles related to experimental data and its fit to the model.
- overall_quality : Harmonic mean of all absolute validation percentile metrics.
- inv_overall_quality : This is (100 - overall_quality).
Primary citation of the entry
- citation_authors : List of authors of the article.
- journal : The journal in which the article was published.
- journal_page : The page numbers of the article - this can be pp-pp, or pp or null.
- journal_volume : The volume in which the article was published, if any - null otherwise.
- pubmed_authors : Authors of the publication from Pubmed.
- pubmed_id : Pubmed identifier.
- citation_doi : The document object index of the article.
- citation_title : The title of the article.
- citation_year : The year of publication of the article.
Likely biological assembly
- assembly_id : List of assembly ids in which this entity is found.
- prefered_assembly_id : Id of most likely assembly containing this entity, rest of the properties are about preferred assembly..
- assembly_composition : Simple English description of composition.
- assembly_form : Homo or hetero.
- assembly_type : Monomeric, dimeric, etc.
Features of molecules within an entry
- prd_name : A name of the molecule.
- prd_type : The structural classification of the molecule.
- prd_class : The broad function of the molecule.
Data processing and status of the entry
- processing_site : The wwPDB site where the file was created or modified
- deposition_site : The site where the file was deposited
- pdb_format_compatible : A value of Y indicates that PDB format data file corresponding to this entry is available in the PDB archive.
Chemical components details
- compound_id : Unique identifier of a chemical component. For protein polymer entities, this is the three-letter amino acid code. For nucleic acid polymer entities, this is the one-letter base code.
- compound_name : The full name of the component
- compound_synonym : Synonym list for the component
- compound_weight : Formula mass in daltons of the chemical component
- compound_systematic_name : IUPAC or Chemical Abstracts full name of the chemical component
Details of the source from which the entity was obtained
- cell_line : A specific line of cells used as the expression system.
- atcc : American Type Culture Collection tissue culture number.
- entry_organism_scientific_name: Scientific name of the organism.
- expression_host_genus : The genus of the organism that served as host for the production of the entity.
- expression_host_superkingdom : The superkingdom of the organism that served as host for the production of the entity.
- expression_host_sci_name : The scientific name of the organism that served as host for the production of the entity.
- expression_host_synonyms : Other names archived in the NCBI taxonomy database for the organism that served as host for the expression system.
- expression_host_tax_id : The identifier for the NCBI taxonomy node corresponding to the organism that served as host for the expression system
Search of molecular entities which is an enzyme
- ec_number : Enzyme Commission number
- ec_hierarchy : Systematic name of the enzyme classification.
- enzyme_name : Enzyme name, equivalent to the accepted name for a given EC number.
- enzyme_systematic_name : Systematic name of the enzyme according to the Enzyme Commission.

UniProt
- uniprot_accession : A list of UniProt accessions.
- uniprot_id : A list of UniProt identifiers.
- uniprot_coverage : Percent coverage of UniProt accessions by entity sequence.
- uniprot_features : Sequence annotations (features) describe regions or sites of interest in the protein sequence.
- gene_name : A list of gene names corresponding to UniProt accessions.
- molecule_name : A list of molecule names for UniProt accessions.
- molecule_synonym : Synomyms for the primary names of UniProt accessions.
Pfam
- pfam_accession : A list of Pfam accessions.
- pfam_name : A list of Pfam domain names.
- pfam_clan_name : A list of Pfam clan names corresponding to the accessions.
- pfam_description : A list of descriptions for the Pfam accessions.
SCOP
- scop_family : A list of SCOP family names for the SCOP domains mapped to the chains of this entity.
- scop_superfamily : A list of SCOP superfamily names for the SCOP domains mapped to the chains of this entity.
- scop_fold : A list of SCOP fold names for the SCOP domains mapped to the chains of this entity.
- scop_class : A list of SCOP class names for the SCOP domains mapped to the chains of this entity.
CATH
- cath_code : A list of CATH codes for the CATH domains mapped to the chains of this entity.
- cath_class : A list of CATH classes for the CATH domains mapped to the chains of this entity.
- cath_architecture : A list of CATH architectures for the CATH domains mapped to the chains of this entity.
- cath_topology : A list of CATH topologies for the CATH domains mapped to the chains of this entity.
- cath_homologous_superfamily : A list of CATH superfamilies for the CATH domains mapped to the chains of this entity.
InterPro
- interpro_accession : A list of InterPro accessions mapped to the chains of this entity.
- interpro_name : A list of names of InterPro accessions mapped to the chains of this entity.
GO
- go_id : A list of GO ids mapped to this entity.
- biological_cell_component : A list of GO cellular component names mapped to this entity.
- biological_function : A list of GO biological functions mapped to this entity.
- biological_process : A list of GO biological processes mapped to this entity.
HomoloGene
- homologus_pdb_entity_id : A list of concatenated pdb and entity ids that are homologous to the entity according to the HomoloGene resource, e.g. ['1brq_1', '2wqa_2'].

query		String	Options string allowed by Solr syntax. Details about constructing Solr queries can be found from webpages such as this.
postdata		String	Send one or more query options in post data instead of appending to URL.

Quotes