0%

Ensembl genes

The Ensembl gene set is based on evidence, and includes manual annotation for our most used species (Figure 6). 

Figure 6 Sequences in public databases are aligned to the genome in order to determine positions of genes, along with splice variants.

The GeneBuild

The initial step is to obtain sequenced genomes from official centres. The sequenced genomes are then annotated in the Ensembl pipeline (also known as the Ensembl genebuild) using both automatic and manual annotation for some species. Human, mouse, zebrafish and rat gene sets include manual annotation from the HAVANA project. The Ensembl gene set for human, including Havana transcripts, is the GENCODE set.