webPRANK

Submit alignment task

Sequence input and submission
Sequence data (required):
  Paste sequences in Fasta format or choose a file to upload

Alignment title (optional):



You can start the alignment by clicking the button above. The tabs below allow you to change the alignment options and use the advanced features of the PRANK algorithm. More information.

Sequences can be either DNA or amino acids: their type is automatically detected. DNA sequences can be aligned using more complex models; see below for details. Your settings are remembered while you move between the tabs. Once you have finished with the settings, return to this main tab to start the alignment.

Protein-coding DNA sequences can be translated to proteins/codons (see Basic alignment options), aligned as proteins/codons, and then back-translated to DNA. The translation is done in frame 1, starting from the first base; codons also require sequences as multiples of 3 in length.

DNA can also be aligned using structure models that describe multiple evolutionary processes (see Basic alignment options). The optimal 'structure' within sequences does not need to be known in advance but is inferred while doing the alignment. The posterior inference of different evolutionary processes across sequence sites can be displayed within the alignment browser.

Basic alignment options
Guide tree (optional)*:
  Paste your tree in Newick format or choose a file to upload


Inference of insertions and deletions:
trust insertions (+F)
Alignment reliability:
compute reliability

Alignment of DNA sequences:
default
align translated codons
use structure model Fast/Slow
align translated proteins
use structure model Genomic
align translated mt proteins

Change the default alignment options. More information.

*The guide tree defines the alignment order, with branch lengths as substitutions per site. Rooted trees with branch lengths are preferred; if no branch lengths are provided, fixed lengths are used (see also Substitution scoring in Advanced alignment options). Unrooted trees with branch lengths are used with mid-point rooting. If no tree is provided, an approximate tree is computed using either Clustalw2 or PRANK.

Trust insertions (+F) is generally beneficial but may cause an excess of gaps if the guide tree is incorrect. Compute reliability is time consuming and may be disabled if not needed. This prevents some alignment post-processing options, however.

In addition to translation and alignment as codons or proteins, DNA sequences can be aligned using structure models. Fast/Slow assumes two types of sequence regions; Genomic additionally distinguishes coding exons.


Advanced alignment options
DNA alignment:
gap rate  gap length  Κ
Protein alignment:
gap rate  gap length

Substitution scoring:
stringent relaxed
Guide tree generation:
use ClustalW2 (faster)
DNA alignment anchoring:
allow CHAOS anchors (faster)
Ancestral sequences:
output ancestral sequences
Alignment annotation:
sequences aligned, posterior scores only

Change the default alignment options. More information.

Gaps are modelled as a time-dependent process and are more probable between distantly- than closely-related sequences. The probability of opening gaps is computed from gap rate and the pairwise distance between two sequences (or ancestral nodes) as defined in the guide tree. Gap length sets the mean for the geometrically-distributed expected gap lengths. Κ defines the ts/tv rate ratio for the HKY model that is used to compute the substitution scores for DNA alignments; the base frequences are empirical, based on the input data. Protein scoring is based on the WAG substitution model.

Substitution scoring is based on the branch lengths defined in the guide tree; option stringent sets the upper limit for maximum pairwise distance (or fixed branch length when none are provided) lower and thus enforces higher similarity. The alignment can be accelerated by using Clustalw2 for the generation of the first guide tree (when a structure model is not used, the alignment is iterated using a tree estimated from the first alignment) and allowing PRANK to use CHAOS-generated anchors for the alignment of long DNA sequences. The alignment with PRANK always infers the ancestral sequences at the internal nodes of the guide tree. These can be outputted and displayed in the alignment browser.

PRANK can also compute reliability and match stucture models on existing alignments. Alignment annotation works best for PRANK-generated alignments and using the original guide tree; it may work for other alignments too.



Extra options for structure models (DNA)
Fast/Slow

F:  length  gap rate  gap length Κ  rel. rate  1.0
S:  length  gap rate  gap length Κ  rel. rate

Genomic (Fast/Slow/Codon)

F:  length  gap rate  gap length Κ  length F+S
S:  length  gap rate  gap length Κ  rel. rate
C:  length  gap rate  gap length Ω   WAG frequencies

Change the default alignment options. More information.

These options define the structure models for the alignment of DNA sequences. For them to have any effect you need to select one of the structure models in in Basic alignment options.

A structure model describes two or more processes and lets the aligner alternate between these during the alignment. The gap parameters and Κ are explained under Advanced alignment options. Length defines the length (mean of a geometric distribution) of fragments evolving under each process and the relative rate can set one of the processes to evolve either faster or slower.

Genomic model consists of five states, Fast, Slow and three Codon sites. Codon site processes are defined by the codon model and selection parameter Ω, with frequences optionally scaled by amino-acid frequencies. Length F+S defines the length of non-coding regions.

Retrieve finished job

Paste the job ID in the box

The job ID should be something like prank-S20100806-105656-0250-77112366 and is given after submitting a job.

Display existing alignment

Alignments in HSAML and other formats can be displayed with http://wasabiapp.org.

More information.

The PRANK alignment program has its own HSAML format that integrates all the information in one file. That format is now supported by the Wasabi analysis environment.

Contact & links

For feedback, questions and comments, contact the developers at webprank@ebi.ac.uk.

The alignment program PRANK is available as a stand-alone program for different operating systems. Find more informations at the program home page. The PRANK alignment algorithm is described in these papers: PMCID:PMC1180752 PMID:18566285 PMCID:PMC3009689

We thank the EBI External Services team for the infrastucture, help and support that enable running this service. However, they are not responsible for any problems that may occur and should not be addressed with complaints or requests concerning this service.

If you publish analyses based on webPRANK alignments, please cite:
  Löytynoja, A., Goldman, N. (2010)
  webPRANK: a phylogeny-aware multiple sequence aligner with interactive alignment browser
  BMC Bioinformatics 11, 579



Updated 8 October, 2017. See the change log.