What are HMMs?

Hidden Markov models (HMMs) are used by many databases. Like profiles, they can be used to convert multiple sequence alignments into position-specific scoring systems. HMMs are adept at representing amino acid insertions and deletions, meaning that they can model entire alignments, including divergent regions. They are sophisticated and powerful statistical models, very well suited to searching databases for homologous sequences [7].

Figure 14 Representation of a Hidden Markov model based on a multiple sequence alignment. Amino acids are given a score at each position in the sequence alignment according to the frequency with which they occur. Transition probabilities (i.e., the likelihood that one particular amino acid follows another particular amino acid) and insertion and deletion states are also modelled.

HMMs have wide utility, as is clear from the numerous databases that use this method for protein classification, including PfamSMART, NCBIfam (including TIGRFAMs)PIRSFPANTHERSFLDSuperfamily and Gene3D.