What effect do variants in coding regions have?

If a variant falls within a coding region, it can be categorised based on how it would affect the codon it falls within (Figure 6).

  • Synonymous/silent – Due to redundancies in the genetic code, many nucleotide changes will not change the amino acid sequence, for example a GCT to GCC change would still encode an alanine.
  • Nonsense – These turn a coding codon, such as GGA glycine, to a stop codon, e.g. TGA. This will result in a truncated protein, which may or may not be subject to nonsense-mediated decay depending on where in the peptide it occurs.
  • Missense – This change results in a change in amino acid, for example ACC threonine to AAC asparagine.

Algorithms such as SIFT and PolyPhen estimate how likely this amino acid change is to affect protein function. These estimates are based on how well conserved the protein is, the chemical difference between the amino acids, and the 3D structure of the protein (PolyPhen only). Both provide a score out of one (0 is the most severe for SIFT, whereas 1 is the most severe for PolyPhen) along with a qualitative prediction. These are predictions only, not experimental validations of the effect.

Figure 6 Mutations are classified based on how they affect the codon they are in. Image source: Wikimedia commons.

Indels with a length divisible by three (i.e. whole codon indels) in coding regions will cause insertions or deletions of whole amino acids into the protein, and are known as in-frame deletions or insertions. Note that indels divisible by three may also cause a missense or nonsense variant if the the variant falls across two codons. However, if the length is not divisible by three, this will cause a frameshift where all codons downstream of the indel are shifted, often resulting in a malformed protein or nonsense-mediated decay.