Biological molecules provide a link between the past and present

Biological molecules provide a link between the past and present

The movie Jurassic Park sparked our collective imagination about the power of biological molecules, and Jurassic World continued its legacy. In the first film in the franchise, Jurassic Park, the cartoon character Mr. DNA explains:
Just one drop of your blood contains billions of strands of DNA, the building blocks of life. A DNA strand like me is a blueprint for building a living thing and sometimes animals that went extinct million years ago, like dinosaurs, left their blueprints behind for us to find.”
(quote from the 1997 movie Jurassic Park).
 

You might be disappointed to learn that the double-stranded helical DNA molecule has a half-life of 571 years (1) making the discovery of dinosaur DNA extremely unlikely. However, DNA isn’t the only clue to biological history – protein can be too. Compared to DNA, the triple-stranded collagen protein, is a much more robust molecule and could provide a ‘protein fingerprint’ for extinct species to help determine if they might be related to other animals. This highly abundant protein is a key component in our bones -- thus if a biological molecule could be discovered in bones or fossilized remains from millions of years ago, then the most likely protein would be collagen.

 

The rope-like structure of collagen

Rope is not composed of a single strand, but multiple strands twisted together. Collagen has a rope-like structure that starts at the molecular level with three amino acid chains twisting around each other and to form a triple helix (diameter 1.5 nm, length 300 nm). Triple helical units associate to form a microfibril (diameter 4 nm; reference 2). Multiple microfibrils associate into fibrils (diameter of approximately 20 nm), then multiple collagen fibrils associate to form a collagen fiber (diameters of 50 to 200 nm).

 

image

The rope-like structure of collagen starts with three amino acids chains forming a triple helix. Multiple triple helical units then associate to form a microfibril. Microfibrils are further associated into larger and longer structures that ultimately results in a collagen fiber.
Links to 3D visualizations of structures determined for the collagen microfibril:
PDB ID 3HR2 and PDB ID 3HQV

 

Collagen fibers can associate to form bundles of fibers in bone with diameters of 10,000 to 80,000 nm or 10 to 80 micrometers. For reference, human hair has a diameter of 20 to 70 micrometers, and is composed of a different protein called keratin (found in nails, hooves and feathers).

 

Collagen in bone

Bones are an inorganic-organic composite. For bone the main organic component is collagen fibers and the main inorganic component is a calcium phosphate mineral called hydroxyapatite. The collagen fibers provide a scaffold where extremely small hydroxyapatite crystals are deposited. It has been found that the mineral deposits are key to bone being able to cope with load-bearing under stress, whereas the collagen protein fibers are key to bone being able to recover/retain structure from deformation-type strain. In this way, the inorganic and organic parts combine to create a material that is resistant to fractures (3) and yet can support the many movements of humans and other animals, such as walking, running, jumping, and dancing.

 

Many roles of collagen

Collagen is estimated to constitute 20 to 40% of all protein in humans and other animals, making it the most abundant protein in humans. This abundance is due to collagen’s many roles in providing structure and strength in our bodies. In addition to being in our bones, collagen is also found in our teeth, corneas, skin, cartilage, muscle, heart vesicles, and many other organs, as well as being key component in the ligaments that connect bone to bone and the tendons that connect muscle to bone. The structure and form provided by collagen in so many organs and tissues in our body makes knowledge about collagen important for understanding many diseases. It can also be important for the development of new medical treatments -- such as bioengineering skin graft options for treating burn victims.

 

A different type of helix

The DNA helix is somewhat famous for being composed of two chains of nucleotides, and has many features that make it ideal for storing blueprint information for living things. Collagen, as mentioned previously, involves three chains that are composed of amino acids, not nucleotides. Collagen, unlike DNA, cannot self-assemble. DNA has ‘rungs’ in its double-helix that involve base-pairing and the base-pairing that occurs in DNA can be used as a ‘self-guidance’ mechanism allowing DNA to ‘unzip’ and ‘re-zip’ its structure with relative ease. For collagen, there must be a mechanism to guide how to coil the three amino acid chains together so the collagen triple helix can assemble correctly. 

Around the time scientists discovered that DNA formed a double helix, G. N. Ramachandran (the scientist who also developed the data quality metric the ‘Ramachandran plot’) discovered that collagen forms a triple helix structure. Subsequently it was uncovered that there are 43 human genes that code for collagen. These 43 genes correspond to 28 different types of collagen. There are sections of these genes that encode for protein regions that are absent from the collagen fibers we observe in tissues and organs. These protein regions were discovered to contain ‘trimerization domains’. Trimerization domains are absent in collagen fibers but are critical for collagen chains to assemble correctly into a triple helix (4). It is at these trimerization domains where the collagen protein chains initially align, with these chains being referred to as the leading, middle, and lagging chains in the triple helix.

 

image

A collagen triple helix and an associated alpha-helical trimerization domain. The triple helix has leading, middle and lagging chains. A key feature in the collagen amino acid sequence is that every third amino acid is a glycine, which is the smallest amino acid and can fit in the center of the helix. The next most common amino acids in the collagen triple helix sequence are proline and hydroxyproline.
Link to the 3D visualization of a structure determined for a collagen trimerization domain and a portion of the triple helix: PDB ID 5CTI

 

Trimerization domains prevent the triple helix from forming a microfibril until it is appropriate for it to do so. Collagen microfibril formation and the subsequent collagen fibril and fiber formation requires a combination of enzymes to act and this includes a protease enzyme that removes the trimerization domain. Other enzymes add crosslinks between the triple helical units to drive the formation of collagen into larger and longer collagen fibers.

Diverse cross-linking occurs when these triple helical units associate to form microfibrils and fibrils. An online tool has been developed so the resulting microfibrils/fibrils can be visualized in 3D. This tool is called ColBuilder generates theoretical models and is available at the following link: https://colbuilder.h-its.org/index.html

 

Amino acid composition of collagen

The amino acid sequence of collagen contains repeating patterns involving three different amino acids: glycine (GLY), proline (PRO), hydroxyproline (HYP). HYP is not in the standard set of 20 amino acids because it is generated after the amino acids have already been joined into a chain. HYP are generated by an enzyme acting on PRO in the protein chain to convert the PRO to a HYP.

The combination of these three amino acids (GLY, PRO and HYP) in a repeating pattern are fundamental to enabling the formation of the collagen triple helix, as GLY is small and it can fit in the middle of the helix, while PRO and HYP create ‘kinks’ in the protein chain causing it to bend back upon itself. The abundance of these three amino acids and the repeating pattern of these contribute to collagen having the capacity to be a ‘protein fingerprint’ in fossils and bones.

 

Collagen in dinosaur bones

The presence of collagen in dinosaur bones was first reported by Dr Mary Schweitzer after certain treatments were performed on sections of a femur from Tyrannosaurus rex (5). This was met with criticisms and doubts for other scientists (6). Some scientists proposed that the bones had been contaminated and the reported presence of collagen fragments were either not from collagen, or not from the dinosaur. There is evidence that dinosaur bones when removed from the ground and then examined immediately after being removed contain many micro-organisms and these micro-organisms could possibly give a false positive for the presence of dinosaur protein (7). In contrast, other scientists have performed different types of studies that also indicated the presence of collagen in dinosaur bones (8). The potential for collagen to be a ‘protein fingerprint’ and to expand our understanding of many extinct species means new research is being pursued and many more discoveries could be just around the corner.

Genevieve Evans

 

About the artwork

Collagen is the single most abundant protein in the animal kingdom, composed of three protein chains, wound together in a rope-like structure to form a tightly-packed triple helix. Mia used this visual feature to portray the link between the dinosaurs who lived millions of years ago and today’s roosters. Mia, aged 16, is a student at the Viewbank College, Melbourne, Australia. She is a passionate arts and science student who wants to pursue a career related to the protection of Australian wildlife. Her all-time favourite movie is Jurassic World and she has a strong interest in paleontology.

View the artwork in the virtual 2021 PDB Art exhibition. 

 

Structures mentioned in this article

PDB ID 3HR2

PDB ID 3HQV

PDB ID 5CTI


 

Sources / Further reading

1. The determination of the half-life of DNA
(a) Kaplan, M. (2012) DNA has a 521-year half-life. Nature
LINK: https://www.nature.com/articles/nature.2012.11555
(b) Allentoft, M. E., Collins, M., Harker, D., Haile, J., Oskam, C. L., Hale, M. L., Campos, P. F., Samaniego, J. A., Gilbert, M. T. P., Willerslev, E., Zhang, G., Scofield, R. P., Holdaway, R. N., and Bunce, M. (2012) The half-life of DNA in bone: measuring decay kinetics in 158 dated fossils. Proceedings of the Royal Society B: Biological Sciences 279, 4724-4733
LINK: https://royalsocietypublishing.org/doi/10.1098/rspb.2012.1745

2. The determination of the microfibril structure of collagen
Orgel, J. P. R. O., Irving, T. C., Miller, A., and Wess, T. J. (2006) Microfibrillar structure of type I collagen in situ. Proceedings of the National Academy of Sciences103, 9001
LINK: https://www.pnas.org/content/103/24/9001

3. Understanding the molecular mechanics in bone
Nair, A. K., Gautieri, A., Chang, S.-W., and Buehler, M. J. (2013) Molecular mechanics of mineralized collagen fibrils in bone. Nature Communications4, 1724
LINK: https://www.nature.com/articles/ncomms2720

4. ‘Trimerization domain’ in collagen structures
Boudko, S. P., and Bächinger, H. P. (2016) Structural insight for chain selection and stagger control in collagen. Scientific Reports6, 37831
LINK: https://www.nature.com/articles/srep37831

5. The first report of collagen being found in dinosaur bones
Schweitzer M. H., Suo, Z., Avci, R., Asara J. M., Allen M. A., Arce F. T., and Horner John, R. (2007) Analyses of Soft Tissue from Tyrannosaurus rex Suggest the Presence of Protein. Science316, 277-280
LINK: https://www.jstor.org/stable/20036013

6. Perspective shared from Dr Mary Schweitzer who first reported collagen being present in dinosaur bones
‘I don't care what they say about me': Paleontologist stares down critics in her hunt for dinosaur proteins
LINK: https://www.science.org/content/article/i-don-t-care-what-they-say-about-me-paleontologist-stares-down-critics-her-hunt

7. Criticisms about the collagen being present in dinosaur bones
(a) Buckley, M., Walker, A., Ho Simon, Y. W., Yang, Y., Smith, C., Ashton, P., Oates Jane, T., Cappellini, E., Koon, H., Penkman, K., Elsworth, B., Ashford, D., Solazzo, C., Andrews, P., Strahler, J., Shapiro, B., Ostrom, P., Gandhi, H., Miller, W., Raney, B., Zylber Maria, I., Gilbert, M. T. P., Prigodich Richard, V., Ryan, M., Rijsdijk Kenneth, F., Janoo, A., and Collins Matthew, J. (2008) Comment on "Protein Sequences from Mastodon and Tyrannosaurus rex Revealed by Mass Spectrometry". Science319, 33-33
LINK: https://www.science.org/doi/10.1126/science.1147046
(b) Kaye, T. G., Gaugler, G., and Sawlowicz, Z. (2008) Dinosaurian Soft Tissues Interpreted as Bacterial Biofilms. PLOS ONE3, e2808
LINK: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0002808
(c) Saitta, E. T., Liang, R., Lau, M. C. Y., Brown, C. M., Longrich, N. R., Kaye, T. G., Novak, B. J., Salzberg, S. L., Norell, M. A., Abbott, G. D., Dickinson, M. R., Vinther, J., Bull, I. D., Brooker, R. A., Martin, P., Donohoe, P., Knowles, T. D. J., Penkman, K. E. H., and Onstott, T. (2019) Cretaceous dinosaur bone contains recent organic material and provides an environment conducive to microbial communities. eLife8, e46205
LINK: https://elifesciences.org/articles/46205

8. Subsequent reports of collagen being found in dinosaur bones
(a) Bertazzo, S., Maidment, S. C. R., Kallepitis, C., Fearn, S., Stevens, M. M., and Xie, H.-n. (2015) Fibres and cellular structures preserved in 75-million–year-old dinosaur specimens. Nature Communications6, 7352
LINK: https://www.nature.com/articles/ncomms8352
(b) Lee, Y.-C., Chiang, C.-C., Huang, P.-Y., Chung, C.-Y., Huang, T. D., Wang, C.-C., Chen, C.-I., Chang, R.-S., Liao, C.-H., and Reisz, R. R. (2017) Evidence of preserved collagen in an Early Jurassic sauropodomorph dinosaur revealed by synchrotron FTIR microspectroscopy. Nature Communications8, 14220
LINK: https://www.nature.com/articles/ncomms14220