The Power of an Idea - The Human Genome Project

The beginnings of the genomics revolution came with a bold proposal to sequence the entire genetic material, the genome, of a human being. It was very much the genetic equivalent of President Kennedy challenging the US to put a man on the moon.
Welcome back. I am Brad Goodner. While genetics as a scientific discipline did not need DNA sequences to get started, it certainly progressed at a much faster clip once one could see what genes actually looked like and determine how genes change due to different mutations. We discussed the Maxam-Gilbert chemical and Sanger enzymatic methods for sequencing DNA strands in our last episode. Their impact was so immediate that Gilbert and Sanger shared ½ of a Nobel Prize just a few short years later. Amazingly, it was Sanger’s second!  The other ½ of that Nobel went to Paul Berg who led efforts in the early 1970’s to develop the methods that we now call recombinant DNA technology or DNA cloning. These techniques involved cutting DNA sequences at specific sites using bacterial enzymes called restriction endonucleases. Restriction is a medical term for cutting and endonucleases are enzymes that cut nucleic acids, in this case double-stranded DNA, within the molecule as opposed to at a free end. For example, the restriction endonuclease BamHI always cuts the DNA sequence 5’-GGATCC-3’. Notice that the complementary strand of DNA is also 5’-GGATCC-3’ just running right to left instead of left to right. Such sequences within a double-stranded DNA molecule are called palindromic sequences. By cutting DNA molecules with different restriction endonucleases and figuring out the sizes of the resulting DNA fragments, scientists could come up with physical maps of a DNA molecule. While they didn’t know from this data alone the complete sequence of the DNA molecule, it was a starting point based on some of the sequence information. By analogy, it was like knowing the layout of streets in a town without knowing every house on every street. In addition to enzymes that cut DNA, recombinant DNA cloning also involved enzymes that could sew DNA fragments back together. The medical term for sewing back up is ligation and these enzymes are called DNA ligases.

This growing physical mapping information about DNA molecules and the initial efforts to sequence fairly small pieces of DNA strands were building on a much older history of genetic maps in different model genetic organisms such as fruit flies, baker’s/brewer’s yeast, and maize. By following the inheritance of different mutations through crosses, geneticists could start to arrange mutations and the genes they were in along linear maps of chromosomes. Through these efforts, they figured out that the number of genetic maps in a given organism usually equalled the number of different types of chromosomes in that organism. In bacteria such as E. coli that lack sexual reproduction, geneticists came up with modifications to their genetic mapping strategies. In most bacteria, the genetic map formed a circle which later matched the true circular nature of the chromosome. These genetic maps were cruder than the physical maps in terms of scale – putting towns in spatial reference to each other rather than individual streets and houses, but genetic maps had a real advantage. They were linked to traits, measurable phenotypes seen in an organism.

All of these tools, old and new, were in the hands of geneticists and other scientists interested in DNA by the year 1980. Over the next 15 years, advances in recombinant DNA technology and DNA sequencing along with sociological changes in the way scientists and governments approached scientific challenges brought forth the Human Genome Project. Here are some of those changes.

Scientists came up with ways to use restriction endonucleases to physically map the human genome. By 1995, the physical map had over 15,000 markers on it. The genetic map of the human genome had 400 mapped traits by 1987.

Scientists came up with ways to handle and clone into plasmids bigger and bigger chunks of DNA – moving from a few thousand base pairs up to over 100 thousand base pairs.
Kary Mullis and colleagues at Cetus Corporation develop a strategy for using DNA Polymerase to replicate user-defined short stretches of DNA over and over and over again to amplify the amount of the user-defined sequence. Their strategy, called Polymerase Chain Reaction, made it easy to obtain workable amounts of specific DNA sequences from a tiny amount of starting material.

Scientists, both at universities and connected to business interests, made Sanger replication-based DNA sequencing into an automated technology. DNA sequencing became more of a standardized service that universities and research institutions provided to their researchers than an individual lab art form.

Big name scientists wrote opinion pieces in the top scientific journals making a case for an all-out effort to sequence the human genome. Discussions about such an effort took place at several research conferences.

In the United States, the National Institutes of Health, NIH for short, and the Department of Energy, DoE for short, independently started plans for sequencing the human genome. NIH makes sense given its mandate to promote human health, but DoE had two good reasons as well. Its governmental charge is to safeguard and promote energy supplies in the US of all types. One energy supply, nuclear power, has clear safety concerns when it comes to exposure to nuclear radiation and subsequent DNA damage. DoE wanted to better understand the impact of radiation on the human genome. DoE had a longer-term energy interest as well – bioenergy in the form of organic carbon polymers stored in algae, crops, and trees. Not as scientifically sexy a topic as the human genome, but very important in its own right. In the end, NIH took the lead but both government agencies were heavily involved in the Human Genome Project. Jim Watson of Watson and Crick fame was picked to head up the new effort. He stayed in the job for 5 years and was replaced by Frances Collins who led the US government-based efforts until it reached its original goal of a complete human genome sequence.

Likewise, government-based scientific agencies in Europe and in Japan made similar decisions to be part of the Human Genome Project.

Early on, several groups of scientists, government agencies and business-based efforts realized that the smaller genomes of other model organisms would be good starting points to better experimental strategies, sequencing technologies, and data analysis tools. There were lots of basic biology interest in these smaller genomes as well.  By the time the first drafts of a human genome sequence were published in 2001, there were already over 50 completed genome sequences of other cellular organisms – many Bacteria, a handful of Archaea, a fungus, a plant, a nematode, and a fruit fly.

Next time, we will consider the two initially competing approaches that were taken to sequence the human genome and why one of those approaches became the standard for all subsequent genomes and metagenomes. Talk to you again soon.