Introduction to a Survey of Genomes - Agrobacterium tumefaciens C58

Now we will tour through a survey of some sequenced genomes. All three domains of life will be represented, but the Bacteria and Archaea will get the lion’s share. For each genome, we will learn why scientists are interested in the organism, some basic data about the genome, its genes and encoded proteins, a few surprises from the genome sequence, and an example of how scientists took the next step past having the genome sequence. Our first sequence is my personal favorite, the soil bacterium and plant pathogen and biotechnology agent Agrobacterium tumefaciens C58.

This is Brad Goodner.  Welcome back to Genomics Revolution.  In our first 6 episodes, we have introduced the terms genome and genomics, talked about how the field of genomics got its start, and looked at the steps of a genome project using the first ever sequenced cellular genome as an example.

Now we will tour through a survey of some sequenced genomes.  All three domains of life will be represented, but the Bacteria and Archaea will get the lion’s share.  For each genome, we will learn why scientists are interested in the organism, some basic data about the genome, its genes and encoded proteins, a few surprises from the genome sequence, and an example of how scientists took the next step past having the genome sequence.  Each genome will be presented by a different student in the 2019 Hiram College Genetics course and they will put their own unique spins on their assigned subjects.

To get us started, I will talk about my favorite genome, #50 in terms of getting published according to my counting.  This is the genome of Agrobacterium tumefaciens strain C58, an organism I have worked with on and off since 1983.  Agrobacterium, or Agro for short, is a genus from the Bacteria division alpha-Proteobacteria found in soils all over the world and is best known because some strains are plant pathogens.  These pathogenic strains contain a plasmid that allows them to do something no other bacterial pathogens can do – transfer a piece of their own DNA into their eukaryotic host cell where the expression of genes on the transferred DNA causes the cells of the plant host to act very very differently.  The “transformed” plant cells grow out of control because they make their own growth-stimulating hormones and they produce and secrete some strange compounds that Agro can use as C and N sources.  It turns out that Agro has been genetically engineering plants on its own for a long time before any humans thought about the possibility.

I came back to work on Agro in 1996 when I read a paper by Allardet-Servent and coworkers (1) who showed that in strain C58 there were two DNA molecules greater than 1 Mbp that contained rRNA genes.  The presence of rRNA genes is usually indicative of a chromosome, but this would mean that strain C58 has 2 chromosomes and there was no previous evidence of this.  A group of 8 undergraduates at University of Richmond worked with me to generate and map a large collection of transposon insertions in essential genes of Agro C58.  Our paper (2), published in 1999 proved that there 2 chromosomes in strain C58. The larger chromosome is a circle of roughly 3 Mbp,but the smaller 2.1 Mbp chromosome is a  linear DNA molecule.  Agro‘ s closest relatives in the genus Rhizobium only show 1 circular chromosome of roughly 3.6 Mbp, so we wondered where did the smaller linear chromosome come from?  We imagined 3 possibilities.  One, the smaller chromosome originated from a breakage event in the original circular chromosome.  Two, the smaller chromosome came in from the outside such as a viral infection.  Three, some combination of the first two hypotheses.  It was this question that drove my lab to start sequencing the Agro C58 genome in 1999.

We started with $3000 to build a genomic library and start sequencing library clones, but we knew the full cost would be closer to a a half million dollars.  In the fall of 1999, we presented some of our initial findings at a small research conference that focuses on the biology of Agrobacterium.  At the end of the conference, a gentleman approached me. “My name is Steve Slater”, he said, “and I work for a small company called Cereon Genomics.  We need to talk but not here.  I will call you tomorrow.”  On the flight back home, I told my wife Asha that I thought Steve Slater was going to tell me that his group had already sequenced the C58 genome.  However, the next day, Steve told me that Cereon Genomics was just beginning to sequence the genome and that they wanted to collaborate with my research team because of our genome map of transposon insertions.  Steve rightfully saw the value of using our map to orient and join up the sequenced pieces of the genome.  We reached an agreement between Cereon Genomics, its parent company Monsanto Corporation, and my research lab.  The agreement required all partners to agree as to when and how to publish the finished work.  If any one partner didn’t want to publish, the collaboration would stop.

It was a fun but odd collaboration.  My students got to work with a lot more sequence information, but we had to use a dial-in modem connection on one computer to access the company database.  This restriction slowed us down but we made consistent progress and were basically finished with sequencing and assembling the genome sequence by the end of 2000.  Around that time, we became aware of another collaboration between an academic lab at University of Washington and DuPont Corporation that was also sequencing the Agro C58 genome.  It was a race but luckily in the end both collaborations agreed to publish back-to-back articles (3,4) in the journal SCIENCE that came out in December of 2001.

So what we did we learn from the Agro C58 genome sequence?  First and foremost, the 2.1 Mbp linear chromosome was evolutionarily derived from a plasmid!  The origin of replication on this chromosome is clearly a member of the repABC plasmid family, very similar to those found on two large plasmids in strain C58.  Second, the linearity of the second chromosome is due to hairpin loops on each end where the top and bottom strands are connected through a stem-loop structure.  In a later paper, we obtained the full sequence of the hairpin loops and showed that the linear chromosome is found only in one subset of Agrobacterium and Rhizobium strains called biovar 1 (5).  During replication, the two “old” strands are still connected at their ends.  Once the hairpin loops are replicated, an enzyme called protelomerase recognizes the double-stranded hairpin sequences and makes staggered cuts to allow the two new daughter ds DNA molecules to separate and reform hairpin loops on each end.  We don’t know yet the evolutionary origin of the hairpin loops and the gene encoding protelomerase.  My personal hypothesis is that they came in as part of a linear bacteriophage.  Third, comparison of the two chromosomes of Agro C58 with the sequenced single chromosome of Sinorhizobium strain 1021 showed clear evidence that several large chunks of the ancestral circular chromosome moved to the plasmid that became the second chromosome.  I had several Hiram College students continue studying this phenomenon and this became part of another follow-up publication (6).

There were a lot more insights gleaned from the Agro C58 genome sequence and we continue to link genes to functions using functional genomics experiments such as the Mariner-type transposon mutagenesis screen going in the 2019 Hiram College Genetics course.  However, I will leave those details for another time.

The Agro C58 genome shows us how complex genomes can arise in the Bacteria domain and how genomes can rearrange over time.  Now let us see what we can learn from other genomes.  Stay tuned to Genomics Revolution.

For More Information on Agrobacterium strain C58 & its genome:
(1) Allardet-Servent et al., 1993.  Journal of Bacteriology 175:7869-75.
(2) Goodner et al., 1999.  Journal of Bacteriology 181:5160-6.
(3) Goodner et al., 2001.  Science 294:2323-8.
(4) Wood et al., 2001.  Science 294:2317-23.
(5) Slater et al., 2013.  Applied & Environmental Microbiology 79:1414-7.
(6) Slater et al., 2009.  Journal of Bacteriology 191:2501-11.