Illustration 7074323 / Dna © Paul Fleet | Dreamstime.com
The size of a species's genome is described by the number of base pairs in its DNA. (A base-pair is a pair of amino acids). Because the range of sizes is huge, the SI prefixes kilo-, mega-, and giga-, together with their symbols, have been adopted. So a kilobase (kb) is 1000 nucleotide pairs, in double-stranded DNA, or 1000 nucleotide in single-stranded DNA. Similarly megabase (Mb) represents a million nucleotides or nucleotide pairs, and gigabase (Gb) 109. There is no need for terabase, because no genome that big has been discovered.
can be converted to picograms by using the factor 1 picogram = 978 Mb.3
3. J. Doležel, J. Bartoš, H. Voglmayr and J. Greilhuber.
Nuclear DNA content and genome size of trout and human.
Cytometry Part A 51A:127–128 (2003)
doi: 10.1002/cyto.a.10013
As the authors note, the same factor was published by T. Cavalier-Smith in
The Evolution of Genome Size.
New York: John Wiley and Sons, 1985.
"Candidatus Tremblaya princeps"
In 2002, the organism with the smallest known genome (491 kilo-basepairs) was Nanoarchaeum equitans,¹ a microbe discovered living on the surfaces of another microbe, Igniococcus. Both are members of the Archea domain. They were found in gravel taken from the ocean floor north of Iceland, in an area heated by volcanic activity. N. equitans is extremely small, roughly a sphere 400 nanometers in diameter. As of this writing, the microbe's genome had not been mapped completely, but was estimated at a few hundred genes, well below the size of the previous record holder.
The previous record holder (from 1995) was the bacterium Mycoplasma genitalium, with about 580 kb.² From this and other studies two researchers³ estimated the smallest possible genome at about 256 genes.
1. Harald Huber, Michael J. Hohn, Reinhard Rachel, Tanja Fuchs,
Verna C. Wimmer and Karl O. Stetter.
A new phylum of Archea represented by a nanosized hyperthermophilic symbiont.
Nature, volume 417, pages 27-28, 2 May 2002.
A letter to the editor.
2. Claire M. Fraser, Jeannine D. Gocayne, Owen White, Mark D.
Adams, Rebecca A. Clayton, Robert D. Fleischmann, Carol J. Bult, Anthony R.
Kerlavage, Granger Sutton, Jenny M. Kelley, Janice L. Fritchman, Janice F.
Weidman, Keith V. Small, Mina Sandusky, Joyce Fuhrmann, David Nguyen, Teresa R.
Utterback, Deborah M. Saudek, Cheryl A. Phillips, Joseph M. Merrick,
Jean-Francois Tomb, Brian A. Dougherty, Kenneth F. Bott, Ping-Chuan Hu, and
Thomas S. Lucier.
The minimal gene complement of Mycoplasma genitalium.
Science, volume 270, pages 397-404 (20 October
1995).
3. Arcady R. Mushegian and Eugene V. Koonin.
A minimal gene set for cellular life derived by comparison of bacterial genomes.
Proceedings of the National Academy of Sciences, volume 93, no. 19, pages
10268–10273 (September 17 1996).
For the genomes, see
www.ncbi.nlm.nih.gov/Complete_Genomes/.
Jack Maniloff.
The minimal cell genome: 'On being the right size.'
Proceedings of the National Academy of Sciences, volume 93, no. 19, pages
10004–10006 (September 17 1996).
Fugu rubripes, a Japanese puffer fish. Although its genome has about 30,000 genes, it is small because it includes very little "junk". See:
www.lbl.gov/Science-Articles/Archive/fugu-decoded.html
www.lbl.gov/Science-Articles/Archive/fugu-facts.html
Ampiuma means, the Australian lungfish (Neoceratodus forsteri) 43 billion basepairs
Previous record holder the mexican axolotl
Axel Meyer, Siegfried Schloissnig, Paolo Franchini, et al.
Giant lungfish genome elucidates the conquest of land by vertebrates.
Nature, 590, pages 284-289 (18 January 2021).
doi.org/10.1038/s41586-021-03198-8
Organism | Number of base pairs in millions |
Number of genes |
When first sequenced |
By whom, notes |
---|---|---|---|---|
Haemophilus influenzae (bacteria) |
1.8 | 1,740 | 1995 |
Fleischmann R. D., Adams M. D., White O., Clayton R. A.,
Kirkness E. F., Kerlavage A. R., Bult C. J., Tomb J. F., Dougherty B.
A., Merrick J. M., et al. |
Saccharomyces cerevisa (yeast) |
12.1 | 6,034 | 1996 | |
Caenorhabditis elegans (roundworm) |
97 | 19,099 | 1998 |
First animal genome to be sequenced. |
Arabidopsis thaliana (thale cress) |
125 | 25,500 | 2000 | First plant genome to be sequenced.
The Arabidopsis Genome Initiative. Other articles in the same issue treat the sequences of chromosomes 1, 3 and 5. |
Drosophila melanogaster (fruit fly) |
185 | 13,061 | 2000 | Adams, M. D. et al. The genome sequence of Drosophila melanogaster. Science, vol. 287, page 2185. (24 March 2000) |
Mus musculus (laboratory mouse) |
3000 | 50,000 | ||
Homo sapiens (human) |
3120 | 30,000 | 2001 | Human Genome Project and Celera
Venter, J. C. et al. International Human Genome Sequencing Consortium. |
2013 | Genome Reference Consortium | |||
3.055 billion base pair (bp) | 2021 | Nurk, S., et al, (the Telomere-to-Telomere Consortium) |
||
Agrobacterium tumefaciens (crown gall bacterium) |
5.67 | ? | 2001 | www.agrobacterium.org |
Oryza sativa (rice) |
389 | 37,544 | 2005 | Second plant genome to be sequenced. |
The National Center for Biotechnology Information maintains a fascinating, though advanced, repository of information at
www.ncbi.nlm.nih.gov/Complete_Genomes/ .
www.genomesize.com . An Animal Genome Size Database, maintained by T. Ryan Gregory at the University of Guelph. It reports genome size in picograms.
https://www.alliancegenome.org/
Copyright © 2000 Sizes, Inc. All rights reserved.
Last revised: 15 November 2001.