IJSEM Journal of Bacteriology
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via CrossRef
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Goris, J.
Right arrow Articles by Tiedje, J. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Goris, J.
Right arrow Articles by Tiedje, J. M.
Agricola
Right arrow Articles by Goris, J.
Right arrow Articles by Tiedje, J. M.
Int J Syst Evol Microbiol 57 (2007), 81-91; DOI  10.1099/ijs.0.64483-0
© 2007 International Union of Microbiological Societies

DNA–DNA hybridization values and their relationship to whole-genome sequence similarities

Johan Goris1,{dagger}, Konstantinos T. Konstantinidis1,{ddagger}, Joel A. Klappenbach1, Tom Coenye2, Peter Vandamme2 and James M. Tiedje1

1 Center for Microbial Ecology, Michigan State University, East Lansing, MI 48824, USA
2 Laboratory for Microbiology, Gent University, K. L. Ledeganckstraat 35, B-9000 Gent, Belgium

Correspondence
Johan Goris
johan_goris{at}applied-maths.com


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
DNA–DNA hybridization (DDH) values have been used by bacterial taxonomists since the 1960s to determine relatedness between strains and are still the most important criterion in the delineation of bacterial species. Since the extent of hybridization between a pair of strains is ultimately governed by their respective genomic sequences, we examined the quantitative relationship between DDH values and genome sequence-derived parameters, such as the average nucleotide identity (ANI) of common genes and the percentage of conserved DNA. A total of 124 DDH values were determined for 28 strains for which genome sequences were available. The strains belong to six important and diverse groups of bacteria for which the intra-group 16S rRNA gene sequence identity was greater than 94 %. The results revealed a close relationship between DDH values and ANI and between DNA–DNA hybridization and the percentage of conserved DNA for each pair of strains. The recommended cut-off point of 70 % DDH for species delineation corresponded to 95 % ANI and 69 % conserved DNA. When the analysis was restricted to the protein-coding portion of the genome, 70 % DDH corresponded to 85 % conserved genes for a pair of strains. These results reveal extensive gene diversity within the current concept of ‘species’. Examination of reciprocal values indicated that the level of experimental error associated with the DDH method is too high to reveal the subtle differences in genome size among the strains sampled. It is concluded that ANI can accurately replace DDH values for strains for which genome sequences are available.


Abbreviations: ANI, average nucleotide identity; DDH, DNA–DNA hybridization

{dagger}Present address: Applied Maths NV, Keistraat 120, B-9830 Sint-Martens-Latem, Belgium. Back

{ddagger}Present address: 15 Vassar Street, Room 48-336, Massachusetts Institute of Technology, Cambridge, MA 02139, USA. Back


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
There is a general consensus among taxonomists that all taxonomic information about a bacterium is incorporated in the complete nucleotide sequence of its genome (Stackebrandt et al., 2002Go; Wayne et al., 1987Go). As whole-genome sequencing did not become available until recent years and, even then, only for a limited number of organisms, other parameters by which organisms could be classified into practical categories were needed. Since the 1960s, DNA–DNA hybridization (DDH) experiments have been performed to determine relatedness between bacteria, as this was one of the few universally applicable techniques available that could offer truly genome-wide comparisons between organisms. A value of 70 % DDH was proposed by Wayne et al. (1987)Go as a recommended standard for delineating species. Although this recommendation is not a strict standard and, in fact, several studies have used more stringent DDH cut-off values or have employed no DDH experiments at all, we focused our analysis on 70 % DDH because it is the best-known standard. Several principally different methods for the measurement of DDH values have been described (Brenner et al., 1969Go; Crosa et al., 1973Go; De Ley et al., 1970Go; Ezaki et al., 1989Go), and the use of DDH in bacterial taxonomy has recently been reviewed in detail (Rosselló-Mora, 2006Go). While the technique has the above-mentioned advantages, it also has several important drawbacks. Because relatively large quantities of DNA (in comparison with PCR-based techniques) of a high quality are required, the whole process of performing DDHs often becomes rather time-consuming and labour-intensive. Also, the diverse methods that are available can yield different results, especially for lower reassociation values (Grimont et al., 1980Go; Huß et al., 1983Go). Its main disadvantage, however, is that because of the comparative nature of the technique no incremental databases can be built, in contrast to sequence information, for example (Gevers et al., 2005Go; Stackebrandt, 2003Go). Because of these drawbacks, bacterial taxonomists are actively searching for alternative methods that can replace DDH experiments (Cho & Tiedje, 2001Go; Coenye et al., 2005Go; Gevers et al., 2005Go).

We have recently shown that the average nucleotide identity (ANI) of conserved genes present in two sequenced strains represents a robust measure of the genetic and evolutionary distance between them, because it shows a strong correlation with 16S rRNA gene sequence similarity and the mutation rate of the genome, it is not affected by lateral transfer or variable recombination rates of single (or a few) genes and it offers resolution at the subspecies level (Konstantinidis & Tiedje, 2005Go). Previously, ANI was compared with DDH values using a limited number of published data often obtained with different hybridization methods. When DDH values were not available for the sequenced strains, mean DDH values for other strains of the same species were used in the calculations (Konstantinidis & Tiedje, 2005Go). However, it is important to account for strain differences within species and to perform all DDH experiments with a single, well-established method under identical experimental conditions.

The goal of the present study was to examine more accurately the relationship between DDH values and (genomic) sequence-derived parameters, such as ANI. For this purpose, we determined a large number of DDH values among related strains for which the whole genome had been sequenced. Furthermore, we evaluated whether genome size differences can explain differences in reciprocal reactions that are often observed with DDH.


    METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Strain selection.
Strains in this study were chosen on the basis of the availability of their complete genome sequences (either fully closed or high draft sequences). Hybridization groups were defined as groups of strains sharing more than 94 % 16S rRNA gene sequence identity (Table 1Go). The following strains were generously provided by the researchers indicated: Shigella sonnei 53G (Dr Ian Henderson, Department of Microbiology and Immunology, Queen's University of Belfast School of Medicine, UK), Pseudomonas fluorescens SBW25 (Dr Andrew J. Spiers, Department of Plant Sciences, University of Oxford, UK), P. fluorescens PfO-1 (Dr Mark W. Silby, Department of Molecular Biology and Microbiology, Tufts University School of Medicine, MA, USA), Pseudomonas syringae pv. Syringae B728a (Dr Steve E. Lindow, Department of Plant and Microbial Biology, University of California, CA, USA) and P. syringae pv. Tomato DC3000 (Dr Cheng Yang He, Department of Plant Biology, Michigan State University, MI, USA). All six Escherichia coli strains and Shigella flexneri 2a 2457T were generously provided by Dr Thomas S. Whittam (National Food Safety and Toxicology Center, Michigan State University, MI, USA). Dr Margaret F. Romine (Pacific Northwest National Laboratory, Richland, WA, USA) provided the Shewanella strains. Bacillus cereus ATCC 10987, P. fluorescens Pf-5 (=ATCC BAA-477), Pseudomonas aeruginosa PAO1 (=ATCC BAA-47), Streptococcus agalactiae 2603 V/R (=ATCC BAA-611) and Streptococcus agalactiae NEM 316 (=ATCC 12403) were obtained from the American Type Culture Collection (Manassas, VA, USA), and Bacillus cereus ATCC 14579T (=LMG 6923T) and Streptococcus agalactiae A909 (=LMG 15083) were provided by the BCCM/LMG Bacteria Collection (Gent University). All strains were grown aerobically for 24 h (see Table 1Go for details of the growth media and incubation temperatures).


View this table:
[in this window]
[in a new window]

 
Table 1. Strains used in this study

 
DNA–DNA hybridizations.
DNA from Gram-negative organisms was prepared as described by Marmur (1961)Go, with reagent volumes adapted to the use of 50 ml centrifuge tubes. For Gram-positive organisms, DNA was prepared according to De Clerck et al. (2004)Go. DDH reactions were as described by Ezaki et al. (1989)Go, with slight modifications. Briefly, DNA was non-covalently adsorbed to polystyrene microplates (black MaxiSorp, FluoroNunc; Nunc) by incubating 100 µl portions of a denatured DNA solution [10 ng DNA per µl phosphate-buffered saline (PBS)/MgCl2 (8 mM NaH2PO4, 1.5 mM KH2PO4, pH 7.2, 137 mM NaCl, 2.7 mM KCl, 0.1 M MgCl2)] per well at 30 °C for 4 h in a hybridization oven. Before incubation, plates were sealed with self-adhesive vinyl tape (Nunc). The plates were then washed once with 300 µl PBS per well with the aid of a multichannel pipette, dried at 45 °C for 15 min and stored in a desiccator at 4 °C. Probe DNA was labelled by mixing 10 µl DNA solution [0.5 µg µl–1 in 0.1x SSC (1x SSC is 0.15 M NaCl plus 0.015 M sodium citrate, pH 7.0±0.2)] plus 10 µl photobiotin solution (Sigma) (0.5 µg µl–1 in water) in a 1.5 ml Eppendorf tube and illuminating the mixture for 30 min under a 400 W mercury-vapour lamp while the open tube was kept upright in a cooling block on ice. The labelled probe DNA was diluted by adding 185 µl 0.1 M Tris/HCl (pH 9.0) and the remaining free photobiotin was removed by extracting twice with 200 µl water-saturated 1-butanol. The probe DNA was then fragmented with 30 ultrasonic pulses at 70 % output (model W-385 sonicator; Heat Systems-Ultrasonics), denatured at 100 °C for 10 min and immediately cooled on ice. A pre-hybridization step was performed by adding 200 µl pre-hybridization solution (2x SSC, 5x Denhardt's solution, 50 % formamide, 100 µg denatured salmon sperm DNA ml–1) per well, sealing the microplate with vinyl tape and incubating it for 30 min at the appropriate hybridization temperature (Table 1Go) in the hybridization oven. For the actual hybridization, the pre-hybridization solution was removed and 100 µl hybridization solution (pre-hybridization solution plus 2.5 % dextran sulfate and 1 µg probe DNA ml–1) was added per well. The microplate was sealed again with vinyl tape and incubated for 3 h at the appropriate hybridization temperature. Hybridization temperatures (given in Table 1Go) chosen were about 5 °C higher (stringent conditions) than the optimal renaturation temperature calculated as [0.51x (G+C mol%) + 47] – 36 °C, where 36 °C is the correction for the presence of 50 % formamide (De Ley, 1970Go; McConaughy et al., 1969Go). The microplate was then washed three times with 300 µl 1x SSC per well. For the enzymic development, 100 µl streptavidin–beta-D-galactosidase (Gibco BRL) solution was added per well (0.5 U ml–1 in PBS plus 0.5 % BSA) and the microplate was covered with a preheated empty microplate and incubated for 10 min at 37 °C. Subsequently, the plate was washed three times with 300 µl 1x SSC per well, using the microplate washer. Finally, the substrate for beta-D-galactosidase, 4-methylumbelliferyl beta-D-galactopyranoside (Sigma), was added (100 µl per well, 0.1 mg ml–1 in PBS plus 1 mM MgCl2) and the plate was incubated at 37 °C. The reaction product, 4-methylumbelliferone (excitation max., 360 nm; emission max., 465 nm) was quantified using a SpectraMax M2 microplate reader (Molecular Devices) at 0, 15, 30 and 45 min and data were immediately transferred to a personal computer. DDH values were calculated using the fluorescence measurements at 30 min; a homologous reaction was regarded as representing 100 % reassociation.

Unless otherwise stated, the DDH values reported here are the means of at least two independent experiments (i.e. DNA immobilization and actual hybridizations performed in different batches on different days). In each of these experiments, all hybridization reactions were done in quadruplicate and calculations were based on the mean fluorescence values (clearly aberrant fluorescence values were omitted). All reciprocal hybridizations (different hybridizations using the same DNAs, A and B, but once with A as the immobilized DNA and once with B as the immobilized DNA) were carried out.

Sequence-based comparisons.
All pairwise, whole-genome sequence comparisons were performed as follows. The genomic sequence from one of the genomes in a pair (‘the query’) was cut into consecutive 1020 nt fragments. The 1020 nt cut-off was used to correspond with the fragmentation of the genomic DNA to approximately 1 kb fragments during the DDH experiments. The use of different cut-offs (e.g. smaller fragments) did not notably modify our results (data not shown). The 1020 nt fragments were then used to search against the whole genomic sequence of the other genome in the pair (‘the reference’) by using the BLASTN algorithm (Altschul et al., 1997Go); the best BLASTN match was saved for further analysis. The BLASTN algorithm was run using the following settings: X=150 (where X is the drop-off value for gapped alignment), q=–1 (where q is the penalty for nucleotide mismatch) and F=F (where F is the filter for repeated sequences); the rest of the parameters were used at the default settings. These settings give better sensitivity than the default settings when more distantly related genomes are being compared, as the latter target sequences that are more similar to each other.

To calculate the percentage of conserved DNA between a query and a reference, only those BLASTN matches reaching values above a cut-off point of 90 % nucleotide sequence identity were considered, regardless of the extent of the alignable region. The lengths of the alignable regions for all such matches were summed and the sum was divided by the total length of the genomic DNA of the query genome to provide a genome size-independent measurement of the percentage of the query's DNA that was conserved in the reference genome.

The ANI between the query genome and the reference genome was calculated as the mean identity of all BLASTN matches that showed more than 30 % overall sequence identity (recalculated to an identity along the entire sequence) over an alignable region of at least 70 % of their length. This cut-off is above the ‘twilight zone’ of similarity searches in which an inference of homology is error prone because of low levels of similarity between aligned sequences (Rost, 1999Go; Sander & Schneider, 1991Go). Therefore we can assume that only homologous DNA fragments were considered in our calculations.

Reverse searching, i.e. in which the reference genome is used as the query, was also performed to provide reciprocal values. Perl scripts were used to extract 1020 nt fragments from whole-genome sequence files, formatting databases for BLAST searches and automatically parsing BLAST outputs. These scripts are available upon request.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
The DDH values measured, the ANI data and the percentages of conserved DNA among strains within the six bacterial groups are presented in Table 2Go. For the DDH values, the mean values of replicate reactions as well as the standard deviations are given. We evaluated the correlation between DDH values, ANI data and the percentages of conserved DNA using different regression models (Figs 1 and 2GoGo). Regression analysis between DDH values and ANI data performed using linear, exponential, power and logarithmic models gave comparably high r2 correlation values (0.94, 0.94, 0.95 and 0.94, respectively). The recommended cut-off point of 70 % DDH for species delineation thereby corresponded to an ANI of 95±0.5 %, depending on the specific regression model used (Fig. 1aGo). When the analysis was restricted to the protein-coding portion of the genome, the 70 % DDH recommendation corresponded to a mean value of 85 % gene conservation for a pair of strains. In an effort to obtain a more conservative estimation of functional similarity, a reciprocal best-match approach was used to determine the orthologous fraction of the conserved genes. In this case, 70 % DDH corresponded to 79 % conserved genes. For the percentage of conserved DNA, we initially evaluated different sequence identity cut-off points to test which of them corresponded best to the DDH values. The 60, 70, 80, 90 and 95 % sequence identity cut-off points evaluated resulted in r2 values of 0.85, 0.87, 0.87, 0.96 and 0.78, respectively. As the best correlation was obtained with a 90 % cut-off point, only those values (for percentage of conserved DNA) calculated with this cut-off are given in Table 2Go and used in Fig. 1(b)Go. In comparison with the linear model (r2=0.95), significantly lower r2 values were calculated with the exponential, power and logarithmic models (0.87, 0.82 and 0.71, respectively). With the linear model, a DDH value of 70 % corresponded to 69 % conserved DNA (Fig. 1bGo). The genomic sequence-derived parameters ANI and the percentage of conserved DNA correlated well at cut-off points above about 80 % ANI (r2=0.96 for the linear model; Fig. 2Go). Given the 90 % cut-off point used for the calculation of the percentage of conserved DNA, no correlation should be expected at ANI values lower than 80 %. As ANI and the percentage of conserved DNA could not be considered as independent parameters, no multiple regression analysis with DDH could be performed.


View this table:
[in this window]
[in a new window]

 
Table 2. DDH values, ANI and percentage of conserved DNA among the strains studied

Values for DDH are means±SD for independent repetitions of the same hybridization reaction; values without a standard deviation were determined only once. The percentage of conserved DNA was determined at a cut-off point of 90 % nucleotide identity (as calculated by BLASTN).

 

Figure 1
View larger version (19K):
[in this window]
[in a new window]

 
Fig. 1. Relationship between DDH values and genomic sequence identity and conservation. Each filled circle represents the value for DDH between two strains (y-axis), plotted against the ANI of the conserved genes between the strains (a) and the percentage of conserved DNA between the strains (b). The standard deviations for the DDH values, omitted from (a) for simplicity, are shown in (b). A linear trend line is shown, but other regression models were evaluated as well (see text). The horizontal broken lines denote the 70 % DDH recommendation for species delineation, while the vertical broken lines denote the corresponding ANI (a) and percentage of conserved DNA (b) values for linear regression.

 

Figure 2
View larger version (10K):
[in this window]
[in a new window]

 
Fig. 2. Relationship between genomic sequence identity and conservation. Each filled circle represents the percentage of conserved DNA shared between two strains (determined at 90 % nucleotide identity), plotted against the ANIs of their common genes.

 
Every DDH reaction was run in the reverse direction as well, i.e. in which the immobilized genome of the forward reaction is used as the probe genome in the reverse reaction. We evaluated whether DNA content differences account for differences between values for the reciprocal reactions. For instance, if the amount of DNA conserved in two genomes A and B (where A has a larger genome size than B), is x Mb, then this conserved DNA represents a larger percentage of the genomic DNA of strain B compared with strain A. Therefore, the reassociation reaction with B as the immobilized genome should have a higher value than the reverse reassociation reaction (i.e. proportional to the genome size difference between the two strains). We found a weakly positive (r2=0.19) but significant (P<0.001) linear correlation between the difference in the reciprocal DDH values and the difference in the percentage of conserved DNA between two genomes (Fig. 3Go). The weakness of the correlation is probably attributable to experimental error, which, although reasonably small (the mean standard deviation was 2.7 %), is relatively large with respect to the typically small differences between the reciprocal DDH values (the mean difference was 4.3 %). The ANI between a pair of strains typically shows <0.1 % difference between the reciprocal searches, i.e. when the query genome of the forward search was used as the reference in the reverse (reciprocal) search. Therefore, ANI should have only a minimal effect on differences between reciprocal DDH values, and therefore we did not investigate this correlation.


Figure 3
View larger version (20K):
[in this window]
[in a new window]

 
Fig. 3. Reciprocal DDH values versus DNA-content differences. Each filled circle represents the difference between the reciprocal DDH reactions (y-axis) for a pair of strains (e.g. strain A versus strain B, with A being the query and immobilized strain in the sequence comparisons and DDH experiments, respectively), plotted against the difference in the percentage of conserved DNA between the two strains. A negative value on the y-axis indicates that the reciprocal reaction, i.e. using B as the query and immobilized strain, gave a higher DDH value than the forward reaction. A negative value on the x-axis indicates that the conserved DNA between the two strains represents a larger fraction of strain B (or, alternatively, that strain B has a smaller genome size than strain A).

 
Finally, we looked for correlations between the mean G+C content of the genome, the percentage of genomic fragments deviating from the mean genomic G+C content, genomic duplication and the extent of variation between replicates of the same DDH hybridization (i.e. the experimental error). Genomic duplication was defined here as the fraction of the 1020 nt genomic fragments that have another match within their genome with more than 80 % identity. We found no significant correlations between any of these parameters and the experimental error. Therefore, the experimental error associated with DDH values is presumably attributable exclusively to technical issues concerning DDH experiments.


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
We determined 124 mean DDH values between 28 genome sequenced strains belonging to six phylogenetically distinct groups. Strains within a group were closely related, i.e. showing >70 % ANI and >94 % 16S rRNA gene sequence identity. The six bacterial groups were Bacillus cereus, Burkholderia species, E. coli/Shigella species, Pseudomonas species, Shewanella species and Streptococcus agalactiae (Table 2Go). These groups represent opportunistic pathogens as well as environmental species, they include both Gram-positive and Gram-negative bacteria and they comprise a wide range of genome sizes (Table 1Go), from 2 Mb (Streptococcus agalactiae) to mean genome sizes of 4–6 Mb (Escherichia, Pseudomonas and Shewanella) and up to 8 Mb (Burkholderia), which allowed for robust interpretations. The microplate hybridization method of Ezaki et al. (1989)Go is a well-established and frequently used method in bacterial taxonomy (Rosselló-Mora, 2006Go). A mean standard deviation of 2.7 % was calculated between replicate DDH experiments. This deviation is comparable to or smaller than those reported in previous studies (Christensen et al., 2000Go; Goris et al., 1998Go; Huß et al., 1983Go; Johnson, 1991Go), indicating that the experimental error was consistently small.

Several reviews (Rosselló-Mora, 2006Go; Rosselló-Mora & Amann, 2001Go; Stackebrandt & Goebel, 1994Go; Stackebrandt & Liesack, 1993Go) mention that DNA fragments must share at least 80 % identity in order to hybridize during DDH experiments. However, this statement is based on early studies of the hybridization kinetics of unmodified and alkali-deaminated DNA (Ullmann & McCarthy, 1973Go) or synthetic polyribonucleotides (Bautz & Bautz, 1964Go). We found that a cut-off point of 90 % nucleotide identity gave a significantly better correlation between the percentages of conserved DNA and the DDH values than the 80 % cut-off point (r2=0.96 versus 0.87, respectively). Our analysis does not preclude the possibility that some genomic fragments of 80 % (or less) identity cross-hybridize, but it shows that fragments of greater identity are more important during the genome-scale hybridizations; consequently, a 90 % cut-off point was used in the remaining analysis to determine the percentage of conserved DNA between two strains.

Our results revealed a close relationship between DDH values and ANI (Fig. 1aGo) and between DDH values and the percentage of conserved DNA (Fig. 1bGo) for each pair of strains. Because of the very small differences between different models (linear, exponential, power and logarithmic) in terms of their ability to describe the relationship between DDH and ANI, no assumptions can be made about the mechanisms underlying this relationship based on these comparisons. The relationship between DDH values and percentage of conserved DNA was best described by a linear model. The model's divergence from the ideal situation (y=x) at lower values for the percentage of conserved DNA could be explained by a small contribution from DNA fragments with less than 90 % identity but that still hybridize (Fig. 1bGo). The finding that ANI and the percentage of conserved DNA are strongly correlated is consistent with previous results (Konstantinidis & Tiedje, 2005Go). Therefore, only one of these two genome-derived parameters is necessary for a fairly accurate prediction of the expected DDH values between two strains. According to our dataset, the classical cut-off point of 70 % DDH similarity for species delineation corresponds to 95 % ANI and 69 % conserved DNA. With the analysis restricted to the protein-coding portion of the genome, 70 % DDH corresponds to 85 or 79 % conserved genes between a pair of strains when, respectively, a one-way or a reciprocal best-match approach was used to determine the orthologous fraction of the conserved genes. These results reveal that the 70 % DDH recommendation encompasses relatively homogeneous strains at the genomic level, which is consistent with previous studies on the phenotypic similarity of the strains (Stackebrandt & Goebel, 1994Go; Wayne et al., 1987Go). Nonetheless, a difference of up to 21 % in gene content between strains showing >=70 % DDH represents a large genetic and (presumably) phenotypic difference, e.g. up to 1000 genes may differ between two strains with a 5 Mb genome (approximately the mean genome size). Such a large difference in gene content would probably be responsible for a suite of important phenotypes, which could justify the description of such strains as separate species or, at least, ecotypes. It is possible, however, that these phenotypes would only be important under natural conditions, i.e. they might not be apparent under laboratory conditions because of technological limitations. If the results based on the six bacterial groups considered here are more universally applicable in the prokaryotic world, then our results suggest that the 70 % DDH criterion can only serve as a first (coarse) level of screening for species. Higher resolution should then be adopted, as necessary, for particular groups of organism.

Theoretically, differences in DNA content (or genome size) could be expected to lead to differences in reciprocal DDH values. However, the low level of correlation found between differences in reciprocal DDH values and the difference in the percentage of conserved DNA between two genomes (Fig. 3Go) indicates that DDH is too coarse a method (i.e. the experimental error is too high) to reveal subtle differences in genome size between strains. Experimental error could not be explained by (deviations of) the mean genomic G+C content or genomic duplications and is therefore probably solely attributable to technical errors such as DNA impurities and fragmentation affecting the efficiencies of the immobilization, hybridization and enzymic reactions.

Our results (Fig. 1Go), together with data from other studies (Rademaker et al., 2000Go; Vauterin et al., 1995Go), suggest that DDH values are continuous, i.e. theoretically, every value between 0 and 100 % could be obtained in DDH experiments. These data are supportive of a continuous gradient of genetic relatedness rather than discrete species boundaries. Although relatively few of the 28 strains studied appeared to be moderately related (i.e. showing 80–90 % ANI or 30–60 % DDH) (Fig. 1Go), this result is probably attributable to a bias in the collection of sequenced strains rather than to species boundaries. We recently reported on the occurrence of species-specific diagnostic genetic signatures among sequenced representatives of E. coli/Shigella and Salmonella (Konstantinidis & Tiedje, 2005Go). The recent description of Escherichia albertii (Huys et al., 2003Go), a novel species that probably spans the genetic gap between E. coli and Salmonella (Hyma et al., 2005Go), as well as the analysis of environmental E. coli isolates (Byappanahalli et al., 2006Go; Ishii et al., 2006Go), indicates, however, that a genetic continuum may indeed be present for this group of bacteria as well. Besides, because of the pronounced decrease in the percentage of conserved DNA shown with increasing evolutionary distance (Fig. 2Go), discontinuities in the DDH values should be expected every time distantly related groups are compared (e.g. <80–85 % ANI) such as E. coli versus Salmonella (~80 % ANI). Shorter evolutionary scales, e.g. corresponding to 85–100 % ANI, are the most important – and at the same time the most underinvestigated – areas for investigation with respect to species boundaries. The current dataset is simply too small for either validation or rejection of the existence of ‘condensed nodes in a cloudy and confluent taxonomic space’ (Vandamme et al., 1996Go). As has been stated previously by other authors (e.g. Rosselló-Mora, 2003Go), species delineation through the rigid application of any standard for DNA–DNA relatedness (such as the 70 % cut-off point) is purely arbitrary. Our data further validate this statement, as the 70 % cut-off point does not necessarily correlate with clear genomic clusters within the set of strains investigated.

A note should be made here regarding DDH values and species designation. Despite the fact that they are classified within separate genera, it is known that in the context of population genetics the four Shigella species belong to the diverse species E. coli (Lan & Reeves, 2002Go). This is clearly reflected in our hybridization results (Table 2Go): reassociation values between 61.3 and 83.3 % were found between Shigella sonnei 53G or Shigella flexneri 2a 2457T and the six E. coli strains. These values are comparable with those found among the E. coli strains (71.4–100 %, highest level of similarity being between the two O157 serotypes).

Remarkably low DDH values were found between Pseudomonas strains that are reported to belong to the same species: P. fluorescens strains Pf-5, SBW25 and PfO-1 yielded DDH values between 25 and 32 %, whereas the two P. syringae strains, B728a and DC3000, yielded DDH values of 38–39 % (Table 2Go). These low reassociation values demonstrate that these strains cannot belong to the same species. Consistent with this, the ANI values among these genomes are much lower than the 95 % ANI value corresponding to the 70 % DDH recommendation.

While the first example might be common knowledge among microbiologists, the second illustrates that caution should be exercised when drawing conclusions in genome-comparison studies based on the reported species name for some sequenced strains.

In conclusion, we have shown that DDH values correlate well with the genome sequence-derived parameters ANI and the percentage of conserved DNA. A value of 70 % DDH thereby corresponds to about 95 % ANI and 69 % conserved DNA. Previously published DDH values could be used to give a rough approximation of the ANI values and gene-content differences between the strains evaluated, using the equations described here (Fig. 1Go). For more accurate measurements, however, alternative methods are needed. At present, only a relatively small fraction of all available strains can be fully sequenced, but multilocus sequencing analysis using appropriate genetic markers might have potential application in this area. Further investigation is required, however, to determine whether multilocus sequencing analysis correlates as well as DDH with ANI. Despite its drawbacks, DDH remains valuable in bacterial taxonomy. It is the accepted standard, and, to date, no other universally applicable and cost-effective technique offers genome-wide comparison. However, the steadily decreasing cost of DNA sequencing means that DDH is likely to be replaced by sequence-based techniques in the not-too-distant future.


    ACKNOWLEDGEMENTS
 
J. G. acknowledges the support he received from the Belgian American Educational Foundation, in the form of a BAEF post-doctoral fellowship. The authors thank the researchers (named in the text) that generously provided strains.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J. H., Zhang, Z., Miller, W. & Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389–3402.[Abstract/Free Full Text]

Bautz, E. K. F. & Bautz, F. A. (1964). The influence of noncomplementary bases on the stability of ordered polynucleotides. Proc Natl Acad Sci U S A 52, 1476–1481.[Free Full Text]

Brenner, D. J., Fanning, G. R., Rake, A. V. & Johnson, K. E. (1969). Batch procedure for thermal elution of DNA from hydroxyapatite. Anal Biochem 28, 447–459.[CrossRef][Medline]

Byappanahalli, M. N., Whitman, R. L., Shively, D. A., Sadowsky, M. J. & Ishii, S. (2006). Population structure, persistence, and seasonality of autochthonous Escherichia coli in temperate, coastal forest soil from a Great Lakes watershed. Environ Microbiol 8, 504–513.[CrossRef][Medline]

Cho, J. C. & Tiedje, J. M. (2001). Bacterial species determination from DNA-DNA hybridization by using genome fragments and DNA microarrays. Appl Environ Microbiol 67, 3677–3682.[Abstract/Free Full Text]

Christensen, H., Angen, Ø., Mutters, R., Olsen, J. E. & Bisgaard, M. (2000). DNA–DNA hybridization determined in micro-wells using covalent attachment of DNA. Int J Syst Evol Microbiol 50, 1095–1102.[Abstract]

Coenye, T., Gevers, D., Van de Peer, Y., Vandamme, P. & Swings, J. (2005). Towards a prokaryotic genomic taxonomy. FEMS Microbiol Rev 29, 147–167.[CrossRef][Medline]

Crosa, J. H., Brenner, D. J. & Falkow, S. (1973). Use of a single-strand specific nuclease for analysis of bacterial and plasmid deoxyribonucleic acid homo- and heteroduplexes. J Bacteriol 115, 904–911.[Abstract/Free Full Text]

De Clerck, E., Rodriguez-Diaz, M., Vanhoutte, T., Heyrman, J., Logan, N. A. & De Vos, P. (2004). Anoxybacillus contaminans sp. nov. and Bacillus gelatini sp. nov., isolated from contaminated gelatin batches. Int J Syst Evol Microbiol 54, 941–946.[Abstract/Free Full Text]

De Ley, J. (1970). Re-examination of the association between melting point, buoyant density, and chemical base composition of deoxyribonucleic acid. J Bacteriol 101, 738–754.[Abstract/Free Full Text]

De Ley, J., Cattoir, H. & Reynaerts, A. (1970). The quantitative measurement of DNA hybridization from renaturation rates. Eur J Biochem 12, 133–142.[Medline]

Ezaki, T., Hashimoto, Y. & Yabuuchi, E. (1989). Fluorometric deoxyribonucleic acid-deoxyribonucleic acid hybridization in microdilution wells as an alternative to membrane-filter hybridization in which radioisotopes are used to determine genetic relatedness among bacterial strains. Int J Syst Bacteriol 39, 224–229.[Abstract/Free Full Text]

Gevers, D., Cohan, F. M., Lawrence, J. G., Spratt, B. G., Coenye, T., Feil, E. J., Stackebrandt, E., Van de Peer, Y., Vandamme, P. & other authors (2005). Re-evaluating prokaryotic species. Nat Rev Microbiol 3, 733–739.[CrossRef][Medline]

Goris, J., Suzuki, K., De Vos, P., Nakase, T. & Kersters, K. (1998). Evaluation of a microplate DNA-DNA hybridization method compared with the initial renaturation method. Can J Microbiol 44, 1148–1153.[CrossRef]

Grimont, P. A. D., Popoff, M. Y., Grimont, F., Coynault, C. & Lemelin, M. (1980). Reproducibility and correlation study of three deoxyribonucleic acid hybridization procedures. Curr Microbiol 4, 325–330.

Huß, V. A. R., Festl, H. & Schleifer, K. H. (1983). Studies on the spectrophotometric determination of DNA hybridization from renaturation rates. Syst Appl Microbiol 4, 184–192.

Huys, G., Cnockaert, M., Janda, J. M. & Swings, J. (2003). Escherichia albertii sp. nov., a diarrhoeagenic species isolated from stool specimens of Bangladeshi children. Int J Syst Evol Microbiol 53, 807–810.[Abstract/Free Full Text]

Hyma, K. E., Lacher, D. W., Nelson, A. M., Bumbaugh, A. C., Janda, J. M., Strockbine, N. A., Young, V. B. & Whittam, T. S. (2005). Evolutionary genetics of a new pathogenic Escherichia species: Escherichia albertii and related Shigella boydii strains. J Bacteriol 187, 619–628.[Abstract/Free Full Text]

Ishii, S., Ksoll, W. B., Hicks, R. E. & Sadowsky, M. J. (2006). Presence and growth of naturalized Escherichia coli in temperate soils from Lake Superior watersheds. Appl Environ Microbiol 72, 612–621.[Abstract/Free Full Text]

Johnson, J. L. (1991). DNA reassociation experiments. In Nucleic Acid Techniques in Bacterial Systematics, pp. 21–44. Edited by E. Stackebrandt & M. Goodfellow. Chichester: Wiley.

Konstantinidis, K. T. & Tiedje, J. M. (2005). Genomic insights that advance the species definition for prokaryotes. Proc Natl Acad Sci U S A 102, 2567–2572.[Abstract/Free Full Text]

Lan, R. & Reeves, P. R. (2002). Escherichia coli in disguise: molecular origins of Shigella. Microbes Infect 4, 1125–1132.[CrossRef][Medline]

Marmur, J. (1961). A procedure for the isolation of deoxyribonucleic acid from micro-organisms. J Mol Biol 3, 208–218.

McConaughy, B. L., Laird, C. D. & McCarthy, B. J. (1969). Nucleic acid reassociation in formamide. Biochemistry 8, 3289–3295.[CrossRef][Medline]

Rademaker, J. L., Hoste, B., Louws, F. J., Kersters, K., Swings, J., Vauterin, L., Vauterin, P. & de Bruijn, F. J. (2000). Comparison of AFLP and rep-PCR genomic fingerprinting with DNA–DNA homology studies: Xanthomonas as a model system. Int J Syst Evol Microbiol 50, 665–677.[Abstract]

Rosselló-Mora, R. (2003). Opinion: the species problem, can we achieve a universal concept? Syst Appl Microbiol 26, 323–326.[Medline]

Rosselló-Mora, R. (2006). DNA-DNA reassociation methods applied to microbial taxonomy and their critical evaluation. In Molecular Identification, Systematics and Population Structure of Prokaryotes, pp. 23–50. Edited by E. Stackebrandt. Berlin: Springer.

Rosselló-Mora, R. & Amann, R. (2001). The species concept for prokaryotes. FEMS Microbiol Rev 25, 39–67.[Medline]

Rost, B. (1999). Twilight zone of protein sequence alignments. Protein Eng 12, 85–94.[Abstract/Free Full Text]

Sander, C. & Schneider, R. (1991). Database of homology-derived protein structures and the structural meaning of sequence alignment. Proteins 9, 56–68.[CrossRef][Medline]

Stackebrandt, E. (2003). The richness of prokaryotic diversity: there must be a species somewhere. Food Technol Biotechnol 41, 17–22.

Stackebrandt, E. & Goebel, B. M. (1994). A place for DNA-DNA reassociation and 16S rRNA sequence analysis in the present species definition in bacteriology. Int J Syst Bacteriol 44, 846–849.[Abstract/Free Full Text]

Stackebrandt, E. & Liesack, W. (1993). Nucleic acids and classification. In Handbook of New Bacterial Systematics, pp. 151–194. Edited by M. Goodfellow & A. G. O'Donnell. London: Academic Press.

Stackebrandt, E., Frederiksen, W., Garrity, G. M., Grimont, P. A. D., Kämpfer, P., Maiden, M. C. J., Nesme, X., Rosselló-Mora, R., Swings, J. & other authors (2002). Report of the ad hoc committee for the re-evaluation of the species definition in bacteriology. Int J Syst Evol Microbiol 52, 1043–1047.[Abstract]

Ullmann, J. S. & McCarthy, B. J. (1973). The relationship between mismatched base pairs and the thermal stability of DNA duplexes. Biochim Biophys Acta 294, 416–424.[Medline]

Vandamme, P., Pot, B., Gillis, M., De Vos, P., Kersters, K. & Swings, J. (1996). Polyphasic taxonomy, a consensus approach to bacterial systematics. Microbiol Rev 60, 407–438.[Abstract/Free Full Text]

Vauterin, L., Hoste, B., Kersters, K. & Swings, J. (1995). Reclassification of Xanthomonas. Int J Syst Bacteriol 45, 472–489.[Abstract/Free Full Text]

Wayne, L. G., Brenner, D. J., Colwell, R. R., Grimont, P. A. D., Kandler, O., Krichevsky, M. I., Moore, L. H., Moore, W. E. C., Murray, R. G. E. & other authors (1987). Report of the ad hoc committee on reconciliation of approaches to bacterial systematics. Int J Syst Bacteriol 37, 463–464.[Free Full Text]




This article has been cited by other articles:


Home page
J. Bacteriol.Home page
X. Y. Han, K. C. Sizer, E. J. Thompson, J. Kabanja, J. Li, P. Hu, L. Gomez-Valero, and F. J. Silva
Comparative Sequence Analysis of Mycobacterium leprae and the New Leprosy-Causing Mycobacterium lepromatosis
J. Bacteriol., October 1, 2009; 191(19): 6067 - 6074.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
K. T. Konstantinidis, M. H. Serres, M. F. Romine, J. L. M. Rodrigues, J. Auchtung, L.-A. McCue, M. S. Lipton, A. Obraztsova, C. S. Giometti, K. H. Nealson, et al.
Comparative systems biology across an evolutionary gradient within the Shewanella genus
PNAS, September 15, 2009; 106(37): 15909 - 15914.
[Abstract] [Full Text] [PDF]


Home page
Int. J. Syst. Evol. Microbiol.Home page
N. A. Logan, O. Berge, A. H. Bishop, H.-J. Busse, P. De Vos, D. Fritze, M. Heyndrickx, P. Kampfer, L. Rabinovitch, M. S. Salkinoja-Salonen, et al.
Proposed minimal standards for describing new taxa of aerobic, endospore-forming bacteria
Int J Syst Evol Microbiol, August 1, 2009; 59(8): 2114 - 2121.
[Abstract] [Full Text] [PDF]


Home page
Int. J. Syst. Evol. Microbiol.Home page
E. Vanlaere, A. Baldwin, D. Gevers, D. Henry, E. De Brandt, J. J. LiPuma, E. Mahenthiralingam, D. P. Speert, C. Dowson, and P. Vandamme
Taxon K, a complex within the Burkholderia cepacia complex, comprises at least two novel species, Burkholderia contaminans sp. nov. and Burkholderia lata sp. nov.
Int J Syst Evol Microbiol, January 1, 2009; 59(1): 102 - 111.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
M. Deloger, M. El Karoui, and M.-A. Petit
A Genomic Distance Based on MUM Indicates Discontinuity between Most Bacterial Species and Genera
J. Bacteriol., January 1, 2009; 191(1): 91 - 99.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
D. Costechareyre, F. Bertolla, and X. Nesme
Homologous Recombination in Agrobacterium: Potential Implications for the Genomic Species Concept in Bacteria
Mol. Biol. Evol., January 1, 2009; 26(1): 167 - 176.
[Abstract] [Full Text] [PDF]


Home page
Int. J. Syst. Evol. Microbiol.Home page
M. J. Claesson, D. van Sinderen, and P. W. O'Toole
Lactobacillus phylogenomics - towards a reclassification of the genus
Int J Syst Evol Microbiol, December 1, 2008; 58(12): 2945 - 2954.
[Abstract] [Full Text] [PDF]


Home page
Int. J. Syst. Evol. Microbiol.Home page
M. Mulet, M. Gomila, C. Gruffaz, J.-M. Meyer, N. J. Palleroni, J. Lalucat, and E. Garcia-Valdes
Phylogenetic analysis and siderotyping as useful tools in the taxonomy of Pseudomonas stutzeri: description of a novel genomovar
Int J Syst Evol Microbiol, October 1, 2008; 58(10): 2309 - 2315.
[Abstract] [Full Text] [PDF]


Home page
Int. J. Syst. Evol. Microbiol.Home page
T. Adekambi, T. M. Shinnick, D. Raoult, and M. Drancourt
Complete rpoB gene sequencing as a suitable supplement to DNA-DNA hybridization for bacterial species and genus delineation
Int J Syst Evol Microbiol, August 1, 2008; 58(8): 1807 - 1814.
[Abstract] [Full Text] [PDF]


Home page
Int. J. Syst. Evol. Microbiol.Home page
M. Martens, P. Dawyndt, R. Coopman, M. Gillis, P. De Vos, and A. Willems
Advantages of multilocus sequence analysis for taxonomic studies: a case study using 10 housekeeping genes in the genus Ensifer (including former Sinorhizobium)
Int J Syst Evol Microbiol, January 1, 2008; 58(1): 200 - 214.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via CrossRef
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Goris, J.
Right arrow Articles by Tiedje, J. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Goris, J.
Right arrow Articles by Tiedje, J. M.
Agricola
Right arrow Articles by Goris, J.
Right arrow Articles by Tiedje, J. M.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
INT J SYST EVOL MICROBIOL MICROBIOLOGY J GEN VIROL
J MED MICROBIOL ALL SGM JOURNALS