©1995 by The American Society for Biochemistry and Molecular Biology, Inc.
Optical Melting of 128 Octamer DNA Duplexes
EFFECTS OF BASE PAIR LOCATION AND NEAREST NEIGHBORS ON THERMAL STABILITY (*)

Mitchel J. Doktycz (§) , Max D. Morris (1), Shelly J. Dormady (¶) , Kenneth L. Beattie (2), K. Bruce Jacobson

From the (1) Health Sciences Research Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831-6123, the Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831-6367, and the (2) DNA Technology Laboratory, Houston Advanced Research Center, The Woodlands, Texas 77831

ABSTRACT
INTRODUCTION
MATERIALS AND METHODS
RESULTS AND DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES

ABSTRACT

The use of short oligonucleotide probes is finding increased application in DNA sequencing and genome characterization techniques, but a lack of knowledge of the hybridization properties of short duplexes hinders their use. Melting data were acquired on 128 DNA duplexes based on the length proposed in sequencing by hybridization procedures and formed from the general sequences 5`- XYZTGGAC-3`, 5`-GTCCA XYZ-3`, 5`-GC XYZGAC-3`, and 5`-GTC XYZGC-3` where X, Y, and Z are either A, T, G, or C. These molecules were designed to elucidate the effects of location and nearest-neighbor stacking on the stability of base pairing in short DNA duplexes. The type of base pairs present had a major effect on stability, but was insufficient to predict stability without the inclusion of nearest-neighbor terms. Furthermore, the addition of information on position, or distance from the end, of the nearest-neighbor doublets led to statistically better fitting of the melting data. However, the positionally dependent stabilization differences are small compared with the contributions of base pairing and stacking.


INTRODUCTION

DNA analysis of the level of entire genomes and the scale of entire populations is being considered. Such a task requires the advancement of technologies, through the understanding of properties of nucleic acids and the optimization of procedures. Frequently, these technologies exploit the specific base pairing properties of DNA for sequence information and genome localization, with shorter probes finding increased application. Shorter oligonucleotide probes are attractive since complete probe sets are possible to obtain and implement. For example, only 4096 sequences compose the set of all hexamer oligonucleotides, while >10make up the complete set of 18-mers. The synthesis of thousands to hundreds of thousands of oligonucleotides is currently possible using commercial DNA synthesizers or the new parallel synthesis approaches (1) , respectively. The use of shorter probes takes advantage of the fact that shorter sequences are a subset of longer sequences. For instance, a particular 18-mer can be obtained by combination of three hexamers. This strategy is being exploited by using either enzymatic ligation (2, 3) or continuous stacking (4, 5, 6, 7) of the shorter oligomers to produce sequencing primers that may make primer walking strategies practical.

Component sequences can be used in a reverse fashion to decipher unknown sequences as suggested in the technique of sequencing by hybridization (8, 9, 10, 11, 12) . In this application, either an unknown DNA sequence is immobilized and interrogated by hybridization of short known sequences, or a complete or partial array of these sequences is immobilized and allowed to hybridize with the target DNA. The unknown sequence is then deduced by the overlapping component sequences. Applications in DNA sequencing, mapping, and diagnostic testing are being demonstrated with promises of greater efficiency over present methods (8, 10, 12, 13, 14, 15, 16) .

Although the technology to synthesize the necessary library of probes is available, the use of these libraries is still hindered by a lack of information on the hybridization properties of short oligonucleotides. General rules, which account for hybrid stability based solely on base pair type and solution environment, may not be sufficient since local sequence dependences are no longer averaged out as in longer sequences. Many data sets have been collected to interpret sequence-dependent stability, but disagreement among these sets, due to the different types of molecules studied and conditions employed, warns against extrapolating the data for use on shorter probes (17, 18, 19, 20, 21, 22, 23) . There is poor understanding of the limitations on implementation of short probes, and models for interpretation of resultant data are imperfect.

For optimal use of short probes, a characterization of the various interactions that contribute to the hybrid stability is necessary. These interactions may include Watson-Crick base pair stability, mismatched base pair stability, nearest-neighbor stacking, and dangling-end stability, all of which may further depend on solution conditions as well as the relative position of these interactions in a short duplex. The positional dependence results from the preference of melting to initiate from the ends of the duplex. This is commonly referred to as end fraying or end effects and can propagate several base pairs into the duplex. Data from nuclear magnetic resonance experiments show a perturbation of up to 3 base pairs from the end of a duplex (24) . For a long DNA duplex (>100 base pairs), this may be inconsequential to the total thermal stability, but for shorter duplexes, a significant fraction of the total number of base pairs may be perturbed. For 18- to 20-base oligonucleotides, frequently used in priming polymerase reactions, 30% of the base pairs may possess different interaction energies and perhaps even reduced sequence specificity. For octadeoxyribonucleotide duplexes, more than half the base pairs are affected.

The characterization of the position and sequence-dependent stability of the Watson-Crick base pairs and the nearest-neighbor stacking interactions will be useful for the proper design, implementation, and interpretation of sequencing by hybridization chips. Furthermore, an understanding of the energetics of end base pairs will aid in understanding polymerase priming and probe specificity as well as other applications that utilize short DNA probes. To this end, we have undertaken optical melting experiments involving a set of 256 octadeoxyribonucleotides to evaluate the positional- and sequence-dependent stability of short duplexes in solution. This molecule set is subdivided into two groups referred to as the ``end set'' and ``middle set.'' The end set molecules are of the general sequences 5`- XYZTGGAC-3` and 5`-GTCCA XYZ-3`, where X, Y, and Z may be A, T, G, or C. The end set molecules allow the formation of 64 perfectly matching duplexes with every base pair type in every nearest-neighbor environment occurring in the end and penultimate positions. Similarly, the middle set is composed of the sequences 5`-GC XYZGAC-3` and 5`-GTC XYZGC-3` to form 64 perfectly matching duplexes to evaluate the sequence-dependent stability of internal base pairs.


MATERIALS AND METHODS

Sample Preparation

Oligonucleotides were synthesized by Genosys Biotechnologies, Inc. (The Woodlands, TX) on a 0.2 µ M scale and were desalted. All oligonucleotides were assessed for failure sequences by electrophoresis on denaturing 15% polyacrylamide gels followed by staining with Stains-all. The sequences were determined to be >90% pure and were used with no further purification.

Samples were prepared for melting by combining aliquots of the appropriate single strands from concentrated stock solutions (1 m M) of the oligonucleotides with 1 ml of melting buffer (10 m M sodium phosphate, pH 7.0, 1.0 M sodium chloride) for a final concentration of 2 µ M each strand. The oligonucleotide concentrations were determined from the absorbance at 260 nm assuming an extinction coefficient of 1 10 Mcm. Samples were then heated in a boiling water bath and allowed to cool slowly to room temperature. They were then transferred to cuvettes, covered with mineral oil, capped, and then cooled to 1 °C in the spectrophotometer.

Optical Melting Experiments

Melts were performed using a Varian Cary 1E spectrophotometer fitted with a 12-position thermoelectrically controlled sample holder and motorized sample stage, which allowed the simultaneous analysis of six sample and reference pairs. Temperature ramps were performed from 1 to 70 °C at a rate of 0.5 °C/min. Data were collected at 0.2 °C intervals while monitoring the temperature by a probe inserted into one of the cuvettes. Temperature differences among the 12 cuvettes were within 0.1 °C. Denaturation and renaturation experiments were performed consecutively and repeated for a total of four ramps/experiment. A minimum of three samples of a particular duplex was examined.

The absorbance versus temperature curves were normalized by fitting base lines to the nearly linear regions before and after the steeply sloped transition region and by taking the product of the difference between the absorbance and the lower base line and the absorbance difference between the base lines at all data points. The resulting normalized curve was smoothed once with a sliding boxcar filter, and the first derivative was taken to identify the melting temperature and peak height.

Data Analysis

Melting temperatures and peak heights resulting from the individual ramps of experiments on a particular sample were collected and used to assess the variability within experimental sessions. This variability was further assessed by comparing the variability among the sessions as a function of the different sessions using a nested analysis of variance: T( i, j, k) = DNA ( i) + session ( i, j) + measurement ( i, j, k), where Tis the observed melting temperature (or alternatively, the measured peak height for calculation of G) of the ith molecule during the kth measurement of the jth session. The variance was determined from the 128 duplex molecules, analyzed over 383 sessions, and included 1084 measurements. This analysis of standard variance yielded conservative estimates of 0.4 and 0.0015 for the standard error of the mean of all Tand peak height measurements, respectively, for a given molecule.

The melting temperatures and peak heights derived from the melting curves were used to calculate free energies using the van't Hoff method as described by Marky and Breslauer (25) . The standard error associated with the determination of Gis 0.12 kcal/mol.

The preliminary analysis of Tand Gvalues, using the measurement model stated above, was based on a nested-effects analysis of variance as described (26) . Calculations for this analysis were accomplished using a FORTRAN program written specifically for the data set. Subsequent analyses were performed on average values (per octamer pair) of Tand G, rather than individual measurements, and are based on general regression and analysis of variance procedures accomplished through the use of the GLM procedure of the Statistical Analysis System (release 6.07, SAS Institute, Cary, NC).


RESULTS AND DISCUSSION

Melting Data

Fig. 1A displays some representative normalized melting curves resulting from duplexes constructed from the end set molecules. These duplexes contain a common 7-base sequence and differ only in the identity of the end base pair. Each of the four curves shows a distinct melting profile and resultant melting temperature. Likewise, Fig. 1 B displays melting curves of duplexes constructed from the middle set molecules that contain a single base change at position 4 from the 5`-end. The melting profiles indicate that base pair type and orientation ( e.g. AT versus TA) both lead to distinct changes in thermostability. The different base pair orientations give rise to different dinucleotide stacking interactions, which may account for the different thermostabilities. For the molecules in Fig. 1B, the spread in melting temperatures is nearly 8 °C; the difference between the molecules containing AT and TA base pairs is over 3 °C.


Figure 1: Normalized melting profiles of duplexes contained in the end molecule sequence set ( A) and middle sequence set ( B). The solution conditions were 10 m M sodium phosphate, pH 7.0, 1.0 M sodium chloride and a strand concentration of 2 µ M. The duplex sequences in A are AACTGGAC/GTCCAGTT, TACTGGAC/GTCCAGTA, GACTGGAC/GTCCAGTC, and CACTGGAC/GTCCAGTG with melting temperatures, determined from the first derivative of the curves, of 33.2, 30.4, 33.8, and 35.7 °C, respectively. The duplex sequences in B are GCAAAGAC/GTCTTTGC, GCATAGAC/GTCTATGC, GCAGAGAC/GTCTCTGC, and GCACAGAC/GTCTGTGC with melting temperatures of 33.7, 30.6, 35.7, and 38.5 °C, respectively.



The melting temperatures for most of the perfectly matched duplex molecules of the end and middle sets differ with the sequence variations. A tabulation of the unique sequence, numeric identifier, and average melting temperature of each of the perfect duplex molecules formed from the end and middle set sequences is displayed in Table I. The experimental conditions were 10 m M sodium phosphate, pH 7.0, 1.0 M sodium chloride and a strand concentration of 2 µ M. The differences in melting temperatures due to sequence variation are as high as 18.9 and 23.3 °C for the end and middle sets, respectively, and as high as 8.2 °C for molecules of similar base pair content. These differences are due to changes in only 3 base pairs, which lead to differences in up to three and four nearest-neighbor stacking interactions for the end and middle set duplexes, respectively.

The melting temperature values in are the result of a total of over 1000 melting curves collected over a period of several months. Samples of a particular duplex sequence were prepared and melted on different days and in random spectrophotometer cuvette positions to reduce any procedural and instrumental bias as well as changes in buffer preparation and temperature probes. Errors due to slight concentration differences, derived from the assumption of a constant extinction coefficient, as well as minor variabilities in the purity of the different oligonucleotides are also recognized. However, concentration differences of a few percent, for a particular sample, did not lead to any consistent trends in the melting temperature. A nested-effects analysis of variance procedure (26) was used to assess the error associated with measurements collected on a particular day as well as variability from day to day. The measurement to measurement variability ( i.e. within a particular day or experimental session) was observed to fluctuate with time such that the calculation of a standard error for each duplex was less meaningful. This fluctuation was presumably due to instrumental problems related to temperature control or accuracy of measurement. We adopted the relatively conservative approach of using the greatest (over time) estimate of the standard deviation associated with individual measurements. Using this, the (again, conservative) estimate of standard error for mean Twas 0.4 °C for each duplex. This value was consistent enough for each sequence analyzed that weighted regression was not used in subsequent analyses.

An example of the error associated with the data set and the necessity for careful statistical analysis can be seen upon the respective comparison of the melting temperatures of the end molecule set sequences 45, 46, 47, and 48 with the middle set molecule sequences 7, 23, 39, and 55. These sequences are common to the two sets, with the samples derived from separate syntheses and experimental data from separate experiments. The differences in experimental melting temperatures are 0.7, 1.4, 1.7, and 0.3 °C for the respective comparisons with a root mean square error (RMSE)() of 0.9 °C. This exemplifies the variability in precision within the data set and serves as a still more conservative estimate of the error potentially contained in the assignment of the melting temperatures.

Also shown in is the calculated change in free energy at 25 °C based on the all-or-none model as described by Marky and Breslauer (25) . The values are shown for comparison purposes, but it is recognized that they are prone to the shortcomings of the two-state model as well as the calculational methods from which they are derived. The calculational method involves the fitting of base lines for curve normalization, which strongly influences the assessment of cooperativity and apparent transition energy. The transition midpoint is less affected and does reflect more directly the relative differences among the various duplex stabilities. The low melting temperature of some of the duplexes prevents establishment of the lower base line. The low melting temperatures, however, do allow ample data collection after the transition, where evidence for melting that deviates from the two-state model was apparent. The high temperature regions of the molecules with lower melting temperatures showed an increased leveling off of the absorbance at temperatures above 60 °C. This feature is probably associated with the melting out of the single strands and is contrary to a two-state model. Any stability associated with structure of the single-strand species must be considered when assigning thermodynamic parameters to the transition of the duplex to single-strand configuration.

Analysis Models

Thermostability of nucleic acid duplexes is frequently modeled by assigning stabilities to singlet or doublet interactions that account for stability due to base pairing or nearest-neighbor stacking. A singlet interaction is defined as a single base pair ( i.e. an AT or GC base pair independent of orientation), while a doublet interaction is defined as 2 adjacent base pairs, with defined base pair orientations, which give rise to the unique nearest-neighbor stacking interactions. Longer range interactions are difficult to assign due to their relatively lower energies and require a large data set for estimating the greater number of variables. The present data set, however, does allow the estimation of an increased number of variables due to the large number of sequences analyzed. The variables, which are of primary interest for this study, are the possible positional dependence of singlet and doublet interactions. Many of these interactions, however, are linearly dependent upon each other and therefore cannot be uniquely assigned (27) . What can be determined are the types of interactions that are important and the non-unique stability assignments. These assignments can be used to predict thermostabilities or arranged in linearly independent combinations for comparison purposes.

Different thermostability models of increasing complexity were assumed and applied by tallying the counts of the individual terms and evaluating the components as described by Doktycz et al. (17) . The adequacy of each model was then estimated by the goodness of fit to the data set. The simplest models applied to the data in involved the stability of singlet interactions. The initial model (BP1) assumes that stability is dependent solely on the number of the 8 base pairs that are either AT or GC. This model can be extended (BP2) to accommodate positional dependence of the singlet interactions by assuming that different stabilities can be assigned to an AT or GC base pair positioned in the extreme 5`- or 3`-position of a strand as compared with an internal position. The third extension of the base pair model (BP3) assumes that base pair stability varies as a function of all positions in the duplex. Here, there would be four assigned stabilities for both AT and GC base pairs in an 8-base pair duplex due to symmetry ( e.g. position 1 at the 5`-end is assumed to be equivalent to position 8 at the 3`-end in an 8-base pair duplex).

A brief description of these singlet interaction models is summarized in along with the number of fitting parameters, i.e. the number of degrees of freedom associated with the fitted model in the analysis of variance, the number of uniquely estimable fitting parameters, and the root mean square error resulting from fits of the various models to the combined end and middle set data. Also shown are two sets of significance values ( p-values) for F-tests of goodness of fit. The p-value is the probability of obtaining the observed arrangement of data, or an arrangement even more extreme, under the assumption that the given model is correct. Hence, small values indicate lack of fit, while larger ones do not. The first p-value assesses the model fit considering a standard error of 0.4 °C for the Tvalues, while the second p-value considers the variability associated with the Tdetermination for the sequences common to both the end and middle sets. These p-values indicate different levels of error, which reflect the different components of variability mentioned earlier. As judged by the RMSE values, the more complex base pair models show no improvement in fit over the simpler models. The fit for all models, as judged by the p-values, is poor and implies that a model based solely on singlet interactions, even with the inclusion of positional dependence, does not accurately characterize the differences in thermostability. This does not imply that there is no positional-dependent stability that can be associated with the singlet interactions. Rather, the singlet interactions alone are not sufficient for describing the data.

Models that extend beyond the base pair model and that account for sequence dependence by considering the nearest-neighbor stacking interactions were also considered. These nearest-neighbor models include the terms described above for BP1, but also account for the 10 double-stranded base stacking interactions by assuming no positional dependence (NN1), an end versus internal positional dependence (NN2), or complete positional dependence (NN3). The general model used for characterizing these doublet interactions is similar to those described by Doktycz et al. (17) and Goldstein and Benight (27) , but different from the nearest-neighbor model described by Breslauer et al. (22) in that the singlet and doublet information is separated rather than combined. The number of interactions and resultant parameters are greatly increased with the nearest-neighbor models. However, the linear dependences between the nearest-neighbor stacking interactions reduce the number of unique interactions that are assessable. The number of interactions and the number of assessable interactions, along with the resultant RMSE values and associated p-values for each of the nearest-neighbor models, are summarized in I.

The additional interactions show clear improvements in fitting the data set. The incorporation of just nearest-neighbor information reduces the RMSE nearly by half compared with the base pair models. The addition of nearest-neighbor terms is more important than the addition of base pair positional information. Furthermore, positional information on the nearest-neighbor terms is also significant in reducing the RMSE. The NN3 model fits the melting data significantly better than the NN2 model, which was found to fit only marginally better than the NN1 model as judged by F-tests, which compare the size of residual for the two models. The smaller improvement on going from the NN1 model to the NN2 model may be due to positional information on the singlet interactions that arise from the linear dependences among the fitting parameters. Each non-end singlet interaction is related to doublet information resulting from base pairs on either side, while the end base pairs are related to only one doublet interaction. This singlet and doublet information, along with the linear dependences between these terms, leads to inherent information regarding the end singlet interactions.

Despite the reduced RMSE values of the doublet models over the singlet models, there is statistically significant lack of fit in these models when residuals are compared with the standard errors constructed from variation in the data as indicated by the first set of p-values given in I. These p-values may be reflective of poor fits due to information that is not explicit in the models, or they may be due to errors associated with the data. When the errors apparent from the independent Tassignments of like sequences are considered, the p-values indicate that the nearest-neighbor models should not be rejected due to lack of fit. Further credence to the nearest-neighbor models appears when those molecules with the largest residuals resulting from the fit are excluded, and the data refit show no meaningful improvement in the RMSE. Any real lack of fit that may exist is apparently not due to one or a few molecules that are somehow different from the others.

Fitting the models to the data sets separately shows better fits to the end set molecules than to the middle and combined sequence sets. The RMSE for the base pair models fit to the end set data alone is 1.4 °C compared with 2.1 °C for the middle set data and 1.9 °C for the combined set data. Likewise, the RMSE resulting from fits of the NN3 model is 0.5 °C for the end set data versus 1.0 °C for the middle set data alone and 0.8 °C for the combined set data. As noted above, measurement error (within a session) clearly changed with time during the 4 months during which experiments were conducted, but overall differences in precision, which can be attributed to the end and middle sets, are not apparent. Still, there are clear differences in the performance of the singlet and doublet interaction models in describing the melting data. The origin of these differences in the model fits is not clear, but may be attributable to the positional dependence of the nearest-neighbor stabilities and longer range interactions. The molecules of the middle set focus on internal nucleotide positions and result in altering the nearest-neighbor environment in all positions but the extreme end stacks. The end set molecules focus on the end positions and alter only the 3 end base pairs and do not affect the central stack. The internal nucleotide positions and resultant sequence-dependent stability seem to be more poorly modeled by singlet and doublet interactions, implying the importance of longer range interactions for these short sequences. However, these interactions are too small in magnitude to extract from the data set under the current level of error associated with the measurements.

Validation Set

The most definitive confirmation of the various models is to observe their performance when predicting Tvalues of sequences not contained in the original molecule set. To this end, randomly generated duplex sequences were prepared and melted. The resulting ``validation set'' sequences along with the experimentally and theoretically derived melting temperatures and changes in free energy are compared in . These validation sequences are the summary of all randomly generated duplex sequences examined. The collection of data from other randomly generated sequences will certainly be useful. The non-unique parameters resulting from the NN1 model were used in predicting the Tand Gvalues and are listed in . The use of the positionally dependent terms resulting from the NN2 and NN3 models shows statistically improved fits. However, their practical utilization is not justified since the additional 30 terms of the NN3 model lead to only slightly reduced RMSE values over the NN1 model. Although the NN1 model does not include any positionally explicit doublet information, positionally implicit singlet information is present.

The RMSE of 1.8 °C, resulting from fitting the NN1 model to the validation set sequences, is somewhat higher than the RMSE of the model fit to the combined end and middle set data. This higher value may be due in part to the lower number of sequences from which the RMSE is obtained. However, the differences between the predicted and observed values are generally within the range of residuals resulting from model prediction of the end and middle set molecules. This can be taken as justification for use of the NN1 model as fitted from the end and middle sets in that no substantial inadequacies are revealed by the validation set. This observation is reinforced by the fact that inclusion of the validation set data for model fitting did not materially change the RMSE.

The values in were used to predict the Tand Gvalues for the validation set sequences and could be used to predict the Tand Gof any 8-base pair sequence melted under the same conditions as used in the present experiments. The values are not unique due to linear dependencies noted above, and physical meaning cannot be attributed to them; however, predictions based on them are unique and would be identical to those obtained using other model parameterizations. To calculate the melting temperature or change in free energy of duplex formation for any 8-base pair sequence, the model coefficients in are multiplied by the number of corresponding singlet and doublet interactions present and summed. The resultant value is only meaningful under salt conditions of 1 M NaCl and a DNA strand concentration of 2 µ M, but may be useful as a guide for relative sequence stabilities under other conditions.

Conclusions

Of the many stabilizing and destabilizing interactions that influence DNA hybrid stability, the base pairing and base stacking interactions are considered the most important in describing the sequence-dependent stability. For short oligonucleotide duplexes, allowance for both types of interactions is necessary for the accurate prediction of melting temperatures. Models based solely on base pairing do not fit the data accurately, while models that incorporate base pair doublet information show improved performance. Furthermore, consideration of end fraying, or the preference of melting to initiate from the ends of a short duplex, shows statistically improved fits. This positional dependence is a minor component compared with the effects of the singlet and doublet interactions and is not necessary to include for prediction within the current level of error.

The values listed in can be used to predict the melting temperature of any 8-base pair duplex under the limitations of experimental conditions and errors associated with the present experiments and should be useful in assessing the relative stability of octadeoxyribonucleotide probes as used, for example, in sequencing by hybridization procedures. Altered salt conditions and DNA strand concentrations could be adjusted for (see Ref. 28) provided there is no sequence dependence associated with the nucleation energy or counterion stabilization. Caution should be exercised, however, when applying the values to longer DNA sequences or structures that are not part of the model set. Furthermore, restrictions resulting from the relatively lower melting temperatures of the molecules studied, such as the stability associated with the single-strand state as well as other deviations from two-state behavior, should be considered. The thorough characterization of other sequence-dependent interactions such as mispairing and unpaired dangling ends will further aid in the understanding and use of short DNA probes.()

  
Table: Summary of melting temperatures and free energies for end and middle set duplexes


  
Table: Summary of singlet interaction models


  
Table: Summary of doublet interaction models


  
Table: Validation set


  
Table: Non-unique coefficients from the NN1 model



FOOTNOTES

*
This work was supported in part by National Institutes of Health Grant 1 P20 HG00666 (to K. L. B.), the Oak Ridge National Laboratory Directed Research and Development Fund, and the Office of Health and Environmental Research of the United States Department of Energy under Contract DE-AC05-84OR21400 (to Martin Marietta Energy Systems). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore by hereby marked `` advertisement'' in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

§
Partially supported by the Oak Ridge National Laboratory Postdoctoral Research Program administered by the Oak Ridge Institute for Science and Education. To whom correspondence should be addressed. Tel.: 615-574-6215; Fax: 615-574-6210.

Supported by the Great Lakes College Association/Associated Colleges of the Midwest Program administered by the Oak Ridge Institute for Science and Education.

The abbreviation used is: RMSE, root mean square error.

M. J. Doktycz, R. S. Timm, K. L. Beattie, K. B. Jacobson, and R. S. Foote, manuscript in preparation.


ACKNOWLEDGEMENTS

Valuable contributions from Michael T. Lipcan III (Vasser College; a participant in Student Research Participation administered through the Oak Ridge Institute for Science and Education) are gratefully acknowledged.


REFERENCES
  1. Pease, A. C., Solas, D., Sullivan, E. J., Cronin, M. T., Holmes, C. P., and Fodor, S. P. A. (1994) Proc. Natl. Acad. Sci. U. S. A. 91, 5022-5026 [Abstract]
  2. Szybalski, W. (1990) Gene ( Amst.) 90, 177-178 [Medline] [Order article via Infotrieve]
  3. Kaczorowski, T., and Szybalski, W. (1994) Anal. Biochem. 221, 127-135 [CrossRef][Medline] [Order article via Infotrieve]
  4. Studier, F. W. (1989) Proc. Natl. Acad. Sci. U. S. A. 86, 6917-6921 [Abstract]
  5. Kieleczawa, J., Dunn, J. J., and Studier, F. W. (1992) Science 258, 1787-1791 [Medline] [Order article via Infotrieve]
  6. Azhikina, T., Veselovskaya, S., Myasnikov, V., Potapov, V., Ermolayeva, O., and Sverdlov, E. (1993) Proc. Natl. Acad. Sci U. S. A. 90, 11460-11462 [Abstract]
  7. Kotler, L. E., Zevin-Sonkin, D., Sobolev, I. A., Beskin, A. D., and Ulanovsky, L. E. (1993) Proc. Natl. Acad. Sci. U. S. A. 90, 4241-4245 [Abstract]
  8. Southern, E. M., Maskos, U., and Elder, J. K. (1992) Genomics 13, 1008-1017 [Medline] [Order article via Infotrieve]
  9. Bains, W., and Smith, G. C. (1988) J. Theor. Biol. 135, 303-307 [Medline] [Order article via Infotrieve]
  10. Fodor, S. P. A., Rava, R. P., Huang, X. C., Pease, A. C., Holmes, C. P., and Adams, C. L. (1993) Nature 364, 555-556 [CrossRef][Medline] [Order article via Infotrieve]
  11. Lysov, Y. P., Florent'ev, V. L., Khorlin, A. A., Khrapko, K. R., Shik, V. V., and Mirzabekov, A. D. (1988) Dokl. Akad. Nauk. SSSR 303, 1508-1511 [Medline] [Order article via Infotrieve]
  12. Drmanac, R., Labat, I., Brukner, I., and Crkvenjakov, R. (1989) Genomics 4, 114-128 [Medline] [Order article via Infotrieve]
  13. Drmanac, R., Drmanac, S., Strezoska, Z., Paunesku, T., Labat, I., Zeremski, M., Snoddy, J., Funkhouser, W. K., Koop, B., Hood, L., and Crkvenjakov, R. (1993) Science 260, 1649-1652 [Medline] [Order article via Infotrieve]
  14. Mirzabekov, A. D. (1994) TIBTECH 12, 27-32
  15. Strezoska, Z., Paunesku, T., Radosavljevic, D., Labat, I., Drmanac, R., and Crkvenjakov, R. (1991) Proc. Natl. Acad. Sci. U. S. A. 88, 10089-10093 [Abstract]
  16. Drmanac, R., Drmanac, S., Labat, I., Crkvenjakov, R., Vicentic, A., and Gemmell, A. (1992) Electrophoresis 13, 566-573 [Medline] [Order article via Infotrieve]
  17. Doktycz, M. J., Goldstein, R. F., Paner, T. M., Gallo, F. J., and Benight, A. S. (1992) Biopolymers 32, 849-864 [Medline] [Order article via Infotrieve]
  18. Gotoh, O., and Tagashira, Y. (1981) Biopolymers 20, 1033-1042
  19. Ornstein, R. L., and Fresco, J. R. (1983) Biopolymers 22, 1979-2000 [Medline] [Order article via Infotrieve]
  20. Vologodskii, A. V., Amirikyan, B. R., Lyubchenko, Y. L., and Frank-Kamenetskii, M. D. (1984) J. Biolmol. Struct. & Dyn. 2, 131-148
  21. Wartell, R. M., and Benight, A. S. (1985) Physics Rep. 126, 67-107
  22. Breslauer, K. J., Frank, R., Blöcker, H., and Marky, L. A. (1986) Proc. Natl. Acad. Sci U. S. A. 83, 3746-3750 [Abstract]
  23. Delcourt, S. G., and Blake, R. D. (1991) J. Biol. Chem. 266, 15160-15169 [Abstract/Free Full Text]
  24. Leijon, M., and Gräslund, A. (1992) Nucleic Acids Res. 20, 5339-5343 [Abstract]
  25. Marky, L. A., and Breslauer, K. J. (1987) Biopolymers 26, 1601-1620 [Medline] [Order article via Infotrieve]
  26. Snedcor, G. W., and Cochran, W. G. (1967) Statistical Methods, 6th Ed., pp. 285-288 and 291-294, Iowa State University Press, Ames, IA
  27. Goldstein, R. F., and Benight, A. S. (1992) Biopolymers 32, 1679-1693 [Medline] [Order article via Infotrieve]
  28. Wetmur, J. G. (1991) Crit. Rev. Biochem. Mol. Biol. 26, 227-259 [Abstract]

©1995 by The American Society for Biochemistry and Molecular Biology, Inc.