(Received for publication, June 12, 1995; and in revised form, August 23, 1995)
From the
We describe the use of a phylogenetic approach to analyze the modular organization of the single-chained (898 amino acids) and multifunctional DNA polymerase of phage T4. We have identified, cloned in expression vectors, and sequenced the DNA polymerase gene (gene 43) of phage RB69, a distant relative of T4. The deduced primary structure of the RB69 protein (RB69 gp43) differs from that of T4 gp43 in discrete clusters of short sequence that are interspersed with clusters of high similarity between the two proteins. Despite these differences, the two enzymes can substitute for each other in phage DNA replication, although T4 gp43 does exhibit preference to its own genome. A 55-amino acid internal gp43 segment of high sequence divergence between T4 and RB69 could be replaced in RB69 gp43 with the corresponding segment from T4 without loss of replication function. The reciprocal chimera and a deletion mutant of the T4 gp43 segment were both inactive for replication and specifically inhibitory (``dominant lethal'') to the T4 wild-type allele. The results show that phylogenetic markers can be used to construct chimeric and truncated forms of gp43 that, although inactive for replication, can still exhibit biological specificity.
In DNA replication, DNA polymerases bear the major responsibility for copying genomes with high accuracy. As a group, these enzymes display a variety of molecular types, but most are unified by exhibiting two catalytic functions that control fidelity: primer/template-dependent nucleotidyl transferase (polymerase) and DNA 3` exonuclease (proofreading function) (Kornberg and Baker, 1992). In bacteriophage T4, the two functions are part of the same polypeptide chain, product of phage gene 43 (gp43), whereas in some biological systems the polymerase and DNA 3` exonuclease activities are specified by separate protein subunits, e.g. Escherichia coli DNA polymerase III holoenzyme (Kelman and O'Donnell, 1995). Another E. coli enzyme, DNA polymerase I, resembles T4 gp43 in size and in possessing polymerase and DNA 3` exonuclease functions in the same polypeptide chain; however, unlike T4 gp43, polymerase I also has an N-terminal 5` to 3` exonuclease function. A third E. coli DNA polymerase, polymerase II, resembles T4 gp43 in biochemical properties and amino acid sequence motifs but is a little smaller in size than the phage enzyme (Cai et al., 1995). One group of DNA polymerases, the reverse transcriptases, lack editing function altogether (Skalka and Geoff, 1993). T4 gp43 also bears a sequence-specific RNA-binding autogenous translational repressor function (Andrake et al., 1988) that only partially overlaps the DNA binding function of the enzyme (Pavlov and Karam, 1994).
Typically, replication DNA polymerases work in complex with other proteins, which provide accessory functions that help meet a number of requirements and overcome a variety of constraints inherent to the semiconservative duplication of long supercoiled and condensed double-helical DNA genomes. In the case of T4, the interfacing of replication with other DNA metabolic processes in the phage-infected cell complicates definition of what constitutes a replication complex; however, it is clear that T4 gp43 works in partnership with several other phage-induced proteins, including the products of genes 32 (a single-strand-binding, Ssb, protein), 45 (sliding clamp), 44/62 (clamp loader, DNA-dependent ATPase), 41 (helicase), 61 (primase), and others (for recent reviews, see Kreuzer and Morrical(1994) and Nossal(1994)). Some of the analogous proteins of the E. coli replicase are subunits of the polymerase III holoenzyme. Many studies have suggested that the single-chained 898-amino acid T4 DNA polymerase is organized into modules that specify its various activities (Lin et al., 1994; Nossal, 1969; Reha-Krantz, 1994; Spicer et al., 1988), but boundaries between modules remain largely undefined because of the interdependence of functions within the gp43 molecule and the lack of direct structural information defining modules and relating functions to one another. The most unambiguous identification of a T4 gp43 domain has been that of the DNA 3` exonuclease site, which can be differentially inactivated by a single amino acid substitution (D219A) (Frey et al., 1993) and demonstrated to exist in defined gp43 fragments (Lin et al., 1994). We report here results of phylogenetic studies that provide an expanded view of the modular organization of this multifunctional replication enzyme.
We
identified, cloned, sequenced, and expressed the structural gene for
DNA polymerase (gene 43) of phage RB69, whose genetic map is
similar to that of the canonical T-even phages, although it excludes
these other phages in mixed infections and does not recombine or
exhibit phenotypic mixing with them (Russell, 1967; Russell and Huskey,
1974). We show here that the T4 and RB69 polymerase genes are related
in primary structure and biological functions, but they are identical
only at about 65% of nucleotide positions, and neither yields viable
phage recombinants when propagated in cells carrying the cloned
heterologous gene (frequencies lower than 10). The
deduced amino acid sequence of RB69 DNA polymerase also diverges from
that of T4 gp43 (
62% of residues identical plus
14%
chemically similar). (
)The divergence between the two gp43
phylogenetic relatives occurs in clusters rather than being
distributive, and their amino acid similarity patterns suggest that
domains essential to replication functions are highly conserved. Also,
despite large differences in primary structure, plasmid encoded gp43
from either T4 or RB69 can complement the other protein for phage DNA
replication; however, quantitatively, the T4 enzyme shows preference to
its own genome. An internal 55-residue segment of very high divergence
between the T4 and RB69 enzymes (37 dissimilar positions) could be
replaced in RB69 gp43 with its counterpart from T4 to yield a
biologically functional chimeric enzyme. The reciprocal domain exchange
yielded a nonfunctional gp43 that was partially inhibitory to
replication by wild-type T4 gp43. An internally deleted T4 gp43 was
also specifically inhibitory to T4 replication by wild-type enzyme.
These inhibitory proteins may retain activities that countereact
wild-type gp43. The results demonstrate the use of phylogenetic markers
to define exchangeable segments in the modular T4 DNA polymerase. The
construction of chimeric and specifically deleted derivatives of this
enzyme may ultimately help assign specific gp43 functions to specific
modules.
The T4 gene 43 double amber mutant 43amE4322-B22 has been described previously (O'Donnell
and Karam, 1972); it bears UAG codons for positions 386 and 731 of the
gene product (Reha-Krantz, 1994). The RB69 gene 43 mutant 43sacd carries a small out-of-frame internal deletion that
inactivates the phage DNA polymerase; the deletion was produced by
cleaving cloned RB69 gene 43 DNA at a unique SacI
site (see Fig. 4) and then treating the DNA with mung bean
nuclease before religation and transformation of host cells. The sacd mutation was subsequently transferred to phage by marker
rescue. E. coli BL21(DE3), which contains a T7 RNA polymerase
gene under lac UV5 promoter control (Studier and Moffatt,
1986), was used as host for recombinant plasmids expressing the T4 and
RB69 DNA polymerase genes under control of the T7 10 promoter in
the pSP72 and pSP73 vectors sold by Promega. This E. coli strain was also used as host in plasmid-phage complementation
assays. E. coli CR63 (Sup D, ser) was used
for platings of T4 amber mutants, and E. coli strains CAJ70 (a
UGA suppressor (Sambrook et al.(1967)) and
S/6str
, both nonpermissive for phage amber
mutants, were used to score for wild-type phage. Growth conditions and
complementation assays were as described previously (Hughes et al., 1987).
Figure 4:
Graphical representation of clustered
similarities (identical and chemically similar residues) between the
primary structures of the T4 and RB69 DNA polymerases. The Gene 43 panel shows partial restriction maps for the two structural genes,
and the cross-shaded (&cjs2090;) bars represent polymerase
chain reaction generated DNA fragments from both RB69 and T4 gene 43 that were used in constructing chimeric and internally
truncated gp43 species (see Fig. 5and Fig. 6).
Restriction site abbreviations were as follows: Bg, BglII; Bm, BamHI; Bx, BstXI; Dr, DraI; Pv, PvuII; Sc, SacI; Xh, XhoI.
The BglII site (AgatcT), when introduced into the gene,
created a 4-base pair (gatc) insertion; the bracketing A and T
nucleotides of the site are part of the wild-type gene 43 sequence. The insertion was removed by mung bean nulease digestion
and religation following BglII treatment. The gp43
similarities panel highlights segmental differences between the T4 and
RB69 gp43 molecules. The different shadings represent similarities
ranging from 33% () to completely identical (
). The asterisk marks regions of less than 50%
similarity.
Figure 5:
Biological activities of T4-RB69 gp43
chimeras. For qualitative spot tests, 5 µl of phage solution (about
10 particles) were deposited on lawns of E. coli BL21(DE3) carrying the desired plasmid, and plates were incubated
overnight at 30 °C. For quantitative tests, plasmid-bearing cells
(at 3
10
/ml) were infected and analyzed for phage
production and DNA synthesis as described by Hughes et al.,
1987. The numbers shown below the spots are
relative values, comparing growth of the specified phage on the
wild-type clone of the homologous gene. A value of 1.0 refers to a
phage yield of 300-500/cell with T4 and 150-200 with RB69
infections. Measurements of DNA synthesis
([
H]thymidine incorporation) were also carried
out, and the results (data not shown) were consistent with phage
yields. The infecting phage strains were: WT, wild-type; T4 43
, T4 double-amber mutant 43E4322-B22; RB69 43
, RB69 deletion
mutant 43sacd. In the plasmid-bearing cells used for spot
testing and liquid culture experiments, WT, wild-type gene 43; delC1, deletion of the C-terminal 99 RB69 gp43
residues; delM1, deletion of the 55-residue internal gp43
segment of either T4 or RB69; and FS801, 4-base deletion
affecting reading frame starting after amino acid residue 801 (a
histidine) in T4 gp43; this construct yields wild-type T4 gene 43 recombinants when infected with the T4 43
mutant used. CH-1 and CH-2 are reciprocal T4-RB69 gp43 chimeric
forms (see Fig. 6for
constructions).
Figure 6:
Electrophoretic analysis of proteins made
by plasmid clones of wild-type, partially deleted, and chimeric T4 and
RB69 gene 43. The gene 43 constructs diagrammed in
the figure were expressed in pSP72-bearing E. coli BL21(DE3)
under T7 promoter control, and S-labeled proteins were
analyzed by SDS-polyacrylamide gel electrophoresis (SDS-PAGE) as described in Andrake and Karam(1991).
Positions of the gene 43 products from these clones are marked
with a ``
'' in the autoradiogram panel where the
direction of protein migration is presented in left-to-right
orientation. The delC1 and FS 801 constructs
originated from WT clones of the corresponding genes that were digested
with BstXI (see Fig. 4for site location). With T4 FS801, BstXI digestion was followed by mung bean
nuclease treatment and religation. With RB69 delC1, the
sequence downstream of the BstXI site was removed by digesting
the linearized WT plasmid with XhoI (site distal to the cloned
gene boundary), and the truncated DNA was treated with mung bean
nuclease and then religated. The CH-1, CH-2, and delM1 constructs were made by fusing different combinations of the
polymerase chain reaction-generated fragments diagrammed in Fig. 4; the gp43 segment exchanged or deleted in these
constructs spans from residue 498(T4)/501(RB69) to residue
552(T4)/555(RB69), i.e. the darkly shaded segment in Fig. 2. All constructs were confirmed by DNA
sequencing.
Figure 2:
Primary structure alignment for the T4 and
RB69 DNA polymerases. The sequence of RB69 gp43 was deduced from DNA
sequence determinations as described under ``Materials and
Methods.'' The sequence of T4 gp43 was determined in the studies
by Spicer et al.(1988). The chart shows several landmarks on
the T4 protein: EXO, conserved exonuclease motifs in DNA
polymerases; POL, conserved sequence motifs in Family B DNA
polymerases (Braithwaite and Ito(1993); also referred to as the Pol
family (Joyce and Steitz, 1994; Wang et al., 1989)). The POL I, POL II, and POL III motifs (overlap
with motifs C, A, and B, respectively, of Delarue et al., 1990) have been implicated as ``polymerase sites'' by
mutational studies in T4 (see Reha-Krantz (1994) for a review). Amino
acid residues underscored with a dot are suspected to be
active site (POL or EXO) residues (Spacciapoli and
Nossal, 1994). Some amino acid differences from T2, T6, and RB18 are
also marked (see text for gene segments sequenced). The central
shading marks the segment of least homology between the T4 and
RB69 proteins.
Figure 1:
Southern blot analysis
of genomic DNA preparations from T4 related phages. DNA samples were
digested with either DraI (panel A) or DraI
plus SspI (panel B), and the resulting digests were
separated by electrophoresis on a 1% agarose gel, transferred to
nitrocellulose filter, and hybridized (at 50 °C) with a P-labeled riboprobe prepared by antisense transcription of
a complete T4 gene 43. Methods were as described by Hsu and
Karam(1990). Autoradiogram lanes depict analyses of the following
genomic DNA samples. Panel A, lane 1, T2; lane
2, T4; lane 3, T6; lane 4, RB6; lane 5,
RB18; lane 6, RB19; lane 7, RB51; lane 8,
RB69; lane 9, RB70. Panel B, lane 1, T4; lane 2, RB69. The band marked by the horizontal arrow (panel A) probably consists of two comigrating DraI DNA fragments in at least T2, T4, and T6. (see Fig. 4).
Fig. 2shows an alignment between the primary
structure deduced for RB69 gp43 from the nucleotide sequence and the
known 898-amino acid sequence for T4 DNA polymerase (Spicer et al., 1988). There is a total of 348 single amino acid differences
between the two proteins, including the five additional residues in
RB69 gp43 (903 amino acids) as compared with the T4 enzyme. This
represents almost 40% of all amino acid positions. Overall the two
proteins are either identical or chemically similar at 74% of all
positions; however, most of the differences occur in clusters rather
than at dispersed locations. One conspicuous difference between the two
proteins involves an internal 55-residue segment, which exhibits only
about 33% similarity (8 identical plus 10 chemically similar positions)
between the two proteins, i.e. residues 498(T4)/501(RB69) to
552(T4)/555(RB69); this segment may extend further N-terminally to
residue 482(T4)/485(RB69). Several shorter clusters with somewhat
higher degrees of similarity (40-50%) were also observed,
particularly in comparisons between the N-terminal segments of the two
proteins. Fig. 2also shows the few sites of difference we have
detected between T4 gp43 and its counterparts from other T4-like
phages, which were only partially analyzed for their gene 43 sequences. When compared for the internal segment of high
divergence between the T4 and RB69 proteins, only T2, T6, and RB18
showed differences from their T4 counterpart (one amino acid difference
in each case; Fig. 2). As shown in Fig. 3, the amino acid
similarity profiles for the internal gp43 segment of these phages are
reflected in the corresponding nucleotide sequences of the structural
gene. T2 and T4 gp43 also exhibited three amino acid differences
between them within the first 106 residues of the protein, while T6 and
RB70 gp43 were identical to the T4 protein in this N-terminal segment;
these observations are based on sequencing only the first 318 base
pairs of the genes from T2, T6, and RB70. Although no additional gp43
segments of the T4-related phages were examined, the data collected so
far suggest that T4 gp43 is very closely similar to its T2, T6, RB6,
R18, RB19, RB51, and RB70 counterparts but clearly distinct from RB69
gp43. A schematic of the clustered differences between the T4 and RB69
gp43 species is given in Fig. 4and is consistent with a
segmental structure for this class of single-chained DNA polymerases
(Reha-Krantz, 1994; Lin et al., 1994), where conserved amino
acid clusters may mark segments critical to replication or other vital
functions of these enzymes.
Figure 3: The gene 43 segment of high divergence between T4 and RB69; comparisons with the corresponding segments of other T4-like phages. This segment encodes the 55-amino acid sequences highlighted by the darkest shading in Fig. 2. Note that, except for RB69, the differences between the other phages and T4 ranged between two and five nucleotides within the 165-base span (97-99% identity between these phages and T4), and most of the differences involve the same positions of the homologous genes. The RB69 sequence differed from the T4 counterpart at 105 of the 165 nucleotide positions (i.e. the two sequences are 37% identical), with about 70% of the differences being related by transversion. A dash means the residue is identical to that in T4.
Fig. 6compares protein sizes for a number of gp43 constructs
that were used in the biological experiments summarized in Fig. 5. Clearly, as observed with the in-frame internal delM deletions, removal of the 165-bp sequence for the 55-residue
segment of gp43 results in synthesis of gp43 species that are shorter
than wild-type protein (Fig. 6) as well as in loss of
replication function (Fig. 5). We next asked if the divergent
internal segments of the two gp43 species could substitute for each
other. We constructed the chimeras CH-1 and CH-2 (Fig. 6) and
assayed their biological activities by the plasmid-phage
complementation assay (Fig. 5). One of the two alternate swaps,
CH-2, did indeed exhibit a product that supported phage replication,
whereas replacement of the T4 segment with its counterpart from RB69
yielded a chimera (CH-1) that was inactive for phage replication (Fig. 5). Interestingly, the CH-1 chimera exhibited a partial
inhibitory effect on replication of an infecting T4 wild-type phage and
must therefore actively counter the function of wild-type gp43, or
perhaps inhibit its synthesis (Andrake and Karam, 1991). This
replicon-specific ``dominant lethal'' effect was more evident
in liquid culture than in plating assays (Fig. 5). The delM constructs from RB69 and T4 also exhibited replicon-specific
trans-dominant inhibitory effects, whereas the C-terminal RB69 deletion (delC clone) and T4 frameshift (FS801) mutants used
neither inhibited replication of infecting wild-type phage nor
complemented infecting gene 43 phage mutants (Fig. 5)
and are therefore probably devoid of biologically relevant activity.
Preliminary studies indicate that the delM1 mutants retain
RNA-binding (repressor) function, but it is still unclear if they are
DNA binding as well. ()It may also be important to note here
that RB69 gp43 migrates more slowly than T4 gp43 in SDS-PAGE and that
this property is associated with the 55-residue segment exchanged in
CH-1 and CH-2 rather than with the additional five amino acids of the
RB69 enzyme. The reason for the difference in electrophoretic behavior
is not known.
The results described here underscore the utility of
comparing functionally analogous proteins from organisms that belong in
the same phylogeny but that are not very closely related. We can safely
conclude that T4 and RB69 share common ancestry, although some segments
of their genomes may have unrelated origins. Opportunities for
horizontal transfer of genetic elements make questions about origins
and diversification of genomes difficult to resolve in any biological
system (Campbell and Botstein, 1983), but especially so with virulent
phages where the host(s) that contributed to their natural selection
cannot be ascertained (Shub, 1994). T4 and RB69 are only slightly
similar in serological properties, but they resemble each other more
closely in appearance and in physical and genetic properties of their
nucleic acids (Russell, 1967; Russell and Huskey, 1974). Similarity by
such criteria could relate any two phages that may have acquired
functionally analogous genetic cassettes from already highly diverged,
or even unrelated, origins, e.g. phages and 21 (Campbell
and Botstein, 1983). Ultimately, we expect to encounter two levels of
segmental divergence between the RB69 and T4 genomes: (i) intragenic
divergence of the type we report here for the DNA polymerase genes and
(ii) intergenic divergence, whereby entire segments of the two genomes
will prove to be dissimilar in sequence (although perhaps still similar
in function), as has been observed among the mix of immunity,
replication, and assembly gene clusters of lambdoid phages (Susskind
and Youderian, 1983). There are several examples of horizontal
acquisition of genetic information in the evolution of T-even phages,
including the capture of introns (Clyman and Belfort, 1994) and
insertion elements (Miller and Jozwik, 1990) by some T-even genomes but
not by others that are otherwise very closely related to one another ( Fig. 1and Fig. 2) and the sharing of tail fiber antigen
determinants between some T-even and other phages like Mu and
(Henning and Hashemolhosseini, 1994). The clustered differences between
the primary structures of the DNA polymerases of T4 and RB69 (Fig. 2), and other evidence suggesting that T4 gp43 is a
modular enzyme (for review, see Reha-Krantz(1994)), lead us to consider
that intracistronic evolution of gene 43 may have also
occurred by both horizontal and vertical change. In particular, the
55-amino acid internal segment of least similarity between them could
have either originated by horizontal transmission of two unrelated DNA
segments that converged to a similarity level consistent with modern
gp43 function, or diverged from one DNA origin by an unusually
permissive acceptance of amino acid substitutions. In this regard, it
is interesting to mention that there are no known conditional lethal
missense mutations that map in this region of T4 gene 43 (Reha-Krantz, 1994) and that one of the highly conserved positions
between the T4 and RB69 gp43 segments (i.e. Ser523 in T4 gp43)
is changed to a chemically similar residue in T2 (to Thr) and a
dissimilar one in T6 (to Asn) (Fig. 2), two phages that are very
closely related to T4 in their gene 43 sequences (Fig. 3). Interestingly, however, we do note that the T4 and
RB69 segments are both particularly rich in acidic amino acid content
(pI
4.7) and there may have been a selection for this chemical
property during their evolution. We also note that the nucleotide
sequence encoding this gp43 segment in RB69 possesses the high AT/GC
ratio characteristic of T-even phages but with a different base
distribution from T4 (Fig. 3). It will be interesting to find
out if this segment will be less tolerant to mutational alternations if
these caused a drift away from acidity.
The T4 and RB69 DNA polymerases can substitute for each other in phage replication, although the T4 enzyme appears to show a strong preference to its own genome (Fig. 5). Qualitatively, the broader specificity of the RB69 enzyme is a remarkable property in view of the presumption that several components in each of the T4 and RB69 DNA replicase assemblies must have co-evolved to maintain mutual recognition. In this context, the segmental distribution of conserved sequences between T4 and RB69 gp43 may prove to be mirrored by similarly clustered patterns of sequence conservation in other protein components of their respective replicases. Gp43 segments at which divergence in primary structure was permitted during evolution may serve any of a number of important roles, such as providing appropriate spacing between interacting activity domains of a folded enzyme or providing determinants that distinguish biological specificities of the two replication systems from each other. We have been testing for such roles in the 55-residue internal gp43 sector of highest divergence between T4 and RB69. In the ``domain swap'' experiments described here, we observed that reciprocal exchange of this segment between the two phages did not alter replicon specificity of the gp43 recipients ( Fig. 5and Fig. 6). In the one case, an RB69 gp43 with a T4-derived insert (CH-2) replicated T4 and RB69 equally well, i.e. exhibited RB69 gp43 characteristics. In the reciprocal exchange (CH-1), the chimeric gp43 was inactive for replication but had the added interesting property of inhibiting function from T4 (but not RB69) wild-type gp43. That is, it exhibited a T4-specific ``trans-dominant'' phenotype. A similar, but more inhibitory, phenotype was exhibited by a T4 gp43 that was deleted for this internal segment (T4 delM1). The deleted RB69 gp43 counterpart (RB69 delM1) was inhibitory toward both wild-type phages, which is a phenotype that mirrors the broader specificity of wild-type RB69 gp43. We conclude that this internal segment is essential for replication function but does not determine replicon specificity of gp43. Also, the trans-dominant phenotypes exhibited by internally deleted gp43 from T4 and RB69 and the inhibitory effect of CH-1 suggest that these proteins retain some of the activities of the wild-type enzymes and may compete with these if present in the same cell. It should be possible to localize the target for inhibition both by genetic and in vitro assay.
T4 DNA polymerase exhibits several amino acid sequence similarities with a number of eukaryotic DNA polymerases (Spicer et al., 1988; Braithwaite and Ito, 1993). Interestingly, the segment of divergence from RB69 gp43 appears as a ``gap'' in sequence alignments with these other enzymes (Wang et al., 1989) and is positioned to the immediate N-terminal side of a highly basic sequence motif (pI = 10-10.3; designated POL III in Fig. 2) that is conserved among DNA-dependent DNA polymerases from a variety of biological sources (Blanco et al.(1991) overlaps motif B of Delarue et al.(1990)). Since the segment from T4 can substitute for the one in RB69, the amino acid sequence contained therein may have no interactions with other parts of the intact gp43 molecule. The segment may be an innocuous linker or spacer whose divergence during evolution was limited only by amino acid changes that altered its length or interfered with other segments of the enzyme. Such explanations can be tested by site-directed and randomized mutagenesis. It is still possible that divergence of the gp43 internal segment is related to the evolution of different biological specificities in the two gp43 species examined here. Considering the incompatibility of T4 and RB69 in coinfected E. coli hosts (Russell and Huskey, 1974), the two phages must have experienced much of their natural selection in separate cellular environments, and their gene 43 products may have evolved different signatures that are not functionally distinguishable in the bacterial hosts that, by experimental design, were used for their initial detection.