(Received for publication, August 24, 1994)
From the
The Bacillus subtilis bacteriophage PBS2 uracil-DNA
glycosylase inhibitor (Ugi) is an acidic protein of 84 amino acids that
inactivates uracil-DNA glycosylase from diverse organisms (Wang, Z.,
and Mosbaugh, D. W.(1989) J. Biol. Chem. 264,
1163-1171). The secondary structure of Ugi has been determined by
solution state multidimensional nuclear magnetic resonance. The protein
adopts a single well defined structure consisting of five anti-parallel
-strands and two
-helices. Six loop or turn regions were
identified that contain approximately one half of the acidic amino acid
residues and connect the
-strands sequentially to one another. The
secondary structure suggests which regions of Ugi may be involved in
interactions with uracil-DNA glycosylase.
Uracil residues may be introduced into DNA by the incorporation
of dUMP during DNA synthesis or by the deamination of cytosine in DNA. In vivo cytosine deamination occurs at a genetically
significant rate to produce a premutagenic UG
mispair(1, 2, 3) . If unrepaired, the
uracil-DNA lesion will promote a C to T transition mutation after the
next cycle of replication(3) .
Most organisms eliminate
uracil residues from DNA by means of the uracil-excision DNA repair
pathway. Uracil-DNA glycosylase initiates repair by cleaving the N-glycosylic bond between uracil and deoxyribose in DNA to
yield free uracil and an abasic site(4) . The abasic sites
induced by the action of uracil-DNA glycosylase cause changes in the
structure, dynamics, and chemical stability of DNA (5, 6, 7, 8) . The chemistry of
abasic sites has also been investigated (9, 10, 11) . The ubiquitous uracil-DNA
glycosylase enzyme has been highly conserved throughout evolution as
evidenced by the significant amino acid similarity, 55.7% identical
residues, between human and bacterial forms of uracil-DNA
glycosylase(12, 13, 14) . The functional
similarity between Escherichia coli uracil-DNA glycosylase
(Ung) ()and its human counterpart are underscored by the
observation that human uracil-DNA glycosylase complements E. coli
ung mutants(15) . Further, extracts of both E. coli and human cell lines have been shown to generate one nucleotide
repair patches following excision of uracil from DNA, suggesting that
the entire excision repair pathway itself may be highly
conserved(16) .
Interestingly, the genomes of several
mammalian viruses have been reported to encode uracil-DNA
glycosylase(12, 17, 18, 19, 20) .
The function of these viral uracil-DNA glycosylases remains to be fully
understood. However, recent evidence suggests that the viral enzyme
plays a key role in viral replication and reactivation since viral
uracil-DNA glycosylase activity is rapidly induced following both
herpes simplex virus and pox virus
infection(21, 22, 23) . In adult neurons
mammalian uracil-DNA glycosylase activity was reportedly undetectable;
thus, uracil residues may accumulate either by the oxidative
deamination of cytosine or by the misincorporation of dUMP in place of
dTMP during DNA repair synthesis(24) . When a nerve cell is
attacked by herpes simplex, pox, vaccinia, or pseudorabies virus, there
is often a long latency period. During this period the genome of the
virus can acquire uracil residues if host uracil-DNA repair is
absent(24) . Further, uracil residues located in the herpes
simplex 1 origin of replication (Ori) hamper specific
recognition by the origin binding protein(24) , and
inactivation of the vaccinia and pox virus-encoded uracil-DNA
glycosylase gene eliminates viral viability(20, 25) .
Thus, uracil-DNA glycosylase plays an important role in the DNA
metabolism of most biological systems, including virus.
Bacillus subtilis bacteriophages PBS 1 and 2 are unusual biological systems in that their double-stranded DNA genome contains uracil in place of thymine(26) . Upon infection of the host, phage-induced activities cause depletion of the dTTP pool concomitant with dramatic elevation of the dUTP pool, thereby facilitating the incorporation of dUMP into phage DNA during replication(27, 28, 29) . However, uracil-containing DNA synthesized under conditions of PBS infection must be protected from the B. subtilis uracil-DNA glycosylase to avoid uracil-DNA degradation. This is accomplished by expression of the PBS 2 ugi gene product which directly inactivates the host uracil-DNA glycosylase(29, 30) .
The ugi gene from bacteriophage PBS 2 has been cloned, sequenced, and
overexpressed in E. coli, and the purified Ugi protein has
been characterized(31, 32) . Ugi is an acidic protein
with a pI of 4.2(33) . It is a heat stable, nonglobular
monomeric protein of 84 amino acid residues that exhibits anomalous
migration during SDS-polyacrylamide
electrophoresis(32, 33, 34) . The Ugi protein
has been shown to inactivate Ung by forming an extremely stable
UngUgi complex with 1:1 stoichiometry(33) . In addition,
Ugi inactivates other uracil-DNA glycosylase including those isolated
from Micrococcus luteus, Saccharomyces cerevisiae,
rat liver, and human cells(30, 32, 34) .
Inhibition of Ung occurs in a noncompetitive manner with respect to the
uracil-DNA substrate(35) .
Kinetic studies have shown that
the interaction of Ugi with Ung occurs by way of a two-stage mechanism
involving a rapid pre-equilibrium ``docking'' step, followed
by a rearrangement or ``locking'' step that leads
irreversibly to the final complex(36) . Several lines of
evidence suggest that Ugi binds Ung at or near the DNA binding site of
the enzyme. Ugi binding to Ung prevents enzyme association with DNA (33) . Ung that has been UV-cross-linked to the DNA
oligonucleotide dT at the enzyme DNA binding site fails to
form a complex with Ugi; and Ung complexed with Ugi does not
UV-cross-link to dT
(37) . The high percentage of
negatively charged amino acids, 12 Glu + 6 Asp out of 84 residues,
further suggests that Ugi may act as a DNA mimic in binding to Ung.
Such mimicry may provide an explanation as to how the Ugi protein from
PBS is capable of inactivating uracil-DNA glycosylase from diverse
biological systems that are under no selective pressure to maintain a
Ugi binding site.
Additional understanding of the UngUgi
complex requires elucidation of the individual protein structures as
well as that of the complex. In this article we report on the
determination of the secondary structure of Ugi by multidimensional
nuclear magnetic resonance (NMR) utilizing isotopically labeled
protein. The Ugi secondary structure contains five antiparallel
sheets, that form a contiguous structure with one another, and two
helices. The information gained from the secondary structure allows
assessment of the regions of Ugi likely to be involved in binding to
Ung.
Figure 1:
The HSQC spectrum was obtained on a N-labeled sample of Ugi. The cross-peaks are at the
chemical shifts of the nitrogen along axis F
and
the chemical shift of the proton along axis F
. The
cross-peaks of the amide nitrogen-amide proton pairs are labeled by
residue number and residue type and the cross-peak of the ring nitrogen
of the tryptophan is also labeled. The chemical shifts of the amide
proton and nitrogen resonances are given in Table 1. The
unlabeled cross-peaks arise from glutamine and asparagine NH
groups and some of the assignments of these cross-peaks are given
in Table 1.
Figure 2: The spectra shown are taken from a three-dimensional TOCSY-HSQC data set obtained on Ugi. The two spectra show the connectivities made from amide proton to side chain protons. The bottom spectrum is from the most congested region of the three-dimensional data.
Figure 3:
The spectra shown are taken from a
three-dimensional, NOESY-HMQC data set obtained on Ugi. The spectra at
the top are arranged as ``strips'' and the
sequential NOE connectivities used to assign the resonances of the
-helix from residue 27 to 33 are shown. All of the strips have the
same range of chemical shifts along F
, and the
width of the chemical shift range along F
is
indicated. The spectra at the bottom are also arranged as
``strips'' and show the sequential NOEs used to assign the
resonances of the
-strand residues from 68 to
77.
Figure 4:
The
sequential NOEs between the residues of Ugi are depicted with the NOEs
characterized as small, medium, or large. The sequential NOEs are
depicted as well as those between residue i and i + 2, and i and i + 3. The residues
with slow amide exchange are indicated by a filled circle. The
chemical shift of the CH proton of each assigned
residue relative to the chemical shift of the random coil, unshifted
position is given as
H
.
Three-dimensional NOESY-HMQC and TOCSY-HSQC spectra were obtained at
15, 30, and 37 °C so as to resolve the residual water signal from
some of the C protons. The NMR properties of the
protein did not change significantly over this temperature range. The
NMR properties of the protein did not change significantly over the pH
range of 5.5 to 8. The amide protons exchange sufficiently slowly with
water at pH 7 to allow the NMR experiments to be carried out a this pH.
The exchange rates of the amide protons were determined at pH 7.0
and 20 °C by observing the rate of loss of intensity of cross-peaks
in the two-dimensional HMQC spectrum as a function of time after the
Ugi has been transferred from a HO solution to a
H
O solution. The amide protons with the slowest
exchange rates are indicated in Fig. 4.
The characterization of Ugi began with the determination that
this protein adopts a single well defined structure in solution. The
HSQC spectrum in Fig. 1contains cross-peaks whose coordinates
are the chemical shifts of the amide nitrogen and amide proton of the
non-proline residues. (Chemical shifts of all proton and nitrogen
resonances are given in Table 1.) The cross-peaks arising from
the NH of glutamine and asparagine also appear in this
region but are much broader than the cross-peaks from amides. The
examination of this spectrum shows that the protein adopts a defined
structure since there is a wide dispersion of the chemical shifts along
both the nitrogen and proton dimensions. The presence of a number of
resonances with proton chemical shifts between 8 and 10 ppm suggests
that the protein has a significant percentage of the residues in
-strands.
Many proteins are investigated by NMR at low pH since the crucial amide protons undergo rapid base-catalyzed exchange with the bulk water at neutral pH. It was found that the amide protons of Ugi exchange sufficiently slowly with water at pH 7.0 to allow the NMR experiments to be carried out successfully. This slow rate of amide exchange is also indicative of a highly structured protein. Examination of the spectrum indicated that only one structural form of the protein was present in solution, as there is one cross-peak for the amide of each non-proline residue in the protein. These NMR results indicated that Ugi had a well defined, single structure in solution and that it was well suited for NMR structural studies based on multidimensional NMR and sequential assignment strategies.
The next step in the
structural characterization was the determination of the chemical
shifts of the CH protons. The resonances of these
protons were ascertained from the information present in a DQCOSY
spectrum of the proton. The DQCOSY data, not shown, contains
cross-peaks between the amide and alpha protons, and makes it possible
to distinguish resonances in the HSQC spectrum of amide protons from
those of the amino protons of glutamine and asparagine residues.
Once determined, the resonances of the amide and protons were
combined with the results of three-dimensional TOCSY-HSQC data to
identify the spin systems(39, 40, 41) . These
experiments allow the connection of the
protons to the other
protons of the same residue via spin-spin and scalar couplings, and
hence allow the classification of groups of resonances to a residue of
a particular spin system type. Typical TOCSY-HSQC data is shown in Fig. 2.
The sequential assignments of the individual spin
systems were based primarily on the information present in the
three-dimensional NOESY-HMQC data. From this information the sequential
connectivities from amide proton of residue i to amide proton
of residue i - 1, the amide proton of residue i + 1 as well as from the amide proton of residue i to
CH proton of residue i - 1 were
determined(39, 40, 41) . The spectra
containing the sequential connectivities used to assign the resonances
of residues in the long helix are shown in Fig. 3. Fig. 3also contains the set of spectra used to make the
sequential assignments in one of the
-strand regions. The
sequential NOE connectivities of all of the residues are indicated in Fig. 4. All of the non-proline residues were assigned except for
the terminal residues 1 through 4.
Characterization of the secondary
structural elements of Ugi was based on several pieces of information.
The helices were identified primarily on the basis of strong NOE
connectivities between the NH of residue i and the NH of
residue i + 1 and residue i - 1. The
helices were also characterized by the presence of weak to medium
connectivities from the NH proton of residue i to the NH
proton of residue i + 3 and weak NOE connectivities
between the CH of residue i and the NH proton
of residues i + 2 and i + 3. These NOE
connectivities are depicted in Fig. 4and allowed the
identification of the presence of two
-helices in the secondary
structure. One of these helices is from residue 5 to residue 14 and the
other helix is from residue 27 to residue 35. In general, downfield
shifts, positive differences, correlate with
-sheets and upfield
shifts, negative differences, correlate with alpha
helices(42) . The chemical shift differences of the
C
H protons relative to the random coil, unshifted
positions are also given in Fig. 4. In the two regions of Ugi
assigned to
-helices the expected negative differences between the
observed chemical shifts and the random coil chemical shifts are found.
The presence of five -strand regions in the secondary structure
of Ugi was determined primarily on the basis of the presence of strong
NOE connectivities between the C
H proton of residue i and the amide proton of residue i + 1, strong
connectivities between the C
H proton of residue i and the amide proton of residue i + 1, and by the
fact that the amide protons of residues in
-strands tend to
exchange more slowly with water. The
-strands are from residues
20-24, 41-48, 53-60, 69-74, and 79-84.
The characteristic NOE connectivities of the
-strand regions
suggested that they are all
anti-parallel(40, 41, 43) . This information
is summarized in Fig. 4. The chemical shift differences of the
C
H protons of the
-strands are expected to exhibit
positive differences relative to the random coil, unshifted positions.
The experimental results, summarized in Fig. 4, are in agreement
with this expectation. Indeed, grouping resonances to spin system type,
sequential assignments, NOE connectivities, amide exchange rates, and
differences in chemical shift relative to the random coil, unshifted
positions are all consistent with each other and the secondary
structure assignments.
The determination of the arrangements of the
-strands relative to one other was based on the information
present in NOE connectivities between C
H and
C
H protons on separate
-strands. These NOEs appear
at every other site along a
-strand, and are indicated in the
depiction of the
-strands in Fig. 5and in the depiction of
the entire secondary structure in Fig. 6. These connections
allowed the arrangement of the five
-strands in a connected
anti-parallel manner that is shown in Fig. 5and Fig. 6.
Figure 5:
A depiction of the topology of the
-strands of Ugi. The protons connected by NOEs are indicated by
the double-headed arrows.
Figure 6:
A depiction of the overall secondary
structure of Ugi. The -strand regions are indicated by the large arrows and the
-helices by the helices.
The presence of five -strands in Ugi is consistent with prior
data concerning the high stability of the
protein(30, 31, 32, 33) . Other
proteins of similar size with five
-strands include ubiquitin (44) and the RAS binding domain of human RAF-1(45) .
Both of these proteins contain five anti-parallel
-strands and a
single helix, and they are very stable proteins, presumably because of
the high
-content of the structures(44, 45) . In
ubiquitin and the RAS binding domain, the helix is between the first
and fourth
-strands; whereas in Ugi the intervening helix is
between the first and second
-strands. The somewhat unusual
feature of the
-strands in Ugi is that they are connected
sequentially.
The secondary structure has six loop or turn regions in which one half of the 18 acidic amino acid residues of the protein are found. The regions from residues 36-40, 49-52, and 61-68 contain six acidic groups, two each, whereas the region 75-78 contains one acidic group; other regions constituting loops or turns are residues 15-19, 25, and 26. The anomalous electrophoretic mobility of Ugi may arise, in part, from the presence of these acidic groups in regions of the protein not involved in the secondary structure.
The biological function of Ugi is to inhibit uracil-DNA glycosylase, a DNA repair enzyme that binds single- and double-stranded DNA, recognizes uracil residues, and then catalyzes the hydrolysis of the N-glycosylic bond between the uracil base and the deoxyribose-phosphate ``backbone'' of DNA. Ugi and Ung react to form a stable protein-protein complex that does not interact with DNA. Thus, Ung may recognize Ugi because the inhibitor protein mimics in some fashion the general properties of its DNA substrate. The secondary structure of Ugi can reveal which regions and residues may be critical to the interaction with Ung.
If Ugi is to
mimic DNA, then it should have a number of acidic groups which can
mimic the electrostatic potential of the phosphate groups in DNA. The
secondary structure has eight acidic groups that may be available for
this role. The regions 36-40, 49-52, and 61-68 are
prime candidates for this activity since these regions are not involved
in the secondary structure and contain six of the 18 acidic residues
present. While the tertiary structure of the -strands is not yet
known, it may be the case that a barrel type structure is formed in
which these negative changes are distributed in a manner similar to the
arrangement of the negative changes of DNA, and thus may act as the
primary binding site of Ugi to uracil-DNA glycosylase. It is noted that
amino acids 49-52 contains the sequence
Glu
-Ser
-Thr
-Asp
and
that the serine and threonine residues may serve as hydrogen bond
donors in complex formation. It is also possible that residues
75-78 are involved in complex formation as this region contains
the sequence
Ser
-Gln
-Gly
-Glu
.
The serine and glutamine residues offer hydrogen bonding opportunities
and the glutamic acid a negative charge.
The helix from residues 27 to 35 has a distinctly charged side and a hydrophobic side; however, the helix from 5 to 13 does not appear to have as significant a difference between its two sides. While either or both of these helices could be involved in complex formation, the secondary structure does not make a clear prediction as to what that role(s) might be.
The
-strand regions of proteins are typically not involved directly in
the activity of proteins. Hence, the
-strand regions of Ugi are
likely to be important structurally but are not expected to play
significant roles in the interaction with uracil-DNA glycosylase. Based
on the secondary structure of the
-strands shown in Fig. 5,
no specific pattern in the arrangement of the acidic or other types of
residues can be inferred.
Ugi contains only three aromatic residues
and of these Tyr and Trp
are not present in a
secondary structural feature. If Ugi binds to the uracil binding site
of uracil-DNA glycosylase, then it is possible that an aromatic residue
may occupy that site. The Tyr
and Trp
are the
only residues apparently available for this role.
The tertiary structure of Ugi in solution is now being determined in order to gain information concerning the arrangement of negative charges and other features of Ugi which may be important to the interaction with uracil-DNA glycosylase. Additional studies are planned which will examine the changes in the amide exchange rates and chemical shifts of the amide nitrogen and protons, as well changes in the mobility of the amide nitrogens which may occur when Ugi is complexed with uracil-DNA glycosylase. These studies should offer information about the regions of Ugi which directly interact with uracil-DNA glycosylase. Preliminary results on the complex indicate that the structure of Ugi undergoes considerable change upon complex formation. This is consistent with kinetic results which indicated the presence of a slow, locking step in complex formation(36) .