From the Department of Biochemistry, Queen's University, Kingston, Ontario K7L 3N6, Canada
![]() |
ABSTRACT |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Unlike mammals, birds, and most other fishes, winter flounder completes spermatogenesis without replacing its germ cell histones with protamines. Instead, during spermiogenesis, these fish produce a family of high molecular weight (80,000-200,000) basic nuclear proteins (HMrBNPs) that bind to sperm chromatin containing the normal complement of histones. These large, basic proteins are built up of tandem iterations of oligopeptide repeats that contain phosphorylatable DNA-binding motifs. Although the HMrBNPs have no obvious homology to histones, protamines, or other sperm-specific chromatin proteins, we report here the isolation of a clone (2B) from a winter flounder genomic DNA library that establishes a link between the HMrBNPs and histone H1. The 2B sequence contains an open reading frame, which, when conceptually translated, encodes a 265-residue protein. At its N terminus the translation product contains numerous simple repeats that match the oligopeptides contained within the HMrBNPs. Unexpectedly, the C terminus of the putative protein shows 66% identity and 76% conservation to the histone H1 globular domain. This connection suggests that the HMrBNPs may have originated from the extended N-terminal tail region of a testis-specific, H1-like linker histone.
![]() |
INTRODUCTION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In almost all eukaryotic cells, histones have a fundamental role in organizing and condensing DNA (1, 2). It is therefore not surprising that the sequences of the core histones (H2A, H2B, H3, and H4) are extremely well conserved and contain many basic residues. The fifth histone (H1), which may or may not sit outside the nucleosomal core (3), is the longest, most variable, and most lysine-rich member of the histone family. The structure of histone H1 can be subdivided into three domains: a variable N-terminal region of 35-40 residues with a net positive charge followed by a well conserved globular domain of 80 residues (4), which is thought to interact with both core histones and nucleosomal DNA, and a very basic C-terminal tail of ~90 residues, 90% of which is lysine, alanine, and proline.
Most organisms possess more than one tissue- or stage-specific histone H1 variant (5, 6). For example, sperm-specific histone H1 variants (H1T) are commonly found in mammals (6), amphibians (7), and invertebrates (8, 9). H1Ts typically have a shorter C-terminal domain and tails that contain a higher proportion of positively charged residues (usually Arg) than their somatic counterparts, as well as a greater number of phosphorylation sites (6). However, the sperm-specific H1 of sea urchin (SpH1) is longer than its somatic counterpart at both ends due to N- and C-terminal extensions composed of tetrapeptide repeats (SPXB, where X is usually basic, and B is K or R) (10).
This trend to increased basicity and a higher arginine content in sperm-specific histones may facilitate condensation of the DNA into the sperm nucleus. In fact, the switch from somatic to sperm chromatin can be accomplished using a variety of proteins. One strategy used by some vertebrates and many invertebrates is to retain histones in a nucleosomal arrangement but to incorporate sperm-specific histone variants and/or other specialized basic proteins into the condensing chromatin (11, 12). In mammals, birds, and most fishes, DNA condensation is ultimately accomplished using protamines. These small arginine-rich proteins replace the histones and by doing so eradicate the nucleosomal organization established by the histones.
The winter flounder is one of the minority of bony fishes that retains its histones throughout spermatogenesis and does not replace them with protamines. Moreover, it does not synthesize significant quantities of sperm-specific histone variants (13). The winter flounder does, however, produce a group of high molecular weight basic nuclear proteins (HMrBNPs)1 in mid- to late spermatids. These unique proteins are retained in the mature sperm, where they comprise >25% of the total acid-soluble proteins. As judged by SDS-polyacrylamide gel electrophoresis, there are at least 15 HMrBNPs that range in apparent molecular weight from 80,000 to 150,000, with a major band at ~110,000 and trace quantities of larger proteins up to 200,000. Amino acid analysis revealed that this group of proteins is constructed primarily from four amino acids: Arg (24%), Ser (23%), Lys (15%), and Pro (14%), which reflects their underlying simple repetitive sequences of dodecapeptides, with the consensus sequence SPMRSRSPSRSK, and heptapeptides, with the sequence RRVXXPK (where XX is QT or PS) (14). This simple composition and intermediate basicity suggests that the HMrBNPs might best fit in the class of chromatin proteins intermediate between histones and protamines (15).
Although the extreme repetitiveness of the HMrBNPs has precluded us from directly sequencing the proteins and from isolating full-length cDNA and genomic clones, we have obtained partial nucleotide sequences including the proximal promoter, 5' and 3' UTRs and about 1.5 kb of the coding region (manuscript in preparation, see GenBankTM accession numbers U39735, U39845, and U39932). Using these tools, we have investigated the genomic structure of the HMrBNPs. Here we report the isolation of a winter flounder HMrBNP genomic clone, designated 2B. The ORF of this clone encodes a 30-kDa protein whose N-terminal sequence shows homology to the HMrBNPs. Most interestingly, the C-terminal region of the putative protein encodes a histone H1-like globular core domain. This finding sheds new light on the origin of the HMrBNPs and their function in the developing sperm of the winter flounder.
![]() |
EXPERIMENTAL PROCEDURES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Isolation and Purification of Genomic Clones--
A partial
Sau3A-digested winter flounder genomic DNA library was
custom-made by Stratagene in FixII. The amplified library was plated
on Escherichia coli NM522, and plaques were transferred to
nitrocellulose filters (Schleicher & Schuell). The membranes were
screened with 32P-labeled, HMrBNP
cDNA clone 3'-5 (GenbankTM accession number U39735). Selected
positive plaques were purified through two further rounds of
screening.
Restriction Enzyme and Southern Hybridization Mapping of Genomic Clones-- DNA from phage that had been banded twice on CsCl was digested with restriction enzymes and electrophoresed through a 0.8% agarose gel. The DNA was stained with ethidium bromide and photographed. A restriction enzyme map was constructed based on single and double digests. To identify regions of homology to the HMrBNP cDNA, the restriction enzyme fragments were also analyzed by Southern blotting using 32P-labeled, HMrBNP cDNA clone 3'-5 (GenbankTM accession number U39735). The membrane was hybridized overnight at 68 °C in 25 mM sodium phosphate (pH 7.2) containing 7% SDS. After hybridization, the membrane was washed twice for 20 min at 68 °C with 0.1% SDS containing 0.5× SSC and then autoradiographed (XAR, Kodak). Regions of hybridization were subcloned into pBluescript (Stratagene) and sequenced (Sequenase 2.0, U. S. Biochemical Corp.).
Isolation and Southern Blot Analysis of Genomic DNA-- Genomic DNA from an individual fish was isolated from frozen tissue as described (16). Genomic DNA (10 µg) and purified phage DNA (100 ng combined with 10 µg of calf thymus DNA) were digested with restriction enzymes, and the fragments were separated by agarose gel electrophoresis and transferred to a nylon membrane (Zeta-Probe GT, Bio-Rad). HMrBNP cDNA clone 3'-5 (GenbankTM accession number U39735) was radiolabeled and used to probe the bound DNA (see above).
Comparison of Genomic Clone 2B with HMrBNP Sequence-- Dot matrix comparisons were performed using the Caltech DNA sequence analysis program, version 2.4. The sequences (2B from this paper and HMrBNP compiled from GenbankTM accession numbers U39845 and U39932) were compared using a stringency of 16 matches in a window of 20 bp.
![]() |
RESULTS |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Isolation of Genomic Clones-- The HMrBNPs are a family of abundant proteins produced only in the testis during spermatogenesis. Their abundance was reflected in the frequency with which HMrBNP clones were isolated from a mid-spermatid stage testis cDNA library (17). However, numerous attempts to recover a full-length cDNA have been hampered by the extreme repetitiveness of the sequence. As a result, we have had to piece together information about these proteins and their sequence using a number of different strategies. In an attempt to obtain information about HMrBNP gene structure and regulation, we screened a partial Sau3A-digested winter flounder genomic DNA library using an incomplete cDNA clone. Approximately 200,000 plaque forming units (>3 genome equivalents) from the amplified library were screened using radiolabeled HMrBNP cDNA clone 3'-5. Over 50 hybridization signals of varying intensities were detected. 23 of the phage producing strong signals were plaque-purified, and restriction enzyme maps for 10 of these clones were constructed.
The clones were grouped into two main classes (Fig. 1) based on the similarity and overlap of their restriction maps. Phage in class A formed a group of five overlapping clones (2B, 2D, 2E, 4B, and 1D) that spanned more than 20 kb. Each clone possessed a discrete region that hybridized strongly to the HMrBNP cDNA probe. With the exception of clone 4B (which was truncated in the region of hybridization), the match to the HMrBNP cDNA probe was restricted to a 1.6-kb SacI/HindIII fragment. Class B comprised three clones (3B, 3C, and 3F) representing ~18 kb of genomic DNA. These phage also contained a single region of hybridization that was localized around a common SacI site. The remaining two clones (1C and 2A) appear to have unique restriction maps and showed hybridization only at one end of the phage insert.
|
The Region of Clone 2B That Hybridized to HMrBNP cDNA Contains a Putative Gene-- The 1.6-kb HindIII/SacI fragment (which hybridized to the HMrBNP cDNA) of genomic clone 2B and its flanking regions were subcloned and sequenced. The nucleotide sequence obtained contains a long ORF from bp 498-1324 (Fig. 2), which when conceptually translated (beginning at the first in-frame methionine codon) would encode a 265-amino acid-long, 30-kDa protein. This ORF is followed by two potential polyadenylation signals (AATAAA), 107 (bp 1431) and 225 bp (bp 1549) downstream from the predicted translation termination codon. The sequence reported here also extends over 500 bp on the 5'-side of the coding region.
|
The 2B Genomic Sequence Is a HMrBNP Homolog-- Portions of the nucleotide sequence of 2B showed a high degree of identity to a representative HMrBNP gene sequence. Dot matrix analysis of the two sequences (Fig. 3) using a stringency of >80% identity (16 out of 20) showed extensive areas of homology. These regions included portions of the proximal promoter, the 5'-UTR, and the coding sequence. Curiously, the putative CCAAT and TATA boxes identified in the proximal promoter region of the HMrBNP sequence were not conserved in 2B. DNA coding for the predicted N-terminal region of 2B showed extensive identity to the 5'-UTR and coding region of the HMrBNP gene. Although the region in 2B (bp 557-626) matching the HMrBNP 5'-UTR has been translated in Fig. 2, it is not known at this time if this sequence is 5'-UTR or coding sequence because it is only after the second in-frame ATG in 2B (bp 596-598) that HMrBNP-like sequences begin (see below). Dot matrix analysis comparing the coding regions of 2B and HMrBNP showed a high degree of identity and tandem repetition between these two sequences as indicated by the multiple lines parallel to the diagonal (Fig. 3). These regions of identity occurred over discontinuous stretches of about 100 bp in length, suggesting a closely related but distinct protein sequence. The similarity to the HMrBNP sequence ended abruptly around nucleotide 1075 of clone 2B (codon 184).
|
Genomic Clone 2B Encodes a HMrBNP/Histone H1 Hybrid-- The predicted amino acid sequence of the N-terminal region of 2B (residues 14-188) had an amino acid composition similar to that of the HMrBNPs. The same four amino acids Arg (21%), Ser (22%), Lys (17%), and Pro (11%) were by far the most abundant and together made up 71% of the composition. In addition, 2B contained numerous sequences (Figs. 2 and 4) similar to the heptapeptide and dodecapeptide repeats obtained by endoproteinase Lys-C digestion of the HMrBNPs (14). Interestingly, these repeats tend to occur in a defined order, interspersed with the sequences SPK and MRAKSPRRSK, such that they form a 32-amino acid sequence (Fig. 4). These larger repeats, with the consensus sequence KSPMRSRSPSRSKSPKRRVKTPKMRAKSPRRS, occurred four times (three linked in tandem) within the first 170 residues of the ORF and showed 71-90% identity to each other. Such repeats were also detected in the HMrBNP coding region (GenBankTM accession number 39735). The three HMrBNP repeats selected for comparison (x, y, and z) showed 80-90% similarity to the repeats in 2B. In addition, the predicted amino acid sequence of 2B contained four tandemly arrayed 7-amino acid repeats between residues 156-183 that overlap with the most C-terminal 32-amino acid repeat. These repeats (consensus sequence SPKMRAK) were distinct from both the heptapeptide repeats determined by peptide sequencing (RRVQTPK) and those present in the 32-amino acid repeats (RRVKTPK). They are more like the sequences that separate the dodecapeptide and heptapeptide repeats in the 32-amino acid repeats (SPK and MRAK). Therefore, residues 28-183 of clone 2B closely resemble the abundant HMrBNPs but with subtle differences in the repeat pattern.
|
|
The Isolated Genomic Clones Are Not the Product of in Vitro
Recombination--
Because the FixII library was propagated in a
Rec+ host, it was possible that the clones obtained from this library
had been subject to internal recombination/deletion. To address this
question, representative clones 1D (the longest of the class A clones)
and 3B (a class B clone) were examined by Southern blot hybridization (Fig. 6). The hybridizing fragments from
the
clones (digested with three different restriction enzymes) all
aligned perfectly with bands in the genomic DNA lanes. Clone 1D gave
rise to single bands of hybridization at 10, 3, and 10 kb with
EcoRI, HindIII, and SacI,
respectively, whereas clone 3B produced 6-, 6-, and 5-kb bands of
hybridization after digestion with the same three enzymes,
respectively. This suggested that the clones were not the result of
internal recombination but were continuous genomic fragments. However,
as the corresponding genomic bands were far less intense on the
autoradiograph than other HMrBNP gene signals, they are presumed to encode minor variants of the
HMrBNP gene family or possibly pseudogenes.
|
![]() |
DISCUSSION |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The main purpose of spermiogenesis is to produce a streamlined, motile cell that can efficiently transfer the male's genetic material to the egg. One of the requirements of this streamlining is the condensation of the nucleus. Different organisms have used various approaches to meet this challenge but have generally accomplished it by increasing the positive charges present on the sperm chromatin proteins, either through the introduction of sperm-specific histone variants or the replacement of histones by protamines (12). Among fish, both approaches are used. For example, rainbow trout (19) and yellow perch (20) replace histones with protamines, whereas grass carp retain their histones but produce a number of sperm-specific variants (21).
Prior to this study, it was thought that winter flounder used an extreme variation of the second approach. In addition to retaining their histones, they synthesize a novel group of sperm chromatin proteins, the HMrBNPs, that appear to be involved in binding and condensing DNA and are thus functionally related to protamines and linker histones (13, 22). However, characterization of the HMrBNPs in isolation failed to identify a link to either group or indeed to any other proteins. Their amino acid composition is intermediate between those of histones and protamines, but their size and amino acid sequence resembles neither one. Recent data base searches of GenbankTM and Swiss-prot with the HMrBNP peptide repeats and partial nucleotide sequences did not reveal any similar proteins.
The isolation and sequencing of genomic clone 2B has at long last shed light on the origin of the HMrBNPs. The HMrBNPs are clearly related to linker histones because the putative coding region of 2B is homologous to the HMrBNPs at its N terminus and to the globular region of histone H1 at its C terminus. Also, the 5'-flanking region of the 2B ORF has very high sequence identity to the proximal promoter of the HMrBNP genes.
It is important to emphasize that this linkage is not a recombination
artifact. When the FixII genomic library was screened with
HMrBNP cDNA, many hybridization signals were
detected. Restriction enzyme analysis of 10 of these clones picked at
random did not identify a full-length HMrBNP
gene, despite their strong hybridization signal on phage DNA blots (see
Fig. 5). Although one of the 10 clones isolated (1C) contained 1.5 kb
from the 5'-end of a HMrBNP gene, this identity
occurred at one end of the phage insert where the bulk of the repeat
had been removed by the original Sau3A digestion (data not
shown). The failure to isolate a full-length HMrBNP gene is entirely consistent with the
extraordinary difficulties experienced in trying to clone these highly
repetitive sequences at the cDNA level. However, because the host
cells used to screen the genomic library (E. coli NM532)
were Rec+, the possibility that the isolated clones had undergone
recombination/deletion was investigated. Southern blot analysis of
these clones alongside winter flounder genomic DNA indicates that they
contain bona fide DNA fragments that have not been rearranged. It also
suggests that gene 2B represents a minor constituent, perhaps a single copy gene, among the HMrBNP multi-gene family.
Indeed, it is notable that the intransigence of the
HMrBNP genes to cloning facilitated the
isolation of clone 2B.
It is quite likely that the progenitor of 2B and the HMrBNPs was a testis-specific variant of histone H1, which would account for the tissue and stage specificity of HMrBNP expression. Some testis-specific histones are notably more basic (and arginine-rich) than their somatic counterparts and have N- or C-terminal regions that contain short simple repeats of the SPKK motif (6, 10, 23, 24) that are so abundant in the HMrBNPs. Amplification of similar repeats could have produced an extreme H1 variant like 2B to assist in regulated and reversible sperm chromatin condensation. Gene duplication, rearrangement, and loss of the globular H1-like domain in the HMrBNP progenitor have apparently given rise to this unique auxiliary sperm chromatin protein. Moreover, the demand for this protein seems to have led to the extensive expansion and amplification of its gene to the point where there are now at the very least 15 HMrBNP isoforms (13).
This recognition of HMrBNP origins is
particularly interesting in relation to recent work on the
"protamine-like" sperm proteins of the mollusk, Mytilus
californianus. Carlos et al. (8, 25) have demonstrated
that two of the three major protamine-like proteins in this
invertebrate are in fact post-translational cleavage products of an
H1-like protein. Specifically, PL-IV was the C-terminal peptide of this
protein, and PL-II* (or 2B) encompassed the N-terminal peptide
linked to an 84-amino acid trypsin-resistant globular region that shows
~40% similarity to the globular core of many histone H1s (11). The
third major protamine-like protein (PL-III) was homologous to the
N-terminal region of PL-II* (26). PL-II*, PL-III, and related proteins
are enriched in SPKK motifs and also contain stretches of alternating
S(R/K) (27), both of which are prevalent and phosphorylatable in the
flounder HMrBNPs (28). In these two instances,
nonclassical, protamine-like sperm proteins have turned out to be
histone H1 derivatives. Because fishes that do not produce protamines
often have additional quantities of linker histones (29), a common
theme is beginning to emerge, that of extra linker histones and their
derivatives compensating for the lack of protamine in the developing
sperm nucleus. Indeed, it is quite likely that some of the additional
sperm-specific proteins and other protamine-like proteins in the recent
classification of Saperas et al. (29) will turn out to be H1
derivatives.
It is beginning to appear that the distinction among protamines, protamine-like proteins, and linker histones may simply be their stage of evolution. The idea that sperm basic chromatin proteins originated from histones was originally proposed by Subirana et al. (30). This hypothesis has since been refined by Ausio and co-workers to suggest that such proteins arose from a primitive histone H1, and it is supported by their extensive biochemical analysis of sperm nuclear proteins from a wide variety of lower eukaryotes (31-33). The isolation and characterization of winter flounder clone 2B provides evidence in vertebrates that specialized sperm chromatin proteins have evolved from the N-terminal tail of a progenitor linker histone.
![]() |
ACKNOWLEDGEMENTS |
---|
We thank Dr. Garth Fletcher and co-workers for supplying tissue samples and Sherry Gauthier for expert technical assistance.
![]() |
FOOTNOTES |
---|
* This work was supported by a grant from the Medical Research Council of Canada and by an NSERC studentship and Queen's University Dean's award (to C. E. W.).The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Peter L. Davies would like to dedicate this paper to his former mentor, Dr. Gordon H. Dixon.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) U45877.
Present address: Div. of Endocrinology, Diabetes & Metabolic
Disease, Thomas Jefferson University, 1020 Locust St., Suite 348, Philadelphia, PA 19107-6799.
§ To whom correspondence should be addressed. Tel.: 613-545-2983; Fax: 613-545-2497; E-mail:DaviesP{at}post.queensu.ca.
1 The abbreviations used are: HMrBNP, high molecular weight basic nuclear protein; ORF, open reading frame; UTR, untranslated region; kb, kilobase pair(s); bp, base pair(s).
![]() |
REFERENCES |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|