(Received for publication, July 12, 1994; and in revised form, November 4, 1994)
From the
Miniparamyosin, a distinct Drosophila melanogaster paramyosin isoform of 60 kDa, is shown here to be encoded by the same gene as paramyosin. The gene, located at 66D14, spans over 12.8 kilobases (kb) and is organized into 10 exons, 9 of which code for the paramyosin transcripts. An exon, located between exons 7 and 8, codes for the 5`-end of the miniparamyosin, and the two proteins share the two last exons of the gene. Mapping of the 5`-ends of these transcripts indicates that the paramyosin and miniparamyosin mRNAs arise from two overlapping transcriptional units; the miniparamyosin transcription initiation site is located inside a paramyosin intron, 8 kb downstream of the one used for paramyosin transcription. The existence of two different promoters and the conserved and nonconserved features of their sequences suggest a very complex regulation of these two muscle proteins. In fact, while paramyosin is expressed at two distinct stages of development as most other Drosophila muscle proteins, miniparamyosin appears late in development, being present only in the adult musculature. The absence of exon 1B, the specific exon of miniparamyosin, in the nematode Caenorhabditis elegans, as well as additional lines of evidence support the lack of miniparamyosin in this particular organism. However, it is present in most invertebrate species examined, including different arthropod, annelid, mollusc, and echinoderm species.
The regulated expression of the muscle proteins is the main characteristic of myogenesis. Differentiation of a specific muscle type is achieved by a combination of mechanisms including the expression of distinct muscle proteins (1) and the expression of type-specific isoforms of the muscle proteins produced by differentially spliced transcripts or posttranscriptional modification(2, 3) . In Drosophila melanogaster, the majority of the muscle proteins seem to be encoded by a single gene, and the generation of the diversity is mainly produced by alternative splicing(4) . This diversity is often increased by the choice of alternative polyadenylation sites or posttranslational modifications(4, 5, 6, 7) . The occurrence of muscle type- and stage-specific isoforms has been maintained throughout evolution, suggesting specific roles for the different isoforms(3, 8) . However, the functional significance of expressing specific isoforms in different muscles remains an unsolved issue.
The identification of the different
components of Drosophila thick filaments is a necessary step
in exploiting the advantages of Drosophila for studies of
muscle structure and function(4, 6, 9) . Our
laboratory has been involved in the last few years in the study of the
thick filament organization in Drosophila, characterizing some
of the regulatory properties of muscle proteins (10, 11, 12) . The understanding of how
assembly of thick filaments in the muscle fibers takes place, the
molecular mechanisms involved, and the specific roles of each thick
filament component remain to be clarified. Although myosin is the major
component of the thick filament, paramyosin and a few additional minor
proteins have also been biochemically identified as components of the
thick filament core in invertebrates(13, 14) .
Paramyosin, a coiled coil -helical fibrillar dimer, is a
structural component of invertebrate thick filaments (13, 14, 15, 16, 17, 18, 19, 20) .
There is no vertebrate homolog of
paramyosin(15, 16, 17, 18, 19, 20) .
In Caenorhabditis, paramyosin is required for proper assembly
of the body wall musculature(19) . Mutant analysis indicates
that interaction of myosin with paramyosin and maintenance of the
proper stoichiometry of both proteins are also necessary for assembly
and determination of the thick filament
length(21, 22, 23, 24) . In insects,
paramyosin has been studied mainly in flight
muscles(15, 16, 25) . The flight musculature,
a distinct insect muscle type, has been one of the experimental systems
where, at least historically, the structure and organization of the
contractile tissues has been analyzed(26) . D. melanogaster paramyosin was identified and cloned in our
laboratory(10, 11) , where it was shown that it has a
similar sequence and molecular weight similar to those of other
invertebrate paramyosins. Drosophila paramyosin was especially
abundant in non-fibrillar musculature and relatively less abundant in
the fibrillar indirect flight muscles(11) . Several isoforms of
approximately the same molecular mass of about 107 kDa were identified
in Drosophila(11) . More recently, through the
identification of a cDNA, miniparamyosin, a distinct paramyosin isoform
of lower molecular weight was described in Drosophila.
Miniparamyosin, an isoform not previously identified in any
invertebrate system, was described as exclusively present in certain
types of muscles in the adult fly(27) .
The role of different forms of paramyosin in regulating the properties and complex phenomenology of the thick filaments in invertebrate muscles made it interesting to study in more detail. In this article, we present the genomic clones that code for the complete sequence of paramyosin and miniparamyosin together with their intron/exon structure. The evidence supports the idea that all isoforms are coded in D. melanogaster by a single gene. Nevertheless, paramyosin and miniparamyosin are regulated from different promoter sequences located several kilobases apart. Their transcription initiation sites and the nucleotide sequence of the promoters have been determined. The analysis of these sequences has shown the existence of several putative regulatory sites, some of which are consensus binding sites for muscle-specific transcription factors. Furthermore, by producing antibodies specific for the paramyosin and miniparamyosin isoforms we have been able to show that miniparamyosin is widely distributed among phylogenetically distant invertebrates.
Figure 1: Genomic structure and organization of the paramyosin/miniparamyosin gene in the 66D14 locus. a, in the upper part the three analyzed genomic clones and the restriction map of the clones (H, HindIII; E, EcoRI) are shown. Black squares indicate cDNA probes (K, X300, and HP) used and their positions with respect to the genomic clones. In the lower part a schematic representation of the paramyosin/miniparamyosin gene is shown with the numbered exons in black boxes. A portion of this organization appears in a figure of a recent review(6) . In the lower part the complete restriction map of the DNA region is presented (B, BamHI; H, HindIII; E, EcoRI; X, XhoI; Xb, XbaI; Hc, HincII; P, PstI). b, the nucleotide sequences of the intron-exon boundaries of the genomic clones. The donor and acceptor splice sites correlate with the splice junction consensus for Drosophila(33) . Numbers in brackets specify the nucleotides in each exon. The approximate size of the introns is at the right of the figure in bp or kb (indicated). The positions of the boundaries are indicated by the number(s) appearing above each sequence. Numbering starts at the initial nucleotide of the complete paramyosin and miniparamyosin (italics) cDNAs.
Figure 2: Identification of the transcription initiation sites of the paramyosin amd miniparamyosin mRNAs. a, primer extension analysis of the paramyosin and miniparamyosin transcription units (see ``Materials and Methods''). D. melanogaster late pupae total RNAs were annealed with oligonucleotides complementary to paramyosin (PM, upper panel) and miniparamyosin (mPM, lower panel) and extended with reverse transcriptase. The same primers were used with the appropriated genomic clones to generate the dideoxy sequencing ladders shown in the insets. The transcription initiation start sites are indicated by arrows. Numbering starts at nucleotide +1 corresponding, respectively, to the major initiation starts of the paramyosin and miniparamyosin transcripts as detected in these assays. b, nuclease S1 protection assay with genomic fragments amplified by PCR (see ``Materials and Methods''). Two different fragments corresponding to the 5`-regions of exon 1A (for paramyosin) and exon 1B (for miniparamyosin) were used. c, the sequences of the genomic regions corresponding to the paramyosin (PM) and miniparamyosin (mPM) transcription initiation sites are presented. Arrows indicate the transcriptional starting sites. The starting nucleotides of the published paramyosin and miniparamyosin cDNAs (10, 27) are indicated by bent arrows. As can be seen, 45 nucleotides were missing in the miniparamyosin cDNA.
In an effort to identify conserved sequences that may also serve a regulatory function, we have cloned and sequenced the 5`-flanking regions of the paramyosin and miniparamyosin transcriptional initiation sites in D. melanogaster and Drosophila virilis. These two Drosophilidae species diverged more than 50 million years ago and therefore are useful to detect evolutionarily conserved regulatory features. In Fig. 3, the nucleotide sequences of the regions extending about 350 and 450 nucleotides for paramyosin and miniparamyosin, respectively, upstream of the main transcription initiation sites are presented. The nucleotide sequences reveal a number of potential regulatory elements (boxed or underlined in Fig. 3). The comparison with the homologous D. virilis sequences allows the preliminary identification of the functionally significant features in the sequence (boxed in Fig. 3). The pentanucleotide TCAGT, a consensus sequence for the eucaryotic promoter cap site (34) and for the regulatory initiator element(35) , is present in the sequenced paramyosin and miniparamyosin transcriptional initiation sites. It is highly conserved between the two Drosophilidae and provides support for the likelihood of a functional role of these motifs. Furthermore, paramyosin and miniparamyosin proximal promoters in D. melanogaster show AT-rich regions between -25 and -40 from the main transcriptional initiation sites. Interestingly, the comparison with the D. virilis sequences suggests distinct roles for these putative TATA boxes. Whereas the AT-rich region of the paramyosin promoter is exactly conserved in D. virilis, validating the likelihood of it serving as a functional element, the corresponding region of the miniparamyosin promoter is not conserved in D. virilis. In addition, the CG and CCAAT elements, found to be critical in numerous eucaryotic promoters, were also detected. The paramyosin proximal promoter has a putative CCAAT element at position -45 that could be recognized by CCAAT transcriptional factors (36) and is conserved in the two Drosophilidae species. An equivalent CCAAT element is not found in the miniparamyosin promoter. Instead, several GC boxes that could bind the general transcriptional factor Sp1 (37) are present at positions -45 and -52 in both Drosophilidae miniparamyosin promoters. Even though two GC boxes at -121 and -136 are present in the paramyosin promoter, they are not conserved in the D. virilis sequence, which suggests they have no functional role. Similarly, the GC element present at -113 in the miniparamyosin promoter is also absent in the D. virilis sequence. Therefore, in spite of being present in both D. melanogaster promoters, it is doubtful that these GC boxes contribute to the regulation of these proteins.
Figure 3: Comparison of the paramyosin and miniparamyosin promoters of D. melanogaster and D. virilis. The upper panel (a) illustrates the regions extending 340 and 357 nucleotides upstream of the transcriptional initiation sites of the D. melanogaster and D. virilis paramyosin promoters, respectively. The lower panel (b) illustrates regions extending 440 and 453 nucleotides upstream of the initiation starts of the D. melanogaster and D. virilis miniparamyosin promoters, respectively. The initiation starts are indicated by black circles, and +1 corresponds to the main ones. Putative regulatory elements are boxed when conserved and underlined when unconserved in the corresponding D. virilis sequence. Underneath are the names of the corresponding putative DNA-binding elements, namely, E-box, the consensus binding site for the MyoD family; CCAATT, the consensus sequence for the CCAATT binding elements; A-T, the A-T-rich regions; and G-C, the putative GC boxes.
The possibility of identifying cis-acting DNA elements responsible for muscle-specific transcription of the paramyosin/miniparamyosin gene was also examined. The analysis of the currently available sequence in the two Drosophilidae species reveals the conservation of several E-boxes in both promoters. The sequence CANNTG, or E-box, has been proposed as the consensus binding site of muscle-specific transcription factors of the MyoD family(38) . The paramyosin promoter has only one conserved E-box at -263, and the miniparamyosin sequence has two E-boxes, at -59 and -152, all numbered from the transcription initiation sites. On the other hand, two E-boxes at -61 and at -296 in the D. melanogaster paramyosin promoter are not conserved in D. virilis; neither is the E-box at -417 in the miniparamyosin promoter.
Figure 4: Exon 1B of the D. melanogaster miniparamyosin is not included in the C. elegans paramyosin gene. A comparison of D. melanogaster and C. elegans(39) genomic organizations is shown. Boxes represent exons, and lines represent introns. Interrupted lines mark the unidentified C. elegans 5`-genomic region (according to data bank information). Genomic regions coding for homologous peptides are indicated by lines joining the two genomic regions.
To explore this alternative, we have used the polyclonal antibodies against specific regions of paramyosin/miniparamyosin gene (see ``Materials and Methods''). The presence of a similar cross-reactive protein was investigated by immunoblotting analysis of extract from several representative members of different invertebrate phyla (namely, Diptera (D. melanogaster, D. virilis, and Calliphora), Hymenoptera (Formica), Coleoptera (Coccinella), Orthoptera (Locusta and Blatta), and a member of the Apterygota, the silverfish (Lepisma)) and from representative members of other arthropod classes (Arachnida (Araneus), Crustacea (Astacus) and Chilopoda (Scalopendra)). In parallel, muscle extracts from additional representative protostome invertebrates were also processed and tested (namely, Mollusca (Mytilus and Helix), Annelida (Lumbricus), and Nematoda (Caenorhabditis)). In addition, muscles from a deuterostome invertebrate, the sea urchin Sphaerochinus, and a mammal, Rattus, are included. In Fig. 5, Western blots made with three different antibodies (namely, anti-miniparamyosin (anti-exon 1B), anti-paramyosin (an antibody made against the purified protein(11) ), and anti-exon 5 antiserum (specific to paramyosin in D. melanogaster) are presented (see ``Materials and Methods''). The miniparamyosin antiserum recognizes a protein of lower molecular mass than paramyosin (50-80 kDa) in all preparations except those from Caenorhabditis and Rattus. Interestingly, the miniparamyosin polyclonal antibody also cross-reacts with paramyosin in all the analyzed species except in Diptera (the two Drosophilae species and Calliphora) and Caenorhabditis (Fig. 5b). In Fig. 5c, the reaction with the antiparamyosin antibody made against the purified protein (11) recognizes paramyosin in all the invertebrate species. The reaction in the case of Caenorhabditis is very weak and is not visible in the figure. The absence of paramyosin in mammalian muscles is well known. In Fig. 5d, the anti-exon 5 antiserum recognizes paramyosin in all invertebrate species tested, except in Coccinella where it recognizes myosin instead. It is worth noting that in more evolutionarily distant invertebrates, the anti-exon 5 antiserum increasingly recognizes myosin in addition to paramyosin.
Figure 5: Differential expression of miniparamyosin in species representative of a wide range of protostome phyla. Whole extracts were prepared from representative members of the following insect classes; Diptera, flies (D. melanogaster and virilis) and blowflies (Calliphora); Hymenoptera, ants (Formica); Coleoptera, ladybugs (Coccinella); Orthoptera, locusts (Locusta) and cockroaches (Blatta); and Apterygota, silverfishes (Lepisma) were prepared. In parallel, muscle extracts from other arthropod classes (Chilopoda, centipede (Scalopendra); Arachnida, spider (Araneus); and Crustacea, crayfish (Astacus)) as well as from other representative protostome invertebrate phyla (namely, Annelida, earthworm (Lumbricus); snail (Helix); and Mollusca, mussel (Mytilus)) were tested. About 20 µg of protein corresponding to each sample were run in 10% SDS-polyacrylamide gels (a), and Western blots were cross-reacted with the the following antibodies: D. melanogaster anti-miniparamyosin (b), antiparamyosin (c), and anti-exon 5 of paramyosin (d). In the right part of the figure, the molecular weight of the markers and the positions of myosin (M), paramyosin (PM), and miniparamyosin (mPM) are indicated.
By using a probe against its 3`-end, it was previously shown that D. melanogaster paramyosin is coded by a single gene(10, 27) , since this region is shared with miniparamyosin. It was thought that both proteins were encoded by the same gene. Our studies on the genomic organization of the paramyosin/miniparamyosin gene support this conclusion. The gene located at 66D14 (10) spans 12.8 kb and is organized into at least 10 exons and 9 introns. The available nucleotide sequence of the genomic DNA has allowed the identification of the exact positions of the splice junctions. All exons identified could be ascribed to the paramyosin sequence except for exon 1B (in between exon 7 and 8 of the paramyosin). This exon, together with exons 8 and 9, codes for miniparamyosin.
The comparative analysis of the paramyosin and miniparamyosin promoters in D. melanogaster and D.virilis suggests interesting differences. Only the paramyosin promoter shows a putative TATA box, the conserved AT-rich region, but both promoters show the same initiator/cap site sequence. The main initiation start sites for miniparamyosin and paramyosin mRNAs show an identity of 8-11 nucleotides and share the initiator consensus sequence, TCAGT(34, 40) . This initiation promoter and related sequences are present in several Drosophila muscle genes studied to date(41, 42, 43, 44) . Recent work has shown that in genes lacking TATA boxes the initiator element plays a role functionally analogous to the TATA element. Through its interaction with the TFIID complex, it is capable of directing the basal transcription by RNA polymerase II and of determining the precise site of transcription initiation(35, 45) . In promoters that contain a TATA box, the initiator element greatly enhances promoter strength (45) . Therefore, the initiator elements present in the two promoters driving the expression of the paramyosin and miniparamyosin may play different roles. The decision as to whether the conservation of the AT-rich region at -39 in the paramyosin promoter is attributable to its serving as putative TATA box or as an alternative transcription initiation site (see above) must wait until further experimental evidence is obtained. In addition, the paramyosin promoter has a conserved CCAAT box within an appropriate distance of the transcriptional start. The CCAAT element can be recognized by different transcriptional factors in different cells; in some cases, they produce transcription stimulation, whereas in others, they result in repression (46) . The role of this element in this promoter may be interesting to study, since most D. melanogaster muscle promoters do not contain CCAAT boxes in spite of the fact that they drive patterns of expression very similar to the paramyosin promoter. With regard to the differences between these two promoters, an additional evolutionary conserved element, the GC motif, is found in the miniparamyosin promoter of both Drosophilidae but not in the D. virilis paramyosin promoter. Complete clarification of the relative importance of the different features of these inititation sequences in different types of muscle and developmental stages must await the results of in vivo and in vitro expression studies. Nevertheless, the results presented in this and earlier work (10, 27) fully support the concept that the D. melanogaster paramyosin/miniparamyosin gene is expressed from two overlapping transcriptional units that encode two proteins, paramyosin and miniparamyosin, using two promoters and a combination of different origins of transcription and alternative polyadenylation sites.
A
number of eucaryotic genes have now been characterized containing two
promoters (for example, see (47) ). Among D. melanogaster muscle genes, a similar situation to that of the
paramyosin/miniparamyosin gene has been detected in the tropomyosin II
gene(42) . This gene encodes for both a specific muscle isoform
and a cytoplasmic isoform. The first exon of the cytoplasmic form is
located between exons 3 and 4 of the muscle-expressed isoform. Both
genes have an internal promoter that drives the expression of a smaller
protein with a different expression pattern; miniparamyosin is only
expressed in the adult musculature, whereas the smaller form of
tropomyosin II is a cytoplasmic isoform. In contrast with the
tropomyosin II gene, both paramyosin and miniparamyosin promoters are
muscle-specific, whereas the promoter used by the cytoplasmic
tropomyosin II isoform is a more generalized housekeeping type
promoter. In addition, the paramyosin/miniparamyosin gene promoters are
regulated differentially. The paramyosin promoter is active throughout
development in developing muscle cells of the embryo, larva, and adult.
This pattern is equivalent to most muscle proteins in D.
melanogaster including myosin, the main component of the thick
filament(6) . In contrast, the miniparamyosin promoter is
active mainly in the adult musculature. The protein has been detected
transiently in third instard larvae, becomes fully expressed in adults,
and maintains its expression in older adults. It is interesting to
point out that the expression of the two proteins is switched on both
at the mRNA and polypeptide levels (27, 11, this paper). Different
myogenic programs are known to exist, controlling the expression of
muscle-specific genes. The activation of most of the skeletal
muscle-specific genes characterized to date requires a functional
binding site for one of the known HLH myogenic factors. The
sequence CANNTG, or E-box, constitutes an integral component of the
positive regulatory elements, mainly enhancers, of muscle-specific
genes. Since not all the muscle types express MyoD, it is likely that
additional factors, like MEF-2, M-CAT, or CarG factors, are also
involved in the differential muscle
regulation(48, 49, 50) . Moreover, recent
data suggest that the interaction of MyoD and MEF-2 plays a central
role in the regulation of the muscle expression. In Drosophila, MyoD and MEF-2 homologs have been
isolated(51, 52, 53) . The study of the
cis-elements involved in the regulation of the expression of two muscle
proteins encoded by the same gene, such as paramyosin and
miniparamyosin, could provide an interesting paradigm of the myogenic
programs such as those that control the expression level of different
proteins in different types of Drosophila muscles at distinct
stages of development.
The organization of the C. elegans gene, the only well characterized paramyosin gene, was compared
with that of the D. melanogaster gene. In C. elegans,
the gene is organized into 11 exons, which encode for a protein with
high homology to the D. melanogaster paramyosin(10) .
When the respective splice junctions were analyzed, a greater degree of
conservation in the splicing sites was found in the amino-terminal
region of the genes. In the rest of the sequence, the arrangement of
exons differs (Fig. 4). In particular, exon 5 in D.
melanogaster (the one that was chosen to raise the specific
paramyosin antibody) has its homologous peptide encoded by exons 7, 8,
and part of 9 of C. elegans. Furthermore, the sequence of exon
1B, the specific exon of the miniparamyosin, was not found in any known
region (exons or introns) of the C. elegans paramyosin gene.
No space exists for an exon 1B homolog in C. elegans. The
exons that flank 1B, exons 7 and 8, as well as part of exon 5 and exon
6 in D. melanogaster, correspond in C. elegans to a
single exon (exon 9). This lack of conservation of the exon/intron
organization of paramyosin contrasts with the situation in many of the
muscle genes, for example in the tropomyosin II gene, in which the exon
organization has been conserved in all species examined so far
including vertebrates and invertebrates(42) . The high
variation in the genomic organization of this gene is substantiated by
the available sequence of Drosophila virilis, which indicates
that there are even variations between the paramyosin splicing sites in
these two Drosophilidae. ()Several lines of
evidence indicate that miniparamyosin is absent in the pseudocoelomate C. elegans. First, there is a lack of a similar sequence in
the gene, insofar as the genetic organization is presently known.
Second, Northern analysis failed to identify additional mRNAs to the
paramyosin ones in C. elegans(54) . Our antibodies
and, more interestingly, the C. elegans anti-paramyosin
polyclonal antibody also failed to detect a second isoform of different
molecular weight in this organism (13, this paper).
The evolutionary diversification of paramyosin in invertebrate muscle, including the presence of miniparamyosin, has been approached through the use of antibodies specific for the two isoforms. Both types of proteins are present in all invertebrate species investigated except C. elegans. Whereas slight variations in the molecular mass of paramyosin were known (10, 11, 39; see Fig. 5, c and d), the molecular mass of miniparamyosin shows a much wider range of variation (from 50 to 80 kDa; see Fig. 5b). The miniparamyosin antiserum recognizes a single protein with a similar mobility as miniparamyosin only in Diptera. In all other species analyzed, the antiserum recognizes two proteins, paramyosin and the putative miniparamyosin. This result suggests that the D. melanogaster miniparamyosin-specific protein domain, exon 1B, may be expressed not only in the miniparamyosin but also in the paramyosin of the other species. In Annelida and Mollusca, the cross-reaction with the paramyosin was stronger with the miniparamyosin antibody than with a paramyosin-specific antibody (prepared with a peptide encoded by exon 5), suggesting that the specific function of the protein domain coded by exon 1B is in some species included in the paramyosin gene. This is in accordance with the much greater variability in the exon organization of the paramyosin/miniparamyosin genes mentioned in the preceding paragraph. In light of this, it is interesting that the anti-exon 5 antibody, specific to paramyosin, is able to cross-react not only with paramyosin but with myosins of several invertebrate species. Coccinella provides an extreme example because only the myosin cross-reaction occurs. This suggest that a similar exchange of functions may be occurring between parts of myosin and paramyosin in these species. Furthermore, in the case of the deuterostome invertebrates, such as Echinodermus, the results suggest that a miniparamyosin isoform may be present in addition to the already known paramyosin. In conclusion, we suggest that the patterns of antibody cross-reactivity obtained reflect different specializations of the paramyosin gene complex such that, in Diptera (possibly through the evolution of a separate regulatory control) miniparamyosin has acquired a structure and function distinct from those of paramyosin. In any case, the widespread occurrence of miniparamyosin in invertebrate muscles and the complexity of the transcriptional regulation of the two distinct isoforms encoded by a single gene in Drosophila species point to the importance of the functional role of paramyosin and miniparamyosin in producing the structural and functional diversity of invertebrate muscles. Future work, currently under way in our laboratory, will contribute to the clarification of these important questions.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) X79485 [GenBank]and X79484[GenBank].