(Received for publication, November 21, 1995; and in revised form, January 17, 1996)
From the
M-CAT motifs mediate muscle-specific transcriptional activity via interaction with binding factors that are antigenically and biochemically related to vertebrate transcription enhancer factor-1 (TEF-1), a member of the TEA/ATTS domain family of transcription factors. M-CAT binding activities present in cardiac and skeletal muscle tissues cannot be fully accounted for by existing cloned isoforms of TEF-1. TEF-1-related cDNAs isolated from heart libraries indicate that at least three classes of TEF-1-related cDNAs are expressed in these and other tissues. One class are homologues of the human TEF-1 originally cloned from HeLa cells (Xiao, J. H., Davidson, I., Matthes, H., Garnier, J. M., and Chambon, P.(1991) Cell 65, 551-568). A second class represents homologues of the avian TEF-1-related gene previously isolated (Stewart, A. F., Larkin, S. B., Farrance, I. K., Mar, J. H., Hall, D. E., and Ordahl, C. P.(1994) J. Biol. Chem. 269, 3147-3150). The third class consists of a novel, divergent TEF-1 cDNA, named DTEF-1, and its preliminary characterization is described here. Two isoforms of DTEF-1 (DTEF-1A and DTEF-1B) were isolated as 1.9-kilobase pair clones with putative open reading frames of 433 and 432 amino acids whose differences are attributable to alternative splicing at the C terminus of the TEA DNA binding domain. Cardiac muscle contains high levels of DTEF-1 transcripts, but unexpectedly low levels are detected in skeletal muscle. DTEF-1 transcripts are present at intermediate levels in gizzard and lung, and at low levels in kidney. DTEF-1A is a sequence-specific M-CAT-binding factor. The distinct spatial pattern of expression, and unusual amino acid sequence in its DNA binding domain, may indicate a particular role for DTEF-1 in cell-specific gene regulation. Recent work also suggests that at least one more TEF-1-related gene exists in vertebrates. We propose a naming system for the four TEF-1 gene family members identified to date that preserves existing nomenclature and provides a means for extending that nomenclature as additional family members may be identified.
Upon differentiation, the cytoplasm of myogenic progenitor cells
is converted to a highly organized sarcomeric array. This process is
dependent upon the activation and expression of genes in a specific
temporal sequence that is coordinated by transcription factors. In
developing skeletal muscle, for example, the expression of most
skeletal muscle-specific genes is dependent upon the myoD family of
transcription factors (also known as myogenic determination factors or
MDFs(); for review see (1, 2, 3) ). MDFs are muscle-specific
proteins of the basic helix-loop-helix (bHLH) superfamily that form
heterodimers with ubiquitous bHLH nuclear proteins, and thereby bind E
boxes in muscle gene promoters. MDFs can also activate the skeletal
myogenic program in many non-muscle cells, converting them to the
skeletal muscle phenotype(4, 5, 6) . MDFs are
not present in cardiac muscle(7) .
Many cardiac gene
promoters, on the other hand, are regulated through DNA sequences known
as M-CAT motifs (5`-CATTCCT-3`). M-CAT-dependent promoter activity was
first described in the chicken cardiac troponin T
gene(8, 9, 31) . M-CAT sites have been shown
to be involved in the regulation of the -myosin heavy chain,
-myosin heavy chain, cardiac troponin C, skeletal
-actin, and
-acetylcholine receptor
genes(8, 9, 10, 11, 12, 13) .
The M-CAT motif is bound by M-CAT binding factor(s), or MCBF, which is
enriched in the nuclei of striated muscle, but is also present in
non-muscle tissues(9, 14) . All muscle MCBFs have been
shown to be antigenically and biochemically related to TEF-1,
transcription enhancer factor-1(14, 32) .
TEF-1 was first cloned from Hela cells where it binds the M-CAT-related GTIIC and Sph elements in the SV40 enhancer(15, 16, 17) , whose sequences are variations of the canonical M-CAT motif(8, 14) . TEF-1 is a member of the TEA/ATTS domain family of transcription factors that are characterized by a structurally conserved DNA binding domain(18, 19) . The TEA domain has been found in the amino-terminal regions of a class of regulatory proteins and is conserved across many species. The yeast regulatory protein TEC1 is involved in the activation of the Ty1 retrotransposon(20) , ABAA regulates conidiation in Aspergillus nidulans and terminates vegetative growth(21) , and the Drosophila gene scalloped plays an important role in neurodifferentiation(22) . Functional mammalian TEF-1 is found in the 2-8-cell stage in early mouse development(23) , and insertional knockout of the murine homologue of the human TEF-1 gene results in embryonic lethality accompanied by hypoplasia and/or degeneration of the ventricular myocardium(24) .
Previously,
we cloned a class of TEF-1-related cDNAs from heart and skeletal muscle
that includes at least four alternatively spliced cDNA products
(designated TEF-1A, -1B, -1C, and -1D) of the same gene. TEF-1A mRNAs
are expressed in many tissues but are enriched in both cardiac and
skeletal muscle. In addition, isoproteins encoded by TEF-1A and -1B
cDNAs are bona fide M-CAT binding factors. TEF-1B, at least,
can activate transcription when linked to a heterologous DNA binding
domain(25) . Muscle tissues contain at least three
proteinM-CAT complexes on electrophoretic mobility shift assays,
one of which appears to be muscle-specific and up-regulated upon
differentiation(32) . All muscle protein
M-CAT complexes
contain TEF-1 proteins, but the latter muscle-enriched complex contains
TEF-1A and not other TEF-1 related proteins(32) .
Additional
TEF-1-related cDNAs might, therefore, account for the multiple
proteinM-CAT complexes found in muscle and non-muscle tissues. We
have identified a new, divergent class of TEA domain gene that we name
DTEF-1, for divergent TEF-1. DTEF-1 is a sequence-specific
M-CAT binding factor whose mRNA is highly enriched in cardiac muscle.
Thus, DTEF-1 may be involved in the cardiac-specific regulation of
M-CAT-dependent promoters.
To identify further TEF-1-related genes, and in particular those that might be preferentially expressed during cardiac development, a partial avian NTEF-1 cDNA was used to screen a seven week chicken heart cDNA library under reduced stringency conditions. Of nine cDNA clones isolated, four had nucleotide sequences corresponding to RTEF-1. The nucleotide sequence of five other clones were divergent from that of either NTEF-1 or RTEF-1 ( Fig. 1and data not shown) indicating that a third TEA-domain gene is present in the avian genome. We designate this new TEF-1-related gene DTEF-1 (Divergent TEF-1) to indicate its higher degree of divergence from avian NTEF-1 (see ``Discussion'').
Figure 1: Structure of DTEF-1 cDNAs and derived amino acid sequence. A, diagram of DTEF-1A and 1B isoforms. Small open boxes indicate noncoding sequences; large boxes indicate coding sequences; alternatively spliced 1A and 1B exons are indicated. Black boxes indicate the TEA domain. B, nucleotide and deduced amino acid sequence of DTEF-1A. AUA and AUG initiators in the context of their putative respective Kozak sequences are underlined. The asterisk indicates a stop codon; the TEA domain is doubly underlined. The 3` polyadenylation signal is in bold type. C, comparisons of TEA domains of chick/human NTEF-1, chick RTEF-1, Drosophila scalloped (sd), and DTEF-1A and -1B. Amino acid sequences of TEA domains (central separated portion) and immediate flanking residues are shown. Dots indicate identity. D, nucleotide and derived amino acid sequence for alternatively spliced segments of DTEF-1A and DTEF-1B. Bold type indicates residues that constitute the 3` end and carboxyl-terminal of the TEA domain. Underlined regions represent nucleotide and amino acid differences between DTEF-1A and DTEF-1B. The DTEF-1B alternative splicing domain results in predicted polypeptide sequence one amino acid shorter than that predicted for DTEF-1A. E, alignment of derived amino acid sequences of avian NTEF-1, RTEF-1, and DTEF-1, identified in the figure as chickn, chickr, and chickd. RTEF-1 is identical to NTEF-1 in the TEA domain, while DTEF-1A is 97% identical. Amino-terminal to the TEA domain RTEF-1 is 43% identical to NTEF-1, and DTEF-1 is 45% identical to NTEF-1. Carboxyl-terminal to the TEA domain RTEF-1 has 72% identity with NTEF-1 and DTEF-1 has 70% identity. Black background indicates identity, gray background indicates similarity, and white background indicates difference.
All five apparently full-length (1.9-kb) DTEF-1 cDNAs were identical except for a specific segment that suggests alternative splicing of the primary transcript of the DTEF-1 gene (Fig. 1, A and D) generating predicted isoforms designated DTEF-1A and DTEF-1B. The deduced amino acid sequence of DTEF-1A (Fig. 1B) yields a 433-amino acid polypeptide that is 72% identical to both avian NTEF-1 and RTEF-1A.
The TEA domain is highly conserved throughout evolution(18, 19) , and there is only a single amino acid change within this domain between the drosophila TEA domain gene scalloped and the human-, mouse-, or chick-derived NTEF-1s (22) (Fig. 1C). NTEF-1 and RTEF-1 are identical throughout their TEA domains. However, DTEF-1A, which represented four of the five cDNA clones, has two amino acid differences within the TEA domain as compared to NTEF-1 (Fig. 1C). Amino acid residue 87 of DTEF-1A contains a lysine instead of an arginine, and residue 94 contains a leucine in place of an isoleucine, preserving the basic and aliphatic natures, respectively, of the residues at these two positions.
The DTEF-1B isoform is identical to DTEF-1A at the
nucleotide and amino acid levels except for an alternatively spliced
23-amino acid exon, extending from Arg to Gln
(Fig. 1D) which replaces Lys
to
Lys
and includes the last 10 amino acids of the TEA
domain. The resultant TEA domain of DTEF-1B is identical to that of
NTEF-1 (Fig. 1C).
Figure 2: Northern blot analysis of DTEF-1 transcripts in embryonic day 12 chick tissues: heart (H), skeletal (Sk), lung (Lu), liver (Liv), kidney (K), brain (Br), and gizzard (G). Predominant transcripts of 1.9 and 3.6 kb are indicated. The migration positions of low abundance transcripts at 6.8 and 7.2 kb are indicated but are visible only upon extended exposure (data not shown). 18 S and 27 S ribosomal RNA mobilities are indicated.
Figure 3:
DTEF-1 encodes an M-CAT binding protein.
DTEF-1A cDNA containing codons from Ser to the C terminus
was used to produce in vitro transcribed mRNA that was
translated in rabbit reticulocyte lysate. The in vitro translation of DTEF-1 fusion protein yielded a minority of full
length protein. DTEF-1 fusion protein was incubated with radiolabeled
M-CAT DNA and analyzed by electrophoretic mobility shift assay. Bands
shown are the gel shift complexes formed with full-length DTEF-1. Lane 1 and lane 4, no competitor(-); lane
2, wild type competitor DNA (WT); lane3, mutant
competitor DNA (MT).
In this study we report evidence that vertebrate TEF-1 comprises a multigene family. Three classes of TEF-1-related genes, NTEF-1, RTEF-1, and DTEF-1, have been tentatively implicated in muscle gene regulation. NTEF-1 is expressed in various tissues including heart, skeletal muscle, lung, kidney, brain, and gizzard, and produces a major 1.6-1.8-kb transcript (15, 27) (data not shown). RTEF-1 is also present in muscle tissue and one isoform can potentially transactivate muscle promoters through binding to M-CAT motifs(25) . DTEF-1 is the newest and most divergent class of TEF-1-related cDNAs. Unlike NTEF-1 and RTEF-1, the expression of DTEF-1 is restricted to a few tissues, and is most highly expressed in heart muscle.
Recently, another vertebrate TEF-1 family member was reported (ETF-1) (30) . ETF-1 is expressed specifically in a subset of embryonic tissues, including the cerebellum, testis and distal portions of the limb and tail buds, but is essentially absent from adult tissues. Sequence comparison of ETF-1 to avian TEF-1 family members shows that ETF-1 is not any more closely related to any one TEF-1 family member than another, indicating that it is not a true homologue of N-, R-, or DTEF-1. Although ETF-1 shares sequence features with both RTEF-1 and DTEF-1, and is 100% identical to NTEF-1 in the TEA domain, ETF-1 probably represents a fourth class of TEF-1-related genes. We suggest ETF-1 should be renamed ETEF-1 in order to reflect its membership of the TEF-1 family (see below and Table 1).
We
propose a system of nomenclature for the known TEF-1 related genes in
vertebrates (Table 1), taking into account the cloning of ETF-1
in the mouse(30) . This nomenclature is based on the first
cloned vertebrate TEF-1 family member, human TEF-1(15) . The
avian, rat, and murine homologues of human TEF-1 are grouped as NTEF-1
type (Nominal TEF-1, from which the name is derived) ()(15, 27) based on their high degree of
homology (97% at the amino acid level, see Table 1). Another
closely related class of TEF-1 cDNAs/genes, RTEF-1 (Related to
NTEF-1) has been identified in chick (25) and mouse, (
)sharing higher amino acid sequence homology to each other
(89%) than to the NTEF-1 class (74%). The DTEF-1 (Divergent
TEF-1) class of TEF-1 cDNAs reported here constitutes a fourth class
that is 72% identical to human NTEF-1 with novel changes in the TEA
domain (see below). By contrast, all other vertebrate family members
cloned to date show 100% identity in the TEA domain.
Vertebrate TEF-1-related gene family members characterized to date are believed to initiate translation at an isoleucine codon that lies upstream of the first methionine codon(15, 25, 27) . The size of endogenous and in vivo produced human NTEF-1 corresponds to initiation at an AUU codon and shorter NTEF-1 polypeptides that might initiate at the downstream AUG codon are not detected by anti-TEF-1 antisera(15) . Similarly, the major product of in vitro translated NTEF-1 mRNA also corresponds to an isoleucine initiated protein. Engineering of a perfect Kozak consensus around the first methionine codon does not affect that translation pattern. On the other hand, introduction of a perfect Kozak sequence around the isoleucine codon results in more efficient initiation at that site, while mutation of the isoleucine codon abolishes it(15) .
The first potential AUG (residue 428) initiator codon in the DTEF-1 cDNA is surrounded by a poor Kozak consensus sequence (7/13 nucleotides, underlined in Fig. 1B), but a reasonably favorable Kozak consensus (10/13 nucleotide homology with the Kozak consensus sequence; underlined in Fig. 1B) surrounds the isoleucine codon (AUA) at nucleotide 352. That isoleucine 352 codon corresponds to the one identified as the initiator for human NTEF-1 as outlined above (15) and initiates a conceptual open reading frame of 433 amino acids whose sequence is closely related to that of other vertebrate TEF-1-related genes (see Table 1) and consists of a serine-rich N terminus, followed by a highly conserved, basic TEA domain, and then a proline-rich region. We tentatively conclude, therefore, that DTEF-1 translation probably initiates at an isoleucine codon and that the DNA binding and transactivation motif pattern is grossly conserved through all classes of vertebrate TEF-1 cDNAs consistent with general conservation of function for this protein.
Two isoforms of the DTEF-1 cDNA were isolated. The nucleotide sequence of the DTEF-1A cDNA isoform differs from that of DTEF-1B over a 69 nucleotide segment that includes the C-terminal portion of the TEA domain (Fig. 1D). In all other regions these isoforms are identical in nucleotide sequence indicating that they are products of a single alternatively spliced gene. The DTEF-1B TEA domain is identical to all other published vertebrate TEA domains except that of DTEF-1A (Fig. 1C). DTEF-1A, on the other hand, represents the first divergent vertebrate TEA domain protein due to the presence of a lysine at position 87 where an arginine is in human-N, mouse-N, chick-N, chick-R, Drosophila, and Aspergillus TEA domains. Similarly all known TEA domain genes encode an isoleucine at position 94 except for chick DTEF-1A and Aspergillus abaA which code for a similarly aliphatic leucine at that position.
The divergent region of DTEF-1A encompasses the predicted third helix of the TEA DNA binding domain. This helix was shown to be important in sequence-specific recognition of SV40 M-CAT-like elements by human NTEF-1, but complete disruption of its tertiary structure by proline substitutions, where the rest of the protein was intact, did not completely abolish binding capacity(16) . The first helix (where Drosophila scalloped differs from N, R, and DTEF-1) appeared to be the most important for sequence-specific DNA recognition(16) . Further deletion analysis of human NTEF-1 established that portions of the C terminus of the protein are also involved in modulation of DNA binding(16) . Thus, it is impossible to predict, at present, how the amino acid substitutions and the sequence differences in the C terminus of the DTEF-1 TEA domain might affect the sequence-specificity of DNA binding. We show here that DTEF-1A is capable of sequence-specific binding to a canonical M-CAT motif, but its relative affinity for the wide range of M-CAT element variants found in cardiac and other promoters, compared to other TEF-1 family members, remains to be tested.
NTEF-1 transcripts are widely expressed in human, mouse and chick tissues. RTEF-1 transcripts are also widely expressed; being enriched in skeletal and cardiac muscle and less abundant in gizzard, brain, and liver. By contrast, DTEF-1 transcripts are most highly abundant in heart muscle; at intermediate levels in lung and gizzard; and minimal or undetectable levels in skeletal muscle, kidney, liver and brain. The selective enrichment of DTEF-1 mRNA in heart and the demonstration that DTEF-1 is a sequence-specific M-CAT binding factor suggests a potential role for DTEF-1 in cardiac-specific muscle gene regulation that may differ from that played by other TEF-1 family members.
M-CAT elements were first
described in the cardiac-specific cTNT promoter (8) and since
then most characterized cardiac promoters have been found to contain
one or more functional M-CAT
elements(9, 10, 11, 12, 13) .
M-CAT elements in the promoters of the -myosin heavy chain and
-skeletal actin genes have been shown to be required for induction
by
-adrenergic agonists and protein kinase
C(28, 29) . Several candidate protein kinase C sites
can be identified in the derived amino acid sequence of DTEF-1,
including residues 81, 86, 219, 329, and 398. These potential protein
kinase C sites are conserved in avian NTEF-1 and RTEF-1 and mammalian
NTEF-1, consistent with evolutionary conservation of potentially
important regulatory sites. Since DTEF-1 is abundantly expressed in
heart it may be a target for this protein kinase C regulation and a
potential mediator of the
-adrenergic hypertrophy
response.
The fact that this multigene family consists of at least four genes with each gene encoding multiple, alternatively spliced isoforms implies complex variability in the regulatory functions associated with the TEF-1 family. This is remniscient of the MEF-2 and bHLH families of transcription factors whose members exhibit distinct spatiotemporal patterns of expression, functional redundancy, and differentially contribute to specific gene regulatory mechanisms. Members of the TEF-1 gene family may also show functional redundancy. Disruption of the murine NTEF-1 gene, for example, results in embryonic lethality by day 11.5 (24) by which time the forming heart tube is grossly normal. Histological analysis of mutant embryo hearts shows hypoplastic ventricles with reduced trabeculation and evidence of myocyte degeneration subjacent to the endocardium. These findings suggest a possible role for NTEF-1 in the growth and maintenance of the cardiac phenotype, but are less consistent with a role in the initiation of cardiogenesis. However, the expression of at least two other TEF-1 gene family members in the heart (RTEF-1 and DTEF-1) may be sufficient to compensate for the absence of the NTEF-1 gene products. Furthermore, the expression of other members of the TEF-1 multigene family may explain why M-CAT-dependent transcription of genes for cardiac proteins (cTNT, cTNC, myosins) is maintained in the TEF-1 knockout mutant embryos(24) . It will be interesting to see what effects transgenic knockout of the mouse DTEF-1 and RTEF-1 homologues have upon cardiac and skeletal muscle development.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) U46127 [GenBank]for DTEF-1A and U46128 [GenBank]for DTEF-1B.