Division of Molecular Biology and Biochemistry, School of Biological Science, University of MissouriKansas City
![]() |
Abstract |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Introduction |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
The Sp family has four members in humans: Sp1, Sp2, Sp3, and Sp4 (Suske 1999
). Of these, Sp1 was identified first and is the most well characterized (Dynan and Tjian 1983
). Sp1 and Sp3 are ubiquitous transcription factors that have been implicated in the control of a wide variety of genes (Hagen et al. 1994
). Sp1 and Sp3 are thought to compete for similar binding sites, and the relative rate of transcription is affected by the outcome of this competition (Hagen et al. 1992
; Hata et al. 1998
). Sp2 is least similar to the other Sp family members, with a 20%27% protein sequence similarity, and very little is known about its function (Kingsley and Winoto 1992
). Lastly, Sp4 has tissue-specific expression restricted to the brain and nervous tissue (Hagen et al. 1992
).
The structure of the polypeptides Sp1, Sp3, and Sp4 consists of five domains (fig. 1
). In Sp1, domains A, B, C, and D form both homotypic and heterotypic protein-protein interactions. Higher-order structures or multiple tetramers occur through protein-protein interactions in the A, B, or D domains (Mastrangelo et al. 1991
). These higher-order structures result in super activation in which multiple-binding sites have much more transcriptional activity than the sum of single sites. Domains A and B are in the N-terminal half of the protein, which can be further divided into subdomains containing a serine and threonine-rich area followed by a glutamine-rich area. Domain C contains a highly charged region, which is responsible for the inhibition of transcription in Sp3 (Dennig, Beato, and Suske 1996
). The DNA-binding domain contains three zinc fingers that recognize DNA-binding sites GGGCGG or CACCC with similar binding affinities (Hagen et al. 1992, 1994
). Each zinc finger motif is composed of an alpha helix and a beta sheet structure that tetrahedrally binds a zinc atom. Finally, domain D is located at the C-terminus beyond the zinc fingers and forms protein-protein interactions with other transcription factors (Ding et al. 1999
).
|
![]() |
Materials and Methods |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Alignment and Phylogenetic Construction
Fundulus heteroclitus Sp nucleotide and protein sequences were used to search the GenBank database at the National Center Biological Information. Sequences matching more than one domain (fig. 1
) among Sp family proteins were aligned using a Clustal alignment program (MacVector). The zinc finger regions of non-Sp genes with significant similarity (P < 10-14) were aligned with Sp zinc finger sequences. For the B domains, non-Sp proteins with significant similarity (P < 10-4) were aligned with Sp B domains. Aligned sequences were subjected to a heuristic search for the best possible trees using the maximum parsimony program in PAUP (Swofford 2000
). Bootstrap (1,000 replicates) values were obtained from the best tree.
Proteins Used in Analyses
Sp Family Phylogeny
Proteins used for the Sp phylogenetic analysis included: human Sp3 (Q02447), mouse Sp3 (AF062567), human Sp1 (AF252284), mouse Sp1 (S79832), rat Sp1 (S25287), pig Sp1 (U57347), human Sp4 (NP_003103), mouse Sp4 (NP_0033265), and human Sp2 (Q02086).
B Domain Phylogeny
Proteins used for the phylogenetic analysis of the B domain included the Sp proteins (listed above) human OCT-1 (A47001), nuclear transcription factor Y (AAH05003), POU-domain class 2 transcription factor (NP_035267), histone H1 transcription factor (A56365), transcription factor NFY-C (AF191744), and CCAAT-binding transcription factor (I59348).
Zinc Finger Phylogeny
Zinc fingers used for the phylogenetic analysis included the Sp proteins, Drosophila buttonhead (Q24266), human BTEB1 transcription factor (Q13886), rat BTEB1 (Q01713), human BTEB2 (Q13887), human erythroid KLF1 (Q13351), mouse EKLF1 (P46099), human EKLF4 (O43474), mouse EKLF4 (Q60793), human ubiquitous KLF7 (O75840), human WILMS Tumor protein (P19544), alligator WT1 (P50902), rat WT1 (P49952), mouse WT1 (P22561), human ZNF74 (Q16587), human ZF9 (Q99612), mouse ZF9 (O35819), mouse AP-2 rep transcription factor (Y14295), human TIEG1 (NP_005646), rat TIEG1 (U78875), human TIEG2 (NP_003588), Drosophila Sp (AAF46519), and mouse Sp5 (NP_071880).
![]() |
Results |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Only a single Sp, Sp3, was found in the Fundulus liver and cardiac tissue. All six PCR products using primers to conserved regions among Sp1, Sp2, Sp3, and Sp4 were fSp3. The sequence variations among these six PCR products are within the range of sequence variations found among 14 different Fundulus Sp alleles (unpublished data). Amplification with Sp1-specific primers did not result in any product. No other Sp cDNAs were isolated among greater than 6,000 cardiac cDNAs from F. heteroclitus (Oleksiak, Kolell, and Crawford 2001)
. Northern analysis on the liver and cardiac mRNA revealed a single product with an Sp3 probe. Southern analysis with human Sp1 probe did not hybridize to Fundulus genomic DNA. In Western analyses and gel shift analyses, Sp3-specific, but not Sp1-specific, antibodies cross-reacted with proteins in Fundulus nuclear extract (data not shown). Thus, unlike mammals, there is only one ubiquitously expressed Sp.
The fSp3 protein sequence was aligned with Sp1, Sp2, Sp3, and Sp4 protein sequences from human, mouse, and rat. These are the only genes in the GenBank that are similar to more than one Sp domain. The Sp proteins were subjected to maximum parsimony analysis (fig. 2 ). Each of the three of the Sp subtypes (Sp1, Sp3, Sp4) is supported by 100% bootstrap values (1,000 replicates). The F. heteroclitus Sp protein groups with the mammalian Sp3 proteins, supporting the hypothesis that this transcription factor is an Sp3 gene.
|
The three contiguous fSp3 zinc fingers were used in a BLAST similarity search of the GenBank database. Proteins with significant similarity (P < 10-14) to the zinc finger were aligned using CLUSTAL. Maximum parsimony analyses (PAUP, fig. 3
) was applied to these zinc finger proteins using the invertebrate Drosophila Sp buttonhead protein as the outgroup. There were 208 minimum trees. The consensus tree (fig. 3
) of these 208 trees has a significant phylogenetic signal (P < 0.001, permutation tail probability [PTP] test [Faith and Cranston 1991
]). That is, maximum parsimony analysis was applied to each if the 1,000 random permutations of the zinc finger proteins, and these analyses of the randomized data sets never produced as short a tree (280 characters). The shortest random permutated tree had 615 characters. The zinc fingers from F. heteroclitus groups with the mammalian Sp3 proteins, supporting the hypothesis that this transcription factor is an Sp3 gene. Similarly, Sp1 and Sp4 each form separate clades within the Sp zinc finger family. Among zinc finger regions, Sp1, Sp3, and Sp4 form a monophyletic group (figs. 3 and 4
). This group never includes Sp2 among the 208 minimum trees. Sp2 shares only 8 of the 16 unique, derived amino acids (fig. 4
). Although the bootstrap value for monophyletic Sp1, Sp3, and Sp4 is not >50%, fewer than 10% of bootstrap trees included Sp2 with the other Sp proteins. The T-PTP test (topology-dependent permutation tail probability [Faith 1991
]), with the constraints that Sp1, Sp3, and Sp4 are monophyletic and Sp2 is monophyletic with the other zinc fingers, is significant (P < 0.01, T-PTP). Finally, the trees that are constrained so that Sp2 groups with the other Sp proteins are four steps longer than the tree with Sp2 excluded. This analysis supports the observation (fig. 3
) that Sp2 groups are outside the Sp family of transcription factors. Sp2 and the newly identified Sp5 (Harrison et al. 2000
) have considerably less sequence similarity with other Sp family members (i.e, 27% and 20%, respectively). The zinc finger data suggest that Sp2 and Sp5 proteins represent the most divergent proteins of the Sp family and may not be members of the Sp family.
|
|
The phylogenetic analysis of the B domain included four nonzinc finger proteins (fig. 5 ). POU domain transcription factor was used as an outgroup because it had the lowest similarity score to fSp3. This analysis produced three minimum trees. The strict consensus had two major clades: one for the Sp family and the other for Sp2 and CCAAT-binding/NF-Y transcription factor families. This tree is well supported (P < 0.01, PTP), and the bootstrap values for these two clades are >75%. Sp1, Sp3, and Sp4 form a monophyletic group that does not include Sp2. Forced topology in which Sp2 is grouped with the other Sps is six steps longer than the topology of Sps without Sp2.
|
|
![]() |
Discussion |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
In the B domain, unlike all other domains, fSp3 does not group with the mammalian Sp3s (fig. 5
). Instead, fSP3 clades with the more ancestral Sp4. The grouping of fSp3 with Sp4 may reflect a difference between mammals and teleosts in the pattern of Sp expression. The B domain interacts with the essential transcription factor TFIID (TATA-binding factor) via hTAF130. Among TATA-less promoters, Sp-binding sites (consensus sequence: GGGCGG; [Dynan and Tjian 1983
]) are required for significant activity (Azizkhan et al. 1993
), and the degree of activation from Sp tends to be stronger in the context of TATA-less promoters than TATA-containing promoters (Colgan and Manley 1995
). In mammals, both Sp1 and Sp3 are most often expressed together and are often antagonistic (Hagen et al. 1992
; Hata et al. 1998
). Thus, among mammals, because both Sp1 and Sp3 are affecting transcription via the interaction of the B domain with TFIID, selection may favor divergence among B domains. However, in Fundulus there is only one Sp, fSp3, which is expressed ubiquitously. This expression pattern could cause selection to maintain a more general or ancestral function, and thus the fSp3 B domain would lack the derived characters found in the rest of the protein. This idea is supported by the lack of polymorphism in the B domains among Fundulus populations (unpublished data). If the difference in the phylogeny of the B domains versus the other domains is in response to the pattern of expression for Sp1, it suggests that this domain plays an important role in defining the different roles of Sp3 and Sp1.
Sp Family
Among Sp proteins, only Sp1, Sp3, and Sp4 have all five SP domains (fig. 1
). Sp2 has four of the five domains. Other proteins with Sp nomenclature (e.g., dSp, mSp5) have only one of these domainsthe three consecutive zinc finger domains. The functions of Sp proteins are to bind GC-rich DNA and interact with TAFs as well as other proteins. The four proteinprotein interactive domains as well as the zinc fingers define these functions. If one defines Sp proteins by their functions, then the other proteins which lack all the five domains will not be functionally similar. These functional relationships are a reflection of their evolutionary history.
In the evolutionary analyses in which other transcription factors can be aligned with Sp proteins (figs. 3 and 5
), Sp2 does not group with other Sp transcription factors. Using the DNA-binding zinc fingers (Nardelli et al. 1991
; Pavletich and Pabo 1991
), Sp2 does not form a monophyletic group with other Sp proteins (fig. 3
). In the B domain, the Sp2 groups with CCAAT/NF-Y transcription factors, not with the other Sp proteins (fig. 5
). For Sp1, Sp3, and Sp4 proteins, the B domains form a monophyletic group. These data suggest that Sp2 is not a member of the Sp family.
The zinc finger proteins used in this analysis (fig. 3
) have three consecutive zinc fingers (Cx24Cx3Fx5Lx2Hx3Hx5, where bold letters are metal-binding residues [Narayan, Kriwacki, and Caradonna 1997
]). Most of the amino acids (50 of 87) are invariant or have only one alternative amino acid (ALL, fig. 4 ). Of the remaining 37 amino acids, 16 derived amino acids distinguish all Sp1, Sp3, and Sp4 proteins from the other major clades of similar zinc finger proteins (Sps, fig. 4
). Sp2 shares only 8 of the 16 derived amino acids. Thirteen amino acids vary among Sp1, Sp3, and Sp4 (fig. 4
). The most interesting variable site is amino acid 81, one of the amino acids that interact with DNA (Narayan, Kriwacki, and Caradonna 1997
). For this interactive site, all the Sps have a serine or alanine: Sp3s have an alanine, Sp4 and Sp1 have a serine. The threonine found in Sp2 is found in several other zinc fingers. It seems likely that these amino acids could affect the Sp binding to DNA.
These data (the presence of all the five domains, the evolutionary relationship of the whole protein or the Sp domains, and the derived amino acids in zinc fingers) all indicate that Sp1, Sp3, and Sp4 form a monophyletic group and thus describe members of the Sp family. The observation that other proteins have similar zinc fingers (e.g., dSp or mSp5) or B domains (e.g., CCAAT/NFY-binding proteins) should not necessarily allow them to be classified as Sp proteins.
Modular Evolution
The Sp family is often grouped with other proteins that contain zinc fingers. The zinc finger tree shows (fig. 3
) that Sp transcription factors can be grouped with a myriad of transcription factors that (1) contain three zinc fingers and (2) bind GC-rich templates. The zinc fingers share homology with several proteins, including Kruppel factors. Although these proteins share an ability to bind to GC-rich sequences, this does not adequately describe the evolution or homology of the Sp transcription factors. Instead, it may represent the evolution of that single domain. In contrast to the zinc fingers, the B domain suggests a different phylogeny: Sp, CCAAT/NFY-binding proteins, and POU/OCT1 transcription factors that bind to different promoter sequences. The significant similarity (P < 10-4) of the B domains and the apparent homology among the B domains (fig. 5
) for these proteins suggest that these proteins may be grouped based on proteinprotein interactions much like grouping proteins based on the zinc finger DNA-binding region.
The diverse B domains (fig. 5
) have a conserved function; they bind to the same TAFs (TFIID associated factors [Coustry et al. 1998
; Wolstein et al. 2000
]). Additionally this function is phylogenetically conserved. The B domain of hSp1 binds both the human TAF130 (Tanese et al. 1996
; Rojo-Niersbach, Furukawa, and Tanese 1999
) and its Drosophila homologue dTAF110 (Gill et al. 1994
). This occurs even though Drosophila lacks an Sp protein (Pugh and Tjian 1990
). Similarly, OCT1 and CCAAT-binding factors have been shown to bind both hTAF130 and dTAF110 (Coustry et al. 1998
; Wolstein et al. 2000
). The observation that the homologous B domain proteins interact with TFIID subunits in phylogenetically distant species appears to reflect a more ancestral state than the homology among Sp family members. This functional similarity among different families of transcription factors either causes the similarity in the B domain's amino acid sequence (convergent evolution) or these transcription factors have evolved from different domains or modules, so that the B domain enjoys a separate evolutionary history from the zinc fingers.
Modular evolution occurs when protein domains are created by gene duplication followed by domain shuffling to evolve novel transcription factors. Modular evolution is thought to contribute to the evolution of the basic helixloophelix transcription factor family (Morgenstern and Atchley 1999
). For modular evolution to occur, the expectation would be that these individual domains would have different ancestry. This appears to be true for the Sp family: zinc fingers and the B domains are related to different factors (compare figs. 3 and 5
). Consistent with a modular theory of evolution, the B domain is a different exon from the DNA-binding domain (Suske 1999
). These data suggest that these domains had different evolutionary histories prior to the creation of the Sp family and that the Sp family of transcription factor is a mosaic of different protein modules.
In summary, in Fundulus, the only member of the Sp family to be widely expressed is homologous to the Sp3 subtype. Among the Sp transcription factors, only Sp1, Sp3, and Sp4 form a monophyletic group. Yet the domains of these Sp proteins have different evolutionary histories: the zinc fingers to GC-binding proteins and B domain to TAF-binding proteins. These data suggest that for these three members, the ancestral form was formed by mosaic evolution: the assortment of different domains to form a new functional protein.
![]() |
Acknowledgements |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
![]() |
Footnotes |
---|
Keywords: modular evolution
protein polymorphism
clinal variation
Address for correspondence and reprints: Douglas L. Crawford, Division of Molecular Biology and Biochemistry, School of Biological Science, 5007 Rockhill Road, University of MissouriKansas City, Kansas City, Missouri 64110. crawforddo{at}umkc.edu
.
![]() |
References |
---|
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
---|
Arkhipova I. R., 1995 Promoter elements in Drosophila melanogaster revealed by sequence analysis Genetics 139:1359-1369
Azizkhan J. C., D. E. Jensen, A. J. Pierce, M. Wade, 1993 Transcription from TATA-less promoters: dihydrofolate reductase as a model Crit. Rev. Eukaryot. Gene Expr 3:229-254[Medline]
Boam D. S., I. Davidson, P. Chambon, 1995 A TATA-less promoter containing binding sites for ubiquitous transcription factors mediates cell type-specific regulation of the gene for transcription enhancer factor-1 (TEF-1) J. Biol. Chem 270:19487-19494
Colgan J., J. L. Manley, 1995 Cooperation between core promoter elements influences transcriptional activity in vivo Proc. Natl. Acad. Sci. USA 92:1955-1959[Abstract]
Coustry F., S. Sinha, S. N. Maity, B. Crombrugghe, 1998 The two activation domains of the CCAAT-binding factor CBF interact with the dTAFII110 component of the Drosophila TFIID complex Biochem. J 331:291-297[ISI][Medline]
Dennig J., M. Beato, G. Suske, 1996 An inhibitor domain in Sp3 regulates its glutamine-rich activation domains EMBO J 15:5659-5667[Abstract]
Ding H., A. M. Benotmane, G. Suske, D. Collen, A. Belayew, 1999 Functional interactions between Sp1 or Sp3 and the helicase-like transcription factor mediate basal expression from the human plasminogen activator inhibitor-1 gene J. Biol. Chem 274:19573-19580
Dynan W. S., R. Tjian, 1983 The promoter-specific transcription factor Sp1 binds to upstream sequences in the SV40 early promoter Cell 35:79-87[ISI][Medline]
Faith D. P., 1991 Cladistic permutation test for monophyly and nonmonophyly Syst. Zool 40:366-375[ISI]
Faith D. P., P. S. Cranston, 1991 Could a cladogram this short have arisen by chance alone? On the permutation tests for cladistic structure. Cladistics 7:1-28
Gill G., E. Pascal, Z. H. Tseng, R. Tjian, 1994 A glutamine-rich hydrophobic patch in transcription factor Sp1 contacts the dTAFII110 component of the Drosophila TFIID complex and mediates transcriptional activation Proc. Natl. Acad. Sci. USA 91:192-196[Abstract]
Hagen G., S. Muller, M. Beato, G. Suske, 1992 Cloning by recognition site screening of two novel GT box binding proteins: a family of Sp1 related genes Nucleic Acids Res 20:5519-5525[Abstract]
. 1994 Sp1-mediated transcriptional activation is repressed by Sp3 EMBO J 13:3843-3851[Abstract]
Harrison S., D. Houzelstein, S. Dunwoodie, R. Beddington, 2000 Sp5, a new member of the Sp1 family, is dynamically expressed during development and genetically interacts with Brachyury Dev. Biol 227:358-372[ISI][Medline]
Hata Y., E. Duh, K. Zhang, G. S. Robinson, L. P. Aiello, 1998 Transcription factors Sp1 and Sp3 alter vascular endothelial growth factor receptor expression through a novel recognition sequence J. Biol. Chem 273:19294-19303
Jolliff K., Y. Li, L. F. Johnson, 1991 Multiple protein-DNA interactions in the TATAA-less mouse thymidylate synthase promoter Nucleic Acids Res 19:2267-2274[Abstract]
Karchner S. I., W. H. Powell, M. E. Hahn, 1999 Identification and functional characterization of two highly divergent aryl hydrocarbon receptors (AHR1 and AHR2) in the teleost Fundulus heteroclitus. Evidence for a novel subfamily of ligand-binding basic helix loop helix-Per-ARNT-Sim (bHLH-PAS) factors J. Biol. Chem 274:33814-33824
Kennett S. B., A. J. Udvadia, J. M. Horowitz, 1997 Sp3 encodes multiple proteins that differ in their capacity to stimulate or repress transcription Nucleic Acids Res 25:3110-3117
Kingsley C., A. Winoto, 1992 Cloning of GT box-binding proteins: a novel Sp1 multigene family regulating T-cell receptor gene expression Mol. Cell. Biol 12:4251-4261[Abstract]
Kollmar R., K. A. Sukow, S. K. Sponagle, P. J. Farnham, 1994 Start site selection at the TATA-less carbamoyl-phosphate synthase (glutamine-hydrolyzing)/aspartate carbamoyltransferase/dihydroorotase promoter J. Biol. Chem 269:2252-2257
Lu J., W. Lee, C. Jiang, E. B. Keller, 1994 Start site selection by Sp1 in the TATA-less human Ha-ras promoter J. Biol. Chem 269:5391-5402
Mastrangelo I. A., A. J. Courey, J. S. Wall, S. P. Jackson, P. V. Hough, 1991 DNA looping and Sp1 multimer links: a mechanism for transcriptional synergism and enhancement Proc. Natl. Acad. Sci. USA 88:5670-5674[Abstract]
Morgenstern B., W. R. Atchley, 1999 Evolution of bHLH transcription factors: modular evolution by domain shuffling? Mol. Biol. Evol 16:1654-1663
Narayan V., R. Kriwacki, J. Caradonna, 1997 Structures of zinc finger domains from transcription factor Sp1. Insights into sequence-specific protein-DNA recognition J. Biol. Chem 272:7801-7809
Nardelli J., T. J. Gibson, C. Vesque, P. Charnay, 1991 Base sequence discrimination by zinc-finger DNA-binding domains Nature 349:175-178[ISI][Medline]
Oleksiak M. F., K. Kolell, D. L. Crawford, 2001 The utility of natural populations for microarray analyses: isolation of genes necessary for functional genomic studies Mar. Biotechnol. 3:S203S211.
Pascal E., R. Tjian, 1991 Different activation domains of Sp1 govern formation of multimers and mediate transcriptional synergism Genes Dev 5:1646-1656[Abstract]
Pavletich N. P., C. O. Pabo, 1991 Zinc finger-DNA recognition: crystal structure of a Zif268-DNA complex at 2.1 A Science 252:809-817[ISI][Medline]
Pugh B. F., R. Tjian, 1990 Mechanism of transcriptional activation by Sp1: evidence for coactivators Cell 61:1187-1197[ISI][Medline]
. 1991 Transcription from a TATA-less promoter requires a multisubunit TFIID complex Genes Dev 5:1935-1945[Abstract]
Roeder R. G., 1991 The complexities of eukaryotic transcription initiation: regulation of preinitiation complex assembly Trends Biochem. Sci 16:402-408[ISI][Medline]
Rojo-Niersbach E., T. Furukawa, N. Tanese, 1999 Genetic dissection of hTAF(II)130 defines a hydrophobic surface required for interaction with glutamine-rich activators J. Biol. Chem 274:33778-33784
Suske G., 1999 The Sp-family of transcription factors Gene 238:291-300[ISI][Medline]
Swofford D. L., 2000 PAUP* Phylogenetic analysis using parsimony (*and other methods). Version 4.0b8. Sinauer Associates, Sunderland, Mass
Tanese N., D. Saluja, M. F. Vassallo, J. L. Chen, A. Admon, 1996 Molecular cloning and analysis of two subunits of the human TFIID complex: hTAFII130 and hTAFII100 Proc. Natl. Acad. Sci. USA 93:13611-13616
Tjian R., T. Maniatis, 1994 Transcriptional activation: a complex puzzle with few easy pieces Cell 77:5-8[ISI][Medline]
Wiley S. R., R. J. Kraus, J. E. Mertz, 1992 Functional binding of the "TATA" box binding component of transcription factor TFIID to the -30 region of TATA-less promoters Proc. Natl. Acad. Sci. USA 89:5814-5818[Abstract]
Wolstein O., A. Silkov, M. Revach, R. Dikstein, 2000 Specific interaction of TAFII105 with OCA-B is involved in activation of octamer-dependent transcription J. Biol. Chem 275:16459-16465