KRAB Zinc Finger Proteins: An Analysis of the Molecular Mechanisms Governing Their Increase in Numbers and Complexity During Evolution

Camilla Looman*, Magnus Åbrink{dagger}, Charlotta Mark{ddagger} and Lars Hellman*,2

*The Department of Cell and Molecular Biology, Uppsala University;
{dagger}The Biomedical Center, Department of Veterinary Medical Chemistry, Swedish University of Agricultural Sciences;
{ddagger}Medical Products Agency, Uppsala


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Acknowledgements
 References
 
Krüppel-related zinc finger proteins, with 564 members in the human genome, probably constitute the largest individual family of transcription factors in mammals. Approximately 30% of these proteins carry a potent repressor domain called the Krüppel associated box (KRAB). Depending on the structure of the KRAB domain, these proteins have been further divided into three subfamilies (A + B, A + b, and A only). In addition, some KRAB zinc finger proteins contain another conserved motif called SCAN. To study their molecular evolution, an extensive comparative analysis of a large panel of KRAB zinc finger genes was performed. The results show that both the KRAB A + b and the KRAB A subfamilies have their origin in a single member or a few closely related members of the KRAB A + B family. The KRAB A + B family is also the most prevalent among the KRAB zinc finger genes. Furthermore, we show that internal duplications of individual zinc finger motifs or blocks of several zinc finger motifs have occurred quite frequently within this gene family. However, zinc finger motifs are also frequently lost from the open reading frame, either by functional inactivation by point mutations or by the introduction of a stop codon. The introduction of a stop codon causes the exclusion of part of the zinc finger region from the coding region and the formation of graveyards of degenerate zinc finger motifs in the 3'-untranslated region of these genes. Earlier reports have shown that duplications of zinc finger genes commonly occur throughout evolution. We show that there is a relatively low degree of sequence conservation of the zinc finger motifs after these duplications. In many cases this may cause altered binding specificities of the transcription factors encoded by these genes. The repetitive nature of the zinc finger region and the structural flexibility within the zinc finger motif make these proteins highly adaptable. These factors may have been of major importance for their massive expansion in both number and complexity during metazoan evolution.


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Acknowledgements
 References
 
To a large extent, cellular differentiation is regulated by transcription factors. Several larger families of transcription factors have been identified, and in mammals, zinc finger proteins probably constitute the largest individual family of such nucleic acid–binding proteins. Zinc finger motifs of the Krüppel (Cys2His2)-type were first identified in the TFIIIA of Xenopus laevis (Miller, McLachlan, and Klug 1985Citation ) and have subsequently been found in many different species including yeast, nematodes, plants, fruit fly, spider, crab, tunicates, frog, mouse, and human beings (Schuh et al. 1986Citation ; Venter et al. 2001Citation ). The basic structural unit, the so-called zinc finger, is a conserved motif of 28 amino acids, which is often repeated within a protein (fig. 1 ; reviewed in Klug and Rhodes 1987Citation ). Zinc finger proteins with up to 36 such zinc finger motifs have been identified (Ruiz i Altaba, Perry-O'Keefe, and Melton 1987Citation ). Four amino acids, two cysteines and two histidines, interact directly with a zinc ion—an interaction which is essential for proper folding of the Krüppel-type zinc finger motif (fig. 1 ; Parraga et al. 1988Citation ; Lee et al. 1989Citation ). In addition to a zinc finger region, most of these proteins also contain a regulatory domain. In rodents and human beings, about one-third of the zinc finger genes carry the Krüppel-associated box (KRAB), a potent repressor of transcription. The KRAB domain has been identified in sequences from frog, chicken, rodents, and human beings but not in plants, fungi, and insects and not yet in fish (Bellefroid et al. 1991Citation ; Benn et al. 1991Citation ; Margolin et al. 1994Citation ; Witzgall et al. 1994Citation ; Venter et al. 2001Citation ). The KRAB domain consists of an A box and a B box (Bellefroid et al. 1991Citation ). The KRAB A and B boxes are encoded by two separate exons, and alternative splicing of these exons has been reported (Bellefroid et al. 1991Citation ; Lovering and Trowsdale 1991Citation ; Rosati et al. 1991Citation ; Takashima et al. 2001Citation ). A recent study based on the nonconserved amino acids of the zinc finger region of Krüppel-type zinc finger genes from Drosophila melanogaster, Caenorhabditis elegans, and human beings identified 39 subfamilies of Krüppel-type zinc finger genes (Knight and Shimeld 2001Citation ). However, none of these seem to include the KRAB zinc finger genes. The human and mouse genomes contain three closely related subfamilies of KRAB zinc finger genes, one carrying the classical KRAB A box together with the classical KRAB B box, another having the classical KRAB A box and a highly divergent KRAB B box, named b, and a third carrying the classical A box only (fig. 1 ; Mark, Abrink, and Hellman 1999Citation ). All three subfamilies effectively repress transcription through interaction with TIF1-ß, a transcriptional co-repressor involved in gene silencing through heterochromatin formation (Margolin et al. 1994Citation ; Witzgall et al. 1994Citation ; Friedman et al. 1996Citation ; Kim et al. 1996Citation ; Moosmann et al. 1996Citation ; Nielsen et al. 1999Citation ; Ryan et al. 1999Citation ; Underhill et al. 2000Citation ; Abrink et al. 2001Citation ; Lorenz, Koczan, and Thiesen 2001Citation ). Another conserved domain, called SCAN or leucine-rich domain (LeR), has been identified in a few KRAB zinc finger proteins. This domain, which is rich in leucine and glutamic acid residues, has been shown to mediate homo- and hetero-oligomerization (fig. 1 ; Williams, Blacklow, and Collins 1999Citation ; Sander et al. 2000Citation ; Schumacher et al. 2000Citation ). In addition, one example of a gene containing two KRAB domains has been identified (ZNF333, GenBank accession number AF372702).



View larger version (22K):
[in this window]
[in a new window]
 
Fig. 1.—A schematic representation of the overall structure of the six main subfamilies of KRAB zinc finger proteins. In panel (A) the general structure of the KRAB A + B, KRAB A + b, KRAB A, double-KRAB, and SCAN-KRAB zinc finger proteins are depicted. In panel (B) the consensus amino acid sequence of the KRAB A (pfam01352, NCBI domains), KRAB B (Mark, Abrink, and Hellman 1999Citation ), KRAB b (Mark, Abrink, and Hellman 1999Citation ), and SCAN (pfam02023, NCBI domains) domains are presented. The residues of KRAB A that are essential for interaction with TIF1-ß and repression of transcription are underlined (Friedman et al. 1996Citation ), and the residues of the KRAB and SCAN domains that are thought to form helical structures are indicated with a dotted line above the corresponding sequence (Collins, Stone, and Williams 2001Citation ). In panel (C) a schematic diagram of a Krüppel-type zinc finger motif is presented. Conserved amino acid residues are indicated with their single-letter code, and the residues involved in DNA binding are shown in gray (Wolfe, Nekludova, and Pabo 2000Citation ). A filled, black circle represents the zinc ion that holds the structure together

 
Krüppel-type zinc finger genes often show a clustered organization, probably reflecting an evolutionary history of duplication events (Dehal et al. 2001Citation and reviewed in Shannon et al. 1998Citation ). Furthermore, genome-sequencing projects of different species including yeast, nematodes, insects, plants, and human beings indicate that both the number of genes containing Krüppel-type zinc finger motifs and the number of these motifs within a single gene have increased throughout evolution. For example, the genome of baker's yeast, Saccaromyces cervisiae, contains 34 Krüppel-related zinc finger genes, whereas the human genome contains 564 such genes (Venter et al. 2001Citation ). However, it is not only the number of zinc finger genes that appears to have increased throughout evolution, so has the number of zinc finger motifs within each zinc finger gene. On average, a zinc finger gene of the plant Arabidopsis thaliana contains one zinc finger motif, whereas the corresponding numbers for the zinc finger genes of S. cervisiae, C. elegans, D. melanogaster, and Homo sapiens are 1.5, 2.5, 3.5, and 8, respectively (Venter et al. 2001Citation ). The apparent expansion of the Krüppel-related zinc finger protein family during metazoan evolution raises several important questions: Are particular characteristics of the zinc finger proteins responsible for their massive increase in number and complexity, and why are KRAB zinc finger genes only found in vertebrates and possibly only in tetrapods? Do chromatin-remodeling KRAB zinc finger proteins participate in the coordination of development and cell differentiation in organisms with larger genomes? Do the increase in numbers and complexity of the zinc finger genes correlate with increasing body size, genome size, or body complexity, or have the number of zinc finger genes fluctuated during evolution and maybe also gone through secondary reductions in number and complexity in certain evolutionary lineages? To unravel the mechanisms involved in the apparent expansion of the zinc finger gene family throughout metazoan evolution and to obtain clues about the functional consequences or advantages of this expansion, we present here a detailed comparative analysis of a large panel of KRAB zinc finger genes.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Acknowledgements
 References
 
The program CLUSTAL W (Thompson, Higgins, and Gibson 1994Citation ), which is based on the neighbor-joining algorithm, was used for all nucleotide sequence alignments and for the construction of the distance trees. Nodal support was estimated using bootstrap analyses based on 1,000 replicates. The numbers of synonymous and nonsynonymous substitutions (Sd and Nd, respectively) were computed using a modified version of the Nei-Gojobori method. The results from this analysis were then used to calculate the percentages of synonymous and nonsynonymous substitutions in relation to the total number of nucleotide substitutions (%S and %N, respectively). The numbers of synonymous and nonsynonymous nucleotide differences per site (KS and KA, respectively) was also determined using the same method, with a Jukes-Cantor correction. All nucleotide substitution analyses were performed using MEGA version 2.1 (Kumar et al. 2001Citation ). The following sequences were used for the analyses presented in this work: cKr1 (X15538), DNABPZ (L20450), FPM315 (D88827), HKr18 (AF277623 and genomic sequence), HKr19 (AF27762), HPF2 (M27878), HZF2 (X78925), HZF4 (X78927 and genomic sequence), HZF6 (also referred to as ZNF93, NM_004234), HZF12 (unpublished data), KID1 (mouse, L77247), Kid2 (AF184112), Kid3 (AF192804), HKL1 (human ortholog to KID1, also referred to as TCF17, D89928), KOX1 (X52332), KRAZ1 (AB024004), KRAZ2 (AB024005), KS1 (U56732), KZF1 (U67082), KZF2 (U67083), MZF13 (AF242376), MZF22 (AF242377), MZF31 (AF242378), NK10 (X79828), NRIF (AJ242914), pMLZ8 (U07861), pMTZ1 (L28167), RbaK (NM_021163), RITA (AF272148), rKr2 (U27186), SKAT2 (AF281141), Skz1 (AF291722), Xfin (X06021), ZBRK1 (NM_021632), ZF5128 (NM_014347), Zfp30 (Z30174), hZFP-37 (human, AF022158), mZfp-37 (mouse, X89264), Zfp93 (U46186), Zfp94 (U46187), ZFP95 (human, NM_014569), Zfp95 (mouse, U62907), Zfp354c (also referred to as AJ18, NM_023988), ZIM3 (AF365931), ZK1 (AB011414), ZNF7 (M29580), ZNF41 (X60155), ZNF43 (X59244), ZNF45 (L75847), ZNF85 (U35376), ZNF91 (NM_003430), ZNF133 (U09366), ZNF136 (U09367), ZNF140 (U09368), ZNF141 (L15309), ZNF155 (NM_003445), ZNF157 (U28687), ZNF184 (U66561), ZNF189 (U95992), ZNF195 (NM_007152), ZNF197 (NM_006991), ZNF202 (human, AF027219), Znf202 (mouse, AF292648), ZNF221 (AF187987), ZNF222 (AF187988), ZNF224 (AF187990), ZNF225 (NM_013362), ZNF226 (NM_016444), ZNF230 (NM_006300), ZNF264 (NM_003417), ZNF274 (NM_133502), ZNF304 (NM_020657), ZNF317 (AF275255), ZNF333 (AF372702), and ZNF347 (NM_032584).


    Results and Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Acknowledgements
 References
 
An Analysis of the Evolutionary Origin of KRAB Zinc Finger Gene Subfamilies
To study the evolutionary origin of the different subfamilies of KRAB zinc finger genes, a comparative analysis of the nucleotide sequence of the zinc finger regions of 70 human, mouse, and rat KRAB zinc finger genes was performed. The zinc finger regions of the only KRAB zinc finger genes identified in species other than mammals, i.e., Xfin (frog) and cKr1 (chicken), were also included, and the zinc finger region of a non-KRAB zinc finger gene isolated from D. melanogaster (rgr) was used as an outgroup. The distance tree derived from this analysis is shown in figure 2 . Interestingly, the majority of the zinc finger genes carrying only the KRAB A box and the genes carrying KRAB A + b form two independent monophyletic groups. In contrast, the KRAB A + B zinc finger genes form a coherent but paraphyletic group, which also includes the SCAN-KRAB zinc finger genes. The alignment of whole zinc finger domains is problematic due to the varying number and repetitive nature of the zinc finger motifs within the zinc finger region. The results shown in figure 2 are, however, supported by the results from a similar analysis of the entire open reading frames of these genes and a comparative analysis including over 600 individual zinc finger motifs originating from these genes (data not shown). Similar results were also obtained in a comparative analysis based on the KRAB A domain of 40 KRAB zinc finger genes (Mark, Abrink, and Hellman 1999Citation ). The relationship among these genes is thus the same, independent of which region we analyze (i.e., the KRAB A domain, the zinc finger domain, individual zinc finger motifs, or the entire open reading frame).



View larger version (23K):
[in this window]
[in a new window]
 
Fig. 2.—A distance tree showing the phylogenetic relationship among 72 KRAB zinc finger genes. The analysis is based on the nucleotide sequence of their zinc finger regions. The different KRAB zinc finger subfamilies are indicated as the A family, the A + b family, the A + B family, and the SCAN family. Members of the KRAB A + b and KRAB A + B subfamily of KRAB zinc finger genes, which lack the KRAB b, or B, exon, are indicated by an asterisk. Bootstrap support values greater than 90% are shown (1,000 replicates)

 
Based on these collected results, we conclude that the KRAB A + B family is the origin of both the KRAB A + b and the KRAB A families. At some point during evolution the exon encoding the B box diverged into b in one ancestral gene (A + b), and in another, the B exon became lost completely (A only). Subsequent duplication events resulted in the formation of families of closely related genes. This theory is further substantiated by the fact that all human members of the A + b family are located in one single cluster on chromosome 19 (ZNF45 family; Dehal et al. 2001Citation ) and the fact that the genes encoding the KRAB A zinc finger proteins have been shown to lack the exon encoding the B, or b, box completely (Vissing et al. 1995Citation ; Mark, Abrink, and Hellman 1999Citation ). Members of all three families of KRAB zinc finger genes are present in both rodents and human beings, indicating that these three families emerged prior to the major radiation of placental mammals, 60–100 MYA. The SCAN-containing KRAB zinc finger genes do not form a separate group, indicating that these genes have evolved independent of each other. The SCAN motif is most probably an independent functional domain and is found in KRAB zinc finger proteins as well as in other zinc finger proteins (Honer et al. 2001Citation ). Because both the KRAB and the SCAN domain are encoded by separate exons, the SCAN-containing KRAB zinc finger proteins might have evolved by exon shuffling between the KRAB zinc finger genes and the SCAN zinc finger genes.

Notably, a few KRAB zinc finger genes, which appear in the KRAB A + b and the KRAB A + B families in the distance tree, seem to lack a KRAB b (ZNF222, Zfp93, Zfp94, rKr2) or B (KRAZ1, KZF-1, RITA, ZNF189, ZNF304) exon (all genes marked with an asterisk in fig. 2 ). This might reflect the loss of the exon encoding the b, or the B, box from these genes, or it might be the result of an alternative splicing event. There is strong evidence indicating that these events occurred after the formation of the different families shown in figure 2 (i.e., the KRAB A + b and A zinc finger gene families). The gene encoding ZNF222—one of the genes which appear in the A + b family but which seems to lack the exon encoding KRAB b—is located within the cluster of KRAB A + b zinc finger genes on human chromosome 19. This indicates that ZNF222 is a true member of the KRAB A + b zinc finger gene family. In addition, upon careful examination of the genomic sequence located between the exons encoding KRAB A and the zinc finger region of ZNF222, a putative KRAB b exon was detected (data not shown). Although the splicing sites bordering this putative exon seem to be intact it remains to be established if they are used. We were unable to perform a similar analysis for the mouse genes Zfp93 and Zfp94 and the rat gene rKr2. However, the human orthologs of both Zfp93 and Zfp94 (HZF6 and ZNF45, respectively) carry a KRAB b box, which shows that these genes do belong to the A + b subfamily. Based on these results, we propose a model in which all the members of the KRAB A + b subfamily originate from one ancestral gene in which the B box diverged into b. Subsequent duplication events resulted in the formation of the KRAB A + b zinc finger gene subfamily. Similarly, the KRAB A zinc finger gene subfamily originates from an ancestral gene in which the exon encoding B was lost completely. However, as mentioned earlier, some of the genes, which in the distance tree appear in the KRAB A + B family, also seem to lack a KRAB B exon. To determine if this is the result of an alternative splicing event or is due to the loss of the exon encoding the B box, we have analyzed the genomic sequences located between the exons encoding KRAB A and the zinc finger region of the human KRAB zinc finger genes KRAZ1, RITA, ZNF189, and ZNF304 for putative KRAB B exons (unfortunately, there is no genomic sequence available for the rat KRAB zinc finger gene KZF-1). The genomic sequences for KRAZ1 and RITA were both found to contain putative KRAB B exons. However, the splicing sites are not canonical, indicating that they may be less efficient. ZNF189 and ZNF304 were both found to contain nucleotide sequences homologous to the KRAB B box, but no open reading frame corresponding to the KRAB B domain could be identified (data not shown). The KRAB B exon has thus been lost in other genes, separate from those belonging to the KRAB A zinc finger gene family.

The KRAB B domain cannot act as a transcriptional repressor by itself. However, it has been shown to increase the repression activity of the A box (Vissing et al. 1995Citation ). On the other hand, the highly divergent b box does not seem to contribute to the repression activity of the KRAB domain (Abrink et al. 2001Citation ). The repression activity of the KRAB domain is thus only modulated by, and not dependent on, the presence of an intact B box. This enables the loss of the KRAB B box or the divergence of B into b without any significant change of function.

Mechanisms Involved in the Increase in the Total Number of Zinc Finger Genes During Evolution
The total number of zinc finger genes appears to have increased dramatically during metazoan evolution. For example, the genome of the plant A. thaliana contains 21 zinc finger genes, as compared with 34 in baker's yeast (S. cervisiae), 68 in a nematode (C. elegans), 234 in an insect (D. melanogaster) and 564 in human beings (Goffeau 1997Citation ; The C. elegans Sequencing Consortium 1998Citation ; Adams et al. 2000Citation ; The Arabidopsis Genome Initiative 2000Citation ; McPherson et al. 2001Citation ; Venter et al. 2001Citation ). In addition, zinc finger genes are often clustered. The clusters consist of evolutionarily related zinc finger genes and are probably a result of a number of consecutive gene duplication events. The KRAB A + b zinc finger genes and the members of the ZNF91 family of KRAB A + B zinc finger genes are both good illustrations of such an event. Duplications of zinc finger genes are most often duplications of single genes (Shannon et al. 1998Citation and references therein). However, duplications of all or part of a cluster have also been reported (e.g., human chromosome 10; Tunnacliffe et al. 1993Citation ; Jackson et al. 1996Citation ). Evolutionary analysis of zinc finger genes from mouse and human zinc finger clusters shows that different founder genes have been duplicated, lost, and selected independently in each conserved cluster since the divergence of primate and rodent lineages (Dehal et al. 2001Citation ). This indicates that duplication of zinc finger genes is an ongoing process. Following duplication of a gene there are three possible outcomes. The copy can become a nonfunctional pseudogene, retain its function (resulting in increased production of RNA or protein or both), or accumulate molecular changes that may, in time, affect a new function. Duplication-derived paralogous genes, therefore, allow nucleotide substitutions that cause changes in the amino acid sequence. Speciation-derived orthologous genes, on the other hand, strive to preserve their original function. For these genes, nucleotide substitutions which do not affect the amino acid sequence and the function of the gene are predominant. This is also true for KRAB zinc finger genes. Four orthologous pairs and seven paralogous pairs of KRAB zinc finger genes were selected for analysis of nucleotide substitution in their zinc finger regions (table 1 ). The analysis shows that the fraction of synonymous substitutions (%S) in the zinc finger region of orthologous genes is significantly higher than the fraction of nonsynonymous substitutions (%N; P < 0.0001; t-test; table 1 ). In the zinc finger region of paralogous genes the situation is reversed, and nonsynonymous nucleotide substitutions are more common than synonymous substitutions (P < 0.0001; t-test; table 1 ).


View this table:
[in this window]
[in a new window]
 
Table 1 Synonymous and Nonsynonumous Substitutions Between Different Orthologous and Paralogous KRAB Zinc Finger Genes from Mouse and Human Beings

 
The numbers of nucleotide substitutions at nonsynonymous (KA) and synonymous (KS) sites were also calculated for these gene pairs, and to further evaluate the sequence constraint, the ratio of nonsynonymous to synonymous substitutions was determined (table 1 ). A KA/KS ratio greater than 1 indicates an acceleration of protein evolution since the divergence of the two genes, whereas a KA/KS ratio less than 1 indicates selective constraint of the two genes. Complete relaxation of selection will result in KA/KS of around 1. According to our results, the zinc finger regions of orthologous KRAB zinc finger genes are well conserved. The zinc finger regions of paralogous genes, on the other hand, are less well conserved, and a mutation within this region often leads to a change in the amino acid sequence. The KA/KS ratio is indeed specifically higher in paralogous genes compared with orthologous genes (P = 0.002; t-test), but it is not greater than or close to 1 as expected. This is partly explained by the fact that the two cysteines and the two histidines, which are essential for correct folding of the nucleic acid–binding Krüppel-type zinc finger motif, are highly conserved. It could also reflect the fact that, following an early phase of relaxed selection or even accelerated evolution, duplicate genes gradually increase their selective constraint. In fact, the vast majority of gene duplicates with KS > 0.1 exhibit a KA/KS ratio of less than 1 (Lynch and Conery 2000Citation ).

In conclusion, we show that the zinc finger regions of paralogous KRAB zinc finger genes accumulate more changes in their amino acid sequence than do orthologous genes. It should be noted that the amino acids of the KRAB domain, which are involved in the interaction with TIF1-ß and the transcriptional repression exerted by the KRAB domain, are very well conserved in all KRAB zinc finger proteins (Mark, Abrink, and Hellman 1999Citation ; Abrink et al. 2001Citation ). Duplication of KRAB zinc finger genes, therefore, is most likely a first step in the evolution of new transcriptional repressors. The new transcription factor will contain a well-conserved KRAB domain, responsible for transcriptional repression. However, changes in the amino acid sequence of the zinc finger region will slowly lead to altered binding specificity so that new transcription factors will appear which will regulate genes other than those regulated by the ancestral gene. The fact that the residues responsible for the specificity in sequence recognition are separated from those responsible for structural integrity of the domain (Wolfe, Nekludova, and Pabo 2000Citation ) probably allows considerable flexibility in the evolution of novel binding specificities and increases the potential for these proteins to evolve relatively freely.

Expansion of the Zinc Finger Region
Not only has the number of zinc finger genes increased throughout evolution, so has the number of zinc finger motifs within each individual gene. On average, a zinc finger gene of A. thaliana contains one zinc finger motif, whereas the corresponding numbers for S. cervisiae, C. elegans, D. melanogaster, and H. sapiens are 1.5, 2.5, 3.5, and 8, respectively (Venter et al. 2001Citation ). We have analyzed the phylogenetic relationship among the individual zinc finger motifs of 40 KRAB zinc finger genes. Each gene was analyzed separately, and the results clearly indicate the occurrence of internal duplications (fig. 3 Go ). The sizes of these duplications range from a single zinc finger motif to as many as six consecutive zinc finger motifs (e.g., DNABPZ in fig. 3A ). In addition, these duplicated regions can be repeated at least up to four times within the protein (KZF2 in fig. 3A ). Internal duplications, therefore, seem to be a common mechanism involved in the expansion of the number of zinc finger motifs carried in each zinc finger gene and thereby contribute to the evolution of new transcriptional regulators.



View larger version (38K):
[in this window]
[in a new window]
 
Fig. 3.—Schematic representation of the phylogenetic relationship between the separate zinc finger motifs of 40 different KRAB zinc finger genes. Each zinc finger gene was analyzed separately, as shown for DNABPZ and HKr18 in (A). Based on their phylogenetic relationship, the individual zinc finger motifs were divided into several distinct groups, also shown in (A). Bootstrap values greater than 50% are shown (1,000 replicates). (B) and (C) show the zinc finger region of mouse and human KRAB zinc finger genes, respectively. Each circle represents a single zinc finger motif, and the stop codon is indicated by an asterisk. The numbers within the circles indicate which phylogenetic group these particular zinc finger motifs belong to. Note that the numbers indicating the phylogenetic groups are only applicable to the gene indicated. Potential internal duplications are underlined

 


View larger version (45K):
[in this window]
[in a new window]
 
Fig. 3. (Continued)

 
Condensation of the Zinc Finger Region
Not only are zinc finger motifs added to the zinc finger region, they are also frequently lost or functionally inactivated. For example, degenerate zinc finger motifs, which lack one or several of the cysteines or histidines necessary for interaction with the zinc ion and proper folding are observed in the open reading frames of approximately 80% of the genes included in this study (fig. 4 Go ). These nonfunctional zinc finger motifs can be found before, within, or after the region of functional zinc finger motifs (e.g., MZF13, MZF31, and HZF12 respectively; fig. 4 ). A few examples are found where the entire linker region consists of degenerate fingers (i.e., DNABPZ, MZF22, and KZF-2, fig. 4 ). In some cases these remnants of zinc finger motifs show only a very weak resemblance to the classical zinc finger motif. These are hard to detect, becoming part of the linker region or generating a linker sequence of exactly 28 amino acids within the zinc finger region (e.g., the gap between fingers 7 and 8 of NK10; fig. 4 ). Furthermore, 25% of the zinc finger genes included in this study carry additional degenerate zinc finger motifs in their 3'-untranslated region (e.g., ZNF136 and DNABPZ; fig. 4 ). These "graveyards" of degenerate zinc finger motifs are composed of anything from 1 to 15 zinc finger motifs (figs. 4 and 5 ) and have most likely been part of the open reading frames of these genes. The formation of these graveyards is probably initiated by the introduction of a stop codon within the coding region. For example, this is the situation for the mouse zinc finger gene Zfp93. Alignments and phylogenetic analysis of Zfp93 and its human ortholog, HZF6 (also referred to as ZNF93), show that the last zinc finger motif in HZF6 corresponds to a zinc finger motif located in the 3'-untranslated region of Zfp93. Careful examination of the nucleotide sequence shows that this zinc finger motif has been excluded from the open reading frame of Zfp93 by the introduction of a stop codon (a change from TAC [tyrosine] to TAA [stop]; data not shown). Interestingly, the mutations, including those causing a change in amino acid sequence, are more frequent in this zinc finger motif than in those within the open reading frame (data not shown). In time, this region of zinc finger motifs outside the open reading frame seems to accumulate mutations, and not even the zinc-coordinating cysteines and histidines are conserved (fig. 5 ). Moreover, this region seems to be much more susceptible to small deletions and insertions, causing reading frame shifts.



View larger version (51K):
[in this window]
[in a new window]
 
Fig. 4.—A schematic representation of the structure of a panel of 72 KRAB zinc finger mRNAs. The open reading frames are depicted as gray bars, and the 3'-untranslated regions are shown by lines with numbers indicating the approximate number of nucleotides from the stop codon to the poly-A tail. An asterisk indicates that the sequence was obtained from a genomic sequence. The KRAB A and B motifs are indicated by A and B in boxes in different shades of gray. The b motif is indicated by b in an open box, and the SCAN box is marked by SCAN in an open box. Open circles and gray ovals indicate degenerate and functional zinc finger motifs, respectively. Reminiscent zinc finger motifs present in the 3'-untranslated sequence are indicated by open ovals. The species from which the mRNA was isolated is indicated within brackets: D (D. melanogaster), G (Gallus gallus), H (H. sapiens), M (Mus musculus), R (Rattus norvegicus) and X (X. laevis).

 


View larger version (42K):
[in this window]
[in a new window]
 
Fig. 4. (Continued)

 


View larger version (31K):
[in this window]
[in a new window]
 
Fig. 5.—A schematic representation of the graveyardlike region of 10 KRAB zinc finger genes. The open reading frames are depicted as gray bars, and the 3'-untranslated regions are shown as open bars. The KRAB A and B motifs are indicated by A and B in boxes of different shades of gray. The b motif is indicated by b in an open box. Open circles and gray ovals indicate degenerate and functional zinc finger motifs, respectively. Reminiscent zinc finger motifs in the 3'-untranslated sequence are indicated by open ovals. In addition, the three reading frames of the 3'-untranslated region are shown in boxes. A zinc finger motif is indicated by CCHH. Mutations in any of these four essential amino acids are indicated by a black dot.

 

    Conclusions
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Acknowledgements
 References
 
The increasing number of fully characterized genomes has given strong indications that KRAB zinc finger genes are involved in a massive evolutionary expansion both in the total number of genes as well as in the number of zinc finger motifs within each gene (Venter et al. 2001Citation ). Some features of zinc finger proteins may have been especially important for the apparent expansion of the zinc finger gene family in some species: zinc finger proteins in general, and KRAB zinc finger proteins in particular, are built from independent structural modules, the KRAB domain and the zinc finger domain. The zinc finger domain, in turn, is built from consecutive zinc finger motifs, which fold independently (Lee et al. 1989Citation ; Pavletich and Pabo 1991Citation ). The expansion of the KRAB zinc finger gene family is most probably a result of successive duplication events, either by individual genes or clusters of genes (Shannon et al. 1998Citation and references therein; Dehal et al. 2001Citation ). These duplications have probably been favored by the repetitive nature of the KRAB zinc finger genes and possibly other repetitive sequences located in the close vicinity of these genes. In fact, the members of the ZNF91 family of KRAB A + B zinc finger genes are bracketed by repetitive sequences that might have been important for the expansion of this gene family (Eichler et al. 1998Citation ). Unequal crossing over, guided by repetitive sequences, is one potential mechanism that may cause an increase both in the number of genes and the number of zinc finger motifs within a single gene. Slippage during replication, caused by the repetitive nature of the genes, may also contribute to the expansion of the zinc finger region.

The duplication of KRAB zinc finger genes is probably the first step in the evolution of new transcriptional repressors. The new gene, through its well-conserved KRAB domain, is able to interact with TIF1-ß and repress transcription. However, due to accumulating changes in the amino acid sequence of the zinc finger region, the binding specificity of the zinc finger region will slowly change. The binding specificity of the KRAB zinc finger proteins is further modulated by the addition and inactivation of entire zinc finger motifs that take place within the zinc finger region. Together, these mechanisms generate a framework for the evolution of transcription factors with new binding specificities, which might have been essential for the expansion of the KRAB zinc finger gene family during metazoan evolution.


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Acknowledgements
 References
 
We thank Dr. Robert Fredriksson for sharing his knowledge of MEGA and Dr. David Ardell for sharing his knowledge on phylogenetics. This work was supported by a grant from The European Commission (BIOMED2).


    Footnotes
 
William Jeffery, Reviewing Editor

Keywords: KRAB Krüppel zinc finger evolution Back

Address for correspondence and reprints: Lars Hellman, Department of Cell and Molecular Biology, Uppsala University, BMC, Box 596, SE-751 24 Uppsala, Sweden. E-mail: Lars.Hellman{at}icm.uu.se Back


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusions
 Acknowledgements
 References
 

    Abrink M., J. A. Ortiz, C. Mark, C. Sanchez, C. Looman, L. Hellman, P. Chambon, R. Losson, 2001 Conserved interaction between distinct Kruppel-associated box domains and the transcriptional intermediary factor 1 beta Proc. Natl. Acad. Sci. USA 98:1422-1426[Abstract/Free Full Text]

    Adams M. D., S. E. Celniker, R. A. Holt, et al. (195 co-authors) 2000 The genome sequence of Drosophila melanogaster Science 287:2185-2195[Abstract/Free Full Text]

    Bellefroid E. J., D. A. Poncelet, P. J. Lecocq, O. Revelant, J. A. Martial, 1991 The evolutionarily conserved Kruppel-associated box domain defines a subfamily of eukaryotic multifingered proteins Proc. Natl. Acad. Sci. USA 88:3608-3612[Abstract]

    Benn A., M. Antoine, H. Beug, J. Niessing, 1991 Primary structure and expression of a chicken cDNA encoding a protein with zinc-finger motifs Gene 106:207-212[ISI][Medline]

    Collins T., J. R. Stone, A. J. Williams, 2001 All in the family: the BTB/POZ, KRAB and SCAN domains Mol. Cell. Biol 21:3609-3615[Free Full Text]

    Dehal P., P. Predki, A. S. Olsen, et al. (21 co-authors) 2001 Human chromosome 19 and related regions in mouse: conservative and lineage-specific evolution Science 293:104-111[Abstract/Free Full Text]

    Eichler E. E., S. M. Hoffman, A. A. Adamson, L. A. Gordon, P. McCready, J. E. Lamerdin, H. W. Mohrenweiser, 1998 Complex beta-satellite repeat structures and the expansion of the zinc finger gene cluster in 19p12 Genome Res 8:791-808[Abstract/Free Full Text]

    Friedman J. R., W. J. Fredericks, D. E. Jensen, D. W. Speicher, X. P. Huang, E. G. Neilson, F. J. R Rauscher, 1996 KAP-1, a novel corepressor for the highly conserved KRAB repression domain Genes Dev 10:2067-2078[Abstract]

    Goffeau A., 1997 The yeast genome directory Nature 387:1-105

    Honer C., P. Chen, M. J. Toth, C. Schumacher, 2001 Identification of SCAN domains in four gene families Biochim. Biophys. Acta 16:441-448

    Jackson M. S., C. G. See, L. M. Mulligan, B. F. Lauffart, 1996 A 9 75-Mb map across the centromere of human chromosome 10. Genomics 33:258-270

    Kim S. S., Y. M. Chen, E. O'Leary, R. Witzgall, M. Vidal, J. V. Bonventre, 1996 A novel member of the RING finger family, KRIP-1, associates with the KRAB-A transcriptional repressor domain of zinc finger proteins Proc. Natl. Acad. Sci. USA 93:15299-15304[Abstract/Free Full Text]

    Klug A., D. Rhodes, 1987 Zinc fingers: a novel protein motif for nucleic acid recognition Trends Biochem. Sci 12:464-469.[ISI]

    Knight R. D., S. M. Shimeld, 2001 Identification of conserved C2H2 zinc-finger gene families in the Bilateria Genome Biol 2: research 1-8.

    Kumar S., K. Tamura, I. B. Jakobsen, M. Nei, 2001 MEGA2: molecular evolutionary genetics analysis software Bioinformatics 17:1244-1245[Abstract/Free Full Text]

    Lee M. S., G. P. Gippert, K. V. Soman, D. A. Case, P. E. Wright, 1989 Three-dimensional solution structure of a single zinc finger DNA-binding domain Science 245:635-637[ISI][Medline]

    Lorenz P., D. Koczan, H. J. Thiesen, 2001 Transcriptional repression mediated by the KRAB domain of the human C2H2 zinc finger protein Kox1/ZNF10 does not require histone deacetylation Biol. Chem 382:637-644[ISI][Medline]

    Lovering R., J. Trowsdale, 1991 A gene encoding 22 highly related zinc fingers is expressed in lymphoid cell lines Nucleic Acids Res 19:2921-2928[Abstract]

    Lynch M., J. S. Conery, 2000 The evolutionary fate and consequences of duplicate genes Science 290:1151-1155[Abstract/Free Full Text]

    Margolin J. F., J. R. Friedman, W. K. Meyer, H. Vissing, H. J. Thiesen, F. J. R. Rauscher, 1994 Kruppel-associated boxes are potent transcriptional repression domains Proc. Natl. Acad. Sci. USA 91:4509-4513[Abstract]

    Mark C., M. Abrink, L. Hellman, 1999 Comparative analysis of KRAB zinc finger proteins in rodents and man: evidence for several evolutionarily distinct subfamilies of KRAB zinc finger genes DNA Cell Biol 18:381-396[ISI][Medline]

    McPherson J. D., M. Marra, L. Hillier, et al. (104 co-authors) 2001 A physical map of the human genome Nature 409:934-941[ISI][Medline]

    Miller J., A. D. McLachlan, A. Klug, 1985 Repetitive zinc-binding domains in the protein transcription factor IIIA from Xenopus oocytes EMBO J 4:1609-1614[Abstract]

    Moosmann P., O. Georgiev, B. Le Douarin, J. P. Bourquin, W. Schaffner, 1996 Transcriptional repression by RING finger protein TIF1 beta that interacts with the KRAB repressor domain of KOX1 Nucleic Acids Res 24:4859-4867[Abstract/Free Full Text]

    Nielsen A. L., J. A. Ortiz, J. You, A. M. Oulad, R. Khechumian, A. Gansmuller, P. Chambon, R. Losson, 1999 Interaction with members of the heterochromatin protein 1 (HP1) family and histone deacetylation are differentially involved in transcriptional silencing by members of the TIF1 family EMBO J 18:6385-6395[Abstract/Free Full Text]

    Parraga G., S. J. Horvath, A. Eisen, W. E. Taylor, L. Hood, E. T. Young, R. E. Klevit, 1988 Zinc-dependent structure of a single-finger domain of yeast ADR1 Science 241:1489-1492[ISI][Medline]

    Pavletich N. P., C. O. Pabo, 1991 Zinc finger–DNA recognition: crystal structure of a Zif268-DNA complex at 2.1 A Science 252:809-817[ISI][Medline]

    Rosati M., M. Marino, A. Franze, A. Tramontano, G. Grimaldi, 1991 Members of the zinc finger protein gene family sharing a conserved N-terminal module Nucleic Acids Res 19:5661-5667[Abstract]

    Ruiz i Altaba A., H. Perry-O'Keefe, D. A. Melton, 1987 Xfin: an embryonic gene encoding a multifingered protein in Xenopus EMBO J 6:3065-3070.[Abstract]

    Ryan R. F., D. C. Schultz, K. Ayyanathan, P. B. Singh, J. R. Friedman, W. J. Fredericks, F. R. Rauscher, 1999 KAP-1 corepressor protein interacts and colocalizes with heterochromatic and euchromatic HP1 proteins: a potential role for Kruppel-associated box–zinc finger proteins in heterochromatin-mediated gene silencing Mol. Cell. Biol 19:4366-4378[Abstract/Free Full Text]

    Sander T. L., A. L. Haas, M. J. Peterson, J. F. Morris, 2000 Identification of a novel SCAN box-related protein that interacts with MZF1B. The leucine-rich SCAN box mediates hetero- and homoprotein associations J. Biol. Chem 28:12857-12867

    Schuh R., W. Aicher, U. Gaul, et al. (11 co-authors) 1986 A conserved family of nuclear proteins containing structural elements of the finger protein encoded by Kruppel, a Drosophila segmentation gene Cell 47:1025-1032[ISI][Medline]

    Schumacher C., H. Wang, C. Honer, et al. (11 co-authors) 2000 The SCAN domain mediates selective oligomerization J. Biol. Chem 275:17173-17179[Abstract/Free Full Text]

    Shannon M., J. Kim, L. Ashworth, E. Branscomb, L. Stubbs, 1998 Tandem zinc-finger gene families in mammals: insights and unanswered questions DNA Seq 8:303-315[ISI][Medline]

    Takashima H., H. Nishio, H. Wakao, M. Nishio, K. Koizumi, A. Oda, T. Koike, K. Sawada, 2001 Molecular cloning and characterization of a KRAB-containing zinc finger protein, ZNF317, and its isoforms Biochem. Biophys. Res. Commun 288:771-779[ISI][Medline]

    The Arabidopsis Genome Initiative. 2000 Analysis of the genome sequence of the flowering plant Arabidopsis thaliana Nature 408:796-815[ISI][Medline]

    The C. elegans Sequencing Consortium. 1998 Genome sequence of the nematode C. elegans: a platform for investigating biology Science 282:2012-2018.[Abstract/Free Full Text]

    Thompson J. D., D. G. Higgins, T. J. Gibson, 1994 CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice Nucleic Acids Res 22:4673-4680[Abstract]

    Tunnacliffe A., L. Liu, J. K. Moore, M. A. Leversha, M. S. Jackson, L. Papi, M. A. Ferguson-Smith, H. J. Thiesen, B. A. Ponder, 1993 Duplicated KOX zinc finger gene clusters flank the centromere of human chromosome 10: evidence for a pericentric inversion during primate evolution Nucleic Acids Res 21:1409-1417[Abstract]

    Underhill C., M. S. Qutob, S. P. Yee, J. Torchia, 2000 A novel nuclear receptor corepressor complex, N-CoR, contains components of the mammalian SWI/SNF complex and the corepressor KAP-1 J. Biol. Chem 275:40463-40470[Abstract/Free Full Text]

    Venter J. C., M. D. Adams, E. W. Myers, et al. (274 co-authors) 2001 The sequence of the human genome Science 291:1304-1351[Abstract/Free Full Text]

    Vissing H., W. K. Meyer, L. Aagaard, N. Tommerup, H. J. Thiesen, 1995 Repression of transcriptional activity by heterologous KRAB domains present in zinc finger proteins FEBS Lett 369:153-157[ISI][Medline]

    Williams A. J., S. C. Blacklow, T. Collins, 1999 The zinc finger–associated SCAN box is a conserved oligomerization domain Mol Cell Biol 19:8526-8535[Abstract/Free Full Text]

    Witzgall R., E. O'Leary, A. Leaf, D. Onaldi, J. V. Bonventre, 1994 The Kruppel-associated box-A (KRAB-A) domain of zinc finger proteins mediates transcriptional repression Proc. Natl. Acad. Sci. USA 91:4514-4518[Abstract]

    Wolfe S. A., L. Nekludova, C. O. Pabo, 2000 DNA recognition by Cys2His2 zinc finger proteins Annu. Rev. Biophys. Biomol. Struct 29:183-212[ISI][Medline]

Accepted for publication July 17, 2002.