(Received for publication, April 29, 1995; and in revised form, July 6, 1995)
From the
The syntrophin family of dystrophin-associated proteins consists
of three isoforms, 1,
1, and
2, each encoded by a
distinct gene. We have cloned and characterized the mouse
1- and
2-syntrophin genes. The mouse
1-syntrophin gene (>24
kilobases) is comprised of eight exons. The mouse
2-syntrophin
gene (>33 kilobases) contains seven exons, all of which have
homologues at the corresponding position in the
1-syntrophin gene.
Primer extension analysis reveals two transcription initiation sites in
the
1-syntrophin gene and a single site in the
2-syntrophin
gene. The sequence immediately 5` of the transcription start sites of
both genes lacks a TATA box but is GC-rich and has multiple putative
SP1 binding sites. The
1-syntrophin gene is located on human
chromosome 20 and mouse chromosome 2, while the
2-syntrophin gene
is on human chromosome 16 and mouse chromosome 8. Analysis of the amino
acid sequence of the syntrophins reveals the presence of four conserved
domains. The carboxyl-terminal 56 amino acids are highly conserved and
constitute a syntrophin unique domain. Two pleckstrin homology domains
are located at the amino-terminal end of the protein. The first
pleckstrin homology domain is interrupted by a domain homologous to
repeated sequences originally found in the Drosophila discs-large protein.
Syntrophin is a peripheral membrane protein of M
58,000 that was first identified in the postsynaptic
membrane of Torpedo electric organ and subsequently shown to
be present in many mammalian tissues(1) . Interest in
syntrophin came first from its location at the neuromuscular junction
and more recently from the demonstration that it is directly associated
with dystrophin, the product of the Duchenne/Becker muscular dystrophy
gene. Although the precise function of syntrophin is unknown, a
potential role for the dystrophin-associated proteins in
agrin-stimulated nicotinic acetylcholine receptor clustering has
implicated syntrophin in the process of synaptogenesis(2) .
Three different but highly conserved syntrophin isoforms encoded by
distinct genes have been identified and cloned. Each syntrophin has
approximately 50% amino acid identity with the other two(3) .
The three syntrophins can be separated into two classes based on
isoelectric point(4) . The acidic isoform, 1-syntrophin,
(pI
6.7) has been cloned from Torpedo, mouse, rabbit, and
human (5, 6, 7) . There are two basic forms,
1- and
2- syntrophin (pI
9.0). A full-length human cDNA
encoding
1-syntrophin has been cloned, and the gene has been
localized to human chromosome 8q23-24 (8) . Partial
clones encoding mouse and human
2-syntrophin have been reported
previously(5, 8) .
The function of syntrophin is
likely to be related to its association with dystrophin and other
members of the dystrophin protein family (9, 10, 11, 12, 13) .
Proteins of the dystrophin family are derived from a combination of
three genes, the use of alternative promoters within these genes, and
alternative splicing. Dystrophin, the major product in skeletal muscle,
is a 427-kDa protein with an actin-binding amino terminus, 24
spectrin-like coiled coil repeats, a cysteine rich (CR) ()region and a unique carboxyl terminus
(CT)(14, 15) . Shorter forms of dystrophin that
contain CRCT and either no (Dp 71) or only a few spectrin repeats (e.g. Dp 116) are produced from the same gene via internal
promoters (for review, see (54) ). Utrophin is highly related
to dystrophin but is encoded by a separate
gene(16, 17) . A third gene encodes a protein of M
87,000 from Torpedo that has modest but
significant homology with the CRCT of dystrophin(18) . The
association of syntrophin with each of these members of the dystrophin
family suggests that it has a general role in their functions at the
membrane(9) .
Most members of the dystrophin protein family exhibit highly restricted tissue distributions. The exception is utrophin, which, as its name implies, is nearly ubiquitous. Dystrophin is expressed primarily in skeletal muscle but can also be detected in cardiac muscle and brain. Dp 71 is found primarily in brain glial cells, liver, and stomach (19, 20, 21) . Dp 116 is found only in glial cells, particularly the Schwann cells(22) . Other recently described proteins derived from alternate promoters within the dystrophin gene also appear to have restricted tissue distributions(23, 24) .
The three
syntrophins also show unique expression patterns. Northern blot
analysis has shown that 1-syntrophin, like dystrophin, is
expressed primarily in skeletal muscle but is also present in cardiac
muscle, kidney, and brain(5) . In skeletal muscle, the
subcellular distribution of
1-syntrophin is virtually identical to
that of dystrophin. Both proteins are associated with the sarcolemma
and concentrated at the neuromuscular junction(25) .
2-Syntrophin message is expressed at highest levels in testis,
brain, cardiac muscle, kidney, and lung(5) , but only at low
levels in skeletal muscle where the protein is restricted to the
neuromuscular junction(25) .
1-Syntrophin message is found
primarily in liver with moderate levels in kidney, skeletal muscle, and
lung but at very low levels in brain and cardiac muscle(8) .
These unique patterns of syntrophin are most likely a result of
tissue-specific transcription. The predominant interactions between
syntrophins and dystrophin family members may reflect these
differential tissue expressions and thus subserve the specific
functional requirements of that tissue.
As a prelude to
understanding the transcriptional control of syntrophin, we report the
cloning and characterization of the genes encoding two of the three
syntrophins. We have mapped the loci for 1-syntrophin (gene symbol Snta1) and
2-syntrophin (gene symbol Sntb2) to
the mouse and human chromosomes in order to identify potential links
between mutations in these genes and known genetic disorders. Finally,
analysis of the full-length amino acid sequence of the syntrophins has
revealed the presence of two pleckstrin homology (PH)
domains(26) , a PDZ domain (so named to indicate its presence
in postsynaptic density protein-95 (PSD-95), the Drosophila discs large tumor suppresser protein, and
the zonula occludens-1 protein (ZO-1), (
)and a highly conserved carboxyl-terminal syntrophin unique
(SU) domain.
To identify the
human chromosomes encoding syntrophin, a hamster-human somatic cell
hybrid panel (BIOS, New Haven, CT) was screened with the 1- and
2-syntrophin cDNAs according to a Southern blot protocol described
previously(5) .
Figure 1:
Exon/intron structure
of the 1- and
2-syntrophin genes. The lengths of the exons (open boxes) are given in number of base pairs, and the intron
lengths are given in kilobase pairs. Introns are not drawn to scale.
The lengths of all exons and
1-syntrophin introns IV and VII were determined by sequencing. Two sizes for
1-syntrophin exon 1 are possible, depending on the transcription
start site. Alternative splicing potentially produces two sizes of
1-syntrophin exon 6 (see text). The lengths of the remaining
1-syntrophin introns and of
2-syntrophin intron VI were
estimated by restriction mapping of genomic clones, and the lengths of
2-syntrophin introns I-V were determined
by PCR of mouse genomic DNA. The positions of the start (ATG)
and stop (TGA) codons are shown.
The 1-syntrophin gene is
over 24 kb in length and contains eight exons. The smallest is exon 5
(131 bp) and the longest is exon 8 (580 bp). The length of exon 1 is
either 367 or 408 bp depending on the site used for transcription
initiation (see below). The length of exon 6 is dependent on the 3`
splice site used by intron 5. One of the cDNA clones previously
isolated, BC10, encoded an additional 4 amino acids (SSAH) not present
in three other independently isolated clones(5) . The 12
nucleotides encoding this sequence could be included in exon 6 by using
an alternative 3` splice site (Fig. 2). This splice occurs at a
3` TG instead of the highly conserved AG dinucleotide and is therefore
likely to be a rare splicing event. The additional 12 nucleotides were
observed in only one of four mouse cDNA clones and are not present in
the Torpedo(5) , rabbit(6) , or human (7)
1-syntrophin cDNAs, further suggesting that this is a
rare splicing event. Exon 8 contains the TAG stop codon followed by 490
bp of 3`-untranslated sequence, which includes the polyadenylation
signal sequence.
Figure 2:
Intron border sequences of the 1- and
2-syntrophin genes. The nucleotide and corresponding amino acid
exon sequences immediately before and after each intron are shown in upper case. Intronic sequence is shown in lower case.
The conserved GT and AG dinucleotides that begin and end each intron
are in boldface. Intron 5 of the
1-syntrophin gene
contains a second 3` splice site 12 nucleotides prior to the first
splice site that can result in the incorporation of 4 additional amino
acids (see text).
The 2-syntrophin gene is over 33 kb long and
contains seven exons. The smallest is exon 4 with 143 bp, and the
largest, exon 7, is over 1690 bp. The exact size of exon 7 is unknown
because no polyadenylation signal sequence was found in either the
genomic clone or the 3` end of the cDNA(5) . The positions of
the introns relative to the amino acid sequence and the exon/intron
border sequence of both syntrophin genes are shown in Fig. 2.
All introns have the conserved GT and AG dinucleotides present at the
donor and acceptor sites, respectively.
The exon sequences of the
1-syntrophin and
2-syntrophin genes were identical to the
sequences of the previously characterized cDNAs(5) . The cDNA
for mouse
2-syntrophin previously reported (5) was missing
a substantial portion of the 5` sequence. Genomic clones containing
2-syntrophin exon 1 extended the cDNA sequence by 327 bp,
including 264 bp of coding region (88 amino acids). We have also
isolated a
2-syntrophin cDNA clone with sequence corresponding to
that of exon 1. The cDNA sequence contains the exon 1 coding sequence,
but its 5` untranslated region is 15 bp shorter than that of the
genomic sequence (data not shown). The full-length amino acid sequence
of
2-syntrophin derived from the exon 1 sequence coupled with the
previously reported cDNA sequence (5) is shown in Fig. 3. The additional amino acids derived from the genomic
sequence are underlined. The first 5` methionine codon in
frame with the previously reported
2-syntrophin sequence is in a
context favorable for translation initiation(32) . The
2-syntrophin encoded by this full-length sequence is 520 amino
acids long with a calculated molecular mass of 56,388.5 Da. The
full-length mouse
2-syntrophin amino acid sequence shares 46%
identity with mouse
1-syntrophin and 55% identity with human
1-syntrophin.
2-Syntrophin was classified as a basic (
)
form because the partial sequence had indicated a high isoelectric
point. The calculated pI of 8.7 for the full-length protein indicates
that it is correctly placed among the
-syntrophins.
Figure 3:
Complete amino acid sequence of mouse
2-syntrophin. The complete sequence of mouse
2-syntrophin was
derived from the novel amino-terminal 88 amino acids translated from
the genomic sequence (underlined) linked to the previously
published partial cDNA
sequence(5) .
Figure 4:
Primer extension analysis of the
transcription initiation sites of the 1- and
2-syntrophin
gene. Primer extension of mouse skeletal muscle RNA using a primer to
1-syntrophin gives two bands (arrows) separated by 41
nucleotides (lanes 1 and 2). Lanes 1 and 2 represent primer extension products obtained with two
different protocols (see ``Experimental Procedures''). A
potential third band seen in lane 1 was not confirmed in
either the second primer extension (lane 2) or in ribonuclease
protection assays (data not shown). Extension of mouse testis RNA with
a primer to
2-syntrophin results in a single major band (lane
3). The marker sequence (M) was derived from M13mp18 and
is loaded in the order GATC.
Figure 5:
Promoter region sequence of the 1-
and
2-syntrophin genes. The nucleotide sequence 5` of the
translation initiation codon is shown for the
1-syntrophin gene (A) and the
2-syntrophin gene (B). The location
of the transcription start sites (bent arrows) and selected
putative regulatory elements (underlined) are shown. The
1-syntrophin promoter contains an 8-nucleotide inverted repeat (underlined by arrows). The nucleotides are numbered
relative to the translation initiation
codon.
Figure 6:
Mapping of Snta1 to mouse
chromosome 2. A, HincII restriction enzyme pattern of M. spretus (S) and C57BL/6J (B) genomic DNAs
probed 1-syntrophin cDNA. The molecular sizes of the fragments in
kb are indicated. B, haplotype analysis of chromosome 2
genetic markers in (C57BL/6J
M. spretus)F
M. spretus back-cross
mice showing linkage and relative position of Snta1. Closed boxes indicate inheritance of the C57BL/6J (B)
allele, and open boxes indicate the inheritance of the M.
spretus (S) allele from the (C57BL/6J
M.
spretus)F
parent. Gene names and reference to these
loci can be found in GBASE(48) . The first two columns indicate
the number of back-cross progeny with no recombinations. The following
columns indicate recombinational events between adjacent loci
(signified by a change from open box to closed box).
The number of recombinants are listed below each column and the
recombination frequency (REC %) between adjacent loci is
indicated.
A HincII RFLP was also identified for Sntb2 (encoding
2-syntrophin) by the presence of a 2.2-kb genomic DNA fragment in
C57BL/6J or its absence in M. spretus (Fig. 7A). This allele was characterized in the 88
DNAs from the (C57BL/6J
M. spretus) F
M. spretus back-cross panel. Haplotype analysis
of these mapping data was performed and is indicated in Fig. 7B. The Sntb2 locus is closely linked to D8Bir25 (DNA fragment BIR 25) on mouse chromosome 8. The
calculated map distances between Sntb2 and adjacent loci D8Bir25 and D8Bir26 (DNA fragment BIR 26), including
95% confidence limits were determined: D8Bir25-1.1 ±
1.1 cM-Sntb2-5.7 ± 2.5 cM-D8Bir26.
Figure 7:
Mapping of Sntb2 to mouse
chromosome 8. This figure is similar to Fig. 6. A, HincII restriction enzyme pattern for M. spretus (S) and C57BL/6J (B) genomic DNAs probed with
2-syntrophin cDNA. The molecular sizes of the fragments in kb are
indicated. B, haplotype analysis of chromosome 8 genetic
markers in (C57BL/6J
M. spretus)F
M. spretus back-cross mice showing linkage and relative
position of Sntb2.
The
human chromosomes containing the 1- and
2-syntrophin genes
were identified using hamster-human and mouse/human hybrid cell lines.
Southern blot and PCR analyses indicated that SNTA1 is located
on human chromosome 20, and SNTB2 is on human chromosome 16
(data not shown).
Figure 8:
Alignment of the syntrophin PDZ domain
with the PDZ domain of other proteins. A, identical and
structurally conserved amino acids are shaded. Proteins
included are mouse 1- and
2-syntrophin(5) ; human
1-syntrophin (8) ; Torpedo
1-syntrophin(5) ; Drosophila discs-large
protein(49) ; human discs-large protein(36) ; rat
postsynaptic density protein (psd95)(46) ; rat
nitric-oxide synthase(50) ; human lymphocyte chemoattractant
factor (GenBank number M90391); human tyrosine
phosphatase(51) ; mouse zonula occludens-1 protein (ZO-1)(52) ; and human erythrocyte
p55(53) . B, structural diagram of syntrophin showing
the relative locations of the PH domains, the PDZ domain, and the SU
domain.
The interactions between the syntrophins and dystrophin (and
other members of the dystrophin family) is
direct(7, 10, 11, 12, 13) .
One site of interaction occurs at a region of the cysteine-rich domain
encoded by exon 74 of the dystrophin
gene(11, 12, 13) , an exon that is subject to
alternative splicing(37) . Furthermore, the syntrophins can
interact with more than one member of the dystrophin family. For
example, 1,
1, and
2 syntrophins each bind homologous
sequences of dystrophin, utrophin, and 87-kDa protein(7) .
Thus, the specificity of the interaction between the syntrophin
isoforms and the dystrophin family members that is apparent from the
differential localization of
1 and
2 syntrophins in skeletal
muscle (25) must depend on other factors. These may include
posttranslational modifications, associations with other proteins, and
transcriptional regulation of expression. The tissue distribution
patterns of the syntrophin isoforms and members of the dystrophin
protein family implicate tissue-specific transcription as a potential
regulator of the dystrophin/syntrophin isoform association. As a first
step in investigating syntrophin transcriptional regulation, we have
isolated and characterized genomic clones encoding
1- and
2-syntrophin.
The 1-syntrophin gene is over 24 kb in
length and contains seven introns. The
2-syntrophin gene is over
33 kb long and contains at least six introns. Comparison of the
positions where the introns interrupt the coding sequence shows that
all introns in the
2-syntrophin gene occur at the corresponding
position in the
1- syntrophin gene ( Fig. 1and Fig. 2). The
1-syntrophin gene contains an additional
intron dividing the sequence that in
2-syntrophin is a continuous
first exon (Fig. 1). This positioning of introns at similar
locations is frequently observed in genes derived from a common
ancestor.
The 1-syntrophin gene has two major transcription
initiation sites, 41 nucleotides apart. cDNAs with 5` ends near each
start site were obtained during cDNA cloning(5) . The site at
position -76 is 35 nucleotides 5` of the start of cDNA clone BC5,
and the site at -117 extends cDNA clone BC10 by 7 nucleotides.
Since the two transcripts are only 41 nucleotides different, Northern
blots did not resolve the 2 bands, rather a single broad band was
observed at
2.4 kb(5) . A single transcription start site
was identified in the
2-syntrophin gene. This suggests that the
three sizes of message observed on Northern blots (2, 5, and 10 kb) (5) must result from different sized 3`-untranslated region
and/or incomplete removal of introns.
The promoter region of both
syntrophin genes is very GC rich, contains no identifiable TATA box,
and has multiple putative SP1 binding sites. This type of promoter is
often present in housekeeping genes, although the Dp 71 gene has a
similar promoter(38) . Expression of syntrophin mRNAs varies
greatly among different tissues, suggesting that tissue-specific
regulatory elements are present within the gene. A candidate for such
an element is the E-box motif present in both genes. Surprisingly, the
2-syntrophin gene, which is expressed in many tissues but only at
low levels in muscle, contains a putative CArG box element that is
often present in genes encoding muscle-specific proteins. Functional
analysis of each promoter will enable identification of the elements
responsible for
1- and
2-syntrophin's unique pattern of
expression.
To identify potential associations between known genetic
disorders and mutations in syntrophin genes, we have mapped the
location of the 1- and
2-syntrophin genes (Snta1 and Sntb2) to both human and mouse chromosomes. Human chromosome
mapping using somatic cell hybrid indicated that the SNTA1 and SNTB2 loci are located on human chromosomes 20 and 16,
respectively. Our data are in agreement with results from the Kunkel
laboratory(7, 8) , which show that the human genes for
1,
1, and
2 syntrophins are located at 20q11,
8q23-34, and 16q23, respectively.
The mouse syntrophin genes
were mapped to loci that are part of long conserved linkage groups
between human chromosome 20 and mouse chromosome 2 (for
1-syntrophin) or human chromosome 16 and mouse chromosome 8 (for
2-syntrophin). This information enabled us to compare these
locations with those of known mouse neuromuscular genetic disorders. No
likely candidates were found within a reasonable distance of the
1-syntrophin gene. A potential candidate for a
2-syntrophin
mutation may be the myodystrophic (Myd) mouse(39) . Myd harbors the mutation responsible for the myodystrophic
phenotype and has been mapped to chromosome 8 at a site
3 cM from
the
2-syntrophin gene. Given the error limits of genetic mapping,
the Myd locus and the
2-syntrophin gene cannot be
resolved as separate genes. Experiments are currently underway to
determine if
2-syntrophin is the gene product altered by the Myd mutation.
Analysis of the amino acid sequence of all
three syntrophins shows that syntrophin is comprised of four protein
domains. Each syntrophin contains two PH domains, a PDZ domain, and a
highly conserved, carboxyl-terminal SU domain. The PH domain is a
sequence of approximately 100 residues found in many proteins involved
in intracellular signaling(26) . Nuclear magnetic resonance and
crystal structure analyses have shown that this domain in
-spectrin and dynamin consists of seven strands of
-sheet and
one
-helix(40, 41) . In all three syntrophins,
the amino-terminal PH domain is interrupted by the insertion of the PDZ
domain in the loop connecting the c and d
-sheet
strands. Thus, the structure of this PH domain is unlikely to be
disrupted by the presence of intervening sequence at this site. Only a
few other proteins with PH domains contain insertions at this
site(26) . The largest of these is the 340-amino acid insertion
in the PH domain of phospholipase C
, which contains two SH2
domains and an SH3 domain. The syntrophin PDZ is very conserved among
the three isoforms and across species from Torpedo to human (Fig. 8). The SU region is also highly conserved among the
isoforms and across species(5) .
The roles these domains
play in the function of syntrophin is not understood but will likely be
aided by functional studies of these motifs in other proteins. The PH
domain has been implicated in -adrenergic receptor kinase
association with G proteins (G
)(42) ,
phospholipase C binding to phosphatidylinositol
4,5-bisphosphate(43) ,
spectrin binding to
membranes(44) , and protein kinase C association with
Bruton's tyrosine kinase(45) . Thus, the presence of two
PH domains in syntrophin raises the possibility that this protein, and
therefore the dystrophin complex, may mediate intracellular signaling
pathways.
The PDZ domain may be involved in targeting syntrophin to the membrane in a manner similar to that proposed for PSD-95(46) . Recently, two proteins containing PDZ domains have been shown to bind erythroid protein 4.1(36, 47) . Thus, syntrophin could potentially associate with members of the protein 4.1 family. Since syntrophin is the only protein containing a PH domain or a PDZ domain that is known to bind dystrophin, it is unlikely that either of these domains alone is responsible for the syntrophin-dystrophin association. Rather, it appears that the PH and PDZ domains must work in harmony to bind dystrophin or that the dystrophin binding site is located in the SU domain. The immunoprecipitation of dystrophin family proteins by the carboxyl-terminal two thirds of syntrophin (7) further supports the SU region as the dystrophin binding domain. Further studies of the syntrophin domains will allow us to identify the binding sites for dystrophin and potentially for other proteins of the dystrophin complex.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank(TM)/EMBL Data Bank with accession number(s) U00678[GenBank], U30900[GenBank], and U30901[GenBank].