Genomic Organization and Functional Characterization of the Chemokine Receptor CXCR4, a Major Entry Co-receptor for Human Immunodeficiency Virus Type 1*

Scott A. WegnerDagger , Philip K. Ehrenberg§, George Chang§, Deborah E. Dayhoff§, Alex L. Sleeker§, and Nelson L. MichaelDagger

From the Dagger  Division of Retrovirology, Walter Reed Army Institute of Research and § Henry M. Jackson Foundation, Rockville, Maryland 20850

    ABSTRACT
Top
Abstract
Introduction
Procedures
Results
Discussion
References

CXCR4 is both a chemokine receptor and entry co-receptor for T-cell line-adapted human immunodeficiency virus type 1. The genomic organization and promoter function for the entire transcription unit of CXCR4 were determined. The gene contains 2 exons of 103 and 1563 base pairs (bp) interrupted by a 2132-bp intron precisely between codons 5 and 6 of the coding sequences. A transcription start site was identified 88 bp upstream of the initiation codon, and a polyadenylate addition site was identified 22 bp 3' to a polyadenylation signal. Transient expression assays defined a minimal promoter at positions -114 to +43 relative to the transcription start site. This region contains a TATA box, a nuclear respiratory factor-1 (NRF-1) site, and two GC boxes. Specific factor binding to the NRF-1 site and GC boxes were demonstrated by gel mobility shifts and DNase I footprinting. Site-directed mutagenesis showed that the NRF-1 site is crucial for promoter activity providing the first evidence for the regulation of a signal transduction gene by NRF-1. Sequences between -691 and -191 repress CXCR4 promoter activity. Further study of these regulatory elements will be important to understanding how CXCR4 functions as both a chemokine receptor and human immunodeficiency virus type 1 entry co-receptor.

    INTRODUCTION
Top
Abstract
Introduction
Procedures
Results
Discussion
References

Cellular entry of human immunodeficiency virus type 1 (HIV-1)1 requires binding to both CD4 and to one of the seven transmembrane G-protein-coupled chemokine receptors recently discovered to act as co-receptors (1-7). Viruses that infect T-cell lines (T-tropic) form syncytia (syncytium-inducing, SI), are frequently found in late-stage HIV disease, and utilize the chemokine receptor CXCR4; macrophage-tropic (M-tropic) viruses are non-syncytium-inducing (NSI), found throughout disease, and utilize CCR5 (1-9). The CC chemokines RANTES, MIP-1alpha , and MIP-1beta are natural ligands for CCR5 (10, 11), and the CXC chemokine stromal cell-derived factor-1 (SDF-1) is the only known natural ligand for CXCR4 (1, 7, 10, 11). Ligand binding to both receptors is associated with G-protein-coupled signal transduction and leukocyte chemoattraction (1, 7), as well as partial viral entry antagonism (12). Viral entry and signal transduction are separable functions for CCR5 (13-15), but the two may be biologically related to viral pathogenesis.

CCR5 and CXCR4 are both expressed in peripheral blood T-lymphocytes and monocytes (3, 4, 10, 16), but in T-cells, CCR5 expression is largely confined to the activated, memory subset (16, 17) whereas CXCR4 expression is largely confined to the naive T-cell population (16). CXCR4 has also been reported to be expressed on peripheral blood B-cells (16). CXCR4 cell surface density on peripheral blood T-cells is rapidly up-regulated in response to phytohemagglutinin stimulation or interleukin-2 priming, whereas CCR5 shows up-regulation only in response to interleukin-2 (16, 17) similar to other CC chemokine receptors (18). Although both CCR5 and CXCR4 are expressed on monocytes, CXCR4 expression is very low in mature macrophages.2 CXCR4, but not CCR5, is expressed in typical CD4+ human neoplastic T-cell lines such as Sup-T1, H9, HUT 78, Jurkat, and A3.01, which explains the differential ability of such cells to support productive infection with T-tropic and not M-tropic virus (3, 4, 6). An exception is the PM-1 line, which produces both co-receptors and can be infected with some M-tropic as well as T-tropic HIV-1 strains (5, 20).

Evolution of co-receptor use from CCR5 to CXCR4 is coincident with progression to AIDS in approximately half of all HIV-1-seropositive subjects (21-23). The host and viral factors that govern this change in HIV-1 co-receptor use are poorly understood. We have determined the genomic organization of the CXCR4 transcription unit and characterized the basic cis- acting elements in its promoter to provide the conceptual framework to begin to address these questions.

    EXPERIMENTAL PROCEDURES
Top
Abstract
Introduction
Procedures
Results
Discussion
References

Cloning and Sequencing-- A total of 6 × 105 plaques from a human peripheral blood mononuclear cell (PBMC) genomic library prepared in the bacteriophage vector Lambda-DASH II (Stratagene, Inc., San Diego, CA) were screened with a 32P-random prime labeled cDNA probe (pcDNA-fusin) representing the entire coding sequences of CXCR4 (cDNA sequence position numbers 68-1134, according to GenBank accession number X71635; see Ref. 24). Three highly plaque-purified clones were then screened with a 5' 32P-end-labeled oligonucleotide probe from the 5' end of the coding sequence (cDNA sequence position numbers 17-49) resulting in a single, strongly hybridizing clone, lambda CXCR4-7a. Purified bacteriophage DNA from lambda CXCR4-7a was digested with the restriction endonuclease NotI, the 14-kilobase insert was subcloned into the pBluescriptIIKS- vector (Stratagene, Inc.) to generate pCXCR4-7a, and a 5160-bp portion of the insert containing the CXCR4 transcription unit was completely sequenced on both strands using an ABI model 373A DNA sequencer and dye-terminator reactions (Applied Biosystems, Inc., Foster City, CA). Contigs were compiled using Sequencher version 3.0 software (Gene Codes Corp., Ann Arbor, MI). The final sequence, given in Fig. 1, was deposited in GenBank (accession number AF005058). Sequence analysis was performed using the Neural Network approach for prediction of splice donor/acceptors (25) and promoters (26) using interactive software.3 Transcription factor binding site analysis was performed using the TESS and MatInspector interactive software packages.4

Genomic Southern Analysis-- High molecular weight DNA was extracted from the PBMCs of two HIV-1-seronegative human donors using the IsoQuick method (ORCA Research, Inc., Boswell, WA). Aliquots (10 µg) of these genomic DNAs and 1-µg amounts of purified, NotI-digested insert from pCXCR4-7a were separately digested to completion with restriction endonucleases BamHI and PvuII. The products were electrophoresed through a 0.8% agarose gel, transferred to a nylon membrane (Hybond-N, Amersham Corp.), probed with the 32P-random prime labeled pcDNA-fusin insert sequences, washed, and imaged using a model 850 PhosphorImager (Molecular Dynamics, Inc., Sunnyvale, CA).

5' RNA End-mapping-- Total cellular RNA was purified from the human neoplastic T-cell lines HUT 78 and PM-1 cells using the RNAzol B technique (Tel-Test, Inc., Friendswood, TX). RNA (50-µg amounts) was dissolved in 5 µl of water and hybridized with 100 fmol of the 5' 32P-end-labeled oligonucleotide probe PE1, 5'-CCCTCGGCGTCACTTTGCTACCTGCTGC (sequence positions 991-964 according to Fig. 1), dissolved in 1 µl of water at 58 °C for 20 min in 50 mM Tris-HCl, pH 8.3, 50 mM KCl, 10 mM MgCl2, 10 dithiothreitol, 1 mM each deoxyribonucleotide triphosphate (dNTP), and 0.5 mM spermidine. Parallel reactions were performed with [32P]PE1 in the absence of RNA. The reactions were then allowed to cool for 10 min at room temperature prior to incubation with 10 units of avian myeloblastosis virus reverse transcriptase (Promega, Inc., Madison, WI) in the presence of 50 mM Tris-HCl, pH 8.3, 50 mM KCl, 10 mM MgCl2, 10 mM dithiothreitol, 0.5 mM spermidine, 1.0 mM dNTP, and 2.8 mM sodium pyrophosphate at 42 °C for 30 min. The reactions were electrophoresed through a denaturing 7% polyacrylamide sequencing gel along with a sequence ladder generated with unlabeled PE1 and pCXCR4-7a template using [35S]dATP and Sequenase version 2.0 (U. S. Biochemical Corp.), dried down, and visualized by autoradiography.

3' RNA End-mapping-- A CXCR4-specific oligonucleotide 3'RACE1 5'-CCCAGCTGTTTATGCATAGA (sequence position 4537-4556) was used to perform 3'-rapid amplification of cDNA ends (3'-RACE) using 2 µg of RNase-free, DNase I-treated total cellular RNA from HUT 78 cells with a kit purchased from Boehringer Mannheim. Control reactions performed in the absence of RNA produced no amplified material. The specific 240-bp 3'-RACE product was identified by hybridization with primer 3'RACE2, 5'-CAGTTTTCAGGAGTGGGTTG (sequence position 4666-4685), agarose gel purified using the QIAEX II method (Qiagen, Inc., Chatsworth, CA), ligated into the vector pCR2.1 (Invitrogen, Carlsbad, CA), and the nucleotide sequence of multiple molecular clones was determined.

Promoter Mapping Experiments-- A nested series of 5' deletions anchored to a common 3' sequence and a nested series of 3' deletions anchored to a common 5' sequence were generated from sequences representing the putative CXCR4 promoter elements and directionally cloned 5' to the chloramphenicol acetyltransferase (CAT) gene in the vector pKSCAT, a homologue of pSKCAT (27). The inclusive sequence positions of each deletion clone are 5'Delta 1 (261-1037), 5'Delta 3 (759-1037), 5'Delta 5 (836-1037), 5'Delta 7 (885-1037), 5'Delta 9 (909-1037), 3'Delta 1 (759-992), 3'Delta 2 (759-929), 3'Delta 3 (759-905). A modification of 5'Delta 7, NRF-1Delta , was generated in which the nucleotides GCG at sequence positions 896-898 were changed to TTT. All constructs were confirmed by DNA sequencing. Complete panels of these constructs were transfected in triplicate into A3.01, HUT 78, and Sup-T1 cells by electroporation using a method described previously (28) with the following modifications. For each electroporation, 17 µg of CAT construction was electroporated into 1.5 × 107 cells with 3 µg of the beta -galactosidase reporter construction pCMVbeta -gal (28), 25 µg of lysate was used for each CAT assay, and between 1 and 15% of the lysate was used to determine beta -galactosidase activity. Positive control transfections were performed with the HIV-1 long terminal repeat reporter plasmid pU3R-III (29) and negative control transfections were performed with parental pKSCAT. Assays were performed within the linear range of the assay (1-25%). Raw percent conversions were corrected for background by subtraction of pKSCAT activity, normalized to beta -galactosidase activity to control for transfection efficiency, and expressed as a relative percentage of the pU3R-III-positive control to normalize between experiments.

DNase I Footprinting Assays-- A plasmid containing the 5'-flanking region of CXCR4 (5'Delta 3) was digested with the restriction endonucleases KpnI and XmaIII to obtain a fragment (sequence position numbers 758-933) with unique 3'- and 5'-overhanging termini, respectively. This restriction fragment was gel-purified and labeled using [alpha -32P]dCTP and the Klenow fragment of DNA polymerase holoenzyme (New England Biolabs, Inc.). Footprinting was performed using 16 µg of A3.01 nuclear extracts prepared as described previously (30) and DNase I (Promega, Inc.) per the protocol specified in the technical bulletin of the manufacturer. The same unlabeled restriction fragment was used to generate a Maxam-Gilbert G+A chemical sequencing ladder as described previously (31). Bovine serum albumin (16 µg, New England Biolabs, Inc.) was used in place of nuclear extracts in the control reactions. The reaction products were deproteinized, electrophoresed through a 6% denaturing polyacrylamide gel, dried down, and visualized by autoradiography.

Electrophoretic Mobility Shift Assays-- For NRF-1 studies, a 26-bp portion of the CXCR4 promoter (sequence positions 885-910) containing the consensus NRF-1 site was synthesized as two complementary oligonucleotides. A mutant form of these oligonucleotides was also synthesized (Fig. 5A). For standard binding experiments, 0.5-ng amounts of these 32P-end-labeled probes were incubated with 8 µg of A3.01 nuclear extracts at 10 °C for 20 min in a reaction buffer containing 50 µg/ml poly(dI-dC) (Pharmacia Biotech Inc.), 10 mM Tris-HCl, pH 7.5, 50 mM NaCl, 1 mM EDTA, 1 mM dithiothreitol, 5% (v/v) glycerol in a total reaction volume of 20 µl. For supershift experiments, 1 µl of NRF-1 antiserum (kindly provided by Dr. R. Scarpulla, Northwestern University) was subsequently added and incubated at room temperature for an additional 30 min. As a nonspecific control, 1 µl of Sp4 antibody (Santa Cruz Biotechnologies, Inc.) was used for both NRF-1 and Sp1 EMSAs. For competition experiments, 100-fold molar excess (50 ng) of unlabeled wild-type or mutant oligonucleotide was added and allowed to incubate with the nuclear extract at room temperature for 20 min prior to the addition of labeled oligonucleotide. For GC box studies, a 46-bp portion of the CXCR4 promoter (sequence positions 842-887) containing GC boxes I and II (Fig. 1) was synthesized as two complementary oligonucleotides. An oligonucleotide with mutations in both of these GC boxes was also synthesized (Fig. 5B). The reaction conditions differed from the NRF-1 experiments in that the poly(dI-dC) concentration was increased to 250 µg/ml, 1 footprinting unit of purified Sp1 protein (Promega, Inc.) was used in place of nuclear extracts, 1 µg of bovine serum albumin was added to each binding reaction, and 1 µl of Sp1 antiserum (Santa Cruz Biotechnologies, Inc.) was used for supershifts. The DNA-Sp1 binding reactions were carried out at room temperature for 10 min. Binding complexes were resolved on native polyacrylamide gels and visualized by autoradiography essentially as described previously (32).

    RESULTS
Top
Abstract
Introduction
Procedures
Results
Discussion
References

Isolation and Structural Organization of the Human CXCR4 Gene-- Screening of a human PBMC genomic DNA library with a probe consisting of the entire CXCR4 coding sequence yielded three bacteriophage clones of which a single clone, lambda CXCR4-7a, hybridized with oligodeoxynucleotide probes representing the extreme ends of the published coding sequences (24). As this raised the possibility that the entire transcription unit was contained on the 14- kilobase pair insert of this clone, it was chosen for nucleotide sequencing. Comparison of the 5160-base pair genomic sequence obtained (Fig. 1) with the cDNA sequence suggested that this was so. The 67-base pair 5'-untranslated region of the cDNA sequence was contiguously identified in the genomic sequence between positions 971 and 1037 with the initiating methionine codon beginning at position 1038. Following 15 base pairs of coding sequence corresponding to amino acid residues M-E-G-I-S, a consensus splice donor was encountered followed by a 2132-base pair intron (sequence position 1053-3184) precisely after the third base of codon 5 at position 1052. A consensus splice acceptor was identified at the 3' terminus of the intron followed by 1044 base pairs of coding sequences (positions 3185-4228) whose open reading frame started with the first base of codon 6 (isoleucine) and ended with a TAA codon. Taken together, these two exons comprised all 352 amino acids predicted from the cDNA sequence (24). The putative polyadenylation signal, AATAAA, was identified 498 bp 3' to the TAA codon starting at position 4726 of the genomic sequence, and the remaining 21 bp of the 3'-untranslated region predicted from the cDNA sequence was identified downstream contiguous with the published cDNA.


View larger version (106K):
[in this window]
[in a new window]
 
Fig. 1.   Sequence and genomic organization of the human CXCR4 gene. Numbering for the CXCR4 gene is according to GenBank accession number AF005058. Regions matching transcriptional control consensus sequences are underlined and labeled. The consensus polyadenylate addition signal sequence AATAAA is also underlined. Deduced amino acid sequences are listed below the coding region using the single-letter code centered on the second base of each codon. Exon sequences are denoted by uppercase letters and intron, and flanking sequences are denoted by lowercase letters.

Identification of the CXCR4 Transcription Initiation and Termination Sites-- To better define the 5' and 3' boundaries of the transcription unit, RNA end-mapping studies were performed. Primer extension studies were performed with a 32P-end-labeled primer (PE1) from the 5'-untranslated region of CXCR4 using either no RNA template (lane (-)) or total cellular RNA from HUT 78 cells (lane (+)) (Fig. 2). PE1 was also used to generate a dideoxy sequencing ladder from clone 5'Delta 3 (Fig. 2, lanes GATC), which were co-electrophoresed with the primer extension reactions. The mock primer extension lane revealed multiple, nonspecific extension products. However, a family of three specific extension products was identified centered on the dominant, central C residue (Fig. 2, arrow) of the nucleotide triad 5'-ACT corresponding to the sense triad of 5'-AGT at positions 949-951. Thus, the G at position 950 was designated the transcription start site/5' transcription unit border and assigned the identifier +1. Similar 5' ends were identified using total cellular RNA from A3.01 and PM-1 cells (data not shown). The CXCR4 transcription start site is 21 bp upstream of the 5' cDNA terminus identified previously (24) and 32 bp downstream of a canonical TATA box, TATAA, at sequence positions 919-923 (Fig. 1). The 3' end of the CXCR4 transcription unit was mapped using 3'-RACE to sequence position 4747 (data not shown) 22 bp downstream of a canonical polyadenylation addition signal, AATAAA, at positions 4726-4731 in agreement with the published 3' cDNA terminus (24).


View larger version (58K):
[in this window]
[in a new window]
 
Fig. 2.   Mapping of the CXCR4 transcription start site in HUT 78 cells. Primer extension was performed using oligonucleotide PE1 and in the absence (-) and presence (+) of HUT 78 total cellular RNA, deproteinized, and electrophoresed next to a sequencing ladder primed by PE1. The major specific primer extension product is depicted by an arrow, adjacent to the sense and complementary nucleotide sequence of the three visible product bands.

Based on these data, the CXCR4 gene comprised 2 exons of 103 and 1563 base pairs (Fig. 1, uppercase characters) interrupted by a single 2132-base pair intron (Fig. 1, lowercase characters). Exon 1 contained 88 bp of 5'-untranslated sequences followed by 15 bp of coding sequences ending precisely after codon 5. Exon 2 contained 1044 bp of coding sequences from codon 6 to the termination codon TAA followed by 519 bp of 3'-untranslated sequences.

Potential recombination in the library was ruled out by digesting both lambda CXCR4-7a and PBMC genomic DNAs with restriction endonucleases BamHI and PvuII and performing Southern blotting using full length CXCR4 cDNA as a probe. The lambda CXCR4-7a sequence predicted a BamHI fragment of 3602 base pairs and a PvuII fragment of 2269 base pairs, which were identified by hybridization in both the clone and genomic DNA digests (data not shown).

Positional Mapping of the CXCR4 Promoter-- The CXCR4 5'-flanking region sequences were searched for known promoter elements. The sequences immediately upstream of exon 1 and the +1 site were identified as promoter and transcription start sites, respectively, by the Neural Net Promoter prediction algorithm (26). Sequences 5' to the transcription start site included a TATA box, a potential NRF-1 site, and four potential antisense GC boxes, of which the most distal pair, GC box III and IV, overlap (Fig. 1).

A nested series of 5' deletions were generated starting from position -689 relative to the transcription start site with the 3' anchor at position +88, and multiple replicates were transfected into A3.01 cells (Fig. 3). Although moderate CAT activity was obtained with clone 5'Delta 1, a 5-fold gain in signal intensity was consistently noted with clone 5'Delta 3 which lacked sequences between -689 and -191 indicative of repressor sequences within this domain. Deletion of sequences to position -114 (clone 5'Delta 5), which included the distal GC boxes III and IV at positions -138 to -126, had no effect on CAT activity. Further deletion of sequences to position -65 (clone 5'Delta 7), which excised the proximal GC boxes I and II at positions -88 to -66, modestly decreased CAT activity to 40% of that seen with clones 5'Delta 3 and 5'Delta 5, whereas deletion of sequences to position -41 (clone 5'Delta 9), which excised the NRF-1 site, essentially abolished CXCR4 promoter activity.


View larger version (10K):
[in this window]
[in a new window]
 
Fig. 3.   Deletion and mutagenesis mapping of promoter elements in the 5'-flanking region of the CXCR4 gene. A3.01 cells were transiently transfected with CAT expression vectors containing variable regions of the CXCR4 5'-flanking region sequences. Construct names are on the left. Sequence positions of constructs are numbered relative to the transcription start site denoted by the arrow at +1. Results are expressed as mean + standard deviation relative to the activity of the control construct pU3R-III containing the HIV-1 promoter. The construct containing a mutated NRF-1 site is labeled NRF-1 Delta . Differences in CAT activity between constructs were analyzed with the Wilcoxon matched-pairs tests. Constructs 5'Delta 3 and 5'Delta 5 produced significantly more CAT activity than 5'Delta 7 and 5'Delta 9 (p < 0.05). Activity of 5'Delta 7 was also significantly greater than that of 5'Delta 9 (p < 0.05). The activity of 5'Delta 1 was significantly less than that of 5'Delta 3 (p < 0.005). The activity of NRF-1Delta was significantly lower than that of 5'Delta 7 (p < 0.04). The 3' deletion constructs were all significantly different in their activity relative to each other (p < 0.01).

A nested series of 3' deletions were generated with the 5' anchor at position -191 (Fig. 3). Relative to clone 5'Delta 5 (positions -191 to +88), deletion of sequences to +43 in clone 3'Delta 1 had little effect on CAT activity, whereas deletion of sequences surrounding the transcription start site in clone 3'Delta 2 (positions -191 to -21) and deletion of the TATA box in clone 3'Delta 3 (positions -191 to -45) progressively reduced CAT activity. Taken together, the CXCR4 minimal promoter domain was mapped to positions -114 to +43 (sequence positions 836-992), which included GC boxes I and II, the NRF-1 site, the TATA box, and the transcription start site. Given that removal of the NRF-1 site in clone 5'Delta 9 effectively ablated the CXCR4 promoter, a homologue to clone 5'Delta 7 was generated (clone NRF-1Delta ) by site-directed mutagenesis in which the core NRF-1 binding site nucleotides GCG (33) at positions 896-898 (Fig. 1) were changed to TTT. This clone, similar to clone 5'Delta 9, showed essentially no CAT activity, demonstrating the critical role of the NRF-1 binding site sequences in the CXCR4 promoter. Similar results were obtained by transfection of all of the constructs shown in Fig. 3 into both Sup-T1 and HUT 78 cells.

Identification of Potential cis-Acting Elements by Footprint Analysis-- A restriction fragment obtained by digesting a plasmid containing sequence positions 758-933 with KpnI and XmaIII was used for DNase I footprinting analysis with A3.01 cell nuclear extracts (Fig. 4). Compared with the control lane using bovine serum albumin (lane 2), there was a general decrease in band intensity from positions -88 to -49 relative to the transcription start site to include the two proximal GC boxes and the NRF-1 site which was heavily protected (lane 1). Taken together with the reporter gene analyses, the importance of these promoter elements was further substantiated.


View larger version (25K):
[in this window]
[in a new window]
 
Fig. 4.   Binding of A3.01 cell nuclear extracts to the promoter region of CXCR4 with DNase I digestion. Lane 1 includes HUT 78 nuclear extract and labeled probe; lane 2 includes bovine serum albumin and labeled probe alone. Sequence positions were identified by a Maxam-Gilbert sequence ladder (not shown). Regions of protection are given to the right of the figure with sequence positions according to GenBank accession number AF005058 and relative to the transcription start site (parentheses). Regions corresponding to GC boxes I and II and NRF-1 are labeled.

Transcription Factors NRF-1 and Sp1 Bind to the CXCR4 Promoter-- To determine if specific transcription factors corresponding to the NRF-1 site and GC boxes bound to their respective binding sequences in the CXCR4 promoter, electrophoretic mobility shift (EMSA) experiments were performed. Oligonucleotides representing CXCR4 promoter sequences containing either the wild type or mutant NRF-1 binding sites were used as EMSA probes with A3.01 cell nuclear extracts (Fig. 5A). No complex formation was seen in the absence of extract for both wild-type and mutant probes (Fig. 5A, lanes 1 and 7, respectively). Wild-type probe generated one specific (C1) and two nonspecific (NS) complexes with A3.01 extract (lane 2), which was readily competed by unlabeled wild-type probe (lanes 3) but not an excess of mutant probe (lane 4). Addition of NRF-1 antiserum (lane 5) but not nonspecific Sp4 antibody (lane 6) resulted in the supershift of C1 to C2. Only nonspecific complexes were observed with the mutant probe (lane 8). These data strongly suggest that NRF-1 specifically binds to the NRF-1 binding site in the CXCR4 promoter.


View larger version (36K):
[in this window]
[in a new window]
 
Fig. 5.   Electrophoretic mobility shift assays. A, NRF-1 probe with A.301 nuclear extracts. The positions of specific DNA-protein complexes (C1), DNA-protein-antibody complexes (C2), nonspecific complexes (NS), and free probe (P) are indicated by arrows. WT and M indicate wild-type and mutant probes, respectively, the sequences of which are listed below the figure. Putative protein binding site are underlined, and mutations are in boldface. B, Sp1 probe with purified human Sp1 protein. Labels are as given for A.

Oligonucleotides representing CXCR4 promoter sequences containing either wild type or mutant GC box sites were also used as EMSA probes with purified Sp1 protein (Fig. 5B). Purified protein was used in lieu of nuclear extracts, as Sp1 has been previously shown to be expressed in T-cells (34, 35). A mutant oligonucleotide, containing mutations in both GC boxes, was also used (Fig. 5B). No complex formation was seen in the absence of Sp1 for both the wild type and mutant probes (Fig. 5B, lanes 1 and 6, respectively). Wild-type probe generated a strong specific (C1) complex with Sp1 protein (lane 2), which was readily competed by an excess of unlabeled wild-type probe (lane 3). Addition of Sp1 antiserum resulted in the supershift of C1 to complex C2 (lane 4). Supershift was not seen with the nonspecific Sp4 antibody. Only a very weak C1 complex was seen with the mutant probe (lane 7). Taken together, these data strongly suggest that Sp1 specifically binds to the two proximal GC boxes in the CXCR4 promoter.

    DISCUSSION
Top
Abstract
Introduction
Procedures
Results
Discussion
References

We have characterized the entire transcription unit of the human CXCR4 gene. The sequence of the gene has a very high GC content; the coding sequences are 50.3% GC, and the intron and 0.95 kilobase pairs 5' to the transcription start site are 56.9 and 53.1% GC, respectively, which is higher than 99% of human genes (36). The biologic significance of this observation is uncertain. However, the practical implication of this finding is that large portions of the gene are quite difficult to amplify by the polymerase chain reaction, which will make attempts to perform high through-put studies to look for genotypic polymorphisms in cohorts of subjects more difficult.

Polymorphisms that result in mutant chemokine receptor proteins have been identified for CCR5 (37, 38) and CCR2B (39). CCR5 polymorphisms correlate with delayed progression to AIDS (21, 40) and decreased susceptibility to HIV-1 infection (21, 38, 40), whereas the implications for CCR2B polymorphisms are less clear (19, 39). The finding of substantial polymorphisms in the coding sequence for CXCR4 may be less likely than for the other chemokine receptors. CXCR4 mRNA is widely expressed in a variety of hematopoietic and non-hematopoietic tissues, including brain, heart, kidney, lung, and liver (41, 42), which is a markedly larger distribution than the other identified chemokine receptors. There is so far only one identified ligand, SDF-1, which produces a lethal mutation when knocked out in mice (43), and both CXCR4 and SDF-1 genes show high interspecies conservation (9, 41). All of these findings argue for the criticality of a functional CXCR4 gene product, and against the likelihood of finding significant polymorphisms in human cohorts.

The validity of our proposed genomic organization is supported by multiple lines of evidence. Primer extension analysis was performed in multiple cell lines and yielded a strong single nucleotide band with weaker bands for the adjacent nucleotides on either side, but no other significant start sites. This start site was predicted before the performance of these experiments by the Neural Network algorithm (26) and is further supported by the presence of nearby upstream elements including TATA and GC box elements. The intron-exon boundaries contain consensus splice donor and acceptor sequences, were predicted by Neural Network (25) and maintain the open reading frame as predicted by the cDNA sequences (24). The polyadenylate addition site was quite precise when sequenced in multiple clones, occurs immediately downstream of a consensus polyadenylation signal, and agrees with previously published work (24). The size of mRNA species predicted by sum of the two exons and a 100-200-base pair poly(A) tail agrees with that observed by Northern analysis (6, 24).

Analysis of the CXCR4 5'-flanking sequence showed the presence of several potential binding sites for known transcription factors. The importance of the two most proximal Sp1 sites, and to a much greater degree, the NRF-1 site in the basal promotion of the CXCR4 gene was shown by multiple, complementary experiments, including DNase I footprinting, transient transfection with promoter/CAT constructs, site-directed mutagenesis of the NRF-1 site, and mobility shift assays. The transient transfection experiments, deleting from both 5' and 3' ends of the 5'-flanking region, localized the critical basal promoter sequences between positions -114 to +43 relative to the transcription start site (sequence positions 836 and 992). Deletion or mutation of the sequences containing the NRF-1 site at positions -61 to -49 (sequence positions 889-901) essentially abolished CXCR4 promoter activity.

NRF-1 is a nuclear encoded gene product that has been shown to be important for the transcriptional regulation of multiple mitochondrial genes involved in organelle biogenesis and cellular respiration (33). Potential NRF-1 binding sites have also been identified in several genes important for cell maintenance, growth, and proliferation such as the genes for ornithine decarboxylase, 5-aminolevulinate synthase, bcl-2, and DNA polymerase-alpha , and signal transduction genes such as those for cyclophilin, calmodulin, and murine GM-CSF (33). However, the functional significance of these observations is unclear. Following submission of this manuscript, Moriuchi and colleagues (45) described the recovery of the proximal portion of the CXCR4 promoter reported here and also demonstrated the importance of NRF-1 for CXCR4 transcriptional regulation. These two reports represent the first demonstration of the role of NRF-1 in the expression of a signal transduction gene. Given the evidence for the importance of CXCR4 in cellular development and its presence on immune effector cells, it is easy to postulate that NRF-1 serves to coordinate an increase in a cell's metabolic capacity in response to inflammatory or proliferative signals, preparing the cell to migrate or divide. This increased level of cellular activation has been previously shown to be important for high level HIV-1 replication in target cells (46).

Transient transfections done in several cell lines consistently showed a degradation of promoter activity when sequences between -668 and -191 were present in the construct, suggesting the presence of negative regulatory elements in this portion of the sequence. Precise mapping of this region to delineate these elements is now in progress. An inducible repressor element could conceivably be a target for the development of therapeutic agents; if CXCR4 transcription could be blocked, entry of SI strains of HIV-1 into cells might be inhibited.

Identification of two new co-receptors that also mediate HIV-1 and SIV entry, such as Bonzo (47, 48) and BOB (47), as well as reports of others (49), underscores the growing complexity of HIV-1 viral entry into target cells. Elucidation of the regulatory mechanisms that govern the interplay of these co-receptors will be important to a fuller understanding of HIV disease.

    ACKNOWLEDGEMENTS

We thank C. Bailey and C. Wang for technical assistance, C. Drew and D. Joynes for graphics, R. Scarpulla for NRF-1 antiserum, and M. Robb and D. Birx for support and helpful discussions. We are indebted to H. Moriuchi and A. Fauci for sharing unpublished data and insights.

    FOOTNOTES

* This work was supported in part by Cooperative Agreement DAMD17-93-V-3004 between the United States Army Medical Research and Materiel Command and the Henry M. Jackson Foundation for the Advancement of Military Medicine.The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) AF005058.

To whom correspondence and reprint requests should be addressed: Div. of Retrovirology, Walter Reed Army Institute of Research, 1600 E. Gude Dr., Rockville, MD 20850. Tel.: 301-762-0089; Fax: 301-762-7460; E-mail: nmichael{at}pasteur.hjf.org.

1 The abbreviations used are: HIV-1, human immunodeficiency virus type 1; bp, base pair(s); RACE, rapid amplification of cDNA ends; CAT, chloramphenicol acetyltransferase; PBMC, peripheral blood mononuclear cell; EMSA, electrophoretic mobility shift assay; M- and T-tropic, macrophage- and T-cell-tropic, respectively; contig, group of overlapping clones; SI, syncytium-inducing; NSI, non-syncytium-inducing; NRF-1, nuclear respiratory factor-1.

2 R. G. Collman, personal communication.

3 Software package can be obtained via the World Wide Web (http://www-hgc.lbl.gov/projects).

4 The TESS and MatInspector interactive software packages can be obtained via the World Wide Web (http://dot.imgen.bcm.tmc.edu:9331/seq-search/gene-search.html).

    REFERENCES
Top
Abstract
Introduction
Procedures
Results
Discussion
References

  1. Bleul, C. C., Farzan, M., Choe, H., Parolin, C., Clark-Lewis, I., Sodroski, J., and Springer, T. A. (1996) Nature 382, 829-832[CrossRef][Medline] [Order article via Infotrieve]
  2. Berson, J. F., Long, D., Doranz, G. J., Rucker, J., Jirik, F. R., Doms, R. W. (1996) J. Virol. 70, 6288-6295[Abstract]
  3. Alkhatib, G., Combadiere, C., Broder, C. C., Feng, Y., Kennedy, P. E., Murphy, P. M., Berger, E. A. (1996) Science 272, 1955-1958[Abstract]
  4. Deng, H., Liu, R., Ellmeier, W., Choe, S., Unutmaz, D., Burkhart, M., Di Marzio, P., Marmon, S., Sutton, R. E., Hill, C. M., Davis, C. B., Peiper, S. C., Schall, T. J., Littman, D. R., Landau, N. R. (1996) Nature 381, 661-666[CrossRef][Medline] [Order article via Infotrieve]
  5. Dragic, T., Litwin, V., Allaway, G. P., Martin, S. R., Huang, Y., Nagashima, K. A., Cayanan, C., Maddon, P. J., Koup, R. A., Moore, J. P., Paxton, W. A. (1996) Nature 381, 667-673[CrossRef][Medline] [Order article via Infotrieve]
  6. Feng, Y., Broder, C., Kennedy, P. E., Berger, E. A. (1996) Science 272, 872-877[Abstract]
  7. Oberlin, E., Amara, A., Bachelerie, F., Bessia, C., Virelizier, J.-L., Arenzana-Seisdedos, F., Schwartz, O., Heard, J.-M., Clark-Lewis, I., Legier, D. F., Loetscher, M., Baggiolini, M., Moser, B. (1996) Nature 382, 833-835[CrossRef][Medline] [Order article via Infotrieve]
  8. Maddon, P. J., Dalgleish, A. G., McDougal, J. S., Clapham, P. R., Weiss, R. A., Axel, R. (1986) Cell 47, 333-348[Medline] [Order article via Infotrieve]
  9. Doranz, B. J., Rucker, J., Yi, Y., Smyth, R. J., Samson, M., Peiper, S. C., Parmentier, M., Collman, R. G., Doms, R. W. (1996) Cell 85, 1149-1158[Medline] [Order article via Infotrieve]
  10. Raport, C. J., Gosling, J., Schweickart, V. L., Gray, P. W., Charo, I. F. (1996) J. Biol. Chem. 271, 17161-17166[Abstract/Free Full Text]
  11. Sampson, M., Labbe, O., Mollereau, C., Vassart, G., and Parmentier, M. (1996) Biochemistry 35, 3362-3367[CrossRef][Medline] [Order article via Infotrieve]
  12. Cocchi, F., De Vico, A. L., Garzino-Demo, A., Arya, S. K., Gallo, R. C., Lusso, P. (1995) Science 270, 1811-1815[Abstract]
  13. Farzan, M., Choe, H., Martin, K. A., Sun, Y., Sidelko, M., Mackay, C. R., Gerard, N. P., Sodroski, J., Gerard, C. (1997) J. Biol. Chem. 272, 6854-6857[Abstract/Free Full Text]
  14. Gosling, J., Monteclaro, F. S., Atchinson, R. E., Arai, H., Tsou, C.-L., Goldsmith, M. A., Charo, I. F. (1997) Proc. Natl. Acad. Sci. U. S. A. 94, 5061-5066[Abstract/Free Full Text]
  15. Atchison, R. E., Gosling, J., Monteclaro, F. S., Franci, C., Digilio, L., Charo, I. F., Goldsmith, M. A. (1996) Science 274, 1924-1926[Abstract/Free Full Text]
  16. Bleul, C. C., Wu, L., Hoxie, J. A., Springer, T. A., Mackay, C. R. (1997) Proc. Natl. Acad. Sci. U. S. A. 94, 1925-1930[Abstract/Free Full Text]
  17. Wu, L., Paxton, W. A., Kassam, N., Ruffing, N., Rottman, J. B., Sullivan, N., Choe, H., Sodroski, J., Newman, W., Koup, R. A., Mackay, C. R. (1997) J. Exp. Med. 185, 1681-1691[Abstract/Free Full Text]
  18. Loetscher, P., Seitz, M., Baggiolini, M., and Moser, B. (1996) J. Exp. Med. 184, 569-577[Abstract]
  19. Michael, N. L., Louie, L. G., Rohrbaugh, A. L., Schultz, K. A., Dayhoff, D. E., Wang, C. E., Sheppard, H. W. (1997) Nat. Med. 3, 1160-1162[Medline] [Order article via Infotrieve]
  20. Lusso, P., Cocchi, F., Balotta, C., Markham, P. D., Louie, A., Farci, P., Pal, R., Gallo, R. C., Reitz, M. S., Jr. (1995) J. Virol. 69, 3712-3720[Abstract]
  21. Michael, N. L., Chang, G., Louie, L. G., Mascola, J. R., Dondero, D., Birx, D. L., Sheppard, H. W. (1997) Nat. Med. 3, 338-340[Medline] [Order article via Infotrieve]
  22. Sheppard, H. W., Lang, W., Ascher, M. S., Vittinghoff, E., Winkelstein, W. (1993) AIDS 7, 1159-1166[Medline] [Order article via Infotrieve]
  23. Jurriaans, S., Van Gemen, B., Weverling, G. J., Van Strijp, D., Nara, P., Coutinho, R., Koot, M., Schuitemaker, H., Goudsmit, J. (1994) Virology 204, 223-233[CrossRef][Medline] [Order article via Infotrieve]
  24. Loetscher, M., Geiser, T., O'Reilly, T., Zwahlen, R., Baggiolini, M., and Moser, B. (1994) J. Biol. Chem. 269, 232-237[Abstract/Free Full Text]
  25. Brunak, S., Engelbrecht, J., and Knudsen, S. (1991) J. Mol. Biol. 220, 49-65[Medline] [Order article via Infotrieve]
  26. Reese, M. G., Harris, N. L., and Eeckman, F. H. (1996) in Proceedings of the Pacific Symposium on Biocomputing, Kona, Hawaii (Hunter, L., and Klein, T., eds), pp. 74-75, World Scientific Publishing Co. Pte. Ltd., Singapore
  27. Michael, N. L., D'Arcy, L., Ehrenberg, P. K., Redfield, R. R. (1994) J. Virol. 68, 3163-3174[Abstract]
  28. Michael, N. L., Chang, G., d'Arcy, L. A., Tseng, C. J., Birx, D. L., Sheppard, H. W. (1995) J. Virol. 69, 6758-6769[Abstract]
  29. Sodroski, J., Patarca, R., Rosen, C., Wong-Staal, F., and Haseltine, W. (1985) Science 229, 74-77[Medline] [Order article via Infotrieve]
  30. Dignam, J. D., Lebovitz, R. M., and Roeder, R. G. (1983) Nucleic Acids Res. 11, 1475-1489[Abstract]
  31. Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY
  32. Michael, N. L., Vahey, M. T., d'Arcy, L., Ehrenberg, P. K., Mosca, J. D., Rappaport, J., Redfield, R. R. (1994) J. Virol. 68, 979-987[Abstract]
  33. Virbasius, C. A., Virbasius, J. V., Scarpulla, R. C. (1993) Genes Dev. 7, 2431-2445[Abstract]
  34. Briggs, M. R., Kadonaga, J. T., Bell, S. P., Tjian, R. (1986) Science 234, 47-52[Medline] [Order article via Infotrieve]
  35. Granelli-Piperno, A., Pope, M., Inaba, K., and Steinman, R. M. (1995) Proc. Natl. Acad. Sci. U. S. A. 92, 10944-10948[Abstract]
  36. Zoubak, S., Clay, O., and Bernardi, G. (1999) Gene (Amst.) 174, 95-102[CrossRef]
  37. Liu, R., Paxton, W. A., Choe, S., Ceradini, D., Martin, S. R., Horuk, R., MacDonald, M. E., Stuhlmann, H., Koup, R. A., Landau, N. R. (1996) Cell 86, 367-377[Medline] [Order article via Infotrieve]
  38. Sampson, M., Libert, F., Doranz, B. L., Rucker, J., Liesnard, C., Farber, C.-M., Saragosti, S., Lapoumeroulie, C., Cognaux, J., Forceille, C., Muyldermans, G., Verhofstede, C., Burtonboy, G., George, M., Imai, T., Rana, S., Yi, Y., Smyth, R. J., Collman, R. G., Doms, R. W., Vassart, G., Parmentier, M. (1996) Nature 382, 722-725[CrossRef][Medline] [Order article via Infotrieve]
  39. Smith, M. W., Dean, M., Carrington, M., Winkler, C., Huttley, G. A., Lomb, D. A., Goedert, J. J., O'Brien, T. R., Jacobsen, L. P., Kaslow, R., Buchbinder, S., Vittinghoff, E., Vlahov, D., Hoots, K., Hilgartner, M. W., , , , , , O'Brien, S. J. (1997) Science 277, 959-965[Abstract/Free Full Text]
  40. Dean, M., Carrington, M., Winkler, C., Huttley, G. A., Smith, M. W., Allikmets, R., Goedert, J. J., Buchbinder, S. P., Vittinghoff, E., Gomperts, E., Donfield, S., Vlahov, D., Kaslow, R., Saah, A., Rinaldo, C., Detels, R., , , , , , O'Brien, S. J. (1996) Science 273, 1856-1862[Abstract/Free Full Text]
  41. Shirozu, M., Nakano, T., Inazawa, J., Tashiro, K., Tada, H., Shinohara, T., and Honjo, T. (1995) Genomics 28, 495-500[CrossRef][Medline] [Order article via Infotrieve]
  42. Tashiro, K., Tada, H., Heilker, R., Shirozu, M., Nakano, T., and Honjo, T. (1993) Science 261, 600-603[Medline] [Order article via Infotrieve]
  43. Nagasawa, T., Hirota, S., Tachibana, K., Takakura, N., Nishikawa, S., Kitamura, Y., Yoshida, N., Kikutani, H., and Kishimoto, T. (1996) Nature 382, 635-638[CrossRef][Medline] [Order article via Infotrieve]
  44. Deleted in proof
  45. Moriuchi, M., Moriuchi, H., Turner, W., and Fauci, A. S. (1997) J. Immunol. 159, 4322-4329[Abstract]
  46. Zack, J. A., Arrigo, S. J., Weitsman, S. R., Go, A. S., Haislip, A., Chen, I. S. (1990) Cell 61, 213-222[Medline] [Order article via Infotrieve]
  47. Deng, H., Unutmaz, D., KewalRamani, V. N., Littman, D. R. (1997) Nature 388, 296-300[CrossRef][Medline] [Order article via Infotrieve]
  48. Alkhatib, G., Liao, F., Berger, E. A., Farber, J. M., Peden, K. W. C. (1997) Nature 388, 238[CrossRef][Medline] [Order article via Infotrieve]
  49. Clapham, P. R., and Weiss, R. A. (1997) Nature 388, 230-231[CrossRef][Medline] [Order article via Infotrieve]


Copyright © 1998 by The American Society for Biochemistry and Molecular Biology, Inc.