The Extent of Nucleotide Polymorphism is Highly Variable Across a 3-kb Region on Plasmodium falciparum Chromosome 2

Somchai Jongwutiwes*, Chaturong Putaporntip*, Robert Friedman{dagger} and Austin L. Hughes{dagger}

*Department of Parasitology, Faculty of Medicine, Chulalongkorn University, Bangkok;
{dagger}Department of Biological Sciences, University of South Carolina


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
Genomic nucleotide polymorphism in the virulent human malarial parasite Plasmodium falciparum was surveyed by sequencing a 3-kb region of chromosome 2 from 21 isolates, including the MSP4 and MSP5 genes. Extensive sequence polymorphism was observed in the coding regions of these genes and in the region downstream to MSP5, and the average pairwise divergence time of haplotypes in this region was estimated to be at least about 200,000 years. But nucleotide polymorphism was not found in the introns and was much reduced in the intergenic region. Over the entire region, nucleotide diversity was negatively correlated with a nucleotide content skewed toward thymine. Together with the previous evidence of limited nucleotide polymorphism in introns of P. falciparum, these data suggest the existence of a mechanism suppressing single-nucleotide polymorphism in regions of the P. falciparum genome with highly skewed nucleotide content.


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
The population history of Plasmodium falciparum, the most virulent of human malarial parasites, has been controversial (Hughes and Verra 1998Citation ; Rich et al. 1998Citation ; Hughes and Verra 2001Citation ; Volkman et al. 2001Citation ). There is evidence that polymorphisms at certain antigen-encoding loci have been selectively maintained for millions of years, suggesting that no extreme population bottleneck has recently occurred in this species (Hughes 1991Citation , 1992Citation ; Hughes M. K. and Hughes A. L. 1995Citation ; Verra and Hughes 2000Citation ). A survey of polymorphism at neutrally evolving loci yielded the estimate that the effective population size of this species has been of the order of 105 for at least the past few 100,000 years (Hughes and Verra 2001Citation ). Furthermore, a recent study of single-nucleotide polymorphisms (SNPs) at 204 loci on chromosome 3 (Mu et al. 2002Citation ) revealed extensive polymorphism and yielded estimates of the age and effective population size of this species that were very similar to those of Hughes and Verra (2001)Citation . In contrast, sequencing of 25 introns from eight isolates of P. falciparum revealed almost no polymorphism, suggesting a very recent common ancestor for the species (Volkman et al. 2001Citation ). To understand the causes behind these apparently contradictory results, we sequenced an approximately 3-kb region of chromosome 2 from 21 isolates, including two closely linked protein-coding genes, MSP4 and MSP5, each of which contains one intron. We examined the pattern of nucleotide substitution in exons, introns, and intergenic regions, both to compare the extent of nucleotide diversity in these different genomic regions and to identify the factors associated with differences among these regions.

Merozoite surface protein 4 (MSP4) and MSP5 are both surface proteins of the merozoite, the stage of the parasite that infects red blood cells (Marshall, Tieqiao, and Coppel 1998Citation ; Wu et al. 1999Citation ). In addition, there is evidence that MSP4 is expressed on the surface of both the sporozoite, the stage of the parasite transmitted from mosquito to vertebrate hosts, and the liver stage (Bottius et al. 1996Citation ). The loci encoding these molecules are encoded within an approximately 3,000-bp region on chromosome 2, and each locus includes a single short intron (Marshall, Tieqiao, and Coppel 1998Citation ) (fig. 1a ). We sequenced the coding region and intron of each gene, along with the intergenic region and a portion of the region downstream of the MSP5 gene, from 21 isolates of P. falciparum, which were mostly derived from field populations in Thailand.



View larger version (21K):
[in this window]
[in a new window]
 
Fig. 1.—a, Schematic map of the MSP4-MSP5 region of chromosome 2 of P. falciparum and plot of TA skew in a sliding window of 50 aligned nucleotide sites in the MSP4-MSP5 region. b, Numbers of synonymous (dotted line) and nonsynonymous (solid line) nucleotide substitutions per site in a sliding window of 15 codons along the MSP4 coding region. The x-axis represents the center codon of the 15-codon window

 

    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
DNA Sequencing
Plasmodium falciparum DNA templates from 15 Thai isolates collected from symptomatic malaria patients in Tak province, northern Thailand (isolates 806, 807, 814, 815, 827, 828, 837, 838, 841, 842, 843, 844, 946, 947, and G23), three field isolates from Rondonia, Brazil (059, 098, and M290), and three culture-adapted parasites (K1, T9/94, and MAD20) were prepared as described (Putaporntip et al. 2001Citation ). A 3.56-kb DNA fragment encompassing both MSP4 and MSP5 of each isolate was amplified by polymerase chain reaction (PCR) with primers MSP4-F, 5'-CATATTTTATATTTTCAGATGTATTATAAG-3' (nucleotides 1628–1657, positions after GenBank accession number AF033037; Marshall, Tieqiao, and Coppel 1998Citation ), and MSP5-R, 5'-GTGTGCATTTGATTTATTATATGAAAAG-3' (nucleotides 5163–5190). The reaction mixture was 30 µl containing P. falciparum DNA, 2.5 mM MgCl2, 300 µM of each deoxynucleoside triphosphate (dNTP), 3 µl of 10 x LA PCR buffer, 0.3 µM of each primer, and 1.25 units of LA TaqTM DNA polymerase (Takara, Japan). The thermal cycler profile contained a preamplification denaturation at 94°C for 1 min, 35 cycles of 96°C for 20 s and 62°C for 5 min, and a final extension at 72°C for 10 min. Amplification of MSP4 was done in 100 µl reaction mixture containing P. falciparum DNA, 200 mM of each dNTP, 10 mM Tris-HCl (pH 8.3), 50 mM KCl, 2.0 mM MgCl2, 2.5 units Taq DNA polymerase (Pharmacia Biotech, Piscataway, NJ), and 1.0 µM of each primer. The forward PCR primer was MSP4-FM, 5'-CATATTTTATATTTTCAGATCTATTATAAG-3' (nucleotides 1628–1657), and the reverse primer was MSP4-RM, 5'-TAAAATATATTATTATGGTACCATTTAAAC-3' (nucleotides 2691–2720). Underlined nucleotides are mismatches artificially introduced to create BglII and KpnI restriction sites in these primers. The thermal cycler profile contained 35 cycles of 94, 50, and 72°C for 1, 1, and 2 min, respectively. The complete MSP5 gene fragment was PCR amplified under the same condition as those for MSP4, except for the use of the primers MSP5-F, 5'-GCTAGTTTCTTCGATTAATTGTTCC-3' (nucleotides 3374–3398), and MSP5-R. All amplification reactions were performed in Perkin-Elmer 2400 thermal cycler (Perkin-Elmer, Norwalk, Conn.).

After digestion with the respective enzymes, the MSP4 PCR-amplified product from each isolate was purified using the GF x PCR DNA and gel band purification kit (Pharmacia) and ligated into pUC119 vector. Transformation was performed by electroporation using Escherichia coli strain JM109 as a host. Plasmid DNA was purified using Wizard DNA Miniprep kit (Promega, Madison, Wis.). DNA sequencing was performed using either the dRhodamine Dye Terminator Cycle Sequencing Ready Reaction kit or the BigDyeTM Terminator V3.0 Cycle Sequencing kit in an ABI PRISMTM 310 DNA sequencer. Sequences of both MSP4 and MSP5, including their 5' and 3' UTRs, were determined from 3.56 kb of each isolate using primers MSP4-F, MSP4-INTF: 5'-CAATTTATCTGACGCAGCAG-3' (nucleotides 1877–1896); MSP4-INTR: 5'-ACGATGGGGTATGCAATAGG-3' (nucleotides 2244–2263); MSP4-F2: 5'-GAAGGTATTGAATGTGTTG-3' (nucleotides 2526–2544); MSP4-R: 5'-TAAAATATATTATTATGGTAAGATTTAAAC-3' (nucleotides 2691–2720); MSP4-F3: 5'-TACAATATTATATTGTATTG-3' (nucleotides 3001–3020); MSP4-R2: 5'-CCCTTTAAGTTTTCGAACAT-3' (nucleotides 3039–3058); MSP4-R3: 5'-CCAATCAGATGCATGTTAC-3' (nucleotides 3336–3354); MSP5-F, MSP5-5R: 5'-TCATTAATCTTCTGACAACC-3' (nucleotides 3738–3757); MSP5-INTF: 5'-GAACCTCCAAATAGATTACA-3' (nucleotides 4141–4160); MSP5-INTR: 5'-GCACCTCATCATCTACATTG-3' (nucleotides 4301–4320); MSP5-3F: 5'-AATTCTTATTCTTGCCATCC-3' (nucleotides 4533–4552); and MSP5-R. Verification of both gene sequences of each isolate was done using five subclones for MSP4 and the reamplified PCR product for MSP5, using the respective sequencing primers. Sequences have been deposited in GenBank under accession numbers AF447553AF447573.

Statistical Analysis
Sequences were aligned using the CLUSTALX program (Thompson et al. 1997Citation ). In pairwise comparisons among sequences, all sites at which the alignment postulated a gap in any sequence were excluded so that a comparable data set was used for each comparison. A phylogenetic tree was constructed by the minimum evolution method (Rzhetsky and Nei 1992Citation ) using the Tamura and Nei (1993)Citation nucleotide distance. The reliability of clustering patterns in the tree was tested by bootstrapping (Felsenstein 1985Citation ); 1,000 bootstrap pseudosamples were used.

The numbers of synonymous substitutions per synonymous site (dS) and nonsynonymous substitutions per nonsynonymous site (dN) were estimated by the method of Nei and Gojobori (1986)Citation ; the number of nucleotide substitutions per site (d) was estimated by the method of Jukes and Cantor (1969)Citation . Standard errors of means dS, dN, and d for all pairwise comparisons (designated as {pi}S, {pi}N, and {pi}, respectively) were estimated by the bootstrap method implemented in the MEGA2 program (Kumar et al. 2001Citation ). The average age of the nearest common ancestor for pairs of haplotypes () was estimated from the mean nucleotide diversity ({pi}) at both synonymous sites and noncoding sites. It is expected that {pi} = 2r, where r is the rate of nucleotide substitution per site (Hughes and Verra 2001Citation ). We used the estimate of r from Hughes and Verra (2001)Citation .

TA skew in nucleotide content was measured by the quantity (T - A)/(T + A) (Lobry 1996Citation ). When the correlation between {pi} and TA skew was computed in sliding windows along the sequence, the significance of the correlation was conducted by a randomization test because windows are not independent. In this test the Pearson correlation coefficient (r) between a set of n paired observations (Xi and Yi for i = 1, ..., n) is computed and then compared with the r values obtained for 10,000 random data sets. The random data sets are obtained from the actual data by randomly pairing the Xi and Yi values.

Peptides bound by class-I and class-II major histocompatibility complex (MHC) molecules were predicted by the methods of Parker, Bednarek, and Coligan (1994)Citation and Rammensee et al. (1999)Citation .


    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
In the phylogenetic tree of MSP4-MSP5 haplotype sequences, there were two major clusters separated by an internal branch that received 88% bootstrap support (fig. 2 ). Each of these clusters included sequences from Thailand and Brazil (fig. 2 ). Thus, there was no evidence of geographic clustering of sequences. A similar absence of geographic clustering of sequences was observed in the case of the circumsporozoite protein (Jongwutiwes et al. 1994Citation ). The absence of geographic clustering is characteristic of an ancient polymorphism that predates recent population movements (Jongwutiwes et al. 1994Citation ).



View larger version (22K):
[in this window]
[in a new window]
 
Fig. 2.—Minimum evolution tree of MSP4-MSP5 region haplotypes based on the Tamura-Nei nucleotide distance at 3,430 aligned nucleotide sites. Numbers on the branches represent percentages of 1,000 bootstrap pseudosamples supporting that branch; only values >=50% are shown

 
Nucleotide polymorphism was observed in both exons of the MSP4 and MSP5 genes, in the intergenic region, and in the downstream region (table 1 ). The mean of nucleotide diversity at silent sites over the entire region, including both synonymous sites in exons and sites in noncoding regions, was 0.0021 ± 0.0009 (standard error). The nucleotide diversity at synonymous sites in coding regions (0.0023 ± 0.0011) was nearly identical (table 1 ). Both these values were very similar to the mean {pi}S for 23 protein-coding loci of P. falciparum obtained by Hughes and Verra (2001)Citation . Assuming a mutation rate of 5.1 x 10-9 substitutions per site per year (Hughes and Verra 2001Citation ), the average age of the nearest common ancestor for pairs of haplotypes was estimated at 206,000 ± 88,000 years. The mean distance between the two major clusters of haplotypes from the phylogenetic tree (fig. 2 ) was 0.0040 ± 0.0017. Using the same estimate of the mutation rate, the common ancestor of these two clusters was estimated at 392,000 ± 167,000 years ago.


View this table:
[in this window]
[in a new window]
 
Table 1 Nucleotide Content and Diversity (±SE) at Synonymous Sites ({pi}S), Nonsynonymous Sites ({pi}N), and All Sites ({pi}) in Pairwise Comparisons Among 21 Plasmodium falciparum Isolates

 
In both MSP4 and MSP5, no SNPs were seen in the introns (table 1 ). Polymorphism was very low in the intergenic region, where there was a single SNP (table 1 ). On the other hand, extensive nucleotide polymorphism was observed in the region downstream to MSP5 (table 1 ). The introns of both genes, the intergenic region, and the downstream region were all characterized by stretches of AT repeats and single-nucleotide repeats of T and A. In spite of the absence of SNPs in the introns, there were differences among alleles with respect to the numbers of repeat units in the introns of both genes. Likewise, the intergenic and downstream regions showed sequence length differences in repeat arrays.

The genomes of members of the genus Plasmodium are known to be extraordinarily AT-rich (Weber 1988Citation ). In the present data the percent AT ranged from 69%–74% in the exons to 79%–91% in the introns and other noncoding regions (table 1 ). Even more striking was the difference among regions with respect to TA skew (Lobry 1996Citation ), which ranged from a value of -0.285 in exon 1 of MSP4 to 0.198 in the intergenic region (table 1 ). When TA skew was plotted in a sliding window across the region, peaks were found corresponding to the introns and the intergenic region, whereas troughs corresponded to the exons and portions of the downstream region (fig. 1a ). For all windows, nucleotide diversity was negatively correlated with TA skew (r = -0.197; P < 0.0001; randomization test).

There was a striking contrast between the intergenic and downstream regions; the former had a high positive TA skew and low polymorphism, whereas the latter had an overall negative TA skew and high polymorphism (table 1 ). Note that, although there were two sharp peaks in TA skew at the beginning and end of the downstream region (fig. 1a ), the very low troughs in TA skew in the same region were sufficient to create a net negative TA skew for the entire region (table 1 ). The introns, the intergenic region, and the downstream region all had high percent AT, in spite of striking differences among them with respect to nucleotide diversity (table 1 ).

When mean dS and dN were computed in a sliding window of 15 codons across the coding region of MSP4, a region in which dN greatly exceeded dS was observed in the 5' portion of the gene (fig. 1b ). A pattern of dN exceeding dS is indicative of positive selection favoring amino acid replacements (Hughes and Nei 1988Citation ). Computer prediction (Parker, Bednarek, and Coligan 1994Citation ; Rammensee et al. 1999Citation ) of peptides bound by human class-I and class-II MHC molecules indicated a number of potential T-cell epitopes in this region (table 2 ). Additionally, several amino acid replacements observed in the population of sequences analyzed were predicted to decrease peptide binding by MHC molecules (table 2 ). Mean dN (0.020 ± 0.008) was significantly greater than mean dS (0.004 ± 0.004) in the codons (codons 30–79) encoding these putative epitopes (P < 0.05). In contrast, in the remainder of MSP5 there was no significant difference between mean dS (0.002 ± 0.002) and mean dN (0.002 ± 0.001).


View this table:
[in this window]
[in a new window]
 
Table 2 Predicted HLA-Bound Peptides Encoded in the Hypervariable Region of MSP4 of Plasmodium falciparum

 

    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
Analysis of sequences of the MSP4-MSP5 region of chromosome 2 of P. falciparum from 21 isolates revealed dramatic differences among genomic regions with respect to the extent of nucleotide diversity. No nucleotide polymorphism was observed in the introns of the two genes, and very limited polymorphism was seen in the region between the two genes (table 1 ). In contrast, extensive nucleotide polymorphism was observed in the coding regions of the two genes, particularly MSP4, and in the region downstream to the MSP5 gene (table 1 ).

On the basis of silent nucleotide divergence across the region, we estimated the average pairwise divergence time at about 200,000 years and the age of the ancestor of the two most divergent haplotypes at nearly 500,000 years. Note that these are both conservative estimates. As discussed subsequently, our results suggested that there is a reduced rate of nucleotide substitution in regions with high TA skew, such as the introns and the intergenic region. By including these regions in the analysis, we obtained minimum estimates of divergence times. The results, thus, are not consistent with the hypothesis of a recent extreme population bottleneck in P. falciparum (Rich et al. 1998Citation ; Volkman et al. 2001Citation ). Rather, they are consistent with previous results, indicating that the effective population size of this species has been large for at least the past few 100,000 years (Hughes and Verra 2001Citation ; Mu et al. 2002Citation ).

Comparison of the numbers of synonymous and nonsynonymous nucleotide substitutions across coding regions suggested that the MSP4 gene is subject to a form of positive selection, favoring amino acid replacements in an approximately 40-residue region of the MSP4 protein (table 2 and fig. 1b ). Consistent with the experimental evidence of T-cell recognition of MSP4 (Bottius et al. 1996Citation ), these results suggest that polymorphism in this region of MSP4 is selectively maintained and that the selection is driven by host T-cell recognition. Because the sequence divergence among MSP4 alleles (table 1 ) is quite low in comparison with that among alleles at certain other polymorphic loci in P. falciparum (Hughes 1991Citation , 1992Citation ; Hughes M. K. and Hughes A. L. 1995Citation ; Verra and Hughes 2000Citation ), it appears that the balancing selection acting at MSP4 is of comparatively recent origin. The existence of a relatively recent balanced polymorphism at the MSP4 locus is consistent with our estimates of the age of haplotypes in the MSP4-MSP5 region.

The absence of nucleotide polymorphism in the intron, even of a locus apparently subject to balancing selection, suggests the existence of a mechanism suppressing nucleotide polymorphism in this intron. If the pattern of nucleotide substitution in introns of P. falciparum differs from that in exons, then such a difference might explain the fact that Volkman et al. (2001)Citation observed little polymorphism in introns, whereas extensive polymorphism was reported from exons (Hughes and Verra 2001Citation ; Mu et al. 2002Citation ). Like the introns in MSP4 and MSP5, the introns examined by Volkman et al. (2001)Citation were also very short and AT-rich. But our results suggest that AT richness itself is not sufficient to account for a reduction in the substitution rate in P. falciparum genomic regions. For example, the downstream region showed extensive nucleotide polymorphism in spite of a very high AT content (90.1%; table 1 ). In contrast, the two extensive noncoding regions which we sequenced—the intergenic region and the downstream region (table 1 )—differed strikingly with respect to both the TA skew and the extent of nucleotide diversity (table 1 ). This result, along with a negative correlation between nucleotide diversity and TA skew in sliding windows across the entire MSP4-MSP5 regions, suggests the hypothesis that TA skew is a major factor affecting the rate of nucleotide substitution in this species.

If an inverse relationship between TA skew and nucleotide polymorphism is found throughout the genome of P. falciparum, knowledge of TA skew will be useful in the search for polymorphic markers, which in turn can be used to search for loci contributing to traits such as drug resistance. In addition, because introns and exons differed strikingly with respect to TA skew (fig. 1a ), TA skew provides a potential additional factor that can be incorporated into gene-finding algorithms for Plasmodium species, in which prediction of short exons has been problematic (Van Lin et al. 2001Citation ).

Both the class-I and class-II MHC loci of humans are characterized by substantially reduced polymorphism in introns in comparison with exons (Cereb, Hughes, and Yang 1997Citation ; Hughes 2000Citation ). In this case, statistical analyses suggest that the introns are homogenized relative to exons by recombination (perhaps involving a gene conversion–like mechanism) and by subsequent genetic drift (Cereb, Hughes, and Yang 1997Citation ; Hughes 2000Citation ). In the case of the MHC, it is well established that there is a balancing selection maintaining polymorphism in the exons encoding the peptide-binding regions of the molecule (Hughes and Nei 1988Citation , 1989Citation ). The absence of such selection on introns and the occurrence of recombination assure that, over long evolutionary time, polymorphisms in introns do not hitchhike along with the selected polymorphisms in exons.

It might be suggested that a similar phenomenon can explain the absence of SNPs in the introns of MSP4 and MSP5, but there are a number of reasons for doubting such an interpretation. First, the contrast between introns and exons with regard to polymorphism is seen in the case of MSP5, even though there is no evidence of balancing selection at that locus. Furthermore, the introns of both MSP4 and MSP5 show no evidence of homogenization by gene conversion–like mechanisms. These introns reveal numerous interallelic differences, but these differences involve different numbers of repeat arrays and not point mutations. Furthermore, the introns show low levels of nucleotide polymorphism even in comparison with coding regions not under balancing selection, including portions of MSP4 outside the predicted epitope region and the entire MSP5 coding region. It is also worth mentioning that, given our estimate of an average allelic divergence time of about 200,000 years, the polymorphism at the MSP4 locus is very recent in comparison with balanced polymorphisms at MHC loci (Hughes and Yeager 1998Citation ) or even at such P. falciparum loci as those encoding the circumsporozoite protein (Hughes and Verra 1998Citation ) and apical membrane antigen-1 (Verra and Hughes 2000Citation ).


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 
We thank Prof. K. Tanabe for advice on sequencing, Prof. H. Kanbara for providing the DNA sequencer, and Dr. M. U. Ferreira for the Brazilian isolates. This research was supported by grants to S.J. from the Hitachi Scholarship Foundation and the Molecular Biology Project of the Faculty of Medicine, Chulalongkorn University, and grant GM43940 to A.L.H. from the National Institutes of Health.


    Footnotes
 
William Jeffery, Reviewing Editor

Keywords: malaria parasite nucleotide content polymorphism Plasmodium falciparum positive Darwinian selection Back

Address for correspondence and reprints: Austin L. Hughes, Department of Biological Sciences, University of South Carolina, Coker Life Sciences Building, 700 Sumter Street, Columbia, South Carolina 29208. E-mail: austin{at}biol.sc.edu . Back


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Acknowledgements
 References
 

    Bottius E., L. BenMohamed, K. Brahimi, et al. (12 co-authors) 1996 A novel Plasmodium falciparum sporozoite and liver stage antigen (SALSA) defines major B, T helper, and CTL epitopes J. Immunol 156:2874-2884[Abstract]

    Cereb N., A. L. Hughes, S. Y. Yang, 1997 Locus-specific conservation of the HLA class I introns by intra-locus homogenization Immunogenetics 47:30-36[ISI][Medline]

    Felsenstein J., 1985 Confidence limits on phylogenies: an approach using the bootstrap Evolution 39:783-791[ISI]

    Hughes A. L., 1991 Circumsporozoite proteins of malaria parasites (Plasmodium spp.): evidence for positive selection on immunogenic regions Genetics 137:345-353

    ———. 1992 Positive selection and interallelic recombination at the merozoite surface antigen-1 (MSA-1) locus of Plasmodium falciparum Mol. Biol. Evol 9:381-393[Abstract]

    ———. 2000 Evolution of introns and exons of class II MHC genes of vertebrates Immunogenetics 51:473-486[ISI][Medline]

    Hughes M. K., A. L. Hughes, 1995 Natural selection on Plasmodium surface proteins Mol. Biochem. Parasitol 71:99-113[ISI][Medline]

    Hughes A. L., M. Nei, 1988 Pattern of nucleotide substitution at MHC class I loci reveals overdominant selection Nature 335:167-170[ISI][Medline]

    ———. 1989 Nucleotide substitution at major histocompatibility complex class II loci: evidence for overdominant selection Proc. Natl. Acad. Sci. USA 86:958-962[Abstract]

    Hughes A. L., F. Verra, 1998 Ancient polymorphism and the hypothesis of a recent bottleneck in the malaria parasite Plasmodium falciparum Genetics 150:511-513[Free Full Text]

    ———. 2001 Very large long-term effective population size in the virulent human malaria parasite Plasmodium falciparum Proc. R. Soc. Lond. B 268:1855-1860[ISI][Medline]

    Hughes A. L., M. Yeager, 1998 Natural selection at major histocompatibility complex loci of vertebrates Annu. Rev. Genet 32:415-435[ISI][Medline]

    Jongwutiwes S., K. Tanabe, M. K. Hughes, H. Kanbara, A. L. Hughes, 1994 Allelic variation in the circumsporozoite protein of Plasmodium falciparum from Thai field isolates Am. J. Trop. Med. Hyg 51:659-667[ISI][Medline]

    Jukes T. H., C. R. Cantor, 1969 Evolution of protein molecules Pp. 21–132 in H. N. Munro, ed. Mammalian protein metabolism. Academic Press, New York

    Kumar S., K. Tamura, I. B. Jakobsen, M. Nei, 2001 MEGA2: molecular evolutionary genetics analysis software Bioinformatics 17:1244-1245[Abstract/Free Full Text]

    Lobry J. R., 1996 Asymmetric substitution patterns in two DNA strands of bacteria Mol. Biol. Evol 13:660-665[Abstract]

    Marshall W. M., W. Tieqiao, R. W. Coppel, 1998 Close linkage of three merozoite surface protein genes on chromosome 2 of Plasmodium falciparum Mol. Biochem. Parasitol 94:13-25[ISI][Medline]

    Mu J., J. Duan, K. Makova, D. A. Joy, C. Q. Huynh, O. H. Branch, W.-H. Li, X. Su, 2002 Chromosome-wide SNPs reveal an ancient origin for Plasmodium falciparum Nature (in press)

    Nei M., T. Gojobori, 1986 Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions Mol. Biol. Evol 3:418-426[Abstract]

    Parker K. C., M. A. Bednarek, J. E. Coligan, 1994 Scheme for ranking potential HLA-A2 binding peptides based on independent binding of individual peptide side chains J. Immunol 152:163-175[Abstract/Free Full Text]

    Putaporntip C., S. Jongwutiwes, T. Tia, M. U. Ferreira, H. Kanbara, K. Tanabe, 2001 Diversity in the thrombospondin-related adhesive protein gene (TRAP) of Plasmodium vivax Gene 268:97-104[ISI][Medline]

    Rammensee H.-G., J. Bachmann, N. N. Emmerich, O. A. Bachor, S. Stevanovic, 1999 SYFPEITHI: database for MHC ligands and peptide motifs Immunogenetics 50:213-219[ISI][Medline]

    Rich S. M., M. C. Licht, R. R. Hudson, F. J. Ayala, 1998 Malaria's eve: evidence of a recent population bottleneck throughout the world population of Plasmodium falciparum Proc. Natl. Acad. Sci. USA 95:4425-4430[Abstract/Free Full Text]

    Rzhetsky A., M. Nei, 1992 A simple method for estimating and testing minimum-evolution trees Mol. Biol. Evol 9:945-967[Free Full Text]

    Tamura K., M. Nei, 1993 Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees Mol. Biol. Evol 10:512-526[Abstract]

    Thompson J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, D. G. Higgins, 1997 The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools Nucleic Acids Res 25:4876-4882[Abstract/Free Full Text]

    Van Lin L. H. M., T. Pace, C. J. Janse, C. Birago, J. Ramesar, L. Picci, M. Ponzi, A. P. Waters, 2001 Interspecies conservation of gene order and intron–exon structure in a genomic locus of high gene density and complexity in Plasmodium Nucleic Acids Res 29:2059-2068[Abstract/Free Full Text]

    Verra F., A. L. Hughes, 2000 Evidence for ancient balanced polymorphism at the apical membrane antigen-1 (AMA-1) locus of Plasmodium falciparum Mol. Biochem. Parasitol 105:149-153[ISI][Medline]

    Volkman S. K., A. E. Barry, E. J. Lyons, K. M. Nielsen, S. M. Thomas, M. Chol, S. S. Thakore, K. P. Day, D. F. Wirth, D. L. Hartl, 2001 Recent origin of Plasmodium falciparum from a single progenitor Science 293:482-484[Abstract/Free Full Text]

    Weber J. L., 1988 Molecular biology of malaria parasites Exp. Parasitol 66:143-170[ISI][Medline]

    Wu T., C. G. Black, L. Wang, A. R. Hibbs, R. L. Coppel, 1999 Lack of sequence diversity in the gene encoding merozoite surface protein 5 of Plasmodium falciparum Mol. Biochem. Parasitol 103:243-250[ISI][Medline]

Accepted for publication May 10, 2002.