NEWS

For Proteomics Research, A New Race Has Begun

Tom Reynolds

When the human genome sequence was completed ahead of schedule in 2000, news reports declared the start of a new race to decode the human proteome—the full set of proteins encoded by those genes.

As impressive as biology’s first Big Science milestone was, the challenges of proteomics make genomics look almost elementary. In fact, researchers say, the comparison itself is misleading, because no definitive proteome actually exists for humans or any other organism.

But while the concept may be slippery, proteomics—a term coined in 1994 by Mark Wilkins and colleagues at Macqarie University in Sydney, Australia—has undeniably become biology’s favorite buzzword. Start-up companies and established pharmaceutical firms are lofting its banner to signal their position on the leading edge of biotech. Proteomics centers and institutes are proliferating at universities around the world. And the Human Proteome Organization will hold its first annual meeting in Versailles, France, in November.

So, what exactly is proteomics? The National Cancer Institute’s Frederick Cancer Research Facility in Frederick, Md., offers up this definition: "the study of all the protein forms expressed within an organism as a function of time, age, state, external factors, etc." Each cell in an organism presents a constantly changing kaleidoscope of proteins that wax and wane depending on what job the cell is doing—and whether its function is compromised by disease.

Although their genomes might be identical, the proteome of a neuron carrying visual information from the retina, for example, would be quite different from the proteome of a lymphocyte engaged in fending off a microbial invasion, or that of a breast cancer cell proliferating out of control.

Another index of proteomic complexity is the relatively paltry number of human genes. The current estimate, 30,000 to 40,000, one-third to one-tenth of what most genome researchers predicted just a few years ago. Aside from the humbling fact that humans have not many more genes than worms, the major lesson from this surprisingly low count is a greater appreciation of the role played by processes that are not coded in the DNA. These include alternate splicing of RNA to produce different forms of a protein and post-translational modifications—such as phosphorylation and glycosylation—that act as on-off switches for proteins.

In one sense, the central questions asked in proteomics are the same ones that scientists in biochemistry, cell biology, and related fields have been wrestling with for decades: What proteins are present in which cells? How do they work together in signaling pathways? What are the changes in protein expression and activity that drive the development, repair, breakdown, and death of an organism?

What proteomics adds is the vision of these research areas transformed and empowered by "high-throughput" technological breakthroughs—such as the invention of polymerase chain reaction and the exponential gains in sequencing speed—that drove the genomics revolution of the late 20th century. Advances in 2-dimensional gel electrophoresis, liquid chromatography, mass spectrometry, and bioinformatics have moved the field forward, but so far, no single technology has emerged to revolutionize proteomics the way PCR did for genomics.

"Proteomics is very much a jellyfish right now—very amorphous and hard to get a clear sense of," said Joshua LaBaer, M.D., Ph.D., director of the Harvard Institute of Proteomics at Harvard Medical School in Boston. But two major areas of activity can be distinguished, he said. In what he terms "abundance-based" proteomics, researchers look at which proteins are present in a given tissue or body fluid.



View larger version (143K):
[in this window]
[in a new window]
 
Dr. Joshua LaBaer (Credit: Graham Ramsay)

 
"If the abundance-based approach worked perfectly, the end result would be to tell you, for any given specimen, which proteins are present, how much of them are present, and how different that is from some other tissue or disease state," LaBaer explained. "So you could say ‘protein A is elevated in cancer cells compared with noncancerous cells.’" This approach holds promise for identifying disease markers that could be important in diagnosis, and eventually could lead the way to the design of new treatments.

The potential power of abundance-based proteomics came to light with a February publication in the British medical journal the Lancet, A team of researchers led by Emanuel Petricoin, M.D., of the Division of Therapeutic Products at the U.S. Food and Drug Administration, and Lance Liotta, M.D., Ph.D., of NCI’s Laboratory of Pathology, used protein patterns in blood serum to distinguish ovarian cancer patients from healthy women or those with noncancerous ovarian pathology. Their method correctly identified 50 of 50 ovarian cancer specimens and 63 of 66 noncancer specimens—far more accurate than any ovarian cancer detection tests used now, such as CA125. Most importantly, the method was just as good at identifying patients with early-stage cancer as those with advanced disease.

The technique will have to be validated on much larger numbers of patients and in prospective studies before it can be used in clinical practice. But J. Carl Barrett, Ph.D., director of NCI’s Center for Cancer Research, said the study has already broken new ground. "The idea that rather than a single biomarker, an entire pattern of proteins contains important diagnostic information, is an exciting new paradigm," he said.

The method employs a number of technologies commonly used in proteomics. But its particular power lies in two techniques: Liotta’s lasercapture microdissection, which gives a snapshot of expressed proteins in a small number of cells (see News, Nov. 1, 2000, p. 1710), and the pattern-recognition algorithms developed by the bioinformatics company Correlogic of Bethesda, Md. The functions of the five proteins that distinguish between ovarian cancer patients and unaffected women test are completely unknown. Although the Lancet authors suggest that exploring their functions may elucidate disease processes, it is the sheer numbers-crunching, not any insight about what the proteins do, that holds the key to the test’s utility.

The second major area of proteomics—and the one that will ultimately have a far greater impact, according to LaBaer—is functional proteomics: figuring out what each protein’s role is and how proteins interact. "The variety of techniques involved is much broader, but also at a much earlier state of development," he said. These include the yeast two-hybrid system, immunoprecipitation-based methods, protein microarrays and the Harvard Institute’s work to refine high-throughput methods for protein purification.

Different proteins typically require different conditions for their expression and purification. A big challenge for proteomics is to find the best ways to express and purify thousands of proteins in parallel. In the March 5 Proceedings of the National Academy of Sciences, LaBaer and colleagues report on their efforts to do this. They took a test set of 32 human genes of varying size, expressed them in the bacterium E. coli, and attached each product to four different chemical "purification tags." Applying their methods to 336 randomly selected human genes, they successfully purified 60% of them, a figure they believe represents a reasonably lower estimate for the overall success rate of this strategy.

"The availability of such methods will alleviate a key bottleneck in the application of new proteomic techniques such as protein arrays to human biology," the authors wrote.

Among private-sector groups that have begun to stake out territory in proteomics, the most ambitious appears to be the alliance of Myriad Genetics, Inc., of Salt Lake City; Hitachi, Ltd., of Tokyo; and Oracle of Redwood Shores, which announced in April 2001 that it plans to commit $185 million to "map the human proteome in less than 3 years."

William S. Hancock, vice president and general manager of proteomics at Thermo Finnigan, San Jose, Calif., is the editor of the American Chemical Society’s new Journal of Proteome Research.

He said the commercial claim to "map the proteome" contains a degree of hype, because the effort, comprehensive though it may be on its own terms, relies solely on one type of protein analysis, the yeast two-hybrid method, and will yield only a fraction of all potential information about the proteins.

"In this case, what you’re mapping is simple protein interactions. But you’re still not characterizing the full set of a proteome, not doing quantitation, not measuring interactions of complexity, not measuring activity, not measuring localization, not measuring post-translational modification—the list goes on and on," said Hancock. "We’re all going to scratch our heads and try to understand the significance of it."

LaBaer agreed, calling the commercial claim to map the proteome "a clever turn of phrase to capitalize on the Human Genome Project press."

LaBaer and Hancock both emphasized that in the long run, the most substantial progress in proteomics is likely to come from publicly shared research efforts, not proprietary commercial programs.

One model that is already up and running, Hancock said, is Canada’s Biomolecular Interaction Network Database, which stores full descriptions of interactions, molecular complexes and pathways involving proteins, nucleic acids and small molecules.

Another freely available resource will be the Harvard Institute of Proteomics’ FLEX (Full Length Expression) repository, containing all known genes from humans and other organisms in a form that can be expressed as proteins in a variety of experimental systems.

The repository is intended to "create a gold standard" for proteomics researchers worldwide, LaBaer said.

Hancock suggested that one way to organize the workload would be for programs to take on the proteomes of different organs or tissue. Germany has announced that it plans to characterize the proteome of the human brain, he said.



             
Copyright © 2002 Oxford University Press (unless otherwise stated)
Oxford University Press Privacy Policy and Legal Statement