The economics of ‘more research is needed’

Carl V Phillips

Minnesota Center for Philosophy of Science, University of Minnesota, 746 Heller Hall, Minneapolis, MN 55455, USA. E-mail: phill047{at}tc.umn.edu


    Abstract
 Top
 Abstract
 Introduction
 ‘More research is...
 Calculating the value of...
 Example
 Conclusions
 References
 
Background Results from epidemiology and other health research affect millions of life-years and billions of dollars, and the research directly consumes millions of dollars. Yet we do little to assess the value of research projects for future policy, even amid the ubiquitous assertions that ‘more research is necessary’ on a given topic. This methodological proposal outlines the arguments for why and how ex ante assessments can inform us about the value of a particular piece of further research on a topic.

Methods Economics and decision theory concepts—cost-benefit analysis and probability-weighted predictions of outcomes—allow us to calculate the payoff from applied health research based on resulting decisions. Starting with our probability distribution for the parameters of interest, a Monte Carlo simulation generates the distribution of outcomes from a particular new study. Each true value and outcome are associated with a policy decision, and improved decisions are valued to give us the study's contribution as applied research.

Results The analysis demonstrates how to calculate the expected value of further research, for a simplified case, and assess whether it is really warranted. Perhaps more important, it points out what the measure of the value of a further study ought to be.

Conclusions It is quite possible to improve our technology for assessing the value of particular pieces of further research on a topic. However, this will only happen if the need and possibility are recognized by methodologists and applied researchers.

Keywords Epidemiology methods, cost-benefit analysis (CBA), economic valuation, uncertainty, Monte Carlo simulation

Accepted 26 June 2000


    Introduction
 Top
 Abstract
 Introduction
 ‘More research is...
 Calculating the value of...
 Example
 Conclusions
 References
 
Epidemiology is not an abstract academic exercise. The results of epidemiological studies determine medical treatment options, influence lifestyle choices, and change the course of expensive regulations and production methods. Billions of dollars and millions of life-years hinge on the results of the research. Furthermore, the research itself is costly, as measured in either dollars or the opportunity cost of the limited pool of skilled, funded researchers foregoing other research.

Business and public policy analysts recognize that any endeavour should continuously invest in assessing current and future activities, spending at least a few per cent—often a lot more—of the total budget on figuring out if the rest of the budget is being spent well. This is widely recognized for health interventions1 (though typically highly confounded by politics and popular rhetoric), but it is typically overlooked for health research. Before embarking on an expensive study whose results are intended to inform policy (which for present purposes is defined to include formal public policy as well as public health officials' recommendations about lifestyle choices, standards of practice in medicine, and the like), it is natural to ask how the outcome is likely to inform policy. Such pre-research analysis is practised in pharmaceutical research,2 where research funders have profit incentives to make sure the research is worthwhile. Similar analysis has been proposed for other clinical trials, but the underlying principles have been almost entirely ignored.3 There has been little or no formal analysis of when a particular epidemiological study is likely to be worthwhile, let alone applications of such analysis in study design. Such analysis for observational studies (including epidemiology and econometrics) is more complicated than calculating optimal sample size for two-arm trials and may resist a closed-form optimization in practice, but it is not intractable and we can clearly do more than we do currently.

In rare cases, such as research on novel exposures we know absolutely nothing about, it may not be possible to do any useful ex ante assessment about the value of a study. But in most cases, where something is known and we are trying to refine that knowledge, it is possible to assess what the new research might add to the existing base of knowledge. (We can, in fact, perform such an analysis for the first study of a particular exposure, though it requires that we use methods other than frequentist probability since we will have to estimate probabilities before collecting any data about the particular relationship in question.)

If we do not take full advantage of past research when designing future research, then there is little point in having done the past research. Contrary to the impression the public gets from health news headlines, the research process is one of building upon existing knowledge rather than stumbling around until the definitive result is found and displaces all previous findings. Yet further research is often done as if researchers believe the headline version of scientific progress. Typically, research simply repeats existing studies, possibly correcting existing errors and possibly not, as if the next study will be definitive.


    ‘More research is needed’
 Top
 Abstract
 Introduction
 ‘More research is...
 Calculating the value of...
 Example
 Conclusions
 References
 
Studies in epidemiology, as well as many other fields, conclude with the mantra ‘more research is needed’ so frequently that the phrase is commonly used as a humorous catch-phrase. Researchers are justifiably concerned about their results being over-interpreted, and may want to hedge their conclusions. However, the statement ‘more research is needed’ says more than the neutral hedge, ‘we do not know enough to draw definitive conclusions’.

If ‘more research is needed’ is to be interpreted as more than the vacuous statement ‘we do not know everything yet’, the best interpretation is the economic statement, ‘the expected benefit that would come from more research, due to the resulting improvement in our estimates of parameters of interest, justifies the cost of that further research’. This suggests that some assessment should be made about the trade-off, but the economic statement is virtually never accompanied by economic analysis. It is difficult to understand how applied research can be justified at all, let alone be declared to be ‘needed’, without such analysis.

Basic economics tells us that decisions to do more research (and the choice of research projects among many options) should be based on an assessment of the expected net value of the research (i.e. the probability-weighted average of the benefits minus costs). Improving our knowledge is generally a good thing and so if further research were free, it would always be warranted. Since this is not the case, economic analysis is the first piece of further research that is ‘needed’.


    Calculating the value of further research
 Top
 Abstract
 Introduction
 ‘More research is...
 Calculating the value of...
 Example
 Conclusions
 References
 
If we can describe how a subsequent study might change our current estimates of the true value of the parameter(s) of interest, estimate what the probabilities of those changes are, assess how we would change our behaviour under the new information, and put a dollar figure on the behavioural changes, then we can compare the latter figure to the cost of gathering further information and figure out if it is worthwhile. It is possible to create the tools needed to do such analysis. The following presents the framework and advantages of such analysis. Specifics of the quantitative assessment are beyond the present scope and left for future papers.

We can determine how to analyse the question of worthwhile research by working backward from the information we would like to have.

  1. Ultimately, we want to know what policy decision new information would generate. For purposes of applied research, new information pays off if and only if it changes our assessment of the optimal policy. (The question of science for its own sake is a different matter, and is beyond the present scope. The justification for funding further applied research in epidemiology and similar fields is almost always presented in terms of specific practical outcomes, and so should be judged on that basis.) The value of such a change can be calculated based on the net improvement in outcomes under the new policy, given our (updated) assessment of the parameters of interest.
  2. Naturally, we do not know the answer from 1. ex ante. However, we can identify the possible results of the new research and how those results would change our assessment of the parameters of interest, and thus what new policy decisions would result.
  3. To quantify the implications of 2., we must assess the probability distribution of the outcomes based on our existing knowledge. This is a quantitative and philosophical challenge, but it is clearly possible to generate better information than we do now.

Reversing this, the probability distribution of the true value of the parameter of interest is calculated based on current knowledge and broken down into portions of the density that correspond to various findings that could result from the next study, each of which has a particular impact on our understanding of the world and thus policy decisions. The probability distribution of those results can be generated, and the expected value of the information compared to its cost. This process is summarized in Figure 1Go and formalized below.



View larger version (21K):
[in this window]
[in a new window]
 
Figure 1

 
Consider a case where a policy decision, PP, is made based on all presently available information, denoted by XX. We want to choose P to maximize the expected net benefits from a policy, B(P,X). Define,


(1)

(1) the realized net benefit from choosing the optimal policy based on the belief that X is the state of the world when really XT is the true state of the world. Then if further research, study s, allows us to update Xs, our belief about the world, the true net benefit of that research is,


(2)

(2) where cs is the cost of carrying out the study. We are never going to know the true value, XT. Nor will we know XS before doing s. But we can always make the best possible prediction of their distribution based on existing knowledge.

To estimate that expected value of doing s, we calculate expected benefit based on our current beliefs about the distributions of XT and Xs given X. For the simplest case, assume that the states of the world are continuous scalar values (such as a single relative risk). (Other cases follow by analogy, requiring we take the appropriate n-dimensional integrals and/or sums of probabilities.) Then we want to know,


(3)

where f(XT|X) is the density function for the true value given our existing knowledge, and g(XS, XT) is the density of the results of s given the true value.

With E(NBs), we can compare the expected payoff of doing a study to not doing it, and compare the expected payoff of alternative studies. The expected benefit from doing multiple studies simultaneously, the order to conduct multiple possible future studies, and irreversible decisions4 should be part of a complete analysis, and can be calculated, but are set aside for present purposes.

Notice from Equation (3)Go and Figure 1Go that the prediction about the outcome of the next study, Xs, follows directly from our distribution of the true value, XT, which is based, in turn, on our current data. This line of reasoning resembles Bayesian updating, though the economic analysis is agnostic with respect to statistical methods. There is no requirement to use Bayesian or any other particular method. (The formulation presented here uses no prior probabilities, basing predictions about true values on observed data and predictions about future observations on the predicted true values.)

Determining f is difficult and even g is non-trivial. Indeed, determining these densities is widely regarded as impossible, and thus seldom even considered. However, just as some policy can (and will) always be made given the available data, some best estimate of f and g can (and should) always be made for X. For purposes of demonstrating the value of cost-benefit analysis of further research (and of creating new tools to calculate f and g), it is sufficient to recognize that we are not completely ignorant of f and g.

A stylized example illustrates the theory.


    Example
 Top
 Abstract
 Introduction
 ‘More research is...
 Calculating the value of...
 Example
 Conclusions
 References
 
Consider a binary exposure to an environmental chemical that may increase the probability of a disease. We have a choice of a single regulatory intervention, changing production processes to eliminate emission of the chemical, costing $200 million in present value terms and eliminating the risk. Each one per cent increase in relative risk (RR) of disease for the exposed population costs society $10 million in the form of increased morbidity and mortality.

Clearly the intervention is warranted if the RR is greater than or equal to 1.20, the action point. Assume our existing belief about the true RR from the exposure is distributed according to Figure 2Go, based on the data from an existing study. Intervention is warranted, as determined by taking the probability-weighted average for net costs of the exposure without regulation and finding it is higher than $200 million.



View larger version (12K):
[in this window]
[in a new window]
 
Figure 2

 
Notice that we are not concerned with P-values or confidence intervals. As with other true optimization calculations, our policy decision is based on our best assessment of the current data, regardless of how confident we are that this is the right decision. Despite the fact that almost a third of the probability mass is on the ‘wrong’ side of the action point of 1.2, there is still a best possible choice—to intervene—and it should be made.

We are considering a new study, s, which repeats the existing study for a larger population (specifically, increasing the total sample from 1000 to 3000). It turns out that the process that generated the distribution in Figure 2Go is a combination of possible bias from disease misclassification (in particular Type I error, as discussed below) and random sampling error. (We ignore the other inevitable sources of error to simplify the example, but they would be included in an actual analysis, as we discuss elsewhere.5) Assume that s will not affect the uncertainty about the level of exposure misclassification because it is basically the same study.

There are three possible results from s. We could confirm our existing belief that we should intervene, thus not changing our behaviour and generating a benefit of zero minus the cost of s. We could discover, correctly, that intervention is not a good idea, with the benefit of the resulting change in behaviour depending on the true value of the RR. Or we could ‘learn’ that intervention is not a good idea when it really is, with the net cost of the resulting unfortunate change depending on the true value of the RR.

A Monte Carlo simulation produces the distribution of the value of the resulting change in policy (without the cost of s) in Figure 3Go, which shows a probability atom at zero and a density for other values. The Monte Carlo approach allows the modelling of multiple sources of uncertainty which are intractable to solve in closed-form, particularly when the entire density function, rather than just a mean or other summary statistics, is needed.57



View larger version (24K):
[in this window]
[in a new window]
 
Figure 3

 
Figure 3Go shows that the welfare change has a strong mode at zero, representing the probability of not changing our behaviour. The new study has non-zero payoff only if it tells us that our old decision was not optimal. If the study merely reinforces our previous decision, we still get the same actual benefit. This illustrates a flaw in the standard vision of when to do more research. Under the typical status-quo-biased, P-value-centric model of decision making, the first study, represented in Figure 2Go, would not have been enough to warrant intervention and a further study would be needed to provide the excuse to take the policy action we already thought was right. (It turns out that s is also unlikely to pass the usual P-value test without either ignoring the misclassification or additionally carrying out validation study v, below.) Under the standard paradigm, researchers are searching for a study that will tell them again what they already think they know. This makes poor use of available information by failing to act on information in advance of some arbitrarily-set level of confirmation. Additionally it departs from the hypothetico-deductive scientific method in that it sets out to confirm existing beliefs rather than subjecting them to severe tests to see if they hold up.8 So, is s warranted? Applying Equation (3)Go and numerically integrating the values from Figure 3Go yields E(NBs) = $18 million – cs. Doing s would likely be worthwhile, though not if it were extremely expensive to carry out. When there is $200 million worth of productivity at stake, it is worth a lot to make sure there really is a big health hazard.

This sounds similar to the justification for demanding a high level of statistical certainty (a low P-value) before acting. But statistical certainty rules are a poor substitute for the cost-benefit approach. In particular, the standard tests completely ignore the key values in the optimization calculation: the costs of the intervention and disease. If the cost of the disease is low enough, the cost of intervention is either very low or very high, or the cost of the new study is high enough, then further research will not be worthwhile, whatever the P-value. If the intervention is cheap enough we should be less concerned about unnecessary intervention, and regardless of the unimpressive P-value it would be best (from the economic and scientific perspective) to just change our industrial practices based on our strong suspicion and move on.

Returning to Figure 2Go, consider an alternative further study. The distribution represented in Figure 2Go is the average of two distributions illustrated in Figure 4Go. Distribution XR represents the distribution from sampling around the actual observed data, uncorrected for misclassification. Distribution XL represents the researchers' concern that there was a disease misclassification, wherein exposed individuals were inaccurately judged to have the disease, creating a bias factor of about 1.5. There is still some apparent effect of exposure, but XL would not justify intervention.



View larger version (20K):
[in this window]
[in a new window]
 
Figure 4

 
Figure 4Go illustrates how the probability of suspected bias can be included in the reported uncertainty. Only by quantifying such uncertainty can we hope to assess the value of further research that reduces a particular source of uncertainty (not to mention being the only way that we can honestly report our results).

Validation study v will resolve the question of misclassification bias, and give us a revised distribution Xv that is either XL or XR. Is v warranted? The study pays off if we decide to act differently under Xv than we were doing under X. If Xv = XL we should not intervene, while if Xv = XH, we should continue our intervention. Once again this produces a result that is contrary to the usual attitude that puts greater value on finding XH, disclosing a more certain and serious problem. Although it is reassuring to confirm the wisdom of the apparent best action, it lacks the practical payoff of finding out we were wrong. The value of the study lies in the 0.5 probability of finding out that the $200 million expenditure generates an expected benefit of only $166 million (determined by numerically integrating XL), an expected benefit of $17 million.


    Conclusions
 Top
 Abstract
 Introduction
 ‘More research is...
 Calculating the value of...
 Example
 Conclusions
 References
 
It may only take a few person-days (and almost certainly less than a few person-months) to assess whether an expensive, multi-year study is really worthwhile. Just as researchers are fond of telling policy makers that they should spend at least a few per cent of the cost of interventions to assess whether a policy is effective, it is worth a few per cent of researchers' efforts to see if their research is likely to be effective. Medical researchers whose profits depend on well-chosen research have come to recognize the value of such meta-research,2 and funders of other health research should demand nothing less. Yet not even rough-cut analyses are normally done. Epidemiology curriculum does not recommend it or even suggest it is possible. Journal articles do not report it having been done, even when high stakes policy relevance—not to mention the explicit calls for further research—cries out for it. Simply calling attention to this way of thinking could be quite beneficial, even if few formal calculations are done. It may be infrequent that a validation study has a 0.5 chance of saving $200 million, but when the possibility presents itself, everyone should have the basic tools to notice.

Are these analyses actually possible? How can we possibly calculate the densities, f(XT|X) and g(XS|XT)? The first step is for us to escape the ubiquitous implicit assumption that all quantifiable uncertainty is due to random sampling error. Obviously no one believes this. Researchers are aware that measurement error, selection bias, confounding, and model specification contribute to total uncertainty but the test statistics reported in epidemiology lock in a way of thinking that leaves other sources of uncertainty unquantified. We need quantification of the multiple sources of uncertainty. Without that, it is impossible to think effectively about what it is that the next study might actually accomplish. Various methods are available for quantifying uncertainty from both Bayesian9,10 and frequentist perspectives,5 and the value of such methods in health analysis is high.5,6 The emergence of these methods will require a combination of applied researchers understanding their value and methodological researchers making them easier to use.

It is difficult to attract research attention to a problem when there is no demand for the results. Doing cost-benefit analysis to figure out when more research is warranted is a tractable problem and a worthwhile endeavour. If researchers, policy makers, and funding agents recognize the value of such analysis, the demand will induce the necessary methodology research. The result could be a huge boost in the efficiency of the field of epidemiology at a relatively modest cost. More research is ... likely to be worth its cost.


    Acknowledgments
 
The author thanks George Maldonado and participants in several University of Minnesota School of Public Health seminars for helpful suggestions.


    References
 Top
 Abstract
 Introduction
 ‘More research is...
 Calculating the value of...
 Example
 Conclusions
 References
 
1 Gold MR, Siegel JE, Russell LB, Weinstein MC (eds). Cost-Effectiveness in Health and Medicine. New York: Oxford University Press, 1996.

2 Backhouse ME. An investment appraisal approach to clinical trial design. Health Econ 1998;7:605–19.[ISI][Medline]

3 Claxton K, Posnett J. An economic approach to clinical trial design and research priority-setting. Health Econ 1996;5:513–24.[ISI][Medline]

4 Arrow KJ, Fisher AC. Preservation, uncertainty and irreversibility. Quart J Econ 1974;87:312–19.

5 Phillips CV, Maldonado G. Using Monte Carlo methods to quantify the multiple sources of error in studies. Am J Epidemiol 1999; 149:S17.

6 Manning WG, Fryback DG, Weinstein MC. Reflecting uncertainty in cost-effectiveness analysis. In: Gold MR, Siegel JE, Russell LB, Weinstein MC (eds). Cost-Effectiveness in Health and Medicine. New York: Oxford University Press, 1996.

7 Doubilet P, Begg CB, Weinstein MC, Braun P, NcNeil BJ. Probabilistic sensitivity analysis using Monte Carlo simulation. Med Decis Making 1985;5:157–77.[Medline]

8 Salmon WC. The Foundations of Scientific Inference. Pittsburgh: University of Pittsburgh Press, 1966.

9 Berger JO, Wolpert RL. The Likelihood Principle. Hayward, California: Institute of Mathematical Statistics, 1984.

10 Carlin BP, Louis TA. Bayes and Empirical Bayes Methods for Data Analysis. London: Chapman & Hall, 1996.