A comparison of cigarette smokers recruited through the Internet or by mail

Jean-François Etter and Thomas V Perneger

Institute of Social and Preventive Medicine, University of Geneva, Switzerland.

Jean-François Etter, Institute of Social and Preventive Medicine, University of Geneva. CMU, Case Postale, CH-1211 Geneva 4, Switzerland. E-mail: etter{at}cmu.unige.ch Internet: www.stop-tabac.ch

Abstract

Objectives To compare smokers recruited by mail or through the Internet.

Methods A questionnaire was mailed to 19 352 inhabitants of Switzerland in 1998, in an effort to enrol them in a smoking cessation trial. The same questionnaire was also available on the Internet. Furthermore, we mailed a survey to a representative sample (n = 1000) of the population of Geneva, Switzerland, in 1996. In this study, we compare three groups: 1027 smokers recruited through the Internet, 2961 volunteer trial participants recruited by mail (response rate 16%), and 211 smokers in the representative sample also recruited by mail (response rate 75%).

Results Smokers self-recruited through the Internet were younger, more educated, more motivated to quit smoking and smoked more cigarettes per day than smokers in the other samples. Compared to trial participants, Internet participants had more negative attitudes towards smoking, higher self-efficacy scores, and were more addicted to tobacco. The strength of associations between smoking-related variables was similar in Internet and trial participants.

Conclusion As expected, the three groups of smokers differed on several characteristics. However, bias in distributions of variables did not imply bias in associations between variables. Thus, Internet recruitment is a potentially useful method for analytical studies that focus on associations between variables.

KEY MESSAGES

Keywords Smoking prevention and control, internet, mail surveys, bias

Accepted 17 October 2000

The Internet is a fast and cost-effective tool for data collection in epidemiological and medical research.16 Internet questionnaires are easy to design and to answer, and computer programs can provide real-time evaluation of answers and ensure that the data are complete and accurate before they are accepted.2 Internet surveys are also cheap and data entry errors by research assistants are eliminated.

Despite this potential, few published studies used health-related data collected on the Internet,36 or compared data collected on the Internet to similar data collected by mail.4 Concerns about selection bias, data quality, eligibility of participants, representativeness of samples or about the possibility of the same person being registered several times could explain the limited use of the Internet in health research. However, these hypotheses are largely untested.

We compared smokers who volunteered for a smoking cessation trial and were recruited by mail to smokers self-recruited through the Internet to receive smoking cessation counselling. Since both Internet participants and trial participants may differ from unselected smokers in the general population, we compared these two samples to a representative sample of smokers also surveyed by mail. We examined bias in distributions of smoking-related variables and in associations between variables.

Methods

Setting
Internet participants
Internet participants visited, between June 1998 and February 1999, a French-language Internet site on smoking cessation (www.stop-tabac.ch). They found the site either on search engines or from links on health-related web sites. Participants were invited to answer a 61-item questionnaire in order to obtain, a few seconds later, a smoking-cessation counselling letter tailored on their answers. Participants could choose whether or not they wanted their data to be archived, and they were informed that archived data would be used for statistical analyses.

Trial volunteers recruited by mail
A random list of 20 000 addresses was drawn from the official file of residents of the French-speaking part of Switzerland. These people (aged 18–60) received by mail a questionnaire and an invitation to participate in a randomized trial aimed at evaluating a smoking cessation counselling programme. This programme included a series of individually tailored counselling letters and stage-matched booklets (available on www.stop-tabac.ch). The questionnaire used to produce the counselling letters was the same in the Internet and trial samples. Only daily smokers were eligible for the trial. Those who were not eligible and smokers who declined participation were asked to transmit the questionnaire to any smoker they knew. Participants who received the questionnaire directly from us and those who received it indirectly, from an addressee, were similar on most demographic and smoking-related variables.7 There-fore, these respondents were grouped together. This questionnaire was mailed only once and no reminder mailings were sent out. Participants in the trial were informed that they would have to answer a follow-up questionnaire 6 months later.

Representative population sample
A questionnaire on smoking and alcohol prevention was mailed in 1996 to a representative (random) sample of 1000 Geneva residents aged 18–70 years, identified through the official resident registry. This questionnaire did not contain psychometric scales on smoking-related attitudes, self-efficacy or self-change strategies. Non-respondents received up to five reminder mailings and transmission of the questionnaire was not allowed.8

Group comparisons
Because we knew of no published comparison between smokers recruited by mail or through the Internet, we did not specify a priori hypotheses on between-group differences. Rather, we compared groups according to variables that are relevant to smoking prevention.

We compared Internet-, trial- and representative samples on age, sex, number of school years, having children, number of cigarettes smoked per day, having made a 24-hour attempt to quit smoking in the past year, and stages of change. We classified current smokers in three stages of change: precontemplation (no intention to quit in the next 6 months), contemplation (seriously considering quitting in the next 6 months) or preparation (decided to quit in the next 30 days and has made an attempt to quit in the past 12 months).9

The following variables were available for Internet and trial participants only:

Data quality
Multiple registration among Internet participants (in case of duplicate record, we accepted only the first record), proportion of obviously unreliable questionnaires (i.e. questionnaires that contained a majority of missing answers or a majority of contradictory answers) and, among valid questionnaires, proportion of missing answers.

Components of the Transtheoretical Model of Behaviour Change
Smoking-related self-efficacy, evaluation of the adverse effects of smoking, and the frequency of use of self-change strategies.9 Self-efficacy was assessed with a 12-item scale measuring two dimensions: confidence in one's ability to refrain from smoking when facing internal stimuli (e.g. feeling depressed), and external stimuli (e.g. having a drink with friends).10 Evaluation of the negative effects of smoking was assessed with a 10-item scale.11 Involvement in behaviour change was assessed with a 19-item scale measuring the frequency of use of five self-change strategies labelled ‘Risk assessment’, ‘Commitment to quit smoking’, ‘Taking control over the smoking habit’, ‘Coping with the temptation to smoke’ and ‘Helping relationships’.12 These multi-item scales were previously submitted to comprehensive validation tests.1012 For a better interpretation of these psychometric scores, readers are referred to published data on the association between these scores and stages of change, level of dependence, and smoking status.1012

Level of addiction to tobacco was measured by the number of minutes between waking up and smoking the first cigarette of the day.13

Associations between variables
Bias in the distributions of variables does not necessarily imply bias in associations between variables. We compared the strength of associations between smoking-related variables in Internet and trial participants. The Transtheoretical Model of Change was used as a framework for these comparisons.9 First, we compared the size of differences between smokers in the precontemplation stage and smokers in the contemplation or preparation stages, on scores of attitudes towards smoking, self-efficacy and self-change strategies. Second, we compared differences on these scores between light smokers (<15 cig./day) and heavy smokers (>20 cig./day).

Statistical procedures
Psychometric scales were expressed in standardized scores (mean = 50, SD = 10).9 We used {chi}2 tests to compare categorical variables, t-tests or ANOVA to compare continuous variables and to compute confidence intervals, and multivariate logistic regression to identify variables independently associated with Internet versus mail recruitment.

Results

Participation
Internet
About 15 000 people visited the Internet site between June 1998 and February 1999. The questionnaire was answered 1975 times, and 394 people (20%) asked us not to store their data, thus 1581 records were stored. We deleted 15 records that we estimated to be obviously unreliable (0.9%), 235 records of people registered twice (15%), and 62 follow-up assessments (4%). Of 1269 first assessment participants, 1027 daily smokers (81%) were included in subsequent analyses; occasional smokers were excluded.

Trial survey
Of 19 352 mailings sent to valid addresses, 3124 questionnaires were returned (16%), including 2961 by daily smokers (96%).

Representative survey
We received 751 questionnaires (75% of 1000 questionnaires sent out), from 211 smokers (28% of 751) and 540 non-smokers. Only smokers were included in this study.

Differences in distributions of variables
All participants in the representative sample and all but six participants (0.4%) in the trial lived in Switzerland. Internet participants lived in France (n = 337, 33%), Switzerland (n = 296, 29%), Canada (n = 149, 15%), Belgium (n = 31, 3%), and other countries (n = 214, 21%). Differences between Internet participants who lived in Switzerland or in other countries were small (data not shown). Compared to participants in the trial and to the representative sample, Internet participants were younger, more educated, less likely to have children; they smoked more cigarettes per day; more had attempted to quit smoking in the past year, and more were in the contemplation or preparation stages of change (Table 1Go).


View this table:
[in this window]
[in a new window]
 
Table 1 Characteristics of daily smokers recruited through the Internet, daily smokers recruited by mail for inclusion in a smoking cessation trial and of a representative population sample of smokers in Geneva, Switzerland, 1996–1998
 
Compared to trial participants, Internet participants were more frequently men; they were more addicted to tobacco (they smoked their first cigarette of the day earlier), and had higher scores on the ‘Adverse effects of smoking’ and ‘External stimuli’ self-efficacy scales. They were also more actively involved in changing their smoking behaviour, as shown by their higher scores on four of the five self-change strategies (Table 1Go).

In multivariate analysis, the following variables were independently associated with Internet recruitment: younger age, male sex, more education, contemplation or preparation stages (versus precontemplation), a more severe addiction to tobacco (i.e. smoking more cigarettes and smoking one's first cigarette earlier in the morning), more frequent use of the strategies labelled ‘Commitment to quit smoking’ and ‘Helping relationships’, and a higher ‘External stimuli’ self-efficacy score (Table 2Go).


View this table:
[in this window]
[in a new window]
 
Table 2 Odds ratios of belonging to the Internet sample versus trial sample, from a multivariate logistic regression model. Geneva, Switzerland, 1998
 
Differences in associations between variables
Between-stage differences in attitudes, self-efficacy and the use of self-change strategies were similar in Internet and trial participants (Table 3Go). Differences between light and heavy smokers were also similar in both groups, except for small differences in the ‘Internal stimuli’ self-efficacy score and in the ‘Helping relationships’ score (Table 3Go). Thus, only 2 of 16 between-group comparisons in the strength of associations between variables, compared to 14 of 17 comparisons in the distributions of variables, differed significantly between the Internet and trial samples.


View this table:
[in this window]
[in a new window]
 
Table 3 Differences in psychometric scores between smokers in precontemplation versus contemplation/preparation stages of change, and between light and heavy smokers, for participants self-recruited through the Internet or recruited by mail for inclusion in a smoking cessation trial. Geneva, Switzerland, 1998
 
Discussion

A selection bias towards young educated males was observed in smokers self-recruited through the Internet for receiving smoking cessation advice, compared to a representative sample of smokers, and compared to smokers recruited by mail for inclusion in a smoking cessation trial. Participation in the Internet survey required computer literacy and access to a computer connected to the Internet. As use of the Internet is not yet widespread, the Internet sample may be biased towards relatively privileged ‘innovators’. This type of bias may decrease over time.

Because they were recruited for a smoking cessation intervention, both Internet and trial participants were more motivated to quit smoking than smokers in the representative sample. Internet participants were more motivated to quit and were more addicted to tobacco than trial participants. There are several possible reasons for this difference. The first is information bias: the medium (screen or paper) may influence the answers. Then, there are several types of selection bias. One stems from having access to Internet technology, as stated above. Another is due to the active versus passive mode of recruitment: Internet participants actively sought our web site, whereas trial participants received the questionnaire without having requested it. A further possible source of selection bias is that trial participants had to accept the requirements of a randomized clinical trial, including the risk of being allocated to the control arm, and the necessity of providing follow-up information 6 months later. These possible reasons for differences between Internet and trial participants are confounded in our study and their respective contributions cannot be distinguished.

The trial sample also differed from the representative sample on several variables, in part because the number of mailings, and hence the participation rate, were not similar in the two surveys. More importantly, these differences confirm that smokers who volunteer for smoking cessation studies differ from smokers in the general population.14

In contrast with the numerous differences in descriptive statistics, the strength of associations between smoking-related variables was similar in the Internet and trial samples. This is an important result, showing that bias in distributions of variables does not imply bias in associations between variables. These results suggest that Internet recruitment is potentially useful for analytical studies that are focused on associations between variables. However, this finding needs to be replicated, as it may not apply to variables other than those we measured, or to other populations.

The quality of data collected through the Internet was comparable to the quality of data collected by mail. Few unreliable records (<1%) were found in the Internet database and participants who were registered twice were easily identified. All data collected through the Internet are time-stamped and identified with the code of the participant's computer. This information can be used to delete duplicate responses.2

The possibility of identifying the respondents raises the issue of confidentiality of data collected through the Internet. Researchers using the Internet should make sure that study procedures respect regulations on data protection, and that survey participants provide informed consent. In particular, participants should be informed that their answers are stored, they should be told what will be done with their data, and they should be given the possibility of refusing storage. Failure to respect principles of informed consent and data protection may generate mistrust, which in turn may negatively affect future epidemiological research on the Internet.

Collecting data on the Internet is feasible and can provide data of good quality and large samples (our database currently includes 30 000 people, still growing). Such large databases allow for innovative analyses of narrowly defined subgroups of participants. If participants are given an identification code, follow-up studies can be conducted on the Internet.

Smokers recruited through the Internet differ from smokers in the general population, and so Internet recruitment cannot be used to describe the characteristics of populations other than Internet users. Nevertheless, our results suggest that the Internet may be a cost-effective method of data collection for analytical studies that assess associations between variables.

Acknowledgments

This research was supported by grants from the Swiss National Science Foundation to Dr Etter (32–47122–96, 3233–054994.98 and 3200–055141.98), by the Swiss Cancer League, the Swiss Federal Office of Public Health, Health Authorities of the Cantons of Geneva and Jura, the Geneva Cancer League and the Swiss Foundation for Health Promotion.

References

1 Rothman KJ, Cann CI, Walker AM. Epidemiology and the internet. Epidemiology 1997;8:123–25.[ISI][Medline]

2 Houston JD, Fiore DC. Online medical surveys: using the Internet as a research tool. MD Comput 1998;15:116–20.[Medline]

3 Soetikno RM, Mrad R, Pao V, Lenert LA. Quality-of-life research on the Internet: feasibility and potential biases in patients with ulcerative colitis. J Am Med Inform Assoc 1997;4:426–35.[Abstract/Free Full Text]

4 Ross MW, Tikkanen R, Mansson SA. Differences between Internet samples and conventional samples of men who have sex with men: implications for research and HIV interventions. Soc Sci Med 2000;51: 749–58.[ISI][Medline]

5 Jones R, Pitt N. Health surveys in the workplace: comparison of postal, email and World Wide Web methods. Occup Med (Lond) 1999; 49:556–58.[Abstract]

6 Pettit FA. Exploring the use of the World Wide Web as a psychology data collection tool. Comp Hum Behav 1999;15:67–71.[ISI]

7 Etter JF, Perneger TV. Snowball sampling by mail: application to a survey of smokers in the general population. Int J Epidemiol 2000;29: 43–48.[Abstract/Free Full Text]

8 Etter JF, Perneger TV, Ronchi A. Distributions of smokers by stage: international comparison and association with smoking prevalence. Prev Med 1997;26:580–85.[ISI][Medline]

9 Prochaska JO, DiClemente CC, Norcross JC. In search of how people change. Applications to addictive behaviors. Am Psychol 1992;47: 1102–14.[ISI][Medline]

10 Etter JF, Bergman MM, Humair JP, Perneger TV. Development and validation of a scale measuring self-efficacy of current and former smokers. Addiction 2000;95:901–13.[ISI][Medline]

11 Etter JF, Humair JP, Bergman MM, Perneger TV. Development and validation of the Attitudes Towards Smoking scale (ATS-18). Addiction 2000;95:613–25.[ISI][Medline]

12 Etter JF, Bergman MM, Perneger TV. On quitting smoking: development of two scales measuring the use of self-change strategies in current and former smokers. Addictive Behaviors 2000;25:523–38.[ISI][Medline]

13 Heatherton TF, Kozlowski LT, Frecker RC, Rickert WS, Robinson J. Measuring the heaviness of smoking using self-reported time to first cigarette of the day and number of cigarettes smoked per day. Br J Addict 1989;84:791–800.[ISI][Medline]

14 Hughes JR, Giovino GA, Klevens RM, Fiore MC. Assessing the generalizability of smoking studies. Addiction 1997;92:469–72.[ISI][Medline]