[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Severe limitations of ecologic data, not overcome by STRATIFICATION



Howard,   



I am posting this response to your question to the list since my direct emails 

to you get returned as sent to the wrong address.  Perhaps you need to correct 

your email address on your header (hlong@pacbell.net)?



You asked - Professor Field,



Do you believe that STRATIFICATION of potential confounders like smoking S

(Cohen used deciles, 10 levels) can test effect (or non-effect) of S on a

disease (like lung cancer LCa)  I believe Cohen's novel technique could do 

for epidemiology what the calculus did for mechanics.

--

The short answer to your question is no.

 

Howard, As a professed epidemiologist, you likely know that stratification in 

Epidemiology is not a novel epidemiologic technique, in fact quite the 

opposite.  Please read this portion of a paper from Dr. Greenland that 

addresses epidemiologic problems with ecologic studies in the context Dr. 

Cohen uses them. 



KEY MESSAGES - Quotes from Dr. Greenland's paper -



"Though it is commonly recognized that ecological studies can suffer from 

special biases in estimating individual effects, it is rarely acknowledged 

that the same biases affect ecologic estimates of contextual effects."



"Individual-level data are required to address these problems without resorting 

to controversial assumptions."



To my knowledge, Dr. Cohen never addressed Dr. Greenland's assertions below.



---------------------------------

    

Ecologic versus individual-level sources of bias in ecologic estimates of 

contextual health effects.    





Sander Greenland 



Department of Epidemiology, UCLA School of Public Health, and Department of 

Statistics, UCLA College of Letters and Science, 22333 Swenson Drive, Topanga, 

CA 90290, USA. 



Abstract



A number of authors have attempted to defend ecologic (aggregate) studies by 

claiming that the goal of those studies is estimation of ecologic (contextual 

or group-level) effects rather than individual-level effects. Critics of these 

attempts point out that ecologic effect estimates are inevitably used as 

estimates of individual effects, despite disclaimers. A more subtle problem is 

that ecologic variation in the distribution of individual effects can bias 

ecologic estimates of contextual effects. The conditions leading to this bias 

are plausible and perhaps even common in studies of ecosocial factors and 

health outcomes because social context is not randomized across typical 

analysis units (administrative regions). By definition, ecologic data contain 

only marginal observations on the joint distribution of individually defined 

confounders and outcomes, and so identify neither contextual nor individual-

level effects. While ecologic studies can still be useful given appropriate 

caveats, their problems are better addressed by multilevel study designs, 

which obtain and use individual as well as group-level data. Nonetheless, such 

studies often share certain special problems with ecologic studies, including 

problems due to inappropriate aggregation and problems due to temporal changes 

in covariate distributions. 



Studies limited to characteristics of aggregates (groups) of individuals are 

usually termed ecologic studies, a usage that will be adopted here.1–5 This 

usage is perhaps unfortunate, for the word ‘ecologic’ suggests that such 

studies are especially appropriate for studying the impact of environmental 



factors, including societal characteristics. I will here review some 

criticisms of this notion, arguing that it arises from confusion of an 

ecologic perspective (addressing relations at the environmental or social 

level) with ecologic studies. As a number of authors have pointed out,6–12 

overcoming this confusion requires adoption of a multilevel perspective, which 

allows integration of theory and observations on all available levels: 

physiological (which examines exposures and responses of systems within 

individuals), individual (which examines exposures and responses of 

individuals), and aggregate or contextual (which examines exposures and 

responses of aggregates or clusters of individuals, such as locales or 

societies). 



Defences of ecologic studies argue (correctly) that many critics have presumed 

individual-level relations are the ultimate target of inference of all 

ecologic studies, when this is not always so,9,13,14 and that contagious 

outcomes necessitate group-level considerations in modelling regardless of the 

target level.15 They also point out that an ecologic summary may have its own 

direct effects on individual risk beyond that conferred by the contributing 

individual values; for example, average economic status of an area can have 

effects on an individual over and above the effects of the individual's 

economic status.16,17 Unfortunately, some defences go on to make implicit 

assumptions to ‘prove’ that entire classes of ecologic studies are valid, or 

at least no less valid than individual-level analyses; see Greenland and 

Robins,18,19 Morgenstern,5 and Naylor20 for critical commentaries against such 

arguments in the health sciences. Some ecologic researchers are well aware of 

these problems and explicate the assumptions they use,21,22 but still draw 

criticism because of the sensitivity of inferences to those assumptions.23–25 

Thus I will review some controversial assumptions that appear common in 

ecologic analyses of epidemiological data. Finally, I will briefly discuss 

multilevel methods that represent both individual-level and ecologic data 

within a single model. 



The present paper relies on simple illustrations designed to make the points 

transparent to non-mathematical readers, and focuses on problems of 

confounding and specification bias; a companion paper12 provides an overview 

of the underlying mathematical theory. Many other issues have been raised in 

the ongoing ecologic-study controversy; see the references for details, 

especially those in the Discussion section. 



How Ecologic Confounding Depends on Joint Individual-level Distributions



There are two major types of measurements on aggregates: Summaries of 

distributions of individuals within aggregates, such as mean age and per cent 

female; and purely ecologic (contextual) variables that are defined directly 

on the aggregate, such as whether there is a needle-exchange programme in an 



area. The causal effects of the latter purely contextual variables are the 

focus of much social research and ecosocial epidemiology.9,10,13,26,27 

Nonetheless, most outcome variables of public-health importance are summaries 

of individual-level distributions, such as prevalence, incidence, mortality, 

and life expectancy, all of which can be expressed in terms of average 

individual outcomes.28 Furthermore, many contextual variables are measured by 

surrogates that are summaries over individuals; for example, neighbourhood 

social class is often measured by average income and average education. 



The presence of summary measures in an ecologic analysis introduces a major 

source of uncertainty in ecologic inference: Effects on summaries depend on 

the joint individual-level distributions within aggregates, but distributional 

summaries do not fully determine (and sometimes do not even seriously 

constrain) those joint distributions. This problem corresponds to 



the ‘information lost due to aggregation’, and is a key source of controversy 

about ecologic studies.29 



Panel 1 of Table 1 illustrates this problem. For simplicity, just two areas 

are used here, but examples with many areas have also been given.18 Suppose we 

wish to assess a contextual effect, i.e. the impact of an ecologic difference 

between areas A and B (such as a difference in laws or social programmes) on 

the rate of a health outcome, and we measure this effect by the amount RRA 

that this difference multiplies the rate (the true effect of being in A versus 

being in B). One potential risk factor X differs in distribution between the 

areas; an effect of X (measured by the rate ratio RRX comparing X = 1 to X = 0 

within areas) may be present, but we observe no difference in rates between 

the areas. 



Etc........................

References



1 Langbein LI, Lichtman AJ. Ecological Inference. Series/No. 07–010, Thousand 

Oaks, CA: Sage, 1978.



2 Piantadosi S, Syar DP, Green SB. The ecological fallacy. Am J Epidemiol 

1988;127:893–904.[Medline]



3 Cleave N, Brown PJ, Payne CD. Evaluation of methods for ecological 

inference. J Roy Stat Soc Ser A 1995;158:55–72.



4 Plummer M, Clayton D. Estimation of population exposure in ecological 

studies. J Roy Stat Soc Ser B 1996;58:113–26.





5 Morgenstern H. Ecologic studies. In: Rothman KJ, Greenland S (eds). Modern 

Epidemiology. 2nd Edn. Philadelphia: Lippincott, 1998, pp.459–80.



6 Firebaugh G. Assessing group effects. In: Borgatta EF, Jackson DJ (eds). 

Aggregate Data: Analysis and Interpretation. Beverly Hills: Sage, 1980, pp.13–

24.



7 Von Korff M, Koepsell T, Curry S, Diehr P. Multi-level analysis in 

epidemiologic research on health behaviors and outcomes. Am J Epidemiol 

1992;135:1077–82.[Abstract]



8 Navidi W, Thomas D, Stram D, Peters J. Design and analysis of multilevel 

analytic studies with applications to a study of air pollution. Environ Health 

Persp 1994;102(Suppl.8):25–32.[Medline]



9 Susser M, Susser E. Choosing a future for epidemiology: II. From black box 

to Chinese boxes and eco-epidemiology. Am J Public Health 1996;86:674–77.

[Abstract]



10 Duncan C, Jones K, Moon G. Health-related behaviour in context: a 

multilevel modeling approach. Soc Sci Med 1996;42:817–30.[Medline]





11 Duncan C, Jones K, Moon G. Context, composition and heterogeneity: using 

multilevel models in health research. Soc Sci Med 1998;46: 97–117.[Medline]



12 Greenland S. A review of multilevel theory for ecologic analysis. Stat Med 

2001;20:to appear.



13 Schwartz S. The fallacy of the ecological fallacy: the potential misuse of 

a concept and the consequences. Am J Public Health 1994;84: 819–24.[Abstract]



14 Pearce N. Traditional epidemiology, modern epidemiology, and public health. 

Am J Public Health 1996;86:678–83.[Abstract]



15 Koopman JS, Longini IM Jr. The ecological effects of individual exposures 

and nonlinear disease dynamics in populations. Am J Public Health 1994;84:836–

42.[Abstract]



16 Firebaugh G. A rule for inferring individual-level relationships from 

aggregate data. Am Soc Rev 1978;43:557–72.



17 Hakama M, Hakulinen T, Pukkala E, Saxen F, Teppo L. Risk indicators of 

breast and cervical cancer on ecologic and individual levels. Am J Epidemiol 



1982;116:990–1000.[Abstract]



18 Greenland S, Robins J. Ecologic studies—biases, misconceptions, and 

counterexamples. Am J Epidemiol 1994;139:747–60.[Abstract]



19 Greenland S, Robins JM. Accepting the limits of ecologic studies. Am J 

Epidemiol 1994;139:769–71.



20 Naylor CD. Ecological analysis of intended treatment effects: caveat 

emptor. J Clin Epidemiol 1999;52:1–5.[Medline]



21 King G. A Solution to the Ecological Inference Problem. Princeton: 

Princeton University Press, 1997.



22 King G. The future of ecological inference (letter). J Am Stat Assoc 

1999;94:352–54.



23 Rivers D. Review of ‘A solution to the ecological inference problem.’ Am 

Pol Sci Rev 1998;92:442–43.



24 Freedman DA, Klein SP, Ostland M, Roberts MR. Review of ‘A solution to the 

ecological inference problem.’ J Am Stat Assoc 1998;93:1518–22.



25 Freedman DA, Ostland M, Roberts MR, Klein SP. Reply to King (letter). J Am 

Stat Assoc 1999;94:355–57.



26 Borgatta EF, Jackson DJ (eds.). Aggregate Data: Analysis and 

Interpretation. Beverly Hills: Sage, 1980.



27 Iversen GR. Contextual Analysis. Thousand Oaks, CA: Sage, 1991.



28 Rothman KJ, Greenland S. Modern Epidemiology. 2nd Edn. Philadelphia: 

Lippincott, 2000.



29 Achen CH, Shively WP. Cross-Level Inference. Chicago: University of Chicago 

Press, 1995.



30 Duncan OD, Davis B. An alternative to ecological correlation. Am Soc Rev 

1953;18:665–66.



31 Cohen BL. Ecological versus case-control studies for testing a linear-no-

threshold dose-response relationship. Int J Epidemiol 1990;19: 680–84.



32 Cohen BL. In defense of ecological studies for testing a linear no-

threshold theory. Am J Epidemiol 1994;139:769–68.



33 Cohen BL. Re: Parallel analyses of individual and ecologic data residential 

radon, cofactors, and lung cancer in Sweden (letter). Am J Epidemiol 

2000;152:194–95.



34 Susser M. The logic in ecological. Am J Public Health 1994;84:825–35.



35 Openshaw S, Taylor PH. The modifiable area unit problem. In: Wrigley N, 

Bennett RJ (eds). Quantitative Geography. London: Routledge, 1981, Ch. 9.



36 Sheppard L. Insights on bias and information in group-level studies. 

Biostatistics 2002;to appear.



37 Freedman DA, Klein SP, Sacks J, Smyth CA, Everett CG. Ecological regression 

and voting rights (with discussion). Eval Rev 1998;15: 673–816.



38 Greenland S, Morgenstern H. Ecological bias, confounding, and effect 

modification. Int J Epidemiol 1989;18:269–74.[Abstract]



39 Greenland S, Morgenstern H. Neither within-region nor cross-regional 

independence of covariates prevents ecological bias (letter). Int J Epidemiol 

1991;20:816–18.[Medline]



40 Richardson S, Hémon D. Ecological bias and confounding (letter). Int J 

Epidemiol 1990;19:764–66.[Medline]



41 Piantadosi S. Ecologic biases. Am J Epidemiol 1994;139:761–64.[Medline]



42 Stidley C, Samet JM. Assessment of ecologic regression in the study of lung 



cancer and indoor radon. Am J Epidemiol 1994;139:312–22.[Abstract]



43 Lagarde F, Pershagen, G. Parallel analyses of individual and ecologic data 

on residential radon, cofactors, and lung cancer in Sweden. Am J Epidemiol 

1999;149:268–74.[Abstract]



44 Lagarde F, Pershagen, G. The authors reply (letter). Am J Epidemiol 

2000;152:195.[Free Full Text]



45 Cho WTK. If the assumption fits: a comment on the King ecologic inference 

solution. Pol Anal 1998;7:143–63.



46 Stoto MA. Review of ‘Ecological inference in public health.’ Pub Health Rep 

1998;113:182–83.[Medline]



47 Wen SW, Kramer MS. Uses of ecologic studies in the assessment of intended 

treatment effects. J Clin Epidemiol 1999;52:7–12.[Medline]



48 Greenland S. Randomization, statistics, and causal inference. Epidemiology 

1990;1:421–29.[Medline]



49 Greenland S, Robins JM, Pearl J. Confounding and collapsibility in causal 

inference. Stat Sci 1999;14:29–46.





50 Goodman LA. Some alternatives to ecological correlation. Am J Sociol 

1959;64:610–25.



51 Robins JM, Murphy S, Greenland S. Towards a formal theory of causation in 

ecologic and multilevel studies. J Roy Stat Soc Ser A, In Press.



52 Vaupel JW, Manton KG, Stallard, E. The impact of heterogeneity in 

individual frailty on the dynamics of mortality. Demography 1979; 16:439–54.

[Medline]



53 Richardson S, Stücker I, Hémon D. Comparison of relative risks obtained in 

ecological and individual studies: some methodological considerations. Int J 

Epidemiol 1987;16:111–20.[Abstract]



54 Dobson AJ. Proportional hazards models for average data for groups. Stat 

Med 1988;7:613–18.[Medline]



55 Prentice RL, Sheppard L. Aggregate data studies of disease risk factors. 

Biometrika 1995;82:113–25.



56 Lasserre V, Guihenneuc-Jouyaux C, Richardson S. Biases in ecological 

studies: utility of including within-area distribution of confounders. Stat 

Med 2000;19:45–59.[Medline]





57 Guthrie KA, Sheppard L. Overcoming biases and misconceptions in ecologic 

studies. J Roy Stat Soc Ser A 2001;164:141–54.



58 Prentice RL, Sheppard L. Validity of international, time trend, and migrant 

studies of dietary factors and disease risk. Prev Med 1989; 18:167–79.[Medline]



59 Kleinman JC, DeGruttola VG, Cohen BB, Madans JH. Regional and urban-

suburban differentials in coronary heart disease mortality and risk factor 

prevalence. J Chron Dis 1981;34:11–19.



60 Sheppard L, Prentice RL. On the reliability and precision of within- and 

between-population estimates of relative rate parameters. Biometrics 

1995;51:853–63.[Medline]



61 Wakefield J. Ecological inference for 2 x 2 tables. J Roy Stat Soc 2002; to 

appear.



62 Goldstein H. Multilevel Statistical Models. New York: Edward Arnold, 1995.



63 Stavraky KM. The role of ecologic analysis in studies of the etiology of 

disease: a discussion with reference to large bowel cancer. J Chron Dis 

1976;29:435–44.





64 Polissar L. The effect of migration on comparison of disease rates in 

geographic studies in the United States. Am J Epidemiol 1980;111: 175–82.

[Abstract]



65 Greenland S. When should epidemiologic regressions use random coefficients? 

Biometrics 2000;56:915–21.[Medline]



66 Rosenbaum PR, Rubin DB. Difficulties with regression analyses of age-

adjusted rates. Biometrics 1984;40:437–43.[Medline]



67 Brenner H, Savitz DA, Jöckel K-H, Greenland S. Effects of nondifferential 

exposure misclassification in ecologic studies. Am J Epidemiol 1992;135:85–95.

[Abstract]



68 Carroll RJ. Some surprising effects of measurement error in an aggregate 

data estimator. Biometrika 1997;84:231–34.



69 Brenner H, Greenland S, Savitz DA. The effects of nondifferential 

confounder misclassification in ecologic studies. Epidemiology 1992; 3:456–59.

[Medline]



70 Wakefield J, Salway R. A statistical framework for ecological and aggregate 

studies. J Roy Stat Soc Ser A 2001;164:119–37.





71 Morgenstern H. Ecologic study. In: Armitage P, Colton T (eds). Encyclopedia 

of Biostatistics. Vol. 2. Chichester: Wiley, 1998, pp.1255–76.





************************************************************************

You are currently subscribed to the Radsafe mailing list. To unsubscribe,

send an e-mail to Majordomo@list.vanderbilt.edu  Put the text "unsubscribe

radsafe" (no quote marks) in the body of the e-mail, with no subject line.

You can view the Radsafe archives at http://www.vanderbilt.edu/radsafe/