|
Epidemiology for Journalists
Some simple, basic principles and definitions to cope with a science that keeps cropping up in stories about health, safety and the environment.
Edited excerpts from the FACS publication, "Epidemiology for Journalists," copyright 1994 by the Foundation for American Communications (FACS)
By Daniel Wartenberg
Posted to FACSnet April 23, 1996
Epidemiology is the study of patterns of disease, who has disease, how much disease they have and why they have it. The object is to find out who gets sick and why, and in turn help us avoid exposure to whatever makes us sick.
Most epidemiological studies are done in the field rather than the lab. They study groups of people, looking for what they have in common and searching for principles that will apply to most (if not all) study subjects. The science has developed carefully designed protocols to tease out the important risk factors from those complex relationships. Along the way researchers have to screen out factors that could lead to false conclusions, and, to further complicate matters, they can study the same population under the same conditions only once. It is extremely difficult often impossible to replicate the specific history and experiences of any study group.
Epidemiologists look for differences in disease rates in different groups. If the rate differences are substantial enough, then they ask if the differences are due to some particular exposure the groups may have experienced, or whether the differences should be chalked up to random statistical fluctuations.
Starting a Study
To determine whether a specific exposure (or risk factor) causes disease, there are three main criteria that epidemiologists use:
-
Temporality. It seems obvious, but exposure must precede disease. If a person develops lung cancer and then begins smoking cigarettes, the cigarettes cannot be the cause of the disease. (In the late 50s, before the link between cigarettes and lung cancer had been clearly established, eminent scientists actually suggested disease led to smoking.)
-
Consistency. The same type of effect must show up in a variety of studies. Different populations all exposed to the same risk factor should see the same health effects. There have been a large number of studies looking at cigarette smokers and lung cancer, conducted in the U.S. and elsewhere, among men and women, of different ethnic groups from the 1950s through the 1990s. In general, they all give the same results, finding that people who smoke cigarettes are more likely to develop lung cancer than those who do not.
-
Dose-response. The greater the exposure, the greater the health effect. In studies of smoking, the more a person smoked, the greater his risk of lung cancer. Those smoking two packs of cigarettes a day for 40 years had a substantially higher risk of lung cancer than those who smoked less than 5 cigarettes a day for only 10 years.
Types of Studies
There are four most common types of epidemiological studies:
-
Cohort Study follows a group of healthy people, measures their exposure to risk factors and assesses what happens to their health over time. The design is less subject to bias because it measures exposure before scientists learn the health outcome. A cohort study is expensive, time-consuming and logistically difficult, making it most useful for relatively common diseases, where sample sizes don t have to be too large.
-
Case Control Study investigates the prior exposure of individuals with a particular health condition and those without it to infer why certain subjects (the "cases") became ill and others ("controls") did not. The main advantage is that it enables study of rare diseases without having to follow thousands of people, making it generally quicker, cheaper, and easier than the cohort study. Primary disadvantages: There s a greater potential for bias, since we know the health status before the exposure is determined, and it doesn't allow for broader-based health assessments because we select only one type of disease for study.
-
Occupational Epidemiological Study can use any standard epidemiologic design, and simply selects working people with particular jobs or exposures as subjects. Workers often have substantially higher exposures to risk factors than the typical population, which increases our chances of detecting an effect if one truly exists. The main disadvantages are that workers with different jobs differ substantially from one another in terms of risks, and that the working population is substantially different from those who don't work, making it difficult to generalize the results for the overall population.
-
Cross-Sectional Study is conducted by inquiring about the health and risk exposure of groups at a single moment in time and assessing differences. The cross-sectional study implicitly assumes that the study population has been exposed for a long time and will continue to be exposed if nothing intervenes. It's a particularly easy study to conduct, and identifies possible associations and worthwhile case-control or cohort studies for follow up, but the study may not confirm causes.
Attributable Risk
Epidemiologists compare the frequency of disease in groups with and without the risk factor or exposure. They estimate the proportion of each population that has the disease, usually over a period of time such as a year. The resulting number of new cases diagnosed per number of people observed is called the rate of disease. For example, let's say the rate of childhood leukemia in the United States is approximately 0.0001 per year, shorthand for one case per 10,000 children per year.
By subtracting the rate of disease in the population without the risk factor from the rate of disease in the exposed population, we get the rate difference, or attributable risk. We attribute this excess of disease to the risk factor we're studying.
Let's say in a small U.S. town where drinking water was contaminated, investigators identified 10 cases of leukemia in 2,000 children over a 10-year period of observation. That converts to a rate of five cases per 10,000 children per year.
Since the U.S. rate is one per 10,000, we calculate that this town had an excess of four cases per 10,000 people per year (five cases in the town minus one case expected based on the national data, per 10,000 persons per year). We attribute that excess to the contaminated drinking water.
Relative risk
To describe how serious the risk is for those exposed compared to those who are not exposed, we will make a ratio of the rates of disease in both groups to derive a disease rate ratio. In the case of the small town with the contaminated drinking water, this rate ratio would be five divided by one, or 5.0. The rate ratio is sometimes called relative risk.
In this example, children living in the small town were at a risk of developing leukemia that was five times greater than that of children in the U.S. as a whole. The ratio is the most commonly reported result in epidemiology.
Population attributable risk
Another way of thinking about the effect of a risk factor is to determine what proportion of disease in a population would be prevented if the risk factor weren't there. For example, if we removed all contaminated water in the small town, what proportion of childhood leukemia cases nationwide would be prevented?
The index is called the population attributable risk. The index number is calculated by subtracting expected risk from the observed relative risk, dividing by the observed relative risk, then multiplying by the percent of presence of the risk in the general population.
Even though the relative risk (5.0) is fairly high in this case (in environmental epidemiology, a relative risk greater than 2 or 3 is considered high), its effect on the entire U.S. population is fairly small, since few children nationwide are exposed to contaminated wells, and population attributable risk is low.
Imagine the converse situation, where exposure brings only a small risk, but it's so common in the population that it causes a substantial portion of disease nationwide.
For instance, let's assume that sun tanning confers a fairly small relative risk for skin cancer, about 1.5 compared to those who don't tan. However, 50% of the people are tanners, making the population attributable risk just under 17%. (1.5 minus 1.0 divided by 1.5 multiplied by .5.) In other words, keeping everyone out of the sun would prevent 17% of all skin cancers, even though people who tan are only 1.5 times more likely to get such a cancer than those who don't.
Because the population attributable risk puts the relative risk in the context of the whole population, generally it's a more useful index for assessing public health effect, whereas relative risk is useful for assessing the risk to an individual.
Odds ratio
Since case-control studies select subjects because they have or don't have a particular disease, we have to use a different kind of index. For statistical reasons, rather than compare rates we compare odds, and call this the odds ratio.
To calculate the exposure odds ratio, we calculate the odds of exposure among cases and divide it by the odds of exposure among controls. The odds used in epidemiology are like betting odds. If you have 1 chance in 3 of winning, we say your odds are 1 to 2 (written as 1:2). If 4 out of 5 people meet your criteria, we say the odds of finding one such person is 4:1.
Glossary of epidemiological terms
attributable risk: The proportion of exposed cases that would not have gotten the disease if they had not been exposed.
bias: A technical term for playing favorites in choosing study subjects or in assessing exposure.
case-control study: Investigates the prior exposure of individuals with a particular health condition and those without it to infer why certain subjects got the disease and others didn't. Also known as the "Why me?" study.
cohort study: Follows a group of healthy individuals who have different levels of exposure, and characterizes their health outcomes over time with respect to their level of exposure. Also known as the "What will happen to me?" study.
confounding: Finding an association for the wrong reason.
cross-sectional study: Assesses a group's health status and exposure status simultaneously. Also called the "Am I like my neighbors?" study.
disease rate: The number of events (disease occurrences) per number of people in the population per unit of time.
false negative: obtaining a statistically non-significant result when an effect truly exists.
false positive: Obtaining a statistically significant result when there is no effect.
logistic regression: A statistical method for calculating odds ratios for individual risk factors where a variety of risk factors may be contributing to the occurrence of disease.
occupational study: Studies in which subjects are chosen from the workplace.
odds ratio: The comparison between the odds of exposure among cases to the odds of exposure among controls.
p-value (probability value): The probability that an index of effect is as extreme or more extreme than that observed even if no effect exists (i.e., if the null hypothesis if false).
proportional mortality ratio: The proportion of total deaths represented by a particular cause of death in the occupational cohort measured against the same cause in the reference population.
relative risk: The measure of risk for those exposed compared to those who are not exposed.
standardized incidence ratio: The rate of incidence of disease in the worker group compared to that rate in the reference group.
standardized mortality ratio: The rate of mortality in the worker group due to a specified cause compared to the rate for the same in the reference group.
statistical power: The probability that one can detect an effect if there really is one.
statistical significance: The probability of obtaining a result as extreme or more extreme as that observed even if the null hypothesis is true.
survival analysis: Charts the survival of the group with the risk factor and the group without it to determine if the survivals differ. |