Ecological fallacy

From a statistical point of view, these ideas can be unified by specifying proper statistical models to make formal inferences, using aggregate data to make unobserved relationships in individual level data.

[2] An example of ecological fallacy is the assumption that a population mean has a simple interpretation when considering likelihoods for an individual.

Similarly, if a particular group of people is measured to have a lower mean IQ than the general population, it is an error to conclude that a randomly-selected member of the group is more likely than not to have a lower IQ than the mean IQ of the general population; it is also not necessarily the case that a randomly selected member of the group is more likely than not to have a lower IQ than a randomly-selected member of the general population.

Mathematically, this comes from the fact that a distribution can have a positive mean but a negative median.

Consider the following numerical example: Research dating back to Émile Durkheim suggests that predominantly Protestant localities have higher suicide rates than predominantly Catholic localities.

[3] According to Freedman,[4] the idea that Durkheim's findings link, at an individual level, a person's religion to their suicide risk is an example of the ecological fallacy.

Similarly, even if at the individual level, wealth is positively correlated to tendency to vote Republican in the United States, we observe that wealthier states tend to vote Democratic.

States may not be wealthier because they contain more wealthy people (i.e., more people with annual incomes over $200,000), but rather because they contain a small number of super-rich individuals; the ecological fallacy then results from incorrectly assuming that individuals in wealthier states are more likely to be wealthy.

Many examples of ecological fallacies can be found in studies of social networks, which often combine analysis and implications from different levels.

[6] A 1950 paper by William S. Robinson computed the illiteracy rate and the proportion of the population born outside the US for each state and for the District of Columbia, as of the 1930 census.

He cautioned against deducing conclusions about individuals on the basis of population-level, or "ecological" data.

In 2011, it was found that Robinson's calculations of the ecological correlations are based on the wrong state level data.

[8] Robinson's paper was seminal, but the term 'ecological fallacy' was not coined until 1958 by Selvin.

In other words, correlation of aggregate variables take into account cross sectional effects which are not relevant at the individual level.

The problem for correlations entails naturally a problem for regressions on aggregate variables: the correlation fallacy is therefore an important issue for a researcher who wants to measure causal impacts.

For instance, for the governor of a state, it is correct to run regressions between police force on crime rate at the state level if one is interested in the policy implication of a rise in police force.

However, an ecological fallacy would happen if a city council deduces the impact of an increase in police force in the crime rate at the city level from the correlation at the state level.

, the Simpson's paradox is exactly the omitted variable bias for the regression of Y on X where the regressor

The application is striking because the bias is high enough that parameters have opposite signs.

The ecological fallacy was discussed in a court challenge to the 2004 Washington gubernatorial election in which a number of illegal voters were identified, after the election; their votes were unknown, because the vote was by secret ballot.

[10] An expert witness said this approach was like trying to figure out Ichiro Suzuki's batting average by looking at the batting average of the entire Seattle Mariners team, since the illegal votes were cast by an unrepresentative sample of each precinct's voters, and might be as different from the average voter in the precinct as Ichiro was from the rest of his team.

[11] The judge determined that the challengers' argument was an ecological fallacy and rejected it.