Press 'o' to toggle the slide overview and 'f' for full-screen mode.

Choose the theme in which to view this presentation:

Black - White - League - Sky - Beige - Simple
Serif - Blood - Night - Moon - Solarized

Copyright © John Lindsay, 2015


GIS and Spatial Analysis

Data Quality Part 2

John Lindsay
Fall 2015

Error Propagation

  • When there is error in a spatial data layer and we combine it with other layers as part of a GIS analysis workflow, the result will reflect the inaccuracies of the lowest-quality input.
  • Error propagates in a workflow in very complex ways that can be difficult to model analytically.
  • The Monte-Carlo method (stochastic simulation) is commonly used to study error.

Stochastic Simulation

Error propagation simulation

Error Propagation

Error propagation simulation
Lindsay and Evans (2010) Monte-Carlo based simulation of how DEM error propagates into uncertainty in mapped watersheds.

Spatially Aggregated Data

  • Many geographic data are aggregates of data at a more detailed level:
    • National census: collected at the household level but reported for practical and privacy reasons at various levels of aggregation (block, neighbourhood, postcode, county, province, etc.)
    • Traffic Analysis Zone (TAZ)
    • School district
    • Watersheds
    • Pixels

Ecological Fallacy

  • The Ecological Fallacy is a situation that can occur when a researcher or analyst makes an inference about an individual based on aggregate data for a group.
  • The conclusions we draw from a GIS analysis of spatially aggregated data only apply at the scale of the aggregation

Ecological Fallacy

  • Example: we might observe a strong relationship between income and crime at the county level, with lower-income areas being associated with higher crime rate.
  • Conclusion
    • Lower-income persons are more likely to commit crime
    • Lower-income areas are associated with higher crime rates
    • Lower-income counties tend to experience higher crime rates

Ecological Fallacy

Ecological fallacy

The Modifiable Areal Unit Problem (MAUP)

  • "A problem arising from the imposition of artificial units of spatial reporting on continuous geographic phenomena resulting in the generation of artificial spatial patterns."
  • That is, it involves data that are spatially aggregated using arbitrary boundaries and affects our ability to quantify apparent relations, statistical or otherwise.

The Modifiable Areal Unit Problem (MAUP)

  • In a series of publications in the 1970s Stan Openshaw demonstrated that by changing the aggregation characteristics of spatial data, it is possible to measure correlations between -1.0 and 1.0!
  • The MAUP has two manifestations:
    1. the aggregation effect
    2. the scale effect

The Modifiable Areal Unit Problem (MAUP)

These are potential problems in almost every field that utilizes spatial data

MAUP Example: Electoral Districts & Voting

  • It's not just who you vote for but also where you vote that counts.
  • In the 2000 U.S. presidential election, Al Gore, won more of the popular vote than George Bush, but failed to become president.
  • A different aggregation of U.S. counties into states could have produced a different outcome (switching just one northern Florida county to Georgia or Alabama would have produced a different outcome).


  • The practice of setting electoral district boundaries to favour a particular outcome.
  • We have to be very careful whenever we modify the geographic boundaries of voting aggregation areas.
  • Gerrymandering is commonplace in many countries, sometimes even legal!