Press 'o' to toggle the slide overview and 'f' for full-screen mode.

Choose the theme in which to view this presentation:

Black - White - League - Sky - Beige - Simple
Serif - Blood - Night - Moon - Solarized




Copyright © John Lindsay, 2015

GEOG*3480

GIS and Spatial Analysis


Statistical Analysis of
Spatial Data



John Lindsay
Fall 2015

Readings

  • Jensen and Jensen Chapter 8

Topics

  • Over the next two lectures, we'll discuss:
    • Descriptive Statistics
    • Descriptive Spatial Statistics
    • Spatial Autocorrelation
    • Point Pattern Analysis
      • Quadrat Analysis
      • Nearest-Neighbour Analysis
    • Directional Analysis

Descriptive Statistics

  • Measures of central tendency
    • Mode, median, and mean (\(\overset{-}x\))

    • \(\overset{-}x = \frac {\underset{i=1}{\overset{N}{\Sigma}} x} {N}\)

Descriptive Statistics

  • Measures of dispersion
    • Variance (\(s^2\))
    • Standard deviation (\(s\))

    • \(s^2 = \frac {\underset{i=1}{\overset{N}{\Sigma}} (x_i - \overset{-}x)^2} {N - 1}\)

    • \(s = \sqrt \frac {\underset{i=1}{\overset{N}{\Sigma}} (x_i - \overset{-}x)^2} {N - 1}\)

Descriptive Statistics

  • Skewness
    • Measure of the asymmetry of a distribution

  • Kurtosis
    • Measure of the peakedness of a distribution
Distributions (From: Jensen & Jensen 2013)
Distributions
(From: Jensen & Jensen 2013)

Descriptive Spatial Statistics

  • Mean Centre
    • Measure of central tendency that can be used to determine the centre of a distribution plotted in geographic coordinates.

  • Standard Distance
    • Measure of dispersion of geographically distributed data.
Spatial distributions
(From: Jensen & Jensen 2013)

Tobler’s first law

  • The first law of geography: “everything is related to everything else, but near things are more related than distant things.” (Tobler, 1970)

  • This simple statement forms the basis for a great deal of geographical analysis and is concept underlying the idea of spatial autocorrelation.

  • Synonymous with the concept of spatial dependence in geostatistics

Spatial autocorrelation

  • Correlation of a variable with itself through space.
    • Correlation versus spatial autocorrelation

  • Actually bad news and good news
    • Bad for statistical reasons
    • Good because, “if geography is worth studying at all, it must be because phenomena do not vary randomly through space” (O'Sullivan and Unwin, 2003, pg. 28)
    • Essential for spatial modelling through Interpolation

Spatial autocorrelation

  • Three possibilities:
    • Clustered (positive autocorrelation): nearby locations are likely to be similar to one another.

    • Random (autocorrelation near zero): no spatial effect is discernible, and observations seem to vary randomly through space

    • Dispersed (negative autocorrelation): observations from nearby observations are likely to be different from one another.

Spatial autocorrelation

Spatial Autocorrelation

Moran's \(I\)

  • Moran's \(I\) measures the interdependence in spatial distributions.
    • Used with interval/ratio level data
    • Used to detect spatial trends
    • -1 ≤ \(I\) ≤ 1
    • \(I\) = -1 = dispersed
    • \(I\) = 0 = random
    • \(I\) = +1 = clustered

Moran's \(I\)


\(I = \frac {N}{\underset{i=1}{\overset{N} \Sigma} \underset{j=1}{\overset{N} \Sigma} w_{ij}} \frac {\underset{i=1}{\overset{N} \Sigma} \underset{j=1}{\overset{N} \Sigma} w_{ij} (x_i - \overset{-} x) (x_j - \overset{-} x)}{\underset{i=1}{\overset{N} \Sigma} {(x_i - \overset{-} x)^2}}\)

Where \(\overset{-} x\) is the mean of variable \(x\); \( x_i \) is the value at \(i\); \(j\) is a neighbour of \(i\); \( w_{ij} \) is the weight between neighbours \(i\) and \(j\).
Spatial Autocorrelation (From: Jensen & Jensen 2013)
Spatial Autocorrelation (From: Jensen & Jensen 2013)

Point Pattern Analysis

  • Mapped point data often exhibit distinct patterning.

  • Patterns result from the spatial component of a control on the phenomenon.

  • Understanding the pattern can help with understanding the controlling forces on the phenomenon.

Point Pattern Analysis

  • The patterns that we're interested in with Point Pattern Analysis (PPA) result from the locations of individual points and not on their attributes, for which spatial autocorrelation is more relevant.

  • Quadrat Analysis and Nearest-Neighbour Analysis the the two most common methods for PPA

Quadrat Analysis

  • A quadrat is a user-defined geographic area, usually a square or rectangle, used to measure the distribution of a spatial phenomenon.

  • Quadrat analysis can be used to test whether the phenomenon is uniformly distributed.

  • The Chi Square test is used with quadrats.
Quadrat analysis (From: Jensen & Jensen 2013)
Quadrat analysis (From: Jensen & Jensen 2013)

Quadrat Analysis

  • The value of Chi Square is compared with a table of critical values to determine whether the points are statistically significantly different from a uniformly distribution.

  • You should be thinking about the MAUP about now!

  • The size, shape, and number of quadrats will impact the results of the quadrat analysis.

Nearest-neighbour Analysis

  • NNA is used in GIS to determine whether point sets are random or non-random.
  • If a point set is found to be non-random then we are left with the task of determining what controls the distribution.
  • For each point in the set, find the distance to the closest neighbour.
Nearest-neighbour analysis (From: Jensen & Jensen 2013)
Nearest-neighbour analysis (From: Jensen & Jensen 2013)

Nearest-neighbour Analysis

\(d_e = \frac 1 {2 \sqrt{N/A}} = \frac 1 {2 \sqrt{p}} \)

  • where \(d_e\) is the expected density (assuming random distribution); \(N\) is the number of points; \(A\) is the study area; \(p\) is the point density.

\(NNR = \frac {Dist_{Obs}} {Dist_{Ran}} = \frac {d_a} {d_e} \)

  • where \(NNR\) is the nearest-neighbour ratio; \(Dist_{Obs}\) is the mean NN distance; \(Dist_{Ran}\) is the expected distance for a random distribution.
Nearest-neighbour analysis (From: Jensen & Jensen 2013)

Nearest-neighbour Analysis

  • Warning: Our estimates of the point density is dependent on our definition of the study area.
  • If we change the extent of the study area, we change the results.
Not so clusteredVery clustered
Nearest-neighbour analysis Nearest-neighbour analysis

Nearest-neighbour Analysis

  • NNA is also sensitive to the non-uniformity of underlying space.
  • NNA assumes that points are free to locate anywhere.
  • Consider the gap in stream channel heads below. It's the result of Lake Ontario.
Nearest-neighbour analysis

Circular Data

  • Geographers distinguish between directional (0°-360°) and axial (a.k.a. oriented; 0°-180°) data.
    • Wind is directional; a road is axial.

  • Directional and axial data can be plotted using Rose Diagrams, which are like circular histograms.

Rose Diagrams

circular data circular data

Circular Data

\(\overset{-}\theta = tan^{-1}(\frac{\overset{N}{\underset{i=1}{\Sigma}}{sin \theta_i}} {\overset{N}{\underset{i=1}{\Sigma}}{cos \theta_i}}) \)

  • where \(\overset{-}\theta\) is the mean direction, derived from the vector resultant.

Circular Data

\(\overset{-}R = \frac 1 N \sqrt{(\overset{N}{\underset{i=1}{\Sigma}}{cos \theta_i})^2 + ({\overset{N}{\underset{i=1}{\Sigma}}{sin \theta_i}})^2} \)

  • where \(\overset{-}R\) is the standardized length of the vector resultant, called the mean resultant length, and is a measure of dispersion.
  • 0 ≤ \(\overset{-}R\) ≤ 1, where values near 1 indicate small angular dispersion and vice versa.

Circular Data

  • Axial (oriented) data cannot easily be treated as vectors because there is nothing to distinguish one end of the line from the other.

  • An angle of 179° is very close to one of 1°.

  • To solve this double all the angles, calculate the statistics with the doubled data, and then halve the angles to get the mean direction, mean resultant length, etc.
    • 45° × 2 = 90°
    • 225° × 2 = 450° = 450° - 360° = 90°