GEOG*3480 Lecture 5

Press 'o' to toggle the slide overview and 'f' for full-screen mode.

Choose the theme in which to view this presentation:

Black - White - League - Sky - Beige - Simple
Serif - Blood - Night - Moon - Solarized

Copyright © John Lindsay, 2015

GEOG*3480

GIS and Spatial Analysis

Statistical Analysis of
Spatial Data

John Lindsay
Fall 2015

Readings

Jensen and Jensen Chapter 8

Topics

Over the next two lectures, we'll discuss:
- Descriptive Statistics
- Descriptive Spatial Statistics
- Spatial Autocorrelation
- Point Pattern Analysis
  - Quadrat Analysis
  - Nearest-Neighbour Analysis
- Directional Analysis

Descriptive Statistics

Measures of central tendency
- Mode, median, and mean (\(\overset{-}x\))
- \(\overset{-}x = \frac {\underset{i=1}{\overset{N}{\Sigma}} x} {N}\)

Descriptive Statistics

Measures of dispersion
- Variance (\(s^2\))
- Standard deviation (\(s\))
- \(s^2 = \frac {\underset{i=1}{\overset{N}{\Sigma}} (x_i - \overset{-}x)^2} {N - 1}\)
- \(s = \sqrt \frac {\underset{i=1}{\overset{N}{\Sigma}} (x_i - \overset{-}x)^2} {N - 1}\)

Descriptive Statistics

Skewness
- Measure of the asymmetry of a distribution

Kurtosis
- Measure of the peakedness of a distribution

(From: Jensen & Jensen 2013)

Descriptive Spatial Statistics

Mean Centre
- Measure of central tendency that can be used to determine the centre of a distribution plotted in geographic coordinates.

Standard Distance
- Measure of dispersion of geographically distributed data.

(From: Jensen & Jensen 2013)

Tobler’s first law

The first law of geography: “everything is related to everything else, but near things are more related than distant things.” (Tobler, 1970)

This simple statement forms the basis for a great deal of geographical analysis and is concept underlying the idea of spatial autocorrelation.

Synonymous with the concept of spatial dependence in geostatistics

Spatial autocorrelation

Correlation of a variable with itself through space.
- Correlation versus spatial autocorrelation

Actually bad news and good news
- Bad for statistical reasons
- Good because, “if geography is worth studying at all, it must be because phenomena do not vary randomly through space” (O'Sullivan and Unwin, 2003, pg. 28)
- Essential for spatial modelling through Interpolation

Spatial autocorrelation

Three possibilities:
- Clustered (positive autocorrelation): nearby locations are likely to be similar to one another.
- Random (autocorrelation near zero): no spatial effect is discernible, and observations seem to vary randomly through space
- Dispersed (negative autocorrelation): observations from nearby observations are likely to be different from one another.

Spatial autocorrelation

Moran's \(I\)

Moran's \(I\) measures the interdependence in spatial distributions.
- Used with interval/ratio level data
- Used to detect spatial trends
- -1 ≤ \(I\) ≤ 1
- \(I\) = -1 = dispersed
- \(I\) = 0 = random
- \(I\) = +1 = clustered

Moran's \(I\)

\(I = \frac {N}{\underset{i=1}{\overset{N} \Sigma} \underset{j=1}{\overset{N} \Sigma} w_{ij}} \frac {\underset{i=1}{\overset{N} \Sigma} \underset{j=1}{\overset{N} \Sigma} w_{ij} (x_i - \overset{-} x) (x_j - \overset{-} x)}{\underset{i=1}{\overset{N} \Sigma} {(x_i - \overset{-} x)^2}}\)

Where \(\overset{-} x\) is the mean of variable \(x\); \( x_i \) is the value at \(i\); \(j\) is a neighbour of \(i\); \( w_{ij} \) is the weight between neighbours \(i\) and \(j\).

(From: Jensen & Jensen 2013)

Point Pattern Analysis

Mapped point data often exhibit distinct patterning.

Patterns result from the spatial component of a control on the phenomenon.

Understanding the pattern can help with understanding the controlling forces on the phenomenon.

Point Pattern Analysis

The patterns that we're interested in with Point Pattern Analysis (PPA) result from the locations of individual points and not on their attributes, for which spatial autocorrelation is more relevant.

Quadrat Analysis and Nearest-Neighbour Analysis the the two most common methods for PPA

Quadrat Analysis

A quadrat is a user-defined geographic area, usually a square or rectangle, used to measure the distribution of a spatial phenomenon.

Quadrat analysis can be used to test whether the phenomenon is uniformly distributed.

The Chi Square test is used with quadrats.

(From: Jensen & Jensen 2013)

Quadrat Analysis

The value of Chi Square is compared with a table of critical values to determine whether the points are statistically significantly different from a uniformly distribution.

You should be thinking about the MAUP about now!

The size, shape, and number of quadrats will impact the results of the quadrat analysis.

Nearest-neighbour Analysis

NNA is used in GIS to determine whether point sets are random or non-random.
If a point set is found to be non-random then we are left with the task of determining what controls the distribution.
For each point in the set, find the distance to the closest neighbour.

(From: Jensen & Jensen 2013)

Nearest-neighbour Analysis

\(d_e = \frac 1 {2 \sqrt{N/A}} = \frac 1 {2 \sqrt{p}} \)

where \(d_e\) is the expected density (assuming random distribution); \(N\) is the number of points; \(A\) is the study area; \(p\) is the point density.

\(NNR = \frac {Dist_{Obs}} {Dist_{Ran}} = \frac {d_a} {d_e} \)

where \(NNR\) is the nearest-neighbour ratio; \(Dist_{Obs}\) is the mean NN distance; \(Dist_{Ran}\) is the expected distance for a random distribution.

(From: Jensen & Jensen 2013)

Nearest-neighbour Analysis

Warning: Our estimates of the point density is dependent on our definition of the study area.
If we change the extent of the study area, we change the results.

Not so clustered	Very clustered

Nearest-neighbour Analysis

NNA is also sensitive to the non-uniformity of underlying space.
NNA assumes that points are free to locate anywhere.
Consider the gap in stream channel heads below. It's the result of Lake Ontario.

Circular Data

Geographers distinguish between directional (0°-360°) and axial (a.k.a. oriented; 0°-180°) data.
- Wind is directional; a road is axial.

Directional and axial data can be plotted using Rose Diagrams, which are like circular histograms.

Rose Diagrams

Circular Data

\(\overset{-}\theta = tan^{-1}(\frac{\overset{N}{\underset{i=1}{\Sigma}}{sin \theta_i}} {\overset{N}{\underset{i=1}{\Sigma}}{cos \theta_i}}) \)

where \(\overset{-}\theta\) is the mean direction, derived from the vector resultant.

Circular Data

\(\overset{-}R = \frac 1 N \sqrt{(\overset{N}{\underset{i=1}{\Sigma}}{cos \theta_i})^2 + ({\overset{N}{\underset{i=1}{\Sigma}}{sin \theta_i}})^2} \)

where \(\overset{-}R\) is the standardized length of the vector resultant, called the mean resultant length, and is a measure of dispersion.
0 ≤ \(\overset{-}R\) ≤ 1, where values near 1 indicate small angular dispersion and vice versa.

Circular Data

Axial (oriented) data cannot easily be treated as vectors because there is nothing to distinguish one end of the line from the other.

An angle of 179° is very close to one of 1°.

To solve this double all the angles, calculate the statistics with the doubled data, and then halve the angles to get the mean direction, mean resultant length, etc.
- 45° × 2 = 90°
- 225° × 2 = 450° = 450° - 360° = 90°

GEOG*3480

GIS and Spatial Analysis

Statistical Analysis of Spatial Data

Readings

Topics

Descriptive Statistics

Descriptive Statistics

Descriptive Statistics

Descriptive Spatial Statistics

Tobler’s first law

Spatial autocorrelation

Spatial autocorrelation

Spatial autocorrelation

Moran's \(I\)

Moran's \(I\)

Point Pattern Analysis

Point Pattern Analysis

Quadrat Analysis

Quadrat Analysis

Nearest-neighbour Analysis

Nearest-neighbour Analysis

Nearest-neighbour Analysis

Nearest-neighbour Analysis

Circular Data

Rose Diagrams

Circular Data

Circular Data

Circular Data

Statistical Analysis of
Spatial Data