# Stochastic depression analysis

This tool can be used to map topographic depressions in a digital elevation model (DEM), taking into account the uncertainty in depression shape resulting from DEM error. The tool uses a stochastic simulation approach (i.e. Monte Carlo method) to map depressions. It is a re-implementation of the algorithm described by Lindsay and Creed (2006), originally appearing in the Terrain Analysis System (TAS) software. Whitebox's stochastic depression analysis tool contains several improvements compared to the earlier TAS algorithm.

The Whitebox algorithm differs in the way that each realization, and in particular the error fields, are created. It uses a turning bands technique to create error fields. The turning bands method is used to ensure that the error fields have the desired spatial characteristics (e.g. autocorrelation and range) and Histogram Matching is used to ensure they have the appropriate statistical distribution.

By using histogram matching, the modelled DEM error can follow any desirable statistical distribution. For example, it is commonly the case that error within a DEM is more peaked than would be described by a Guassian (normal) distribution. The user specifies the shape of the error distribution by inputing a histogram. This is simply a text file with two columns, the first is error value (in the same units as the z-units of the DEM) and the second is the frequency of occurrence (for additional information see Histogram Matching). This can be generated in a spreadsheet for theoretical distributions (e.g. if the only error information that is available is an RMSE value derived from the DEM's metadata, then it is reasonable to assume a Gaussian distribution and to create the appropriate histogram using the Gaussian probability function in a spreadsheet). The preferred method for generating the input error histogram, however, is to use ground control points (GCPs). If several GCPs are available, it is possible to derive an empirical error distribution form the data, again using a spreadsheet to plot the frequency distribution.

The added benefit of having GCPs is that it these same data can also be used calculate the range, which is a measure of the correlation length within the error fields. This value must be specified, in the same units as the DEM x-y units. If GCPs are available, the range can be derived from the variogram. In this way, it is possible to ensure that each DEM realization in the simulation has the appropriate spatial and statistical error properties.

The tool effectively creates a user specified number of DEM realizations, i.e. multiple equiprobable models of the topography given the known error characteristics. Each realization is created by adding a unique error field to the original DEM. The topographic depressions within the DEM realization are then mapped by differencing a depression-filled version of the realization from the non-depression filled realization. Depressions are filled using the highly efficient Wang and Liu (2006) depression filling algorithm. The use of this algorithm does limit the application of this tool to DEMs that fit in memory. With each iteration, grid cells that are affected by the depression filling process are flagged and entered into a cumulative grid. In this way, the probability of a grid cell belonging to a depression feature (pdep), given the uncertainty in the DEM, can be mapped. The simulation will continue until the user-specified number of iterations has been reached. This stochastic analysis has been found to provide accurate maps of depressions (or wetlands, in some landscapes) even with moderate resolution DEM data. The accuracy of the results will depend, however, on the specific DEM data and the landscape characteristics.

## Credits:

• John Lindsay and Beau Ahrens (2012) email: jlindsay@uoguelph.ca

## Scripting:

The following is an example of a Python script that uses this tool:

``` wd = pluginHost.getWorkingDirectory() inputFile = wd + "input.dep" outputFile = wd + "output.dep" histogramFile = wd + "histo.txt" range = "265.0" numIterations = "100" numBands = "1000" args = [inputFile, outputFile, histogramFile, range, numIterations, numBands] pluginHost.runPlugin("StochasticDepressionAnalysis", args, False) ```

This is a Groovy script also using this tool:

``` def wd = pluginHost.getWorkingDirectory() def inputFile = wd + "input.dep" def outputFile = wd + "output.dep" def histogramFile = wd + "histo.txt" def range = "265.0" def numIterations = "100" def numBands = "1000" String[] args = [inputFile, outputFile, histogramFile, range, numIterations, numBands] pluginHost.runPlugin("StochasticDepressionAnalysis", args, false) ```

## References:

• Carr, JR. 2002. Data visualization in the geosciences. Prentice Hall, 267 pp.
• Lindsay JB, Creed IF. 2006. Distinguishing between artefact and real depressions in digital elevation data. Computers and Geosciences, 32(8): 1192-1204. doi: 10.1016/j.cageo.2005.11.002.