Whitebox Workflows for Python (WbW) Tutorial 1: Hydrological Analysis¶
Introduction¶
This tutorial is part of a series that includes:
- Tutorial 1: Hydrological Analysis
- Tutorial 2: Geomorphometric Analysis
- Tutorial 3: Mapping Building Footprints from LiDAR
This tutorial is intended to demonstrate how to use the WbW geospatial analysis library to perform basic hydrological analysis. It will not cover all of the functionality related to hydrology contained within WbW. For more information, you may refer to the user manual. You may download a copy of the raw Jupyter Notebooks file (*.ipynb
) from here.
Setting up WbW¶
If you don't already have WbW installed on your machine, do so now.
pip install whitebox-workflows
Or if you have it installed already but need to update to the latest version, you may do so with:
pip install whitebox-workflows -U
Each WbW script must begin by importing the whitebox_workflows
library.
import whitebox_workflows
We need to set up our floating license ID, which will be used by WbW. Once you register your WbW license, you will be emailed your unique floating license ID, which will likely be a randomly generated three-word phrase, involving an animal. The license below was used for the Geomorphometry 2023 conference in Iasi, Romania and will be valid until August 2023. After that point, you will need to purchase a license for eithter WbW (about 10USD) or WbW-Pro (about 350USD) to use the scripts below. Licenses can be purchased from Whitebox Geospatial Inc.
license_id = 'geomorphometry-2023'
Let's begin by testing that WbW is set up correctly and our floating license ID works.
The following shows the basic structure of a WbW script. Notice that we checkout our license by specifying the license_id
as a function parameter when we create the WbEnvironment
instance, wbe
, and check the license back in at the end of the script. We must always check-in our license or else it won't be available for future use (or at least not until it is recycled one hour after checkout). We therefore perform the check-in within a finally
block, so that even if the script throws an error, our license will be checked in.
While in this script, all we are doing is printing the versioning information for Whitebox, the WbEnvironment
contains all of the functions related to geospatial analysis, as well as the functions for reading and writing data. Later we'll explore some of this more advanced functionality.
# Test your license by setting up the WbW environment
wbe = whitebox_workflows.WbEnvironment(license_id)
try:
print(wbe.version())
except Exception as e:
print(e)
finally:
print(wbe.check_in_license(license_id))
Creating a DEM from a lidar point cloud¶
Let's create a new script to download some sample data. Here we'll grab the 'mill_brook' lidar dataset. The script below will download the data for us, assign the directory to which these data are downloaded to the WbEnvironment
working directory and lastly print this location so we can know where the data are being stored. Notice that it may take a few minutes to download the data. In the event that the download takes more than a few minutes, the connection may timeout and you will receive an error. If this should happen, you may download the dataset directly from here but you will need to update the wbe.working_directory
to your download folder. Also note that we are setting the wbe.verbose
variable to True
, which will allow the various wbe
functions to output to the terminal. That way we can receive updates as things are happening.
wbe = whitebox_workflows.WbEnvironment(license_id)
wbe.verbose = True
wbe.working_directory = whitebox_workflows.download_sample_data('mill_brook')
print(f'Data have been stored in: {wbe.working_directory}')
Let's read in the lidar (LAZ) file contained in that download directory.
lidar = wbe.read_lidar('mill_brook.laz')
print(f"There are {lidar.header.number_of_points} points in the lidar dataset.")
We can interpolate this lidar data to a raster DEM using several interpolation methods, but here, we'll triangulate it.
# Create a DEM
dem = wbe.lidar_tin_gridding(lidar, returns_included='all', cell_size=1.0, excluded_classes=[1], max_triangle_edge_length=100.0)
# Fill missing data
dem = wbe.fill_missing_data(dem, filter_size=35) # Notice that we overwrite `dem` here.
# You can save the smoothed DEM if you like.
# wbe.write_raster(dem, 'dem.tif', compress=False)
# Let's visualize the DEM with a hillshade
hs = wbe.multidirectional_hillshade(dem)
wbe.write_raster(hs, 'hillshade.tif', compress=False) # Compression is good, but it is a bit slower so here we won't use it.
Go ahead and open the hillshade.tif
file in QGIS and see what these data look like. When you're done, come on back and carry on with the analysis below.
# Smooth the DEM. This step normally takes some experimentation to get the parameters right, which is why
# why I save the raw DEM/hillshade. Comparison on the hillshade images allows me to tweak the parameters
# until I find that the output DEM has the appropriate level of smoothing that I need for my application.
dem_smoothed = wbe.feature_preserving_smoothing(dem, filter_size=11, normal_diff_threshold=25.0, iterations=3)
# You can save the smoothed DEM if you like...
# wbe.write_raster(dem_smoothed, 'dem_smoothed.tif', compress=False)
# ...but we'll certainly need to save the hillshade image to compare with the hillshade from the raw DEM to
# evaluate whether the smoothing was sufficient.
hs = wbe.multidirectional_hillshade(dem_smoothed)
wbe.write_raster(hs, 'hillshade_smoothed.tif', compress=False)
Go ahead and open the smoothed hillshade and compare it with the hillshade image derived from the original DEM to evaluate the degree to which we were successful in removing small-scale topographic variation without significantly affecting the edges of important features, e.g. stream channels. You can go ahead and change the parameters of the feature_preserving_smoothing
function to adjust the level of smoothing. The main parameters affecting the degree of smoothing are the normal_diff_threshold
and the iterations
, but increasing filter_size
can also impact it.
Once you're satified with the amount of smoothing, we can derive contours from the DEM for visualization purposes.
contours = wbe.contours_from_raster(dem_smoothed, contour_interval=10.0)
wbe.write_vector(contours, 'contours.shp')
How about extracting breaklines? This function requires the WbW-Pro license, however, so be sure your floating license ID is for this product before running the code below.
breaklines = wbe.breakline_mapping(dem_smoothed, threshold=3.0, min_length=3)
wbe.write_vector(breaklines, 'breaklines.shp')
You may overlay your contours (and optionally the breaklines) on your hillshade image in QGIS if you would like to see the result. You can decrease the threshold
parameter to create more extensive breakline coverage, and increase it to have the coverage focused on only major breaklines. Experiment with this parameter until you are satisfied with the breakline coverage.
Performing hydrological analyses on the DEM¶
Now let's do a bit of hydrological processing of the data, including extracting a stream network.
import math # We'll use the log function below
# Remove the depressions, first by breaching the depressions using a max dist so that it doesn't
# carve excessively long trenches for very deep pits, and then filling the remaining depressions
dem_no_deps = wbe.breach_depressions_least_cost(dem_smoothed, flat_increment=0.001, max_dist=100) # Change the max dist parameter as appropriate for your DEM
dem_no_deps = wbe.fill_depressions(dem_no_deps, flat_increment=0.001)
# Perform a flow-accumulation operation. Here I'm using the Qin (2007) multiple flow direction algorithm
# but there are many other options available, including D-infinity.
#
# Stream channels are usually identified as areas of relatively high flow accumulation and are mapped by thresholding
# flow accumulation values. Let's choose a threshold value.
channel_threshold = 25000.0
flow_accum = wbe.qin_flow_accumulation(dem_no_deps, out_type='cells', convergence_threshold=channel_threshold, log_transform=True)
wbe.write_raster(flow_accum, 'qin_flow_accum.tif')
# Map the streams by thresholding the flow accum raster, using the same convergence threshold used above. This way
# we can be assured that the streams are single-cell wide D8 representation, which is needed for any stream
# network analysis operations.
streams = flow_accum > math.log(channel_threshold)
Decreasing the value of channel_threshold
will result in a more extensive network of stream channels and increasing it will result in a less extensive network. The channel threshold of 25000 (in grid cells) has been selected simply by examining the values of flow accumulation within the qin_flow_accum.tif
file near the headwaters of the visible stream channels in the hillshade image. There will, of course, be variation in this value and it may require some refining to get a reasonable value that performs well throughout. In fact, geomorphologists often use more sophisticated methods, usually involving slope and sometimes other factors, to select a channelization threshold. Experiment with the value of channel_threshold
to see how the stream network is impacted by this value.
Now let's map the areas draining to an outlet point and to various parts of the stream network...
# Let's extract the watershed for a specific outlet point
outlet = wbe.read_vector('outlet.shp') # This is a vector point that was included when we downloaded the `mill_brook` dataset.
# Make sure that the outlet is positioned along the stream
outlet = wbe.jenson_snap_pour_points(outlet, streams, 5.0)
# We need a d8-pointer raster to be able to route flow through the network
d8_pntr = wbe.d8_pointer(dem_no_deps)
# Extract the outlet's watershed
watershed = wbe.watershed(d8_pointer=d8_pntr, pour_points=outlet)
# Vectorize the watershed polygon for visualization
watershed_vec = wbe.raster_to_vector_polygons(watershed)
# Smooth the watershed map for visualization
watershed_vec = wbe.smooth_vectors(watershed_vec, filter_size=5)
wbe.write_vector(watershed_vec, 'watershed.shp')
# Now, we only want the streams inside the watershed
streams = streams * watershed # Notice that we can treat the rasters like any other Python variable in a math equation.
# Perform a stream network analysis on the stream vector
streams_vec = wbe.raster_streams_to_vector(streams, d8_pntr)
streams_vec, tmp1, tmp2, tmp3 = wbe.vector_stream_network_analysis(streams_vec, dem_no_deps) # We only want the streams output
wbe.write_vector(streams_vec, 'streams.shp')
# Extract all of the watersheds, draining to each outlet on the edge of the DEM using the 'basins' function.
basins = wbe.basins(d8_pntr)
wbe.write_raster(basins, 'basins.tif')
# How about extracting subcatchments, i.e. the areas draining directly to each link in the stream network?
subcatchments = wbe.subbasins(d8_pntr, streams)
wbe.write_raster(subcatchments, 'subcatchments.tif')
# Or perhaps map Strahler basins, i.e. the areas draining to Strahler order 1, 2, 3, etc. streams...
strahler_basins = wbe.strahler_order_basins(d8_pointer=d8_pntr, streams=streams)
wbe.write_raster(strahler_basins, 'strahler_basins.tif')
Wrapping things up¶
Don't forget to check your license in after you're done using it. If you skip this step, your checked-out license won't be returned to the license pool for others to use for one hour. Be kind to others!
print(wbe.check_in_license(license_id))