Focal statistics for python
This module aims to provide focal statistics for python, that runs without the installation of extensive GIS packages.
The only dependency is numpy. Additionally, the package provides sliding window functionality to implement your own
focal statistics functions (see Tutorials). This is implemented in line with
numpy.lib.stride_tricks.sliding_window_view()
, but extends its functionality.
Documentation
Installation
Required dependencies
Instructions
Install with
conda install --channel conda-forge focal-stats
Focal statistics
Focal statistics are functions calculated on the neighborhood of the data, often referred to as sliding, or rolling window operations on 2D data. Functions are statistical and range from mean and standard deviation to quantiles and mode (majority).
Conceptually, the algorithm iterates over each pixel of a raster, and looks in all four directions to calculate the focal statistic. The neighborhoods can overlap, but might also be stacked next to each other, so a sparse output is generated.
An example from the ArcGIS documentation, considering the focal sum of a raster with a specified neighborhood of 3x3 pixels. The values in this window are summed and placed in the output array at the location of the most central pixel in the window:

The second example shows what the same example looks like when a complete raster is considered:

The statistical operations implemented in this package are:
mean:
focal_mean()
minimum:
focal_min()
maximum:
focal_max()
standard deviation:
focal_std()
summation:
focal_sum()
majority/mode:
focal_majority()
correlation:
focal_correlation()
None
Note
This tutorial was generated from an IPython notebook that can be downloaded here.
Usage example
import focal_stats
import rasterio as rio
import matplotlib.pyplot as plt
import numpy as np
import os
Matplotlib is building the font cache; this may take a moment.
os.chdir("../../../")
Loading raster (containing water table depth (Fan et al., 2017)).
with rio.open("data/wtd.tif") as f:
a = f.read(1).astype(np.float64)
a[a == -999.9] = np.nan
Inspecting the data
plt.imshow(a, cmap='Blues', vmax=100)
plt.title("Water table depth")
plt.colorbar()
<matplotlib.colorbar.Colorbar at 0x7ff3c4aec550>

Focal statistics
Calculation of the focal mean:
plt.imshow(focal_stats.focal_mean(a, window_size=15), vmax=100, cmap="Blues")
<matplotlib.image.AxesImage at 0x7ff3c4184b50>

This looks quite similar to the input raster, but with smoothing applied. Let’s try a higher window_size, which should increase the smoothing
plt.imshow(focal_stats.focal_mean(a, window_size=101), vmax=100, cmap="Blues")
<matplotlib.image.AxesImage at 0x7ff3c4112c90>

This same functionality can be used to reduce the shape of the raster based on this window_size.
x = focal_stats.focal_mean(a, window_size=108, reduce=True)
plt.imshow(x, vmax=100, cmap="Blues")
<matplotlib.image.AxesImage at 0x7ff3c4084b90>

The shape of this new raster is exactly 108 times smaller than the input raster. Note that for this to work both x and y-axes need to be divisible by the window size.
None
Note
This tutorial was generated from an IPython notebook that can be downloaded here.
Creating custom focal statistic function
import focal_stats
import rasterio as rio
import matplotlib.pyplot as plt
import numpy as np
import os
os.chdir("../../../")
Loading raster (containing water table depth (Fan et al., 2017)).
with rio.open("data/wtd.tif") as f:
a = f.read(1).astype(np.float64)
a[a == -999.9] = np.nan
Inspecting the data
plt.imshow(a, cmap='Blues', vmax=100)
plt.title("Water table depth")
plt.colorbar()
<matplotlib.colorbar.Colorbar at 0x7f3f63ec3e90>

Creating custom focal mean
Firstly, a windowed version of the input raster needs to be defined.
a_windowed = focal_stats.rolling_window(a, window_size=5)
print(a.shape, a_windowed.shape)
(4320, 5400) (4316, 5396, 5, 5)
This windowed version has a slightly different shape on the first two axes. This is because there is no window behaviour defined on the edges. If this is undesired the original array can be padded with the missing number of columns and rows with numpy.pad. Through this function many different edge value assumptions can be made. Here I use the example of continuing with the closest values.
a_padded = np.pad(a, pad_width=2, mode='edge')
a_windowed_padded = focal_stats.rolling_window(a_padded, window_size=5)
print(a.shape, a_windowed_padded.shape)
(4320, 5400) (4320, 5400, 5, 5)
This has the result that the input and output raster share their first two axes.
Now the only thing that needs to happen is a mean operation on the third and fourth axes:
a_mean = a_windowed.mean(axis=(2, 3))
Plotting this shows that the operation generates an image that is very close to the original raster, with some limited smoothing
plt.imshow(a_mean, cmap="Blues", vmax=100)
plt.colorbar()
<matplotlib.colorbar.Colorbar at 0x7f3f62503510>

This can be captured in a custom focal_mean function as follows:
def focal_mean(a, window_size):
a_windowed = focal_stats.rolling_window(a, window_size=window_size)
return a_windowed.mean(axis=(2, 3))
Resulting in the same image:
plt.imshow(focal_mean(a, window_size=5), cmap="Blues", vmax=100)
plt.colorbar()
<matplotlib.colorbar.Colorbar at 0x7f3f623f4550>

Note that if a single NaN-value was present in the window, it results in
a NaN-value. I dealt with this by inserting 0 in the pixels with
NaN-values and using the sum of this array divided by the number of
valid values per window (e.g.
rolling_sum(~np.isnan(a), window_size=5)
).
What’s new
API reference
Focal statistics
Focal statistics functions operating on array-like input data. They share functionality of window_size
(determining the size of the sliding window), mask
(superseding window_size
and allowing for non-rectangular
windows), fraction_accepted
(nan-behaviour based on the fraction of available data in the input window) and
reduce
(switch between returning same shape as input data and patching the sliding window without overlapping,
leading to a much smaller output array). focal_correlation()
calculates the correlation between two arrays in
contrast to the other functions that operate on a single array.
Focal minimum |
|
Focal maximum |
|
Focal mean |
|
Focal standard deviation |
|
Focal summation |
|
Focal majority |
|
Focal correlation |
Rolling functions
This module additionally implements standalone rolling functions, accepting ND arrays and accepting window_size
,
mask
and reduce
similarly to the focal statistics functionality. It does however not guard against NaN
occurrences specifically, staying with the raw numpy behaviour. The functions are intended to be used to define custom
focal statistics functionality, potentially in higher than 2D dimensionality.
|
Takes an array and returns a windowed version |
|
Takes an array and returns the rolling sum. |
|
Takes an array and returns the rolling mean. |
License
focal_stats is published under a MIT license.