You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Bug Description
When computing the 'nodata' count as one of the statistics for zonal_stats, if a geometry's rasterization does not include any pixels at all then:
a UserWarning will be raised by numpy and
the 'nodata' value will be NaN.
Of course, it is fine to interpret NaN as zero, but for consistency this should probably be:
None, which is the value returned for 'max', 'mean', etc., or
0, which is the value returned for 'count', and probably makes conceptual sense in this case because 'nodata' functions as a count of pixels similarly to 'count'.
Example
I have created a small .tif and .shp file illustrating this: example.zip
In this example, the polygon in the shapefile is too thin to fully encompass any pixels' centroids, using the default centroid rasterization strategy leads to an "empty" rasterization and the following:
When trying to take .sum() over a fully masked array, numpy Masked Arrays will return numpy.ma.masked instead of zero.
If rv_array is all False (i.e. the rasterization doesn't include any pixels at all), then featmasked will be entirely masked, and so (featmasked == 0).sum() will be equal to numpy.ma.masked. Then, trying to convert to a float using float(numpy.ma.masked) raises the warning and returns np.nan, instead of zero.
Bug Description
When computing the
'nodata'
count as one of the statistics forzonal_stats
, if a geometry's rasterization does not include any pixels at all then:'nodata'
value will be NaN.Of course, it is fine to interpret NaN as zero, but for consistency this should probably be:
'max'
,'mean'
, etc., or'count'
, and probably makes conceptual sense in this case because'nodata'
functions as a count of pixels similarly to'count'
.Example
I have created a small
.tif
and.shp
file illustrating this:example.zip
In this example, the polygon in the shapefile is too thin to fully encompass any pixels' centroids, using the default centroid rasterization strategy leads to an "empty" rasterization and the following:
Explanation for what is happening
The issue lies in the masking performed here:
python-rasterstats/src/rasterstats/main.py
Lines 261 to 264 in aa4130e
When trying to take
.sum()
over a fully masked array, numpy Masked Arrays will returnnumpy.ma.masked
instead of zero.If
rv_array
is all False (i.e. the rasterization doesn't include any pixels at all), thenfeatmasked
will be entirely masked, and so(featmasked == 0).sum()
will be equal tonumpy.ma.masked
. Then, trying to convert to a float usingfloat(numpy.ma.masked)
raises the warning and returns np.nan, instead of zero.Examples:
Install info
I'm using rasterstats 0.14.0 with numpy 1.19.2 on Anaconda.
The text was updated successfully, but these errors were encountered: