Skip to content

mask_and_scale=True silently ignores malformed GDAL_METADATA XML #2998

@brendancol

Description

@brendancol

Describe the bug

When open_geotiff(..., mask_and_scale=True) reads a file whose GDAL_METADATA XML tag is not well-formed XML, the reader drops the metadata silently and returns raw, unscaled pixels.

_parse_gdal_metadata() in xrspatial/geotiff/_geotags.py catches ET.ParseError and returns {}. Downstream, _extract_scale_offset() in xrspatial/geotiff/_attrs.py treats a missing scale/offset as identity scaling (1.0 / 0.0). A file that declared a scale/offset inside a corrupt XML payload therefore reads as if no scaling were declared, instead of failing.

PRs #2988 and #2992 already reject malformed numeric SCALE/OFFSET values under mask_and_scale=True via MalformedScaleOffsetError. Malformed XML itself slips through that guard.

Expected behavior

Under mask_and_scale=True, a malformed GDAL_METADATA XML payload should fail closed with a clear MalformedScaleOffsetError, like the existing malformed-numeric case. Reads that don't request mask_and_scale should keep working unchanged, since they never read the scale/offset metadata.

For this module, returning wrong pixels silently is worse than a hard error.

Additional context

The DOCTYPE / billion-laughs security drop should stay silent; that payload is refused on purpose. Existing tests cover malformed numeric scale/offset values but not malformed XML itself (xrspatial/geotiff/tests/read/test_rioxarray_compat_2961.py).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggeotiffGeoTIFF module

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions