Speakers
Description
Hypothesis-awkward is a collection of Hypothesis strategies for Awkward Array. Awkward Array can represent a wide variety of layouts of nested, variable-length, mixed-type data that are common in HEP and other fields. Many tools that process Awkward Array are widely used and actively developed. Unit test cases of these tools often explicitly list many input samples in attempting to cover edge cases. However, in practice, such manually enumerated test samples can cover only a small portion of the vast combinatorial space of valid Awkward Array instances. Hypothesis, a Python property-based testing library, strategically generates test data that can fail test cases and automatically explores edge cases; developers do not need to craft test data manually. Early versions of hypothesis-awkward include strategies that generate Awkward Arrays converted from NumPy arrays and nested Python lists. The collection extends toward comprehensive strategies that generate fully general Awkward Arrays with multiple options to control the layout, data types, missing values, masks, and other array attributes. These strategies can generate thousands or more test samples per test case, automatically searching for rare bugs. These strategies help close in on edge cases in tools that use Awkward Array and Awkward Array itself.