validatetools
is a utility package for managing validation rule sets
that are defined with validate
. In production systems
validation rule sets tend to grow organically and accumulate redundant or
(partially) contradictory rules. `validatetools` helps to identify problems
with large rule sets and includes simplification methods for resolving
issues.
The following methods allow for problem detection:
is_infeasible
checks a rule set for feasibility. An infeasible system must be corrected to be useful.
detect_boundary_num
shows for each numerical variable the allowed range of values.
detect_boundary_cat
shows for each categorical variable the allowed range of values.
detect_fixed_variables
shows variables whose value is fixated by the rule set.
detect_redundancy
shows which rules are already implied by other rules.
The following methods detect possible simplifications and apply them to a rule set.
substitute_values
: replace variables with constants.
simplify_fixed_variables
: substitute the fixed variables with their values in a rule set.
simplify_conditional
: remove redundant (parts of) conditional rules.
remove_redundancy
: remove redundant rules.
Statistical Data Cleaning with Applications in R, Mark van der Loo and Edwin de Jonge, ISBN: 978-1-118-89715-7