Expands a weight specification into a weight matrix to be used
by locate_errors and replace_errors. Weights allow for "guiding" the
error localization process, so that less reliable values/variables with lower
weight are selected first. See details on the specification.
Details
If weight fine tuning is needed,
a possible scenario is to generate a weight data.frame using expand_weights and
adjust it before executing locate_errors() or replace_errors().
The following specifications for weight are supported:
NULL: generates a weight matrix with1'sa named
numeric, unmentioned columns will have weight 1an unnamed
numericwith a length equal toncol(dat)a
data.framewith same number of rows asdata
matrixwith same number of rows asdatInf,NAweights are interpreted as variables that must not be changed.Infweights perform much better than setting a weight to a large number.
See also
Other error finding:
errorlocation-class,
errors_removed(),
locate_errors(),
replace_errors()
Examples
dat <- read.csv(text=
"age,country
49, NL
23, DE
", strip.white=TRUE)
weight <- c(age = 2, country = 1)
expand_weights(dat, weight)
#> age country
#> [1,] 2 1
#> [2,] 2 1
weight <- c(2, 1)
expand_weights(dat, weight, as.data.frame = TRUE)
#> age country
#> 1 2 1
#> 2 2 1
# works too
weight <- c(country=5)
expand_weights(dat, weight)
#> age country
#> [1,] 1 5
#> [2,] 1 5
# specify a per row weight for country
weight <- data.frame(country=c(1,5))
expand_weights(dat, weight)
#> age country
#> [1,] 1 1
#> [2,] 1 5
# country should not be changed!
weight <- c(country = Inf)
expand_weights(dat, weight)
#> age country
#> [1,] 1 Inf
#> [2,] 1 Inf