Expands a weight specification into a weight matrix to be used
by locate_errors and replace_errors. Weights allow for "guiding" the
errorlocalization process, so that less reliable values/variables with less
weight are selected first. See details on the specification.
Details
If weight fine tuning is needed,
a possible scenario is to generate a weight data.frame using expand_weights and
adjust it before executing locate_errors() or replace_errors().
The following specifications for weight are supported:
NULL: generates a weight matrix with1'sa named
numeric, unmentioned columns will have weight 1a unnamed
numericwith a length equal toncol(dat)a
data.framewith same number of rows asdata
matrixwith same number of rows asdatInf,NAweights will be interpreted as that those variables must not be changed and are fixated.Infweights perform much better than setting a weight to a large number.
See also
Other error finding:
errorlocation-class,
errors_removed(),
locate_errors(),
replace_errors()
Examples
dat <- read.csv(text=
"age,country
49, NL
23, DE
", strip.white=TRUE)
weight <- c(age = 2, country = 1)
expand_weights(dat, weight)
#> age country
#> [1,] 2 1
#> [2,] 2 1
weight <- c(2, 1)
expand_weights(dat, weight, as.data.frame = TRUE)
#> age country
#> 1 2 1
#> 2 2 1
# works too
weight <- c(country=5)
expand_weights(dat, weight)
#> age country
#> [1,] 1 5
#> [2,] 1 5
# specify a per row weight for country
weight <- data.frame(country=c(1,5))
expand_weights(dat, weight)
#> age country
#> [1,] 1 1
#> [2,] 1 5
# country should not be changed!
weight <- c(country = Inf)
expand_weights(dat, weight)
#> age country
#> [1,] 1 Inf
#> [2,] 1 Inf