Bibliographical notes

More background on the validate package can be found in the paper for the R Journal.

MPJ van der Loo and E de Jonge (2020). Data Validation Infrastructure for R. Journal of Statistical Software (Accepted).

The theory of data validation is described in the following paper.

MPJ van der Loo, and E de Jonge (2020). Data Validation Infrastructure for R. In Wiley StatsRef: Statistics Reference Online (eds N. Balakrishnan, T. Colton, B. Everitt, W. Piegorsch, F. Ruggeri and J.L. Teugels).

Data validation is described in the wider context of data cleaning, in Chapter 6 of the following book.

MPJ van der Loo and E de Jonge (2018) Statistical Data Cleaning With Applications in R. John Wiley & Sons, NY.

The following document describes data validation in the context of European Official Statistics. It includes issues such as lifecycle management, complexity analyses and examples from practice.

M. Zio, N. Fursova, T. Gelsema, S. Giessing, U Guarnera, J. Ptrauskiene, Q. L. Kalben, M. Scanu, K. ten Bosch, M. van der Loo, and K. Walsdorfe (2015) Methodology for data validation. Deliverable of the ESSNet on validation.

The lumberjack package discussed in Chapter 9 is described in the following paper.

MPJ van der Loo (2020). Monitoring Data in R with the lumberjack package. Journal of Statistical Software, Accepted for publication.