Bibliographical notes
More background on the validate package can be found in the paper for the R Journal.
MPJ van der Loo and E de Jonge (2020). Data Validation Infrastructure for R. Journal of Statistical Software 97(10)
The theory of data validation is described in the following paper.
MPJ van der Loo, and E de Jonge (2020). Data Validation. In Wiley StatsRef: Statistics Reference Online (eds N. Balakrishnan, T. Colton, B. Everitt, W. Piegorsch, F. Ruggeri and J.L. Teugels).
Data validation is described in the wider context of data cleaning, in Chapter 6 of the following book.
MPJ van der Loo and E de Jonge (2018) Statistical Data Cleaning With Applications in R. John Wiley & Sons, NY.
The following document describes data validation in the context of European Official Statistics. It includes issues such as lifecycle management, complexity analyses and examples from practice.
M. Zio, N. Fursova, T. Gelsema, S. Giessing, U Guarnera, J. Ptrauskiene, Q. L. Kalben, M. Scanu, K. ten Bosch, M. van der Loo, and K. Walsdorfe (2015) Methodology for data validation. Deliverable of the ESSNet on validation.
The lumberjack
package discussed in Chapter 10 is described in the following
paper.
MPJ van der Loo (2020). Monitoring Data in R with the lumberjack package. Journal of Statistical Software, 98(1)