Materials for the short course on Statistical Data Cleaning for Business Statistics at the European Establishment Statistics Workshop 2019
Slot 1
Topic time (m) Introduction 20 Reading dirty data 30 Approximate matching 50 Data validation 50Slot 2
Topic time (m) Error localization 20 Imputation 50 Adjusting 20 Monitoring 30 Wrap-up 10The course form is highly hands-on. Each topic starts with an approximately 10-15 minute session where you run and adapt some R code. Next, I will provide background and details on what you just did. After that there is a more in-depth assignment. Depending on time and topic we will discuss the topic more in-depth after that.
Bring a laptop
Participants are expected to have a basic knowledge of R/RStudio, explicitly:
Execute the following R code to install the necessary packages.
install.packages(c( "validate" , "errorlocate" , "simputation" , "rspa" , "daff" , "jsonlite" , "XML" , "readr" , "stringr" , "lumberjack") , dependencies=TRUE)
This work is licensed under a Creative Commons Attribution 4.0 International License.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4