The imputeTS package specializes on (univariate) time series imputation. It offers several different imputation algorithm implementations. Beyond the imputation algorithms the package also provides plotting and printing functions of time series missing data statistics. Additionally three time series datasets for imputation experiments are included.
InstallationThe imputeTS package can be found on CRAN. For installation execute in R:
install.packages("imputeTS")
If you want to install the latest version from GitHub (can be unstable) run:
library(devtools)
install_github("SteffenMoritz/imputeTS")
Usage
To impute (fill all missing values) in a time series x, run the following command:
na_interpolation(x)
Output is the time series x with all NAâs replaced by reasonable values.
This is just one example for an imputation algorithm. In this case interpolation was the algorithm of choice for calculating the NA replacements. There are several other algorithms (see also under caption âImputation Algorithmsâ). All imputation functions are named alike starting with na_ followed by a algorithm label e.g. na_mean, na_kalman, â¦
To plot missing data statistics for a time series x, run the following command:
ggplot_na_distribution(x)
Â
This is also just one example for a plot. Overall there are four different types of missing data plots. (see also under caption âMissing Data Plotsâ).
To print statistics about the missing data in a time series x, run the following command:
statsNA(x)
To load the âheatingâ time series (with missing values) into a variable y and the âheatingâ time series (without missing values) into a variable z, run:
y <- tsHeating
z <- tsHeatingComplete
There are three datasets provided with the package, the âtsHeatingâ, the âtsAirgapâ and the âtsNH4â time series. (see also under caption âDatasetsâ).
Here is a table with available algorithms to choose from:
na_interpolation Missing Value Imputation by Interpolation na_kalman Missing Value Imputation by Kalman Smoothing na_locf Missing Value Imputation by Last Observation Carried Forward na_ma Missing Value Imputation by Weighted Moving Average na_mean Missing Value Imputation by Mean Value na_random Missing Value Imputation by Random Sample na_remove Remove Missing Values na_replace Replace Missing Values by a Defined Value na_seadec Seasonally Decomposed Missing Value Imputation na_seasplit Seasonally Splitted Missing Value ImputationThis is a rather broad overview. The functions itself mostly offer more than just one algorithm. For example na_interpolation can be set to linear or spline interpolation.
More detailed information about the algorithms and their options can be found in the imputeTS reference manual.
Missing Data PlotsHere is a table with available plots to choose from:
ggplot_na_distribution Visualize Distribution of Missing Values ggplot_na_distribution2 Missing Values Summarized in Intervals ggplot_na_gapsize Visualize Distribution of NA Gapsizes ggplot_na_imputations Visualize Imputed ValuesMore detailed information about the plots can be found in the imputeTS reference manual.
DatasetsThere are three datasets (each in two versions) available:
tsAirgap Time series of monthly airline passengers (with NAs) tsAirgapComplete Time series of monthly airline passengers (complete) tsHeating Time series of a heating systems supply temperature (with NAs) tsHeatingComplete Time series of a heating systems supply temperature (complete) tsNH4 Time series of NH4 concentration in a wastewater system (with NAs) tsNH4Complete Time series of NH4 concentration in a wastewater system (complete)The tsAirgap, tsHeating and tsNH4 time series are with NAs. Their complete versions are without NAs. Except the missing values their versions are identical. The NAs for the time series were artifically inserted by simulating the missing data pattern observed in similar non-complete time series from the same domain. Having a complete and incomplete version of the same dataset is useful for conducting experiments of imputation functions.
More detailed information about the datasets can be found in the imputeTS reference manual.
ReferenceYou can cite imputeTS the following:
Need Help?Moritz, Steffen, and Bartz-Beielstein, Thomas. âimputeTS: Time Series Missing Value Imputation in R.â R Journal 9.1 (2017). doi: 10.32614/RJ-2017-009.
If you have general programming problems or need help using the package please ask your question on StackOverflow. By doing so all users will be able to benefit in the future from your question.
SupportDonât forget to mark your question with the imputets tag on StackOverflow to get me notified
If you found a bug or have suggestions, feel free to get in contact via steffen.moritz10 at gmail.com.
VersionAll feedback is welcome
3.3
LicenseGPL-3
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4