A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/waico/skab below:

GitHub - waico/SKAB: SKAB - Skoltech Anomaly Benchmark. Time-series data for evaluating Anomaly Detection algorithms.

🛠🛠🛠The testbed is under repair right now. Unfortunately, we can't tell exactly when it will be ready and we be able to continue data collection. Information about it will be in the repository. Sorry for the delay.

❗️❗️❗️The current version of SKAB (v0.9) contains 34 datasets with collective anomalies. But the update to v1.0 will contain 300+ additional files with point and collective anomalies. It will make SKAB one of the largest changepoint-containing benchmarks, especially in the technical field.

We propose the Skoltech Anomaly Benchmark (SKAB) designed for evaluating the anomaly detection core. SKAB allows working with two main problems (there are two markups for anomalies):

  1. Outlier detection (anomalies considered and marked up as single-point anomalies)
  2. Changepoint detection (anomalies considered and marked up as collective anomalies)

SKAB consists of the following artifacts:

  1. Datasets
  2. Proposed Leaderboard for outlier detection and changepoint detection problems
  3. Python modules for algorithms’ evaluation (now evaluation modules are being imported from TSAD framework, while the details regarding the evaluation process are presented here)
  4. Python core with algorithms’ implementation
  5. Python notebooks with anomaly detection pipeline implementation for various algorithms

All the details about SKAB are presented in the following artifacts:

The SKAB v0.9 corpus contains 35 individual data files in .csv format (datasets). The data folder contains datasets from the benchmark. The structure of the data folder is presented in the structure file. Each dataset represents a single experiment and contains a single anomaly. The datasets represent a multivariate time series collected from the sensors installed on the testbed. Columns in each data file are following:

Exploratory Data Analysis (EDA) for SKAB is presented [here (tbd)]. Russian version of EDA is available on kaggle.

ℹ️We have also made a SKAB teaser that is a small dataset collected separately but from the same testbed. SKAB teaser is made just for learning/teaching purposes and contains only 4 collective anomalies. All the information is available on kaggle.

This leaderboard shows performance of algorithms on test set, unlike leaderboard for SKAB v0.9 which evaluates both training and testing data all together. Moreover, the evaluated window of change points is to the right side of actual change point occurence which is in accordance with fact, that it should be impossible to capture event before it occurs. Lastly, the window size for the NAB detection algorithm is set to 60 seconds to reflect the dynamics of the transition as presented in the slides to enable detection of the start of the transition phase which is also marked as change-point.

You can present and evaluate your algorithm using SKAB on kaggle. Leaderboards are also available at paperswithcode.com: CPD problem.

Information about the metrics for anomaly detection and intuition behind the metrics selection can be found in this medium article.

Outlier detection problem

Sorted by F1; for F1 bigger is better; both for FAR (False Alarm Rate) and MAR (Missing Alarm Rate) less is better Evaluated as binary classification problem.

Algorithm F1 FAR, % MAR, % Perfect detector 1 0 0 Conv-AE 0.78 13.55 28.02 MSET 0.78 39.73 14.13 T-squared+Q (PCA-based) 0.76 26.62 24.92 LSTM-AE 0.74 29.96 25.92 T-squared 0.66 19.21 42.6 LSTM-VAE 0.56 9.13 55.03 Vanilla LSTM 0.54 12.54 59.53 MSCRED 0.36 49.94 69.88 Vanilla AE 0.39 2.59 75.15 Isolation forest 0.29 2.56 82.89 Null detector 0 0 100 Changepoint detection problem

Sorted by NAB (standard); for NAB (standard), NAB (LowFP), NAB (LowFN) bigger is better, for Number of Missed CPs, Number of FPs lower is better The current leaderboard is obtained with the window size for the NAB detection algorithm equal to 60 sec and to the right side of true change point.

Algorithm NAB (standard) NAB (LowFP) NAB (LowFN) Number of Missed CPs Number of FPs Perfect detector 100 100 100 0 0 MSCRED 32.42 16.53 40.28 55 342 Isolation forest 26.16 19.5 30.82 76 135 T-squared+Q (PCA-based) 25.35 14.51 31.33 72 232 Conv-AE 23.61 21.54 27.55 82 23 LSTM-AE 23.51 20.11 25.91 88 69 T-squared 19.54 10.2 24.31 70 106 MSET 13.84 10.22 17.37 96 66 Vanilla AE 11.41 6.53 13.91 103 106 Vanilla LSTM 11.31 -3.8 17.25 90 342 ArimaFD -0.09 -0.17 -0.06 127 2 Null detector 0 0 0 - -

The notebooks folder contains jupyter notebooks with the code for the proposed leaderboard results reproducing. We have calculated the results for following commonly known anomaly detection algorithms:

Additionally on the leaderboard were shown the externally calculated results of the following algorithms:

Details regarding the algorithms, including short description, references to scientific papers and code of the initial implementation is available in this readme.

  1. install Python 3.10+ (tested on 3.10.13)

  2. install poetry package manager

    Poetry installs dependencies and locks versions for deterministic installs. Poetry uses Python's built-in venv module to create virtual environments. It also uses PEP 517 & 518 specifications to build packages without requiring setup.py or requirements.txt files.

  3. LightGBM base install

  4. install SKAB dependencies, see pyproject.toml for details

  5. confirm installation

Please cite our project in your publications if it helps your research.

@misc{skab,
  author = {Katser, Iurii D. and Kozitsin, Vyacheslav O.},
  title = {Skoltech Anomaly Benchmark (SKAB)},
  year = {2020},
  publisher = {Kaggle},
  howpublished = {\url{https://www.kaggle.com/dsv/1693952}},
  DOI = {10.34740/KAGGLE/DSV/1693952}
}

SKAB is acknowledged by some ML resources.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4