Loading...
Searching...
No Matches
An implementation of a median calculation of a sliding window along a data stream. More...
#include <cassert>
#include <cstdlib>
#include <ctime>
#include <list>
#include <set>
#include <vector>
Go to the source code of this file.
namespace probability Probability algorithms.An implementation of a median calculation of a sliding window along a data stream.
Given a stream of integers, the algorithm calculates the median of a fixed size window at the back of the stream. The leading time complexity of this algorithm is O(log(N), and it is inspired by the known algorithm to [find median from (infinite) data stream](https://www.tutorialcup.com/interview/algorithm/find-median-from-data-stream.htm), with the proper modifications to account for the finite window size for which the median is requested
AlgorithmThe sliding window is managed by a list, which guarantees O(1) for both pushing and popping. Each new value is pushed to the window back, while a value from the front of the window is popped. In addition, the algorithm manages a multi-value binary search tree (BST), implemented by std::multiset. For each new value that is inserted into the window, it is also inserted to the BST. When a value is popped from the window, it is also erased from the BST. Both insertion and erasion to/from the BST are O(logN) in time, with N the size of the window. Finally, the algorithm keeps a pointer to the root of the BST, and updates its position whenever values are inserted or erased to/from BST. The root of the tree is the median! Hence, median retrieval is always O(1)
Time complexity: O(logN). Space complexity: O(N). N - size of window
Definition in file windowed_median.cpp.
◆ size_type using probability::windowed_median::size_type = Window::size_typeDefinition at line 49 of file windowed_median.cpp.
◆ Window using probability::windowed_median::Window = std::list<int>Definition at line 48 of file windowed_median.cpp.
◆ main()Main function.
A few fixed test cases
Array of sorted values; odd window size
Array of sorted values - decreasing; odd window size
Even window size
Array with repeating values
Array with same values except one
Array that includes repeating values including negatives
Array with large values - sum of few pairs exceeds MAX_INT. Window size is even - testing calculation of average median between two middle values
Random test cases
Array size in the range [5, 20]
Window size in the range [3, 10]
Random array values (positive/negative)
Testing randomized test
Definition at line 196 of file windowed_median.cpp.
196 {
198 test({1, 2, 3, 4, 5, 6, 7, 8, 9},
199 3);
200 test({9, 8, 7, 6, 5, 4, 3, 2, 1},
201 3);
202 test({9, 8, 7, 6, 5, 4, 5, 6}, 4);
203 test({3, 3, 3, 3, 3, 3, 3, 3, 3}, 3);
204 test({3, 3, 3, 3, 7, 3, 3, 3, 3}, 3);
205 test({4, 3, 3, -5, -5, 1, 3, 4, 5},
206 5);
207
211 test({470211272, 101027544, 1457850878, 1458777923, 2007237709, 823564440,
212 1115438165, 1784484492, 74243042, 114807987},
213 6);
214
216 std::srand(static_cast<unsigned int>(std::time(nullptr)));
217 std::vector<int> vals;
218 for (int i = 8; i < 100; i++) {
219 const auto n =
220 1 + std::rand() /
221 ((RAND_MAX + 5u) / 20);
222 auto windowSize =
223 1 + std::rand() / ((RAND_MAX + 3u) /
224 10);
225 vals.clear();
226 vals.reserve(n);
227 for (int i = 0; i < n; i++) {
228 vals.push_back(
229 rand() - RAND_MAX);
230 }
231 test(vals, windowSize);
232 }
233 return 0;
234}
static void test()
Self-test implementations.
◆ test() void test ( const std::vector< int > & vals, int windowSize ) staticSelf-test implementations.
Comparing medians: efficient function vs. Naive one
Definition at line 182 of file windowed_median.cpp.
182 {
184 for (const auto val : vals) {
185 windowedMedian.insert(val);
186
188 assert(windowedMedian.getMedian() == windowedMedian.getMedianNaive());
189 }
190}
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4