A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/OHDSI/WebAPI/issues/1353 below:

Make cohort caching backwards compatablity to 2.7.4 · Issue #1353 · OHDSI/WebAPI · GitHub

When I started pulling in latest changes related to caching from master (and identifying some problems in #1351, PR #1352), I had serious difficulty switching back and forth between branches for 2.7.4 and 2.8.0. The primary reason was the switch from 'cohort_definition_id' to 'generation_id' across many of the results tables. While we can handle this in a versioned update in WebAPI, many other people who depend on these results schema structures will be completely taken by surprise by these changes.

Therefore, let's apply the improvements that were made in the cohort caching changes, but with some consideration to backwards compatability.

To this end, I suggest the following results schema:

Tables:
cohort_cache: same structure as cohort, but with generation_id as the key. I think this should actually be a design hashId.

The following tables would follow the same pattern:

Then in 2.8.0 branch, when we want to generate a cohort which has been cached, we read from cohort_cache/cohort_inclusion_* and write it all back into the cohort tables. In cohort characterization, it would work similarly, but it would write into the local_cohort tables from the content from cohort_cache.

Switching back to 2.7.4, the 2.7.4 codebase doesn't even know about cache tables. It sees the same cohort tables (keyed on cohort_definition_id) and writes those records directly, instead of deffering to cache (if the cache exists).

In 2.8.0, when a cohort is newly generated (no cache exists), we can continue to generate records into the results.cohort table, but afterwards write those records into cohort_cache keyed on the design ID that generated those results. the benefit here is that people don't have to look for generation_id xref'd to a cohort_definition_id table, they can just look for the results in the results table. The cache is still leveraged on generation, but the users don't need to know about it.

Hopefully this is clear. note: there is a complexity to handle a cohort generation that generates with inclusion stats vs. not generated with inclusion stats (those records written into cohort_inclusion_* tables), but I'm fairly certain we could handle this if we encoded the 'generate stats' option into the design hash.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4