A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/zarr-developers/zarr-specs/issues/82 below:

Content-addressable storage transformer (v3 protocol extension) · Issue #82 · zarr-developers/zarr-specs · GitHub

This issue describes a concept for zarr v3 protocol extension which enables content-addressable storage to be layered on top of any underlying store. It is a thought experiment only, not a concrete proposal. Elaboration of suggestions at #76.

Goals:

The protocol extension introduces a layer of indirection to the storage protocol. This can be thought of as a transformation layer which sits above the store and modifies the key/value store operations.

When attempting to store a given (key, value) pair, the storage transformer hashes the value, and the hash is used to obtain a content-addressable key.

E.g., if storing an encoded chunk value under key 'data/foo/bar/0.0', if the hash for the value is 'abcdefghijklmnopqrstuvwxyz', then the content-addressable key would be something like 'content/a/b/c/d/efghijklmnopqrstuvwxyz'. The way that the transformer generates content-addressable keys could be configured with depth, width and hash algorithm.

The transformer would then issue a request to the underlying store to set (content-addressable-key, value).

To keep track of the location of the content, a content metadata document would be created, which records the content-addressable key, together with any other metadata such as timestamp of creation. This could be a JSON document, and could record multiple versions of the content. E.g.:

[
    {
        "address": "content/a/b/c/d/efghijklmnopqrstuvwxyz",
        "timestamp": 1593171939
    },
    {
        "address": "content/z/yxwvutsrqponmlkjihgfedcba",
        "timestamp": 32503680000
    },
    ...
]

This JSON document could be stored under the original key, in this case 'data/foo/bar/0.0'.

In other words, when the transformer receives a request set(key, value), it does the following:

When the transformation layer receives a request to get(key, value), it does the following:

The transformation layer could also expose an API to set the time state for reading. E.g., if time state is set to time T, then when the transformation layer receives a request to get(key, value), it does the following:

In order to discover that content-addressable storage transformer extension is in use, this could be declared in the zarr entry point metadata document, e.g., zarr.json like:

{
    "zarr_format": "https://purl.org/zarr/spec/protocol/core/3.0",
    "metadata_encoding": "application/json",
    "extensions": [
        {
            "extension": "http://example.org/zarr/extension/content-addressable-storage-transformer",
            "must_understand": true,
            "configuration": {
                "algorithm": "sha256",
                "depth": 4,
                "width": 1
            }
        }
    ]
}

When a zarr v3 implementation opened a hierarchy using this extension, it could recognise that when parsing the entry point metadata document, and insert the appropriate store transformer if supported. I.e., a user opening such a hierarchy would not need to know that content-addressable storage was used, the implementation would discover that for itself.

There are several potential advantages of this scheme:

Notes:


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4