A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/jlevy/sidematter-format below:

jlevy/sidematter-format: A convention for metadata and asset files alongside any document

Sidematter format is a simple, universal convention for keeping metadata and assets alongside a primary document. It is a useful complement to frontmatter format.

Many tools and formats need structured data associated with a document but not inside it:

Sidematter format defines a minimal set of conventions for naming and resolving such “sidecar files” in a consistent way.

Sidecar patterns are often used in data pipelines, in exports from web browsers, and other applications. Unfortunately, there’s no consistent convention for naming and organizing such external files, leading to varied, ad-hoc approaches that don’t interoperate well.

This repository is a description of the format and a reference implementation. The implementation is in Python but the format is simple and can be adopted by any tool or language.

Sidematter format does not specify a way to bundle the outputs, but a file plus its sidematter files can easily be bundled together in a zip or tarball.

Tip

Sidematter format complements frontmatter format, which allows placing metadata within any text file. A good practice is to use frontmatter format for small metadata attached at the front of text files, and sidematter format for larger metadata, on binary files, or for additional file assets.

Sidematter format is easiest to illustrate by an example. Given a primary document report.md, some possible sidematter files would be:

report.md              # Primary document
report.meta.json       # JSON metadata
report.meta.yml        # YAML metadata (can use in addition to or instead of JSON)
report.assets/         # Asset directory
    figure1.png
    diagram.svg
    styles.css

The document and metadata can reference assets with relative paths:

# My Report

![Key findings](report.assets/figure1.png)

See the [full diagram](report.assets/diagram.svg) for details.

Example metadata content:

# report.meta.yml
title: Q3 Financial Analysis
author: Jane Doe
created_at: 2024-01-15
tags:
  - finance
  - quarterly
  - analysis
processing_history:
  - step: data_extraction
    timestamp: 2024-01-15T10:30:00Z
    tool: custom_extractor_v2.1
  - step: analysis
    timestamp: 2024-01-15T11:45:00Z
    tool: pandas_analyzer
image_files:
  - report.assets/figure1.png
  - report.assets/diagram.svg

Metadata must be in JSON or YAML. The choice is flexible. For ease of reading, such as a frontend serving system, JSON is often better. For ease of manual editing, YAML is preferable. The implementation should look for both formats, so will read the metadata on either of these layouts seamlessly. If both are present, the convention is to prefer the JSON.

If desired, sidecar metadata can also be omitted. Another good pattern is to use frontmatter format (simple YAML metadata inserted as frontmatter on the file itself), and omitted from the sidematter:

report.md              # Main file with frontmatter format metadata in YAML
report.assets/         # Asset directory with extra files
    figure1.png
    diagram.svg
    styles.css

The sidematter format defines naming conventions for files and directories related to a base document, which can be any file, with any name.

Given a base document with filename basename.extension, the sidematter files are:

The sidematter names are formed by dropping the final extension from the base document name, then appending the sidematter suffix:

In most cases, metadata should only reside in one place, typically basename.meta.yml. Implementations should observe precedence and pick metadata from the first location found in this order:

  1. Metadata JSON: basename.meta.json

  2. Metadata YAML: basename.meta.yml

  3. Optionally, implementations can look for frontmatter on the file itself (if it is a text file)

The Python implementation provides a simple reference implementation for reading and writing sidematter.

Reading Sidematter Metadata and Assets
from sidematter_format import Sidematter

# Read all sidematter for a document by checking the filesystem.
# Returns an immutable ResolvedSidematter.
paths = Sidematter(Path("report.md")).resolve() 
print(paths.primary)  # Path('report.md')
print(paths.meta)  # {'title': 'Q3 Report', 'author': 'Jane Doe', ...}
print(paths.meta_path)  # Path('report.meta.yml') or None
print(paths.assets_path)  # Path('report.assets') or None
Writing Sidematter Metadata and Assets
from sidematter_format import Sidematter

# Create a Sidematter object for read/write operations
sm = Sidematter(Path("report.md"))

# Write metadata as YAML (default)
metadata = {
    "title": "Q3 Financial Analysis",
    "author": "Jane Doe"
    "tags": ["finance", "quarterly"]
}
sm.write_meta(metadata)

# Write metadata as JSON
sm.write_meta(metadata, fmt="json")

# Write pre-formatted YAML/JSON string
sm.write_meta("title: My Report\nauthor: Jane Doe\n")

# Remove all metadata files
sm.write_meta(None)

# Get the path for an asset (creates .assets/ directory)
chart_path = sm.asset_path("chart.png")
# Returns: Path('report.assets/chart.png')

# Copy a file into the assets directory
sm.add_asset("~/Downloads/chart.png")
# or with a custom name:
sm.add_asset("~/Downloads/fig1.png", dest_name="chart.png")

# Check if assets directory exists
if sm.resolve_assets():
    print(f"Assets found at: {sm.assets_dir}")

For how to install uv and Python, see installation.md.

For development workflows, see development.md.

For instructions on publishing to PyPI, see publishing.md.

This project was built from simple-modern-uv.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4