A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/repology/repology-rules below:

repology/repology-rules: Package normalization ruleset for Repology

There can be a huge discrepancy in how packages for a single project are named and versioned in different repositories, so Repology needs a flexible ruleset in order to overcome the differences, match packages, and make versions comparable.

You are welcome to submit pull requests with the rules you need. Here's a quick pointer of how to add specific rules:

You want to merge differently named packages into a single entry? You want to mark incorrect versions of a specific package? You want to split different projects with the same name

Things to know if you're submitting a pull request or have push access to this repository.

Rules are stored in a set of files in YAML format, a flexible human-friendly markup format for structured data. Each rule is a single item of a big array, and may be written in a single or multiple lines (depending on what's more convenient for the particular case). For example, the following rule renames etracer into extreme-tuxracer:

- { name: etracer, setname: extreme-tuxracer }

which is the same as:

- name: etracer
  setname: extreme-tuxracer

Each rule has a set of keywords which specify how a package is matched (by name, version, repository, category etc.) and how it is modified (package is renamed, version scheme is changed, flags are applied, etc.).

Rule order matters, as multiple rules may match a single package, and they are applied in order. Furthermore, changes applied by earlier rules affect further matches: for instance, if a package is renamed, the new name will be matched for the following rules.

While rules are basically arbitrary, it's practical though to attribute each rule to a specific class of action, the most distinctive of which are:

The ruleset is split into several distinctive parts, mostly based on the functional class of rules described above. They are arranged in such a way that when adding a rule into a specific part you don't need to be aware of the rest of the ruleset.

This may seem complex, but in practice the mostly used rulesets are 800, 850 and 900, which cleanly correspond to three functional classes of rules described in the previous section.

Other parts of the ruleset may need attention when new repositories are introduced.

As already mentioned, the keywords that comprise rules are related to either matching packages, or modifying them. Below are detailed descriptions for all of them.

Each repository that Repology supports has a set of rulesets associated with it. For instance, all Debian-based distros have the ruleset debuntu. This may be used to only match packages in specific repositories, but without the need to chase a specific repository version. You may look up repositories and their details in the repos.d directory of the main Repology repository.

You may specify a list of rulesets to match any of them.

- { ruleset: freebsd, ... }

- { ruleset: [ arch, openbsd ], ... }

Disable rule matching for specified ruleset(s).

# applies to all Debian derivatives, but not Deepin
- { ruleset: debuntu, noruleset: deepin, ... }

Deprecated. Same as ruleset, and may be just changed into it.

Matches package category(ies). Note that category information is not available for all repositories, and each repository may have its own set of categories.

- { category: games, ... }

- { category: [ mail-client, mail-filter, mail-mta ], ... }

Matches package category(ies) against a regular expression. The whole category is matched, match is case insensitive.

- { categorypat: "emacs[0-9]+Packages" }

Matches package maintainer(s). The matching is case-insensitive.

- { maintainer: "nobody@nowhere.com" }

Match exact package name(s).

- { name: firefox, ... }

- { name: [postgresql-client, postgresql-server, postgresql-contrib], ... }

Matches package name against a regular expression. The whole name is matched. May contain captures.

- { namepat: "swig[0-9]+", ... }

Matches exact package version(s).

- { name: firefox, ver: "50.0.1", ... }

The opposite of ver: matches if the package version is none of specified version(s).

- { name: firefox, notver: ["50.0.1", "50.0.2"] }

Matches a package version name against a regular expression. The whole version is matched. Note that you need to escape periods, which mean "any symbol" in regular expressions. Matching is case-insensitive.

- { name: firefox, verpat: "50\\.[0-9]+", ... }

- { name: firefox, verpat: "50\\..*", ... }

Matches the number of components (dot-separated parts) of a version.

- { name: gimp, vercomps: 3, ...} # matches 1.2.3, but not 1.2 or 1.2.3.4

Matches versions longer than a given number of components (dot-separated parts).

Mostly useful to match broken version schemes that add extra version components.

- { name: gimp, verlonger: 3, ...} # 2.9.8.12345 is something unofficial
vergt, verge, verlt, verle, vereq, verne

Compares version to a given one and matches if it is:

# match git >= 2.16
- { name: git, verge: "2.16", ...}

Be careful when using this with regard to pre-release versions: 1.0beta1 is lesser than 1.0, so it won't match verge: 1.0. You may use verpat instead.

relgt, relge, rellt, relle, releq, relne

Similar to the verXX family, but checks how a package version relates to a specified release. A release includes all pre-releases and post-releases with a given prefix; e.g. releq: "1.0" would match 1.0alpha1, 1.0, 1.0patch, 1.0.1, but not 0.99 and 1.1.

Matches the package homepage against a regular expression. Note that unlike namepat and verpat, a partial match is allowed here. Also note that dots should be escaped with double slash, as . means "any character" in regular expressions.

- { name: firefox, wwwpat: "mozilla\\.org", ... }

Matches when a package homepage contains given substring. This is usually more practical than wwwpat as in most cases you just need to match an URL part and don't need complex patterns, and you don't need to worry about escaping here. Matching is case-insensitive.

- { name: firefox, wwwpart: "mozilla.org", ... }

Matches when a package homepage is a sourceforge page for a given project name (https://<project>.sourceforge.net, https://sourceforge.net/project/<project> etc.):

- { name: aterm, sourceforge: aterm, ... }

Matches when a package summary contains a given substring. Useful as an alternative to wwwpart for cases where the package homepage is not available. Matching is case-insensitive.

- { name: firefox, summpart: "browser", ... }

Matches when a package has the p_is_patch flag set (see the p_is_patch action below).

Effectively rename the package. You may use the $0 placeholder to substitute original name, or $1, $2 etc. to substitute the contents of the corresponding captures of the regular expression used in namepat. Note that you don't need to use neither name nor namepat for $0 to work, but you must have namepat with corresponding captures to use $1 and so on.

# etracer→extreme-tuxracer
- { name: etracer, setname: extreme-tuxracer }

# aspell-dict-en→aspell-ru, aspell-dict-ru→aspell-ru etc.
- { namepat: "aspell-dict-(.*)", setname: "aspell-$1" }

# all packages in dev-perl Gentoo category are prepended `perl:`
# Locale-Msgfmt→perl:Locale-Msgfmt
- { ruleset: gentoo, category: dev-perl, setname: "perl:$0" }

Changes the version of the package. As with setname, you may use the placeholders $0, $1, etc.

# remove bogus leading version component
- { verpat: "0\\.(.*)", setver: $1 }

Set to true to completely remove a package. It will not appear anywhere in Repology. Set to false to undo.

# a metapackage which does not refer to any real project, we don't need it
- { name: "x11-fonts", remove: true }

Set to true to mark the version of a matched package as a development or unstable version, so it does not make the latest stable version be marked as outdated. Set to false to undo.

# mark versions with odd second component as devel
- { name: gnome-terminal, verpat: "[0-9]+\\.[0-9]*[13579]\\..*", devel: true }

A project may use two parallel versioning schemes, one of which contains additional version components, such as a build number:

0.17, 0.17.13509, 0.17.13541, 0.18, 0.18.16131

Normally, 0.18.16131 would be considered more recent than 0.18, but if these refer to the same version, this is not desired behavior. In such case, a version scheme containing extra components (e.g. one which compares greater) may be marked as altver, which would allow both 0.18 and 0.18.16131 to be considered the latest, and both to be marked as outdated by the presence of either 0.19 or 0.19.x.

- { name: freecad, verlonger: 3, altver: true }

Similar to altver, but for the case where versioning schemes do not have a common prefix and are totally incompatible:

3.2.1, 3207, 3.2.2, 3211

Marking either of the schemes with this flag results in completely independent processing, which would allow both 3.2.2 and 3211 to be treated as the newest version.

- { name: sublime-text, verpat: "[0-9]+", altscheme: true }
ignore, incorrect, untrusted, noscheme, snapshot, successor, debianism, rolling

Set to true to ignore specific package versions. This is meant for the cases where comparison is not possible - ignored versions are excluded from comparison and do not affect the status of other versions. There are multiple ignore flavors:

# Fedora was known to use "6.0.0" version before it was actually released
# mark as incorrect and prevent future problems
- { name: llvm, ver: "6.0.0", ruleset: fedora, incorrect: true }
- { name: llvm, ruleset: fedora, untrusted: true }

Set to true to indicate that this project uses p letter in the version to indicate post- or patch releases. This fixes version comparison, as by default p is treated as pre-release.

# sudo 1.8.21p2 > 1.8.21
- { name: sudo, p_is_patch: true }

Set to true to indicate that this project uses any letter in the version to indicate post- releases.

# rb here denotes a patchset, treat is as such
- { name: webalizer, verpat: ".*rb.*", any_is_patch: true }

Set to true to force the package version to compare lower than any other package version. Useful to handle upstream versioning schema change when new versions compare lower than legacy ones. Set to false to undo.

# when 0.20 follows 0.193:
- { version: "0.193", sink: true }

Result: 0.20 (newest) > 0.193 (outdated)

Set to true to force the package to be outdated, even if it classifies as the most recent. Note that this does not lead to another version being selected as newest. Useful to convey that a version is outdated even when there are no newer versions (for instance, when a project is superceded by another project). Set to false to undo.

# when 0.20 follows 0.193:
- { version: "0.193", outdated: true }

Result: 0.193 (outdated) > 0.20 (outdated)

Set to true to force the package to be legacy instead of outdated. Set to false to undo. Useful when a specific repository purposely contains an outdated version of a specific project for compatibility purposes.

- { name: ruby-slack-notifier-1, ruleset: aur, legacy: true }

Set to true to prevent the package from ever having legacy status. This is useful for marking packages which declare to be of development version, but are nevertheless outdated.

- { name: ffmpeg-git, nolegacy: true }

Output a given warning when matched.

# will catch unexpected versions
- { name: gtk, verpat: "1\\..*", setname: gtk1 }
- { name: gtk, verpat: "2\\..*", setname: gtk2 }
- { name: gtk, verpat: "3\\..*", setname: gtk3 }
- { name: gtk, verpat: "4\\..*", setname: gtk4 }
- { name: gtk, warning: "Neither of gtk1,2,3,4 - need a new rule or some weirdness is going on" }

# will trigger a warning if new project called "tesseract" appears
# ...or website changes, or just a package without website defined appears,
# so it'll require another condition
- { name: tesseract, setname: tesseract-game, wwwpart: tesseract.gg }
- { name: tesseract, setname: tesseract-ocr, wwwpart: tesseract-ocr }
- { name: tesseract, warning: "Please add rule for tesseract" }

Flavors are used to distinguish a set of packages denoting multiple versions of a project and a set of packages denoting a multiple parts or variants of a project. Consider an example:

Flavors are plain strings and may be arbitrary, for example client and server in the last example. You may specify a flavor explicitly, or use the true value to make the flavor be taken from the package name.

- { name: postgresql-client, setname: postgresql, addflavor: client }
- { name: postgresql-server, setname: postgresql, addflavor: server }

# This works too
- { name: [postgresql-client, postgresql-server], setname: postgresql, addflavor: true }

Same as addflavor, but replaces flavor instead to appending to flavors list.

Set to true to remove all previously added flavors.

Set to true to stop ruleset processing right after the current rule.

Consider this a legacy feature; it should not be needed.

Takes a pattern and replacement strings, and applies them to the package name. Used for low-level normalization.

# slashes in package names are not allowed
- { replaceinname: { "/": "-" } }

# also useful for some repositories
- { replaceinname: { " ": "-" } }

Converts a package name to lowercase. This is called once in the very beginning of the ruleset. The purpose of having this as a rule action is to be able to have exceptions, e.g. packages which should be distinguished solely by the case of their names.

Changes the subrepo property of the package. As with setname, you may use the placeholders $0, $1, etc.

# split subrepo name from package name
- { namepat: "([^-]+)-(.*)", setsubrepo: $1, setname: $2 }

For additional flexibility, a mechanism exists to toggle some rules based on the previous rules.

Sets a virtual flag (arbitrary string) which only exists for the duration of rule processing, and may be checked in the following rules.

- { name: python, addflag: not_python_module }

Only matches if the specified flag is (or is not) set.

- { name: python, addflag: not_python_module }
...
# will add "python:" prefix to all packages in category "python",
# but not for "python" package
- { category: python, noflag: not_python_module, setname: "python:$0" }

These annotations do not affect package processing, but are related to ruleset maintenance.

Indicates that a rule needs manual maintenance. For example, when a development version cannot be determined from the version schema, one would need to revisit and update the version occasionally.

- { name: tor, verge: "0.3.4", devel: true, maintenance: true }

Indicates that a rule should not be removed even if it doesn't match any packages. That is, a rule is likely to be useful sometime in the future.

Indicates that a rule may be removed if it doesn't match any packages.

GPLv3 or later, see COPYING.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4