hashtag-regex offers a regular expression to match hashtag identifiers as per the Unicode Standard.
This repository contains a script that generates this regular expression based on the Unicode data. Because of this, the regular expression can easily be updated whenever the Unicode Standard changes.
Via npm:
npm install hashtag-regex
In Node.js:
const hashtagRegex = require('hashtag-regex'); // Note: because the regular expression has the global flag set, this module // exports a function that returns the regex rather than exporting the regular // expression itself, to make it impossible to (accidentally) mutate the // original regular expression. const text = ` #hashtag #© #🤷🏿♀️ (\uFF03\u{1F937}\u{1F3FF}\u200D\u2640\uFE0F) `; const regex = hashtagRegex(); let match; while (match = regex.exec(text)) { const hashtag = match[0]; console.log(`Matched sequence ${ hashtag } — code points: ${ [...hashtag].length }`); }
Console output:
Matched sequence #hashtag — code points: 8
Matched sequence #© — code points: 2
Matched sequence #🤷🏿♀️ — code points: 6
Matched sequence #🤷🏿♀️ — code points: 6
hashtag-regex is available under the MIT license.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4