Natural language parser, for the English language, that produces nlcst.
ContentsThis package exposes a parser that takes English natural language and produces a syntax tree.
When should I use this?If you want to handle English natural language as syntax trees manually, use this.
Alternatively, you can use the retext plugin retext-english
, which wraps this project to also parse natural language at a higher-level (easier) abstraction.
For Dutch or most Latin-script languages, you can instead use parse-dutch
or parse-latin
.
This package is ESM only. In Node.js (version 16+), install with npm:
npm install parse-english
In Deno with esm.sh
:
import {ParseEnglish} from 'https://esm.sh/parse-english@7'
In browsers with esm.sh
:
<script type="module"> import {ParseEnglish} from 'https://esm.sh/parse-english@7?bundle' </script>Use
import {ParseEnglish} from 'parse-english' import {inspect} from 'unist-util-inspect' const tree = new ParseEnglish().parse( 'Mr. Henry Brown: A hapless but friendly City of London worker.' ) console.log(inspect(tree))
Yields:
RootNode[1] (1:1-1:63, 0-62) ââ0 ParagraphNode[1] (1:1-1:63, 0-62) ââ0 SentenceNode[23] (1:1-1:63, 0-62) ââ0 WordNode[2] (1:1-1:4, 0-3) â ââ0 TextNode "Mr" (1:1-1:3, 0-2) â ââ1 PunctuationNode "." (1:3-1:4, 2-3) ââ1 WhiteSpaceNode " " (1:4-1:5, 3-4) ââ2 WordNode[1] (1:5-1:10, 4-9) â ââ0 TextNode "Henry" (1:5-1:10, 4-9) ââ3 WhiteSpaceNode " " (1:10-1:11, 9-10) ââ4 WordNode[1] (1:11-1:16, 10-15) â ââ0 TextNode "Brown" (1:11-1:16, 10-15) ââ5 PunctuationNode ":" (1:16-1:17, 15-16) ââ6 WhiteSpaceNode " " (1:17-1:18, 16-17) ââ7 WordNode[1] (1:18-1:19, 17-18) â ââ0 TextNode "A" (1:18-1:19, 17-18) ââ8 WhiteSpaceNode " " (1:19-1:20, 18-19) ââ9 WordNode[1] (1:20-1:27, 19-26) â ââ0 TextNode "hapless" (1:20-1:27, 19-26) ââ10 WhiteSpaceNode " " (1:27-1:28, 26-27) ââ11 WordNode[1] (1:28-1:31, 27-30) â ââ0 TextNode "but" (1:28-1:31, 27-30) ââ12 WhiteSpaceNode " " (1:31-1:32, 30-31) ââ13 WordNode[1] (1:32-1:40, 31-39) â ââ0 TextNode "friendly" (1:32-1:40, 31-39) ââ14 WhiteSpaceNode " " (1:40-1:41, 39-40) ââ15 WordNode[1] (1:41-1:45, 40-44) â ââ0 TextNode "City" (1:41-1:45, 40-44) ââ16 WhiteSpaceNode " " (1:45-1:46, 44-45) ââ17 WordNode[1] (1:46-1:48, 45-47) â ââ0 TextNode "of" (1:46-1:48, 45-47) ââ18 WhiteSpaceNode " " (1:48-1:49, 47-48) ââ19 WordNode[1] (1:49-1:55, 48-54) â ââ0 TextNode "London" (1:49-1:55, 48-54) ââ20 WhiteSpaceNode " " (1:55-1:56, 54-55) ââ21 WordNode[1] (1:56-1:62, 55-61) â ââ0 TextNode "worker" (1:56-1:62, 55-61) ââ22 PunctuationNode "." (1:62-1:63, 61-62)API
This package exports the identifier ParseEnglish
. There is no default export.
ParseEnglish()
Create a new parser.
ParseEnglish
extends ParseLatin
. See parse-latin
for API docs.
All of parse-latin
is included, and the following support for the English natural language:
tsp.
, tbsp.
, oz.
, ft.
, and more)sec.
, min.
, tues.
, thu.
, feb.
, and more)Inc.
and Ltd.
)Mr.
, Mmes.
, Sr.
, and more)Dr.
, Rep.
, Gen.
, Prof.
, Pres.
, and more)Ave.
, Blvd.
, Ft.
, Hwy.
, and more)Ala.
, Minn.
, La.
, Tex.
, and more)Alta.
, Qué.
, Yuk.
, and more)Beds.
, Leics.
, Shrops.
, and more)ânâ
, âo
, âem
, âtwas
, â80s
, and more)This package is fully typed with TypeScript. It exports no additional types.
CompatibilityProjects maintained by me are compatible with maintained versions of Node.js.
When I cut a new major release, I drop support for unmaintained versions of Node. This means I try to keep the current release line, parse-english@^7
, compatible with Node.js 16.
This package is safe.
Relatedparse-latin
â Latin-script natural language parserparse-dutch
â Dutch natural language parserYes please! See How to Contribute to Open Source.
LicenseMIT © Titus Wormer
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4