Natural language parser, for the English language, that produces nlcst.
This package exposes a parser that takes English natural language and produces a syntax tree.
If you want to handle English natural language as syntax trees manually, use this.
Alternatively, you can use the retext plugin retext-english
, which wraps this project to also parse natural language at a higher-level (easier) abstraction.
For Dutch or most Latin-script languages, you can instead use parse-dutch
or parse-latin
.
This package is ESM only. In Node.js (version 16+), install with npm:
npm install parse-english
In Deno with esm.sh
:
import {ParseEnglish} from 'https://esm.sh/parse-english@7'
In browsers with esm.sh
:
<script type="module"> import {ParseEnglish} from 'https://esm.sh/parse-english@7?bundle' </script>
import {ParseEnglish} from 'parse-english' import {inspect} from 'unist-util-inspect' const tree = new ParseEnglish().parse( 'Mr. Henry Brown: A hapless but friendly City of London worker.' ) console.log(inspect(tree))
Yields:
RootNode[1] (1:1-1:63, 0-62) └─0 ParagraphNode[1] (1:1-1:63, 0-62) └─0 SentenceNode[23] (1:1-1:63, 0-62) ├─0 WordNode[2] (1:1-1:4, 0-3) │ ├─0 TextNode "Mr" (1:1-1:3, 0-2) │ └─1 PunctuationNode "." (1:3-1:4, 2-3) ├─1 WhiteSpaceNode " " (1:4-1:5, 3-4) ├─2 WordNode[1] (1:5-1:10, 4-9) │ └─0 TextNode "Henry" (1:5-1:10, 4-9) ├─3 WhiteSpaceNode " " (1:10-1:11, 9-10) ├─4 WordNode[1] (1:11-1:16, 10-15) │ └─0 TextNode "Brown" (1:11-1:16, 10-15) ├─5 PunctuationNode ":" (1:16-1:17, 15-16) ├─6 WhiteSpaceNode " " (1:17-1:18, 16-17) ├─7 WordNode[1] (1:18-1:19, 17-18) │ └─0 TextNode "A" (1:18-1:19, 17-18) ├─8 WhiteSpaceNode " " (1:19-1:20, 18-19) ├─9 WordNode[1] (1:20-1:27, 19-26) │ └─0 TextNode "hapless" (1:20-1:27, 19-26) ├─10 WhiteSpaceNode " " (1:27-1:28, 26-27) ├─11 WordNode[1] (1:28-1:31, 27-30) │ └─0 TextNode "but" (1:28-1:31, 27-30) ├─12 WhiteSpaceNode " " (1:31-1:32, 30-31) ├─13 WordNode[1] (1:32-1:40, 31-39) │ └─0 TextNode "friendly" (1:32-1:40, 31-39) ├─14 WhiteSpaceNode " " (1:40-1:41, 39-40) ├─15 WordNode[1] (1:41-1:45, 40-44) │ └─0 TextNode "City" (1:41-1:45, 40-44) ├─16 WhiteSpaceNode " " (1:45-1:46, 44-45) ├─17 WordNode[1] (1:46-1:48, 45-47) │ └─0 TextNode "of" (1:46-1:48, 45-47) ├─18 WhiteSpaceNode " " (1:48-1:49, 47-48) ├─19 WordNode[1] (1:49-1:55, 48-54) │ └─0 TextNode "London" (1:49-1:55, 48-54) ├─20 WhiteSpaceNode " " (1:55-1:56, 54-55) ├─21 WordNode[1] (1:56-1:62, 55-61) │ └─0 TextNode "worker" (1:56-1:62, 55-61) └─22 PunctuationNode "." (1:62-1:63, 61-62)
This package exports the identifier ParseEnglish
. There is no default export.
Create a new parser.
ParseEnglish
extends ParseLatin
. See parse-latin
for API docs.
All of parse-latin
is included, and the following support for the English natural language:
tsp.
, tbsp.
, oz.
, ft.
, and more)sec.
, min.
, tues.
, thu.
, feb.
, and more)Inc.
and Ltd.
)Mr.
, Mmes.
, Sr.
, and more)Dr.
, Rep.
, Gen.
, Prof.
, Pres.
, and more)Ave.
, Blvd.
, Ft.
, Hwy.
, and more)Ala.
, Minn.
, La.
, Tex.
, and more)Alta.
, Qué.
, Yuk.
, and more)Beds.
, Leics.
, Shrops.
, and more)’n’
, ’o
, ’em
, ’twas
, ’80s
, and more)This package is fully typed with TypeScript. It exports no additional types.
Projects maintained by me are compatible with maintained versions of Node.js.
When I cut a new major release, I drop support for unmaintained versions of Node. This means I try to keep the current release line, parse-english@^7
, compatible with Node.js 16.
This package is safe.
parse-latin
— Latin-script natural language parserparse-dutch
— Dutch natural language parserYes please! See How to Contribute to Open Source.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4