Natural language parser, for the Dutch language, that produces nlcst.
This package exposes a parser that takes Dutch natural language and produces a syntax tree.
If you want to handle Dutch natural language as syntax trees manually, use this.
Alternatively, you can use the retext plugin retext-dutch
, which wraps this project to also parse natural language at a higher-level (easier) abstraction.
For English or most Latin-script languages, you can instead use parse-english
or parse-latin
.
This package is ESM only. In Node.js (version 16.0+), install with npm:
In Deno with esm.sh
:
import {ParseDutch} from 'https://esm.sh/parse-dutch@7'
In browsers with esm.sh
:
<script type="module"> import {ParseDutch} from 'https://esm.sh/parse-dutch@7?bundle' </script>
import {inspect} from 'unist-util-inspect' import {ParseDutch} from 'parse-dutch' const tree = new ParseDutch().parse( 'Kunt U zich ’s morgens melden bij het afd. hoofd dhr. Venema?' ) console.log(inspect(tree))
Yields:
RootNode[1] (1:1-1:62, 0-61) └─0 ParagraphNode[1] (1:1-1:62, 0-61) └─0 SentenceNode[24] (1:1-1:62, 0-61) ├─0 WordNode[1] (1:1-1:5, 0-4) │ └─0 TextNode "Kunt" (1:1-1:5, 0-4) ├─1 WhiteSpaceNode " " (1:5-1:6, 4-5) ├─2 WordNode[1] (1:6-1:7, 5-6) │ └─0 TextNode "U" (1:6-1:7, 5-6) ├─3 WhiteSpaceNode " " (1:7-1:8, 6-7) ├─4 WordNode[1] (1:8-1:12, 7-11) │ └─0 TextNode "zich" (1:8-1:12, 7-11) ├─5 WhiteSpaceNode " " (1:12-1:13, 11-12) ├─6 WordNode[2] (1:13-1:15, 12-14) │ ├─0 PunctuationNode "’" (1:13-1:14, 12-13) │ └─1 TextNode "s" (1:14-1:15, 13-14) ├─7 WhiteSpaceNode " " (1:15-1:16, 14-15) ├─8 WordNode[1] (1:16-1:23, 15-22) │ └─0 TextNode "morgens" (1:16-1:23, 15-22) ├─9 WhiteSpaceNode " " (1:23-1:24, 22-23) ├─10 WordNode[1] (1:24-1:30, 23-29) │ └─0 TextNode "melden" (1:24-1:30, 23-29) ├─11 WhiteSpaceNode " " (1:30-1:31, 29-30) ├─12 WordNode[1] (1:31-1:34, 30-33) │ └─0 TextNode "bij" (1:31-1:34, 30-33) ├─13 WhiteSpaceNode " " (1:34-1:35, 33-34) ├─14 WordNode[1] (1:35-1:38, 34-37) │ └─0 TextNode "het" (1:35-1:38, 34-37) ├─15 WhiteSpaceNode " " (1:38-1:39, 37-38) ├─16 WordNode[2] (1:39-1:43, 38-42) │ ├─0 TextNode "afd" (1:39-1:42, 38-41) │ └─1 PunctuationNode "." (1:42-1:43, 41-42) ├─17 WhiteSpaceNode " " (1:43-1:44, 42-43) ├─18 WordNode[1] (1:44-1:49, 43-48) │ └─0 TextNode "hoofd" (1:44-1:49, 43-48) ├─19 WhiteSpaceNode " " (1:49-1:50, 48-49) ├─20 WordNode[2] (1:50-1:54, 49-53) │ ├─0 TextNode "dhr" (1:50-1:53, 49-52) │ └─1 PunctuationNode "." (1:53-1:54, 52-53) ├─21 WhiteSpaceNode " " (1:54-1:55, 53-54) ├─22 WordNode[1] (1:55-1:61, 54-60) │ └─0 TextNode "Venema" (1:55-1:61, 54-60) └─23 PunctuationNode "?" (1:61-1:62, 60-61)
This package exports the identifier ParseDutch
. There is no default export.
Create a new parser.
ParseDutch
extends ParseLatin
. See parse-latin
for API docs.
All of parse-latin
is included, and the following support for the Dutch natural language:
gr.
, sec.
, min.
, ma.
, vr.
, vrij.
, febr.
, mrt.
, and more)Mr.
, Mv.
, Sr.
, Em.
, bijv.
, zgn.
, amb.
, and more)d’
, ’n
, ’ns
, ’t
, ’s
, ’er
, ’em
, ’ie
, and more)This package is fully typed with TypeScript. It exports no additional types.
Projects maintained by me are compatible with maintained versions of Node.js.
When I cut a new major release, I drop support for unmaintained versions of Node. This means I try to keep the current release line, parse-dutch@^7
, compatible with Node.js 16.
This package is safe.
parse-latin
— Latin-script natural language parserparse-english
— English natural language parserYes please! See How to Contribute to Open Source.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4