A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/carrotsearch/langid-java below:

carrotsearch/langid-java: Java port of langid.py (language identifier)

langid-java
-----------

A somewhat changed Java port of langid.py (https://github.com/saffsd/langid.py)

Usage
-----

Fetch a JAR from Maven repositories (or compile it yourself). Then check out
the javadocs of ILangIdClassifier and LangIdV3.

Memory and Speed
----------------

Don't get fooled by the size of the JAR archive. The data model is LZMA 
compressed and will take about ~10MB of RAM. Speed wise this implementation
should be faster than anything else out there; if you have very large texts
you can sub-sample, append those fragments and classify without processing
the entire content.

Quality
-------

At the moment the detection/ performance quality is identical to langid.py
(the model and the math code is identical, even if written in a slightly
different way to speed up computations in Java).


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4