A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://github.com/flashlight/text below:

flashlight/text: Text utilities, including beam search decoding, tokenizing, and more, built for use in Flashlight.

Flashlight Text: Fast, Lightweight Utilities for Text

Quickstart | Installation | Python Documentation | Citing

Flashlight Text is a fast, minimal library for text-based operations. It features:

The Flashlight Text Python package containing beam search decoder and Dictionary components is available on PyPI:

pip install flashlight-text

To enable optional KenLM support in Python with the decoder, KenLM must be installed via pip:

pip install git+https://github.com/kpu/kenlm.git

See the full Python binding documentation for examples and more.

From Source (C++) | With vcpkg (C++) | From Source (Python) | Adding to Your Own Project (C++)

At minimum, C++ compilation requires:

KenLM Support: If building with KenLM support, KenLM is required. To toggle KenLM support use the FL_TEXT_USE_KENLM CMake option or the USE_KENLM environment variable when building the Python bindings.

Tests: If building tests, Google Test >= 1.10 is required. The FL_TEXT_BUILD_TESTS CMake option toggles building tests.

Instructions for building/installing the Python bindings from source can be found here.

Building the C++ project from source is simple:

git clone https://github.com/flashlight/text && cd text
cmake -S . -B build
cmake --build build --parallel
cd build && ctest && cd .. # run tests
cmake --install build # install at the CMAKE_INSTALL_PREFIX

To disable KenLM while building, pass -DFL_TEXT_USE_KENLM=OFF to CMake. To disable building tests, pass -DFL_TEXT_BUILD_TESTS=OFF.

KenLM can be downloaded and installed automatically if not found on the local system. The FL_TEXT_BUILD_STANDALONE option controls this behavior — if disabled, dependencies won't be downloaded and built when building.

Flashlight Text can also be installed and used downstream with the vcpkg package manager. The port contains an optional feature with which to build and install with KenLM support:

vcpkg install flashlight-text # no dependencies, or:
vcpkg install "flashlight-text[kenlm]" # install with KenLM
Adding Flashlight Text to a C++ Project

Given a simple project.cpp file that includes and links to Flashlight Text:

#include <iostream>

#include <flashlight/lib/text/dictionary/Dictionary.h>

int main() {
  fl::lib::text::Dictionary myDict("someFile.dict");
  std::cout << "Dictionary has " << myDict.entrySize()
            << " entries."  << std::endl;
 return 0;
}

The following CMake configuration links Flashlight and sets include directories:

cmake_minimum_required(VERSION 3.10)
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

add_executable(myProject project.cpp)

find_package(flashlight-text CONFIG REQUIRED)
target_link_libraries(myProject PRIVATE flashlight::flashlight-text)

To link against the library providing KenLM support, use the flashlight::flashlight-text-kenlm imported target:

target_link_libraries(myProject
  PRIVATE
  flashlight::flashlight-text
  # transitively links KenLM
  flashlight::flashlight-text-kenlm
)

Contact: jacobkahn@meta.com

Flashlight Text is actively developed. See CONTRIBUTING for more on how to help out.

You can cite Flashlight using:

@misc{kahn2022flashlight,
      title={Flashlight: Enabling Innovation in Tools for Machine Learning},
      author={Jacob Kahn and Vineel Pratap and Tatiana Likhomanenko and Qiantong Xu and Awni Hannun and Jeff Cai and Paden Tomasello and Ann Lee and Edouard Grave and Gilad Avidov and Benoit Steiner and Vitaliy Liptchinsky and Gabriel Synnaeve and Ronan Collobert},
      year={2022},
      eprint={2201.12465},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Flashlight Text is under an MIT license. See LICENSE for more information.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4