A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://cran.r-project.org/web/packages/rms/../digest/../orderanalyzer/index.html below:

CRAN: Package orderanalyzer

orderanalyzer: Extracting Order Position Tables from PDF-Based Order Documents

Functions for extracting text and tables from PDF-based order documents. It provides an n-gram-based approach for identifying the language of an order document. It furthermore uses R-package 'pdftools' to extract the text from an order document. In the case that the PDF document is only including an image (because it is scanned document), R package 'tesseract' is used for OCR. Furthermore, the package provides functionality for identifying and extracting order position tables in order documents based on a clustering approach.

Version: 1.0.0 Depends: R (≥ 4.3.0), tidyselect Imports: data.table, dplyr, matrixcalc, quanteda, rlist, stringr, tibble, tidyr, utils, purrr, digest, lubridate Suggests: pdftools, tesseract, xml2 Published: 2024-12-12 DOI: 10.32614/CRAN.package.orderanalyzer Author: Michael Scholz [cre, aut], Joerg Bauer [aut] Maintainer: Michael Scholz <michael.scholz at th-deg.de> License: GPL-3 NeedsCompilation: no SystemRequirements: Tesseract >= 5.0.0, libtesseract-dev (deb), tesseract-devel (rpm), libleptonica-dev (deb), leptonica-devel (rpm), tesseract-ocr-eng (deb), libpoppler-cpp-dev (deb), poppler-cpp-devel (rpm), poppler-data (rpm/deb), libxml2-dev (deb), libxml2-devel (rpm) CRAN checks: orderanalyzer results Documentation: Downloads: Linking:

Please use the canonical form https://CRAN.R-project.org/package=orderanalyzer to link to this page.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4