A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from http://cran.rstudio.com/web/packages/rJava/../BEACH/../rJava/../topics/readme/README.html below:

README

topics Overview

An R-package for analyzing natural language implementing Differential Language Analysis using words, phrases and topics. The topics package is part of the R Language Analysis Suite, including talk, text and topics.


When using the topics package, please cite:

Ackermann L., Zhuojun G. & Kjell O.N.E. (2024). An R-package for visualizing text in topics. https://github.com/theharmonylab/topics. DOI:zenodo.org/records/11165378.

Installation

The topics package uses JAVA, which is another programming language. Please start by downloading and installing it from www.java.com/en/download/. Then open R and run:

install.packages("devtools")
devtools::install_github("theharmonylab/topics")

# if you run in to any installation problem, try installing rJava first.

# Before open the library, consider setting this option (can increase 5000);  without it the code may ran out of memory
options(java.parameters = "-Xmx5000m")
Table of Contents
  1. Overview
  2. Installation
  3. Usage
Overview

The pipeline is composed of the following steps:

1. Data Preprocessing
The data preprocessing converts the data into a document term matrix (DTM) and removes stopwords, punctuation, etc. which is the data format needed for the LDA model.

2. Model Training
The model training step trains the LDA model on the DTM with a number of iterations and predefined amount of topics.

3. Model Inference
The model inference step uses the trained LDA model to infer the topic term distribution of the documents.

4. Statistical Analysis
The analysis includes the methods like linear regression, binary regression, ridge regression or correlation to analyze the relationship between the topics and the prediction variable. It is possible to control for a number of variables and to adjust the p-value for multiple comparisons.

5. Visualization
The visualization step creates wordclouds of the significant topics found by the statistical analysis.


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4