A toolkit for generating & analyzing "slop" β over-represented lexical patterns β in LLM outputs.
Generate a standardised set of outputs from several models for downstream analysis.
Analyze a model's outputs for repetitive words, bigrams, trigrams, vocabulary complexity, and slop scores.
Aggregate findings across models to build canonical slop lists of of over-represented words and phrases.
π³ Phylogenetic Tree BuildingCluster models based on slop profile similarity using parsimony (PHYLIP) or hierarchical clustering.
https://colab.research.google.com/drive/1SQfnHs4wh87yR8FZQpsCOBL5h5MMs8E6?usp=sharing
Python 3.7+
The required Python dependencies are listed in requirements.txt
. Install them via:
pip install -r requirements.txt
PHYLIP (optional)
sudo apt-get install phylip
pars
and consense
executables are in your PATH
or specify PHYLIP_PATH
in .env
.NLTK data (recommended):
We use punkt
, punkt_tab
, stopwords
, and cmudict
for parts of the analysis. Download via:
import nltk nltk.download('punkt') nltk.download('punkt_tab') nltk.download('stopwords') nltk.download('cmudict')
slop-forensics/
ββ scripts/
β ββ generate_dataset.py
β ββ slop_profile.py
β ββ create_slop_lists.py
β ββ generate_phylo_trees.py
β ββ ...
ββ slop_forensics/
β ββ config.py
β ββ dataset_generator.py
β ββ analysis.py
β ββ metrics.py
β ββ phylogeny.py
β ββ slop_lists.py
β ββ utils.py
β ββ ...
ββ data/
β ββ (internal data files for slop lists, e.g. slop_list.json, etc.)
ββ results/
β ββ datasets/
β ββ analysis/
β ββ slop_lists/
β ββ phylogeny/
β ββ ...
ββ .env.example
ββ requirements.txt
ββ README.md β You are here!
ββ ...
.env.example
to .env
and update the variables:.env
, set OPENAI_API_KEY
to an OpenRouter or OpenAI-compatible key.PHYLIP_PATH
if the pars
/consense
binaries are not in your PATH
.Example .env
contents:
# .env OPENAI_API_KEY=sk-or-v1-xxxxxx OPENAI_BASE_URL="https://openrouter.ai/api/v1" PHYLIP_PATH="/usr/local/bin"
Note: If you are not using OpenRouter, you can point to another OpenAI-compatible service by changing OPENAI_BASE_URL
.
Below is a typical workflow, using mostly defaults. Adjust paths/arguments as desired.
Note: several default parameters are pre-configured in slop_forensics/config.py
.
Use generate_dataset.py
to prompt the specified LLMs for story outputs.
python3 scripts/generate_dataset.py \ --model-ids x-ai/grok-3-mini-beta,meta-llama/llama-4-maverick,meta-llama/llama-4-scout,google/gemma-3-4b-it \ --generate-n 100
.jsonl
files in results/datasets
, named like generated_x-ai__grok-3-mini-beta.jsonl
, etc.slop_forensics/config.py
.Once data is generated, run slop_profile.py
to calculate word/bigram/trigram usage, repetition scores, slop scores, etc.
python3 scripts/slop_profile.py
generated_*.jsonl
in results/datasets
, analyzes each, and writes results to:
results/analysis/slop_profile__{model}.json
(per-model detailed analysis)results/slop_profile_results.json
(combined data for all models).--input-dir
, --analysis-output-dir
, and so on if you want to override defaults.Use create_slop_lists.py
to combine analysis results from multiple models to create a master "slop list".
python3 scripts/create_slop_lists.py
.json
files from results/analysis/
, re-reads the corresponding model .jsonl
files, and creates aggregated slop lists.results/slop_lists/slop_list.json
β top over-represented single wordsresults/slop_lists/slop_list_bigrams.json
β over-represented bigramsresults/slop_lists/slop_list_trigrams.json
β over-represented trigramsresults/slop_lists/slop_list_phrases.jsonl
β top multi-word substrings actually extracted from textCombining stylometric analysis with bioinformatics, we use our generated slop profiles to infer relationships between models purely from their outputs. With the generate_phylo_trees.py
script, we create a pseudo-phylogenetic tree (via PHYLIP parsimony or hierarchical clustering fallback).
The parsimony algorithm is a little different to hierarchical clustering in that it tries to infer lineage from fewest number of mutations. Here we are representing mutations as presence/absence of a given word/phrase in the over-represented list for each model. For more info see the next section.
python3 scripts/generate_phylo_trees.py
results/slop_profile_results.json
).PHYLO_TOP_N_FEATURES
total).pars
(parsimony) and optionally consense
.results/phylogeny/
.png
images (both circular & rectangular) per model highlighting that model on the tree.Purpose:
We analyze each modelβs outputs to find words, phrases, and patterns that are frequently overusedβwhat we call βslop.β
How we do it:
Result: We produce detailed profiles (saved as JSON files) showing which words and phrases each model repeats most, along with these metrics.
2. Slop List Creation: Making a Reference of Frequently Over-Used WordsPurpose:
We create comprehensive lists of commonly overused words and phrases (slop lists), which help identify repetitive patterns across multiple models.
How we do it:
Result:
We produce several canonical lists:
slop_list.json
: Commonly overused single words.slop_list_bigrams.json
and slop_list_trigrams.json
: Commonly repeated phrases of two or three words.slop_list_phrases.jsonl
: Actual multi-word phrases frequently repeated across models.Purpose:
We infer a lineage tree based on similarity of each model's slop profile.
How we do it:
1
) or absence (0
) of a term across models is analogous to genetic mutations.M0001 010000010001100010000...
M0002 000100110000010100001...
Result:
We produce visual tree diagrams (both circular and rectangular), as well as data files (.nwk
and .nex
) showing relationships among models based on their repetitive language patterns.
This pipeline allows you to clearly see which words and phrases each language model tends to overuse, combines these insights into helpful reference lists, and visually clusters models by their linguistic habits.
This project is licensed under the MIT License.
For questions or feedback:
If you use Slop Forensics in your research or work, please cite it as:
@software{paech2025slopforensics, author = {Paech, Samuel J}, title = {Slop Forensics: A Toolkit for Generating \& Analyzing Lexical Patterns in LLM Outputs}, url = {https://github.com/sam-paech/slop-forensics}, year = {2025}, }
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4