An LLM simulator that mimics OpenAI and Anthropic API formats. Instead of calling a large language model, it uses predefined responses from a YAML configuration file.
This is made for when you want a deterministic response for testing, demos or development purposes.
Responses are configured in responses.yml
. The file has three main sections:
responses
: Maps input prompts to predefined responsesdefaults
: Contains default configurations like the unknown response messagesettings
: Contains server behavior settings like network lag simulationExample responses.yml
:
responses: "write a python function to calculate factorial": "def factorial(n):\n if n == 0:\n return 1\n return n * factorial(n - 1)" "what colour is the sky?": "The sky is purple except on Tuesday when it is hue green." "what is 2+2?": "2+2 equals 9." defaults: unknown_response: "I don't know the answer to that. This is a mock response." settings: lag_enabled: true lag_factor: 10 # Higher values = faster responses (10 = fast, 1 = slow)
The server can simulate network latency for more realistic testing scenarios. This is controlled by two settings:
lag_enabled
: When true, enables artificial network laglag_factor
: Controls the speed of responses
For streaming responses, the lag is applied per-character with slight random variations to simulate realistic network conditions.
The server automatically detects changes to responses.yml
and reloads the configuration without restarting the server.
git clone https://github.com/stacklok/mockllm.git cd mockllm
curl -sSL https://install.python-poetry.org | python3 -
poetry install # Install with all dependencies # or poetry install --without dev # Install without development dependencies
MockLLM provides a command-line interface for managing the server and validating configurations:
# Show available commands and options mockllm --help # Show version mockllm --version # Start the server with default settings mockllm start # Start with custom responses file mockllm start --responses custom_responses.yml # Start with custom host and port mockllm start --host localhost --port 3000 # Validate a responses file mockllm validate responses.yml
cp example.responses.yml responses.yml
mockllm validate responses.yml
mockllm start --responses responses.yml
The server will start on http://localhost:8000
by default.
Regular request:
curl -X POST http://localhost:8000/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "mock-llm", "messages": [ {"role": "user", "content": "what colour is the sky?"} ] }'
Streaming request:
curl -X POST http://localhost:8000/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "mock-llm", "messages": [ {"role": "user", "content": "what colour is the sky?"} ], "stream": true }'
Regular request:
curl -X POST http://localhost:8000/v1/messages \ -H "Content-Type: application/json" \ -d '{ "model": "claude-3-sonnet-20240229", "messages": [ {"role": "user", "content": "what colour is the sky?"} ] }'
Streaming request:
curl -X POST http://localhost:8000/v1/messages \ -H "Content-Type: application/json" \ -d '{ "model": "claude-3-sonnet-20240229", "messages": [ {"role": "user", "content": "what colour is the sky?"} ], "stream": true }'
To run the tests:
Contributions are welcome! Please open an issue or submit a PR.
Check out the CodeGate project when you're done here!
This project is licensed under the Apache 2.0 License.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4