A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://resonance.distantmagic.com/tutorials/how-to-create-llm-websocket-chat-with-llama-cpp/ below:

Website Navigation


How to Create LLM WebSocket Chat with llama.cpp?

Preparations

Before starting this tutorial, it might be useful for you to familiarize yourself with the other tutorials and documentation pages first:

In this tutorial, we will also use Stimulus for the front-end code.

Once you have both Resonance and llama.cpp runing, we can continue.

Backend Security

First, we must create a security gate and decide which users can use our chat. Let's say all authenticated users can do that:

app/SiteActionGate/StartWebSocketJsonRPCConnectionGate.php

php

<?php namespace App\SiteActionGate; use App\Role; use Distantmagic\Resonance\Attribute\DecidesSiteAction; use Distantmagic\Resonance\Attribute\Singleton; use Distantmagic\Resonance\AuthenticatedUser; use Distantmagic\Resonance\SingletonCollection; use Distantmagic\Resonance\SiteAction; use Distantmagic\Resonance\SiteActionGate; #[DecidesSiteAction(SiteAction::StartWebSocketJsonRPCConnection)] #[Singleton(collection: SingletonCollection::SiteActionGate)] final readonly class StartWebSocketJsonRPCConnectionGate extends SiteActionGate { public function can(?AuthenticatedUser $authenticatedUser): bool { return true === $authenticatedUser?->user->getRole()->isAtLeast(Role::User); } }
HTTP Responder

First, we need to create a route in our application that will render the HTML chat.

We are using a few attributes:

  1. Can - this is an Authorization attribute it checks if user can connect with WebSockets. If they can't - it will return 403 error page instead. That is optional, but without Can attribute, all users will be able to visit the page, but they won't be able to establish a WebSocket connection anyway.
  2. RespondsToHttp - registers route
  3. Singleton - adds your HttpResponder to Dependency Injection container. We added #[WantsFeature(Feature::WebSocket) attribute to enable WebSocket server.

app/HttpResponder/LlmChat.php

php

<?php namespace App\HttpResponder; use App\HttpRouteSymbol; use Distantmagic\Resonance\Attribute\Can; use Distantmagic\Resonance\Attribute\RespondsToHttp; use Distantmagic\Resonance\Attribute\Singleton; use Distantmagic\Resonance\Attribute\WantsFeature; use Distantmagic\Resonance\Feature; use Distantmagic\Resonance\HttpInterceptableInterface; use Distantmagic\Resonance\HttpResponder; use Distantmagic\Resonance\RequestMethod; use Distantmagic\Resonance\SingletonCollection; use Distantmagic\Resonance\SiteAction; use Distantmagic\Resonance\TwigTemplate; use Psr\Http\Message\ResponseInterface; use Psr\Http\Message\ServerRequestInterface; #[Can(SiteAction::StartWebSocketJsonRPCConnection)] #[RespondsToHttp( method: RequestMethod::GET, pattern: '/chat', routeSymbol: HttpRouteSymbol::LlmChat, )] #[Singleton(collection: SingletonCollection::HttpResponder)] #[WantsFeature(Feature::WebSocket)] final readonly class LlmChat extends HttpResponder { public function respond(ServerRequestInterface $request, ResponseInterface $response): HttpInterceptableInterface { return new TwigTemplate('turbo/llmchat/index.twig'); } }
WebSocket Message Validation

First, we need to register our WebSocket RPC methods. Default WebSocket implementation follows a simple RPC protocol. All messages need to be a tuple:

[method, payload, null|request id]

Method, the first field, is validated against RPCMethodInterface, through RPCMethodValidator (you need to provide both):

app/RPCMethod.php

php

<?php namespace App; use Distantmagic\Resonance\EnumValuesTrait; use Distantmagic\Resonance\JsonRPCMethodInterface; enum JsonRPCMethod: string implements JsonRPCMethodInterface { use EnumValuesTrait; case LlmChatPrompt = 'llm_chat_prompt'; case LlmToken = 'llm_token'; public function getValue(): string { return $this->value; } }

app/RPCMethodValidator.php

php

<?php namespace App; use Distantmagic\Resonance\Attribute\Singleton; use Distantmagic\Resonance\Attribute\WantsFeature; use Distantmagic\Resonance\Feature; use Distantmagic\Resonance\JsonRPCMethodInterface; use Distantmagic\Resonance\JsonRPCMethodValidatorInterface; #[Singleton(provides: JsonRPCMethodValidatorInterface::class)] #[WantsFeature(Feature::WebSocket)] readonly class JsonRPCMethodValidator implements JsonRPCMethodValidatorInterface { public function cases(): array { return JsonRPCMethod::cases(); } public function castToRPCMethod(string $methodName): JsonRPCMethodInterface { return JsonRPCMethod::from($methodName); } public function values(): array { return JsonRPCMethod::values(); } }

The above means your application will support llm_chat_prompt and llm_chat_token messages.

WebSocket Message Responder

Now, we need to register a WebSocket responder that will run every time someone sends llama_chat_prompt message to the server.

It will then use LlamaCppClient to fetch tokens from llama.cpp server and forward them to the WebSocket connection:

app/WebSocketJsonRPCResponder/LlmChatPromptResponder.php

php

<?php namespace App\WebSocketJsonRPCResponder; use App\JsonRPCMethod; use Distantmagic\Resonance\Attribute\RespondsToWebSocketJsonRPC; use Distantmagic\Resonance\Attribute\Singleton; use Distantmagic\Resonance\Attribute\WantsFeature; use Distantmagic\Resonance\Constraint; use Distantmagic\Resonance\Constraint\ObjectConstraint; use Distantmagic\Resonance\Constraint\StringConstraint; use Distantmagic\Resonance\Feature; use Distantmagic\Resonance\JsonRPCNotification; use Distantmagic\Resonance\JsonRPCRequest; use Distantmagic\Resonance\JsonRPCResponse; use Distantmagic\Resonance\LlamaCppClientInterface; use Distantmagic\Resonance\LlamaCppCompletionRequest; use Distantmagic\Resonance\SingletonCollection; use Distantmagic\Resonance\WebSocketAuthResolution; use Distantmagic\Resonance\WebSocketConnection; use Distantmagic\Resonance\WebSocketJsonRPCResponder; #[RespondsToWebSocketJsonRPC(JsonRPCMethod::LlmChatPrompt)] #[Singleton(collection: SingletonCollection::WebSocketJsonRPCResponder)] #[WantsFeature(Feature::WebSocket)] final readonly class LlmChatPromptResponder extends WebSocketJsonRPCResponder { public function __construct( private LlamaCppClientInterface $llamaCppClient, ) {} public function getConstraint(): Constraint { return new ObjectConstraint( properties: [ 'prompt' => new StringConstraint(), ], ); } public function onNotification( WebSocketAuthResolution $webSocketAuthResolution, WebSocketConnection $webSocketConnection, JsonRPCNotification $rpcNotification, ): void { $request = new LlamaCppCompletionRequest($rpcNotification->payload->prompt); $completion = $this->llamaCppClient->generateCompletion($request); foreach ($completion as $token) { if ($webSocketConnection->status->isOpen()) { $webSocketConnection->push(new JsonRPCNotification( JsonRPCMethod::LlmToken, $token->content, )); } else { $completion->stop(); } } } }
Frontend

Stimulus controller starts WebSocket connection and sends user prompts to the server. Not that WebSocket connection uses both CSRF Protection and jsonrpc Protocols:

resources/ts/controller_llmchat.ts

typescript

import { Controller } from "@hotwired/stimulus"; import { stimulus } from "../stimulus"; @stimulus("llmchat") export class controller_llmchat extends Controller<HTMLElement> { public static targets = ["userInputField", "userInputForm", "chatLog"]; public static values = { csrfToken: String, }; private declare readonly chatLogTarget: HTMLElement; private declare readonly csrfTokenValue: string; private declare readonly userInputFieldTarget: HTMLTextAreaElement; private webSocket: null|WebSocket = null; public connect(): void { const servertUrl = new URL(__WEBSOCKET_URL); servertUrl.searchParams.append("csrf", this.csrfTokenValue); const webSocket = new WebSocket(servertUrl, ["jsonrpc"]); webSocket.addEventListener("close", () => { this.webSocket = null; }); webSocket.addEventListener("open", () => { this.webSocket = webSocket; }); webSocket.addEventListener("message", (evt: MessageEvent) => { if ("string" !== typeof evt.data) { return; } const parsed: unknown = JSON.parse(evt.data); this.chatLogTarget.append(parsed.result); }); } public disconnect(): void { this.webSocket?.close(); } public onFormSubmit(evt: Event): void { evt.preventDefault(); this.chatLogTarget.innerHTML = ''; this.webSocket?.send(JSON.stringify({ jsonrpc: "2.0", method: "llm_chat_prompt", params: { prompt: this.userInputFieldTarget.value, } })); this.userInputFieldTarget.value = ''; } }

We will use Asset Bundling (esbuild) to bundle front-end TypeScript code.

shell

$ ./node_modules/.bin/esbuild \ --bundle \ --define:global=globalThis \ --entry-names="./[name]_$(BUILD_ID)" \ --format=esm \ --log-limit=0 \ --metafile=esbuild-meta-app.json \ --minify \ --outdir=./$(BUILD_TARGET_DIRECTORY) \ --platform=browser \ --sourcemap \ --target=es2022,safari16 \ --tree-shaking=true \ --tsconfig=tsconfig.json \ resources/ts/controller_llmchat.ts \ ;

Finally, the HTML:

app/views/llmchat.twig

twig

<script defer type="module" src="{{ esbuild(request, 'controller_llmchat.ts') }}" ></script> <div data-controller="llmchat" data-llmchat-csrf-token-value="{{ csrf_token(request, 'llm_chat) }}" > <div data-llmchat-target="chatLog"></div> <form data-action="submit->llmchat#onFormSubmit"> <input autofocus data-llmchat-target="userInputField" type="text" ></input> <button type="submit">Send</button> </form> </div>
Summary

Now, you can serve LLM chat in your application. Enjoy!


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4