A RetroSearch Logo

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Search Query:

Showing content from https://docs.llamaindex.ai/en/stable/module_guides/deploying/chat_engines/usage_pattern/ below:

Usage Pattern - LlamaIndex

Usage Pattern# Get Started#

Build a chat engine from index:

chat_engine = index.as_chat_engine()

Tip

To learn how to build an index, see Indexing

Have a conversation with your data:

response = chat_engine.chat("Tell me a joke.")

Reset chat history to start a new conversation:

Enter an interactive chat REPL:

Configuring a Chat Engine#

Configuring a chat engine is very similar to configuring a query engine.

High-Level API#

You can directly build and configure a chat engine from an index in 1 line of code:

chat_engine = index.as_chat_engine(chat_mode="condense_question", verbose=True)

Note: you can access different chat engines by specifying the chat_mode as a kwarg. condense_question corresponds to CondenseQuestionChatEngine, react corresponds to ReActChatEngine, context corresponds to a ContextChatEngine.

Note: While the high-level API optimizes for ease-of-use, it does NOT expose full range of configurability.

Available Chat Modes# Low-Level Composition API#

You can use the low-level composition API if you need more granular control. Concretely speaking, you would explicitly construct ChatEngine object instead of calling index.as_chat_engine(...).

Note: You may need to look at API references or example notebooks.

Here's an example where we configure the following:

from llama_index.core import PromptTemplate
from llama_index.core.llms import ChatMessage, MessageRole
from llama_index.core.chat_engine import CondenseQuestionChatEngine

custom_prompt = PromptTemplate(
    """\
Given a conversation (between Human and Assistant) and a follow up message from Human, \
rewrite the message to be a standalone question that captures all relevant context \
from the conversation.

<Chat History>
{chat_history}

<Follow Up Message>
{question}

<Standalone question>
"""
)

# list of `ChatMessage` objects
custom_chat_history = [
    ChatMessage(
        role=MessageRole.USER,
        content="Hello assistant, we are having a insightful discussion about Paul Graham today.",
    ),
    ChatMessage(role=MessageRole.ASSISTANT, content="Okay, sounds good."),
]

query_engine = index.as_query_engine()
chat_engine = CondenseQuestionChatEngine.from_defaults(
    query_engine=query_engine,
    condense_question_prompt=custom_prompt,
    chat_history=custom_chat_history,
    verbose=True,
)
Streaming#

To enable streaming, you simply need to call the stream_chat endpoint instead of the chat endpoint.

Warning

This somewhat inconsistent with query engine (where you pass in a streaming=True flag). We are working on making the behavior more consistent!

chat_engine = index.as_chat_engine()
streaming_response = chat_engine.stream_chat("Tell me a joke.")
for token in streaming_response.response_gen:
    print(token, end="")

See an end-to-end tutorial


RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4