This will help you get started with OCIModelDeployment chat models. For detailed documentation of all ChatOCIModelDeployment features and configurations head to the API reference.
OCI Data Science is a fully managed and serverless platform for data science teams to build, train, and manage machine learning models in the Oracle Cloud Infrastructure. You can use AI Quick Actions to easily deploy LLMs on OCI Data Science Model Deployment Service. You may choose to deploy the model with popular inference frameworks such as vLLM or TGI. By default, the model deployment endpoint mimics the OpenAI API protocol.
Overviewâ Integration detailsâ Model featuresâFor the latest updates, examples and experimental features, please see ADS LangChain Integration.
Some model features, including tool calling, structured output, JSON mode and multi-modal inputs, are depending on deployed model.
SetupâTo use ChatOCIModelDeployment you'll need to deploy a chat model with chat completion endpoint and install the langchain-community
, langchain-openai
and oracle-ads
integration packages.
You can easily deploy foundation models using the AI Quick Actions on OCI Data Science Model deployment. For additional deployment examples, please visit the Oracle GitHub samples repository.
PoliciesâMake sure to have the required policies to access the OCI Data Science Model Deployment endpoint.
CredentialsâYou can set authentication through Oracle ADS. When you are working in OCI Data Science Notebook Session, you can leverage resource principal to access other OCI resources.
import ads
ads.set_auth("resource_principal")
Alternatively, you can configure the credentials using the following environment variables. For example, to use API key with specific profile:
import os
os.environ["OCI_IAM_TYPE"] = "api_key"
os.environ["OCI_CONFIG_PROFILE"] = "default"
os.environ["OCI_CONFIG_LOCATION"] = "~/.oci"
Check out Oracle ADS docs to see more options.
InstallationâThe LangChain OCIModelDeployment integration lives in the langchain-community
package. The following command will install langchain-community
and the required dependencies.
%pip install -qU langchain-community langchain-openai oracle-ads
Instantiationâ
You may instantiate the model with the generic ChatOCIModelDeployment
or framework specific class like ChatOCIModelDeploymentVLLM
.
ChatOCIModelDeployment
when you need a generic entry point for deploying models. You can pass model parameters through model_kwargs
during the instantiation of this class. This allows for flexibility and ease of configuration without needing to rely on framework-specific details.from langchain_community.chat_models import ChatOCIModelDeployment
chat = ChatOCIModelDeployment(
endpoint="https://modeldeployment.<region>.oci.customer-oci.com/<ocid>/predict",
streaming=True,
max_retries=1,
model_kwargs={
"temperature": 0.2,
"max_tokens": 512,
},
default_headers={
"route": "/v1/chat/completions",
},
)
ChatOCIModelDeploymentVLLM
: This is suitable when you are working with a specific framework (e.g. vLLM
) and need to pass model parameters directly through the constructor, streamlining the setup process.from langchain_community.chat_models import ChatOCIModelDeploymentVLLM
chat = ChatOCIModelDeploymentVLLM(
endpoint="https://modeldeployment.<region>.oci.customer-oci.com/<md_ocid>/predict",
)
Invocationâ
messages = [
(
"system",
"You are a helpful assistant that translates English to French. Translate the user sentence.",
),
("human", "I love programming."),
]
ai_msg = chat.invoke(messages)
ai_msg
AIMessage(content="J'adore programmer.", response_metadata={'token_usage': {'prompt_tokens': 44, 'total_tokens': 52, 'completion_tokens': 8}, 'model_name': 'odsc-llm', 'system_fingerprint': '', 'finish_reason': 'stop'}, id='run-ca145168-efa9-414c-9dd1-21d10766fdd3-0')
Chainingâ
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a helpful assistant that translates {input_language} to {output_language}.",
),
("human", "{input}"),
]
)
chain = prompt | chat
chain.invoke(
{
"input_language": "English",
"output_language": "German",
"input": "I love programming.",
}
)
AIMessage(content='Ich liebe Programmierung.', response_metadata={'token_usage': {'prompt_tokens': 38, 'total_tokens': 48, 'completion_tokens': 10}, 'model_name': 'odsc-llm', 'system_fingerprint': '', 'finish_reason': 'stop'}, id='run-5dd936b0-b97e-490e-9869-2ad3dd524234-0')
Asynchronous callsâ
from langchain_community.chat_models import ChatOCIModelDeployment
system = "You are a helpful translator that translates {input_language} to {output_language}."
human = "{text}"
prompt = ChatPromptTemplate.from_messages([("system", system), ("human", human)])
chat = ChatOCIModelDeployment(
endpoint="https://modeldeployment.us-ashburn-1.oci.customer-oci.com/<ocid>/predict"
)
chain = prompt | chat
await chain.ainvoke(
{
"input_language": "English",
"output_language": "Chinese",
"text": "I love programming",
}
)
AIMessage(content='æåæŽĒįžįĻ', response_metadata={'token_usage': {'prompt_tokens': 37, 'total_tokens': 50, 'completion_tokens': 13}, 'model_name': 'odsc-llm', 'system_fingerprint': '', 'finish_reason': 'stop'}, id='run-a2dc9393-f269-41a4-b908-b1d8a92cf827-0')
Streaming callsâ
import os
import sys
from langchain_community.chat_models import ChatOCIModelDeployment
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages(
[("human", "List out the 5 states in the United State.")]
)
chat = ChatOCIModelDeployment(
endpoint="https://modeldeployment.us-ashburn-1.oci.customer-oci.com/<ocid>/predict"
)
chain = prompt | chat
for chunk in chain.stream({}):
sys.stdout.write(chunk.content)
sys.stdout.flush()
1. California
2. Texas
3. Florida
4. New York
5. Illinois
Structured outputâ
from langchain_community.chat_models import ChatOCIModelDeployment
from pydantic import BaseModel
class Joke(BaseModel):
"""A setup to a joke and the punchline."""
setup: str
punchline: str
chat = ChatOCIModelDeployment(
endpoint="https://modeldeployment.us-ashburn-1.oci.customer-oci.com/<ocid>/predict",
)
structured_llm = chat.with_structured_output(Joke, method="json_mode")
output = structured_llm.invoke(
"Tell me a joke about cats, respond in JSON with `setup` and `punchline` keys"
)
output.dict()
{'setup': 'Why did the cat get stuck in the tree?',
'punchline': 'Because it was chasing its tail!'}
API referenceâ
For comprehensive details on all features and configurations, please refer to the API reference documentation for each class:
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4