Load quantized BGE embedding models generated by IntelĀ® Extension for Transformers (ITREX) and use ITREX Neural Engine, a high-performance NLP backend, to accelerate the inference of models without compromising accuracy.
from langchain_community.embeddings import QuantizedBgeEmbeddings
model_name = "Intel/bge-small-en-v1.5-sts-int8-static-inc"
encode_kwargs = {"normalize_embeddings": True}
model = QuantizedBgeEmbeddings(
model_name=model_name,
encode_kwargs=encode_kwargs,
query_instruction="Represent this sentence for searching relevant passages: ",
)
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4