Bright Data provides a powerful Web Unlocker API that allows you to access websites that might be protected by anti-bot measures, geo-restrictions, or other access limitations, making it particularly useful for AI agents requiring reliable web content extraction.
Overview Integration details Tool features Native async Returns artifact Return data Pricing ❌ ❌ HTML, Markdown, or screenshot of web pages Requires Bright Data account SetupThe integration lives in the langchain-brightdata
package.
pip install langchain-brightdata
You'll need a Bright Data API key to use this tool. You can set it as an environment variable:
import os
os.environ["BRIGHT_DATA_API_KEY"] = "your-api-key"
Or pass it directly when initializing the tool:
from langchain_brightdata import BrightDataUnlocker
unlocker_tool = BrightDataUnlocker(bright_data_api_key="your-api-key")
Instantiation
Here we show how to instantiate an instance of the BrightDataUnlocker tool. This tool allows you to access websites that may be protected by anti-bot measures, geo-restrictions, or other access limitations using Bright Data's Web Unlocker service.
The tool accepts various parameters during instantiation:
bright_data_api_key
(required, str): Your Bright Data API key for authentication.format
(optional, Literal["raw"]): Format of the response content. Default is "raw".country
(optional, str): Two-letter country code for geo-specific access (e.g., "us", "gb", "de", "jp"). Set this when you need to view the website as if accessing from a specific country. Default is None.zone
(optional, str): Bright Data zone to use for the request. The "unlocker" zone is optimized for accessing websites that might block regular requests. Default is "unlocker".data_format
(optional, Literal["html", "markdown", "screenshot"]): Output format for the retrieved content. Options include:
from langchain_brightdata import BrightDataUnlocker
unlocker_tool = BrightDataUnlocker(
bright_data_api_key="your-api-key"
)
result = unlocker_tool.invoke("https://example.com")
print(result)
Advanced Usage with Parameters
from langchain_brightdata import BrightDataUnlocker
unlocker_tool = BrightDataUnlocker(
bright_data_api_key="your-api-key",
)
result = unlocker_tool.invoke(
{
"url": "https://example.com/region-restricted-content",
"country": "gb",
"data_format": "html",
"zone": "unlocker",
}
)
print(result)
Customization Options
The BrightDataUnlocker tool accepts several parameters for customization:
Parameter Type Descriptionurl
str The URL to access format
str Format of the response content (default: "raw") country
str Two-letter country code for geo-specific access (e.g., "us", "gb") zone
str Bright Data zone to use (default: "unlocker") data_format
str Output format: None (HTML), "markdown", or "screenshot" Data Format Options
The data_format
parameter allows you to specify how the content should be returned:
None
or "html"
(default): Returns the standard HTML content of the page"markdown"
: Returns the content converted to markdown format, which is useful for feeding directly to LLMs"screenshot"
: Returns a PNG screenshot of the rendered page, useful for visual analysisfrom langchain_brightdata import BrightDataUnlocker
from langchain_google_genai import ChatGoogleGenerativeAI
from langgraph.prebuilt import create_react_agent
llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash", google_api_key="your-api-key")
bright_data_tool = BrightDataUnlocker(bright_data_api_key="your-api-key")
agent = create_react_agent(llm, [bright_data_tool])
user_input = "Get the content from https://example.com/region-restricted-page - access it from GB"
for step in agent.stream(
{"messages": user_input},
stream_mode="values",
):
step["messages"][-1].pretty_print()
API reference
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4