RetroSearch Browse

Home - News ( United States | United Kingdom | Italy | Germany ) - Football scores

Showing content from https://python.langchain.com/docs/integrations/tools/brightdata-webscraperapi below:

BrightDataWebScraperAPI | 🦜️🔗 LangChain

BrightDataWebScraperAPI

Bright Data provides a powerful Web Scraper API that allows you to extract structured data from 100+ ppular domains, including Amazon product details, LinkedIn profiles, and more, making it particularly useful for AI agents requiring reliable structured web data feeds.

Overview Integration details Tool features Native async Returns artifact Return data Pricing ❌ ❌ Structured data from websites (Amazon products, LinkedIn profiles, etc.) Requires Bright Data account Setup

The integration lives in the langchain-brightdata package.

pip install langchain-brightdata

You'll need a Bright Data API key to use this tool. You can set it as an environment variable:

import os

os.environ["BRIGHT_DATA_API_KEY"] = "your-api-key"

Or pass it directly when initializing the tool:

from langchain_brightdata import BrightDataWebScraperAPI

scraper_tool = BrightDataWebScraperAPI(bright_data_api_key="your-api-key")

Instantiation

Here we show how to instantiate an instance of the BrightDataWebScraperAPI tool. This tool allows you to extract structured data from various websites including Amazon product details, LinkedIn profiles, and more using Bright Data's Dataset API.

The tool accepts various parameters during instantiation:

bright_data_api_key (required, str): Your Bright Data API key for authentication.
dataset_mapping (optional, Dict[str, str]): A dictionary mapping dataset types to their corresponding Bright Data dataset IDs. The default mapping includes:
- "amazon_product": "gd_l7q7dkf244hwjntr0"
- "amazon_product_reviews": "gd_le8e811kzy4ggddlq"
- "linkedin_person_profile": "gd_l1viktl72bvl7bjuj0"
- "linkedin_company_profile": "gd_l1vikfnt1wgvvqz95w"

Invocation Basic Usage

from langchain_brightdata import BrightDataWebScraperAPI


scraper_tool = BrightDataWebScraperAPI(
    bright_data_api_key="your-api-key"  
)


results = scraper_tool.invoke(
    {"url": "https://www.amazon.com/dp/B08L5TNJHG", "dataset_type": "amazon_product"}
)

print(results)

Advanced Usage with Parameters

from langchain_brightdata import BrightDataWebScraperAPI


scraper_tool = BrightDataWebScraperAPI(bright_data_api_key="your-api-key")


results = scraper_tool.invoke(
    {
        "url": "https://www.amazon.com/dp/B08L5TNJHG",
        "dataset_type": "amazon_product",
        "zipcode": "10001",  
    }
)

print(results)


linkedin_results = scraper_tool.invoke(
    {
        "url": "https://www.linkedin.com/in/satyanadella/",
        "dataset_type": "linkedin_person_profile",
    }
)

print(linkedin_results)

Customization Options

The BrightDataWebScraperAPI tool accepts several parameters for customization:

Parameter Type Description url str The URL to extract data from dataset_type str Type of dataset to use (e.g., "amazon_product") zipcode str Optional zipcode for location-specific data Available Dataset Types

The tool supports the following dataset types for structured data extraction:

Dataset Type Description amazon_product Extract detailed Amazon product data amazon_product_reviews Extract Amazon product reviews linkedin_person_profile Extract LinkedIn person profile data linkedin_company_profile Extract LinkedIn company profile data Use within an agent

from langchain_brightdata import BrightDataWebScraperAPI
from langchain_google_genai import ChatGoogleGenerativeAI
from langgraph.prebuilt import create_react_agent


llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash", google_api_key="your-api-key")


scraper_tool = BrightDataWebScraperAPI(bright_data_api_key="your-api-key")


agent = create_react_agent(llm, [scraper_tool])


user_input = "Scrape Amazon product data for https://www.amazon.com/dp/B0D2Q9397Y?th=1 in New York (zipcode 10001)."


for step in agent.stream(
    {"messages": user_input},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()

API reference

Bright Data API Documentation

Tool conceptual guide
Tool how-to guides

RetroSearch is an open source project built by @garambo | Open a GitHub Issue

Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo

HTML: 3.2 | Encoding: UTF-8 | Version: 0.7.4