AgentQL's REST API allows you to query web pages and documents like PDFs and image files to retrieve the results through HTTP requests from any language.
Query dataQueries structured data as JSON from a web page given a URL using either an AgentQL query.
POSThttps://api.agentql.com/v1/query-data
curl -X POST https://api.agentql.com/v1/query-data \
-H "X-API-Key: $AGENTQL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "{ products[] { product_name product_price(integer) } }",
"url": "https://scrapeme.live/?s=fish&post_type=product",
"params": {
"wait_for": 0,
"is_scroll_to_bottom_enabled": false,
"mode": "fast",
"is_screenshot_enabled": false
}
}'
note
Make sure to replace $AGENTQL_API_KEY
with your actual API key.
json
{
"data": {
"products": [
{
"product_name": "Qwilfish",
"price": 77
},
{
"product_name": "Huntail",
"price": 52
},
...
]
},
"metadata": {
"request_id": "ecab9d2c-0212-4b70-a5bc-0c821fb30ae3"
}
}
Authentication
All requests to the AgentQL API must include an X-API-Key
header with your API key. You can generate an API key through Dev Portal.
query
string (alternative to prompt
)
The AgentQL query to execute. Learn more about how to write an AgentQL query in the docs. Note: You must define either a query
or a prompt
to use AgentQL.
prompt
string (alternative to query
)
A Natural Language description of the data to query the page for. AgentQL infers the data structure from your prompt. Note: You must define either a query
or a prompt
to use AgentQL.
url
string (alternative to html
)
The URL of the public web page you want to query. Note: You must define either a url
or html
to use AgentQL.
html
string (alternative to url
)
The raw HTML to query data from. Useful if you have a private or locally generated copy of a web page. Note: You must define either a url
or html
to use AgentQL.
params
object (optional)
wait_for
number
The number of seconds to wait for the page to load before querying. Defaults to 0
.
is_scroll_to_bottom_enabled
boolean
Whether to scroll to bottom of the page before querying. Defaults to false
.
mode
str
standard
uses deep data analysis, while fast
trades some depth of analysis for speed and is adequate for most usecases. Learn more about the modes in this guide. Defaults to fast
.
is_screenshot_enabled
boolean
Whether to take a screenshot before extracting data. Returned in metadata
as a Base64 string. Defaults to false
.
data
object
Data that matches the query.
metadata
object
request_id
string
A Universally Unique Identifier (UUID) for the request.
screenshot
string | null
Base64 encoded screenshot if enabled, null
otherwise. You can convert the Base64 string returned in the screenshot
field to an image and view it using free online tools like Base64.guru.
Create a remote browser session that provides a Chrome DevTools Protocol (CDP) URL for connecting to a remote browser instance. This allows you to run browser automation on remote infrastructure.
POSThttps://api.agentql.com/v1/tetra/sessions
curl -X POST https://api.agentql.com/v1/tetra/sessions \
-H "X-API-Key: $AGENTQL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"browser_ua_preset": "windows"
}'
note
Make sure to replace $AGENTQL_API_KEY
with your actual API key.
json
{
"session_id": "ca7947a1-a188-4391-be82-fb968ce4df4a",
"cdp_url": "wss://ca7947a1-a188-4391-be82-fb968ce4df4a.tetra.agentql.com",
"base_url": "https://ca7947a1-a188-4391-be82-fb968ce4df4a.tetra.agentql.com"
}
Authentication
All requests to the AgentQL API must include an X-API-Key
header with your API key. You can generate an API key through Dev Portal.
browser_ua_preset
string (optional)
User agent preset to simulate different operating systems. Supported values: windows
, macos
, linux
.
cdp_url
string
Chrome DevTools Protocol URL for connecting to the remote browser using Playwright or similar automation tools. That's what you need to connect to the remote browser using connect_over_cdp
method.
base_url
string
Base URL for accessing browser session resources and streaming endpoints.
session_id
string
Unique identifier for the browser session.
Once you have created a browser session, you can connect to it using Playwright:
from playwright.async_api import async_playwright
async def use_remote_browser():
# Get session from previous API call
cdp_url = session_data['cdp_url']
async with async_playwright() as p:
browser = await p.chromium.connect_over_cdp(cdp_url)
# Use browser normally
page = await browser.new_page()
await page.goto('https://example.com')
# View the page in real-time (optional)
streaming_url = f"{session_data['base_url']}/stream/0"
print(f"View at: {streaming_url}")
tip
For easier integration, use the AgentQL SDK which provide convenient wrapper functions:
from agentql.tools.sync_api import create_browser_session
import { createBrowserSession } from 'agentql/tools'
See the Remote Browser Guide for complete examples.
Query documentExtract data from a document by sending a PDF or image (JPEG, JPG, PNG) file and an AgentQL query. Learn about the consumption logic for querying documents here
for this example, use the following example file
noteThe query_document
function consumes 1 API call per image (JPG, JPEG, JPG), and 1 API call for each page within a PDF. (i.e querying a 10-page PDF will take 10 AgentQL API calls)
POSThttps://api.agentql.com/v1/query-document
curl -X POST https://api.agentql.com/v1/query-document \
-H "X-API-Key: $AGENTQL_API_KEY" \
-H "Content-Type: multipart/form-data" \
-F "file=@/path/to/file.pdf" \
-F 'body="{\"query\": \"{ project { id lowest_bidder lowest_bid } } \", \"params\": { \"mode\": \"fast\" } }" '
note
Make sure to replace $AGENTQL_API_KEY
with your actual API key.
json
{
"data": {
"project": {
"id": "CPM 81031-200202",
"lowest_bidder": "Toebe Construction LLC",
"lowest_bid": 13309641.63
}
},
"metadata": {
"request_id": "ecab9d2c-0212-4b70-a5bc-0c821fb30ae3"
}
}
Authentication
All requests to the AgentQL API must include an X-API-Key
header with your API key. You can generate an API key through Dev Portal.
The request body for querying documents is a multipart/form-data
object that contains a file and a body.
file
string
File Path of file to execute query on.
body
string
The body is a stringified JSON that represents parameters for the query because multipart/form-data
only takes string.
query
string (alternative to prompt
):The AgentQL query to execute. Learn more about how to write an AgentQL query in the docs. Note: You must define either a query
or a prompt
to use AgentQL.
prompt
string (alternative to query
):A Natural Language description of the data to query the page for. AgentQL infers the data structure from your prompt. Note: You must define either a query
or a prompt
to use AgentQL.
params
object (optional):representation of the parameters for the query.
mode
str: Specifies the extraction mode: standard
for complex or high-volume data, or fast
for typical use cases. Defaults to fast
.data
object
Data that matches the query
metadata
object
request_id
string
A UUID for the request
The query_document
is supported in Python SDK. Learn how to use it here
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4