Documentation Index
Fetch the complete documentation index at: https://docs.pinkfish.ai/llms.txt
Use this file to discover all available pages before exploring further.
Server path: /brightdata | Type: Application | PCID required: Yes
| Tool | Description |
|---|
brightdata_scrape_website | Scrape a website and return the HTML content |
brightdata_scrape_serp | Scrape search engine results pages (SERP) |
brightdata_unlock_website | Unlock and access geo-restricted or blocked websites |
brightdata_get_datasets | List available Bright Data datasets |
brightdata_get_scraping_job | Get status and results of a scraping job |
brightdata_batch_scrape | Submit a batch of URLs for scraping |
brightdata_scrape_website
Scrape a website and return the HTML content
Parameters:
| Parameter | Type | Required | Default | Description |
|---|
url | string | Yes | — | The URL of the website to scrape |
datasetId | string | Yes | — | Bright Data dataset ID for the scraping configuration |
waitForSelector | string | No | — | CSS selector to wait for before scraping |
screenshot | boolean | No | false | Whether to take a screenshot of the page |
format | string | No | "html" | Output format for the scraped content |
{
"type": "object",
"properties": {
"PCID": {
"type": "string",
"description": "Pink Connect ID"
},
"url": {
"type": "string",
"description": "The URL of the website to scrape"
},
"datasetId": {
"type": "string",
"description": "Bright Data dataset ID for the scraping configuration"
},
"waitForSelector": {
"type": "string",
"description": "CSS selector to wait for before scraping"
},
"screenshot": {
"type": "boolean",
"default": false,
"description": "Whether to take a screenshot of the page"
},
"format": {
"type": "string",
"enum": [
"html",
"text",
"json"
],
"default": "html",
"description": "Output format for the scraped content"
}
},
"required": [
"PCID",
"url",
"datasetId"
]
}
brightdata_scrape_serp
Scrape search engine results pages (SERP)
Parameters:
| Parameter | Type | Required | Default | Description |
|---|
query | string | Yes | — | Search query to scrape results for |
searchEngine | string | No | "google" | Search engine to scrape from |
country | string | No | "US" | Country code for localized results |
language | string | No | "en" | Language code for results |
numResults | number | No | 10 | Number of search results to return (1-100) |
datasetId | string | Yes | — | Bright Data dataset ID for SERP scraping |
{
"type": "object",
"properties": {
"PCID": {
"type": "string",
"description": "Pink Connect ID"
},
"query": {
"type": "string",
"description": "Search query to scrape results for"
},
"searchEngine": {
"type": "string",
"enum": [
"google",
"bing",
"yahoo",
"duckduckgo"
],
"default": "google",
"description": "Search engine to scrape from"
},
"country": {
"type": "string",
"default": "US",
"description": "Country code for localized results"
},
"language": {
"type": "string",
"default": "en",
"description": "Language code for results"
},
"numResults": {
"type": "number",
"default": 10,
"description": "Number of search results to return (1-100)"
},
"datasetId": {
"type": "string",
"description": "Bright Data dataset ID for SERP scraping"
}
},
"required": [
"PCID",
"query",
"datasetId"
]
}
brightdata_unlock_website
Unlock and access geo-restricted or blocked websites
Parameters:
| Parameter | Type | Required | Default | Description |
|---|
url | string | Yes | — | The URL to unlock and access |
country | string | No | — | Country code to proxy through |
userAgent | string | No | — | Custom user agent string |
datasetId | string | Yes | — | Bright Data dataset ID for the unlock configuration |
{
"type": "object",
"properties": {
"PCID": {
"type": "string",
"description": "Pink Connect ID"
},
"url": {
"type": "string",
"description": "The URL to unlock and access"
},
"country": {
"type": "string",
"description": "Country code to proxy through"
},
"userAgent": {
"type": "string",
"description": "Custom user agent string"
},
"datasetId": {
"type": "string",
"description": "Bright Data dataset ID for the unlock configuration"
}
},
"required": [
"PCID",
"url",
"datasetId"
]
}
brightdata_get_datasets
List available Bright Data datasets
Parameters:
| Parameter | Type | Required | Default | Description |
|---|
type | string | No | — | Filter datasets by type |
{
"type": "object",
"properties": {
"PCID": {
"type": "string",
"description": "Pink Connect ID"
},
"type": {
"type": "string",
"enum": [
"web_scraper",
"serp_scraper",
"social_scraper"
],
"description": "Filter datasets by type"
}
},
"required": [
"PCID"
]
}
brightdata_get_scraping_job
Get status and results of a scraping job
Parameters:
| Parameter | Type | Required | Default | Description |
|---|
jobId | string | Yes | — | Scraping job ID to check |
includeResults | boolean | No | true | Whether to include scraping results |
{
"type": "object",
"properties": {
"PCID": {
"type": "string",
"description": "Pink Connect ID"
},
"jobId": {
"type": "string",
"description": "Scraping job ID to check"
},
"includeResults": {
"type": "boolean",
"default": true,
"description": "Whether to include scraping results"
}
},
"required": [
"PCID",
"jobId"
]
}
brightdata_batch_scrape
Submit a batch of URLs for scraping
Parameters:
| Parameter | Type | Required | Default | Description |
|---|
urls | string[] | Yes | — | Array of URLs to scrape |
datasetId | string | Yes | — | Bright Data dataset ID for batch scraping |
webhook | string | No | — | Webhook URL to receive results when complete |
{
"type": "object",
"properties": {
"PCID": {
"type": "string",
"description": "Pink Connect ID"
},
"urls": {
"type": "array",
"items": {
"type": "string"
},
"description": "Array of URLs to scrape"
},
"datasetId": {
"type": "string",
"description": "Bright Data dataset ID for batch scraping"
},
"webhook": {
"type": "string",
"description": "Webhook URL to receive results when complete"
}
},
"required": [
"PCID",
"urls",
"datasetId"
]
}