Crawl Map Guide - Pinkfish - AI Agents & Workflows for Getting Work Done

What can you do with it?

Crawl Map provides the easiest way to go from a single URL to a map of the entire website. This is extremely useful when you need to prompt the end-user to choose which links to scrape, need to quickly know the links on a website, need to find pages related to a specific topic using the search parameter, or only need to identify specific pages before scraping. The map endpoint is optimized for speed and returns URLs with titles and descriptions. Supports location settings for geo-specific site mapping.

How to use it?

Basic Command Structure

/crawl-map [urls] [includeSubdomains] [ignoreSitemap] [search]

Parameters

Required:

urls - Array of starting URLs (e.g., [“https://lights.com/”])

Optional Map Control:

search - Search term to filter URLs containing specific content
limit - Maximum number of URLs to return (default: 100)
includeSubdomains - Include subdomains in the map (default: false)
ignoreSitemap - Skip sitemap (legacy, converts to sitemap parameter)
sitemap - Sitemap usage: “include”, “skip”, or “only” (default: “include”)

Location Settings:

location - Object with:
- country - ISO country code (US, GB, DE, FR, etc.)
- languages - Language preferences (e.g., [“en”, “es”])

File Storage:

file_links_expire_in_days - Days until file links expire
file_links_expire_in_minutes - Alternative to days

Response Format

For single URL (returns the data object directly):

{
  "success": true,
  "links": [
    {
      "url": "https://lights.com",
      "title": "Lights - Modern Lighting Store",
      "description": "Premium lighting solutions for your home"
    },
    {
      "url": "https://lights.com/pages/contact",
      "title": "Contact Us",
      "description": "Get in touch with our lighting experts"
    },
    {
      "url": "https://lights.com/products/foscarini-twiggy",
      "title": "Foscarini Twiggy Floor Lamp",
      "description": "Iconic Italian design floor lamp"
    }
  ],
  "scrapeId": "abc-123-def",
  "metadata": {
    "statusCode": 200
  }
}

For multiple URLs (returns array of results):

[
  {
    "url": "https://lights.com",
    "success": true,
    "links": [...],
    "metadata": {...}
  },
  {
    "url": "https://another.com",
    "success": true,
    "links": [...],
    "metadata": {...}
  }
]

Examples

Basic Usage

/crawl-map map https://lights.com/

Maps the entire website starting from the homepage with default settings.

Search-Filtered Mapping

/crawl-map find all pages about "Foscarini" on https://lights.com/

Maps the website and filters results to pages containing “Foscarini”.

Include Subdomains

/crawl-map map https://example.com/ including all subdomains

Maps the main domain and all subdomains like blog.example.com, shop.example.com.

Sitemap Only

/crawl-map get sitemap urls only from https://example.com/

Returns only URLs found in the sitemap.xml file (sitemap: “only”).

Skip Sitemap

/crawl-map map https://example.com/ without using sitemap

Discovers pages by following links only, ignoring sitemap.xml.

Limited Results

/crawl-map map https://large-site.com/ limit to 500 urls

Returns maximum 500 URLs even if more exist.

Location-Specific Mapping

/crawl-map map https://global-site.com/ from Germany

Maps the site as if accessing from Germany.

Product Search

/crawl-map find all product pages on https://store.com/ searching for "electronics"

Maps e-commerce site filtering for electronics products.

Notes

Legacy parameter ignoreSitemap converts to sitemap parameter
Links include URL, title, and description when available
Search parameter filters results after discovery
Map is optimized for speed over completeness

URL Discovery

Map entire websites
Follow internal links
Discover hidden pages
Generate comprehensive lists
Identify site structure

Filtering Options

Include/exclude subdomains
Search for specific content
Filter by page types
Ignore sitemaps option
Custom crawl parameters

Content Search

Find pages with keywords
Filter relevant content
Topic-based discovery
Brand-specific pages
Product searches

Example Commands

Basic Website Map

/crawl-map create full map of "https://example.com/"

Include Subdomains

/crawl-map map "https://company.com/" with all subdomains included

Search-Filtered Map

/crawl-map find all pages mentioning "products" on "https://store.com/"

Ignore Sitemap

/crawl-map map "https://site.com/" without using sitemap data

Multi-URL Mapping

/crawl-map generate maps for multiple starting URLs

Parameters

Required Parameters

urls: Array of starting URLs
Must include at least one valid URL
URLs should be fully qualified (http/https)

Optional Parameters

includeSubdomains: Include subdomain pages (default: true)
ignoreSitemap: Skip sitemap.xml parsing (default: false)
search: Filter pages containing specific keywords

Response Structure

Success Response

{
  "https://lights.com/": {
    "success": true,
    "links": [
      "https://lights.com",
      "https://lights.com/pages/contact",
      "https://lights.com/pages/about-us",
      "https://lights.com/pages/help"
    ]
  }
}

Error Handling

success: Boolean indicating operation status
links: Array of discovered URLs
Error messages for failed operations

Use Cases

Web Scraping Preparation

/crawl-map map target site before scraping specific pages

Content Discovery

/crawl-map find all product pages on e-commerce site

Site Auditing

/crawl-map generate complete site structure for analysis

Competitive Research

/crawl-map discover competitor website structure and pages

SEO Analysis

/crawl-map map site to identify all indexable pages

Subdomain Handling

Include Subdomains (true)

Maps blog.example.com
Maps shop.example.com
Maps support.example.com
Comprehensive coverage

Exclude Subdomains (false)

Only main domain
Faster mapping
Focused results
Reduced scope

Sitemap Integration

Use Sitemap (ignoreSitemap: false)

Leverages sitemap.xml
Faster discovery
Official page list
Complete coverage

Ignore Sitemap (ignoreSitemap: true)

Manual link following
Discovers unlisted pages
More thorough crawling
Hidden content finding

Search Filtering

Keyword Search

Filter by page content
Brand mentions
Product names
Topic relevance

Search Examples

/crawl-map find "contact" pages on company website
/crawl-map discover "pricing" related pages
/crawl-map locate "support" documentation

Best Practices

Start Small
- Test with single URLs first
- Verify results before scaling
- Check site robots.txt
- Respect rate limits
Use Filters Wisely
- Apply search terms for focus
- Include subdomains when needed
- Consider sitemap usage
- Balance speed vs completeness
Plan Your Scraping
- Map before scraping
- Identify target pages
- Prioritize important content
- Avoid unnecessary pages
Monitor Performance
- Large sites take time
- Check for timeouts
- Handle failed URLs
- Validate results

Common Patterns

E-commerce Mapping

/crawl-map find all product pages on online store

Blog Discovery

/crawl-map map blog subdomain for all articles

Documentation Crawl

/crawl-map discover all help and support pages

Brand Research

/crawl-map find pages mentioning specific brand names

Error Handling

Common Issues

Invalid URLs
Network timeouts
Access restrictions
Large site limits

Best Practices

Validate URLs before mapping
Handle partial failures
Check success flags
Retry failed operations

Performance Considerations

Speed Factors

Site size affects time
Subdomain inclusion impacts speed
Search filtering adds processing
Network conditions matter

Optimization Tips

Use specific starting URLs
Apply filters early
Limit subdomain scope
Monitor response times

Tips

Always validate starting URLs before mapping
Use search parameters to focus on relevant content
Include subdomains for comprehensive coverage
Check robots.txt and respect crawling guidelines
Plan scraping strategy based on discovered URLs

Get Started

Organization

Agents

Workflows

Resources

Integrations

Orchestration

Credits & Pricing

Skills

How To Guides

Release Notes

Support

​What can you do with it?

​How to use it?

​Basic Command Structure

​Parameters

​Response Format

​Examples

​Basic Usage

​Search-Filtered Mapping

​Include Subdomains

​Sitemap Only

​Skip Sitemap

​Limited Results

​Location-Specific Mapping

​Product Search

​Notes

​URL Discovery

​Filtering Options

​Content Search

​Example Commands

​Basic Website Map

​Include Subdomains

​Search-Filtered Map

​Ignore Sitemap

​Multi-URL Mapping

​Parameters

​Required Parameters

​Optional Parameters

​Response Structure

​Success Response

​Error Handling

​Use Cases

​Web Scraping Preparation

​Content Discovery

​Site Auditing

​Competitive Research

​SEO Analysis

​Subdomain Handling

​Include Subdomains (true)

​Exclude Subdomains (false)

​Sitemap Integration

​Use Sitemap (ignoreSitemap: false)

​Ignore Sitemap (ignoreSitemap: true)

​Search Filtering

​Keyword Search

​Search Examples

​Best Practices

​Common Patterns

​E-commerce Mapping

​Blog Discovery

​Documentation Crawl

​Brand Research

​Error Handling

​Common Issues

​Best Practices

​Performance Considerations

​Speed Factors

​Optimization Tips

​Tips

What can you do with it?

How to use it?

Basic Command Structure

Parameters

Response Format

Examples

Basic Usage

Search-Filtered Mapping

Include Subdomains

Sitemap Only

Skip Sitemap

Limited Results

Location-Specific Mapping

Product Search

Notes

URL Discovery

Filtering Options

Content Search

Example Commands

Basic Website Map

Include Subdomains

Search-Filtered Map

Ignore Sitemap

Multi-URL Mapping

Parameters

Required Parameters

Optional Parameters

Response Structure

Success Response

Error Handling

Use Cases

Web Scraping Preparation

Content Discovery

Site Auditing

Competitive Research

SEO Analysis

Subdomain Handling

Include Subdomains (true)

Exclude Subdomains (false)

Sitemap Integration

Use Sitemap (ignoreSitemap: false)

Ignore Sitemap (ignoreSitemap: true)

Search Filtering

Keyword Search

Search Examples

Best Practices

Common Patterns

E-commerce Mapping

Blog Discovery

Documentation Crawl

Brand Research

Error Handling

Common Issues

Best Practices

Performance Considerations

Speed Factors

Optimization Tips

Tips