Scraping Commands Guide
Learn how to use scraper skill in your automations
Introduction
The scraper allows you to navigate to a particular location in a website and then bring back the contents. In the simplest case, you give the scraper a URL and it gives you back the contents of that page as text, optionally with a screenshot.
- Using actions, you can click buttons, navigate menus, and even loop through filling out forms with data you’ve collected in a previous step.
- Using AI Vision you can answer questions about content on the page or structure the response that you get back in a format that you specify.
Authentication:
- Using browser connections that you’ve created in Connections you can login automatically.
- Or alternatively, using secrets you’ve stored in the datastore you can fill in username, password, and apikey details without needing to copy/paste those details into your automation as plaintext.
Basic Scraping
Scrape a web page with the /scraper
command followed by one or more URLs:
Scrape and specify the return format:
Format Options
format: markdown
- Returns content in readable text format (default)format: html
- Returns HTML content
Content Filtering
include: article, .main-content, #content
- Specify elements to includeexclude: nav, .sidebar, #footer
- Specify elements to excludemainContent: false
- Include all content (default is true, which focuses on main content)
Available Actions
Use actions when you need to navigate to a certain location before doing your scrape.
Action Types
-
Navigation Actions
goTo: [URL]
- Navigate to a specific pagewait: [milliseconds]
- Pause for specified timescrollBottom
- Scroll to bottom of pagescrollTo: [selector]
- Scroll to specific element
-
Interactive Actions
click: [text or selector]
- Click on an elementhover: [text or selector]
- Move mouse over elementfill: [field] with [text]
- Enter text into input fieldselect: [option]
- Choose from dropdown menucheck: [checkbox]
- Toggle checkbox/radio buttonpress: [key]
- Press keyboard key (Enter, Tab, etc.)
-
Advanced Actions
screenshot: options: [fullpage (true|false), jpeg|png, quality (0-100)]
- Capture screenshotvision: [prompt, options]
- Use AI to retrieve or analyze page content with same options as for screenshot, defaults to jpeg, quality 75runcode: [Playwright code]
- Execute custom Playwrite codeloop: [actions]
- Repeat actions with input data
Waiting
Waiting between actions is a common pattern in scraping and often necessary. You can use the wait
action to pause for a specified amount of time. So if something isn’t working, try adding a wait in between actions.
Examples
Basic Navigation:
Form Filling:
Working with Dropdown Lists:
Advanced Features
Loop Example:
Loop with Data Example: First assemble data into an array of objects. You might do this in a previous step and then refer to the output of that step. Let’s say that your data output for Step 1 looks like this:
Then you would write your scraper in Step 2 like this:
Logging Into Websites
Using Browser Connections If you’ve created a browser connection, you can use it with the scraper. When you’re creating your prompt, first select the saved browser connection and then give your scraper prompt. In the example below assuming that you’ve created a browser connection called “salesforceLogin”
Note that when you use a browser connection, you are already logged in (similar to how you’d already be logged in if you logged in and then visited the page in question), so you don’t need to first go to the login page. You can jump right to your destination URL.
Using Vault You can use items stored in Vault in combination with scraping to fill in username, password, and API key fields in order to avoid entering your credentials into your automation in plain text.
Assuming you’ve previously stored an item in Vault called “salesforceLogin”, you could run the scraper with your Vault item as follows:
In this case, you are first logging in and then going to your destination page. Note that if the site requires 2FA or a one-time password, this technique won’t work with the scraper. You should use Browser Connection (above) instead.
Parent Scoping
Specify a parent container to narrow down element selection:
AI Vision
Use Vision AI to extract specific information:
Batch Processing
Process multiple pages at once:
Common Examples
- Login and Extract Content
- Shopping Cart Process
- Content Analysis
- Form Submission
Selector Types
When targeting elements on a page, you can use several different methods:
-
By Text Content
click: "Login"
- Clicks element containing exact text “Login”click: "Sign up for free"
- Works with longer phrases too
-
By Placeholder Text
fill: "Search..." with "laptops"
- Targets input field with placeholder textclick: "Enter your username"
- Works with any placeholder text
-
CSS Selectors
click: #login-button
- Using ID (#)click: .submit-btn
- Using class name (.)click: button.primary
- Element type with classclick: .form > .submit
- Using hierarchyclick: [data-testid="submit"]
- Using attributes
-
XPath Selectors
click: //button[@type="submit"]
- Button with specific typeclick: //div[@class="menu"]//a
- Links within menu divclick: //h1[contains(text(), "Welcome")]
- Heading containing textclick: //label[text()="Password"]/following-sibling::input
- Input field after labelclick: //*[@id="main"]//button[2]
- Second button in main container
-
Combined Approaches
click: "Submit" in .form-container
- Text within specific containerclick: "Login" in #auth-modal
- Text within IDhover: "Menu" in .nav-bar
- Combining text and classclick: "Next" in //div[@class="pagination"]
- Text within XPath-selected container
Remember:
- Text matching is case-sensitive
- For elements with spaces in selectors, use quotes:
".main content"
- When using text, prefer exact matches over partial ones
- CSS selectors provide simple targeting for basic needs
- XPath provides more powerful selection capabilities:
- Can traverse up the DOM tree
- Can select elements based on their contents
- Supports complex relationships between elements
- Always try to use the most specific selector that works
Happy scraping!