Skip to main content
Browseagent provides 12 browser automation tools organized into 4 categories. These tools enable AI applications to fully control and interact with web browsers for a wide range of tasks.

Tool Categories

Core Workflow Pattern

Most browser automation follows this pattern:
  1. Navigate to the target page
  2. Snapshot to understand page structure
  3. Interact with specific elements using refs from snapshot
  4. Screenshot or capture results
navigate → snapshot → interact → screenshot/results

Essential Tools for Getting Started

browser_navigate

Navigate to any website:
{
  "name": "browser_navigate",
  "arguments": {
    "url": "https://example.com"
  }
}

browser_snapshot

Get page structure and element references:
{
  "name": "browser_snapshot",
  "arguments": {}
}
Returns structured HTML with ref attributes for targeting elements.

browser_click

Click page elements using refs from snapshot:
{
  "name": "browser_click", 
  "arguments": {
    "element": "Submit button",
    "ref": "button#submit-btn"
  }
}

browser_screenshot

Capture visual results:
{
  "name": "browser_screenshot",
  "arguments": {
    "fullPage": true
  }
}

Element Reference System

Browseagent uses a reference system to reliably target page elements:
  1. Take snapshot to get current page structure
  2. Find target element in snapshot output
  3. Use the ref attribute for precise targeting
  4. Provide human description for context
Example snapshot output:
<button id="login-btn" ref="button#login-btn">Login</button>
<input type="email" ref="input[type='email']" placeholder="Email"/>
Use these refs in interaction tools:
{
  "element": "Login button", 
  "ref": "button#login-btn"
}

Common Usage Patterns

Form Automation

  1. Navigate to page with form
  2. Snapshot to see form structure
  3. Type into input fields using refs
  4. Click submit button
  5. Screenshot results

Data Extraction

  1. Navigate to data source page
  2. Screenshot for visual confirmation
  3. Snapshot to get structured data
  4. Extract specific information from snapshot

Multi-step Workflows

  1. Navigate to starting page
  2. For each step:
    • Snapshot current state
    • Interact with elements
    • Wait if needed for page changes
  3. Screenshot final results

Error Handling

Element Not Found

If an element ref doesn’t work:
  1. Take a new snapshot (page may have changed)
  2. Find the updated ref for your target element
  3. Retry the interaction

Page Loading Issues

If pages don’t load completely:
  1. Use browser_wait to allow loading time
  2. Take screenshot to visually verify page state
  3. Retry snapshot once page is fully loaded

Connection Issues

If tools return connection errors:
  1. Verify Chrome extension is connected
  2. Check extension popup status
  3. Restart browser or reconnect extension
I