> For the complete documentation index, see [llms.txt](https://whitepaper.aitech.io/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://whitepaper.aitech.io/agentforge/tools/browser_use.md).

# browser\_use

[BrowserUse](https://browser-use.com/) enables agents to run **browser automation tasks** with natural-language instructions. It simulates human-like browsing to navigate sites, click elements, fill forms, scrape data, and execute multi-step workflows with optional live monitoring and structured results.

### Usage Instructions

Use this block when your agent needs to interact with the live web (research, form submission, scraping, testing).

{% stepper %}
{% step %}
Describe the task in plain English (the **`task`**).
{% endstep %}

{% step %}
(Optional) Provide **`variables`** (secrets/values the steps can use).
{% endstep %}

{% step %}
Choose whether to **`save_browser_data`** (persist cookies/session).
{% endstep %}

{% step %}
(Optional) Select **`model`** for reasoning (defaults to `gpt-4o`).
{% endstep %}

{% step %}
Provide your **`apiKey`**. (You can get it from here: <https://cloud.browser-use.com/>)
{% endstep %}

{% step %}
Run the block → it executes asynchronously and returns a **task `id`**, **`success`** status, **`output`**, and **`steps`** taken.
{% endstep %}
{% endstepper %}

**Great for**

* Automating repetitive web flows (login → search → click → extract).
* Collecting structured data from pages.
* Filing forms or tickets across internal tools.
* End-to-end smoke tests of web apps.

### Tools

#### `browser_use_run_task`

Run a single browser automation task.

**Input**

| Parameter           | Type    | Required | Description                                                                       |
| ------------------- | ------- | -------- | --------------------------------------------------------------------------------- |
| `task`              | string  | Yes      | Natural-language instruction describing what the browser should do.               |
| `variables`         | json    | No       | Key–value pairs available to the task (e.g., credentials, query terms).           |
| `save_browser_data` | boolean | No       | Persist session/browser data (cookies, history). Default: `false`.                |
| `model`             | string  | No       | LLM to use for reasoning (e.g., `gpt-4o`, `gemini-2.0-flash`). Default: `gpt-4o`. |
| `apiKey`            | string  | Yes      | BrowserUse API key (must be valid and funded).                                    |

**Output**

| Parameter | Type    | Description                                                                 |
| --------- | ------- | --------------------------------------------------------------------------- |
| `id`      | string  | Execution identifier for the task (useful for logs/support).                |
| `success` | boolean | Whether the task completed successfully.                                    |
| `output`  | any     | Raw result (e.g., extracted data, confirmation text, structured payload).   |
| `steps`   | json    | Detailed step list (visited URLs, DOM actions, extracted elements, timing). |

### Examples

#### Example 1 — Research summary

* **Task:** “Go to google.com, search ‘latest genAI eval frameworks’, open top 3 reputable results, summarize key differences.”
* **Variables:** `{ "region": "US" }` (optional)
* **Outcome:** `output` contains a concise comparison; `steps` shows search → click → parse.

#### Example 2 — Form submission

* **Task:** “Open example.com/login, sign in with provided credentials, go to /submit, fill and submit the form, confirm the success message.”
* **Variables:** `{ "username": "user", "password": "pass" }`
* **Outcome:** `success: true` and `output` includes confirmation; `steps` logs DOM actions.

#### Example 3 — Data extraction

* **Task:** “Visit site.com/pricing, extract the plan names, monthly prices, and feature lists into structured JSON.”
* **Outcome:** `output` returns structured data for downstream use.

### Best Practices

* **Keep instructions clear and bounded:** Provide goals, constraints, and what to return (e.g., “return JSON with fields X, Y, Z”).
* **Use variables for sensitive data:** Pass secrets via `variables` to keep prompts clean.
* **Persist sessions when needed:** Enable `save_browser_data` for flows that benefit from cookies (e.g., staying logged in).
* **Ask for structure:** If extracting data, tell the agent the exact fields/shape to return.
* **Fail fast with signal:** If a selector or page changes, ensure the task returns a helpful error in `output`.

### Troubleshooting

| Symptom                        | Likely Cause                          | What to Try                                                               |
| ------------------------------ | ------------------------------------- | ------------------------------------------------------------------------- |
| `success: false`, empty output | Site changed layout or selector       | Refine the `task` with clearer steps; specify buttons, forms, or URLs.    |
| Login keeps failing            | Session not persisted                 | Set `save_browser_data: true`. Pass credentials via `variables`.          |
| Rate-limit or bot detection    | Aggressive navigation/scraping        | Slow down steps; reduce frequency; add polite delays; target fewer pages. |
| Unexpected model behavior      | Model too weak/fast for complex flows | Set `model` to `gpt-4o` (default) or a more capable reasoning model.      |

### Notes

* **Category:** `tools`
* **Type:** `browser_use`
* The block executes tasks asynchronously and returns once the run completes (it polls for completion under the hood).


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://whitepaper.aitech.io/agentforge/tools/browser_use.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
