browser_use
BrowserUse enables agents to run browser automation tasks with natural-language instructions. It simulates human-like browsing to navigate sites, click elements, fill forms, scrape data, and execute multi-step workflows with optional live monitoring and structured results.
Usage Instructions
Use this block when your agent needs to interact with the live web (research, form submission, scraping, testing).
Describe the task in plain English (the task).
(Optional) Provide variables (secrets/values the steps can use).
Choose whether to save_browser_data (persist cookies/session).
(Optional) Select model for reasoning (defaults to gpt-4o).
Provide your apiKey. (You can get it from here: https://cloud.browser-use.com/)
Run the block → it executes asynchronously and returns a task id, success status, output, and steps taken.
Great for
Automating repetitive web flows (login → search → click → extract).
Collecting structured data from pages.
Filing forms or tickets across internal tools.
End-to-end smoke tests of web apps.
Tools
browser_use_run_task
browser_use_run_taskRun a single browser automation task.
Input
task
string
Yes
Natural-language instruction describing what the browser should do.
variables
json
No
Key–value pairs available to the task (e.g., credentials, query terms).
save_browser_data
boolean
No
Persist session/browser data (cookies, history). Default: false.
model
string
No
LLM to use for reasoning (e.g., gpt-4o, gemini-2.0-flash). Default: gpt-4o.
apiKey
string
Yes
BrowserUse API key (must be valid and funded).
Output
id
string
Execution identifier for the task (useful for logs/support).
success
boolean
Whether the task completed successfully.
output
any
Raw result (e.g., extracted data, confirmation text, structured payload).
steps
json
Detailed step list (visited URLs, DOM actions, extracted elements, timing).
Examples
Example 1 — Research summary
Task: “Go to google.com, search ‘latest genAI eval frameworks’, open top 3 reputable results, summarize key differences.”
Variables:
{ "region": "US" }(optional)Outcome:
outputcontains a concise comparison;stepsshows search → click → parse.
Example 2 — Form submission
Task: “Open example.com/login, sign in with provided credentials, go to /submit, fill and submit the form, confirm the success message.”
Variables:
{ "username": "user", "password": "pass" }Outcome:
success: trueandoutputincludes confirmation;stepslogs DOM actions.
Example 3 — Data extraction
Task: “Visit site.com/pricing, extract the plan names, monthly prices, and feature lists into structured JSON.”
Outcome:
outputreturns structured data for downstream use.
Best Practices
Keep instructions clear and bounded: Provide goals, constraints, and what to return (e.g., “return JSON with fields X, Y, Z”).
Use variables for sensitive data: Pass secrets via
variablesto keep prompts clean.Persist sessions when needed: Enable
save_browser_datafor flows that benefit from cookies (e.g., staying logged in).Ask for structure: If extracting data, tell the agent the exact fields/shape to return.
Fail fast with signal: If a selector or page changes, ensure the task returns a helpful error in
output.
Troubleshooting
success: false, empty output
Site changed layout or selector
Refine the task with clearer steps; specify buttons, forms, or URLs.
Login keeps failing
Session not persisted
Set save_browser_data: true. Pass credentials via variables.
Rate-limit or bot detection
Aggressive navigation/scraping
Slow down steps; reduce frequency; add polite delays; target fewer pages.
Unexpected model behavior
Model too weak/fast for complex flows
Set model to gpt-4o (default) or a more capable reasoning model.
Notes
Category:
toolsType:
browser_useThe block executes tasks asynchronously and returns once the run completes (it polls for completion under the hood).
Was this helpful?
