> For the complete documentation index, see [llms.txt](https://whitepaper.aitech.io/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://whitepaper.aitech.io/agentforge/tools/firecrawl.md).

# firecrawl

[Firecrawl](https://firecrawl.dev/) is a powerful web scraping and content extraction API that integrates seamlessly into Agent Forge, enabling developers to extract clean, structured content from any website. This integration provides a simple way to transform web pages into usable data formats like Markdown and HTML while preserving the essential content.

With Firecrawl in Agent Forge, you can:

* **Extract clean content**: Remove ads, navigation elements, and other distractions to get just the main content
* **Convert to structured formats**: Transform web pages into Markdown, HTML, or JSON
* **Capture metadata**: Extract SEO metadata, Open Graph tags, and other page information
* **Handle JavaScript-heavy sites**: Process content from modern web applications that rely on JavaScript
* **Filter content**: Focus on specific parts of a page using CSS selectors
* **Process at scale**: Handle high-volume scraping needs with a reliable API
* **Search the web**: Perform intelligent web searches and retrieve structured results
* **Crawl entire sites**: Crawl multiple pages from a website and aggregate their content

In Agent Forge, the Firecrawl integration enables your agents to access and process web content programmatically as part of their workflows. Supported operations include:

* **Scrape**: Extract structured content (Markdown, HTML, metadata) from a single web page.
* **Search**: Search the web for information using Firecrawl's intelligent search capabilities.
* **Crawl**: Crawl multiple pages from a website, returning structured content and metadata for each page.

This allows your agents to gather information from websites, extract structured data, and use that information to make decisions or generate insights—all without having to navigate the complexities of raw HTML parsing or browser automation. Simply configure the Firecrawl block with your API key, select the operation (Scrape, Search, or Crawl), and provide the relevant parameters. Your agents can immediately begin working with web content in a clean, structured format.

## Usage Instructions

Extract content from any website with advanced web scraping or search the web for information. Retrieve clean, structured data from web pages with options to focus on main content, or intelligently search for information across the web.

## Tools

{% tabs %}
{% tab title="firecrawl\_scrape" %}
Extract structured content from web pages with comprehensive metadata support. Converts content to markdown or HTML while capturing SEO metadata, Open Graph tags, and page information.

#### Input

| Parameter       | Type   | Required | Description                    |
| --------------- | ------ | -------- | ------------------------------ |
| `url`           | string | Yes      | The URL to scrape content from |
| `scrapeOptions` | json   | No       | Options for content scraping   |
| `apiKey`        | string | Yes      | Firecrawl API key              |

#### Output

| Parameter     | Type   | Description           |
| ------------- | ------ | --------------------- |
| `markdown`    | string | Page content markdown |
| `html`        | any    | Raw HTML content      |
| `metadata`    | json   | Page metadata         |
| `data`        | json   | Search results data   |
| `warning`     | any    | Warning messages      |
| `pages`       | json   | Crawled pages data    |
| `total`       | number | Total pages found     |
| `creditsUsed` | number | Credits consumed      |
| {% endtab %}  |        |                       |

{% tab title="firecrawl\_search" %}
Search for information on the web using Firecrawl

#### Input

| Parameter | Type   | Required | Description             |
| --------- | ------ | -------- | ----------------------- |
| `query`   | string | Yes      | The search query to use |
| `apiKey`  | string | Yes      | Firecrawl API key       |

#### Output

| Parameter     | Type   | Description           |
| ------------- | ------ | --------------------- |
| `markdown`    | string | Page content markdown |
| `html`        | any    | Raw HTML content      |
| `metadata`    | json   | Page metadata         |
| `data`        | json   | Search results data   |
| `warning`     | any    | Warning messages      |
| `pages`       | json   | Crawled pages data    |
| `total`       | number | Total pages found     |
| `creditsUsed` | number | Credits consumed      |
| {% endtab %}  |        |                       |

{% tab title="firecrawl\_crawl" %}
Crawl entire websites and extract structured content from all accessible pages

#### Input

| Parameter         | Type    | Required | Description                                     |
| ----------------- | ------- | -------- | ----------------------------------------------- |
| `url`             | string  | Yes      | The website URL to crawl                        |
| `limit`           | number  | No       | Maximum number of pages to crawl (default: 100) |
| `onlyMainContent` | boolean | No       | Extract only main content from pages            |
| `apiKey`          | string  | Yes      | Firecrawl API Key                               |

#### Output

| Parameter     | Type   | Description           |
| ------------- | ------ | --------------------- |
| `markdown`    | string | Page content markdown |
| `html`        | any    | Raw HTML content      |
| `metadata`    | json   | Page metadata         |
| `data`        | json   | Search results data   |
| `warning`     | any    | Warning messages      |
| `pages`       | json   | Crawled pages data    |
| `total`       | number | Total pages found     |
| `creditsUsed` | number | Credits consumed      |
| {% endtab %}  |        |                       |
| {% endtabs %} |        |                       |

## Notes

* Category: `tools`
* Type: `firecrawl`


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://whitepaper.aitech.io/agentforge/tools/firecrawl.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
