mistral_parse
The Mistral Parse tool provides a powerful way to extract and process content from PDF documents using Mistral's OCR API. This tool leverages advanced optical character recognition to accurately extract text and structure from PDF files, making it easy to incorporate document data into your agent workflows.
With the Mistral Parse tool, you can:
Extract text from PDFs: Accurately convert PDF content to text, markdown, or JSON formats
Process PDFs from URLs: Directly extract content from PDFs hosted online by providing their URLs
Maintain document structure: Preserve formatting, tables, and layout from the original PDFs
Extract images: Optionally include embedded images from the PDFs
Select specific pages: Process only the pages you need from multi-page documents
The Mistral Parse tool is particularly useful for scenarios where your agents need to work with PDF content, such as analyzing reports, extracting data from forms, or processing text from scanned documents. It simplifies the process of making PDF content available to your agents, allowing them to work with information stored in PDFs just as easily as with direct text input.
Usage Instructions
Extract text and structure from PDF documents using Mistral's OCR API. Configure processing options and get the content in your preferred format. For URLs, they must be publicly accessible and point to a valid PDF file. Note: Google Drive, Dropbox, and other cloud storage links are not supported; use a direct download URL from a web server instead.
Tools
mistral_parser
mistral_parserParse PDF documents using Mistral OCR API
Input
filePath
string
Yes
URL to a PDF document to be processed
fileUpload
object
No
File upload data from file-upload component
resultType
string
No
Type of parsed result (markdown, text, or json). Defaults to markdown.
includeImageBase64
boolean
No
Include base64-encoded images in the response
pages
array
No
Specific pages to process (array of page numbers, starting from 0)
imageLimit
number
No
Maximum number of images to extract from the PDF
imageMinSize
number
No
Minimum height and width of images to extract from the PDF
apiKey
string
Yes
Mistral API key (MISTRAL_API_KEY)
Output
content
string
Extracted content
metadata
json
Processing metadata
Notes
Category:
toolsType:
mistral_parse
Was this helpful?
