What format does this tool output?

The tool outputs a structured JSON file containing the text chunks along with their corresponding metadata, such as page numbers, heading paths, and bounding boxes.

What is the difference between heading-aware and element-per-chunk modes?

Heading-aware mode groups content under its respective section titles up to the maximum character limit, while element-per-chunk treats every individual paragraph, list, or table as a separate, isolated chunk.

Can I control the size of the generated chunks?

Yes, you can set a maximum character limit per chunk, ranging from 200 to 4000 characters, to optimize retrieval performance for your specific vector store.

Does the tool extract tables from the PDF?

Yes, as long as the 'Include Table Nodes' option is enabled, the tool will extract tables and include them in the generated RAG chunks.

What are bounding boxes used for in the output?

Bounding boxes provide the exact spatial coordinates of the text on the original PDF page, allowing frontend applications to visually highlight the cited source text for users.

Elysia Tools

Navigation

AI Tools

PDF RAG Chunker & Citation Pack

Convert a PDF into heading-aware RAG chunks with page numbers, bounding boxes, and citation metadata

Details

What this tool helps you do

Upload a PDF to generate retrieval-friendly chunks with page references, heading paths, and bounding boxes. The output is a JSON pack that works well for vector stores, answer citation, and PDF-grounded chat systems.

Execution

Run this tool

Fill in the form, run the tool, and review the result in one place.

Prepared example runs

Click an example to fill the form automatically. File inputs still need an upload.

1 examples

Prepare a financial report for RAG ingestion

Create chunks with page numbers and bounding boxes so answers can cite the original PDF precisely.

{
  "type": "file",
  "filePath": "/public/samples/json/pdf-rag-chunker-citation-pack-example1.json"
}

Inputs

Set the required fields, then run the tool.

6 options

FilesUpload source files for this workflow.1

PDF FilefileRequired

Supported types: application/pdf

SettingsAdjust formats, ranges, numbers, and modes.2

Chunk ModeselectOptionalMaximum Characters Per ChunknumberOptional

TogglesEnable or disable optional behavior.3

Use Struct TreecheckboxOptionalEnabled when checkedSanitize Sensitive DatacheckboxOptionalEnabled when checkedInclude Table NodescheckboxOptionalEnabled when checked

Result

Ready for a run

Run the tool to preview files, text, structured data, or streamed output here.

Samples

PDF RAG Chunker & Citation Pack

What this tool helps you do

Run this tool

Prepared example runs

Inputs

Result

Examples that match this tool

Continue with connected tools and hubs

Prepared example runs

Inputs

Result

Learn when to use this tool, what it supports, and how real users apply it.

Key facts

Overview

When to use

How it works

Use cases

Examples

1. Prepare a financial report for RAG ingestion

2. Chunking legal contracts with sensitive data sanitization

FAQ

PDF Samples

Markdown Slide Deck Samples

Time Zone Workflow Scheduler ICS Samples

ASS Subtitle Samples

PDF to Clean Text for LLM

Markdown to PDF Theme Pack

PDF/A Convert

PDF to Excel

Document OCR and Structured Extraction Tools

PDF to LLM and RAG Preparation Tools

RAG Chunking, Corpus Cleanup, and Retrieval Prep Tools

PDF Conversion and Document Export Tools