PDF Denoise

Remove visual noise from scanned PDF pages — salt-and-pepper speckle, random grain, and faint background haze — using real image-processing algorithms. Text pages are preserved as searchable vector content.

Cleans noisy scanned PDF pages with a pure-JavaScript pipeline (no external binaries required) and genuine image-processing kernels.

Per-page content-aware processing (important):

  • Image pages (scanned documents): rasterized and denoised. This is where noise removal matters.
  • Text pages (including mixed text + image): copied verbatim. Vector text, fonts, and searchability are fully preserved.
  • Empty pages: copied verbatim.

If your scan carries an OCR text layer (so it reads as a "text" page) but the underlying image is still noisy, enable "Rasterize Text Pages" to force processing.

Denoise modes (all real algorithms):

  • Auto: 3x3 median filter + isolated-speck despeckle. Balanced cleanup that preserves tone and edges — the recommended default.
  • Median: 3x3 per-channel median filter (1–3 passes). The classic remedy for salt-and-pepper / impulse noise.
  • Binarize: Otsu adaptive threshold. Collapses faint background haze into clean white and renders foreground to solid black — ideal for legibility of scanned text.

How it works (image pages):

  1. Each image page is rasterized with pdf.js
  2. The chosen denoise kernel runs on the raw pixel buffer
  3. The cleaned image is embedded into a new PDF

Example Results

2 examples

Auto-Denoise a Noisy Scan

Balanced median + despeckle cleanup of speckled scanned image pages, preserving any vector text pages

pdf_denoised.pdf View File
View input parameters
{ "sourceFile": "/public/samples/pdf/sample-multipage.pdf", "mode": "auto", "strength": 2, "rasterizeText": "false", "pageRange": "" }

Binarize a Faded Scan for Legibility

Applies Otsu thresholding to turn a faint, hazy scan into crisp black-and-white text

pdf_denoised.pdf View File
View input parameters
{ "sourceFile": "/public/samples/pdf/sample-multipage.pdf", "mode": "binarize", "rasterizeText": "false", "pageRange": "1-3" }

Click to upload file or drag and drop file here

Maximum file size: 100MB Supported formats: application/pdf

Auto: balanced median + despeckle (preserves tone). Median: best for salt-and-pepper/impulse noise. Binarize: Otsu threshold turns faint backgrounds white and text solid black.

Number of 3x3 median filter passes (1–3). Higher = stronger noise removal but more softening. Ignored in Binarize mode.

By default text pages are preserved as searchable vector content (not denoised). Enable this only for OCR'd scans whose underlying image is noisy, accepting loss of text selectability.

Specify pages to denoise (e.g., 1-3,5,7-9). Leave blank to process all pages.

Key Facts

Category
Documents & PDF
Input Types
file, select, number, text
Output Type
file
Sample Coverage
4
API Ready
Yes

Overview

PDF Denoise is a browser-based utility designed to clean up scanned PDF documents by removing visual noise, such as salt-and-pepper speckles, random grain, and faint background haze. The tool uses real image-processing algorithms to restore clarity to scanned image pages while preserving vector text pages intact to maintain searchability and font quality.

When to Use

  • When scanned PDF documents contain distracting salt-and-pepper noise, grain, or dark speckles that hinder readability.
  • When faded or low-contrast scans need to be converted into high-contrast, crisp black-and-white text.
  • When cleaning up scanned documents that contain a mix of noisy image pages and clean, searchable vector text pages.

How It Works

  • The tool parses the uploaded PDF and identifies image pages versus vector text pages.
  • Image pages are rasterized, and the selected denoising algorithm (Auto, Median, or Otsu Binarization) is applied directly to the pixel buffer.
  • Vector text pages are preserved intact to maintain searchability and font quality, unless forced rasterization is enabled.
  • The processed image pages and preserved text pages are compiled back into a clean, optimized PDF file.

Use Cases

Cleaning up old, grainy historical document scans to improve legibility.
Preparing scanned contracts or forms with faint backgrounds for OCR processing by binarizing them.
Removing salt-and-pepper noise from scanned textbook pages while keeping the digital text pages sharp.

Examples

1. Removing Speckle Noise from a Scanned Report

Archivist
Background
An archivist has a scanned PDF report filled with distracting salt-and-pepper noise and small black dots across the pages.
Problem
The speckles make the document look unprofessional and hard to read.
How to Use
Upload the PDF, select the 'Auto' denoise mode, set the strength to 2, and run the process.
Example Config
{
  "mode": "auto",
  "strength": 2,
  "rasterizeText": "false"
}
Outcome
The output PDF has clean, speckle-free pages with smooth backgrounds while preserving the original layout.

2. Binarizing a Faint Scan for High Contrast

Legal Assistant
Background
A legal assistant receives a scanned contract that is faint, hazy, and has a grey background, making it difficult to read.
Problem
The text lacks contrast and needs to be converted to crisp black-and-white.
How to Use
Upload the PDF, select the 'Binarize' mode, and specify the page range to process.
Example Config
{
  "mode": "binarize",
  "pageRange": "1-3"
}
Outcome
The faint background haze is completely removed (turned white) and the text is rendered in solid black, significantly improving readability.

Try with Samples

pdf, image, video

Related Hubs

FAQ

Will this tool make my searchable PDF text unsearchable?

No, by default, vector text pages are preserved verbatim to keep them searchable. Only image-only pages are rasterized and denoised.

What is the difference between the Auto and Binarize modes?

Auto mode uses a median filter and despeckling to preserve tones, while Binarize uses Otsu thresholding to turn backgrounds pure white and text solid black.

How do I clean a scanned PDF that already has an OCR text layer?

Enable the 'Rasterize Text Pages' option to force the tool to process and denoise the underlying noisy images, though this will remove the text layer.

Can I denoise only specific pages of my PDF?

Yes, you can specify a page range (for example, '1-3, 5') to target only the pages that require cleanup.

What does the strength setting do?

It controls the number of median filter passes (from 1 to 3) in Auto and Median modes; higher values remove more noise but may soften the image.

API Documentation

Request Endpoint

POST /en/api/tools/pdf-denoise

Request Parameters

Parameter Name Type Required Description
sourceFile file (Upload required) Yes -
mode select Yes Auto: balanced median + despeckle (preserves tone). Median: best for salt-and-pepper/impulse noise. Binarize: Otsu threshold turns faint backgrounds white and text solid black.
strength number No Number of 3x3 median filter passes (1–3). Higher = stronger noise removal but more softening. Ignored in Binarize mode.
rasterizeText select No By default text pages are preserved as searchable vector content (not denoised). Enable this only for OCR'd scans whose underlying image is noisy, accepting loss of text selectability.
pageRange text No Specify pages to denoise (e.g., 1-3,5,7-9). Leave blank to process all pages.

File type parameters need to be uploaded first via POST /upload/pdf-denoise to get filePath, then pass filePath to the corresponding file field.

Response Format

{
  "filePath": "/public/processing/randomid.ext",
  "fileName": "output.ext",
  "contentType": "application/octet-stream",
  "size": 1024,
  "metadata": {
    "key": "value"
  },
  "error": "Error message (optional)",
  "message": "Notification message (optional)"
}
File: File

AI MCP Documentation

Add this tool to your MCP server configuration:

{
  "mcpServers": {
    "elysiatools-pdf-denoise": {
      "name": "pdf-denoise",
      "description": "Remove visual noise from scanned PDF pages — salt-and-pepper speckle, random grain, and faint background haze — using real image-processing algorithms. Text pages are preserved as searchable vector content.",
      "baseUrl": "https://elysiatools.com/mcp/sse?toolId=pdf-denoise",
      "command": "",
      "args": [],
      "env": {},
      "isActive": true,
      "type": "sse"
    }
  }
}

You can chain multiple tools, e.g.: `https://elysiatools.com/mcp/sse?toolId=png-to-webp,jpg-to-webp,gif-to-webp`, max 20 tools.

Supports URL file links or Base64 encoding for file parameters.

If you encounter any issues, please contact us at [email protected]