PDF Word Count

Count words, characters, sentences and CJK characters in PDF documents

Detailed content statistics report: Latin words, CJK characters, characters, sentences, lines, paragraphs, per-page breakdown and top frequent words.

Example Results

1 examples

Count words in a multi-page PDF

Get word, character and sentence statistics with a per-page breakdown.

{
  "totalWords": 72,
  "latinWords": 72,
  "cjkCharacters": 0,
  "charactersWithSpaces": 420,
  "pages": 6
}
View input parameters
{ "sourceFile": "/public/samples/pdf/sample-multipage.pdf", "includePageBreakdown": true, "topFrequentWords": 10 }

Click to upload file or drag and drop file here

Maximum file size: 100MB Supported formats: application/pdf

Number of top frequent words to list (0 to disable)

Key Facts

Category
Documents & PDF
Input Types
file, checkbox, number
Output Type
json
Sample Coverage
4
API Ready
Yes

Overview

The PDF Word Count tool provides a comprehensive analysis of your PDF documents, delivering detailed statistics including Latin words, CJK characters, total character counts, sentences, lines, paragraphs, and a list of the most frequent words.

When to Use

  • When you need to verify the exact word count of a PDF manuscript or essay for submission guidelines.
  • When analyzing bilingual or multilingual documents containing both Latin and CJK (Chinese, Japanese, Korean) characters.
  • When you require a page-by-page breakdown of text density and a list of the most frequently used terms in a PDF.

How It Works

  • Upload your PDF document using the file selector.
  • Choose whether to include a page-by-page breakdown and specify the number of top frequent words to extract.
  • Click the process button to generate a detailed JSON report containing word, character, sentence, and paragraph statistics.

Use Cases

Academic writers checking if their PDF paper complies with journal word limits.
Translators estimating translation costs based on CJK character and Latin word counts.
Content editors analyzing keyword density and word frequency in PDF drafts.

Examples

1. Verifying Academic Paper Word Count

Academic Researcher
Background
A researcher needs to submit a journal article in PDF format and must ensure it does not exceed the 8,000-word limit, excluding references.
Problem
The PDF contains mixed text, tables, and references, making it difficult to get an accurate count of body text words per page.
How to Use
Upload the draft PDF, enable the page breakdown option, and set the top frequent words to 15 to check for repetitive terminology.
Example Config
includePageBreakdown: true, topFrequentWords: 15
Outcome
The tool generates a JSON report showing the exact word count per page, allowing the researcher to isolate the main body text and verify compliance.

2. Analyzing Bilingual Localization Files

Localization Manager
Background
A localization manager receives a bilingual English-Chinese PDF manual and needs to estimate translation and review costs.
Problem
Standard word counters fail to distinguish between English words and Chinese characters, leading to inaccurate quotes.
How to Use
Upload the bilingual PDF and run the analysis with CJK character counting enabled.
Example Config
includePageBreakdown: false, topFrequentWords: 0
Outcome
The manager receives a precise breakdown of Latin words and CJK characters, enabling accurate budgeting.

Try with Samples

pdf, file

Related Hubs

FAQ

Can this tool count Chinese, Japanese, and Korean characters?

Yes, it specifically counts CJK characters separately from Latin words to ensure accurate statistics for multilingual documents.

Does the tool support page-by-page word counts?

Yes, by enabling the page breakdown option, you will receive a detailed count of words and characters for each individual page.

How does the tool handle word frequency?

You can specify the number of top frequent words (up to 100) to list, helping you identify the most common terms in your document.

Is there a file size limit for the PDF?

Yes, the tool supports PDF files up to 100 MB in size.

What metrics are included in the final report?

The report includes total words, Latin words, CJK characters, characters with spaces, sentences, lines, paragraphs, and page breakdowns.

API Documentation

Request Endpoint

POST /en/api/tools/pdf-word-count

Request Parameters

Parameter Name Type Required Description
sourceFile file (Upload required) Yes -
includePageBreakdown checkbox No -
topFrequentWords number No Number of top frequent words to list (0 to disable)

File type parameters need to be uploaded first via POST /upload/pdf-word-count to get filePath, then pass filePath to the corresponding file field.

Response Format

{
  "key": {...},
  "metadata": {
    "key": "value"
  },
  "error": "Error message (optional)",
  "message": "Notification message (optional)"
}
JSON Data: JSON Data

AI MCP Documentation

Add this tool to your MCP server configuration:

{
  "mcpServers": {
    "elysiatools-pdf-word-count": {
      "name": "pdf-word-count",
      "description": "Count words, characters, sentences and CJK characters in PDF documents",
      "baseUrl": "https://elysiatools.com/mcp/sse?toolId=pdf-word-count",
      "command": "",
      "args": [],
      "env": {},
      "isActive": true,
      "type": "sse"
    }
  }
}

You can chain multiple tools, e.g.: `https://elysiatools.com/mcp/sse?toolId=png-to-webp,jpg-to-webp,gif-to-webp`, max 20 tools.

Supports URL file links or Base64 encoding for file parameters.

If you encounter any issues, please contact us at [email protected]