Categories

Text Similarity Detector

Calculate similarity percentage between two texts using multiple algorithms including Cosine Similarity, Jaccard Similarity, and Levenshtein Distance

Treat uppercase and lowercase as different characters

Remove extra spaces, tabs, and newlines before comparison

Ignore words shorter than this length

Key Facts

Category
Text Processing
Input Types
textarea, select, checkbox, number
Output Type
text
Sample Coverage
4
API Ready
Yes

Overview

The Text Similarity Detector is a precise utility designed to calculate the percentage of overlap between two text blocks using advanced mathematical algorithms like Cosine Similarity, Jaccard Similarity, and Levenshtein Distance.

When to Use

  • Comparing two versions of a document to identify content changes or revisions.
  • Checking for potential plagiarism or duplicate content across different articles.
  • Analyzing the linguistic consistency between two sets of marketing copy or technical descriptions.

How It Works

  • Paste your two text samples into the input fields.
  • Select your preferred algorithm, such as Cosine for vector-based analysis or Levenshtein for character-level edit distance.
  • Adjust optional settings like case sensitivity, whitespace handling, and minimum word length to refine your results.
  • Click the analyze button to generate an accurate similarity percentage score.

Use Cases

Academic integrity checks for student submissions.
SEO content auditing to avoid duplicate content penalties.
Version control verification for legal or technical documentation.

Examples

1. Content Duplication Check

SEO Specialist
Background
A content manager needs to ensure that a new blog post draft is sufficiently unique compared to an existing landing page.
Problem
Determining if the new draft contains too much recycled phrasing from the original site content.
How to Use
Paste the existing page text in the first field and the new draft in the second, then select 'Jaccard Similarity'.
Example Config
algorithm: jaccard, ignoreWhitespace: true, minWordLength: 3
Outcome
The tool returns a 15% similarity score, confirming the new content is unique enough for publication.

2. Document Revision Analysis

Legal Assistant
Background
A legal assistant needs to verify that a contract amendment only contains minor edits compared to the original agreement.
Problem
Identifying the extent of changes made to the document structure and wording.
How to Use
Input the original contract and the amended version, selecting 'Levenshtein Distance' to focus on character-level edits.
Example Config
algorithm: levenshtein, caseSensitive: true, ignoreWhitespace: false
Outcome
A high similarity percentage confirms that only minor character-level adjustments were made, saving time on manual review.

Try with Samples

video, text

Related Hubs

FAQ

Which algorithm should I choose?

Use Cosine for semantic similarity, Jaccard for set-based overlap, and Levenshtein for character-level editing differences.

What does the 'Combined' algorithm do?

The Combined option runs all available algorithms and provides an averaged similarity score for a balanced perspective.

Does the tool ignore formatting?

Yes, by enabling 'Ignore Whitespace,' the tool strips extra spaces, tabs, and newlines to focus solely on the text content.

Can I compare very long documents?

The tool is optimized for text comparison; however, extremely large files may be processed more efficiently if broken into smaller segments.

Is the comparison case-sensitive?

It is optional. You can toggle 'Case Sensitive' to treat 'Apple' and 'apple' as either identical or distinct.

API Documentation

Request Endpoint

POST /en/api/tools/text-similarity-detector

Request Parameters

Parameter Name Type Required Description
text1 textarea Yes -
text2 textarea Yes -
algorithm select Yes -
caseSensitive checkbox No Treat uppercase and lowercase as different characters
ignoreWhitespace checkbox No Remove extra spaces, tabs, and newlines before comparison
minWordLength number No Ignore words shorter than this length

Response Format

{
  "result": "Processed text content",
  "error": "Error message (optional)",
  "message": "Notification message (optional)",
  "metadata": {
    "key": "value"
  }
}
Text: Text

AI MCP Documentation

Add this tool to your MCP server configuration:

{
  "mcpServers": {
    "elysiatools-text-similarity-detector": {
      "name": "text-similarity-detector",
      "description": "Calculate similarity percentage between two texts using multiple algorithms including Cosine Similarity, Jaccard Similarity, and Levenshtein Distance",
      "baseUrl": "https://elysiatools.com/mcp/sse?toolId=text-similarity-detector",
      "command": "",
      "args": [],
      "env": {},
      "isActive": true,
      "type": "sse"
    }
  }
}

You can chain multiple tools, e.g.: `https://elysiatools.com/mcp/sse?toolId=png-to-webp,jpg-to-webp,gif-to-webp`, max 20 tools.

If you encounter any issues, please contact us at [email protected]