Categories

Image Source Extractor

Extract image URLs (src attributes) from HTML source code. Supports lazy-loaded images and srcset attributes.

Also extract from data-src attributes (lazy-loaded images)

Also extract from srcset attributes (responsive images)

Remove duplicate image URLs from the results

Choose how to sort the extracted image URLs

Key Facts

Category
Text Processing
Input Types
textarea, checkbox, select
Output Type
json
Sample Coverage
4
API Ready
Yes

Overview

The Image Source Extractor is a web-based tool that parses HTML source code to extract image URLs from src, data-src, and srcset attributes. It provides a quick way to gather image links for analysis, auditing, or downloading purposes.

When to Use

  • When auditing a website's images for SEO checks or performance optimization.
  • When scraping web pages to collect image URLs for downloading or cataloging.
  • When migrating content and need to extract all image sources from HTML files.

How It Works

  • Paste the HTML source code into the input textarea.
  • Optionally enable extraction from data-src and srcset attributes using checkboxes.
  • Choose to remove duplicate URLs and set the sorting preference.
  • The tool outputs a JSON array of the extracted image URLs.

Use Cases

SEO specialists extracting image URLs to check for alt text and broken links.
Web developers analyzing image sources for performance improvements or debugging.
Data analysts collecting image URLs from HTML for machine learning datasets or reports.

Examples

1. Auditing Blog Images for SEO

SEO Analyst
Background
You have the HTML source of a company blog and need to ensure all images have proper sources for search engine indexing.
Problem
Manually parsing HTML to find image URLs is time-consuming and error-prone.
How to Use
Paste the HTML code, enable 'Include data-src' for lazy-loaded images, and set 'Remove Duplicates' to true.
Example Config
includeDataSrc: true, uniqueOnly: true, sortBy: alphabetical
Outcome
A clean JSON list of unique image URLs, sorted alphabetically for easy review and validation.

2. Extracting Responsive Image URLs

Front-end Developer
Background
You're debugging responsive image loading and need to verify all srcset URLs in the HTML.
Problem
Srcset attributes contain multiple URLs, making manual extraction complex.
How to Use
Input the HTML, check 'Include srcset Attributes', and choose 'No Sorting' to maintain order.
Example Config
includeSrcSet: true, sortBy: none
Outcome
All image URLs including those from srcset, output in the original order for accurate analysis.

Try with Samples

html, image, video

Related Hubs

FAQ

What image attributes does this tool extract?

It extracts from src attributes by default, and can optionally include data-src for lazy-loaded images and srcset for responsive images.

Can it handle large HTML files?

Yes, it processes HTML directly in your browser, but very large files may impact performance.

How are duplicate URLs handled?

Duplicates are removed by default, but you can disable this to preserve all occurrences.

Is the output sorted?

You can choose to sort URLs alphabetically or preserve the original order from the HTML.

What format is the output in?

The output is a JSON array containing the extracted image URLs as strings.

API Documentation

Request Endpoint

POST /en/api/tools/image-source-extractor

Request Parameters

Parameter Name Type Required Description
htmlCode textarea Yes -
includeDataSrc checkbox No Also extract from data-src attributes (lazy-loaded images)
includeSrcSet checkbox No Also extract from srcset attributes (responsive images)
uniqueOnly checkbox No Remove duplicate image URLs from the results
sortBy select No Choose how to sort the extracted image URLs

Response Format

{
  "key": {...},
  "metadata": {
    "key": "value"
  },
  "error": "Error message (optional)",
  "message": "Notification message (optional)"
}
JSON Data: JSON Data

AI MCP Documentation

Add this tool to your MCP server configuration:

{
  "mcpServers": {
    "elysiatools-image-source-extractor": {
      "name": "image-source-extractor",
      "description": "Extract image URLs (src attributes) from HTML source code. Supports lazy-loaded images and srcset attributes.",
      "baseUrl": "https://elysiatools.com/mcp/sse?toolId=image-source-extractor",
      "command": "",
      "args": [],
      "env": {},
      "isActive": true,
      "type": "sse"
    }
  }
}

You can chain multiple tools, e.g.: `https://elysiatools.com/mcp/sse?toolId=png-to-webp,jpg-to-webp,gif-to-webp`, max 20 tools.

If you encounter any issues, please contact us at [email protected]