What is the difference between oversampling and undersampling?

Oversampling duplicates rows from the minority class to match the majority count, while undersampling randomly removes rows from the majority class to match the minority count.

What file formats are supported for the dataset?

You can paste raw CSV text directly into the input field, or upload dataset files in CSV or JSON format.

How do I know which resampling strategy to choose?

Undersampling is generally safer for very large datasets where dropping data won't cause severe information loss, while oversampling is better for small datasets where every data point is critical.

Can I export the fully balanced dataset?

Yes, the tool generates a balanced dataset based on your chosen strategy, which you can preview and export in either JSON or CSV format.

Does this tool apply SMOTE or synthetic data generation?

No, this tool uses exact row duplication for oversampling and random trimming for undersampling. It helps you baseline your data before deciding if complex synthetic methods are necessary.

Dataset Imbalance Detector & Resampler | Online Free Tool

Examples

1. Balancing a highly skewed fraud dataset

Data Scientist

Background

A financial dataset contains 10,000 normal transactions but only 500 fraudulent ones, causing the initial model to predict 'normal' every time.

Problem

The minority class (fraud) needs to be amplified to match the majority class without writing custom Python scripts.

How to use

Upload the transaction CSV, set the Label Column to 'is_fraud', and select the 'oversample' strategy.

Label Column: is_fraud, Strategy: oversample, Export Format: csv

Outcome

The tool duplicates the 500 fraud rows until they match the 10,000 normal rows, outputting a perfectly balanced 20,000-row CSV preview.

2. Downsizing majority class for faster model training

Machine Learning Engineer

Background

A massive user database has 500,000 active users and 50,000 churned users. Training on the full dataset is slow and biased.

Problem

Reduce the majority class to match the minority class size to speed up training and balance class weights.

How to use

Label Column: status, Strategy: undersample, Export Format: json

Dataset Imbalance Detector & Resampler | Online Free Tool | Elysia Tools

Dataset Imbalance Detector & Resampler

What this tool helps you do

Run this tool

Prepared example runs

Inputs

Result

Prepared example runs

Inputs

Result

Examples that match this tool

Continue with connected tools and hubs

Learn when to use this tool, what it supports, and how real users apply it.

Key facts

Overview

When to use

How it works

Use cases

Examples

1. Balancing a highly skewed fraud dataset

2. Downsizing majority class for faster model training

FAQ

CSV Samples

Python Samples

Distributed Tracing Samples

JWT Samples

Time Series Anomaly Detector

Dataset Quality Profiler

Mock Data Prefix / Abbreviation Conflict Detector

Time Series Forecast & Seasonality Analyzer

JSON Interchange and Format Translation Tools

Text Case, Encoding, and Normalization Conversion Tools

JSON Inspection, Diff, and Transformation Tools

CSV Export and Table Conversion Tools