Key Facts
- Category
- Data Processing
- Input Types
- textarea, checkbox, number
- Output Type
- json
- Sample Coverage
- 4
- API Ready
- Yes
Overview
The JSON Data Lineage Tracer is a data engineering utility that maps field paths, derived dependencies, and transformation histories within JSON structures. By combining source JSON data with optional derivation rules, it generates a comprehensive field-level lineage graph to help you visualize how data flows and transforms across your ETL pipelines or API responses.
When to Use
- •When auditing ETL pipelines to understand how specific JSON fields are derived from raw source data.
- •When documenting API responses to show downstream consumers the exact origin of aggregated or transformed fields.
- •When debugging complex data mappings where you need to trace the transformation history of specific values.
How It Works
- •Paste your raw JSON payload into the Source JSON field to establish the base data structure.
- •Optionally, provide a Lineage Rules JSON to define how derived fields are calculated from source paths using specific transformations.
- •Toggle whether to include container nodes (objects and arrays) and set a maximum field limit for the output.
- •The tool processes the inputs and generates a structured JSON lineage graph detailing field counts, derived dependencies, and transformation steps.
Use Cases
Examples
1. Tracing API Response Derived Fields
Data Engineer- Background
- An API returns a customer order payload where some fields are calculated on the fly, such as converting cents to dollars and combining names.
- Problem
- Need to document exactly how the 'totalUsd' and 'customerLabel' fields are derived from the raw database output.
- How to Use
- Paste the raw order JSON, then input the derivation rules specifying that 'totalUsd' comes from 'totalCents' divided by 100, and 'customerLabel' concatenates 'firstName' and 'lastName'.
- Example Config
-
{ "rules": [ { "target": "$.order.totalUsd", "sources": ["$.order.totalCents"], "transforms": ["divide_by_100", "round(2)"] }, { "target": "$.order.customerLabel", "sources": ["$.order.customer.firstName", "$.order.customer.lastName"], "transforms": ["concat(\" \")"] } ] } - Outcome
- A JSON lineage graph is generated showing 9 total fields, explicitly linking the 2 derived fields to their original sources and transformation steps.
2. Mapping Flat JSON Structures
Backend Developer- Background
- A legacy system outputs a large, nested JSON file with dozens of keys that need to be mapped to a new schema.
- Problem
- Need a quick inventory of all JSON paths without worrying about derived fields or complex transformations.
- How to Use
- Paste the source JSON, leave the lineage rules blank, uncheck 'Include Object And Array Nodes' to focus only on leaf values, and run the tool.
- Outcome
- A clean JSON graph listing all absolute paths for the leaf nodes, making it easy to copy into a mapping spreadsheet.
Try with Samples
jsonRelated Hubs
FAQ
What format should the lineage rules be in?
The rules should be a JSON object containing a 'rules' array, where each rule specifies a 'target' path, an array of 'sources', and an array of 'transforms'.
Can I trace lineage without providing rules?
Yes, if you only provide the source JSON, the tool will map the base field paths without any derived dependencies.
What does the 'Include Object And Array Nodes' option do?
It determines whether parent container structures are included as distinct nodes in the lineage graph, or if only leaf nodes (actual values) are mapped.
Is there a limit to how many fields I can trace?
Yes, you can configure the maximum number of fields in the output, ranging from 10 up to 2,000.
What does the output look like?
The output is a JSON object containing a summary of field counts and a graph object detailing the nodes and their derivation paths.