Azure Data Factory - Lineage

This article outlines the lineage coverage, configuration requirements, metadata handling, supported scenarios, transformation behavior, process flow, and known limitations for lineage extraction in Azure Data Factory (ADF). The lineage process captures how data moves across pipelines, datasets, linked services, data flows, activities, triggers, and global parameters by analyzing JSON-based metadata definitions and transformation logic. It supports table-level and column-level lineage for supported activities and transformations, enabling visibility into data movement, dependency analysis, transformation tracking, and operational monitoring across cloud-based integration workflows.

Lineage Configuration Requirements

Accurate lineage extraction depends on properly configured authentication, metadata access, and pipeline definition visibility. These settings must be configured correctly to ensure the successful parsing of pipelines, activities, datasets, linked services, and transformation mappings.

Configuration Requirements Table

Configuration
Required Detail

Access Permissions

Read access to pipelines, data flows, datasets, linked services, triggers, and related Azure resources is required

Azure API Access

Service principals or managed identities must be allowed to access Azure Data Factory APIs

Connection Information

Connection strings and dataset parameters must be accessible for source and target resolution

Metadata Extraction

JSON-based metadata extraction must be enabled for pipelines, data flows, datasets, and linked services

Dynamic Parameter Handling

Dataset parameters and dynamic paths must be accessible for lineage resolution

Cross-Subscription Access

Bridge or connector configuration is required for remote or cross-subscription environments

Transformation Parsing

Data flow transformations and activity references must be accessible for lineage processing

Lineage Components

The connector supports lineage extraction across pipelines, activities, control flow components, and mapping data flow transformations.

Activity and Pipeline Component Support

Component
Availability

Copy Activity

Execute Data Flow

Execute Pipeline

Lookup Activity

Stored Procedure Activity

ForEach Activity

If Condition Activity

Switch Activity

Until Activity

Script Activity

Databricks Notebook Activity

Execute SSIS Package Activity

Set Variable Activity

⚠️

Filter Activity

⚠️

Wait Activity

⚠️

Web Activity

⚠️

Get Metadata Activity

⚠️

Delete Activity

⚠️

Validation Activity

⚠️

The ⚠️ icon indicates partially supported functionality with limited lineage coverage in applicable scenarios.

Mapping Data Flow Transformation Support

Transformation
Availability

Source

Sink

Derived Column

Select

Filter

Join

Union

Lookup

Aggregate

Sort

Pivot

Unpivot

Window

Rank

Split

Alter Row

Distinct

Exists

Flatten

Key Generate

Conditional Split

Parse

Assert

⚠️

The ⚠️ icon indicates partially supported functionality with limited lineage coverage in applicable scenarios.

Column-Level Lineage Coverage

Column-level lineage extraction is supported for standard transformation scenarios where metadata and transformation logic are accessible during parsing.

Column Creation and Transformation Coverage

Transformation Type
Supported

Select and Rename Columns

Join Columns

Union Columns

Pivot and Unpivot Columns

Derived Columns and Expressions

⚠️

Aggregate Columns

⚠️

Expression Functions

⚠️

The ⚠️ icon indicates partially supported functionality with limited lineage coverage in applicable scenarios.

Column-level lineage is supported for standard source-to-transformation-to-target mappings across supported Azure Data Factory activities and data flow transformations.

Supported Use Cases

The connector supports lineage extraction across standard Azure Data Factory integration workflows, transformation pipelines, and orchestration scenarios.

Supported Lineage Scenarios

Supported Scenario
Details

Database-to-Database ETL

Lineage extraction across relational database integrations such as SQL Server, Azure SQL, Oracle, and Snowflake

File-to-Database Processing

Lineage extraction from Azure Blob Storage, ADLS CSV/JSON files, and other file-based sources into database targets

Database-to-File Processing

Lineage extraction from database queries into Parquet, CSV, or other file-based outputs

Complex Data Flow Transformations

Lineage extraction for joins, aggregates, derived columns, pivot/unpivot, and transformation workflows

Nested Pipeline Dependencies

Lineage extraction across parent-child pipeline relationships

Conditional Workflow Processing

Lineage extraction across If Condition, Switch, Until, and ForEach activities

Cross-Cloud Integrations

Lineage extraction across AWS S3, Salesforce, REST integrations, and Azure-based targets

Column-Level Transformation Lineage

Column-level lineage for supported transformations including Select, Join, Aggregate, Derived Column, and Pivot/Unpivot

Table-level and column-level lineage are supported for standard Azure Data Factory activities, datasets, and transformation workflows where metadata definitions are accessible.

Partial or Limited Coverage

Certain scenarios provide partial lineage coverage due to runtime-generated logic, dynamic metadata resolution, or limitations in transformation visibility.

Partial Coverage Scenarios

Scenario
Limitation Description

Dynamic SQL Queries

Parameterized or runtime-generated SQL queries are not fully parsed during lineage extraction

Complex Nested Expressions

Highly nested expressions and multi-step transformation logic may not resolve completely

Schema Drift in Data Flows

Dynamic schema changes may result in incomplete column-level lineage

Encrypted Connection Configurations

Azure Key Vault or encrypted connection values require accessible decrypted configurations

Databricks Notebook Logic

Internal notebook transformation logic is not fully visible for lineage extraction

Custom Activities

Internal processing logic within custom activities is partially supported

Parameterized File Paths

Wildcard paths and partition-based file structures may not resolve fully without pipeline execution metadata

Variable-Based Runtime Logic

Runtime-generated values and dynamically assigned variables may provide partial lineage mapping

Unsupported Scenarios

The connector does not support lineage extraction for unsupported external processing logic, undocumented APIs, or activities that do not expose sufficient metadata for lineage generation.

Unsupported Lineage

Not Supported
Description

REST and Web Activity Payload Analysis

Request and response payloads are not analyzed for automatic lineage extraction

External APIs Without Metadata Contracts

APIs without structured metadata definitions are unsupported

Internal SSIS Package Lineage

SSIS package internals require separate SSIS lineage integration

Runtime-Generated External Logic

Externally generated transformation logic without accessible metadata is unsupported

Unresolved Dynamic Runtime Metadata

Runtime-generated schemas and inaccessible execution metadata are unsupported

Unsupported scenarios do not generate lineage relationships and may appear disconnected in lineage visualization.

Current Functional Status

This section outlines the present lineage coverage supported by the Azure Data Factory connector based on the currently available capabilities and limitations.

Status Area
Details

Overall Coverage

Production-ready coverage for standard Azure Data Factory pipelines, activities, and data flow transformations

Lineage Depth

Table-level and column-level lineage are supported for standard workflows

Supported Inputs

Pipelines, datasets, linked services, data flows, triggers, activities, and transformation metadata

Functional Scope

Lineage extraction supports standard ETL workflows, orchestration logic, and transformation processing

Partial Coverage Areas

Dynamic SQL, schema drift, parameterized paths, encrypted configurations, Databricks notebooks, and complex expressions

Unsupported Areas

REST payload analysis, undocumented APIs, unresolved runtime metadata, and SSIS package internals

Resulting Output

Lineage is generated for supported Azure Data Factory workflows with partial or unavailable mapping in unsupported scenarios


Copyright © 2026, OvalEdge LLC, Peachtree Corners GA USA

Last updated

Was this helpful?