Azure Data Factory - Lineage
This article outlines the lineage coverage, configuration requirements, metadata handling, supported scenarios, transformation behavior, process flow, and known limitations for lineage extraction in Azure Data Factory (ADF). The lineage process captures how data moves across pipelines, datasets, linked services, data flows, activities, triggers, and global parameters by analyzing JSON-based metadata definitions and transformation logic. It supports table-level and column-level lineage for supported activities and transformations, enabling visibility into data movement, dependency analysis, transformation tracking, and operational monitoring across cloud-based integration workflows.
Lineage Configuration Requirements
Accurate lineage extraction depends on properly configured authentication, metadata access, and pipeline definition visibility. These settings must be configured correctly to ensure the successful parsing of pipelines, activities, datasets, linked services, and transformation mappings.
Configuration Requirements Table
Access Permissions
Read access to pipelines, data flows, datasets, linked services, triggers, and related Azure resources is required
Azure API Access
Service principals or managed identities must be allowed to access Azure Data Factory APIs
Connection Information
Connection strings and dataset parameters must be accessible for source and target resolution
Metadata Extraction
JSON-based metadata extraction must be enabled for pipelines, data flows, datasets, and linked services
Dynamic Parameter Handling
Dataset parameters and dynamic paths must be accessible for lineage resolution
Cross-Subscription Access
Bridge or connector configuration is required for remote or cross-subscription environments
Transformation Parsing
Data flow transformations and activity references must be accessible for lineage processing
Insufficient permissions, inaccessible datasets, unresolved parameters, restricted linked services, or missing connection information may prevent lineage extraction or result in incomplete lineage mapping.
Lineage Components
The connector supports lineage extraction across pipelines, activities, control flow components, and mapping data flow transformations.
Activity and Pipeline Component Support
Copy Activity
✅
Execute Data Flow
✅
Execute Pipeline
✅
Lookup Activity
✅
Stored Procedure Activity
✅
ForEach Activity
✅
If Condition Activity
✅
Switch Activity
✅
Until Activity
✅
Script Activity
✅
Databricks Notebook Activity
✅
Execute SSIS Package Activity
✅
Set Variable Activity
⚠️
Filter Activity
⚠️
Wait Activity
⚠️
Web Activity
⚠️
Get Metadata Activity
⚠️
Delete Activity
⚠️
Validation Activity
⚠️
The ⚠️ icon indicates partially supported functionality with limited lineage coverage in applicable scenarios.
Mapping Data Flow Transformation Support
Source
✅
Sink
✅
Derived Column
✅
Select
✅
Filter
✅
Join
✅
Union
✅
Lookup
✅
Aggregate
✅
Sort
✅
Pivot
✅
Unpivot
✅
Window
✅
Rank
✅
Split
✅
Alter Row
✅
Distinct
✅
Exists
✅
Flatten
✅
Key Generate
✅
Conditional Split
✅
Parse
✅
Assert
⚠️
The ⚠️ icon indicates partially supported functionality with limited lineage coverage in applicable scenarios.
Column-Level Lineage Coverage
Column-level lineage extraction is supported for standard transformation scenarios where metadata and transformation logic are accessible during parsing.
Column Creation and Transformation Coverage
Select and Rename Columns
✅
Join Columns
✅
Union Columns
✅
Pivot and Unpivot Columns
✅
Derived Columns and Expressions
⚠️
Aggregate Columns
⚠️
Expression Functions
⚠️
The ⚠️ icon indicates partially supported functionality with limited lineage coverage in applicable scenarios.
Column-level lineage is supported for standard source-to-transformation-to-target mappings across supported Azure Data Factory activities and data flow transformations.
Supported Use Cases
The connector supports lineage extraction across standard Azure Data Factory integration workflows, transformation pipelines, and orchestration scenarios.
Supported Lineage Scenarios
Database-to-Database ETL
Lineage extraction across relational database integrations such as SQL Server, Azure SQL, Oracle, and Snowflake
File-to-Database Processing
Lineage extraction from Azure Blob Storage, ADLS CSV/JSON files, and other file-based sources into database targets
Database-to-File Processing
Lineage extraction from database queries into Parquet, CSV, or other file-based outputs
Complex Data Flow Transformations
Lineage extraction for joins, aggregates, derived columns, pivot/unpivot, and transformation workflows
Nested Pipeline Dependencies
Lineage extraction across parent-child pipeline relationships
Conditional Workflow Processing
Lineage extraction across If Condition, Switch, Until, and ForEach activities
Cross-Cloud Integrations
Lineage extraction across AWS S3, Salesforce, REST integrations, and Azure-based targets
Column-Level Transformation Lineage
Column-level lineage for supported transformations including Select, Join, Aggregate, Derived Column, and Pivot/Unpivot
Table-level and column-level lineage are supported for standard Azure Data Factory activities, datasets, and transformation workflows where metadata definitions are accessible.
Partial or Limited Coverage
Certain scenarios provide partial lineage coverage due to runtime-generated logic, dynamic metadata resolution, or limitations in transformation visibility.
Partial Coverage Scenarios
Dynamic SQL Queries
Parameterized or runtime-generated SQL queries are not fully parsed during lineage extraction
Complex Nested Expressions
Highly nested expressions and multi-step transformation logic may not resolve completely
Schema Drift in Data Flows
Dynamic schema changes may result in incomplete column-level lineage
Encrypted Connection Configurations
Azure Key Vault or encrypted connection values require accessible decrypted configurations
Databricks Notebook Logic
Internal notebook transformation logic is not fully visible for lineage extraction
Custom Activities
Internal processing logic within custom activities is partially supported
Parameterized File Paths
Wildcard paths and partition-based file structures may not resolve fully without pipeline execution metadata
Variable-Based Runtime Logic
Runtime-generated values and dynamically assigned variables may provide partial lineage mapping
Incomplete metadata, unresolved parameters, inaccessible encrypted configurations, schema drift, or unsupported runtime logic may prevent complete lineage creation.
Unsupported Scenarios
The connector does not support lineage extraction for unsupported external processing logic, undocumented APIs, or activities that do not expose sufficient metadata for lineage generation.
Unsupported Lineage
REST and Web Activity Payload Analysis
Request and response payloads are not analyzed for automatic lineage extraction
External APIs Without Metadata Contracts
APIs without structured metadata definitions are unsupported
Internal SSIS Package Lineage
SSIS package internals require separate SSIS lineage integration
Runtime-Generated External Logic
Externally generated transformation logic without accessible metadata is unsupported
Unresolved Dynamic Runtime Metadata
Runtime-generated schemas and inaccessible execution metadata are unsupported
Unsupported scenarios do not generate lineage relationships and may appear disconnected in lineage visualization.
Current Functional Status
This section outlines the present lineage coverage supported by the Azure Data Factory connector based on the currently available capabilities and limitations.
Overall Coverage
Production-ready coverage for standard Azure Data Factory pipelines, activities, and data flow transformations
Lineage Depth
Table-level and column-level lineage are supported for standard workflows
Supported Inputs
Pipelines, datasets, linked services, data flows, triggers, activities, and transformation metadata
Functional Scope
Lineage extraction supports standard ETL workflows, orchestration logic, and transformation processing
Partial Coverage Areas
Dynamic SQL, schema drift, parameterized paths, encrypted configurations, Databricks notebooks, and complex expressions
Unsupported Areas
REST payload analysis, undocumented APIs, unresolved runtime metadata, and SSIS package internals
Resulting Output
Lineage is generated for supported Azure Data Factory workflows with partial or unavailable mapping in unsupported scenarios
Coverage remains partial for dynamic runtime logic, schema drift, encrypted configuration handling, Databricks notebook internals, parameterized queries, and unsupported external processing because these scenarios do not expose sufficient metadata for complete lineage extraction.
Copyright © 2026, OvalEdge LLC, Peachtree Corners GA USA
Last updated
Was this helpful?

