Azure Data Lake

This article outlines the integration with Azure Data Lake connector, enabling streamlined metadata management through features such as crawling, and data preview. It also ensures secure authentication via Credential Manager.

Overview

Connector Details

Connector Category

Cloud Storage

OvalEdge Releases Supported

Release5.0 to Release7.1.1

Connectivity

[How the connection is established with Azure Data Lake]

ADL SDK

Connector Features

Feature
Availability

Crawling

Delta Crawling

Profiling*

Sample Profiling

Query Sheet

Data Preview

Auto Lineage

Manual Lineage

Secure Authentication via Credential Manager

Data Quality

DAM (Data Access Management)

Bridge

circle-info

*Full profiling is supported through DuckDB. To enable this capability, configure the system setting (key: enable.duckdb) to True.

Metadata Mapping

The following objects are crawled from Azure Data Lake and mapped to the corresponding UI assets.

Azure Data Lake Object
Azure Data Lake Attribute
OvalEdge Attribute
OvalEdge Category
OvalEdge Type

File/Folder

Folder

Folder

Folder

Folder

File

File

File

File

-

File

XLSX

Folder(subfile)

Folder(subfile)

Folder(subfile)

File

XLS

Folder(subfile)

Folder(subfile)

Folder(subfile)

File

CSV

File

File

File

File

TXT

File

File

File

File

PARQUET

File

File

File

File

ORC

File

File

File

File

JSON

File

File

File

File

YAML

File

File

File

File

PIP

File

File

File

Set up a Connection

Prerequisites

The following are the prerequisites to establish a connection:

Ensure that the CSV files follow the required formatting standards for proper data processing and visibility. Refer to CSV Format Requirementsarrow-up-right.

Service Account User Permissions

circle-exclamation
circle-info

👨‍💻 Who can provide these permissions? These permissions are typically granted by the Azure Data Lake administrator, as users may not have the required access to assign them independently.

Operation
Objects
Access Permission

Connector Validation

Containers

Read

Crawling

Containers

Read

Crawling & Profiling

Buckets

Read

Crawling & Profiling

Folder

Read

Crawling & Profiling

Files

Read

View Data

profile/Get Data

Read

circle-info

Required Permissions

  • Ensure the following Azure permissions are assigned:

    • Microsoft.Storage/storageAccounts/blobServices/containers/read

    • Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read

  • If ACLs are enabled in ADLS Gen2, also configure the following access controls:

    • Folder access (traverse/list): x, r-x

    • File access (read): r–

  • These permissions are required for successful access and operations.

Connection Configuration Steps

circle-info

Users are required to have the Connector Creator role in order to configure a new connection.

  1. Log into OvalEdge, go to Administration > Connectors, click + (New Connector), search for Azure Data Lake, and complete the required parameters.

circle-info

Fields marked with an asterisk (*) are mandatory for establishing a connection.

Field Name
Description

Connector Type

By default, "Azure Data Lake" is displayed as the selected connector type.

Credential Manager*

Select the desired credentials manager from the drop-down list. Relevant parameters will be displayed based on your selection.

Supported Credential Managers:

  • OE Credential Manager

  • AWS Secrets Manager

  • HashiCorp Vault

  • Azure Key Vault

License Add-ons

Select the checkbox for Data Quality Add-On to identify data quality issues using data anomaly detection.

Connector Environment

Select the environment (Example: PROD, STG) configured for the connector.

Connector Name*

Enter a unique name for the Azure Data Lake connection

(Example: "Azure_Data_Lake").

Connector description

Enter a brief description of the connector.

Authentication Type*

The following two types of authentication are supported

for Azure Data Lake:

  • ADL String

  • ADL Service Principal

Client Id*

Enter the Client ID (Application ID), which uniquely identifies the registered application.

Note: The field will appear only if the authentication is selected as “ADL Service Principal”.

Client Secret*

Enter the Client Secret, which is used by the application to authenticate and request tokens.

Note: The field will appear only if the authentication is selected as “ADL Service Principal”.

Tenant Id*

Provide the Tenant ID (Directory ID) that identifies the Azure Active Directory instance used for authentication.

Note: The field will appear only if the authentication is selected as “ADL Service Principal”.

ADL Endpoint*

Provide the URL used to interact with ADL storage accounts.

Note: The field will appear only if the authentication is selected as “ADL Service Principal”.

ADL Connection String*

Enter the ADL connection string that was generated at the Azure storage account.

Note: The field will appear only if the authentication is selected as “ADL String”.

Default Governance Roles

Default Governance Roles*

Select the appropriate users or teams for each governance role from the drop-down list. All users and teams configured in OvalEdge Security are displayed for selection.

Admin Roles

Admin Roles*

Select one or more users from the dropdown list for Integration Admin and Security & Governance Admin. All users configured in OvalEdge Security are available for selection.

No of Archive Objects

No Of Archive Objects*

This shows the number of recent metadata changes to a dataset at the source. By default, it is off. To enable it, toggle the Archive button and specify the number of objects to archive.

Example: Setting it to 4 retrieves the last four changes, displayed in the 'Version' column of the 'Metadata Changes' module.

Bridge

Select Bridge*

If applicable, select the bridge from the drop-down list.

The drop-down list displays all active bridges configured in OvalEdge. These bridges enable communication between data sources and OvalEdge without altering firewall rules.

  1. After entering all connection details, the following actions can be performed:

    1. Click Validate to verify the connection.

    2. Click Save to store the connection for future use.

    3. Click Save & Configure to apply additional settings before saving.

  2. The saved connection will appear on the Connectors home page.

Manage Connector Operations

Crawl/Profile

circle-info

To perform crawl operations, users must be assigned the Integration Admin role.

  1. Navigate to the Connectors page and click Crawl/Profile.

  2. This action initiates the metadata collection process from the data source and loads the retrieved metadata into the File Manager > File Explorer.

  3. In the File Manager, click the connector name, select the specific folder(s) or file(s), then click Catalog / Catalog and Profile from the Nine Dots menu. For more details, click herearrow-up-right.

  4. The selected files or folders will be added to the Data Catalog > Files/File Columns tab.

Other Operations

The Connectors page in OvalEdge provides a centralized view of all configured connectors, including their health status.

Managing connectors includes:

  • Connectors Health: Displays the current status of each connector using a green icon for active connections and a red icon for inactive connections, helping to monitor the connectivity with data sources.

  • Viewing: Click the Eye icon next to the connector name to view connector details, including databases, tables, columns, and codes.

Nine Dots Menu Options:

To view, edit, validate, configure, or delete connectors, click on the Nine Dots menu.

  • Edit Connector: Update and revalidate the data source.

  • Validate Connector: Check the connection's integrity.

  • Settings: Modify connector settings.

    • Crawler: Configure data extraction.

    • Access Instructions: Add notes on how data can be accessed.

    • Business Glossary Settings: Manage term associations at the connector level.

    • Anomaly Detection Settings: Configure anomaly detection preferences at the connector level.

  • Delete Connector: Remove a connector with confirmation.

Connectivity Troubleshooting

If incorrect parameters are entered, error messages may appear. Ensure all inputs are accurate to resolve these issues. If issues persist, contact the assigned support team.

Connectivity Troubleshooting

If incorrect parameters are entered, error messages may appear. Ensure all inputs are accurate to resolve these issues. If issues persist, contact the assigned support team.

1

Connection Validation Failure

Error Description:

  • Connection validation fails due to incorrect credentials, an invalid connection string, missing containers, or network issues.

Resolution:

  • Verify connection string format (must start with DefaultEndpointsProtocol=https)

  • Check Client ID, Client Secret, and Tenant ID

  • Ensure the storage account exists

  • Confirm at least one container is available

  • Validate network connectivity

2

Invalid Client Secret / Authentication Failure

Error Description:

Authentication fails when the client secret is incorrect, expired, or mismatched.

Resolution:

  • Verify Client Secret value (copy correctly from Azure Portal)

  • Check if the secret has expired

  • Generate a new secret if required

  • Ensure Client ID and Tenant ID are correct

3

Authorization Failure (Access Denied)

Error Description:

The service principal does not have the required permissions to access the storage account.

Resolution:

  • Assign required roles:

  • Ensure the role is assigned at the storage account level

  • Verify access in the Azure Portal

4

Resource Not Found (File / Container / Path)

Error Description:

The file, container, or path does not exist or is incorrectly specified.

Resolution:

  • Verify the resource exists in the Azure Portal

  • Check the container name and file path

  • Use correct format: container/folder/file

  • Ensure no extra or missing slashes

5

No Containers Found During Validation

Error Description:

Validation fails when the storage account has no containers or permissions to list containers are missing.

Resolution:

  • Create at least one container in the storage account

  • Verify permissions to list containers

  • Confirm access using Azure Portal

6

No files found in the container

Error Description:

  • Occurs when container is empty, path is incorrect, or permissions are missing.

Resolution:

  • Verify files exist in container

  • Check folder path

  • Confirm permissions

  • Validate container name

7

Container listing failed

Error Description:

  • Occurs due to missing permissions, invalid storage account, or network issues.

Resolution:

  • Assign role: Storage Blob Data Reader

  • Verify storage account

  • Check network connectivity

  • Validate authentication

8

File download failed

Error Description:

  • Occurs when file does not exist, permissions are missing, or network issues occur.

Resolution:

  • Verify file exists

  • Ensure download permissions

  • Check network stability

  • Validate encryption access if applicable

9

Access Denied

Error Description:

  • Occurs when Client ID, Secret, or Tenant ID is incorrect, expired, or permissions are missing.

Resolution:

  • Verify Client ID, Client Secret, Tenant ID

  • Check if the secret is expired

  • Ensure service principal is active

  • Assign roles:

10

Authentication / Token error

Error Description:

  • Occurs when credentials are invalid, expired, or Azure AD authentication fails.

Resolution:

  • Verify Client ID, Secret, Tenant ID

  • Check secret expiry

  • Ensure Tenant ID is correct

  • Validate access to: login.microsoftonline.com

11

Slow file listing

Error Description:

  • Occurs when containers have large number of files or high network latency.

Resolution:

  • Wait for pagination (batch loading)

  • Use specific folder paths

  • Check network speed

12

Operation timeout

Error Description:

  • Occurs due to slow network, large data volume, or low timeout settings.

Resolution:

  • Check the internet connection

  • Retry operation

  • Perform during off-peak hours

  • Increase timeout settings if possible

13

CSV column names not detected

Error Description:

  • Occurs when the file lacks a header, uses an incorrect delimiter, or has encoding issues.

Resolution:

  • Ensure the header row exists

  • Use standard delimiters (comma, tab, semicolon)

  • Save file in UTF-8

  • Avoid special characters in column names

14

Excel file cannot be read

Error Description:

  • Occurs when the file is corrupted, unsupported, password-protected, or too large.

Resolution:

  • Use .xls or .xlsx format

  • Remove password protection

  • Verify file opens in Excel

  • Reduce file size if needed

15

Data type detection failed

Error Description:

  • Occurs due to insufficient data, mixed data types, or inconsistent formats.

Resolution:

  • Provide at least 20–30 rows

  • Maintain consistent data types

  • Use standard date formats (YYYY-MM-DD)

FAQs

chevron-rightWhy does connection string authentication fail?hashtag

It fails when the connection string format is incorrect, the account key is wrong, or the storage account does not exist. Check that the string starts with DefaultEndpointsProtocol=https, verify AccountName and AccountKey, and confirm the storage account exists.

chevron-rightWhy are folders not showing correctly?hashtag

Folders do not appear when the path format is incorrect or when the results are still loading. Verify the path structure and wait for all results to load.

chevron-rightWhy is file download slow or timing out?hashtag

Downloads are slow due to large files, a slow network, or regional differences. Check your network speed, try during off-peak hours, or increase timeout settings.

chevron-rightWhy can’t I see file details?hashtag

You cannot see file details if permissions are missing or the file does not exist. Verify file existence, check permissions, and refresh the connection.

chevron-rightWhy can’t I upload files?hashtag

Upload fails when permissions are missing, the path is invalid, or the network is unstable. Verify upload permissions, container existence, and path format.

chevron-rightWhat role is required for file operations?hashtag

Use Storage Blob Data Reader for read operations and Storage Blob Data Contributor for upload or write operations. Assign roles at the storage account level.

chevron-rightWhy do I see connection or client errors?hashtag

These errors occur due to network issues, SSL problems, or incorrect proxy settings. Check your network, system time, and proxy configuration, then retry.

chevron-rightWhy does the setup fail or the client not initialize?hashtag

Setup fails when the authentication or connection setup does not complete. Verify credentials, storage account, and network connectivity, then retry.

chevron-rightWhy do operations timeout?hashtag

Operations timeout due to slow network or large data responses. Retry the operation or improve network speed and timeout settings.

chevron-rightWhy are CSV column names not detected?hashtag

This happens when the file has no header row or uses an incorrect delimiter or encoding. Add a header row and use standard delimiters with UTF-8 encoding.

chevron-rightWhy can’t the system read Excel files?hashtag

This happens when the file is corrupted, unsupported, or too large. Use .xls or .xlsx format, remove protection, and reduce file size if needed.

chevron-rightWhy are data types detected incorrectly?hashtag

This happens when data is inconsistent, insufficient, or incorrectly formatted. Use consistent data types and standard formats, and include enough sample rows.

chevron-rightWhat is the difference between a Connection String and a Service Principal?hashtag

A connection string uses account keys and is simple to set up. A service principal uses Azure AD and is more secure. Use the connection string for testing and the service principal for production.

chevron-rightWhy is the endpoint URL important?hashtag

It tells the system which storage account to connect to. Use the format: https://.blob.core.windows.net

chevron-rightWhy must the connection string follow a specific format?hashtag

The system cannot parse or authenticate an incorrect format. Always use the exact format from the Azure Portal.

chevron-rightWhy is Tenant ID required?hashtag

Tenant ID identifies your Azure AD directory for authentication. Copy it from the Azure Portal.

chevron-rightWhy do operations fail when using a proxy?hashtag

Failures occur when proxy settings are incorrect or do not support HTTPS. Verify proxy configuration and authentication settings.

chevron-rightWhy can’t I connect to Azure?hashtag

Connection fails due to firewall, DNS, or network restrictions. Allow outbound HTTPS (port 443) and verify DNS and network rules.

chevron-rightWhy is file summary not available?hashtag

This happens when the file does not exist or when permissions are missing. Verify file existence and permissions.

chevron-rightWhy are file paths not recognized?hashtag

This happens when the path format is incorrect or contains special characters. Use the format: container/folder/file.

chevron-rightWhy does folder validation fail?hashtag

It fails when the path is incorrect or no files exist in that folder. Verify the path and ensure files exist.

chevron-rightWhy does pagination fail?hashtag

Pagination fails due to large data volume or timeout issues. Load data in smaller batches or use specific paths.

chevron-rightWhy does the temporary download fail?hashtag

It fails when disk space is low or permissions are missing. Ensure enough disk space and write permissions.

chevron-rightWhy do I see “No containers found”?hashtag

This happens when the storage account has no containers or access is restricted. Create a container and verify permissions.


Copyright © 2026, OvalEdge LLC, Peachtree Corners, GA, USA.

Last updated

Was this helpful?