Delta Lake

This document outlines the integration with Delta Lake, enabling streamlined metadata management through features such as crawling, profiling, querying, data preview, and lineage building (both automatic and manual). It also ensures secure authentication via Credential Manager.

Overview

Connector Details

Connector Category

RDBMS System-Data Warehouse

Connector Version

Release6.3.4

Releases Supported (Available from)

Release6.1

Connectivity

[How the connection is established with Delta Lake]

REST APIs & JDBC driver

Verified Delta Lake Version

2.6.40

The Delta Lake connector has been internally verified with the above Delta Lake versions and is expected to be compatible with other supported Delta Lake versions. If there are any issues with validation or metadata crawling, please submit a support ticket for investigation and feedback.

Connector Features

Feature
Availability

Crawling

Delta Crawl

Profiling

Query Sheet

Data Preview

Auto Lineage

Manual Lineage

Authentication via Credential Manager

Data Quality

DAM (Data Access Management)

Bridge

Metadata Mapping

The following objects are crawled from Delta Lake and mapped to the corresponding UI assets.

Delta Lake Object
Delta Lake Attribute
OvalEdge Attribute
OvalEdge Category
OvalEdge Type

Table

Table Name

Table

Tables

Table

Table

Table Data Type

Table Type

Tables

Table

Columns

Column Name

Column

Table Columns

Columns

Columns

Column Datatype

Column Type

Table Columns

Columns

Columns

Column Comment

Source Description

Table Columns

Columns

Views

View Name

View

Tables

Views

Set up a Connection

Prerequisites

The following are the prerequisites to establish a connection:

Whitelisting Ports

Ensure the inbound port “443” is whitelisted to enable successful connectivity with the Delta Lake database.

Service Account User Permissions

👨‍💻Who can provide these permissions? These permissions are typically granted by the Microsoft SQL Server administrator, as users may not have the required access to assign them independently.

Operations
Objects
System Tables
Access Permissions

Crawling & Profiling

Schemas

Schemas

USAGE

Crawling & Profiling

Tables / Views

Tables

USAGE

Crawling & Profiling

Table / View Columns

On the Table

SELECT

Crawling & Lineage Building

Lineage-related source codes

System.Access.Table_Lineage, System.Access.Column_Lineage

SELECT

Crawling

Column Relations

Information_Schema.Table_Constraints, Information_Schema.Key_Column_Usage, Information_Schema.Referential_Constraints

SELECT

Connection Configuration Steps

  1. Log into OvalEdge, navigate to Administration > Connectors, click + (New Connector), search for Delta Lake, and complete the specific parameters.

Fields marked with an asterisk (*) are mandatory for establishing a connection.

Field Name
Description

Connector Type

By default, "Delta Lake" is displayed as the selected connector type.

Authentication

Delta Lake supports the following two types of authentications:

  • Service Principal

  • Personal Access Token

Field Name
Description

Credential Manager*

Select the desired credentials manager from the drop-down list. Relevant parameters will be displayed based on the selection.

Supported Credential Managers:

  • OE Credential Manager

  • AWS Secrets Manager

  • HashiCorp

  • Azure Key Vault

License Add Ons

Auto Lineage

Supported

Data Quality

Supported

Data Access

Not Supported

  • Select the checkbox for the Auto Lineage Add-On to build data lineage automatically.

  • Select the checkbox for the Data Quality Add-On to identify data quality issues using data quality rules and anomaly detection.

Connector Name*

Enter a unique name for the Delta Lake connection

(Example: "Delta Lakedb").

Connector Environment

Select the environment (Example: PROD, STG) configured for the connector.

Connector Description

Enter a description to identify the purpose of the connector.

Client Id*

Enter the Client ID configured for accessing the Delta Lake database.

Client Secret*

Enter the Client Secret associated with the provided Client ID.

Server*

Enter the Delta Lake server name or IP address (Example: xxxx-sxxxxxx.xxxx4ijxxxl.xx-south-1.rxs.xxxxx.com or 1xx.xxx.1.xx).

Port*

By default, the port number for the Delta Lake, "443" is auto-populated. If required, the port number can be modified according to the custom port number configured for the Delta Lake Database.

Database Type*

Select the database type from the drop-down:

  • Delta Lake_Regular

  • Delta Lake_Unity_Catalog

Database

Enter the database name to which the service account user has access to within Delta Lake.

Driver*

By default, the Delta Lake driver details are auto-populated.

HTTP Path*

Enter the HTTP Path, which is typically associated with the cluster or warehouse.

Lineage Fetching Mode*

Select the Lineage Fetching Mode from the drop-down:

  • QUERY mode (Access lineage via system tables)

  • API mode (Access lineage via REST APIs)

Connection String

Configure the connection string for the Delta Lake database:

  • Automatic Mode: The system generates a connection string based on the provided credentials.

  • Manual Mode: Enter a valid connection string manually.

Replace placeholders with actual database details.

{sid} refers to the Database Name.

Proxy Enabled*

Select Yes to route API calls through a proxy server. Select No to bypass the proxy and connect directly.

Plugin Server

Enter the server’s name when running as a plugin server.

Plugin Port

Enter the port number on which the plugin is running.

Default Governance Roles

Default Governance Roles*

Select the appropriate users or teams for each governance role from the drop-down list. All users and teams configured in OvalEdge Security are displayed for selection.

Admin Roles

Admin Roles*

Select one or more users from the dropdown list for Integration Admin and Security and Governance Admin. All users configured in OvalEdge Security are available for selection.

No of Archive Objects

No Of Archive Objects*

It indicates the number of recent metadata changes to a dataset at the source. By default, it is off. You can enable it by toggling the Archive button and specifying the number of objects to archive.

Example: Setting it to 4 retrieves the last 4 changes, shown in the 'version' column of the 'Metadata Changes' module.

Bridge

Select Bridge*

If applicable, select the bridge from the drop-down list.

The drop-down list displays all active bridges configured in OvalEdge. These bridges enable communication between data sources and OvalEdge without altering firewall rules.

  1. After entering all connection details, the following actions can be performed:

    1. Click Validate to verify the connection.

    2. Click Save to store the connection for future use.

    3. Click Save & Configure to apply additional settings before saving.

  2. The saved connection will appear on the Connectors home page.

Manage Connector Operations

Crawl/Profile

The Crawl/Profile button allows users to select one or more schemas for crawling and profiling.

  1. Navigate to the Connectors page and click Crawl/Profile.

  2. Select the schemas to crawl.

  3. The Crawl option is selected by default. To perform both operations, select the Crawl & Profile radio button.

  4. Click Run to collect metadata from the connected source and load it into the Data Catalog.

  5. After a successful crawl, the information appears in the Data Catalog > Databases tab.

The Schedule checkbox allows automated crawling and profiling at defined intervals, from a minute to a year.

  1. Click the Schedule checkbox to enable the Select Period drop-down.

  2. Select a time interval for the operation from the drop-down menu.

  3. Click Schedule to initiate metadata collection from the connected source.

  4. The system will automatically execute the selected operation (Crawl or Crawl & Profile) at the scheduled time.

Other Operations

The Connectors page provides a centralized view of all configured connectors, along with their health status.

Managing connectors includes:

  • Connectors Health: Displays the current status of each connector using a green icon for active connections and a red icon for inactive connections, helping to monitor the connectivity with data sources.

  • Viewing: Click the Eye icon next to the connector name to view connector details, including databases, tables, columns, and codes.

Nine Dots Menu Options:

You can view, edit, validate, and delete connectors using the Nine Dots menu.

  • Edit Connector: Update and revalidate the data source.

  • Validate Connector: Check the connection's integrity.

  • Settings: Modify connector settings.

    • Crawler: Configure data that needs to be extracted.

    • Profiler: Customize data profiling rules and methods.

    • Query Policies: Define rules for executing queries based on roles.

    • Access Instructions: Specify how data can be accessed as a note.

    • Business Glossary Settings: Manage term associations at the connector level.

    • Anomaly Detection Settings: Configure anomaly detection preferences at the connector level.

    • Others: Configure notification recipients for metadata changes.

  • Build Lineage: Automatically build data lineage using source code parsing.

  • Delete Connector: Remove connectors with confirmation.

Connectivity Troubleshooting

If incorrect parameters are entered, error messages may appear. Ensure all inputs are accurate to resolve these issues. If issues persist, contact the assigned support team.

S.No.
Error Message(s)
Error Description / Resolution

1

Error setting/closing session: 401 Unauthorized

Description:

The connector can't authenticate due to an expired or incorrect token/client secret.

Resolution:

  • Check if the token or client secret is expired.

  • Generate a new one if needed.

2

OAuth2 is currently supported on AWS, Azure, and GPC platforms.

Description:

The connector can't connect because the server address includes an unsupported protocol (http/https).

Resolution:

  • Enter only the IP address or hostname.

  • Do not include http:// or https:// before it.

  • Save the changes and test the connection.

3

Error setting/closing session: 401 Unauthorized

Description:

This error can occur due to:

  • Incorrect client ID

  • Wrong HTTP path

  • Invalid database name

Resolution:

  • Use only the IP address or hostname (no http:// or https://)

  • Verify the client ID, HTTP path, and database name

  • Correct any invalid values and retest the connection

4

Query processing time exceeded the queryTimeout(). SQLTimeoutException

Description:

The server is taking time to initialize the cluster, which may cause the connection to fail temporarily.

Resolution:

  • Wait for 2 minutes and try validating the connection again.


Copyright © 2025, OvalEdge LLC, Peachtree Corners GA USA

Last updated

Was this helpful?