# Databricks

This article outlines the integration with the Databricks connector, enabling streamlined metadata management through features such as crawling and lineage building (both automatic and manual). It also ensures secure authentication via Credential Manager.

<div align="left"><figure><img src="https://1813356899-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FhTnkoJQml0pok9awFDhx%2Fuploads%2F73eKKhn1vOnrQjQQN2dz%2Funknown.png?alt=media&#x26;token=27e7fc05-634f-401b-b25f-6ab8884cad46" alt=""><figcaption></figcaption></figure></div>

## Overview

### Connector Details

| Connector Category                                                             | ETL Tool      |
| ------------------------------------------------------------------------------ | ------------- |
| Connector Version                                                              | Release7.2.3  |
| Releases Supported (Available from)                                            | Release5.1    |
| <p>Connectivity</p><p>\[How the connection is established with Databricks]</p> | REST APIs     |
| Verified Databricks Version                                                    | Cloud Version |

{% hint style="info" %}
The Databricks connector has been validated with the mentioned "Verified Databricks Versions" and is expected to be compatible with other supported Databricks versions. If there are any issues with validation or metadata crawling, please submit a support ticket for investigation and feedback.
{% endhint %}

### Connector Features

| Feature                                      | Availability |
| -------------------------------------------- | :----------: |
| Crawling                                     |       ✅      |
| Delta Crawling                               |       ❌      |
| Data Preview                                 |       ❌      |
| Auto Lineage                                 |       ✅      |
| Manual Lineage                               |       ✅      |
| Secure Authentication via Credential Manager |       ✅      |
| Data Quality                                 |      NA      |
| DAM (Data Access Management)                 |      NA      |
| Bridge                                       |       ✅      |

{% hint style="info" %}
'NA' indicates that the respective feature is 'Not Applicable.'
{% endhint %}

### Metadata Mapping

The following objects are crawled from Databricks and mapped to the corresponding UI assets.

<table><thead><tr><th width="164.11114501953125">Databricks Object</th><th width="184.111083984375">Databricks Attribute</th><th width="177.4444580078125">OvalEdge Attribute</th><th width="174.22216796875">OvaEdge Category</th><th>OvalEdge Type</th></tr></thead><tbody><tr><td>Notebook</td><td>Notebook file</td><td>Code Name</td><td>Codes</td><td>ADB_&#x3C;language type></td></tr><tr><td>Commands inside Notebook</td><td>Command code</td><td>Code Name</td><td>Codes</td><td>ADB_&#x3C;language type></td></tr></tbody></table>

## Set up a Connection

### Prerequisites

The following are the prerequisites to establish a connection:

#### **Service Account User Permissions**

{% hint style="warning" %}
It is recommended to use a separate service Token to establish the connection to the data source, configured with the following minimum set of permissions.
{% endhint %}

{% hint style="info" %}
👨‍💻 **Who can provide these permissions?** These permissions are typically granted by the Databricks administrator, as users may not have the required access to assign them independently.
{% endhint %}

<table><thead><tr><th width="210.77777099609375">Operations</th><th width="200.111083984375">Objects</th><th>Access Permission</th></tr></thead><tbody><tr><td>Crawling</td><td>Datasets</td><td>Read Permission for the API calls</td></tr></tbody></table>

### Connection Configuration Steps

{% hint style="warning" %}
Users are required to have the Connector Creator role in order to configure a new connection.
{% endhint %}

1. Log into **OvalEdge**, go to **Administration** > **Connectors**, click **+ (New Connector)**, search for **Databricks**, and complete the required parameters.

{% hint style="info" %}
Fields marked with an asterisk (\*) are mandatory for establishing a connection.
{% endhint %}

<table><thead><tr><th width="220">Field Name</th><th>Description</th></tr></thead><tbody><tr><td>Connector Type*</td><td>By default, "Databricks" is displayed as the selected connector type.</td></tr><tr><td>Credential Manager*</td><td><p>Select the desired credentials manager from the drop-down list. Relevant parameters will be displayed based on the selection.</p><p>Supported Credential Managers:</p><ul><li>OE Credential Manager</li><li>AWS Secrets Manager</li><li>HashiCorp</li><li>Azure Key Vault</li></ul></td></tr><tr><td>License Add-Ons*</td><td><ul><li>Select the checkbox for <strong>Auto Lineage</strong> Add-On to build data lineage automatically.</li></ul></td></tr><tr><td>Connector Name*</td><td><p>Enter a unique name for the Databricks connection            </p><p>(Example: "DatabricksDB").</p></td></tr><tr><td>Connector Environment</td><td>Select the environment (Example: PROD, STG) configured for the connector.</td></tr><tr><td>Connection Description </td><td>Enter a brief description of the connector.</td></tr><tr><td>Server*</td><td>Enter the Databricks server name or IP address.</td></tr><tr><td>Authentication*</td><td><p></p><p>The following two types of authentication are supported for Databricks:</p><ul><li>Username/Password Authentication</li><li>Token Authentication.</li></ul></td></tr></tbody></table>

{% tabs %}
{% tab title="Username/Password Authentication" %}

<table><thead><tr><th width="190.6666259765625">Field Name</th><th>Description</th></tr></thead><tbody><tr><td>Username*</td><td>Enter the username configured for Databricks server connectivity.</td></tr><tr><td>Password*</td><td>Enter the password associated with the Databricks server account.</td></tr><tr><td>Notebook Path</td><td>Enter the path to crawl specific folders from the Databricks workspace.</td></tr><tr><td>Proxy Enabled*</td><td>Select <strong>Yes</strong> to route the connection through a configured proxy, or <strong>No</strong> to connect directly without a proxy.</td></tr></tbody></table>
{% endtab %}

{% tab title="Token Authentication" %}

<table><thead><tr><th width="190.6666259765625">Field Name</th><th>Description</th></tr></thead><tbody><tr><td>Access Token*</td><td>Enter the access token generated for Databricks server connectivity.</td></tr><tr><td>Notebook Path</td><td>Enter the path to crawl specific folders from the Databricks workspace.</td></tr><tr><td>Proxy Enabled*</td><td>Select <strong>Yes</strong> to route the connection through a configured proxy, or <strong>No</strong> to connect directly without a proxy.</td></tr></tbody></table>
{% endtab %}
{% endtabs %}

**Default Governance Roles**

<table data-header-hidden><thead><tr><th width="219.3333740234375"></th><th></th></tr></thead><tbody><tr><td>Default Governance Roles*</td><td>Select the appropriate users or teams for each governance role from the drop-down list. All users configured in the security settings are available for selection.</td></tr></tbody></table>

**Admin Roles**

<table data-header-hidden><thead><tr><th width="220.6666259765625"></th><th></th></tr></thead><tbody><tr><td>Admin Roles*</td><td>Select one or more users from the drop-down list for Integration Admin and Security &#x26; Governance Admin. All users configured in the security settings are available for selection.</td></tr></tbody></table>

**No of Archive Objects**

<table data-header-hidden><thead><tr><th width="220.666748046875"></th><th></th></tr></thead><tbody><tr><td>No Of Archive Objects*</td><td><p>This shows the number of recent metadata changes to a dataset at the source. By default, it is off. To enable it, toggle the Archive button and specify the number of objects to archive.</p><p><strong>Example</strong>: Setting it to 4 retrieves the last four changes, displayed in the 'Version' column of the 'Metadata Changes' module.</p></td></tr></tbody></table>

**Bridge**

<table data-header-hidden><thead><tr><th width="220.6666259765625"></th><th></th></tr></thead><tbody><tr><td>Select Bridge*</td><td><p>If applicable, select the bridge from the drop-down list.</p><p>The drop-down list displays all active bridges that have been configured. These bridges facilitate communication between data sources and the system without requiring changes to firewall rules.</p></td></tr></tbody></table>

2. After entering all connection details, the following actions can be performed:
   1. Click **Validate** to verify the connection.
   2. Click **Save** to store the connection for future use.
   3. Click **Save & Configure** to apply additional settings before saving.
3. The saved connection will appear on the **Connectors** home page.

## Manage Connector Operations

### Crawl

{% hint style="warning" %}
To perform crawl operations, users must be assigned the **Integration Admin** role.
{% endhint %}

The **Crawl/Profile** button allows users to select one or more **schemas** for crawling.&#x20;

1. Navigate to the **Connectors** page and click **Crawl/Profile**.
2. Select the schemas to crawl.
3. The **Crawl** option is selected by default.
4. Click **Run** to collect metadata from the connected source and load it into the **Data Catalog**.
5. After a successful crawl, the information appears in the **Data Catalog** > **Databases/<>Codes** tab.

The **Schedule** checkbox allows automated crawling at defined intervals, from a minute to a year.

1. Click the **Schedule** checkbox to enable the **Select** Period drop-down.
2. Select a time period for the operation from the drop-down menu.
3. Click **Schedule** to initiate metadata collection from the connected source.
4. The system will automatically execute the **crawl** operation at the scheduled time.

### Other Operations

The **Connectors** page provides a centralized view of all configured connectors, along with their health status.

**Managing connectors includes:**

* **Connector Health:** Displays the current status of each connector using a **green** icon for active connections and a **red** icon for inactive connections, helping to monitor the connectivity with data sources.
* **Viewing**: Click the **Eye** icon next to the connector name to view connector details, including databases, tables, columns, and codes.

**Nine Dots Menu Options:**

To view, edit, validate, build lineage, configure, or delete connectors, click on the **Nine Dots** menu.

* **Edit Connector**: Update and revalidate the data source.
* **Validate Connector:** Check the connection's integrity.
* **Settings**: Modify connector settings.
  * **Lineage**: Select server dialects for parsing and setting connector priority for table lineage.
* **Build Lineage**: Automatically build data lineage using source code parsing.
* **Delete Connector**: Remove a connector with confirmation.

## Connectivity Troubleshooting

If incorrect parameters are entered, error messages may appear. Ensure all inputs are accurate to resolve these issues. If issues persist, contact the assigned **support team**.

<table><thead><tr><th width="84.5555419921875">S.No.</th><th width="161.8887939453125">Error Message(s)</th><th>Error Description &#x26; Resolution</th></tr></thead><tbody><tr><td>1</td><td>Unauthorized Access (401)</td><td><p><strong>Description</strong>: </p><p>The authentication token or credentials are invalid or no longer valid.</p><p><strong>Resolution</strong>:</p><ul><li>Validate the authentication token or credentials.</li><li>Regenerate the token if it has expired.</li><li>Ensure correct authentication details are configured.</li></ul></td></tr><tr><td>2</td><td>Forbidden Access (403)</td><td><p><strong>Description</strong>:</p><p>The account lacks the required roles or privileges to execute the operation.</p><p><strong>Resolution</strong>:</p><ul><li>Assign the necessary roles or permissions to the account.</li><li>Verify that the account has access to the requested resource.</li><li>Review access control settings for the operation.</li></ul></td></tr></tbody></table>

***

Copyright © 2025, OvalEdge LLC, Peachtree Corners GA USA
