# Apache Hive This document outlines the integration with the Apache Hive connector, enabling streamlined metadata management through features such as crawling, data preview, and manual lineage building. It also ensures secure authentication via Credential Manager.

## Overview ### Connector Details | Connector Category | Big Data Platform | | ------------------------------------------------------------------------------- | ----------------- | | Connector Version | Release6.3.4 | | Releases Supported (Available from) | Legacy connector | |

Connectivity

\[How the connection is established with Apache Hive]

| JDBC | | Verified Apache Hive Version | 5.8.0 | {% hint style="info" %} The Apache Hive connector has been validated with the mentioned "Verified Apache Hive Versions" and is expected to be compatible with other supported Apache Hive versions. If there are any issues with validation or metadata crawling, please submit a support ticket for investigation and feedback. {% endhint %} ### Connector Features | Feature | Availability | | -------------------------------------------- | :----------: | | Crawling | ✅ | | Delta Crawling | ❌ | | Profiling | ✅ | | Query Sheet | ✅ | | Data Preview | ✅ | | Auto Lineage | ✅ | | Manual Lineage | ✅ | | Secure Authentication via Credential Manager | ✅ | | Data Quality | ❌ | | DAM (Data Access Management) | ❌ | | Bridge | ✅ | ### Metadata Mapping The following objects are crawled from Apache Hive and mapped to the corresponding UI assets.

Apache Hive Object	Apache Hive Attribute	OvalEdge Attribute	OvaEdge Category	OvalEdge Type
Schema	Schema name	Schema	Databases	Schema
Table	Table Name	Table	Tables	Table
Table	Table Type	Type	Tables	Table
Table	Table Comments	Source Description	Descriptions	Source Description
Columns	Column Name	Column	Table Columns	Columns
Columns	Data Type	Column Type	Table Columns	Columns
Columns	Description	Source Description	Table Columns	Columns
Views	View Name	View	Tables	View

## Set up a Connection ### Prerequisites The following are the prerequisites to establish a connection. ### **Service Account User Permissions** {% hint style="warning" %} It is recommended to use a separate service account to establish the connection to the data source, configured with the following minimum set of permissions. {% endhint %} {% hint style="info" %} 👨‍💻Who can provide these permissions? These permissions are typically granted by the Apache Hive administrator, as users may not have the required access to assign them independently. {% endhint %}

Objects	System Tables	Access Permission
Schema	USAGE on the database	USAGE
Tables	USAGE on the database SELECT privilege on tables	SELECT and USAGE
Table Columns	USAGE on the database SELECT on the table	SELECT and USAGE
Primary Keys (PK) and Foreign Keys (FK)	USAGE on the database SELECT on the table	SELECT and USAGE

### Connection Configuration Steps {% hint style="warning" %} Users are required to have the Connector Creator role in order to configure a new connection. {% endhint %} 1. Log into **OvalEdge**, go to **Administration > Connectors**, click **+ (New Connector)**, search for **Apache Hive**, and complete the required parameters. {% hint style="info" %} Fields marked with an asterisk (\*) are mandatory for establishing a connection. {% endhint %}

Field Name	Description
Connector Type	By default, "Hive" is displayed as the selected connector type.
Authentication	The following two types of authentication are supported for Apache Hive: Kerberos Non-Kerberos

Field Name

Description

Connector Type

By default, "Hive" is displayed as the selected connector type.

Authentication

The following two types of authentication are supported for Apache Hive:

Kerberos
Non-Kerberos

{% tabs %} {% tab title="Kerberos Authentication" %}

Field Name	Description
Credential Manager*	Select the desired credentials manager from the drop-down list. Relevant parameters will be displayed based on the selected option. Supported Credential Managers: Database AWS Secrets Manager HashiCorp Azure Key Vault
License Add Ons	Select the checkbox for Auto Lineage Add-On to build data lineage automatically.
Connector Name*	Enter a unique name for the Apache Hive connection (Example: "ApacheHive").
Connector Description	Enter a brief description of the connector.
Connector Environment	Select the environment (Example: PROD, STG) configured for the connector.
Server*	Enter the Apache Hive database server name or IP address (Example: hive-server.company.com or 192.168.1.10.
Port*	By default, the port number for the Apache Hive, "1000" is auto-populated. If required, the port number can be modified as per the custom port number that is configured for the Apache Hive.
Database*	Enter the database name to which the service account user has access within the Apache Hive.
Driver*	By default, the Apache Hive driver details are auto-populated.
Principal	Kerberos principal name for authentication
Connection String	Configure the connection string for the Apache Hive database: Automatic Mode: The system generates a connection string based on the provided credentials. Manual Mode: Enter a valid connection string manually. Replace placeholders with actual database details. {sid} refers to Database Name.
Keytab	Kerberos keytab file for authentication.
Krb5-Configuration File*	Path to the Kerberos configuration file (krb5.conf) required for authentication.

{% endtab %} {% tab title="Non-Kerberos Authentication" %}

Field Name	Description
Credential Manager*	Select the desired credentials manager from the drop-down list. Relevant parameters will be displayed based on the selected option. Supported Credential Managers: Database AWS Secrets Manager HashiCorp Azure Key Vault
License Add Ons	Select the checkbox for Auto Lineage Add-On to build data lineage automatically.
Connector Name*	Enter a unique name for the Apache Hive connection (Example: "ApacheHive").
Connector Description	Enter a brief description of the connector.
Connector Environment	Select the environment (Example: PROD, STG) configured for the connector.
Server*	Enter the Apache Hive database server name or IP address (Example: hive-server.company.com or 192.168.1.10.
Port*	By default, the port number for the Apache Hive, "1000" is auto-populated. If required, the port number can be modified as per the custom port number that is configured for the Apache Hive.
Database*	Enter the database name to which the service account user has access within the Apache Hive.
Driver*	By default, the Apache Hive driver details are auto-populated.
Username*	Service account username used for accessing Hive. Note: Visible only when the installation environment is Linux/Unix.
Password*	Password for the service account user. Note: Visible only when the installation environment is Linux/Unix.
Connection String	Configure the connection string for the Apache Hive database: Automatic Mode: The system generates a connection string based on the provided credentials. Manual Mode: Enter a valid connection string manually. Replace placeholders with actual database details. {sid} refers to Database Name.

{% endtab %} {% endtabs %}

Default Governance Roles	Description
Default Governance Roles*	Select the appropriate users or teams for each governance role from the drop-down list. All users configured in the security settings are available for selection.
Admin Roles
Admin Roles*	Select one or more users from the dropdown list for Integration Admin and Security & Governance Admin. All users configured in the security settings are available for selection.
No of Archive Objects
No Of Archive Objects*	This shows the number of recent metadata changes to a dataset at the source. By default, it is off. To enable it, toggle the Archive button and specify the number of objects to archive. Example: Setting it to 4 retrieves the last four changes, displayed in the 'Version' column of the 'Metadata Changes' module.
Bridge
Select Bridge*	If applicable, select the bridge from the drop-down list. The drop-down list displays all active bridges that have been configured. These bridges facilitate communication between data sources and the system without requiring changes to firewall rules.

2. After entering all connection details, the following actions can be performed: 1. Click **Validate** to verify the connection. 2. Click **Save** to store the connection for future use. 3. Click **Save & Configure** to apply additional settings before saving. 3. The saved connection will appear on the Connectors home page. ## Manage Connector Operations ### Crawl {% hint style="warning" %} To perform crawl operations, users must be assigned the Integration Admin role. {% endhint %} 1. Navigate to the **Connectors** page and click **Crawl/Profile**. 2. Select the schemas to be crawled. 3. The Crawl option is selected by default. To perform both operations, select the **Crawl & Profile** radio button. 4. Click **Run** to collect metadata from the connected source and load it into the **Data Catalog**. 5. After a successful crawl, the information appears in the **Data Catalog > Databases** tab. ### Other Operations The Connectors page provides a centralized view of all configured connectors, along with their health status. **Managing connectors includes:** * **Connectors Health:** Displays the current status of each connector using a green icon for active connections and a red icon for inactive connections, helping to monitor the connectivity with data sources. * **Viewing:** Click the Eye icon next to the connector name to view connector details, including databases, tables, columns, and codes. **Nine Dots Menu Options:** To view, edit, validate, build lineage, configure, or delete connectors, click on the **Nine Dots** menu. * **Edit Connector:** Update and revalidate the data source. * **Validate Connector:** Check the connection's integrity. * **Settings:** Modify connector settings. * **Crawler:** Configure data extraction. * **Profiler:** Customize data profiling rules and methods. * **Query Policies:** Define query execution rules based on roles. * **Access Instructions:** Add notes on how data can be accessed. * **Business Glossary Settings:** Manage term associations at the connector level. * **Others:** Configure notification recipients for metadata changes. * **Build Lineage:** Automatically build data lineage using source code parsing. * **Delete Connector:** Remove a connector with confirmation. ## Connectivity Troubleshooting If incorrect parameters are entered, error messages may appear. Ensure all inputs are accurate to resolve these issues. If issues persist, contact the assigned support team.

S.No. Error Message(s) Error Description & Resolution

S.No.	Error Message(s)	Error Description & Resolution
1	Error while validating HIVE connection: Cannot create PoolableConnectionFactory (Could not open client transport with JDBC Uri: `jdbc:hive2://https:-1//` `xxxxxxx.com/:10000/SID`: Cannot open without port.)	Error Description: The JDBC connection string is invalid because the port and URI are incorrectly defined. Error Resolution: Enter a valid JDBC URI in the format: `jdbc:hive2://:/` `Verify xxxServer2` is running and accessible on the specified port. Check network/firewall settings to allow connectivity

Error while validating HIVE connection: Cannot create PoolableConnectionFactory (Could not open client transport with JDBC Uri: jdbc:hive2://https:-1//

xxxxxxx.com/:10000/SID: Cannot open without port.)

Error Description:

The JDBC connection string is invalid because the port and URI are incorrectly defined.

Error Resolution:

Enter a valid JDBC URI in the format:
jdbc:hive2://:/

Verify xxxServer2 is running and accessible on the specified port.

Check network/firewall settings to allow connectivity