# AWS Glue

This article outlines the integration with the AWS Glue connector, enabling streamlined metadata management through crawling and manual lineage building.

The connector supports SDK-based connectivity to AWS Glue environments for metadata extraction from schemas, tables, and columns. Role-Based Authentication and IAM User Authentication provide secure access to AWS Glue resources based on the configured AWS account and region.

<div align="left"><img src="/files/uSOET7ourdiKK5zDX5Vi" alt="" height="313" width="624"></div>

## Overview

### Connector Details

<table data-header-hidden><thead><tr><th width="414"></th><th></th></tr></thead><tbody><tr><td>Connector Category</td><td>ETL Tool</td></tr><tr><td>OvalEdge Release Supported</td><td>Release6.3.4 and later</td></tr><tr><td><p>Connectivity</p><p>[How the connection is established with AWS Glue]</p></td><td>SDK</td></tr><tr><td>Verified AWS Glue Version</td><td>Glue 5.0</td></tr></tbody></table>

{% hint style="warning" %}
The AWS Glue connector has been validated with the mentioned "Verified AWS Glue Versions" and is expected to be compatible with other supported AWS Glue versions. If there are any issues with validation or metadata crawling, please submit a support ticket for investigation and feedback.
{% endhint %}

**SDK Details**

<table><thead><tr><th width="91.54547119140625">Sl. No.</th><th width="190.727294921875">SDK Details</th><th width="204.727294921875">Supported Version</th><th>Reference</th></tr></thead><tbody><tr><td>1</td><td>AWS SDK</td><td>2.41.34</td><td>Click <a href="https://central.sonatype.com/artifact/software.amazon.awssdk/glue/2.41.34">here</a></td></tr></tbody></table>

### Connector Features

| Feature                                      | Availability |
| -------------------------------------------- | :----------: |
| Crawling                                     |       ✅      |
| Delta Crawling                               |       ❌      |
| Profiling                                    |       ❌      |
| Data Preview                                 |       ❌      |
| Auto Lineage                                 |       ❌      |
| Manual Lineage                               |       ✅      |
| Secure Authentication via Credential Manager |       ✅      |
| Data Quality                                 |       ❌      |
| DAM (Data Access Management)                 |       ❌      |

### Metadata Mapping

The following objects are crawled from AWS Glue and mapped to the corresponding UI assets.

<table><thead><tr><th width="160">AWS Glue Object</th><th width="174.45452880859375">AWS Glue Attribute</th><th width="172.45458984375">OvalEdge Attribute</th><th width="170.727294921875">OvaEdge Category</th><th width="168">OvalEdge Type</th></tr></thead><tbody><tr><td>Schema</td><td>Database Name</td><td>Schema</td><td>Tables</td><td>Schema</td></tr><tr><td>Schema</td><td>Description</td><td>Source Description</td><td>Descriptions</td><td>Schema</td></tr><tr><td>Table</td><td>Table Name</td><td>Table</td><td>Tables</td><td>Table</td></tr><tr><td>Table</td><td>Table Type (EXTERNAL, etc)</td><td>Type</td><td>Tables</td><td>Table</td></tr><tr><td>Table</td><td>Description</td><td>Source Description</td><td>Descriptions</td><td>-</td></tr><tr><td>Column</td><td>Column Name</td><td>Column</td><td>Table Columns</td><td>Table Column</td></tr><tr><td>Column</td><td>Data Type</td><td>Column Type</td><td>Table Columns</td><td>Table Column</td></tr><tr><td>Column</td><td>Description</td><td>Source Description</td><td>Table Columns</td><td>Table Column</td></tr><tr><td>Column</td><td>Position</td><td>Column Position</td><td>Table Columns</td><td>Table Column</td></tr><tr><td>Column</td><td>Length / Precision</td><td>Data Type Size</td><td>Table Columns</td><td>Table Column</td></tr></tbody></table>

## Set up a Connection

### Prerequisites

The following are the prerequisites to establish a connection:

#### **Service Account User Permissions**

{% hint style="warning" %}
It is recommended to use a separate service account to establish the connection to the data source, configured with the following minimum set of permissions.
{% endhint %}

{% hint style="info" %}
👨‍💻**Who can provide these permissions?** These permissions are typically granted by the AWS Glue administrator, as users may not have the required access to assign them independently.
{% endhint %}

<table><thead><tr><th width="99.90911865234375">Objects</th><th width="257.63641357421875">Sys Tables</th><th>Access Permissions</th></tr></thead><tbody><tr><td>Schema</td><td>glue:GetDatabases</td><td>IAM permission to call glue:GetDatabases</td></tr><tr><td>Schema</td><td>glue:GetDatabase</td><td>IAM permission to call glue:GetDatabase</td></tr><tr><td>Table</td><td>glue:GetTables</td><td>glue:GetTables</td></tr><tr><td>Table</td><td>glue:GetTable</td><td>glue:GetTable</td></tr><tr><td>Table</td><td>glue:GetTableVersions</td><td>glue:GetTableVersions</td></tr><tr><td>Table</td><td>glue:GetTableVersion</td><td>glue:GetTableVersion</td></tr><tr><td>Table</td><td>Athena + S3</td><td>athena:StartQueryExecution, s3:GetObject</td></tr></tbody></table>

{% hint style="info" %}

* glue:GetDatabases and glue:GetDatabase permissions are required to list and retrieve metadata for AWS Glue Data Catalog databases (logical schema containers).
* glue:GetTables permission is required to list all tables under a specific database (schema) in AWS Glue Data Catalog.
* glue:GetTable permission retrieves metadata for individual tables, including schema, location, and input format details.
* glue:GetTableVersions and glue:GetTableVersion permissions are required to access historical table version metadata when versioning is enabled.
* athena:StartQueryExecution and s3:GetObject permissions are required to query Glue tables through Athena and access underlying data stored in Amazon S3.
  {% endhint %}

### Connection Configuration Steps

{% hint style="warning" %}
Users are required to have the Connector Creator role in order to configure a new connection.
{% endhint %}

1. Log into OvalEdge, go to Administration > Connectors, click + (New Connector), search for AWS Glue, and complete the required parameters.

{% hint style="info" %}
Fields marked with an asterisk (\*) are mandatory for establishing a connection.
{% endhint %}

<table><thead><tr><th width="219.45458984375">Field Name</th><th>Description</th></tr></thead><tbody><tr><td>Connector Type</td><td>By default, "<strong>AWS Glue</strong>" is displayed as the selected connector type.</td></tr><tr><td>Authentication*</td><td><p>Select the type of Authentication from the dropdown menu.</p><ul><li>Role-Based Authentication</li><li>IAM User Authentication</li></ul></td></tr></tbody></table>

{% tabs %}
{% tab title="Role-Based Authentication" %}

<table><thead><tr><th width="190.81817626953125">Field Name</th><th>Description</th></tr></thead><tbody><tr><td>Credential Manager*</td><td><p>Select the desired credentials manager from the drop-down list. The corresponding parameters will be displayed based on the selected option.</p><p>Supported Credential Managers:</p><ul><li>OE Credential Manager</li><li>AWS Secrets Manager</li><li>HashiCorp</li><li>Azure Key Vault</li></ul><p>For more details, click <a href="https://docs.ovaledge.com/connectors/additional-requirements/credential-manager-configuration">here</a>.</p></td></tr><tr><td>Connector Name*</td><td><p>Enter a unique name for the AWS Glue connection              </p><p>(Example: "AWS_Glue").</p></td></tr><tr><td>Connector Description</td><td>Enter a brief description of the connector.</td></tr><tr><td>Connector Environment</td><td><p>Select the environment (Example: PROD, STG) configured for the connector.</p><p>For more details, click <a href="https://docs.ovaledge.com/connectors/introduction-to-connectors/setup-and-connectivity/prerequisites#connector-environment">here</a>.</p></td></tr><tr><td>Cross-Account Role ARN</td><td>Enter the ARN (Amazon Resource Name) of the role used for cross-account access.</td></tr><tr><td>Database Region*</td><td>Select the AWS Region where the AWS Glue resources are configured (Example: us-xxxx-1, ap-xxxx-1). The selected region is used to establish connectivity and retrieve metadata from the configured AWS Glue environment.</td></tr></tbody></table>
{% endtab %}

{% tab title="IAM User Authentication" %}

<table><thead><tr><th width="189">Field Name</th><th>Description</th></tr></thead><tbody><tr><td>Credential Manager*</td><td><p>Select the desired credentials manager from the drop-down list. The corresponding parameters will be displayed based on the selected option.</p><p>Supported Credential Managers:</p><ul><li>OE Credential Manager</li><li>AWS Secrets Manager</li><li>HashiCorp</li><li>Azure Key Vault</li></ul><p>For more details, click <a href="https://docs.ovaledge.com/connectors/additional-requirements/credential-manager-configuration">here</a>.</p></td></tr><tr><td>Connector Name*</td><td><p>Enter a unique name for the AWS Glue connection              </p><p>(Example: "AWS_Glue").</p></td></tr><tr><td>Connector Description</td><td>Enter a brief description of the connector.</td></tr><tr><td>Connector Environment</td><td><p>Select the environment (Example: PROD, STG) configured for the connector.</p><p>For more details, click <a href="https://docs.ovaledge.com/connectors/introduction-to-connectors/setup-and-connectivity/prerequisites#connector-environment">here</a>.</p></td></tr><tr><td>Access key*</td><td>Enter the AWS Access Key ID used to authenticate the IAM user.</td></tr><tr><td>Secret key*</td><td>Enter the AWS Secret Access Key associated with the Access Key ID.</td></tr><tr><td>Database Region*</td><td>Select the AWS Region where the AWS Glue resources are configured (Example: us-xxxx-1, ap-xxxx-1). The selected region is used to establish connectivity and retrieve metadata from the configured AWS Glue environment.</td></tr></tbody></table>
{% endtab %}
{% endtabs %}

**Default Governance Roles**

<table data-header-hidden><thead><tr><th width="219.45452880859375"></th><th></th></tr></thead><tbody><tr><td>Default Governance Roles*</td><td>Select the appropriate users or teams for each governance role from the drop-down list. All users configured in the security settings are available for selection.</td></tr></tbody></table>

**Admin Roles**

<table data-header-hidden><thead><tr><th width="219.45458984375"></th><th></th></tr></thead><tbody><tr><td>Admin Roles*</td><td>Select one or more users from the dropdown list for Integration Admin and Security &#x26; Governance Admin. All users configured in the security settings are available for selection.</td></tr></tbody></table>

**Bridge**

<table data-header-hidden><thead><tr><th width="218.54541015625"></th><th></th></tr></thead><tbody><tr><td>Select Bridge*</td><td><p>If applicable, select the bridge from the drop-down list.</p><p>The drop-down list displays all active bridges that have been configured. These bridges facilitate communication between data sources and the system without requiring changes to firewall rules.</p></td></tr></tbody></table>

2. After entering all connection details, the following actions can be performed:
   1. Click **Validate** to verify the connection.
   2. Click **Save** to store the connection for future use.
   3. Click **Save & Configure** to apply additional settings before saving.
3. The saved connection will appear on the **Connectors home** page.

## Manage Connector Operations

### Crawl

{% hint style="warning" %}
To perform crawl operations, users must be assigned the Integration Admin role.
{% endhint %}

The **Crawl/Profile** button allows users to select one or more schemas for crawling.&#x20;

1. Navigate to the Connectors page and click **Crawl/Profile**.
2. Select the schemas to crawl.
3. The **Crawl** option is selected by default.
4. Click **Run** to collect metadata from the connected source and load it into the **Data Catalog**.
5. After a successful crawl, the information appears in the **Data Catalog** > **Databases**/**<>Codes** tab.

The **Schedule** checkbox allows automated crawling at defined intervals, from a minute to a year.

1. Click the **Schedule checkbox** to enable the Select Period drop-down.
2. Select a time interval for the operation from the drop-down menu.
3. Click **Schedule** to initiate metadata collection from the connected source.
4. The system will automatically execute the **crawl** operation at the scheduled time.

### Other Operations

The **Connectors** page provides a centralized view of all configured connectors, along with their health status.

**Managing connectors includes:**

* **Connector Health:** Displays the current status of each connector using a **green** icon for active connections and a **red** icon for inactive connections, helping to monitor the connectivity with data sources.
* **Viewing**: Click the **Eye** icon next to the connector name to view connector details, including databases, tables, columns, and codes.

**Nine Dots Menu Options:**

To view, edit, validate, configure, or delete connectors, click on the **Nine Dots** menu.

* **Edit Connector**: Update and revalidate the data source.
* **Validate Connector**: Check the connection's integrity.
* **Settings**: Modify connector settings.
  * **Crawler**: Configure data extraction.
  * **Access Instructions:** Add notes on how data can be accessed.
  * **Business Glossary Settings**: Manage term associations at the connector level.
  * **Others**: Configure notification recipients for metadata changes.
* **Delete Connector**: Remove a connector with confirmation.

For more details on connector settings, click [here](https://docs.ovaledge.com/connectors/introduction-to-connectors/setup-and-connectivity/connector-settings).

### Connectivity Troubleshooting

If incorrect parameters are entered, error messages may appear. Ensure all inputs are accurate to resolve these issues. If issues persist, contact the assigned support team.

<table><thead><tr><th width="91.727294921875">Sl. No.</th><th width="204.3636474609375">Error Message(s)</th><th>Error Description &#x26; Resolution</th></tr></thead><tbody><tr><td>1</td><td>Error while validating AwsGlue connection Failed to list databases from AWS Glue | Root cause: glue.rf.amazonaws.com</td><td><p><strong>Description</strong>: The selected Database Region is incorrect or does not contain the configured AWS Glue resources.</p><p><strong>Resolution</strong>: </p><ul><li>Verify that the correct AWS Region is selected in the Database Region field.</li><li>Confirm that the AWS Glue resources are available in the selected region.</li><li>Update the region configuration and validate the connection again.</li></ul></td></tr><tr><td>2</td><td>Invalid Access Key or Secret Key</td><td><p><strong>Description</strong>: The Access Key or Secret Key is invalid, expired, or does not have permission to access AWS Glue resources in the selected Database Region.</p><p><strong>Resolution</strong>:</p><ul><li>Verify that the Access Key and Secret Key are correct.</li><li>Confirm that the IAM credentials are active and have the required AWS Glue permissions.</li><li>Update the credentials in the connector configuration and validate the connection again.</li></ul></td></tr></tbody></table>

***

&#x20;Copyright © 2026, OvalEdge LLC, Peachtree Corners GA USA


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.ovaledge.com/release8.1/connectors/connector-repositories/etl-tool/aws-glue.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
