# AWS Glue ETL

An out-of-the-box connector is available for AWS Glue ETL entities. It provides support for crawling Jobs, Workflow, Triggers, Crawlers, and lineage building for the above entities.

## **Connectivity Summary**

<figure><img src="/files/AgfNhFrXj01fq4em4GBo" alt=""><figcaption></figcaption></figure>

The connectivity to AWS Glue ETL is via AWS Glue SDK, which is included in the platform.&#x20;

The Glue SDK used by the connector is given below:

<table data-header-hidden><thead><tr><th width="142.75"></th><th width="183.25"></th><th></th></tr></thead><tbody><tr><td><strong>Driver / API</strong></td><td><strong>Version</strong></td><td><strong>Details</strong></td></tr><tr><td>AWS Glue SDK</td><td>1.12.232</td><td><p><a href="https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-glue/1.12.232">https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-glue/1.12.232</a></p><p>Note: Latest version is 1.12.244.</p></td></tr></tbody></table>

## **Technical Specifications**

The connector capabilities are shown below:\
The AWS Glue Entities are created as Datasets with types as Job, Workflow, Crawler & Trigger. We extract a job's script and build the lineage for it whereas for the crawlers, triggers, and workflows we extract the information of entities involved and build the associations accordingly.

### **Crawling**

| Feature  | Supported Objects | Remarks |
| -------- | ----------------- | ------- |
| Crawling | Jobs              |         |
|          | Workflows         |         |
|          | Crawlers          |         |
|          | Triggers          |         |

### **Lineage Building**

| Lineage entities | Details   |
| ---------------- | --------- |
| Jobs             | Supported |
| Workflows        | Supported |
| Crawlers         | Supported |
| Triggers         | Supported |

## **Pre-requisites:**

To use the connector, the following need to be available:

* Connection details as specified in the following section should be available.
* An admin/service account, for crawling. The minimum privileges required are

| **Operation**   | **Access Permission**             |
| --------------- | --------------------------------- |
| Crawl Jobs      | LIST, GET permission on Jobs      |
| Crawl Workflows | LIST, GET permission on workflows |
| Crawl Crawlers  | LIST, GET permission on crawlers  |
| Crawl Triggers  | LIST, GET permission on triggers  |

### **Connection Details**

The following connection settings should be added for connecting to AWS Glue ETL:

| Property        | Details                                                                                                                                                                                                   |
| --------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Database Type   | ETL                                                                                                                                                                                                       |
| Connection Name | <p>Select a Connection name for the AWS Glue ETL. The name that you specify is a reference name to easily identify your AWS Glue ETL connection in OvalEdge. </p><p>Example: AWS Glue ETL Connection.</p> |
| Authentication  | Select the authentication type whether it is  Role-based authentication or Basic Authentication.                                                                                                          |
| Access key      | Access key                                                                                                                                                                                                |
| Secret key      | Secret key                                                                                                                                                                                                |
| Region          | Region of Glue                                                                                                                                                                                            |

## Set up a Connection

### Prerequisites

The following are the prerequisites to establish a connection:

### Connection Configuration Steps

{% hint style="warning" %}
Users are required to have the Connector Creator role in order to configure a new connection.
{% endhint %}

1. Log into OvalEdge, go to Administration > Connectors, click + (New Connector), search for AWS Glue ETL, and complete the required parameters.

{% hint style="info" %}
Fields marked with an asterisk (\*) are mandatory for establishing a connection.
{% endhint %}

<table><thead><tr><th width="219.45458984375">Field Name</th><th>Description</th></tr></thead><tbody><tr><td>Connector Type</td><td>By default, "<strong>AWS Glue</strong> <strong>ETL</strong>"is displayed as the selected connector type.</td></tr><tr><td>Authentication*</td><td><p>Select the type of Authentication from the dropdown menu.</p><ul><li>Role-Based Authentication</li><li>IAM User Authentication</li></ul></td></tr></tbody></table>

{% tabs %}
{% tab title="Role-Based Authentication" %}

<table><thead><tr><th width="190.81817626953125">Field Name</th><th>Description</th></tr></thead><tbody><tr><td>Credential Manager*</td><td><p>Select the desired credentials manager from the drop-down list. The corresponding parameters will be displayed based on the selected option.</p><p>Supported Credential Managers:</p><ul><li>OE Credential Manager</li><li>AWS Secrets Manager</li><li>HashiCorp</li><li>Azure Key Vault</li></ul><p>For more details, click <a href="https://docs.ovaledge.com/connectors/additional-requirements/credential-manager-configuration">here</a>.</p></td></tr><tr><td>License Add Ons</td><td>Select the Auto Lineage Add-On checkbox to build data lineage automatically.</td></tr><tr><td>Connector Name*</td><td><p>Enter a unique name for the AWS Glue ETL connection              </p><p>(Example: "AWS_Glue_ETL").</p></td></tr><tr><td>Connector Description</td><td>Enter a brief description of the connector.</td></tr><tr><td>Connector Environment</td><td>Select the environment (Example: PROD, STG) configured for the connector.</td></tr><tr><td>Cross-Account Role ARN</td><td>Enter the ARN (Amazon Resource Name) of the role used for cross-account access.</td></tr><tr><td>Database Region*</td><td>Select the AWS Region where the AWS Glue resources are configured (Example: us-xxxx-1, ap-xxxx-1). The selected region is used to establish connectivity and retrieve metadata from the configured AWS Glue environment.</td></tr></tbody></table>
{% endtab %}

{% tab title="IAM User Authentication" %}

<table><thead><tr><th width="189">Field Name</th><th>Description</th></tr></thead><tbody><tr><td>Credential Manager*</td><td><p>Select the desired credentials manager from the drop-down list. The corresponding parameters will be displayed based on the selected option.</p><p>Supported Credential Managers:</p><ul><li>OE Credential Manager</li><li>AWS Secrets Manager</li><li>HashiCorp</li><li>Azure Key Vault</li></ul><p>For more details, click <a href="https://docs.ovaledge.com/connectors/additional-requirements/credential-manager-configuration">here</a>.</p></td></tr><tr><td>Connector Name*</td><td><p>Enter a unique name for the AWS Glue connection              </p><p>(Example: "AWS_Glue").</p></td></tr><tr><td>Connector Description</td><td>Enter a brief description of the connector.</td></tr><tr><td>Connector Environment</td><td><p>Select the environment (Example: PROD, STG) configured for the connector.</p><p>For more details, click <a href="https://docs.ovaledge.com/connectors/introduction-to-connectors/setup-and-connectivity/prerequisites#connector-environment">here</a>.</p></td></tr><tr><td>Access key*</td><td>Enter the AWS Access Key ID used to authenticate the IAM user.</td></tr><tr><td>Secret key*</td><td>Enter the AWS Secret Access Key associated with the Access Key ID.</td></tr><tr><td>Database Region*</td><td>Select the AWS Region where the AWS Glue resources are configured (Example: us-xxxx-1, ap-xxxx-1). The selected region is used to establish connectivity and retrieve metadata from the configured AWS Glue environment.</td></tr></tbody></table>
{% endtab %}
{% endtabs %}

**Default Governance Roles**

<table data-header-hidden><thead><tr><th width="219.45452880859375"></th><th></th></tr></thead><tbody><tr><td>Default Governance Roles*</td><td>Select the appropriate users or teams for each governance role from the drop-down list. All users configured in the security settings are available for selection.</td></tr></tbody></table>

**Admin Roles**

<table data-header-hidden><thead><tr><th width="219.45458984375"></th><th></th></tr></thead><tbody><tr><td>Admin Roles*</td><td>Select one or more users from the dropdown list for Integration Admin and Security &#x26; Governance Admin. All users configured in the security settings are available for selection.</td></tr></tbody></table>

**Bridge**

<table data-header-hidden><thead><tr><th width="218.54541015625"></th><th></th></tr></thead><tbody><tr><td>Select Bridge*</td><td><p>If applicable, select the bridge from the drop-down list.</p><p>The drop-down list displays all active bridges that have been configured. These bridges facilitate communication between data sources and the system without requiring changes to firewall rules.</p></td></tr></tbody></table>

2. After entering all connection details, the following actions can be performed:
   1. Click **Validate** to verify the connection.
   2. Click **Save** to store the connection for future use.
   3. Click **Save & Configure** to apply additional settings before saving.
3. The saved connection will appear on the **Connectors home** page.

## Manage Connector Operations

### Crawl

{% hint style="warning" %}
To perform crawl operations, users must be assigned the Integration Admin role.
{% endhint %}

The **Crawl/Profile** button allows users to select one or more schemas for crawling.&#x20;

1. Navigate to the Connectors page and click **Crawl/Profile**.
2. Select the schemas to crawl.
3. The **Crawl** option is selected by default.
4. Click **Run** to collect metadata from the connected source and load it into the **Data Catalog**.
5. After a successful crawl, the information appears in the **Data Catalog** > **Databases**/**<>Codes** tab.

The **Schedule** checkbox allows automated crawling at defined intervals, from a minute to a year.

1. Click the **Schedule checkbox** to enable the Select Period drop-down.
2. Select a time interval for the operation from the drop-down menu.
3. Click **Schedule** to initiate metadata collection from the connected source.
4. The system will automatically execute the **crawl** operation at the scheduled time.

### Other Operations

The **Connectors** page provides a centralized view of all configured connectors, along with their health status.

**Managing connectors includes:**

* **Connector Health:** Displays the current status of each connector using a **green** icon for active connections and a **red** icon for inactive connections, helping to monitor the connectivity with data sources.
* **Viewing**: Click the **Eye** icon next to the connector name to view connector details, including databases, tables, columns, and codes.

**Nine Dots Menu Options:**

To view, edit, validate, configure, or delete connectors, click on the **Nine Dots** menu.

* **Edit Connector**: Update and revalidate the data source.
* **Validate Connector**: Check the connection's integrity.
* **Settings**: Modify connector settings.
  * **Crawler**: Configure data extraction.
  * **Access Instructions:** Add notes on how data can be accessed.
  * **Business Glossary Settings**: Manage term associations at the connector level.
  * **Others**: Configure notification recipients for metadata changes.
* **Delete Connector**: Remove a connector with confirmation.

## **Points to note:**

AWS Glue ETL doesn’t support querying for the Glue data catalog from OvalEdge.

***

Copyright © 2025, OvalEdge LLC, Peachtree Corners GA USA


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.ovaledge.com/release8.1/connectors/connector-repositories/etl-tool/aws-glue-etl.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
