AWS Glue ETL

An out-of-the-box connector is available for AWS Glue ETL entities. It provides support for crawling Jobs, Workflow, Triggers, Crawlers, and lineage building for the above entities.

Connectivity Summary

The connectivity to AWS Glue ETL is via AWS Glue SDK, which is included in the platform.

The Glue SDK used by the connector is given below:

Driver / API

Version

Details

Technical Specifications

The connector capabilities are shown below: The AWS Glue Entities are created as Datasets with types as Job, Workflow, Crawler & Trigger. We extract a job's script and build the lineage for it whereas for the crawlers, triggers, and workflows we extract the information of entities involved and build the associations accordingly.

Crawling

Feature

Supported Objects

Remarks

Crawling

Jobs

Workflows

Crawlers

Triggers

Lineage Building

Lineage entities

Details

Jobs

Supported

Workflows

Supported

Crawlers

Supported

Triggers

Supported

Pre-requisites:

To use the connector, the following need to be available:

  • Connection details as specified in the following section should be available.

  • An admin/service account, for crawling. The minimum privileges required are

Operation

Access Permission

Crawl Jobs

LIST, GET permission on Jobs

Crawl Workflows

LIST, GET permission on workflows

Crawl Crawlers

LIST, GET permission on crawlers

Crawl Triggers

LIST, GET permission on triggers

Connection Details

The following connection settings should be added for connecting to AWS Glue ETL:

Property
Details

Database Type

ETL

Connection Name

Select a Connection name for the AWS Glue ETL. The name that you specify is a reference name to easily identify your AWS Glue ETL connection in OvalEdge.

Example: AWS Glue ETL Connection.

Authentication

Select the authentication type whether it is Role-based authentication or Basic Authentication.

Access key

Access key

Secret key

Secret key

Region

Region of Glue

Set up a Connection

Prerequisites

The following are the prerequisites to establish a connection:

Connection Configuration Steps

circle-exclamation
  1. Log into OvalEdge, go to Administration > Connectors, click + (New Connector), search for AWS Glue ETL, and complete the required parameters.

circle-info

Fields marked with an asterisk (*) are mandatory for establishing a connection.

Field Name
Description

Connector Type

By default, "AWS Glue ETL"is displayed as the selected connector type.

Authentication*

Select the type of Authentication from the dropdown menu.

  • Role-Based Authentication

  • IAM User Authentication

Field Name
Description

Credential Manager*

Select the desired credentials manager from the drop-down list. The corresponding parameters will be displayed based on the selected option.

Supported Credential Managers:

  • OE Credential Manager

  • AWS Secrets Manager

  • HashiCorp

  • Azure Key Vault

For more details, click herearrow-up-right.

License Add Ons

Select the Auto Lineage Add-On checkbox to build data lineage automatically.

Connector Name*

Enter a unique name for the AWS Glue ETL connection

(Example: "AWS_Glue_ETL").

Connector Description

Enter a brief description of the connector.

Connector Environment

Select the environment (Example: PROD, STG) configured for the connector.

Cross-Account Role ARN

Enter the ARN (Amazon Resource Name) of the role used for cross-account access.

Database Region*

Select the AWS Region where the AWS Glue resources are configured (Example: us-xxxx-1, ap-xxxx-1). The selected region is used to establish connectivity and retrieve metadata from the configured AWS Glue environment.

Default Governance Roles

Default Governance Roles*

Select the appropriate users or teams for each governance role from the drop-down list. All users configured in the security settings are available for selection.

Admin Roles

Admin Roles*

Select one or more users from the dropdown list for Integration Admin and Security & Governance Admin. All users configured in the security settings are available for selection.

Bridge

Select Bridge*

If applicable, select the bridge from the drop-down list.

The drop-down list displays all active bridges that have been configured. These bridges facilitate communication between data sources and the system without requiring changes to firewall rules.

  1. After entering all connection details, the following actions can be performed:

    1. Click Validate to verify the connection.

    2. Click Save to store the connection for future use.

    3. Click Save & Configure to apply additional settings before saving.

  2. The saved connection will appear on the Connectors home page.

Manage Connector Operations

Crawl

circle-exclamation

The Crawl/Profile button allows users to select one or more schemas for crawling.

  1. Navigate to the Connectors page and click Crawl/Profile.

  2. Select the schemas to crawl.

  3. The Crawl option is selected by default.

  4. Click Run to collect metadata from the connected source and load it into the Data Catalog.

  5. After a successful crawl, the information appears in the Data Catalog > Databases/<>Codes tab.

The Schedule checkbox allows automated crawling at defined intervals, from a minute to a year.

  1. Click the Schedule checkbox to enable the Select Period drop-down.

  2. Select a time interval for the operation from the drop-down menu.

  3. Click Schedule to initiate metadata collection from the connected source.

  4. The system will automatically execute the crawl operation at the scheduled time.

Other Operations

The Connectors page provides a centralized view of all configured connectors, along with their health status.

Managing connectors includes:

  • Connector Health: Displays the current status of each connector using a green icon for active connections and a red icon for inactive connections, helping to monitor the connectivity with data sources.

  • Viewing: Click the Eye icon next to the connector name to view connector details, including databases, tables, columns, and codes.

Nine Dots Menu Options:

To view, edit, validate, configure, or delete connectors, click on the Nine Dots menu.

  • Edit Connector: Update and revalidate the data source.

  • Validate Connector: Check the connection's integrity.

  • Settings: Modify connector settings.

    • Crawler: Configure data extraction.

    • Access Instructions: Add notes on how data can be accessed.

    • Business Glossary Settings: Manage term associations at the connector level.

    • Others: Configure notification recipients for metadata changes.

  • Delete Connector: Remove a connector with confirmation.

Points to note:

AWS Glue ETL doesn’t support querying for the Glue data catalog from OvalEdge.


Copyright © 2025, OvalEdge LLC, Peachtree Corners GA USA

Last updated

Was this helpful?