Amazon Redshift

This article outlines the integration with Amazon Redshift, enabling streamlined metadata management through features such as crawling, profiling, querying, data preview, and lineage building (both automatic and manual).

This connector uses the JDBC driver to establish connectivity with Amazon Redshift databases and supports metadata extraction from schemas, tables, columns, views, and other database objects. Authentication is established using service account credentials for secure access to Amazon Redshift metadata objects.

Overview

Connector Details

Connector Category

Data Warehouse

OvalEdge Release Supported

Release4 and later

Connectivity

[How the connection is established with Amazon Redshift]

JDBC driver

Verified Amazon Redshift Version

1.0.109768

Note: The Amazon Redshift connector has been validated with the mentioned "Verified Amazon Redshift Versions" and is expected to be compatible with other supported Amazon Redshift versions. If there are any issues with validation or metadata crawling, please submit a support ticket for investigation and feedback.

Connector Features

Feature
Availability

Crawling

Delta Crawling

Profiling

Sample Profiling

Query Sheet

Data Preview

Auto Lineage

Manual Lineage

Secure Authentication via Credential Manager

Data Quality

DAM (Data Access Management)

Bridge

Metadata Mapping

The following objects are crawled from Amazon Redshift and mapped to the corresponding UI assets.

Redshift Object
Redshift Attribute
OvalEdge Attribute
OvaEdge Category
OvalEdge Type

Schema

Schema name

Schema

Databases

Schema

Schema

Schema comment

Source Description

Databases

Schema

Table

Table Name

Table

Tables

Table

Table

Table Type

Type

Tables

Table

Table

Table Comments

Source Description

Descriptions

Source Description

Columns

Column Name

Column

Table Columns

-

Columns

Data Type

Column Type

Table Columns

-

Columns

Description

Source Description

Table Columns

-

Columns

Ordinal Position

Column Position

Table Columns

-

Columns

Length

Data Type Size

Table Columns

-

Views

View Name

View

Tables

View

Views

text

View Query

Views

View

Procedures

ROUTINE_NAME

Name

Procedures

-

Procedures

DESCRIPTION

Source Description

Descriptions

-

Procedures

ROUTINE_DEFINITION

Procedure

Procedures

-

Functions

ROUTINE_NAME

Name

Functions

-

Functions

ROUTINE_DEFINITION

Function

Functions

-

Functions

DESCRIPTION

Source Description

Descriptions

-

Set up a Connection

Prerequisites

The following are the prerequisites to establish a connection:

External Supporting Files

The required external JAR files are included as part of the OvalEdge installation artifacts. For driver installation and configuration details, refer to the Connector Drivers Setup Guide. Please contact the OvalEdge Team for assistance related to the driver files and configuration setup.

File Name
Description

JDBC driver file (RedshiftJDBC42)

Use this file to enable connectivity with Amazon Redshift without relying on the AWS SDK. Place the file in the Third Party Jars folder.

Whitelisting Ports

Ensure the inbound port “5439” is whitelisted for OvalEdge to connect to the Amazon Redshift database.

Service Account User Permissions

👨‍💻 Who can provide these permissions? These permissions are typically granted by the Amazon Redshift administrator, as users may not have the required access to assign them independently.

Operation
Objects
pg_catalog Tables
Access Permissions
Comment

Crawling

Schema

pg_catalog.pg_namespace, SVV_EXTERNAL_SCHEMAS

USAGE, SELECT

Requires access to the pg_catalog.pg_namespace to crawl schema and schema comments into OvalEdge. SVV_EXTERNAL_SCHEMAS is to crawl external schemas and tables

Crawling

Tables

INFORMATION_SCHEMA.TABLES, pg_catalog.pg_class, SVV_EXTERNAL_TABLES

SELECT

Requires access to catalog tables and table comments into OvalEdge

Crawling

Table Columns

INFORMATION_SCHEMA.COLUMNS, INFORMATION_SCHEMA.TABLE_CONSTRAINTS, SVV_EXTERNAL_COLUMNS, INFORMATION_SCHEMA.KEY_COLUMN_USAGE, pg_catalog.pg_description

SELECT

To fetch the column name, type, length, position, constraints, and comments

Crawling & Lineage Building

Views

pg_catalog.pg_class, pg_get_viewdef(), pg_catalog.pg_namespace

SELECT

To fetch views code, metadata, and schema association

Crawling & Lineage Building

Functions/Stored Procedures & Source Code

pg_proc, PG_PROC_info, pg_namespace, pg_get_late_binding_view_cols()

SELECT

To get procedures, functions, source code from system catalogs and retrieve view column definitions

Crawling

Relationships

INFORMATION_SCHEMA.TABLE_CONSTRAINTS, INFORMATION_SCHEMA.KEY_COLUMN_USAGE, INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS

SELECT

To identify the constraint names for reference tables/columns and relationships

Profiling

Row Count

User Data Tables

SELECT

To fetch row count of tables for profiling analysis

Profiling

Data Profiling & Sample Data

User Data Tables

SELECT

To fetch sample rows and statistical metrics (MIN, MAX, COUNT, DISTINCT) for data profiling

Data Access & Query Execution

Data Query Execution

User Data Tables

SELECT

To execute user-defined SELECT queries with filtering and governance

Connection Configuration Steps

  1. Log into OvalEdge, go to Administration > Connectors, click + (New Connector), search for Redshift, and complete the required parameters.

Note: Fields marked with an asterisk (*) are mandatory for establishing a connection.

Field Name
Description

Connector Type

By default, "Redshift" is displayed as the selected connector type.

Credential Manager*

Select the desired credentials manager from the drop-down list. Relevant parameters will be displayed based on your selection.

Supported Credential Managers:

  • OE Credential Manager

  • AWS Secrets Manager

  • HashiCorp Vault

  • Azure Key Vault

For more details, click here.

License Add Ons

  • Select the checkbox for the Auto Lineage Add-On to build data lineage automatically.

  • Select the checkbox for Data Quality Add-On to identify data quality issues using data quality rules and anomaly detection.

  • Select the checkbox for the Data Access Add-On license that will enforce connector access via OvalEdge with the Data Access Management (DAM) feature enabled.

For more details, click here.

Connector Name*

Enter a unique name for the Amazon Redshift connection

(Example: "Redshift_Prod").

Connector Environment

Select the environment (Example: PROD, STG) configured for the connector. For more details, click here.

Connector Description

Enter a brief description of the connector.

Server*

Enter the Amazon Redshift database server name or IP address (Example: xxxx-redshift.xxxx4ijtzasl.xx-south-1.rds.xxxx.com or 192.xxx.1.xx).

Port*

By default, the port number for the Amazon Redshift, "5439" is auto-populated. If required, the port number can be modified as per custom port number that is configured for your Redshift.

Database*

Enter the database name to which the service account user has access within the Redshift.

Driver*

By default, the Redshift driver details are auto-populated.

Username*

Enter the service account username set up to access the Amazon Redshift database (Example: "oesauser").

Password*

Enter the password associated with the service account user.

Connection String

Configure the connection string for the Amazon Redshift database:

  • Automatic Mode: The system generates a connection string based on the provided credentials.

  • Manual Mode: Enter a valid connection string manually.

Replace placeholders with actual database details.

{sid} refers to Database Name.

Plugin Server

Enter the server name when running as a plugin server.

Plugin Port

Enter the port number on which the plugin is running.

Default Governance Roles

Default Governance Roles*

Select the appropriate users or teams for each governance role from the drop-down list. All users configured in the security settings are available for selection.

Admin Roles

Admin Roles*

Select one or more users from the dropdown list for Integration Admin and Security & Governance Admin. All users configured in the security settings are available for selection.

No of Archive Objects

No Of Archive Objects*

This shows the number of recent metadata changes to a dataset at the source. By default, it is off. To enable it, toggle the Archive button and specify the number of objects to archive.Example: Setting it to 4 retrieves the last four changes, displayed in the 'Version' column of the 'Metadata Changes' module.

Bridge

Select Bridge*

If applicable, select the bridge from the drop-down list.The drop-down list displays all active bridges that have been configured. These bridges facilitate communication between data sources and the system without requiring changes to firewall rules.

  1. After entering all connection details, the following actions can be performed:

    1. Click Validate to verify the connection.

    2. Click Save to store the connection for future use.

    3. Click Save & Configure to apply additional settings before saving.

  2. The saved connection will appear on the Connectors home page.

Manage Connector Operations

Crawl/Profile

The Crawl/Profile button allows users to select one or more schemas for crawling and profiling.

  1. Navigate to the Connectors page and click Crawl/Profile.

  2. Select the schemas to be crawled.

  3. The Crawl option is selected by default. To perform both operations, select the Crawl & Profile radio button.

  4. Click Run to collect metadata from the connected source and load it into the Data Catalog.

  5. After a successful crawl, the information appears in the Data Catalog > Databases tab.

The Schedule checkbox allows automated crawling and profiling at defined intervals, from a minute to a year.

  1. Click the Schedule checkbox to enable the Select Period drop-down.

  2. Select a time period for the operation from the drop-down menu.

  3. Click Schedule to initiate metadata collection from the connected source.

  4. The system will automatically execute the selected operation (Crawl or Crawl & Profile) at the scheduled time.

Other Operations

The Connectors page provides a centralized view of all configured connectors, along with their health status.

Managing connectors includes:

  • Connectors Health: Displays the current status of each connector using a green icon for active connections and a red icon for inactive connections, helping to monitor the connectivity with data sources.

  • Viewing: Click the Eye icon next to the connector name to view connector details.

Nine Dots Menu Options:

To view, edit, validate, build lineage, configure, or delete connectors, click on the Nine Dots menu.

  • Edit Connector: Update and revalidate the data source.

  • Validate Connector: Check the connection's integrity.

  • Settings: Modify connector settings.

    • Crawler: Configure data extraction.

    • Profiler: Customize data profiling rules and methods.

    • Query Policies: Define query execution rules based on roles.

    • Access Instructions: Include notes on how to access the data.

    • Business Glossary Settings: Manage term associations at the connector level.

    • Anomaly Detection Settings: Configure anomaly detection preferences at the connector level.

    • Others: Configure notification recipients for metadata changes.

  • Build Lineage: Automatically build data lineage using source code parsing.

  • Delete Connector: Remove a connector with confirmation.

For more details, click here.

Connectivity Troubleshooting

If incorrect parameters are entered, error messages may appear. Ensure all inputs are accurate to resolve these issues. If issues persist, contact the assigned support team.

S. No.
Error Message(s)
Description & Resolution

1

Error while validating connection: Error while validating Redshift Connection : Failed to load driver class com.amazon.redshift.jdbc.Driver in either of HikariConfig class loader or Thread context classloader

Description: Connection validation failed due to missing Redshift JDBC driver.

Resolution:

Download the Redshift JDBC driver from Amazon's official site and upload it under Admin > Drivers. Then retry the connection.

2

Error while validating connection: Error while validating Redshift Connection : Failed to obtain JDBC Connection; nested exception is java.sql.SQLException: [Amazon](500150) Error setting/closing connection: UnknownHostException

Description: Connection validation failed due to an UnknownHostException, indicating that the hostname or IP is incorrect or the Redshift server is not reachable.

Resolution: Verify that the provided host/IP is correct and ensure the Redshift server is up and accessible from the network.

3

Error while validating connection: Error while validating Redshift Connection : Failed to obtain JDBC Connection; nested exception is java.sql.SQLException: [Amazon](500310) Invalid operation: password authentication failed for user "ovaledge1";

Description:

Connection validation failed due to incorrect credentials.

Resolution:

Check and update the username or password. Ensure the credentials are correct and have access to the Redshift cluster

4

Error while validating connection: Error while validating Redshift Connection : Failed to obtain JDBC Connection; nested exception is java.sql.SQLException: [Amazon](500310) Invalid operation: database "ovaledge" does not exist;

Description: Connection failed because the specified database does not exist.

Resolution: Verify and correct the database name in the connection settings.

ERROR: schema "pg_temp_91" does not exist

Description: The query references a temporary schema that no longer exists. Temporary schemas in Amazon Redshift are session-based and are automatically removed when the session ends or when temporary objects are dropped. Resolution:

  • Verify whether the temporary schema exists in the Redshift environment.

  • Ensure the referenced temporary tables or schemas are available during query execution.

  • Re-run the operation using valid and active schemas.

6.

Unsupported Datatype

Description: Profiling is skipped for columns where the column length exceeds the configured profiling size limit (default: 4000). The operation is skipped intentionally to avoid performance issues during profiling. Resolution:

  • Review the warning logs for skipped columns.

  • Reduce the column size threshold if required.

  • Exclude large unsupported columns from profiling where applicable.

7.

ERROR: Query (<query_id>) cancelled on user's request

Description: The query execution was interrupted manually or automatically due to query timeout settings, resource constraints, or administrative actions in Amazon Redshift. Resolution:

  • Execute the failed query directly in Redshift to measure its execution time.

  • Increase the query timeout value in the connection profile settings if required.

  • Optimize long-running queries or reduce profiling scope for large datasets.

8.

ERROR: permission denied for schema pg_autocopy ERROR: permission denied for schema pg_automv

Description: The service account does not have sufficient privileges to access Redshift system-managed schemas such as pg_autocopy or pg_automv. Resolution:

  • Grant the required USAGE or SELECT permissions on the respective schemas if access is required.

  • Verify that the service account has sufficient privileges to access system schemas used during crawling or profiling.

9.

ERROR: S3ServiceException: User is not authorized to perform: s3:ListBucket

Description: The IAM role associated with the Amazon Redshift cluster or session does not have sufficient permissions to access the target Amazon S3 bucket or prefix. Resolution:

  • Update the IAM policy attached to the Redshift role.

  • Grant the required s3:ListBucket permission on the target S3 bucket and prefix.

  • Verify that the IAM role is correctly associated with the Redshift cluster or session.

10

Error while retrieving Redshift SQL Profile Results : StatementCallback; bad SQL grammar [SELECT MAX("last_name") AS maxo, MIN("last_name") AS mino, COUNT(DISTINCT "last_name") AS distincto, COUNT(*) AS notnullo FROM "salesinfo"."actor" WHERE "last_name" IS NOT NULL]; nested exception is java.sql.SQLException: Amazon Invalid operation: could not identify an equality operator for type "unknown";

Description: The Redshift profiling operation fails while retrieving SQL profile results because the query includes a DISTINCT operation on a column type for which Redshift cannot determine a valid equality operator. This issue commonly occurs when profiling unsupported, ambiguous, or incompatible data types. Resolution:

  • Verify the data type of the affected column and ensure it supports comparison and DISTINCT operations in Amazon Redshift.

  • Review the table schema for columns with unsupported, custom, or ambiguous data types that may be interpreted as unknown.

  • Modify the column data type to a supported Redshift data type if required.

  • Exclude unsupported columns from profiling when DISTINCT-based metrics are not applicable.

  • Validate the query directly in Amazon Redshift to identify the specific column causing the failure.

  • Retry the profiling operation after updating the schema or profiling configuration.


Copyright © 2026, OvalEdge LLC, Peachtree Corners GA USA

Last updated

Was this helpful?