Amazon Athena
This article outlines the integration with the Amazon Athena connector, enabling streamlined metadata management through features such as crawling, profiling, querying, data preview, and lineage building (both automatic and manual).
This connector supports connectivity to Amazon Athena using the AWS SDK and enables metadata extraction for schemas, tables, columns, views, named queries, and prepared statements. It supports both IAM User Authentication and Role-Based Authentication, allowing access to Athena resources, AWS Glue Data Catalog metadata, and Amazon S3 query result locations required for crawling, profiling, and query execution.

Overview
Connector Details
Connector Category
RDBMS
OvalEdge Release Supported
Release6.3.4 and later
Connectivity
[How the connection is established with Amazon Athena]
AWS SDK
Verified Amazon Athena Version
Athena Engine v3
The Amazon Athena connector has been validated with the mentioned "Verified Amazon Athena Versions" and is expected to be compatible with other supported Amazon Athena versions. If there are any issues with validation or metadata crawling, please submit a support ticket for investigation and feedback.
Connector Features
Crawling
✅
Delta Crawling
❌
Profiling
✅
Sample Profiling
✅
Query Sheet
✅
Data Preview
✅
Auto Lineage
✅
Manual Lineage
✅
Secure Authentication via Credential Manager
✅
Data Quality
❌
DAM (Data Access Management)
❌
Bridge
✅
Metadata Mapping
The following objects are crawled from Amazon Athena and mapped to the corresponding UI assets.
Schema
database.name
Schema
Schemas
schema
Schema
database.description
Source Description
Descriptions
Source Description
Table
table.name
Table
Tables
table
Table
table.tableType (TABLE/VIEW/EXTERNAL_TABLE)
Table Data Type
Tables
table
Table
table.parameters.location
Table Location
Tables
table
Table
- (Athena doesn’t carry comments via API)
Table Comments
Descriptions
Source Description
Columns
column_name
Column
Table Columns
-
Columns
data_type
Column Type
Table Columns
-
Columns
ordinal_position
Column Position
Table Columns
-
Columns
IS_NULLABLE (YES/NO)
Nullable
Table Columns
-
Columns
comment (if present; often empty in Athena)
Source Description
Table Columns
-
Views
table.name (where table Type = VIRTUAL_VIEW)
View
Tables
view
Views
SHOW CREATE VIEW result
View Query
Views
View
Named Queries
namedQuery.name
Name
Views
other
Named Queries
named Query.queryString
View/Query Text
Views
Other
Prepared Statements
preparedStatement.statement Name
Name
Views
other
Set up a Connection
Prerequisites
The following are the prerequisites to establish a connection:
External Supporting Files
The required external JAR files are included as part of the OvalEdge installation artifacts. For driver installation and configuration details, refer to the Connector Drivers Setup Guide. Please contact the OvalEdge Team for assistance related to the driver files and configuration setup.
athena-2.30.2.jar
Use this file when connecting to Amazon Athena using the AWS SDK
Service Account User Permissions
It is recommended to use a separate service account to establish the connection to the data source, configured with the following minimum set of permissions.
👨💻 Who can provide these permissions? These permissions are typically granted by the Amazon Athena administrator, as users may not have the required access to assign them independently.
The IAM role/user (for example: ovaxxxge-bxxxge-xxx-xxx) must have appropriate Athena, S3, and Glue permissions.
An admin/service account for OvalEdge Data Catalog Operations.
Crawling
Schema (Databases)
athena:ListDatabases, glue:GetDatabases
athena:ListDatabases, glue:GetDatabase
Validation
S3 Bucket
s3:HeadBucket, s3:ListBucket, s3:HeadObject
s3:HeadBucket, s3:ListBucket, s3:HeadObject
Crawling
Tables
athena:ListTableMetadata, glue:GetTables
athena:ListTableMetadata, glue:GetTable
Crawling
Table Columns
athena:StartQueryExecution, athena:GetQueryResults, information_schema.columns
athena:StartQueryExecution, athena:GetQueryResults, s3:GetObject
Crawling & Lineage Building
Views
athena:StartQueryExecution, athena:GetQueryResults, SHOW CREATE VIEW
athena:StartQueryExecution, athena:GetQueryResults, s3:GetObject
Crawling & Lineage Building
External Tables
athena:ListTableMetadata (table parameters)
athena:ListTableMetadata
Crawling & Lineage Building
Named Queries
athena:ListNamedQueries, athena:GetNamedQuery, athena:ListWorkGroups
athena:ListNamedQueries, athena:GetNamedQuery, athena:ListWorkGroups
Crawling & Lineage Building
Prepared Statements
athena:ListPreparedStatements, athena:GetPreparedStatement, athena:ListWorkGroups
athena:ListPreparedStatements, athena:GetPreparedStatement, athena:ListWorkGroups
Profiling
Row Count
athena:StartQueryExecution, athena:GetQueryResults, SELECT COUNT(*)
athena:StartQueryExecution, athena:GetQueryResults, s3:GetObject
Profiling
Data Profiling – Top Values
athena:StartQueryExecution, athena:GetQueryResults, SELECT ... GROUP BY ... ORDER BY ... LIMIT
athena:StartQueryExecution, athena:GetQueryResults, s3:GetObject
Profiling
Sample Data
athena:StartQueryExecution, athena:GetQueryResults, SELECT * ... LIMIT
athena:StartQueryExecution, athena:GetQueryResults, s3:GetObject
Profiling
Non-Null Count
athena:StartQueryExecution, athena:GetQueryResults, SELECT COUNT(*) WHERE column IS NOT NULL
athena:StartQueryExecution, athena:GetQueryResults, s3:GetObject
Profiling
Max / Min / Distinct Count
athena:StartQueryExecution, athena:GetQueryResults, SELECT MAX(), MIN(), COUNT(DISTINCT)
athena:StartQueryExecution, athena:GetQueryResults, s3:GetObject
Data Access & Governance
Governed Data Query Execution
athena:StartQueryExecution, athena:GetQueryResults, SELECT ... WHERE ...
athena:StartQueryExecution, athena:GetQueryResults, s3:GetObject
Data Access & Query Execution
Data Query Execution (Async)
athena:StartQueryExecution, athena:GetQueryExecution, athena:GetQueryResults
athena:StartQueryExecution, athena:GetQueryExecution, athena:GetQueryResults, s3:GetObject
Data Access & Query Execution
Data Query Execution (Real-Time)
athena:StartQueryExecution, athena:GetQueryExecution, athena:GetQueryResults
athena:StartQueryExecution, athena:GetQueryExecution, athena:GetQueryResults, s3:GetObject
Connection Validation
Connection Validation
s3:ListBucket, s3:GetBucketLocation, athena:ListWorkGroups
s3:ListBucket, s3:GetBucketLocation, athena:ListWorkGroups
All Operations
S3 Output Location
s3:ListBucket, s3:GetBucketLocation, s3:GetObject
s3:ListBucket, s3:GetBucketLocation, s3:GetObject
Connection Configuration Steps
Users are required to have the Connector Creator role in order to configure a new connection.
Log into OvalEdge, go to Administration > Connectors, click + (New Connector), search for Amazon Athena, and complete the required parameters.
Note: Fields marked with an asterisk (*) are mandatory for establishing a connection.
Connector Type
By default, "Amazon Athena" is displayed as the selected connector type.
Authentication*
Select the authentication type from the drop-down.
Role based Authentication
IAM User Authentication
Credential Manager*
Select the desired credentials manager from the drop-down list. Relevant parameters will be displayed based on the selection.
Supported Credential Managers:
OE Credential Manager
AWS Secrets Manager
HashiCorp Vault
Azure Key Vault
For more details, click here.
License Add Ons
Select the checkbox for the Auto Lineage Add-On to build data lineage automatically. For more details, click here.
Connector Name*
Enter a unique name for the connector.
Connector Description
Enter a brief description to describe the purpose of the connector.
Connector Environment
Select the environment (Example: PROD, STG) configured for the connector. For more details, click here.
Cross-Account Role ARN
Enter the ARN of the Role Based that allows access to the target account for establishing the connection.
Database Region*
Enter the AWS region where the Amazon Athena resources and associated S3 output location are configured (for example, xx-xxx-1).
Catalog Name*
Enter the name of the Data Catalog that contains the databases and tables to be crawled (default: AwsDataCatalog).
Output S3 Folder Path*
Enter the Amazon S3 folder path where Athena query results are stored (for example, s3://bucket-name/athena/results/). The configured account or role must have access to this location.
Credential Manager*
Select the desired credentials manager from the drop-down list. Relevant parameters will be displayed based on the selection.
Supported Credential Managers:
OE Credential Manager
AWS Secrets Manager
HashiCorp Vault
Azure Key Vault
For more details, click here.
License Add Ons
Select the checkbox for the Auto Lineage Add-On to build data lineage automatically. For more details, click here.
Connector Name*
Enter a unique name for the connector.
Connector Description
Enter a brief description to describe the purpose of the connector.
Connector Environment
Select the environment (Example: PROD, STG) configured for the connector.
For more details, click here.
Access Key*
Enter the AWS Access Key ID associated with the IAM user account used to connect to Amazon Athena. This key is used along with the Secret Key to authenticate API requests.
Secret Key*
Enter the AWS Secret Access Key associated with the specified Access Key ID. This key is used to securely authenticate access to Amazon Athena services.
Database Region*
Enter the AWS region where the Amazon Athena resources and associated S3 output location are configured (for example, xx-xxx-1).
Catalog Name*
Enter the name of the Data Catalog that contains the databases and tables to be crawled (default: AwsDataCatalog).
Output S3 Folder Path*
Enter the Amazon S3 folder path where Athena query results are stored (for example, s3://bucket-name/athena/results/). The configured account or role must have access to this location.
Default Governance Roles
Default Governance Roles*
Select the appropriate users or teams for each governance role from the drop-down list. All users configured in the security settings are available for selection.
Admin Roles
Admin Roles*
Select one or more users from the dropdown list for Integration Admin and Security & Governance Admin. All users configured in the security settings are available for selection.
Bridge
Select Bridge*
If applicable, select the bridge from the drop-down list.The drop-down list displays all active bridges that have been configured. These bridges facilitate communication between data sources and the system without requiring changes to firewall rules.
After entering all connection details, the following actions can be performed:
Click Validate to verify the connection.
Click Save to store the connection for future use.
Click Save & Configure to apply additional settings before saving.
The saved connection will appear on the Connectors home page.
Manage Connector Operations
Crawl/Profile
To perform crawl and profile operations, users must be assigned the Integration Admin role.
The Crawl/Profile button allows users to select one or more schemas for crawling and profiling.
Navigate to the Connectors page and click Crawl/Profile.
Select the schemas to be crawled.
The Crawl option is selected by default. To perform both operations, select the Crawl & Profile radio button.
Click Run to collect metadata from the connected source and load it into the Data Catalog.
After a successful crawl, the information appears in the Data Catalog > Databases tab.
The Schedule checkbox allows automated crawling and profiling at defined intervals, from a minute to a year.
Click the Schedule checkbox to enable the Select Period drop-down.
Select a time period for the operation from the drop-down menu.
Click Schedule to initiate metadata collection from the connected source.
The system will automatically execute the selected operation (Crawl or Crawl & Profile) at the scheduled time.
Other Operations
The Connectors page provides a centralized view of all configured connectors, along with their health status.
Managing connectors includes:
Connectors Health: Displays the current status of each connector using a green icon for active connections and a red icon for inactive connections, helping to monitor the connectivity with data sources.
Viewing: Click the Eye icon next to the connector name to view connector details.
Nine Dots Menu Options:
To view, edit, validate, build lineage, configure, or delete connectors, click on the Nine Dots menu.
Edit Connector: Update and revalidate the data source.
Validate Connector: Check the connection's integrity.
Settings: Modify connector settings.
Crawler: Configure data extraction.
Profiler: Customize data profiling rules and methods.
Query Policies: Define query execution rules based on roles.
Access Instructions: Include notes on how to access the data.
Business Glossary Settings: Manage term associations at the connector level.
Others: Configure notification recipients for metadata changes.
Build Lineage: Automatically build data lineage using source code parsing.
Delete Connector: Remove a connector with confirmation.
For more details on connector settings, click here.
Additional Information
Athena restricts each account to 100 databases, and databases cannot include over 100 tables.
Athena DDL max query limit: 20 DDL active queries.
Amazon S3 bucket limit is 100 buckets per account by default – you can request to increase it up to 1,000 S3 buckets per account.
Connectivity Troubleshooting
If incorrect parameters are entered, error messages may appear. Ensure all inputs are accurate to resolve these issues. If issues persist, contact the assigned support team.
1
S3 bucket does not existInvalid S3 output location
Error Description: The configured Amazon S3 output location is invalid, inaccessible, or does not exist.
Resolution:
Verify the S3 path format is s3://bucket-name/path/.
Ensure the path ends with a forward slash (/).
Verify the S3 bucket exists and is accessible.
Ensure the configured credentials have s3:GetBucketLocation, s3:ListBucket, and s3:PutObject permissions.
Verify the S3 bucket region matches the Athena region.
Validate access using AWS CLI commands.
2
Invalid credentialsAccess Denied
Error Description: Authentication to Amazon Athena failed due to invalid credentials or insufficient permissions.
Resolution:
Verify that the Access Key and Secret Key are valid.
Verify the IAM Role ARN is correct when role-based authentication is used.
Ensure the IAM user or role has the required Athena, AWS Glue, and S3 permissions.
Confirm the credentials have not expired.
Validate access using AWS CLI commands.
3
Unable to assume cross-account role
Error Description: The specified cross-account IAM role cannot be assumed.
Resolution:
Verify the Role ARN format and account details.
Ensure the role trust policy allows role assumption.
Verify the role has the required Athena, AWS Glue, and S3 permissions.
Confirm AWS STS access is enabled in the target region.
Verify the session duration does not exceed the configured maximum.
4
Athena client initialization failed
Error Description: The Athena client could not be initialized due to configuration, credential, or connectivity issues.
Resolution:
Verify the configured AWS region is valid.
Confirm credentials are valid.
Verify network connectivity to AWS services.
Ensure all required AWS SDK dependencies are available.
Review application logs for detailed initialization errors.
5
Invalid region
Error Description: The configured AWS region is invalid or unsupported.
Resolution:
Verify the region follows the correct AWS format (for example, us-east-1).
Ensure Athena is supported in the selected region.
Verify the S3 output location resides in the same region.
Confirm the region is enabled in the AWS account.
6
Query execution timeout
Error Description: Query execution exceeded the configured timeout limits.
Resolution:
Optimize the query to reduce execution time.
Use filters or LIMIT clauses when appropriate.
Review Athena workgroup timeout settings.
Verify network latency and AWS service availability.
Retry the operation after validating query performance.
7
Query cancelledQuery failed
Error Description: Athena terminated the query due to execution failures, workgroup policies, syntax issues, or resource limitations.
Resolution:
Review the detailed error message returned by Athena.
Verify query syntax and object references.
Check data scan limits and workgroup restrictions.
Ensure the queried objects exist and are accessible.
Review Athena and OvalEdge logs for additional details.
8
No schemas returnedFailed to retrieve databases
Error Description: OvalEdge could not retrieve database metadata from the configured catalog.
Resolution:
Verify the Catalog Name is correct.
Ensure the catalog exists and is accessible.
Verify permissions for glue:GetDatabases and related Athena APIs.
Confirm the selected region contains the catalog metadata.
Review logs for API or permission-related errors.
9
Failed to retrieve tables
Error Description: Table metadata could not be retrieved from Amazon Athena.
Resolution:
Verify the database exists and is accessible.
Confirm the Catalog Name is configured correctly.
Ensure permissions for athena:ListTableMetadata and glue:GetTables are granted.
Review logs for metadata retrieval failures.
10
Failed to retrieve columns
Error Description: Column metadata could not be retrieved from Athena metadata tables.
Resolution:
Verify the database and table exist.
Ensure access to information_schema.columns.
Confirm the required query permissions are granted.
Review query execution logs for errors.
11
Failed to retrieve query results
Error Description: Query results could not be processed or returned successfully.
Resolution:
Verify the query completed successfully in Athena.
Ensure result-set metadata is available.
Confirm access to the configured S3 output location.
Review query execution logs and API responses for details.
FAQs
Is there a step-by-step way to upgrade to the AWS Data Catalog?
Yes. A step-by-step guide can be found here.
Can I run any Hive Query on Athena?
Amazon Athena uses Hive only for DDL (Data Definition Language) and for creation/modification and deletion of tables and/or partitions. Please click here for a complete list of statements supported. Athena uses Presto when you run SQL queries on Amazon S3. You can run ANSI-Compliant SQL SELECT statements to query your data in Amazon S3.
Some databases are missing after crawling. Why?
Amazon Athena retrieves databases using paginated API calls. If some databases are missing, verify IAM permissions, catalog accessibility, and review crawl logs for pagination-related messages.
Why are some external tables not crawled?
External tables must contain valid input format, output format, and SerDe definitions. Tables with incomplete definitions may be excluded during crawling.
Why is the nullable status of a column incorrect?
Nullable status is derived from the IS_NULLABLE attribute in Athena metadata. Verify the column definition in the AWS Glue Data Catalog.
How are large query results processed?
Query results are retrieved using paginated API requests until all result pages are processed.
Why does submitQuery() display a warning for DML or DDL statements?
The connector supports query execution for SELECT statements only. DML and DDL operations are not supported.
Why does a query remain in the RUNNING state for a long time?
Query status is monitored through periodic polling. Long-running queries may require optimization or workgroup configuration review.
Why does getRowCount() return 0?
The table may not contain data, or the query may not have returned any records. Verify the source table and query execution results.
Why are some columns skipped during profiling?
Unsupported data types and columns exceeding profiling limits are automatically excluded.
Why do profiling statistics appear incorrect?
Verify the column data type supports aggregation functions and review query execution logs.
Why are some columns excluded from sample profiling?
Columns belonging to unsupported data types configured in the profiling exclusion list are omitted.
Why are Top 50 Values not displayed?
Verify that the query executed successfully and that the column contains sufficient data values.
Why are null or invalid entries removed from generated JSON?
The connector applies validation and security filtering to remove invalid or potentially unsafe values.
Why are views not displayed after crawling?
View extraction is supported only for catalogs that support view definitions. Verify permissions and view availability.
Why does SHOW CREATE VIEW fail?
Ensure the view exists and that the configured user has permission to access the view definition.
Why are prepared statements not retrieved?
Verify workgroups exist and ensure permissions for athena:ListWorkGroups and athena:GetPreparedStatement.
Why are named queries not retrieved?
Verify permissions for athena:ListNamedQueries and athena:GetNamedQuery, and ensure named queries exist in the selected workgroups.
Why are workgroups not displayed?
Verify athena:ListWorkGroups permission and confirm the configured region is correct.
What causes "No result set metadata" errors?
The query result did not return the expected metadata. Verify successful query execution and review the API response.
Why does query result processing return unexpected values?
Verify column metadata, result-set structure, and source data consistency.
Why do governed data queries return no results?
Verify filter conditions, query syntax, and object accessibility.
Why do SUM, AVG, or STDDEV calculations fail?
These functions are supported only for numeric columns. Verify the selected column data type.
Why does Account ID retrieval fail?
Verify sts:GetCallerIdentity permission, credential validity, and AWS STS accessibility.
What should be checked when API calls frequently time out?
Review query complexity, network latency, AWS service availability, and timeout settings.
How can Athena connectivity be validated outside OvalEdge?
Use AWS CLI commands to validate Athena, AWS Glue, STS, and S3 access with the configured credentials or IAM role.
What permissions are required for successful crawling and profiling?
The configured IAM user or role must have the required Athena, AWS Glue, S3, and STS permissions described in the Service Account User Permissions section.
Copyright © 2026, OvalEdge LLC, Peachtree Corners GA USA
Last updated
Was this helpful?

