> For the complete documentation index, see [llms.txt](https://docs.ovaledge.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.ovaledge.com/release8.1/connectors/connector-repositories/data-warehouse/amazon-athena.md). # Amazon Athena This article outlines the integration with the Amazon Athena connector, enabling streamlined metadata management through features such as crawling, profiling, querying, data preview, and lineage building (both automatic and manual). This connector supports connectivity to Amazon Athena using the AWS SDK and enables metadata extraction for schemas, tables, columns, views, named queries, and prepared statements. It supports both IAM User Authentication and Role-Based Authentication, allowing access to Athena resources, AWS Glue Data Catalog metadata, and Amazon S3 query result locations required for crawling, profiling, and query execution.

## Overview ### Connector Details | Connector Category | RDBMS | | --------------------------------------------------------------------------------- | ---------------------- | | OvalEdge Release Supported | Release6.3.4 and later | |

Connectivity

\[How the connection is established with Amazon Athena]

| AWS SDK | | Verified Amazon Athena Version | Athena Engine v3 | {% hint style="info" %} The Amazon Athena connector has been validated with the mentioned "Verified Amazon Athena Versions" and is expected to be compatible with other supported Amazon Athena versions. If there are any issues with validation or metadata crawling, please submit a support ticket for investigation and feedback. {% endhint %} ### Connector Features | Feature | Availability | | -------------------------------------------- | :----------: | | Crawling | ✅ | | Delta Crawling | ❌ | | Profiling | ✅ | | Sample Profiling | ✅ | | Query Sheet | ✅ | | Data Preview | ✅ | | Auto Lineage | ✅ | | Manual Lineage | ✅ | | Secure Authentication via Credential Manager | ✅ | | Data Quality | ❌ | | DAM (Data Access Management) | ❌ | | Bridge | ✅ | ### Metadata Mapping The following objects are crawled from Amazon Athena and mapped to the corresponding UI assets.

Amazon Athena Object	Amazon Athena Attribute	OvalEdge Attribute	OvaEdge Category	OvalEdge Type
Schema	database.name	Schema	Schemas	schema
Schema	database.description	Source Description	Descriptions	Source Description
Table	table.name	Table	Tables	table
Table	table.tableType (TABLE/VIEW/EXTERNAL_TABLE)	Table Data Type	Tables	table
Table	table.parameters.location	Table Location	Tables	table
Table	- (Athena doesn’t carry comments via API)	Table Comments	Descriptions	Source Description
Columns	column_name	Column	Table Columns	-
Columns	data_type	Column Type	Table Columns	-
Columns	ordinal_position	Column Position	Table Columns	-
Columns	IS_NULLABLE (YES/NO)	Nullable	Table Columns	-
Columns	comment (if present; often empty in Athena)	Source Description	Table Columns	-
Views	table.name (where table Type = VIRTUAL_VIEW)	View	Tables	view
Views	SHOW CREATE VIEW result	View Query	Views	View
Named Queries	namedQuery.name	Name	Views	other
Named Queries	named Query.queryString	View/Query Text	Views	Other
Prepared Statements	preparedStatement.statement Name	Name	Views	other

## Set up a Connection ### Prerequisites The following are the prerequisites to establish a connection: #### **External Supporting Files** {% hint style="info" %} The required external JAR files are included as part of the OvalEdge installation artifacts. For driver installation and configuration details, refer to the [Connector Drivers Setup Guide](https://docs.ovaledge.com/connectors/additional-requirements/connector-drivers-setup-guide). Please contact the OvalEdge Team for assistance related to the driver files and configuration setup. {% endhint %}

File Name	Description
athena-2.30.2.jar	Use this file when connecting to Amazon Athena using the AWS SDK

#### **Service Account User Permissions** {% hint style="warning" %} It is recommended to use a separate service account to establish the connection to the data source, configured with the following minimum set of permissions. {% endhint %} {% hint style="info" %} **👨‍💻 Who can provide these permissions?** These permissions are typically granted by the Amazon Athena administrator, as users may not have the required access to assign them independently. {% endhint %} The IAM role/user (for example: ovaxxxge-bxxxge-xxx-xxx) must have appropriate Athena, S3, and Glue permissions. * An admin/service account for OvalEdge Data Catalog Operations.

### Connection Configuration Steps {% hint style="warning" %} *Users are required to have the Connector Creator role in order to configure a new connection.* {% endhint %} 1. Log into **OvalEdge**, go to **Administration > Connectors**, click **+ (New Connector)**, search for **Amazon Athena**, and complete the required parameters. {% hint style="info" %} ***Note:** Fields marked with an asterisk (\*) are mandatory for establishing a connection.* {% endhint %}

Field Name	Description
Connector Type	By default, "Amazon Athena" is displayed as the selected connector type.
Authentication*	Select the authentication type from the drop-down. Role based Authentication IAM User Authentication

Field Name

Description

Connector Type

By default, "Amazon Athena" is displayed as the selected connector type.

Authentication*

Select the authentication type from the drop-down.

Role based Authentication
IAM User Authentication

{% tabs %} {% tab title="Role based Authentication" %}

Field Name	Description
Credential Manager*	Select the desired credentials manager from the drop-down list. Relevant parameters will be displayed based on the selection. Supported Credential Managers: OE Credential Manager AWS Secrets Manager HashiCorp Vault Azure Key Vault For more details, click here.
License Add Ons	Select the checkbox for the Auto Lineage Add-On to build data lineage automatically. For more details, click here.
Connector Name*	Enter a unique name for the connector.
Connector Description	Enter a brief description to describe the purpose of the connector.
Connector Environment	Select the environment (Example: PROD, STG) configured for the connector. For more details, click here.
Cross-Account Role ARN	Enter the ARN of the Role Based that allows access to the target account for establishing the connection.
Database Region*	Enter the AWS region where the Amazon Athena resources and associated S3 output location are configured (for example, xx-xxx-1).
Catalog Name*	Enter the name of the Data Catalog that contains the databases and tables to be crawled (default: AwsDataCatalog).
Output S3 Folder Path*	Enter the Amazon S3 folder path where Athena query results are stored (for example, s3://bucket-name/athena/results/). The configured account or role must have access to this location.

{% endtab %} {% tab title="IAM User Authentication" %}

Field Name	Description
Credential Manager*	Select the desired credentials manager from the drop-down list. Relevant parameters will be displayed based on the selection. Supported Credential Managers: OE Credential Manager AWS Secrets Manager HashiCorp Vault Azure Key Vault For more details, click here.
License Add Ons	Select the checkbox for the Auto Lineage Add-On to build data lineage automatically. For more details, click here.
Connector Name*	Enter a unique name for the connector.
Connector Description	Enter a brief description to describe the purpose of the connector.
Connector Environment	Select the environment (Example: PROD, STG) configured for the connector. For more details, click here.
Access Key*	Enter the AWS Access Key ID associated with the IAM user account used to connect to Amazon Athena. This key is used along with the Secret Key to authenticate API requests.
Secret Key*	Enter the AWS Secret Access Key associated with the specified Access Key ID. This key is used to securely authenticate access to Amazon Athena services.
Database Region*	Enter the AWS region where the Amazon Athena resources and associated S3 output location are configured (for example, xx-xxx-1).
Catalog Name*	Enter the name of the Data Catalog that contains the databases and tables to be crawled (default: AwsDataCatalog).
Output S3 Folder Path*	Enter the Amazon S3 folder path where Athena query results are stored (for example, s3://bucket-name/athena/results/). The configured account or role must have access to this location.

{% endtab %} {% endtabs %} **Default Governance Roles**


Default Governance Roles*	Select the appropriate users or teams for each governance role from the drop-down list. All users configured in the security settings are available for selection.

**Admin Roles**


Admin Roles*	Select one or more users from the dropdown list for Integration Admin and Security & Governance Admin. All users configured in the security settings are available for selection.

**Bridge**


Select Bridge*	If applicable, select the bridge from the drop-down list.The drop-down list displays all active bridges that have been configured. These bridges facilitate communication between data sources and the system without requiring changes to firewall rules.

2. After entering all connection details, the following actions can be performed: 1. Click **Validate** to verify the connection. 2. Click **Save** to store the connection for future use. 3. Click **Save & Configure** to apply additional settings before saving. 3. The saved connection will appear on the Connectors home page. ## Manage Connector Operations ### Crawl/Profile {% hint style="warning" %} *To perform crawl and profile operations, users must be assigned the Integration Admin role.* {% endhint %} The **Crawl/Profile** button allows users to select one or more schemas for **crawling** and **profiling**. 1. Navigate to the **Connectors** page and click **Crawl/Profile**. 2. Select the schemas to be crawled. 3. The **Crawl** option is selected by default. To perform both operations, select the **Crawl & Profile** radio button. 4. Click **Run** to collect metadata from the connected source and load it into the **Data Catalog**. 5. After a successful crawl, the information appears in the **Data Catalog > Databases** tab. The **Schedule** checkbox allows automated crawling and profiling at defined intervals, from a minute to a year. 1. Click the **Schedule** checkbox to enable the Select Period drop-down. 2. Select a time period for the operation from the drop-down menu. 3. Click **Schedule** to initiate metadata collection from the connected source. 4. The system will automatically execute the selected operation (**Crawl** or **Crawl & Profile**) at the scheduled time. ### Other Operations The **Connectors** page provides a centralized view of all configured connectors, along with their health status. #### Managing connectors includes: * **Connectors Health**: Displays the current status of each connector using a **green** icon for active connections and a **red** icon for inactive connections, helping to monitor the connectivity with data sources. * **Viewing**: Click the **Eye icon** next to the connector name to view connector details. #### **Nine Dots Menu Options:** To view, edit, validate, build lineage, configure, or delete connectors, click on the **Nine Dots** menu. * **Edit Connector**: Update and revalidate the data source. * **Validate Connector**: Check the connection's integrity. * **Settings**: Modify connector settings. * **Crawler**: Configure data extraction. * **Profiler**: Customize data profiling rules and methods. * **Query Policies**: Define query execution rules based on roles. * **Access Instructions**: Include notes on how to access the data. * **Business Glossary Settings**: Manage term associations at the connector level. * **Others**: Configure notification recipients for metadata changes. * **Build Lineage**: Automatically build data lineage using source code parsing. * **Delete Connector**: Remove a connector with confirmation. For more details on connector settings, click [here](https://docs.ovaledge.com/connectors/introduction-to-connectors/setup-and-connectivity/connector-settings). ### **Additional Information** 1. Athena restricts each account to 100 databases, and databases cannot include over 100 tables. 2. Athena DDL max query limit: 20 DDL active queries. 3. Amazon S3 bucket limit is 100 buckets per account by default – you can request to increase it up to 1,000 S3 buckets per account. ## Connectivity Troubleshooting If incorrect parameters are entered, error messages may appear. Ensure all inputs are accurate to resolve these issues. If issues persist, contact the assigned support team.

S.No.	Error Message(s)	Error Description & Resolution
1	S3 bucket does not existInvalid S3 output location	Error Description: The configured Amazon S3 output location is invalid, inaccessible, or does not exist. Resolution: Verify the S3 path format is s3://bucket-name/path/. Ensure the path ends with a forward slash (/). Verify the S3 bucket exists and is accessible. Ensure the configured credentials have s3:GetBucketLocation, s3:ListBucket, and s3:PutObject permissions. Verify the S3 bucket region matches the Athena region. Validate access using AWS CLI commands.
2	Invalid credentialsAccess Denied	Error Description: Authentication to Amazon Athena failed due to invalid credentials or insufficient permissions. Resolution: Verify that the Access Key and Secret Key are valid. Verify the IAM Role ARN is correct when role-based authentication is used. Ensure the IAM user or role has the required Athena, AWS Glue, and S3 permissions. Confirm the credentials have not expired. Validate access using AWS CLI commands.
3	Unable to assume cross-account role	Error Description: The specified cross-account IAM role cannot be assumed. Resolution: Verify the Role ARN format and account details. Ensure the role trust policy allows role assumption. Verify the role has the required Athena, AWS Glue, and S3 permissions. Confirm AWS STS access is enabled in the target region. Verify the session duration does not exceed the configured maximum.
4	Athena client initialization failed	Error Description: The Athena client could not be initialized due to configuration, credential, or connectivity issues. Resolution: Verify the configured AWS region is valid. Confirm credentials are valid. Verify network connectivity to AWS services. Ensure all required AWS SDK dependencies are available. Review application logs for detailed initialization errors.
5	Invalid region	Error Description: The configured AWS region is invalid or unsupported. Resolution: Verify the region follows the correct AWS format (for example, us-east-1). Ensure Athena is supported in the selected region. Verify the S3 output location resides in the same region. Confirm the region is enabled in the AWS account.
6	Query execution timeout	Error Description: Query execution exceeded the configured timeout limits. Resolution: Optimize the query to reduce execution time. Use filters or LIMIT clauses when appropriate. Review Athena workgroup timeout settings. Verify network latency and AWS service availability. Retry the operation after validating query performance.
7	Query cancelledQuery failed	Error Description: Athena terminated the query due to execution failures, workgroup policies, syntax issues, or resource limitations. Resolution: Review the detailed error message returned by Athena. Verify query syntax and object references. Check data scan limits and workgroup restrictions. Ensure the queried objects exist and are accessible. Review Athena and OvalEdge logs for additional details.
8	No schemas returnedFailed to retrieve databases	Error Description: OvalEdge could not retrieve database metadata from the configured catalog. Resolution: Verify the Catalog Name is correct. Ensure the catalog exists and is accessible. Verify permissions for glue:GetDatabases and related Athena APIs. Confirm the selected region contains the catalog metadata. Review logs for API or permission-related errors.
9	Failed to retrieve tables	Error Description: Table metadata could not be retrieved from Amazon Athena. Resolution: Verify the database exists and is accessible. Confirm the Catalog Name is configured correctly. Ensure permissions for athena:ListTableMetadata and glue:GetTables are granted. Review logs for metadata retrieval failures.
10	Failed to retrieve columns	Error Description: Column metadata could not be retrieved from Athena metadata tables. Resolution: Verify the database and table exist. Ensure access to information_schema.columns. Confirm the required query permissions are granted. Review query execution logs for errors.
11	Failed to retrieve query results	Error Description: Query results could not be processed or returned successfully. Resolution: Verify the query completed successfully in Athena. Ensure result-set metadata is available. Confirm access to the configured S3 output location. Review query execution logs and API responses for details.

## **FAQs**

Is there a step-by-step way to upgrade to the AWS Data Catalog?

Yes. A step-by-step guide can be found [here](http://docs.aws.amazon.com/athena/latest/ug/glue-athena.html).

Can I run any Hive Query on Athena?

Amazon Athena uses Hive only for DDL (Data Definition Language) and for creation/modification and deletion of tables and/or partitions. Please [click here](http://docs.aws.amazon.com/athena/latest/ug/language-reference.html) for a complete list of statements supported. Athena uses Presto when you run SQL queries on Amazon S3. You can run ANSI-Compliant SQL SELECT statements to query your data in Amazon S3.

Some databases are missing after crawling. Why?

Amazon Athena retrieves databases using paginated API calls. If some databases are missing, verify IAM permissions, catalog accessibility, and review crawl logs for pagination-related messages.

Why are some external tables not crawled?

External tables must contain valid input format, output format, and SerDe definitions. Tables with incomplete definitions may be excluded during crawling.

Why is the nullable status of a column incorrect?

Nullable status is derived from the IS\_NULLABLE attribute in Athena metadata. Verify the column definition in the AWS Glue Data Catalog.

How are large query results processed?

Query results are retrieved using paginated API requests until all result pages are processed.

Why does submitQuery() display a warning for DML or DDL statements?

The connector supports query execution for SELECT statements only. DML and DDL operations are not supported.

Why does a query remain in the RUNNING state for a long time?

Query status is monitored through periodic polling. Long-running queries may require optimization or workgroup configuration review.

Why does getRowCount() return 0?

The table may not contain data, or the query may not have returned any records. Verify the source table and query execution results.

Why are some columns skipped during profiling?

Unsupported data types and columns exceeding profiling limits are automatically excluded.

Why do profiling statistics appear incorrect?

Verify the column data type supports aggregation functions and review query execution logs.

Why are some columns excluded from sample profiling?

Columns belonging to unsupported data types configured in the profiling exclusion list are omitted.

Why are Top 50 Values not displayed?

Verify that the query executed successfully and that the column contains sufficient data values.

Why are null or invalid entries removed from generated JSON?

The connector applies validation and security filtering to remove invalid or potentially unsafe values.

Why are views not displayed after crawling?

View extraction is supported only for catalogs that support view definitions. Verify permissions and view availability.

Why does SHOW CREATE VIEW fail?

Ensure the view exists and that the configured user has permission to access the view definition.

Why are prepared statements not retrieved?

Verify workgroups exist and ensure permissions for athena:ListWorkGroups and athena:GetPreparedStatement.

Why are named queries not retrieved?

Verify permissions for athena:ListNamedQueries and athena:GetNamedQuery, and ensure named queries exist in the selected workgroups.

Why are workgroups not displayed?

Verify athena:ListWorkGroups permission and confirm the configured region is correct.

What causes "No result set metadata" errors?

The query result did not return the expected metadata. Verify successful query execution and review the API response.

Why does query result processing return unexpected values?

Verify column metadata, result-set structure, and source data consistency.

Why do governed data queries return no results?

Verify filter conditions, query syntax, and object accessibility.

Why do SUM, AVG, or STDDEV calculations fail?

These functions are supported only for numeric columns. Verify the selected column data type.

Why does Account ID retrieval fail?

Verify sts:GetCallerIdentity permission, credential validity, and AWS STS accessibility.

What should be checked when API calls frequently time out?

Review query complexity, network latency, AWS service availability, and timeout settings.

How can Athena connectivity be validated outside OvalEdge?

Use AWS CLI commands to validate Athena, AWS Glue, STS, and S3 access with the configured credentials or IAM role.

What permissions are required for successful crawling and profiling?

The configured IAM user or role must have the required Athena, AWS Glue, S3, and STS permissions described in the Service Account User Permissions section.

*** Copyright © 2026, OvalEdge LLC, Peachtree Corners GA USA --- # Agent Instructions This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com. ## Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter: ``` GET https://docs.ovaledge.com/release8.1/connectors/connector-repositories/data-warehouse/amazon-athena.md?ask=&goal= ``` `ask` is the immediate question: it should be specific, self-contained, and written in natural language. `goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.

Operation	Objects	AWS Athena System APIs / Objects	Access Permissions
Crawling	Schema (Databases)	athena:ListDatabases, glue:GetDatabases	athena:ListDatabases, glue:GetDatabase
Validation	S3 Bucket	s3:HeadBucket, s3:ListBucket, s3:HeadObject	s3:HeadBucket, s3:ListBucket, s3:HeadObject
Crawling	Tables	athena:ListTableMetadata, glue:GetTables	athena:ListTableMetadata, glue:GetTable
Crawling	Table Columns	athena:StartQueryExecution, athena:GetQueryResults, information_schema.columns	athena:StartQueryExecution, athena:GetQueryResults, s3:GetObject
Crawling & Lineage Building	Views	athena:StartQueryExecution, athena:GetQueryResults, SHOW CREATE VIEW	athena:StartQueryExecution, athena:GetQueryResults, s3:GetObject
Crawling & Lineage Building	External Tables	athena:ListTableMetadata (table parameters)	athena:ListTableMetadata
Crawling & Lineage Building	Named Queries	athena:ListNamedQueries, athena:GetNamedQuery, athena:ListWorkGroups	athena:ListNamedQueries, athena:GetNamedQuery, athena:ListWorkGroups
Crawling & Lineage Building	Prepared Statements	athena:ListPreparedStatements, athena:GetPreparedStatement, athena:ListWorkGroups	athena:ListPreparedStatements, athena:GetPreparedStatement, athena:ListWorkGroups
Profiling	Row Count	athena:StartQueryExecution, athena:GetQueryResults, SELECT COUNT(*)	athena:StartQueryExecution, athena:GetQueryResults, s3:GetObject
Profiling	Data Profiling – Top Values	athena:StartQueryExecution, athena:GetQueryResults, SELECT ... GROUP BY ... ORDER BY ... LIMIT	athena:StartQueryExecution, athena:GetQueryResults, s3:GetObject
Profiling	Sample Data	athena:StartQueryExecution, athena:GetQueryResults, SELECT * ... LIMIT	athena:StartQueryExecution, athena:GetQueryResults, s3:GetObject
Profiling	Non-Null Count	athena:StartQueryExecution, athena:GetQueryResults, SELECT COUNT(*) WHERE column IS NOT NULL	athena:StartQueryExecution, athena:GetQueryResults, s3:GetObject
Profiling	Max / Min / Distinct Count	athena:StartQueryExecution, athena:GetQueryResults, SELECT MAX(), MIN(), COUNT(DISTINCT)	athena:StartQueryExecution, athena:GetQueryResults, s3:GetObject
Data Access & Governance	Governed Data Query Execution	athena:StartQueryExecution, athena:GetQueryResults, SELECT ... WHERE ...	athena:StartQueryExecution, athena:GetQueryResults, s3:GetObject
Data Access & Query Execution	Data Query Execution (Async)	athena:StartQueryExecution, athena:GetQueryExecution, athena:GetQueryResults	athena:StartQueryExecution, athena:GetQueryExecution, athena:GetQueryResults, s3:GetObject
Data Access & Query Execution	Data Query Execution (Real-Time)	athena:StartQueryExecution, athena:GetQueryExecution, athena:GetQueryResults	athena:StartQueryExecution, athena:GetQueryExecution, athena:GetQueryResults, s3:GetObject
Connection Validation	Connection Validation	s3:ListBucket, s3:GetBucketLocation, athena:ListWorkGroups	s3:ListBucket, s3:GetBucketLocation, athena:ListWorkGroups
All Operations	S3 Output Location	s3:ListBucket, s3:GetBucketLocation, s3:GetObject	s3:ListBucket, s3:GetBucketLocation, s3:GetObject