# Amazon S3

This article outlines the integration with the Amazon S3 connector, enabling streamlined metadata management through features such as crawling of files and folders, profiling, and data preview. It also supports metadata extraction for multiple file formats, including CSV, XLSX, XLS, PARQUET, ORC, JSON, YAML, TXT, and PIP files.

This connector uses the AWS S3 SDK to establish connectivity with Amazon S3 and supports Role-Based Authentication and IAM User Authentication for accessing buckets, folders, and files.

<figure><img src="/files/vxBFQvKmYHYQq6n13ErA" alt=""><figcaption></figcaption></figure>

## **Overview**

### **Connector Details**

| Connector Category                                                       | Cloud Storage |
| ------------------------------------------------------------------------ | ------------- |
| OvalEdge Release Current Connector Version                               | 6.3.4         |
| <p>Connectivity</p><p><em>\[How OvalEdge connects to Amazon S3]</em></p> | AWS S3 SDK    |
| <p>OvalEdge Releases Supported</p><p>(Available from)</p>                | Release4.0    |

### **Connector Features**

| Feature                                      | Availability |
| -------------------------------------------- | :----------: |
| Crawling / Cataloging                        |       ✅      |
| Delta Crawling                               |       ❌      |
| Profiling\*                                  |       ✅      |
| Sample Profiling                             |       ✅      |
| Query Sheet                                  |      NA      |
| Data Preview                                 |       ✅      |
| Auto Lineage                                 |      NA      |
| Manual Lineage                               |       ✅      |
| Secure Authentication via Credential Manager |       ✅      |
| Data Quality                                 |       ✅      |
| DAM (Data Access Management)                 |       ✅      |
| Bridge                                       |       ✅      |

{% hint style="info" %}
"NA" indicates that the respective feature is 'Not Applicable.'
{% endhint %}

{% hint style="info" %}
\*Full profiling is supported through DuckDB. To enable this capability, configure the system setting (key: enable.duckdb) to **True**.
{% endhint %}

### **Metadata Mapping**

The following objects are crawled from Amazon S3 and mapped to the corresponding UI assets.

<table><thead><tr><th width="170.66668701171875">Amazon S3 Object</th><th width="182.8333740234375">Amazon S3 Attribute</th><th width="182.6666259765625">OvalEdge Attribute</th><th width="180.5">OvalEdge Category</th><th width="177.1666259765625">OvalEdge Type</th></tr></thead><tbody><tr><td>Bucket</td><td>Folder</td><td>Folder</td><td>Folder</td><td>Folder</td></tr><tr><td>File</td><td>File</td><td>File</td><td>File</td><td>File</td></tr><tr><td>XLSX</td><td>Folder(subfile)</td><td>Folder(subfile)</td><td>Folder(subfile)</td><td>Folder(subfile)</td></tr><tr><td>XLS</td><td>Folder(subfile)</td><td>Folder(subfile)</td><td>Folder(subfile)</td><td>Folder(subfile)</td></tr><tr><td>CSV</td><td>File</td><td>File</td><td>File</td><td>File</td></tr><tr><td>TXT</td><td>File</td><td>File</td><td>File</td><td>File</td></tr><tr><td>PARQUET</td><td>File</td><td>File</td><td>File</td><td>File</td></tr><tr><td>ORC</td><td>File</td><td>File</td><td>File</td><td>File</td></tr><tr><td>JSON</td><td>File</td><td>File</td><td>File</td><td>File</td></tr><tr><td>YAML</td><td>File</td><td>File</td><td>File</td><td>File</td></tr><tr><td>PIP</td><td>File</td><td>File</td><td>File</td><td>File</td></tr></tbody></table>

## **Set up a Connection**&#x20;

### **Prerequisites**

The following are the prerequisites to establish a connection:

Ensure that the CSV files follow the required formatting standards for proper data processing and visibility. Refer to [CSV Format Requirements](https://docs.ovaledge.com/connectors/additional-requirements/csv-format-requirements-for-file-connectors)**.**

#### **File Naming Convention**

Ensure that all file names within the selected folders and subfolders follow the supported naming convention and do not contain the NTFS Alternate Data Stream (ADS) separator character (:).

{% hint style="warning" %}
Files with unsupported naming patterns may be skipped during Catalog and Profile operations. To ensure successful execution and complete metadata extraction, rename such files before triggering the job.
{% endhint %}

#### **Service Account User Permissions**

{% hint style="warning" %}
It is recommended to use a separate service account to establish the connection to the data source, configured with the following minimum set of permissions.
{% endhint %}

{% hint style="info" %}
👨‍💻**Who can provide these permissions?** These permissions are typically granted by the Amazon S3 administrator, as users may not have the required access to assign them independently.
{% endhint %}

<table><thead><tr><th width="247.3333740234375">Objects</th><th>Access Permission</th></tr></thead><tbody><tr><td>Buckets</td><td><p>ListAllMyBuckets</p><p>GetBucketLocation</p><p>GetBucketTagging</p><p>GetEncryptionConfiguration</p></td></tr><tr><td>Folder</td><td><p>ListBucket</p><p>GetBucketLocation</p><p>GetEncryptionConfiguration</p></td></tr><tr><td>Files</td><td><p>ListBucket</p><p>GetBucketLocation</p><p>GetEncryptionConfiguration</p></td></tr><tr><td>Profile</td><td>GetObject</td></tr></tbody></table>

#### Cross-Account Role Prerequisites

For cross-account access, ensure the following configurations are completed:

* The target AWS IAM role must include a trust policy that allows the source account or IAM principal to assume the role.
* The IAM principal or role used by the application must have the `sts:AssumeRole` permission for the target role.

#### SSE-KMS Bucket Permissions

For Amazon S3 buckets encrypted using SSE-KMS, additional AWS KMS permissions are required.

Along with `s3:GetObject`, the IAM principal or assumed role must also have the following permissions on the associated Customer Managed Key (CMK):

* `kms:Decrypt`
* `kms:DescribeKey` (if applicable)

### **Connection Configuration Steps**

{% hint style="warning" %}
Users are required to have the Connector Creator role in order to configure a new connection.
{% endhint %}

1. Log into **OvalEdge**, go to **Administration > Connectors**, click **+ (New Connector)**, search for **Amazon S3**, and complete the required parameters.

{% hint style="info" %}
Fields marked with an asterisk (**\***) are mandatory for establishing a connection.
{% endhint %}

<table><thead><tr><th width="219">Field Name</th><th>Description</th></tr></thead><tbody><tr><td>Connector Type</td><td>By default, "Amazon S3" is displayed as the selected connector type.</td></tr><tr><td>Authentication<strong>*</strong></td><td><p>The following two types of authentication are supported for Amazon S3:</p><ul><li>Role Based Authentication (Default)</li><li>IAM User Authentication</li></ul></td></tr></tbody></table>

{% tabs %}
{% tab title="Role Based Authentication" %}

<table><thead><tr><th width="209.8333740234375">Field Name</th><th>Description</th></tr></thead><tbody><tr><td>Credential Manager*</td><td><p>Select the desired credentials manager from the drop-down list. Relevant parameters will be displayed based on the selection.</p><p>Supported Credential Managers:</p><ul><li>OE Credential Manager</li><li>AWS Secrets Manager</li><li>HashiCorp</li><li>Azure Key Vault</li></ul></td></tr><tr><td>License Add Ons</td><td><p> </p><ul><li>Select the checkbox for <strong>Data Quality Add-On</strong> to identify data quality issues using data quality rules and anomaly detection.</li><li>Select the checkbox for <strong>Data Access Add-On</strong> to enable the data access functionality.</li></ul></td></tr><tr><td>Connector Name*</td><td><p>Enter a unique name for the Amazon S3 connection              </p><p>(Example: "AmazonS3db").</p></td></tr><tr><td>Connector Description</td><td>Enter a brief summary or details about the connector.</td></tr><tr><td>Connector Environment</td><td>Select the environment (Example: PROD, STG) configured for the connector.</td></tr><tr><td>Cross-Account Role ARN</td><td>Enter the ARN (Amazon Resource Name) of the role used for cross-account access.</td></tr><tr><td>Filter by tags</td><td>Enter one or more tags to narrow down and display only the items associated with those tags.</td></tr><tr><td>Region</td><td>Enter the region where the Amazon S3 files or resources are located.</td></tr></tbody></table>
{% endtab %}

{% tab title="IAM User Authentication" %}

<table><thead><tr><th width="209.8333740234375">Field Name</th><th>Description</th></tr></thead><tbody><tr><td>Credential Manager*</td><td><p>Select the desired credentials manager from the drop-down list. Relevant parameters will be displayed based on the selection.</p><p>Supported Credential Managers:</p><ul><li>OE Credential Manager</li><li>AWS Secrets Manager</li><li>HashiCorp</li><li>Azure Key Vault</li></ul></td></tr><tr><td>License Add Ons</td><td><p> </p><ul><li>Select the checkbox for Data Quality Add-On to identify data quality issues using data quality rules and anomaly detection.</li><li>Select the checkbox for Data Access Add-On to enable the data access functionality.</li></ul></td></tr><tr><td>Auto Lineage</td><td>Not Supported</td></tr><tr><td>Data Quality</td><td>Supported</td></tr><tr><td>Data Access</td><td>Supported</td></tr><tr><td>Connector Name*</td><td><p>Enter a unique name for the Amazon S3 connection              </p><p>(Example: "AmazonS3db").</p></td></tr><tr><td>Connector Description</td><td>Enter a brief summary or details about the connector.</td></tr><tr><td>Connector Environment</td><td>Select the environment (Example: PROD, STG) configured for the connector.</td></tr><tr><td>Access key*</td><td>Enter the AWS Access Key ID used to authenticate the IAM user.</td></tr><tr><td>Secret key*</td><td>Enter the AWS Secret Access Key associated with the Access Key ID.</td></tr><tr><td>Filter by tags</td><td>Enter one or more tags to narrow down and display only the items associated with those tags.</td></tr><tr><td>Region</td><td>Enter the region where the Amazon S3 files or resources are located.</td></tr></tbody></table>
{% endtab %}
{% endtabs %}

**Default Governance Roles**

<table data-header-hidden><thead><tr><th width="219.8333740234375"></th><th></th></tr></thead><tbody><tr><td>Default Governance Roles<strong>*</strong></td><td>Select the appropriate users or teams for each governance role from the drop-down list. All users configured in the security settings are available for selection.</td></tr></tbody></table>

**Admin Roles**

<table data-header-hidden><thead><tr><th width="219.8333740234375"></th><th></th></tr></thead><tbody><tr><td>Admin Roles<strong>*</strong></td><td>Select one or more users from the dropdown list for Integration Admin and Security &#x26; Governance Admin. All users configured in the security settings are available for selection.</td></tr></tbody></table>

**No of Archive Objects**

<table data-header-hidden><thead><tr><th width="219.83331298828125"></th><th></th></tr></thead><tbody><tr><td>No Of Archive Objects<strong>*</strong></td><td><p>This shows the number of recent metadata changes to a dataset at the source. By default, it is off. To enable it, toggle the Archive button and specify the number of objects to archive.</p><p><strong>Example</strong>: Setting it to 4 retrieves the last four changes, displayed in the 'Version' column of the 'Metadata Changes' module.</p></td></tr></tbody></table>

**Bridge**

<table data-header-hidden><thead><tr><th width="219.8333740234375"></th><th></th></tr></thead><tbody><tr><td>Select Bridge<strong>*</strong></td><td><p>If applicable, select the bridge from the drop-down list.</p><p>The drop-down list displays all active bridges that have been configured. These bridges facilitate communication between data sources and the system without requiring changes to firewall rules.</p></td></tr></tbody></table>

2. After entering all connection details, the following actions can be performed:
   1. Click **Validate** to verify the connection.
   2. Click **Save** to store the connection for future use.
   3. Click **Save & Configure** to apply additional settings before saving.
3. The saved connection will appear on the Connectors home page.

## **Manage Connector Operations**

### **Crawl/Profile**

{% hint style="info" %}
To perform crawl and profile operations, users must be assigned the Integration Admin role.
{% endhint %}

1. Navigate to the **Connectors** page and click **Crawl/Profile.**
2. This action initiates the metadata collection process from the data source and loads the retrieved metadata into the **File Manager > File Explorer.**
3. In the File Manager, click the connector name, select the specific **folder(s) or file(s)**, then click **Catalog / Catalog and Profile** from the **Nine Dots** menu. For more details, click [here](https://docs.ovaledge.com/release8.1/file-manager/file-explorer).

{% hint style="info" %}
Profiling is supported only at the individual file level through the File Nine Dots menu in File Manager. File columns are fetched into the system only after the profiling process has been successfully completed.
{% endhint %}

4. The selected files or folders will be added to the **Data Catalog > Files/File Columns** tab.

#### **Other Operations**

The **Connectors** page provides a centralized view of all configured connectors, along with their health status.

**Managing connectors includes:**

* **Connectors Health**: Displays the current status of each connector using a **green** icon for active connections and a **red** icon for inactive connections, helping to monitor the connectivity with data sources.
* **Viewing**: Click the **Eye icon** next to the connector name to view connector details, including databases, tables, columns, and codes.

**Nine Dots Menu Options**:

To view, edit, validate, configure, or delete connectors, click on the **Nine Dots** menu.

* **Edit Connector**: Update and revalidate the data source.
* **Validate Connector**: Check the connection's integrity.
* **Settings**: Modify connector settings.
  * **Crawler**: Configure data extraction.
  * **Access Instructions**: Add notes on how data can be accessed.
  * **Business Glossary Settings**: Manage term associations at the connector level.
  * **Anomaly Detection Settings**: Configure anomaly detection preferences at the connector level.
  * **Others**: Configure notification recipients for metadata changes.
* **Delete Connector:** Remove a connector with confirmation.

For more details, click [here](https://docs.ovaledge.com/connectors/introduction-to-connectors/setup-and-connectivity/connector-settings).

## **Connectivity Troubleshooting**

If incorrect parameters are entered, error messages may appear. Ensure all inputs are accurate to resolve these issues. If issues persist, contact the assigned support team.

<table><thead><tr><th width="91.19696044921875">S.No.</th><th width="317.04547119140625">Error Message(s)</th><th>Error Description/Resolution</th></tr></thead><tbody><tr><td>1</td><td>Error while validating connection: Please provide valid credentials: The AWS Access Key Id you provided does not exist in our records. (Service: Amazon S3; Status Code: 403; Error Code: InvalidAccessKeyId; Request ID: 73GVA0Y9H15Q5K7G; S3 Extended Request ID: jmNMT5vyMU9kEiT68EgfY6IYRwTdvzSh+51qL/6IzxpguBCYe7e1JOJYLpbHOl1t2mqyKlmArTw=; Proxy: null)</td><td><p><strong>Error Description:</strong> Invalid Access Key</p><p><strong>Resolution:</strong></p><ul><li>Verify that the configured AWS Access Key ID is correct and active.</li><li>Ensure that the access key belongs to the intended AWS account.</li><li>Update the connection with a valid access key and revalidate the connection.</li></ul></td></tr><tr><td>2</td><td>Error while validating connection: Please provide valid credentials: The request signature we calculated does not match the signature you provided. Check your key and signing method. If you start to see this issue after you upgrade the SDK to 1.12.460 or later, it could be because the bucket provided contains '/'. (Service: Amazon S3; Status Code: 403; Error Code: SignatureDoesNotMatch; Request ID: NWGSQ9BDSZ2A3H5H; S3 Extended Request ID: 319yH7h/x76swRiPpjxjs8KB/6dLrdGHrrAJs9rD2/HgQWudiMCQJMzj1ItUQAJ1zEsVm/YsCbU=; Proxy: null)</td><td><p><strong>Error Description:</strong> Invalid Secret Key</p><p><strong>Resolution:</strong> </p><ul><li>Verify that the configured AWS Secret Access Key is correct.</li><li>Ensure that only the bucket name is provided without any folder path or prefix.</li><li>Revalidate the connection after updating the secret key or bucket configuration.</li></ul><p><strong>Note</strong>: With AWS SDK version <code>1.12.460</code> and later, entering a bucket value that includes a forward slash (<code>/</code>) or path prefix may result in an <code>SignatureDoesNotMatch</code> error. Ensure that only the bucket name is provided without any folder path or prefix.</p></td></tr><tr><td>3</td><td>Error while validating connection: Exception while fetching AWSCredentialsProvider : User: arn:aws:iam::479930578883:user/connector_testing is not authorized to perform: sts: AssumeRole on resource: arn:aws:iam::479930578883:role/airflow_MWAA (Service: AWSSecurityTokenService; Status Code: 403; Error Code: AccessDenied; Request ID: 6bd3e40e-6e9c-43e9-8f51-e631727b6afe; Proxy: null)</td><td><p><strong>Error Description:</strong> Missing <code>sts:AssumeRole</code> permission for cross-account role authentication.</p><p><strong>Resolution:</strong></p><ul><li>Verify that the IAM user or role has permission to perform <code>sts:AssumeRole</code>.</li><li>Ensure that the target IAM role trust relationship is configured correctly.</li><li>Update the required permissions and revalidate the connection.</li></ul></td></tr><tr><td>4</td><td>Error while validating connection: Incorrect Account ID!</td><td><p><strong>Error Description:</strong> Invalid account ID</p><p><strong>Resolution:</strong></p><ul><li>Verify that the configured AWS Account ID is correct.</li><li>Ensure that the account ID matches the configured AWS environment.</li><li>Update the account ID and validate the connection again.</li></ul></td></tr></tbody></table>

***

Copyright © 2026, OvalEdge LLC, Peachtree Corners GA USA


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.ovaledge.com/release8.1/connectors/connector-repositories/cloud-storage/amazon-s3.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
