# BigQuery

This article outlines the integration with the BigQuery connector, enabling streamlined metadata management through features such as crawling, profiling, querying, data preview, and lineage building (both automatic and manual). It also ensures secure authentication via Credential Manager.

<figure><img src="https://1813356899-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FhTnkoJQml0pok9awFDhx%2Fuploads%2Fe7i0I4N3F9oMlub2DWiI%2Funknown.png?alt=media&#x26;token=191fe49c-22c5-45e6-b99f-9311b704b7f8" alt=""><figcaption></figcaption></figure>

## Overview

### Connector Details

<table data-header-hidden><thead><tr><th width="407.333251953125"></th><th></th></tr></thead><tbody><tr><td>Connector Category</td><td>Data Warehouse</td></tr><tr><td>Connector Version</td><td>Release7.2</td></tr><tr><td>Releases Supported (Available from)</td><td>Release6.3.4.x</td></tr><tr><td><p>Connectivity</p><p>[How the connection is established with BigQuery]</p></td><td>JDBC and SDK</td></tr></tbody></table>

### Connector Features

| Feature                                      | Availability |
| -------------------------------------------- | :----------: |
| Crawling                                     |       ✅      |
| Delta Crawling                               |       ❌      |
| Profiling                                    |       ✅      |
| Query Sheet                                  |       ✅      |
| Data Preview                                 |       ✅      |
| Auto Lineage                                 |       ✅      |
| Manual Lineage                               |       ✅      |
| Secure Authentication via Credential Manager |       ✅      |
| Data Quality                                 |       ❌      |
| DAM (Data Access Management)                 |       ❌      |
| Bridge                                       |       ✅      |

### Metadata Mapping

The following objects are crawled from BigQuery and mapped to the corresponding UI assets.

<table><thead><tr><th width="231.6666259765625">BigQuery Object</th><th width="198">BigQuery Attribute</th><th width="190.666748046875">OvalEdge Attribute</th><th width="181.3333740234375">OvalEdge Category</th><th width="170.666748046875">OvalEdge Type</th></tr></thead><tbody><tr><td>Project</td><td>Project</td><td>Database name</td><td>Database</td><td>Database</td></tr><tr><td>Dataset / Schema</td><td>Dataset Id / Schema Name</td><td>Schema name</td><td>Schemas</td><td>Schema</td></tr><tr><td>Table</td><td>Table Name</td><td>Table name</td><td>Tables</td><td>Table</td></tr><tr><td>Table</td><td>Table Type</td><td>Table Type</td><td>Tables</td><td>Table</td></tr><tr><td>Columns</td><td>column_name</td><td>Column name</td><td>Table Columns</td><td>Table Column</td></tr><tr><td>Columns</td><td>data_type</td><td>Column Type</td><td>Table Columns</td><td>Table Column</td></tr><tr><td>Views</td><td>Table_Name</td><td>View name</td><td>Views</td><td>Table</td></tr><tr><td>Views</td><td>View_Definition</td><td>Query</td><td>Views</td><td>Table</td></tr><tr><td>Routines (Procedures and Functions)</td><td>Routine_Name</td><td>Routine name</td><td>Procedures/ Functions</td><td>Code</td></tr><tr><td>Routines (Procedures and Functions)</td><td>Routine_Definition</td><td>Query</td><td>Procedures/ Functions</td><td>Code</td></tr><tr><td>Routines (Procedures and Functions)</td><td>Routine_Type</td><td>Job Type</td><td>Procedures/ Functions</td><td>Code</td></tr><tr><td>Routines (Procedures and Functions)</td><td>Created</td><td>Created date</td><td>Procedures/ Functions</td><td>Code</td></tr></tbody></table>

## Set up a Connection

### Prerequisites

The following are the prerequisites to establish a connection:

#### **Service Account User Permissions**

{% hint style="warning" %}
It is recommended to use a separate service account to establish the connection to the data source, configured with the following minimum set of permissions.
{% endhint %}

{% hint style="info" %}
👨‍💻 Who can provide these permissions? These permissions are typically granted by the BigQuery administrator, as users may not have the required access to assign them independently.
{% endhint %}

<table><thead><tr><th width="121.6666259765625">Objects</th><th width="166.6666259765625">Sys Tables</th><th width="286.333251953125">Role</th><th width="222.3333740234375">Access Permissions</th></tr></thead><tbody><tr><td>Schemas</td><td>Information_Schema.Schemata</td><td><p>roles/bigquery.admin</p><p>roles/bigquery.dataEditor</p><p>roles/bigquery.dataOwner</p><p>roles/bigquery.dataViewer</p></td><td>bigquery.datasets.get</td></tr><tr><td>Tables</td><td>Information_Schema.Tables</td><td><p>roles/bigquery.admin</p><p>roles/bigquery.dataViewer</p><p>roles/bigquery.metadataViewer</p></td><td><p>bigquery.tables.get</p><p>bigquery.tables.list</p></td></tr><tr><td>Columns</td><td>Information_Schema.Columns</td><td><p>roles/bigquery.admin</p><p>roles/bigquery.dataViewer</p><p>roles/bigquery.dataEditor</p><p>roles/bigquery.metadataViewer</p></td><td><p>bigquery.tables.get</p><p>bigquery.tables.list</p></td></tr><tr><td>Views</td><td>Information_Schema.Views</td><td><p>roles/bigquery.admin</p><p>roles/bigquery.dataEditor</p><p>roles/bigquery.metadataViewer</p><p>roles/bigquery.dataViewer</p></td><td><p>bigquery.tables.get</p><p>bigquery.tables.list</p></td></tr><tr><td>Routines (Procedures / Functions)</td><td>Information_Schema.Routines</td><td><p>roles/bigquery.admin</p><p>roles/bigquery.metadataViewer</p><p>roles/bigquery.dataViewer</p></td><td><p>bigquery.routines.get</p><p>bigquery.routines.list</p></td></tr></tbody></table>

### Connection Configuration Steps

{% hint style="warning" %}
Users are required to have the Connector Creator role in order to configure a new connection.
{% endhint %}

1. Log into OvalEdge, go to Administration > Connectors, click + (New Connector), search for BigQuery, and complete the required parameters.

{% hint style="info" %}
Fields marked with an asterisk (\*) are mandatory for establishing a connection.
{% endhint %}

<table><thead><tr><th width="219.3333740234375">Field Name</th><th>Description</th></tr></thead><tbody><tr><td>Connector Type</td><td>By default, "BigQuery" is displayed as the selected connector type.</td></tr><tr><td>Credential Manager*</td><td><p>Select the desired credentials manager from the drop-down list. Relevant parameters will be displayed based on the selection.</p><p>Supported Credential Managers:</p><ul><li>OE Credential Manager</li><li>AWS Secrets Manager</li><li>HashiCorp Vault</li><li>Azure Key Vault</li></ul></td></tr><tr><td>Server*</td><td>By default, the BigQuery server URL (https://bigquery.googleapis.com) is pre-populated. If required, update the field with the BigQuery server endpoint or IP address (e.g., xxxx-xxxx.xxxx4ijtzasl.xx-south-1.rds.xxxxx.com or 1xx.xxx.1.x0).</td></tr><tr><td>License Add Ons</td><td><p></p><ul><li>Select the checkbox for <strong>Auto Lineage</strong> Add-On to build data lineage automatically.</li><li>Select the checkbox for <strong>Data Quality</strong> Add-On to identify data quality issues using data quality rules and anomaly detection.</li></ul></td></tr><tr><td>Connector Name*</td><td><p>Enter a unique name for the BigQuery connection              </p><p>(Example: "BigQuery_Prod").</p></td></tr><tr><td>Connector Environment</td><td>Select the environment (Example: PROD, STG) configured for the connector.</td></tr><tr><td>Billing Project ID</td><td>Enter the Billing Project ID used for BigQuery query and data access billing.</td></tr><tr><td>Connector description</td><td>Enter a brief description of the connector.</td></tr><tr><td>Validation Type*</td><td><p>The following two types of validation are supported for BigQuery:</p><ul><li>File Authentication</li><li>UI Authentication</li></ul></td></tr></tbody></table>

{% tabs %}
{% tab title="File Authentication" %}

<table><thead><tr><th width="193.3333740234375">Field Name</th><th>Description</th></tr></thead><tbody><tr><td>Project Id*</td><td>Enter the Project ID associated with the BigQuery environment.</td></tr><tr><td>Application*</td><td>Enter the Application ID associated with the BigQuery environment and authorized for required operations.</td></tr><tr><td>File Path*</td><td>Enter the File Path of the BigQuery JSON file.</td></tr><tr><td>Regions (comma-separated)</td><td>Enter one or more BigQuery regions (e.g., US, EU, us-central1) - separate multiple regions with commas.</td></tr></tbody></table>
{% endtab %}

{% tab title="UI Authentication" %}

<table><thead><tr><th width="189.333251953125">Field Name</th><th>Description</th></tr></thead><tbody><tr><td>Account Type*</td><td>Enter the service account type used for BigQuery UI authentication.</td></tr><tr><td>Client id*</td><td>Enter the client ID associated with the BigQuery service account.</td></tr><tr><td>Client Email*</td><td>Enter the client email associated with the BigQuery service account.</td></tr><tr><td>Private Key*</td><td>Enter the private key for the BigQuery service account.</td></tr><tr><td>Private Key Id*</td><td>Enter the private key ID associated with the BigQuery service account.</td></tr><tr><td>Token Uri*</td><td>Enter the token Uri used for authentication with BigQuery.</td></tr><tr><td>Project Id*</td><td>Enter the Project ID associated with the BigQuery environment.</td></tr><tr><td>Application*</td><td>Enter the Application ID associated with the BigQuery environment and authorized for required operations.</td></tr><tr><td>Regions (comma-separated)</td><td>Enter one or more BigQuery regions (e.g., US, EU, us-central1) - separate multiple regions with commas.</td></tr></tbody></table>
{% endtab %}
{% endtabs %}

**Default Governance Roles**

<table data-header-hidden><thead><tr><th width="219.3333740234375"></th><th></th></tr></thead><tbody><tr><td>Default Governance Roles*</td><td>Select the appropriate users or teams for each governance role from the drop-down list. All users and teams configured in OvalEdge Security are displayed for selection.</td></tr></tbody></table>

**Admin Roles**

<table data-header-hidden><thead><tr><th width="219.3333740234375"></th><th></th></tr></thead><tbody><tr><td>Admin Roles*</td><td>Select one or more users from the dropdown list for Integration Admin and Security &#x26; Governance Admin. All users configured in OvalEdge Security are available for selection.</td></tr></tbody></table>

**No of Archive Objects**

<table data-header-hidden><thead><tr><th width="219.3333740234375"></th><th></th></tr></thead><tbody><tr><td>No Of Archive Objects*</td><td><p>This shows the number of recent metadata changes to a dataset at the source. By default, it is off. To enable it, toggle the Archive button and specify the number of objects to archive.</p><p>Example: Setting it to 4 retrieves the last four changes, displayed in the 'Version' column of the 'Metadata Changes' module.</p></td></tr></tbody></table>

**Bridge**

<table data-header-hidden><thead><tr><th width="219.3333740234375"></th><th></th></tr></thead><tbody><tr><td>Select Bridge*</td><td><p>If applicable, select the bridge from the drop-down list.</p><p>The drop-down list displays all active bridges configured in OvalEdge. These bridges enable communication between data sources and OvalEdge without altering firewall rules.</p></td></tr></tbody></table>

2. After entering all **connection details**, the following actions can be performed:
   1. Click **Validate** to verify the connection.
   2. Click **Save** to store the connection for future use.
   3. Click **Save & Configure** to apply additional settings before saving.
3. The saved connection will appear on the **Connectors home page.**

## Manage Connector Operations

### Crawl/Profile

{% hint style="warning" %}
To perform crawl and profile operations, users must be assigned the Integration Admin role.
{% endhint %}

The **Crawl/Profile** button allows users to select one or more **schemas** for **crawling** and **profiling**.

1. Navigate to the **Connectors page** and click **Crawl/Profile**.
2. Select the schemas to crawl.
3. The **Crawl** option is selected by default. Click the **Crawl & Profile** radio button to run both operations.
4. Click **Run** to collect metadata from the connected source and load it into the **OvalEdge Data Catalog**.
5. After a successful crawl, the information appears in the **Data Catalog > Databases** tab.

The **Schedule** checkbox allows automated **crawling** and **profiling** at defined intervals, from a minute to a year.

1. Click the **Schedule** checkbox to enable the **Select** Period drop-down.
2. Select a time period for the operation from the drop-down menu.
3. Click **Schedule** to initiate metadata collection from the connected source.
4. The system will automatically execute the selected operation (**Crawl** or **Crawl & Profile**) at the scheduled time.

#### Other Operations

The **Connectors** page in OvalEdge provides a centralized view of all configured connectors, including their health status.

**Managing connectors includes:**

* **Connectors Health:** Displays the current status of each connector using a **green** icon for active connections and a **red** icon for inactive connections, helping to monitor the connectivity with data sources.
* **Viewing**: Click the **Eye** icon next to the connector name to view connector details, including databases, tables, columns, and codes.

**Nine Dots Menu Options:**

To view, edit, validate, build lineage, configure, or delete connectors, click on the **Nine Dots** menu.

* **Edit Connector**: Update and revalidate the data source.
* **Validate Connector**: Check the connection's integrity.
* **Settings**: Modify connector settings.
  * **Crawler**: Configure data extraction.
  * **Profiler**: Customize data profiling rules and methods.
  * **Query Policies:** Define query execution rules based on roles.
  * **Access Instructions:** Add notes on how data can be accessed.
  * **Business Glossary Settings:** Manage term associations at the connector level.
  * **Anomaly Detection Settings:** Configure anomaly detection preferences at the connector level.
  * **Connection Pooling:** Allows configuring parameters such as maximum pool size, idle time, and timeouts directly within the application.
  * **Others**: Configure notification recipients for metadata changes.
* **Build Lineage**: Automatically build data lineage using source code parsing.
* **Delete Connector**: Remove a connector with confirmation.

## Connectivity Troubleshooting

If incorrect parameters are entered, error messages may appear. Ensure all inputs are accurate to resolve these issues. If issues persist, contact the assigned support team.

<table><thead><tr><th width="85">S.No.</th><th width="208.333251953125">Error Message(s)</th><th>Error Description &#x26; Resolution</th></tr></thead><tbody><tr><td>1</td><td>Error while validating connection: Error while validating BIGQUERY Connection : IOException block 1: Use JsonReader.setLenient(true) to accept malformed JSON at line 1 column 2 path</td><td><p><strong>Description</strong>:</p><p>The system is unable to authenticate because the JSON key file path is incorrect, the file is missing, or the file is not in valid JSON format.</p><p><strong>Resolution</strong>:</p><p>Verify the file path in the connection settings and ensure the file is a valid JSON service account key. If connecting through a bridge, confirm that the JSON file exists in the bridge client machine and is accessible.</p></td></tr><tr><td>2</td><td>Invalid details in Application or Project Id</td><td><p><strong>Description</strong>:</p><p>Connection to BigQuery failed due to an incorrect Project ID or Application ID, or the dataset associated with the Application ID does not exist or is inaccessible.</p><p><strong>Resolution</strong>:</p><p>Ensure the Project ID matches the correct Google Cloud project. Confirm the Application ID corresponds to an existing dataset in the project and that the dataset is accessible.</p></td></tr><tr><td>3</td><td>com.google.api.client.auth.oauth2.TokenResponseException: 401 Unauthorized</td><td><p><strong>Description</strong>:</p><p>Authentication failed because the service account key is invalid, expired, or deleted, or because the associated service account is disabled.</p><p><strong>Resolution</strong>:</p><p>Regenerate a new service account key from the GCP Console. Verify the service account is active and has the required BigQuery permissions. Replace the old key in the connection configuration.</p></td></tr><tr><td>4</td><td>com.google.api.client.googleapis.json.GoogleJsonResponseException: 403</td><td><p><strong>Description</strong>:</p><p>The service account does not have sufficient IAM roles to query data or run BigQuery jobs.</p><p><strong>Resolution</strong>:</p><p>Grant the required roles to the service account in Google Cloud IAM.</p><p>Minimum recommended roles:</p><ul><li>roles/bigquery.user</li><li>roles/bigquery.dataViewer</li><li>roles/bigquery.jobUser</li></ul></td></tr></tbody></table>

***

Copyright © 2025, OvalEdge LLC, Peachtree Corners, GA, USA.
