BigQuery

This article outlines the integration with the BigQuery connector, enabling streamlined metadata management through features such as crawling, profiling, querying, data preview, and lineage building (both automatic and manual). It also ensures secure authentication via Credential Manager.

Overview

Connector Details

Connector Category

Data Warehouse

Connector Version

Release7.2

Releases Supported (Available from)

Release6.3.4.x

Connectivity

[How the connection is established with BigQuery]

JDBC and SDK

Connector Features

Feature

Availability

Crawling

✅

Delta Crawling

❌

Profiling

✅

Query Sheet

✅

Data Preview

✅

Auto Lineage

✅

Manual Lineage

✅

Secure Authentication via Credential Manager

✅

Data Quality

❌

DAM (Data Access Management)

❌

Bridge

✅

Metadata Mapping

The following objects are crawled from BigQuery and mapped to the corresponding UI assets.

BigQuery Object

BigQuery Attribute

OvalEdge Attribute

OvalEdge Category

OvalEdge Type

Project

Database name

Database

Dataset / Schema

Dataset Id / Schema Name

Schema name

Schemas

Schema

Table

Table Name

Table name

Tables

Table

Table Type

Tables

Table

Columns

column_name

Column name

Table Columns

Table Column

Columns

data_type

Column Type

Table Columns

Table Column

Views

Table_Name

View name

Views

Table

Views

View_Definition

Query

Views

Table

Routines (Procedures and Functions)

Routine_Name

Routine name

Procedures/ Functions

Code

Routines (Procedures and Functions)

Routine_Definition

Query

Procedures/ Functions

Code

Routines (Procedures and Functions)

Routine_Type

Job Type

Procedures/ Functions

Code

Routines (Procedures and Functions)

Created

Created date

Procedures/ Functions

Code

Set up a Connection

Prerequisites

The following are the prerequisites to establish a connection:

Service Account User Permissions

It is recommended to use a separate service account to establish the connection to the data source, configured with the following minimum set of permissions.

👨‍💻 Who can provide these permissions? These permissions are typically granted by the BigQuery administrator, as users may not have the required access to assign them independently.

Objects

Sys Tables

Role

Access Permissions

Schemas

Information_Schema.Schemata

roles/bigquery.admin

roles/bigquery.dataEditor

roles/bigquery.dataOwner

roles/bigquery.dataViewer

bigquery.datasets.get

Tables

Information_Schema.Tables

roles/bigquery.admin

roles/bigquery.dataViewer

roles/bigquery.metadataViewer

bigquery.tables.get

bigquery.tables.list

Columns

Information_Schema.Columns

roles/bigquery.admin

roles/bigquery.dataViewer

roles/bigquery.dataEditor

roles/bigquery.metadataViewer

bigquery.tables.get

bigquery.tables.list

Views

Information_Schema.Views

roles/bigquery.admin

roles/bigquery.dataEditor

roles/bigquery.metadataViewer

roles/bigquery.dataViewer

bigquery.tables.get

bigquery.tables.list

Routines (Procedures / Functions)

Information_Schema.Routines

roles/bigquery.admin

roles/bigquery.metadataViewer

roles/bigquery.dataViewer

bigquery.routines.get

bigquery.routines.list

Connection Configuration Steps

Users are required to have the Connector Creator role in order to configure a new connection.

Log into OvalEdge, go to Administration > Connectors, click + (New Connector), search for BigQuery, and complete the required parameters.

Fields marked with an asterisk (*) are mandatory for establishing a connection.

Field Name

Description

Connector Type

By default, "BigQuery" is displayed as the selected connector type.

Credential Manager*

Select the desired credentials manager from the drop-down list. Relevant parameters will be displayed based on the selection.

Supported Credential Managers:

OE Credential Manager
AWS Secrets Manager
HashiCorp Vault
Azure Key Vault

Server*

By default, the BigQuery server URL (https://bigquery.googleapis.com) is pre-populated. If required, update the field with the BigQuery server endpoint or IP address (e.g., xxxx-xxxx.xxxx4ijtzasl.xx-south-1.rds.xxxxx.com or 1xx.xxx.1.x0).

License Add Ons

Select the checkbox for Auto Lineage Add-On to build data lineage automatically.
Select the checkbox for Data Quality Add-On to identify data quality issues using data quality rules and anomaly detection.

Connector Name*

Enter a unique name for the BigQuery connection

(Example: "BigQuery_Prod").

Connector Environment

Select the environment (Example: PROD, STG) configured for the connector.

Billing Project ID

Enter the Billing Project ID used for BigQuery query and data access billing.

Connector description

Enter a brief description of the connector.

Validation Type*

The following two types of validation are supported for BigQuery:

File Authentication
UI Authentication

Field Name

Description

Project Id*

Enter the Project ID associated with the BigQuery environment.

Application*

Enter the Application ID associated with the BigQuery environment and authorized for required operations.

File Path*

Enter the File Path of the BigQuery JSON file.

Regions (comma-separated)

Enter one or more BigQuery regions (e.g., US, EU, us-central1) - separate multiple regions with commas.

Default Governance Roles

Default Governance Roles*

Select the appropriate users or teams for each governance role from the drop-down list. All users and teams configured in OvalEdge Security are displayed for selection.

Admin Roles

Admin Roles*

Select one or more users from the dropdown list for Integration Admin and Security & Governance Admin. All users configured in OvalEdge Security are available for selection.

No of Archive Objects

No Of Archive Objects*

This shows the number of recent metadata changes to a dataset at the source. By default, it is off. To enable it, toggle the Archive button and specify the number of objects to archive.

Example: Setting it to 4 retrieves the last four changes, displayed in the 'Version' column of the 'Metadata Changes' module.

Bridge

Select Bridge*

If applicable, select the bridge from the drop-down list.

The drop-down list displays all active bridges configured in OvalEdge. These bridges enable communication between data sources and OvalEdge without altering firewall rules.

After entering all connection details, the following actions can be performed:
1. Click Validate to verify the connection.
2. Click Save to store the connection for future use.
3. Click Save & Configure to apply additional settings before saving.
The saved connection will appear on the Connectors home page.

Manage Connector Operations

Crawl/Profile

To perform crawl and profile operations, users must be assigned the Integration Admin role.

The Crawl/Profile button allows users to select one or more schemas for crawling and profiling.

Navigate to the Connectors page and click Crawl/Profile.
Select the schemas to crawl.
The Crawl option is selected by default. Click the Crawl & Profile radio button to run both operations.
Click Run to collect metadata from the connected source and load it into the OvalEdge Data Catalog.
After a successful crawl, the information appears in the Data Catalog > Databases tab.

The Schedule checkbox allows automated crawling and profiling at defined intervals, from a minute to a year.

Click the Schedule checkbox to enable the Select Period drop-down.
Select a time period for the operation from the drop-down menu.
Click Schedule to initiate metadata collection from the connected source.
The system will automatically execute the selected operation (Crawl or Crawl & Profile) at the scheduled time.

Other Operations

The Connectors page in OvalEdge provides a centralized view of all configured connectors, including their health status.

Managing connectors includes:

Connectors Health: Displays the current status of each connector using a green icon for active connections and a red icon for inactive connections, helping to monitor the connectivity with data sources.
Viewing: Click the Eye icon next to the connector name to view connector details, including databases, tables, columns, and codes.

Nine Dots Menu Options:

To view, edit, validate, build lineage, configure, or delete connectors, click on the Nine Dots menu.

Edit Connector: Update and revalidate the data source.
Validate Connector: Check the connection's integrity.
Settings: Modify connector settings.
- Crawler: Configure data extraction.
- Profiler: Customize data profiling rules and methods.
- Query Policies: Define query execution rules based on roles.
- Access Instructions: Add notes on how data can be accessed.
- Business Glossary Settings: Manage term associations at the connector level.
- Anomaly Detection Settings: Configure anomaly detection preferences at the connector level.
- Connection Pooling: Allows configuring parameters such as maximum pool size, idle time, and timeouts directly within the application.
- Others: Configure notification recipients for metadata changes.
Build Lineage: Automatically build data lineage using source code parsing.
Delete Connector: Remove a connector with confirmation.

Connectivity Troubleshooting

If incorrect parameters are entered, error messages may appear. Ensure all inputs are accurate to resolve these issues. If issues persist, contact the assigned support team.

S.No.

Error Message(s)

Error Description & Resolution

Error while validating connection: Error while validating BIGQUERY Connection : IOException block 1: Use JsonReader.setLenient(true) to accept malformed JSON at line 1 column 2 path

Description:

The system is unable to authenticate because the JSON key file path is incorrect, the file is missing, or the file is not in valid JSON format.

Resolution:

Verify the file path in the connection settings and ensure the file is a valid JSON service account key. If connecting through a bridge, confirm that the JSON file exists in the bridge client machine and is accessible.

Invalid details in Application or Project Id

Description:

Connection to BigQuery failed due to an incorrect Project ID or Application ID, or the dataset associated with the Application ID does not exist or is inaccessible.

Resolution:

Ensure the Project ID matches the correct Google Cloud project. Confirm the Application ID corresponds to an existing dataset in the project and that the dataset is accessible.

com.google.api.client.auth.oauth2.TokenResponseException: 401 Unauthorized

Description:

Authentication failed because the service account key is invalid, expired, or deleted, or because the associated service account is disabled.

Resolution:

Regenerate a new service account key from the GCP Console. Verify the service account is active and has the required BigQuery permissions. Replace the old key in the connection configuration.

com.google.api.client.googleapis.json.GoogleJsonResponseException: 403

Description:

The service account does not have sufficient IAM roles to query data or run BigQuery jobs.

Resolution:

Grant the required roles to the service account in Google Cloud IAM.

Minimum recommended roles:

roles/bigquery.user
roles/bigquery.dataViewer
roles/bigquery.jobUser

PreviousTeradata NextCloud Storage

Last updated 2 months ago

Was this helpful?