GitHub
An out-of-the-box connector is available for the GitHub Repository. OvalEdge provides support for crawling Files/Datasets i.e Github Folders or Files.

OvalEdge Crawling: It is a process of collecting information about data from various data sources like on-premise, cloud databases, Hadoop, visualization software, and file systems.
When an OvalEdge crawler connects to a data source, it collects and catalogs all the data elements (i.e., metadata) and stores it in the OvalEdge data repository. Here, the crawler creates an index for every stored data element, which can later be used in data exploration within the OvalEdge Data catalog, which is a smart search.
The OvalEdge crawlers can be scheduled to scan the databases regularly, so they always have an up-to-date index of the data element.
Data Sources: The Data Sources are the ones where the OvalEdge crawler integrates with various data sources to help the users to extract metadata and build a data catalog.
This document provides information about how to make a connection to your GitHub Repository and crawl the data from various Repositories.
Connect to the Data: Before crawling and building a connection, you must first connect to your data. OvalEdge requires users to configure a separate connection for each type of data source. The users must enter the source credentials and database information for each type of connectivity. Once a data connection is made, a simple click of the Crawl button starts the crawling process.
Connector Capabilities
The connectivity to the GitHub Connector is performed via the Rest API. Here, the user must be collaborated with the organization and can have read access to the repository, and must generate a personal access token under the profile settings option in the GitHub page.
Technical Specifications
Crawling
Feature
Supported Objects
Remarks
Crawling
Jobs
It fetches all folders and files from the GitHub instance.
Connection Details
Pre-requisites
To use the GitHub Connector, the details specified in the following section should be available.
An admin/service account for Crawling.
The minimum privileges required for users are
Operation
Access Permission
Connection Validation
User should be associated with the organization for the given repository or the user should be the owner of the repository.
Configuration Setting: Configuration key (ovaledge.extauth.authtype) need to be set as HYBRID for OAuth authentication setup
To connect to the GitHub instance using the OvalEdge application, complete the following steps.
Login to the OvalEdge application
In the left menu, click on the Administration module name, and the sub-modules associated with the Administration are displayed.
Click on the Crawler sub-module name, and the Crawler Information page is displayed.
In the Crawler Information page, click on the + icon. The Manage Connection with Search Connector pop-up window is displayed.
In the Manage Connection pop-up window, select the connection type as GitHub. The Manage Connection with GitHub-specific details pop-up window is displayed.
The following are the field attributes required for the connection of GitHub.
Property
Details
Connection Type
GitHub
License Type
Standard
Name
Select a Connection name for GitHub. The name that you specify is a reference name to easily identify the GitHub instance connection in OvalEdge.
Example: GitHub1
GitHub Organization
Enter the name of the organization
Ex: ovaledgeindia
GitHub Owner
Enter the owner name of the repository
Ex: John David
Repo Name
Enter the name of the repository
Ex: ovaledgesuperset
Personal Access Token
Enter the personal access token of the user
Note: This token is generated by the GitHub owner in the GitHub Instance.
Ex:ghp_z0HvXmn1vnz6wes1naOhEKZ8CYygJo0ixtew
GitHub Path
Enter the path of the particular repository folder
Ex: Test/
Default Governance Roles
Select the required governance roles for the Steward, Custodian, and Owner
Once after entering the connection details in the required fields, click on the validate button the entered connection details are validated the Save and Save & Configure buttons are enabled.
Click on the save button to establish the connection or the user can also directly click on the save and configure button to establish the connection and configure the connection settings. Here when you click on the Save & Configure button, the Connection Settings pop-up window is displayed. Where you can configure the connection settings for the selected Connector. The Save & Configure button is displayed only for the Connectors for which the settings configuration is required.
Crawling/Profiling
Once connectivity is established successfully, in the Crawler Information page select the GitHub connection and click on the Crawl/Profile button. The Crawling and Profiling pop-up window is displayed.
Select the specific schema which needs to be crawled and select the Crawl option and click on the Run button. The respective job associated with the GitHub connection is triggered and the data existing in the specified GitHub Repository is fetched and displayed in the Data Catalog Queries page.
Security Information
Ovaledge does not lift any secured data from the source system (Version Control).
Any security information under the config (JSON files) is filtered.
Copyright © 2025, OvalEdge LLC, Peachtree Corners GA USA
Was this helpful?

