Ways of Building Auto Lineage
Users can select source codes from the list and build auto lineage by selecting the Nine Dots menu. It provides options to meet specific requirements, making the process flexible. Here is an overview of these options with context:
RDBMS Connectors
Reporting and ETL Connectors
Build Lineage for Selected Codes
This option allows the user to build lineage only on the selected codes.
Build Lineage for New or Changed Codes
This option allows the user to build lineage for the new codes, or if the source has modified codes after the recrawling.
Build Lineage for All Codes
This option initiates the process of lineage building for all available codes within the selected connector, giving users a complete view of the data lineage.
Fetch Code Log and Build Lineage
The platform offers a special lineage-building option for Oracle, SQL Server, PostgreSQL, MySQL, and Snowflake using the code log. This enables users to build lineage using query logs of the source system without crawling the connection again. Users can tailor the selection of the logs as per their needs. The query log setting must be done during connection creation to build the lineage using this option.
Re-build Summary Lineage for Selected Source Code
This option allows the user to rebuild the Summary Lineage only for the selected source code. It updates the simplified data flow view for the selected source without affecting other sources in the instance. This option applies only to reporting and ETL connectors.
Re-build Summary Lineage for All Source Code
This option rebuilds the Summary Lineage for all available source codes in the instance. It refreshes the simplified data flow view across every source, ensuring that the Summary Lineage reflects the latest available information. This option applies only to reporting and ETL connectors.
Query Log Settings
Data crawling can be achieved in two ways: through the crawler and via the query log. Queries play a crucial role in building data lineage and relationships.
With Query Log Settings, They can select the query type and the lookback period. However, the query text is predetermined based on the connector, although users can customize it to meet their specific requirements. Additionally, users can specify the schema from which they want to retrieve the query and exclude specific users. The objective of query log settings is to retrieve the query from the data source, process the queries, and create data lineage for (INSERT) for the specified duration.
Various options to configure query log settings include
Query
Displays the queries that are used to retrieve the Query Logs from the schema
Look back Period
This feature allows users to retrieve queries that were processed in the datasource a certain number of days ago. For example, the lookback period can be set to 1 day, two days, or more. The maximum Look back period is seven days.
Include Query Type
Users can include query types such as Select, Insert, Update, and Delete
Exclude Users
In the Query Logs, users have the option to exclude unnecessary users. If a username is specified in the excluded user field, the query log for queries executed by that particular user is excluded from retrieval.
Schema
The user has the option to select a specific schema to apply query log settings. If a user leaves this blank, it will fetch the query log based on all the schema available for that database connection.
Export Source Code to File
In certain instances, the parsing of the source code fails, and the lineage is not built. OvalEdge allows users to export selected source codes in a .zip format (password-protected for security). A click starts a job submission, and the file is quickly downloaded. Share this file with the analysis team for further processing, research, and curation in the dedicated lab environment.
Import Source Code from File
Import source code allows the user to import the corrected source code in JSON format to build the lineage. It usually works under the export functionality, and the user uploads the JSON file from the dedicated lab environment.
Viewing Source Codes
It allows users to access important information for each query on the "Build Auto Lineage" page after they submit and execute a job. This feature gives users detailed insights into creating their data lineage. Here's a close look at the key information for each query.
Schema Name: This field shows the code schema in the connector. It helps understand the code's structure and context, ensuring clarity in the data lineage.
Code Name: The "Code Name" section shows the name of the code. This helps users identify and refer to the code in their lineage quickly.
Lineage Status: Lineage Status" indicates the data lineage progress. It shows the current status of the query in the application. Various statuses are used to communicate the query's progress, making the lineage creation process transparent and clear. The statuses include:
SUCCESS_LINEAGE_BUILD
100%
YES
Real Table
SUCCESS_WITH_TEMP
100%
YES
Temp Table
SUCCESS_LINEAGE_DOES_NOT_EXIST
100%
NO
–
SUCCESS_LINEAGE_FIXED
100%
YES
Real Table
SUCCESS_LINEAGE_PARTIALLY_BUILD
<100%
YES
Real Table
SUCCESS_LINEAGE_PARTIALLY_BUILDWITH_TEMP
<100%
YES
Real Table + Temp Table
PARSE_FAILED
FAILED
–
–
FAILED_UNSUPPORTED_QUERY
FAILED
–
–
SUCCESS_LINEAGE_BUILD_WITH_OvalEdgeLAB
100%
YES
Real Table
SUCCESS_MANUALLY_BUILD_WITH_OvalEdgeLAB
100%
YES
Real Table
SUCCESS_NO LINEAGE BUT ASSOCIATION
100%
NO
-
Correct Query
OvalEdge helps users fix issues with the "Correct Query" option. Author users with Metawrite permission can see the query by clicking the ‘Eye’ icon or correct the query by clicking the tick mark. This is useful for addressing parsing problems or improving the success of the lineage creation process.
View Code: Users can easily check the code by clicking the ‘Eye’ icon in the "Correct Query" column on the "Build Auto Lineage" page. This shows what's in the code and helps them understand data lineage creation.
Correct Query: The "Correct Query" option allows users to fix code issues, make changes, and revalidate the code, ensuring accurate and reliable data lineage. This feature is crucial for managing errors, adapting to changes, and maintaining a robust data lineage system.
In the application, users can improve and handle their code using the "Correct Query Screen" feature.
Operations in Correct Query
Compare Screens:
Shows a side-by-side comparison.
The left screen shows the original query.
The right screen is a blank workspace for code changes, where users can copy the existing code and make changes as per the requirements.
Reset Code: The "Reset" option restores the query to the previous version. It undoes validation changes, allowing users to make further adjustments or fix parsing problems.
Copy Query: Users can copy the original code from the left window and place it in the right window to edit and build lineage.
Replace: The "Replace" feature lets users replace incorrect strings with correct ones throughout the code with a single click. It will be applied to all the connectors' source codes.
Source Server Type: Users can specify or change the type of source server from which the data originates. It ensures the ETL process applies the correct parsing rules and connection configurations based on the selected server type.
Format Query: The "Format Query" operation aligns the query into a structured format for better readability, understanding, and smoother curation.
Validate: The "Validate" button lets users check the correctness of their edited query, ensuring it follows the required syntax and structure.
Save & Build Lineage: After validations, click the Save & Build Lineage option to save the edited query and start the lineage-building process.
Result: This section shows users the outcome of the query parsing process. It displays parsing results and associated logs, giving a clear overview of the code's status.
Build Lineage using Shutter box: The "Shutter box" feature in OvalEdge allows users to build lineage in various ways:
Lineage: Users can directly select source and destination objects from dropdown menus to build lineage, simplifying data flow definition.
Relationships: Users can select schemas, tables, and columns to establish connections between data elements.
Associations: Users can select objects to associate with queries, creating meaningful connections between data elements.
Error Message: When a query can't parse and shows "Parse failed," the "Error Message" section helps. It gives users the error log, showing specific issues to fix. This information allows users to solve parsing errors.
Viewing Lineage on Code
After successfully building lineage, users can access the queries they worked on. Users can view the lineage for these queries in the Data Catalog > Codes section.
Code Summary Tab:
This section provides a detailed summary of the query. It includes a business description, a technical description, associated business glossary terms, tags, the top users who have interacted with the query, the last crawled date, and the lineage status.
Associations:
The "Associations" tab lists objects directly associated with the query. Understanding these connections is crucial for grasping the query's impact on other data elements in the system.
References:
In the "References" tab, users can see a list of objects that refer to the root object. This information helps users understand where and how the object is used across the system.
Lineage:
In the "Lineage" tab, users can see how the lineage was built. This includes two viewing options:
Graphical
Tabular
Exploring lineage helps users understand the data's journey thoroughly. For more detailed insights into each field, users can refer to the Data Catalog Deep Dive Article.
Schedule
The lineage process can be scheduled in advance for user convenience. To schedule it, navigate to Administration → Job Workflow. Lineage can also be scheduled for new processes or changed queries. For more details on the Schedule, please refer to the schedule section.
Advanced Job
Refresh Has-Lineage: This job refreshes the Has Lineage column in the Data Catalog list page. The column indicates whether the object has lineage or not.
Build Dashboard stats: This job will populate the data into lineage dashboard after migrations or fresh deployments.
For more details on the Advanced Job, please refer to the Advanced Jobs.
Copyright © 2025, OvalEdge LLC, Peachtree Corners, GA, USA.
Last updated
Was this helpful?

