> For the complete documentation index, see [llms.txt](https://docs.ovaledge.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.ovaledge.com/release8.1/advanced-tools/build-auto-lineage/ways-of-building-auto-lineage.md). # Ways of Building Auto Lineage Users can select source codes from the list and build auto lineage by selecting the Nine Dots menu. It provides options to meet specific requirements, making the process flexible. Here is an overview of these options with context: ### RDBMS Connectors

### Reporting and ETL Connectors

### **Build Lineage for Selected Codes** This option allows the user to build lineage only on the selected codes. ### **Build Lineage for New or Changed Codes** This option allows the user to build lineage for the new codes, or if the source has modified codes after the recrawling. ### **Build Lineage for All Codes** This option initiates the process of lineage building for all available codes within the selected connector, giving users a complete view of the data lineage. ### **Fetch Code Log and Build Lineage** The platform offers a special lineage-building option for Oracle, SQL Server, PostgreSQL, MySQL, and Snowflake using the code log. This enables users to build lineage using query logs of the source system without crawling the connection again. Users can tailor the selection of the logs as per their needs. The query log setting must be done during connection creation to build the lineage using this option. ### **Re-build Summary Lineage for Selected Source Code** This option allows the user to rebuild the Summary Lineage only for the selected source code. It updates the simplified data flow view for the selected source without affecting other sources in the instance. This option applies only to reporting and ETL connectors. ### **Re-build Summary Lineage for All Source Code** This option rebuilds the Summary Lineage for all available source codes in the instance. It refreshes the simplified data flow view across every source, ensuring that the Summary Lineage reflects the latest available information. This option applies only to reporting and ETL connectors. ### **Query Log Settings** Data crawling can be achieved in two ways: through the crawler and via the query log. Queries play a crucial role in building data lineage and relationships. With Query Log Settings, They can select the query type and the lookback period. However, the query text is predetermined based on the connector, although users can customize it to meet their specific requirements. Additionally, users can specify the schema from which they want to retrieve the query and exclude specific users. The objective of query log settings is to retrieve the query from the data source, process the queries, and create data lineage for (INSERT) for the specified duration. **Various options to configure query log settings include**

Query Log Settings	Description
Query	Displays the queries that are used to retrieve the Query Logs from the schema
Look back Period	This feature allows users to retrieve queries that were processed in the datasource a certain number of days ago. For example, the lookback period can be set to 1 day, two days, or more. The maximum Look back period is seven days.
Include Query Type	Users can include query types such as Select, Insert, Update, and Delete
Exclude Users	In the Query Logs, users have the option to exclude unnecessary users. If a username is specified in the excluded user field, the query log for queries executed by that particular user is excluded from retrieval.
Schema	The user has the option to select a specific schema to apply query log settings. If a user leaves this blank, it will fetch the query log based on all the schema available for that database connection.

### **Export Source Code to File** In certain instances, the parsing of the source code fails, and the lineage is not built. OvalEdge allows users to export selected source codes in a .zip format (password-protected for security). A click starts a job submission, and the file is quickly downloaded. Share this file with the analysis team for further processing, research, and curation in the dedicated lab environment. {% hint style="info" %} The user must set the value to "False" in “is.ovaledge.lab” (System Settings). {% endhint %} ### **Import Source Code from File** Import source code allows the user to import the corrected source code in JSON format to build the lineage. It usually works under the export functionality, and the user uploads the JSON file from the dedicated lab environment. {% hint style="info" %} The user must set the value to “True” in “is.ovaledge.lab” (System Settings). {% endhint %} ### Viewing Source Codes It allows users to access important information for each query on the "Build Auto Lineage" page after they submit and execute a job. This feature gives users detailed insights into creating their data lineage. Here's a close look at the key information for each query.

**Schema Name:** This field shows the code schema in the connector. It helps understand the code's structure and context, ensuring clarity in the data lineage. **Code Name:** The "Code Name" section shows the name of the code. This helps users identify and refer to the code in their lineage quickly. **Lineage Status:** Lineage Status" indicates the data lineage progress. It shows the current status of the query in the application. Various statuses are used to communicate the query's progress, making the lineage creation process transparent and clear. The statuses include: | Status | Parse | Lineage Dashboard | Table Type | | -------------------------------------------- | ------ | ----------------- | ----------------------- | | SUCCESS\_LINEAGE\_BUILD | 100% | YES | Real Table | | SUCCESS\_WITH\_TEMP | 100% | YES | Temp Table | | SUCCESS\_LINEAGE\_DOES\_NOT\_EXIST | 100% | NO | – | | SUCCESS\_LINEAGE\_FIXED | 100% | YES | Real Table | | SUCCESS\_LINEAGE\_PARTIALLY\_BUILD | <100% | YES | Real Table | | SUCCESS\_LINEAGE\_PARTIALLY\_BUILDWITH\_TEMP | <100% | YES | Real Table + Temp Table | | PARSE\_FAILED | FAILED | – | – | | FAILED\_UNSUPPORTED\_QUERY | FAILED | – | – | | SUCCESS\_LINEAGE\_BUILD\_WITH\_OvalEdgeLAB | 100% | YES | Real Table | | SUCCESS\_MANUALLY\_BUILD\_WITH\_OvalEdgeLAB | 100% | YES | Real Table | | SUCCESS\_NO LINEAGE BUT ASSOCIATION | 100% | NO | - | ### Correct Query OvalEdge helps users fix issues with the "Correct Query" option. Author users with Metawrite permission can see the query by clicking the ‘Eye’ icon or correct the query by clicking the tick mark. This is useful for addressing parsing problems or improving the success of the lineage creation process. 1. **View Code:** Users can easily check the code by clicking the ‘Eye’ icon in the "Correct Query" column on the "Build Auto Lineage" page. This shows what's in the code and helps them understand data lineage creation.

2. **Correct Query:** The "Correct Query" option allows users to fix code issues, make changes, and revalidate the code, ensuring accurate and reliable data lineage. This feature is crucial for managing errors, adapting to changes, and maintaining a robust data lineage system.

In the application, users can improve and handle their code using the "Correct Query Screen" feature.

### Operations in Correct Query 1. **Compare Screens:** 1. Shows a side-by-side comparison. 2. The left screen shows the original query. 3. The right screen is a blank workspace for code changes, where users can copy the existing code and make changes as per the requirements. 2. **Reset Code:** The "Reset" option restores the query to the previous version. It undoes validation changes, allowing users to make further adjustments or fix parsing problems. 3. **Copy Query:** Users can copy the original code from the left window and place it in the right window to edit and build lineage. 4. **Replace:** The "Replace" feature lets users replace incorrect strings with correct ones throughout the code with a single click. It will be applied to all the connectors' source codes. 5. **Source Server Type:** Users can specify or change the type of source server from which the data originates. It ensures the ETL process applies the correct parsing rules and connection configurations based on the selected server type. 6. **Format Query:** The "Format Query" operation aligns the query into a structured format for better readability, understanding, and smoother curation. 7. **Validate:** The "Validate" button lets users check the correctness of their edited query, ensuring it follows the required syntax and structure. 8. **Save & Build Lineage:** After validations, click the Save & Build Lineage option to save the edited query and start the lineage-building process. 9. **Result:** This section shows users the outcome of the query parsing process. It displays parsing results and associated logs, giving a clear overview of the code's status. 10. **Build Lineage using Shutter box:** The "Shutter box" feature in OvalEdge allows users to build lineage in various ways:

* **Lineage:** Users can directly select source and destination objects from dropdown menus to build lineage, simplifying data flow definition. * **Relationships:** Users can select schemas, tables, and columns to establish connections between data elements. * **Associations:** Users can select objects to associate with queries, creating meaningful connections between data elements. **Error Message:** When a query can't parse and shows "Parse failed," the "Error Message" section helps. It gives users the error log, showing specific issues to fix. This information allows users to solve parsing errors. ### Viewing Lineage on Code After successfully building lineage, users can access the queries they worked on. Users can view the lineage for these queries in the **Data Catalog > Codes section.**

#### **Code Summary Tab:** This section provides a detailed summary of the query. It includes a business description, a technical description, associated business glossary terms, tags, the top users who have interacted with the query, the last crawled date, and the lineage status. #### **Associations:** The "Associations" tab lists objects directly associated with the query. Understanding these connections is crucial for grasping the query's impact on other data elements in the system. #### **References:** In the "References" tab, users can see a list of objects that refer to the root object. This information helps users understand where and how the object is used across the system. #### **Lineage:** In the "Lineage" tab, users can see how the lineage was built. This includes two viewing options: * Graphical * Tabular ## AI-Augmented Lineage AI-Augmented Lineage enhances the lineage-building process by extending the existing deterministic parsing framework with AI-driven intelligence. This capability improves lineage coverage for complex transformation logic, reduces dependency on manual intervention, and accelerates support for new or evolving data patterns. The feature operates within the existing lineage architecture and integrates seamlessly into the Auto Lineage workflow without altering the core lineage-building mechanisms. ### **AI-Augmented Parsing Behavior** During lineage construction, the application continues to execute standard deterministic parsing for all supported patterns. When the parser encounters unsupported, complex, or unrecognized logic, the system automatically invokes AI-based parsing to infer lineage. This fallback-based approach ensures: * Continuity of lineage generation without failure * Extended coverage for complex transformation logic * Preservation of reliability from deterministic parsing while augmenting it with AI capabilities The AI-based parsing operates only when required and does not override existing parser results. ### **Handling Complex Transformations** AI-Augmented Lineage improves extraction for scenarios that are traditionally difficult to process using rule-based parsing, including: * Nested queries and multi-layer transformations * Complex scripting logic (e.g., M-code in reporting tools) * Non-standard or evolving query patterns This ensures improved completeness of lineage across data pipelines, especially in reporting and ETL systems. ### **Confidence-Based Lineage Processing** All AI-inferred lineage is evaluated using a confidence scoring mechanism. * High-confidence lineage results are automatically merged into the lineage graph * Lower-confidence results are not directly applied and are flagged for review Flagged lineage appears as pending lineage, allowing users to validate and accept it before inclusion. This ensures accuracy while maintaining governance control through a human-in-the-loop approach. ### **Data Security through Pseudonymization** Before processing any source code through AI models, sensitive information is protected using pseudonymization. * Identifiers such as database names, schema names, table names, and credentials are replaced with neutral tokens * AI processing is performed on masked data * After lineage extraction, the system restores original values through a controlled de-pseudonymization process This ensures that no sensitive data is exposed externally while maintaining accurate lineage reconstruction. ### **Zero-Deployment Template Management** AI-Augmented Lineage supports a zero-deployment enhancement model. Administrators can update: * Parsing templates (code templates) * AI prompt templates These templates are version-controlled and stored within the application. Updates can be applied without requiring application redeployment, enabling faster adaptation to new transformation patterns and reducing turnaround time for lineage improvements. ### **Feature Enablement and Configuration** AI-Augmented Lineage is controlled through both global and connector-level configurations. * A system-level setting enables AI-based lineage capabilities across the instance * Connector-level configuration allows selective enablement under the AI Lineage settings This enables phased adoption and controlled rollout across different data sources. ### **Supported Connectors** AI-Augmented Lineage is currently supported for: * RDBMS connectors * Reporting connectors such as Power BI (with a focus on complex transformation logic) Support is continuously expanding to additional connectors and transformation frameworks. ### **Integration with Auto Lineage** AI-Augmented Lineage is integrated into the existing Build Auto Lineage workflow. * AI parsing is triggered automatically during lineage job execution when parsing gaps are identified * Confidence scoring is applied as part of lineage processing * Results are reflected within lineage status and review workflows {% hint style="warning" %} No changes are required to the existing lineage job execution steps. {% endhint %} ### **Impact on Lineage Visualization** AI-generated lineage integrates into existing lineage views: * Accepted lineage becomes part of the standard lineage graph * Pending lineage can be reviewed and validated before inclusion * Lineage visualization continues to support both Summary and Detailed views without structural changes Exploring lineage helps users understand the data's journey thoroughly. For more detailed insights into each field, users can refer to the [Data Catalog Deep Dive Article](/release8.1/data-catalog/data-catalog-overview.md). ## Schedule The lineage process can be scheduled in advance for user convenience. To schedule it, navigate to **Administration → Job Workflow**. Lineage can also be scheduled for new processes or changed queries. For more details on the Schedule, please refer to the [schedule section](https://docs.ovaledge.com/administration/schedule). ## Advanced Job **Refresh Has-Lineage**: This job refreshes the Has Lineage column in the Data Catalog list page. The column indicates whether the object has lineage or not. **Build Dashboard stats**: This job will populate the data into lineage dashboard after migrations or fresh deployments. For more details on the Advanced Job, please refer to the [Advanced Jobs](https://docs.ovaledge.com/administration/advanced-jobs). *** Copyright © 2025, OvalEdge LLC, Peachtree Corners, GA, USA. --- # Agent Instructions This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com. ## Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter: ``` GET https://docs.ovaledge.com/release8.1/advanced-tools/build-auto-lineage/ways-of-building-auto-lineage.md?ask=&goal= ``` `ask` is the immediate question: it should be specific, self-contained, and written in natural language. `goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.