AI Enrichments
Why AI Enrichment Matters
Exploratory analysis provides answers, but deeper insight often requires creating new columns, classifying values, or enriching existing data with advanced intelligence. Traditionally, these tasks involve writing formulas, scripts, or running models in external tools.
askEdgi integrates these capabilities directly into the Workspace through AI Functions. Using natural language prompts, new calculated columns can be created, records classified, or text data analyzed without the need for external scripting or additional tools.
What Can Be Done
AI Functions provide the ability to:
Create calculated columns (e.g., profit margin percentage)
Perform classifications (e.g., High/Medium/Low risk)
Apply text-based analysis techniques, including sentiment analysis, intent detection, emotion recognition, and classification, to analyze text data.

Use Case & Real-Life Scenario
Continuing the return analysis, a product manager explores severity levels using AI enrichment:
“Create a new column for return_rate = (returns/orders) * 100.
askEdgi generates a calculated column return_rate for each product.

“Classify products into High, Medium, or Low return categories based on return_rate (High: >30%, Medium: 10–30%, Low: <10%).”
askEdgi creates a new column return_category, to highlight product risk levels.
“Perform sentiment analysis on customer_reviews.”
askEdgi enriches the dataset with a sentiment_label column (Positive, Neutral, Negative).
By combining return patterns with sentiment insights, the analysis reveals that products with negative sentiment also report the highest return rates, signaling potential quality issues.
Different AI enrichments
Prompt Analysis Evaluates the clarity and effectiveness of user-generated prompts to ensure accurate results.
Example: Generate a new column comparing income in the dataset against the average income to provide deeper insight into earning levels.
Sentiment Analysis Classifies text data into Positive, Neutral, or Negative categories.
Example: Analyze customer reviews, browsing history, and purchase records to gain insights into customer behavior.


Intent Analysis Identifies underlying intent in textual data, classifying it into predefined categories.
Example: Detect intent in customer support or compliance interactions.

Emotion Analysis Detects emotional tones in text for better understanding of customer experiences.
Example: Assess emotions in product reviews or support conversations.


Text Classification Categorizes text into domain-specific classes such as fraud detection or spam filtering.
Example: Apply classification models to incoming emails or financial records to automate data analysis and processing.

Proofreading Identifies grammatical, clarity, and structural issues to ensure professional communication.
Example: Refine business documents or product descriptions for improved readability.

Availability
Public – Available (AI column creation and enrichment)
SaaS – Available (full AI functions)
On-Prem – Limited (Metadata analytics only)
Retrieval Augmented Generation (RAG) in askEdgi
Retrieval Augmented Generation (RAG) in askEdgi refers to answering questions by first retrieving trusted enterprise context and then generating responses based on that context.
askEdgi is not a generic AI chatbot.
It does not rely on assumptions or general knowledge.
Instead, askEdgi:
Understands business terms defined by the organization
Knows what data exists and how it is governed
Considers structural relationships between datasets
Uses metadata, lineage, and contextual statistics
Suggests relevant and trusted assets before execution
Explains answers using enterprise context
This approach ensures that responses are grounded in how the organization understands and manages its data.
Business Value of RAG
Business users frequently need answers to questions such as:
Which dataset should be used for a metric?
What does a business term mean?
Where does a number originate?
Why do different reports show different values?
Which datasets should be combined?
These questions require understanding business meaning, structural relationships, and governance alignment.
Retrieval Augmented Generation ensures:
Accurate answers
Consistent interpretation
Connected datasets
Trustworthy analysis
Enterprise Context Used by RAG
The enterprise context retrieved by askEdgi includes the following elements.
Business glossary
Approved business definitions and terminology
Curated datasets
Trusted and governed data assets
Governance information
Ownership, classification, and access controls
Metadata and documentation
Business and technical descriptions
Dataset relationships
Structural connections between assets
Lineage information
Data movement and dependency paths
Contextual statistics
Sample characteristics and value distributions
Top values
Frequently occurring values used for interpretation
Sample statistics and top value summaries improve interpretation but remain secondary to governed metadata and definitions.
askEdgi Modes and the Role of RAG
askEdgi operates in two clearly defined modes to support different user needs. Each mode has a clear purpose and boundary that ensures trust and predictability.
Analysis Mode
Analysis Mode is the default and primary mode. It supports the complete journey from understanding a question to generating insights.
Purpose of Analysis Mode
Analysis Mode supports the following activities:
Understanding a question
Identifying the correct data
Validating how datasets relate
Performing analysis
Receiving business-aligned explanations
Analysis Mode supports the complete journey from discovery to insight without switching between guidance and execution modes.
Understand RAG Usage in Analysis Mode
In Analysis Mode, Retrieval Augmented Generation performs the following functions:
Interpret the business meaning behind a question
Retrieve relevant glossary definitions
Surface business and technical descriptions of assets
Evaluate asset metadata and documentation
Analyze relationships between datasets
Eliminate unrelated or disconnected tables
Confirm that selected datasets combine correctly
Retrieval Augmented Generation ensures reasoning in a business context before execution begins.
Understand Relationship Aware Intelligence
When a question spans multiple datasets, askEdgi performs structural validation.
askEdgi performs the following actions:
Confirm that selected assets are structurally connected
Avoid a combination of unrelated datasets
Suggest additional related assets only when required
Limit context expansion to what is necessary
This prevents:
Incorrect joins
Over-selection of irrelevant tables
Misleading analysis
Loss of trust
Datasets are validated as part of a connected data ecosystem rather than isolated objects.
Understand Workspace First Execution
Analysis Mode respects the Workspace as the execution boundary.
Execution rules are as follows:
If tables remain pinned, analysis is restricted to pinned tables
If tables are not pinned, eligible workspace tables are considered
Additional catalog assets are surfaced only when necessary
Data outside the intended scope is not analyzed
This ensures controlled execution.
Discovery Mode
Discovery Mode supports structured exploration of the Data Catalog. This mode does not use RAG and does not perform analysis execution.
Purpose of Discovery Mode
Discovery Mode supports:
Asset browsing
Data availability validation
Metadata understanding
Documentation review
Understand Discovery Mode Behavior
In Discovery Mode, askEdgi performs the following actions:
Retrieve assets from the catalog
Surface business descriptions and technical documentation
Apply governance-aware filters
Return metadata and definitions
RAG-based reasoning does not occur in this mode. Discovery Mode provides clarity without execution.
Discovery Mode does not use RAG. Only Analysis Mode uses RAG.
How askEdgi Finds the Right Context in Analysis Mode
The following sequence describes how askEdgi retrieves context and prepares for execution.
Step 1: Determine Workspace Dependency
askEdgi evaluates whether the request requires existing workspace data.
Workspace data is required when the request:
Reads existing tables
Computes metrics from data
References workspace objects
Validates schemas
Workspace data is not required when the request:
Requests an example
Requests a sample SQL or Python
Requires logical reasoning without data
Creates new structures without referencing existing data
This separation improves clarity and efficiency.
Step 2: Evaluate Existing Workspace Context
askEdgi checks whether sufficient context already exists within the Workspace.
If sufficient context exists, search expansion does not occur.
Step 3: Enrich Business Understanding
When additional clarity is required, askEdgi retrieves:
Glossary definitions
Asset descriptions
Metadata context
This ensures the correct interpretation of business intent.
Step 4: Suggest Relevant Assets
When necessary, askEdgi identifies additional datasets aligned with business intent.
Only governed and relevant assets are considered.
Step 5: Validate Dataset Compatibility
Before execution, askEdgi confirms:
Required attributes exist
Datasets are structurally connected
Necessary relationships are available
Required elements are complete
If validation fails, execution does not proceed.
askEdgi stops and requests clarification instead of executing with incomplete data.
Step 6: Execute Analysis
Execution occurs only after:
Context is sufficient
Relationships are confirmed
Required data elements exist
Execution is intentional and validated.
How askEdgi Handles Missing or Incomplete Information
askEdgi stops intentionally to ensure accurate and trustworthy results.
askEdgi stops when:
Required data is missing
Structural compatibility cannot be confirmed
Business context is unclear
Confidence is insufficient
askEdgi does not:
Guess
Partially execute
Assume schema
This results in predictable and trustworthy outcomes.
RAG Trust in Analysis Mode
The RAG framework in askEdgi relies on controlled enterprise grounding.
RAG uses the following information sources:
Curated business descriptions
Technical documentation
Structured metadata
Relationship context between assets
Lineage information
Contextual data statistics, such as sample data characteristics
Top 50 values
Contextual data statistics and top 50 values improve relevance and interpretation. Governed metadata and business definitions remain the primary reference.
askEdgi enforces the following controls:
Respect governance and access rules
Validate structural compatibility before dataset combination
Confirm required attributes before execution
Stop when information remains incomplete
Maintain clear separation between discovery and execution
This layered grounding ensures business-aligned, structurally valid, and explainable responses.
RAG Limitations
Certain behaviors remain intentionally restricted to maintain trust.
askEdgi avoids:
Guessing or fabricating answers
Ignoring governance controls
Combining unrelated datasets
Excessive expansion across the data ecosystem
Execution with incomplete schema validation
Restraint remains a core system principle.
Business Impact
Organizations gain the following benefits:
Faster and safer data discovery
Reduced dependency on technical teams
Fewer incorrect dataset combinations
Strong structural validation before analysis
Higher trust in analytics and reporting
Streamlined workflows without mode confusion
Better alignment between business and data teams
askEdgi serves as a reliable entry point to enterprise data knowledge and trusted analysis.
Summary
RAG forms the foundation that makes askEdgi:
Context-aware instead of generic
Relationship-aware instead of isolated
Schema-aware instead of assumptive
Dependency-aware instead of speculative
Trusted instead of uncertain
Business aligned instead of technically driven
Clear separation between Analysis Mode and Discovery Mode with structured validation before execution ensures intentional, explainable, and trustworthy interactions.
Code Explanation Panel in AskEdgi
The Code Explanation Panel is a new feature that provides natural language summaries of SQL and Python code used to generate AskEdgi results. This feature improves accessibility for non-technical users and increases transparency in result generation.
Accessing the Code Explanation Panel
Open the Code View for a query or analysis result.
Click on the Explanation tab, positioned next to the existing Code and Copy options.
AskEdgi generates a concise, human-readable description of the code logic automatically.

Functionality
SQL Example: “This query retrieves the total sales for each region in 2023 and ranks them by revenue.”
Python Example: “This script converts unstructured balance sheet text into a structured DataFrame for two fiscal years.”
Explanations are streamlined and accurate, reflecting the code logic.
Users can toggle Show More / Show Less to expand or collapse longer explanations.
The explanation auto-refreshes whenever the code changes or is re-run.
Performance & UX
A loading indicator is shown while the explanation is being generated.
Explanations are cached for the session to improve response time on repeated views.
Error Handling
If the explanation cannot be generated (e.g., API error, timeout, unsupported code format), the following message is displayed:
“Explanation could not be generated. Please try again or refresh.”
For large or multi-step Python scripts, explanations are summarized in chunks (e.g., function-level or step-wise).
Users can optionally view a Detailed Explanation for step-wise logic.
Intelligent Query Source Detection in askEdgi
askEdgi supports Intelligent Query Source Detection, enabling automatic optimization of where a query is executed. Instead of requiring all data to be ingested into the workspace, the system can determine whether a query should run within the workspace engine or be executed directly on the original data source, such as a database or data warehouse.
This capability improves performance, reduces unnecessary data movement, and supports efficient analysis for large or real-time datasets.
Why Intelligent Source Detection Matters
Previously, all datasets needed to be fully ingested into the workspace before analysis could begin. This approach could be inefficient when working with:
Large datasets
Live enterprise databases
Real-time or frequently updated data
Data warehouses such as Snowflake
With Intelligent Source Detection, askEdgi removes this limitation by dynamically selecting the most appropriate execution environment.
How Intelligent Query Execution Works
When a user submits a query or analytical request, askEdgi automatically evaluates:
Where the relevant data resides
Whether execution is more efficient in the workspace or at the source
Performance, scale, and execution feasibility
Based on this evaluation, askEdgi chooses one of the following execution paths:
Workspace Execution
Queries run inside the askEdgi workspace engine when data is already ingested or best suited for in-workspace processing.
Source Execution
Queries are pushed directly to the original source system, such as a data warehouse, when execution outside the workspace is more efficient.
This ensures faster response times, reduced resource consumption, and improved scalability.
Live Source Query Mode
Live Source Query Mode allows AskEdgi to execute SQL queries directly on supported source systems instead of ingesting data into DuckDB. This enables real-time analytics, minimizes data duplication, and supports environments where data movement is restricted.
When enabled, Live Source becomes the default data querying mode, and all newly added tables are placed under the Live Source section for direct execution.
Connector Configuration - Live Source Checkbox
A Live Source checkbox is available in the AskEdgi settings for supported connectors (Ex: Snowflake).
Cached Mode (Default)
Tables are ingested into DuckDB
Full AI enrichment, transformations, recipes, and cross-source joins supported
Live Query Mode
SQL executes directly on the source system
No data is copied into DuckDB
AI enrichment, transformations, and cross-source joins are disabled
Only tables from the same live connector can be queried together
Live Connections in the Workspace
When Live Query Mode is enabled:
A Live Connections section appears in the workspace
Tables added from the source catalog display a Live indicator
No ingestion into DuckDB occurs
Live tables remain queryable directly on the source
Pinning Rules (Execution)
Rules
Cached (Imported) Table
✅ Allowed
Live Table
❌ Not Allowed
Live Table Pin Attempt Behavior
A blocking popup is shown:
Title: Live Table Execution Not Supported
Message:
This table is queried directly from the source system and cannot be pinned for execution. To analyze this data using AskEdgi features, move the table to Imported Data.
Actions:
Move to Imported Data
Cancel
Hybrid Execution Blocking (Live + Cached)
If a query references both Live and Cached tables, execution is blocked before SQL generation.
System Message (Chat)
⚠️ Mixed Data Sources Detected AskEdgi cannot analyze Live and Imported data together. To continue, move the Live table into Imported Data so both tables run in the same engine.
CTA: Move Live Table to Imported Data
Move-to-Cache Workflow
Users may explicitly move a Live table into Cached mode.
Flow
User confirms move
Table is removed from Live Source
Table is ingested into DuckDB
Progress indicator shown
On success — pin becomes available
User is prompted to rerun query
AI & Feature Limitations for Live Tables
Live tables do not support:
AI enrichment
Transformations
Calculated columns
DDL / DML operations
Cross-connector joins
Hybrid execution
Disabled features display tooltips explaining the limitation.
SQL Execution Routing Logic
All tables cached
DuckDB
All tables live (same connector)
Source System
Mixed Live + Imported
❌ Blocked
Copyright © 2025, OvalEdge LLC, Peachtree Corners, GA USA
Last updated
Was this helpful?

