Amazon DocumentDB
The Document DB stores data in flexible, JSON-like documents, meaning fields that vary from document to document, and data structure that can be changed over time.
NOTE: You need an SSL certificate in your JAVA Environment to connect to Document DB.
Please refer to the following document link before creating a connection in the OvalEdge.
https://docs.google.com/document/d/1nxgdbdyncCtrZFcbyGKzk2BoiWbBFUoaTvdexzz2xC8/
Ovaledge Uses Document DB (Mongo) Client API to make a connection to a running Document DB instance.
OvalEdge Crawling: It is a process of collecting information about data from various data sources like on-premise, cloud databases, Hadoop, visualization software, and file systems.
When an OvalEdge crawler connects to a data source, it collects and catalogs all the data elements (i.e., metadata) and stores it in the OvalEdge data repository. Here, the crawler creates an index for every stored data element, which can later be used in data exploration within the OvalEdge Data catalog, which is a smart search.
The OvalEdge crawlers can be scheduled to scan the databases regularly, so they always have an up-to-date index of the data element.
Data Sources: The Data Sources are the ones where the OvalEdge crawler integrates with various data sources to help the users to extract metadata and build a data catalog.
This document provides information about how to make a connection to your Document DB instance and crawl the data from various workspaces.
Connect to the Data: Before crawling and building a connection, you must first connect to your data. OvalEdge requires users to configure a separate connection for each type of data source. The users must enter the source credentials and database information for each type of connectivity. Once a data connection is made, a simple click of the Crawl button starts the crawling process.
Connector Capabilities
The connectivity to the Document DB Connector is performed via the Document DB (Mongo) Client API. The connector currently supports the following versions of Driver/APIs:
The drivers used by the connector are given below:
Driver/API
Version
Details
Mongodb driver (DocumentDB)
3.12.5
https://mvnrepository.com/artifact/org.mongodb/mongo-java-driver/3.12.5
Note: Latest version is 3.12.8
sql-to-mongo-db-query-converter
1.11
Technical Specifications
Crawling
Feature
Supported Objects
Remarks
Crawling
Tables
Table Columns
Supported Data Types:
String, Integer, Boolean Double, Timestamp, Object, Date, Object ID, Binary Data.
Profiling
Feature
Support
Remarks
Table Profiling
Row count, Columns count, View sample data
Supports all data types
Column Profiling
Min, Max, Null count, distinct, top 50 values
Full Profiling
Supported
Sample Profiling
Supported
Lineage Building
Feature
Remarks
Table Lineage
Not Supported
Column Lineage
Not Supported
Querying
Operation
Remarks
Select
Supported
Insert
Not supported, by default.
Update
Not supported, by default.
Delete
Not supported, by default.
Joins within Database
Not Supported
Joins outside Database
Not supported
Aggregations
Supported
Group By
Supported
Order By
Supported
Connection Details
Pre-requisites
To use the Amazon Document DB Connector, the details specified in the following section should be available.
An admin/service account for Crawling and Profiling.
The minimum privileges required for a cluster user are
Operation
Access Permission
Connection Validation
Read Any Database
Crawl Schemas
Read Any Database
Crawl Tables
Read Any Database
Profile Schemas, Tables
Read Any Database
To connect to the Amazon Document DB database using the OvalEdge application, complete the following steps.
Login to the OvalEdge application
In the left menu, click on the Administration module name, and the sub-modules associated with the Administration are displayed.
Click on the Crawler sub-module name, and the Crawler Information page is displayed.
In the Crawler Information page, click on the + icon. The Manage Connection with Search Connector pop-up window is displayed.
In the Manage Connection pop-up window, select the connection type as Amazon Document DB. The Manage Connection with Amazon Document DB specific details pop-up window is displayed.
6. The following are the field attributes required for the connection of Amazon Document DB.
Property
Details
Connection Type
Amazon Document DB
License Type
Standard
Connection Name
Select a Connection name for the Amazon Document DB database. The name that you specify is a reference name to easily identify the Amazon Document DB database connection in OvalEdge.
Example: Amazon Document DB1
Cluster Endpoint
Document DB Cluster URL Example:13.59.52.223:27017
Port
27017 Note: It might get changed.
Database
Admin Note: It might get changed.
Username
User account login credentials
Password
User’s Password
JAVA Home Path
Enter the Java Home Path
Ex: C:\Program Files\Java\jdk1.8.0_333\
Or if there is no JDK this can be Java Home Path
C:\Program Files\Java\
Connection String
Plugin Server
Optional
Plugin Port
Optional
Default Governance Roles
Select the required governance roles for the Steward, Custodian, and Owner
No of Archive Objects
Enter the count for the archive objects.
7. Once after entering the connection details in the required fields, click on the validate button the entered connection details are validated the Save and Save & Configure buttons are enabled.
8. Click on the save button to establish the connection or the user can also directly click on the save and configuration button to establish the connection and configure the connection settings. Here when you click on the Save & Configure button, the Connection Settings pop-up window is displayed. Where you can configure the connection settings for the selected Connector. The Save & Configure button is displayed only for the Connectors for which the settings configuration is required.
Crawler/Profiler Settings
Once connectivity is established, additional configurations for crawling and profiling can be specified:
Settings
Property
Details
Order
Priority of the rule
Start Time and End Time
Used when crawling/profiling is to be scheduled
No. of Threads
No. of threads used to perform profiling
Profile Type
Disabled/Auto/Sample
Row Count Constraint
No. of rows to be fetched
Row Count Limit
The maximum limit of rows to be fetched
Sample Profile Size
Sample profile row limit
Sample Data Count
Sample count of the data
Query Timeout
Time to wait for response
Crawler Options
Only Tables can be crawled
Crawler Rules
Only Table and Columns Include and Exclude Regex.
Note: In the Crawler Rules, we won't be using include and exclude regex functionalities for functions and procedures, and they are not present in Document DB.
Copyright © 2025, OvalEdge LLC, Peachtree Corners GA USA
Was this helpful?

