tencent cloud

Data Discovery
Last updated:2026-02-05 11:18:21
Data Discovery
Last updated: 2026-02-05 11:18:21
Data Discovery is a unified search and entry portal in WeData for global data assets. By integrating metadata from multiple data sources, it provides users with a platform-level capacity to quickly search, understand, and evaluate data assets. Users can click Data Assets on the data governance webpage, then select Data Discovery in the left sidebar to enter the corresponding page.


Search Box

The top of the data discovery page is the search bar, where you can reset keyword searches based on your needs. Next to the search bar, users can click Recently Viewed to display the latest 10 browsing records, and directly click to enter the details page.

Search Area

The left side of the data discovery page is the search area, containing all released and unreleased assets in the asset inventory. You can further refine search results based on data source type, data source, database, etc.
All filter options include: asset catalog, data source type, data source, database, owner, associated project, asset grade, data warehouse layering, asset tag, asset status, search scope.

Search Result

Search results sort related data assets by matching degree. The data table list contains information such as data source type, owner, asset catalog, tag, and field, and provides features like view lineage, request permissions, and add/remove favorites.
Data table search results
Display the appropriate data tables based on the match rate. Search keywords will be highlighted. Filter search results based on those I am responsible for or collected by me. Simultaneously support table name sorting or popularity sorting.
The display form supports card view and table view, displayed by default as card view. Click Card View to switch to "Table View". In Settings, adjust the displayed fields for easier search and inspection. In the Settings button, you can set the preference to adjust between side-sliding or window navigation.

Metric search results (To be launched)
Display metric search results based on match rate. Keywords are highlighted. Filter results by those I am responsible for or collected by me. Support sorting by table name or popularity.
The display form supports card view and table view, with card view displayed by default. Click Card View to switch to "Table View". Adjust the displayed fields in settings for easier search and inspection.
Click the table name or metric name to enter the table details page or metric detail page.


Table Detail

Note:
Since data sources differ in support level, not all types include the following features. Please refer to the actual display result on the webpage.
Under each feature of data governance, click the table name you want to view to enter the table details page. Based on the data source, the table details page contains business information, technical information, asset score, basic information, data preview, outputs and changes, data lineage, data temperature, partition information, data quality, access log, and note.

Business information
Show the asset catalog, tags, asset status, importance level, release date, associated project, asset owner, and data warehouse layering of the current data table.
Business information can be modified by clicking "Modify Business Information" at the top right of the interface to open a popup for adjustment.
Technical information
Displays the data source type, data source, database, collection task, engine Owner, engine ID, table type, storage capacity, storage path, lifecycle, recent data, DDL changes, creation time, and metadata source of the current data table.
Basic Information
Provides edit, view DDL, and view Select features, and displays field name, field type, field description, instructions, security level, and security category.

Partition Information
Partition information contains partition field information and partition details.

Data preview
Preview the table content, supporting displaying up to the first 5 data entries, with data updated on T+1.

Outputs and changes
Output: Filter by output task/instance time, displaying task ID, execution count, scheduled time, start time, output time, execution duration, and production time consumption.
Changes: Displays the change records of the table in the past 30 days, including change time, change type, change log, operator, and number of affected tables.
Lineage
WeData lineage shows the complete data flow chain under the root account across all projects, including data source, destination, and associated tasks. The lineage feature provides table/field-level lineage and impact analysis, covering inter-table lineage of formal data tables used in tasks.

The lineage feature mainly displays upstream and downstream data flows and impact analysis for central tables/fields, defaulting to direct first-level upstream and downstream table lineage. Users can trace lineage relationships on the canvas, switch display object granularity, and more. Main features and operations are as follows:
Parameter
Description
Table View/Field View/Impact Analysis
Support switching between table/field dimensions to show lineage relationships
Table View: Displays inter-table upstream and downstream relationships at table granularity, where a node on the canvas represents a table. By default, it shows direct first-level upstream and downstream formal table lineage of the central table.
Field View: Displays inter-table upstream and downstream associated fields at field granularity, where a node on the canvas represents a field. By default, field lineage uses the first field of a table as the center.
Impact analysis: Analyzes the impact and dependency relationship of the current table on tasks and other tables.
map/hierarchy mode
Table lineage supports two modes: full table traceability or single-link traceability, defaulting to map mode.
Search
Supports searching for existing tables/fields in the canvas. The found object will be displayed centrally in the canvas.
Canvas tool: zoom in/resize/restore/full-screen
Set lineage canvas and node size.
lineage canvas
Show table/field lineage.
Table/Field: A node on the canvas represents a table/field. By default, the table entered in the details page serves as the central table on the canvas, with its associated upstream and downstream tables/fields displayed on the left and right.
Name: Table/Field name. For non-central tables/fields, click the link above the node for quick access to the table details page.
Number of Upstream Objects: The number of upstream first-level tables/fields. 0 means no upstream.
Downstream object count: The number of downstream first-level tables/fields. 0 means no downstream.
Data flow: The arrow direction represents the data flow direction, with the left side as the source data and the right side as the destination data.
Associated tasks: Click the arrow to view the sync/SQL task info that generated this data lineage association.
Expand/Collapse: Click the upstream/downstream digit of a node on the canvas to expand/collapse its upstream/downstream objects. If a table/field is located at the downstream of the central node, clicking will only expand its downstream objects; conversely, only the upstream will expand.
Quick expand: In map mode, right-click a target object to quickly expand multi-level lineage for upstream/downstream.
Table View
Table lineage by default shows the directly connected first-level upstream and downstream association tables, and tasks for the central table. It supports selecting target tables to trace upstream/downstream lineage, expanding all direct first-level upstream/downstream tables of the selected table at once, while keeping the expanded state of other same-level table lineage unchanged.
Field View
Field lineage uses the first field of the central table as the initialization object, expanding by default the direct first-level upstream and downstream association quantity, associated tables, and tasks of this field. It supports selecting target fields to trace upstream/downstream lineage. Click the field selector in the upper-left corner of the canvas to switch display fields.
Impact analysis
Analyze the impact and dependency relationship of the current table on tasks and other tables.
Click Impact Analyze to enter the analysis process, which may continue for several minutes to tens of minutes. Upon completion, the analysis results will be displayed.

The current analysis result is reserved by default for 1 month. After expiry, just analyze again.
Affected tables: Display associated data table objects (such as default.cleanedcarsales in figure).
Affected tasks: Display associated data processing tasks (such as Data Integration/SQL computing tasks).
Download bill detail: Click Download bill detail to export to Excel format impact list, including affected tables and tasks.
Re-analyze: Since lineage changes dynamically with time, we recommend re-analyzing each time to retrieve the latest impact results.
System automation annotates "last analysis time" (retention for 1 month history), for example: 2025-07-18 17:43:06. Assess whether updates have occurred after this point. Currently only support downstream impact analysis.
Current constraints: Currently only support analyzing downstream impact.
Data temperature
Data temperature provides temperature trend and frequent access task info.
Temperature trend: data access count and Table Details Views in the last 7 days.
Frequent access: Most frequently accessed tasks in the last 30 days (task ID, access type, task status, affiliated project, associated workflow, owner, and count).

Data quality
Data quality provides quality monitoring rules for data table configuration and dashboard quality overview of table data.


For EMR-hive tables, support creating an analysis dashboard herein to view reasoning/profiling/data drift relevant metrics. Other types of tables have no dashboard entrance for data quality.
Note:
1. Only the table owner, asset administrator, and project administrator have dashboard enable permissions, and the operation user must belong to the corresponding project of the selected data source to perform this operation.

Operation steps
1. Under Data Asset > Table Detail > Data Quality tab, enable the data quality dashboard.
2. Fill in the dashboard parameters.


Element
Description
Execution resource
The execution resource here is the scheduling resource group already bound to the project.
Execution Engine
Here you can select Hive and Spark, related to the purchased EMR resource. In general, Hive tables can directly select Hive engine.
Computing Resource
Select default
Here you can select the resource group in the EMR cluster. In general, you can directly select default.
Analysis Type
Select the analysis type of the table. Options: reasoning table, time series table, snapshot table.
Reasoning Table
Time Series Table
Snapshot Table
Element
Description
Problem Type
Type of model output supports reasoning and regression
prediction column
Target column for model training. Read all fields in the table and select according to business associate.
tag column
Prediction column for model output. Read all fields in the table and select according to business associate.
model ID column
Unique identifier field of the model. Read all fields in the table and select according to business associate.
Timestamp column
Read fields in the table where the field type is timestamp type and select according to business associate.
Statistical Cycle
Periodic quality inspection for selected tables by minute, hour, day, week, and month.
Statistical time
Quality tasks only calculate data rows in the current cycle (real-time incremental data). Historical data will not update.
Interval
Statistics interval. Select according to business.
Metric granularity
Minimum dimensional and fine-grained level of statistical calculation. Each time compute metric targets all data rows with identical metric granularity in timestamp.
Note: When metric granularity is 30 minutes, only compute the data of the last 3 months.
Baseline Data Table
Optional. Reference data used to compare and analyze data changes, with a fixed time point or standard state as the base. Used to calculate drift-related metrics. If left empty, the drift metric calculation result will be empty.
Element
Description
Timestamp column
Read fields in the table where the field type is timestamp type and select according to business associate.
Statistical Cycle
Periodic quality inspection for selected tables by minute, hour, day, week, and month.
Statistical time
Quality tasks only calculate data rows in the current cycle (real-time incremental data). Historical data will not update.
Interval
Statistics interval. Select according to business.
Baseline Data Table
Optional. Reference data used to compare and analyze data changes, with a fixed time point or standard state as the base. Used to calculate drift-related metrics. If left empty, the drift metric calculation result will be empty.
Baseline Data Table
Optional. Reference data used to compare and analyze data changes, with a fixed time point or standard state as the base. Used to calculate drift-related metrics. If left empty, the drift metric calculation result will be empty.
3. Click confirm to create a dashboard.
4. View dashboard data.
Note:
1. For tables with existing dashboard data, users without permission can view charts/metric data of the current status. Only the current status can be viewed, not editable, no drop-down filter for other conditions, and non-deletable.
2. Display as "-" means the metric has no data value.
Different types of tables show different charts/metrics on the dashboard.
Reasoning Table
Time Series Table
Snapshot Table
For reasoning tables, viewable field analysis data, regression/category metrics, and data drift metrics.
Field Analysis
Field analysis includes: field name, data type, minimum value, maximum value, median, mean, number of unique values, uniqueness rate, minimum length, maximum length, average length.

View reasoning analysis data
When the regression model is selected, the display values are: mean squared error, root mean squared error, average absolute error, average absolute percentage error, R2 metric.

Recalculate the metric value by entering the model ID in the table.
Select a date to view values in different time ranges.
When a classification model is selected, show classification model metrics, fairness, and deviation.
Classification metrics display statistical values and charts. Statistical values include accuracy, precision, recall rate, and F1 score. Charts include indicator trend charts and Confusion Matrix charts.

Fairness and deviation
Enter the comparison column, protection group value, and positive class value.

Click confirm button, wait after successful running of quality task, then view statistical value and chart.
Statistical values: predict parity, equal prediction, equal opportunity, statistical parity; chart shows fair rate trend chart.
Note:
1. The statistical result is unable to view before the task is completed.
2. Task running status can be viewed in "configuration message".


View data drift
Note:
1. After creating the dashboard, select the feature column to calculate the data offset metric.
2. The feature columns of the baseline data table and monitoring table must fully align (field name, field type, field meaning, and value ranges match), otherwise both cannot be effectively compared, and the drift metric will return null due to "no match dimension".

Show settlement result:


For time series tables, viewable field analysis data and data drift metrics.
Field analysis data displays field charts. Fields include: field name, data type, minimum value, maximum value, median, mean, number of unique values, uniqueness rate, minimum length, maximum length, average length.

View data drift:

Numeric metrics include: KS test, Wasserstein distance, Population Stability Index (PSI).
Classification type metrics include: Chi-Squared Test, L-infinity distance, JS divergence.
For time series tables, viewable field analysis data and data drift metrics.
Field analysis data displays field charts. Fields include: field name, data type, minimum value, maximum value, median, mean, number of unique values, uniqueness rate, minimum length, maximum length, average length.

View data drift:

Numeric metrics include: KS test, Wasserstein distance, Population Stability Index (PSI).
Classification type metrics include: Chi-Squared Test, L-infinity distance, JS divergence.
Disable dashboard
Toggle off means disable dashboard. Once disabled, the dashboard will no longer collect statistics, but historical data are not cleared. You can re-enable it if needed.

Edit dashboard
Click edit button to refresh dashboard configuration.
Note:
Except for the associated project, analysis type, and problem type, which cannot be modified, other items support modification.

Delete dashboard
You can delete the current dashboard by clicking delete. Once deleted, the data cannot be recovered and can only be recreated.

Configuration Information
View dashboard configuration parameters and run history through configuration message.

The instance execution status is the same as the data quality task.
Access log
Access logs provide a statistical overview of data table access details, including access date, access account, task ID, access type, and number of execution.

Note
Provide an editable page, making it easy for users to input business information such as how to use the form.

Database Details

Enter the Governance Center > Asset Inventory > Data Table page, the technical information of the table will display the database of this asset, and you can click to view the database details.

Database overview
The database overview shows the total number of tables and storage capacity in the database.
Basic Information
The basic information mainly shows the data source type, associated data source, collection task, engine ID, affiliated project, creator, and creation time. Click the URLs of the collection task to go to the corresponding task details.
Table list
The table list mainly shows the data table name, owner, importance level, tag, asset catalog, released status, stored amount, and its last update time under the database. Click the table name to go to the table details page.
The table list also supports batch operations in asset inventory, such as one-click transfer, batch modify asset directories, batch modify tags, batch modify importance level, batch modify lifecycle, and favorite.

Data Source Details

On the Governance Center > Asset Inventory > Table Detail page, the technical information of the table will display the data source and associated data source of this asset. Click to view the data source details. The data source details page contains the data source overview, basic information, and all databases and data tables under the source.

Data source overview
The data source overview shows the total number of databases and tables under the data source.
Basic Information
The basic information mainly shows the data source name, data source type, engine ID, creator, creation time, affiliated project, authorized project, collection task, Co-source Collection Task, etc. Meanwhile, click the URLs of the collection task to navigate to the corresponding task details.
Library list
The library list mainly shows all database names, storage capacity, and their affiliated projects under this data source. You can batch modify the affiliated projects of databases under this data source here. Click the library name to go to the database details page.
Table list
The table list mainly shows all data table names, database name.Schema, owner, importance level, tag, asset catalog, released status, stored amount, and its last update time under this data source. Click the table name to navigate to the table details page.
The table list also supports batch operations in asset inventory, such as one-click transfer, batch modification of asset directory, batch modification of tags, batch modification of importance level, batch modification of lifecycle, and favorites.
Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback