Module | Core Features |
Model experiment | Enable the MLflow service in Studio, you can call MLflow-related functions in the experiment to record parameters, metrics, and results for each experiment, then view them in Experiment Management, thereby achieving traceability and reproducibility. And provided AutoML to support no-code development. |
Feature management | Use the feature processing API provided by WeData in Studio to create, write, read, search, sync, and consume feature tables. You can also view and manage features in Feature Management to implement unified management and consumption of features. |
Model Management | Enable the MLflow service in Studio. You can call MLflow-related functions in the experiment to register a model or perform visual model registration in Experiment Management. It supports viewing key information of the model as well as its association with experiments, runs, and services. |
Model Service | Support creating API service from models in Model Management, performing service monitoring and other features, and viewing the relationship between models for easy information tracing. |
Module | Core Features |
Studio | The main workspace for AI development is Studio, where users can edit, debug, and run code, and call the MLflow and feature engineering API to perform CRUD operations on feature tables, model training, and model registration. |
Workflow | The main workspace of the automated process. Users can debug code in Studio and submit it to workflow setting for periodic scheduling to automate and periodically produce models. |
Data Quality | Model service inference tables, feature tables, and training data tables can trigger data quality tasks to view quality information such as field analysis, drift analysis, and model metrics. |
Engine | Data science integrates two types of engines, Data Lake Compute (DLC) and Elastic MapReduce (EMR), as data sources, offline feature storage, and training resources for AI development. |
Type | Description |
Engines support | DLC standard engine Applicable to model training, experiment reporting, feature management, and model registration. Note that only the DLC engine can be used for AutoML experimentation when selecting the "wedata-data-science" mirror to create a resource group. EMR on CVM,EMR on TKE Applicable to model training, experiment reporting, feature management, and model registration. Note that it cannot be used for AutoML experimentation. EMR:Ray on TKE Applicable to model training, experiment reporting, and model registration. Note: Not applicable to AutoML experimentation or feature processing. |
MLflow version | WeData's experiment management is fully compatible with MLflow version 2.17.2. The related mirror has been pre-installed. You can execute the command to check after connecting to the Studio runtime environment.
|
Offline feature store | WeData-managed offline feature tables currently only support Data Lake Compute (DLC) Iceberg tables and Elastic MapReduce (EMR) Hive tables. The primary key and timestamp key must be specified during feature table registration. Subsequent operations will perform feature indexing based on the specified primary key and timestamp key. |
Table operation permissions | DLC: If Catalog is enabled, authorize users with corresponding database/table permissions in DLC and grant permissions in the corresponding Catalog under "data assets - Catalog directory" in WeData. If Catalog is not enabled, grant database/table permissions in DLC. EMR: In WeData, set the engine access account and account mapping under "Project Management > Storage-Compute Engine Settings" to determine table privileges. |
Model operation permissions | DLC: If Catalog is enabled, grant permissions to users in the corresponding Catalog under "data assets > Catalog directory" in WeData. If Catalog is not enabled, proceed to manage permissions by project granularity. EMR: Manage permissions by project granularity |
Online feature store | WeData supports Redis as an online feature store. Prepare as follows: Region and network need to pay attention to remain consistent with the scheduling resource group and engine (DLC or EMR) used. Step 2: Add a Redis data source in WeData's "Project Management > Data Source Management" and test connectivity with the scheduling resource group. Step 3: Add the default feature library (online) in WeData feature management. |
Feature engineering package | Connect to the engine and run the following command to install the latest feature engineering package.
If you use DLC, select the "wedata-data-science" mirror when building a resource group. The feature engineering package has been pre-installed. Execute command check version.
|
Key management description | The call of the feature engineering package requires the ID and key of TencentCloud API. You can go to Tencent Cloud CAM to create access key. If your organization requires that key information cannot be transmitted in plain text, you can use Tencent Cloud SSM or Tencent Cloud KMS to manage keys. |
Was this page helpful?
You can also Contact sales or Submit a Ticket for help.
Help us improve! Rate your documentation experience in 5 mins.
Feedback