Prerequisites
Enable WeData Studio
Purchase a Compute & Storage Engine
SQL files executed in WeData Studio can be connected to the following big data engines:
1. DLC engine
2. EMR engine: Hive type data source
Studio Development IDE
Create a New SQL File
In the left-side file directory, click “+”, select Create New SQL, and specify the file name and target folder path.
Once created, you can write and debug SQL directly within WeData Studio.
SQL Editor
This section describes the key features of the SQL editor in WeData Studio.
1. Run Button
Multiple execution options are provided to help users flexibly control the execution scope and result size for different query scenarios.
Run/Run All
Run: Executes only the currently selected SQL statement.
Run all: Executes all SQL statements in the script sequentially.
LIMIT 1000.
When enabled: Query results are limited to a preview of up to 1,000 rows.
When disabled: Results can return up to the maximum limit supported by the system.
2. Batch Submit
Batch submits the SQL script content and any updated parameters in Studio to all tasks that reference the script.
3. Formatting
Click the Formatting button to automatically format SQL code in the editor.
The system adjusts indentation, line breaks, and keyword structure to make SQL statements clearer and more standardized, improving readability and maintainability.
4. Parameter
The Parameters button is used to manage parameters defined in the SQL script. The system automatically detects parameters defined in the format ${parameter_name} (for example, ${param1}) and displays them in a unified parameter dialog, where users can enter or confirm parameter values.
5. Data Source and Resource Group Configuration
The Data Source and Resource Configuration section allows users to configure the data source type and the corresponding compute and scheduling resources used for SQL execution before running the query.
6. SQL Editor Shortcut Key
7. Version History
Click Save to generate a saved version of the current file.
Click Version History on the right to view all historical versions. The version history supports version comparison, version rollback etc. This allows users to track changes and restore previous versions when needed.
Script Execution
1. Enter SQL code in the editor. You can directly read tables from the data catalog using a three-part identifier in the format catalog.schema.table, for example:
SELECT *
FROM catalog.schema.table;
2. At the top of the editor, configure the required execution settings for this run, including the data source and scheduling resource group.
3. Click Run.
Execution Result
After the SQL execution in Studio is completed, the results are displayed in the Results panel. This panel is used to view: query results, execution logs, executed SQL code. It helps users quickly validate the execution status of the SQL.
The left side of the result panel displays the execution timestamp of the SQL.
If a single execution contains multiple SQL statements, each statement generates a separate result tab. Users can switch between results of different SQL statements using the result tabs.
Query result display
SQL query results are displayed in a tabular format and support the following operations:
Result search
Field Setting:
Users can configure which columns are displayed.
Frequently used columns can be pinned to the top by clicking the pin icon next to the column name.
Result download supports exporting in the following formats:
CSV
Excel
TXT
Column sorting:
Click a column header to sort results in ascending or descending order.
Copy result
Select and copy data directly from the result area.
Execution Log
Users can switch to view execution logs in the result panel. Execution logs support:
Log Content Search.
Log file download.
Log information can be used to troubleshoot SQL execution issues or analyze the execution process.
Note:
Currently, SQL execution in Studio runs on a scheduling resource group. For each execution, the result preview supports up to 1,000 rows.
Referencing Studio SQL Files in Orchestration Workflow Tasks
This section describes how to use SQL files created in WeData Studio as the code source for workflow tasks in the Orchestration Space.
Supported task type:
Supported Data Source Type | Task type |
DLC Data Source | DLC SQL |
EMR Data Source | Hive SQL/Spark SQL |
Create a Task
1. When creating a task, select Studio directory.
2. Support selecting SQL files from the following locations:
Workspace: Only SQL scripts whose data source type matches the task type are supported.
Studio
Note:
For SQL files newly created in Workspace, you must click Batch Submission once in WeData Studio before the file can be selected in the Orchestration module.
Editing Restrictions in Workflow Tasks
Direct modification of SQL code is not supported in the Orchestration module. SQL code must be edited in WeData Studio.
The following configurations can be modified in workflow tasks:
Scheduling parameters
Scheduling configurations
Data source and scheduling resource group settings
Parameter Synchronization Rules
When new parameters are added to a SQL script, clicking Batch Submission in Studio will synchronize the parameters to the task’s scheduling parameters.
For parameters that already exist in the task, modifying parameter values in Studio will not update them in the task during batch submission.
Referencing Remote Git Repository Files in Workflow Tasks
When a project is configured with a Git repository, a new Remote Git Repository option is available under Creation Method when creating a task.
Users can select a SQL file from the project-bound remote Git repository and its branches as the task’s code source.
Supported task type:
Supported Data Source Type | Task type |
DLC Data Source | DLC SQL |
EMR Data Source | Hive SQL/Spark SQL |