tencent cloud

Cloud Object Storage

Release Notes and Announcements
Release Notes
Announcements
Product Introduction
Overview
Features
Use Cases
Strengths
Concepts
Regions and Access Endpoints
Specifications and Limits
Service Regions and Service Providers
Billing
Billing Overview
Billing Method
Billable Items
Free Tier
Billing Examples
Viewing and Downloading Bill
Payment Overdue
FAQs
Getting Started
Console
Getting Started with COSBrowser
User Guide
Creating Request
Bucket
Object
Data Management
Batch Operation
Global Acceleration
Monitoring and Alarms
Operations Center
Data Processing
Content Moderation
Smart Toolbox
Data Processing Workflow
Application Integration
User Tools
Tool Overview
Installation and Configuration of Environment
COSBrowser
COSCLI (Beta)
COSCMD
COS Migration
FTP Server
Hadoop
COSDistCp
HDFS TO COS
GooseFS-Lite
Online Tools
Diagnostic Tool
Use Cases
Overview
Access Control and Permission Management
Performance Optimization
Accessing COS with AWS S3 SDK
Data Disaster Recovery and Backup
Domain Name Management Practice
Image Processing
Audio/Video Practices
Workflow
Direct Data Upload
Content Moderation
Data Security
Data Verification
Big Data Practice
COS Cost Optimization Solutions
Using COS in the Third-party Applications
Migration Guide
Migrating Local Data to COS
Migrating Data from Third-Party Cloud Storage Service to COS
Migrating Data from URL to COS
Migrating Data Within COS
Migrating Data Between HDFS and COS
Data Lake Storage
Cloud Native Datalake Storage
Metadata Accelerator
GooseFS
Data Processing
Data Processing Overview
Image Processing
Media Processing
Content Moderation
File Processing Service
File Preview
Troubleshooting
Obtaining RequestId
Slow Upload over Public Network
403 Error for COS Access
Resource Access Error
POST Object Common Exceptions
API Documentation
Introduction
Common Request Headers
Common Response Headers
Error Codes
Request Signature
Action List
Service APIs
Bucket APIs
Object APIs
Batch Operation APIs
Data Processing APIs
Job and Workflow
Content Moderation APIs
Cloud Antivirus API
SDK Documentation
SDK Overview
Preparations
Android SDK
C SDK
C++ SDK
.NET(C#) SDK
Flutter SDK
Go SDK
iOS SDK
Java SDK
JavaScript SDK
Node.js SDK
PHP SDK
Python SDK
React Native SDK
Mini Program SDK
Error Codes
Harmony SDK
Endpoint SDK Quality Optimization
Security and Compliance
Data Disaster Recovery
Data Security
Cloud Access Management
FAQs
Popular Questions
General
Billing
Domain Name Compliance Issues
Bucket Configuration
Domain Names and CDN
Object Operations
Logging and Monitoring
Permission Management
Data Processing
Data Security
Pre-signed URL Issues
SDKs
Tools
APIs
Agreements
Service Level Agreement
Privacy Policy
Data Processing And Security Agreement
Contact Us
Glossary

Connecting Oceanus to COS

PDF
Focus Mode
Font Size
Last updated: 2024-03-25 15:16:26

Oceanus Overview

Oceanus is a powerful real-time analysis tool in the big data ecosystem. With it, you can easily build various applications in just a few minutes, such as website clickstream analysis, targeted ecommerce recommendation, and IoT. Oceanus is developed based on Apache Flink and provides fully managed cloud services, so you don't need to care about the Ops of infrastructure. It can also be connected to data sources in the cloud for a complete set of supporting services.
Oceanus comes with a convenient console for you to write SQL analysis statements, upload and run custom JAR packages, and manage jobs. Based on the Flink technology, it can achieve a sub-second processing latency in datasets at the petabyte level.
This document describes how to connect Oceanus to COS. Currently, Oceanus is available in the dedicated cluster mode, where you can run various jobs and manage related resources in your own cluster.

Prerequisites

Creating Oceanus cluster

Log in to the Oceanus console and create an Oceanus cluster.

Creating COS bucket

1. Log in to the COS console.
2. Click Bucket List on the left sidebar.
3. Click Create Bucket to create a bucket as instructed in Creating a Bucket.
Note:
When you write data to COS, the Oceanus job must run in the same region as COS.

Directions

Go to the Oceanus console, create an SQL job, and select a cluster in the same region as COS.

1. Create a source

CREATE TABLE `random_source` (
f_sequence INT,
f_random INT,
f_random_str VARCHAR
) WITH (
'connector' = 'datagen',
'rows-per-second'='10', -- Number of date rows generated per second
'fields.f_sequence.kind'='random', -- Random number
'fields.f_sequence.min'='1', -- Minimum sequential number
'fields.f_sequence.max'='10', -- Maximum sequential number
'fields.f_random.kind'='random', -- Random number
'fields.f_random.min'='1', -- Minimum random number
'fields.f_random.max'='100', -- Maximum random number
'fields.f_random_str.length'='10' -- Random string length
);
Note:
Here, the built-in connector datagen is selected. Select a data source based on your actual business needs.

2. Create a sink

-- Replace `<bucket name>` and `<folder name>` with your actual bucket and folder names.
CREATE TABLE `cos_sink` (
f_sequence INT,
f_random INT,
f_random_str VARCHAR
) PARTITIONED BY (f_sequence) WITH (
'connector' = 'filesystem',
'path'='cosn://<bucket name>/<folder name>/', --- Directory path to which data is to be written
'format' = 'json', --- Format of written data
'sink.rolling-policy.file-size' = '128MB', --- Maximum file size
'sink.rolling-policy.rollover-interval' = '30 min', --- Maximum file write time
'sink.partition-commit.delay' = '1 s', --- Partition commit delay
'sink.partition-commit.policy.kind' = 'success-file' --- Partition commit method
);
Note:
For more WITH parameters of a sink, see "Filesystem (HDFS/COS)".

3. Configure the business logic

INSERT INTO `cos_sink`
SELECT * FROM `random_source`;
Note:
This is for demonstration only and has no actual business purposes.

4. Set job parameters

Select flink-connector-cos as the Built-in Connector and configure the COS URL in Advanced Parameters as follows:
fs.AbstractFileSystem.cosn.impl: org.apache.hadoop.fs.CosN
fs.cosn.impl: org.apache.hadoop.fs.CosFileSystem
fs.cosn.credentials.provider: org.apache.flink.fs.cos.OceanusCOSCredentialsProvider
fs.cosn.bucket.region: <COS region>
fs.cosn.userinfo.appid: <COS user appid>
The job is configured as follows:
Replace <COS region> with your actual COS region, such as ap-guangzhou.
Replace <COS user appid> with your actual APPID, which can be viewed in Account Center.
Note:
For more information on job parameter settings, see "File System (HDFS/COS)".

5. Start the job

Click Save > Check Syntax > Release Draft, wait for the SQL job to start, and go to the corresponding COS directory to view the written data.

Help and Support

Was this page helpful?

Help us improve! Rate your documentation experience in 5 mins.

Feedback