This document mainly provides relevant guidelines for GooseFS quick deployment. Users can create a GooseFS cluster directly in the console and use Cloud Object Storage (COS) as remote storage without the need to write code or run programs.
Preparations
Account and Permission Preparation
1. Register a Tencent Cloud account.
Before using the Tencent Cloud Data Accelerator Goose FileSystem (GooseFS) service, you need to register a Tencent Cloud account. Click the button below to start registration. (If you are already registered, skip this step.)
2. Complete real-name authentication.
3. Enable GooseFS service.
Before logging in to the console to use GooseFS, you need to contact us to enable the GooseFS service. 4. GooseFS license.
After enabling the GooseFS service, you need to set up a preset role and grant the Data Accelerator GooseFS relevant permissions. Click the Authorization button to authorize.
Click Authorize. If you have no authorization permissions, copy the link to the root account or a sub-account with authorization permissions to authorize the role.
After authorization is completed, a GooseFS cluster can be created normally.
5. Associate a bucket with a node.
To create a namespace using GooseFS, you need to mount a COS storage bucket. Please create a COS storage bucket in the COS Console. If you use managed mode or Master-managed mode, you need to manually prepare Master or Worker nodes. Please create a Cloud Virtual Machine in the Cloud Virtual Machine Console. Cluster Deployment
Step 1: Create a GooseFS Cluster
1. Log in to the Data Accelerator GooseFS console. In the left sidebar, select GooseFS > Instance List to enter the GooseFS cluster list page. 2. Click Create to enter the GooseFS creation flow.
3. Fill in the GooseFS cluster info and configure the following parameters.
Cluster Name: Cluster unique identifier, globally unique under the account. The cluster name must follow the naming specification. See GooseFS cluster overview for description. Cluster Description: Used to describe the attributes, purpose, and other information of the cluster.
Cluster Category: Different types or application scenarios of clusters. Divided into Tencent Public Cloud clusters and IDC cloud clusters.
Region: The region of the cluster, usually matching the domain of the computing service's located cluster.
Availability Zone: The AZ where the cluster is located, usually the same as the computing cluster.
VPC: The VPC where the cluster is located, generally the same VPC as the computing cluster. Try to avoid cross-VPC deployment situations.
Subnet: The subnet where the cluster is located.
Cluster Tag: Cluster tag information.
4. Cluster resources are used to configure the cache storage scale that the GooseFS cluster needs to manage.
GooseFS deployment modes include Fully Managed Hosting Mode, Master-Only Hosting Mode, and Management Plane Hosting Mode.
Under the Fully Managed Hosting Mode, users do not need to self-purchase Master nodes and Worker nodes. All nodes are hosted by Tencent Cloud. The Tencent Cloud team is responsible for the overall availability and stability of the cluster, ensuring users can focus on core business.
Under the Master-Only Hosting Mode, users purchase Master instances on the GooseFS platform; Worker nodes require users to purchase CVM instances as nodes, which are added to the cluster via Add Node after cluster creation. Under the Management Plane Hosting Mode, both the Master and Worker nodes of GooseFS require users to purchase CVM instances as nodes. The following example shows how to create a cluster in fully managed mode:
In fully managed mode, user only needs to select the Instance Capacity (10TB-1000TB, step length 10TB).
5. Enter the required parameters, then click Next. After confirming the information is correct, click OK to execute.
6. Enter the cluster creation process. You can view the detailed progress of cluster deployment in the Task Center.
Step 2: Adding a Node
To learn about GooseFS node details, check the GooseFS cluster overview. The following introduces the process of adding a node. 1. Select the GooseFS cluster where you want to add a node, open the cluster details page, choose the node management option from the sidebar, and enter the node management subpage.
2. On the node management page, click Add Node and fill in the following parameters.
Instance Type: Currently only support adding CVM instances.
Node Type: Supports Worker nodes and Client nodes. Fully Maneged Hosting only supports Client nodes.
Node IP: Select the node you need to add.
Formatted Mounting: Used to format the disk and perform file system mounting. Users can choose whether to perform formatted mounting. If not enabled, there is no need to set the data disk mounting option during the creation process; it can be mounted manually or using scripts. If enabled, the device name, system formatting, mount point, and other parameters need to be filled in.
Device Name: The hardware device that needs to be mounted, usually a specific disk partition. You need to log in to the machine to view it, typically located under the /dev path.
Format System: Linux file system type.
Mount Point: File system mount point.
3. Enter the required parameters for various operations as needed, confirm and click Confirm to execute, then add the node.
Step 3: Creating a Namespace
GooseFS organizes and manages cache data through namespaces. The namespace of GooseFS serves as a unified access API for the GooseFS cluster to access data, and can mount multiple underlying storage systems, such as COS and Cloud HDFS.
1. Select the GooseFS cluster that requires creating a namespace, enter the cluster details page, select the Namespace option from the sidebar, and enter the namespace subpage.
2. On the namespace page, click Add Namespace and fill in the following fields.
Bucket Source: You can select buckets from your own account or other accounts. Note: When selecting buckets from other accounts, ensure that the corresponding account has read/write permissions for the corresponding bucket.
COS Bucket: Select the bucket you need to mount.
Project Name: Namespace name.
Mounting Range: Selectable for the entire bucket or a specified directory prefix.
UFS URI: Used for accessing and operating files within the UFS file system. ```
Read Policy: Used to manage the caching policy of data in the GooseFS cluster when reading data from GooseFS. Currently only supports CACHE.
Write Policy: Used to manage the caching policy of data in the GooseFS cluster when writing data through GooseFS. Currently supports CACHE_THROUGH and THROUGH.
UFS Attributes: Used to set configuration information for accessing underlying storage, such as access keys.
fs.cosn.bucket.region: Selected bucket region, such as ap-beijing.
fs.cosn.flush.enabled: Whether to enable the data refresh mechanism, default is true.
fs.cosn.upload.part.size: The shard size for multi-part upload, defaults to 8MB.
fs.cosn.userinfo.secretId: used to mark the identity of the API caller. One APPID can create multiple cloud API keys.
fs.cosn.userinfo.secretKey: Encrypted signature key for your Tencent Cloud account, used to verify request legitimacy.
Note:
SecretId and SecretKey are collectively referred to as cloud API keys. They are secure credentials required for user access to Tencent Cloud API for identity verification. Please create them on the Cloud Access Management (CAM) webpage. Select Access Key > API Key Management in the left sidebar, then click Create Key to obtain them. SecretKey can only be obtained on the key creation page. Save carefully. 3. Enter the required parameters for various operations as needed, confirm and click Confirm to execute, then create the namespace.
Step 4: Adding a Client
After adding a node and creating a namespace, you need to add a GooseFS-FUSE client to the newly-added node to access the namespace.
1. Select the GooseFS cluster where you want to add a client, open the cluster details page, choose the client management option from the sidebar, and enter the client management subpage.
2. On the client management page, perform the Add New Client operation to deploy a GooseFS client on a designated node. The optional parameters are as follows.
Instance Type: Currently only support adding CVM instances.
Client Type: Currently only support FUSE type. For an introduction to GooseFS FUSE client, see GooseFS FUSE Tool. Node IP: Nodes that need to add clients.
Note:
In Management Plane Hosting Mode and Master-Only Hosting Mode, clients can be deployed on Worker nodes and Client nodes of the GooseFS cluster.
In Fully Managed Hosting Mode, clients can only be deployed on Client nodes of the GooseFS cluster.
Mount Information: Used to add local file system mount points. After adding, users can access cached files on GooseFS via client. Optional parameters are as follows:
Local Mounting Directory: The FUSE client's local mounting directory, used to access cached files on the GooseFS cluster, with a default value of /data/goosefs.
GooseFS Directory: GooseFS file system directory, the path in the GooseFS namespace, such as /0-aaa-int-1250000000/data. If not specified, it defaults to mounting the namespace root path, such as /0-aaa-int-1250000000.
Mount Parameters: FUSE client mount parameters. Default mount parameter allow_other means allowing other users to access; direct_io means enabling direct I/O. More parameters can be found in FUSE mount parameter list. Batch Concurrency: Controls the number of tasks executed simultaneously (default is 1), suitable for batch-mounted clients.
Batch Failure Strategy: Strategy after task failure, with options to continue execution or terminate subsequent batches.
3. Enter the required parameters for various operations as needed, confirm and click Confirm to execute, then add the client.
Step 5: Mount Verification
The bucket has been mounted on the corresponding Client node. Log in to node CVM, enter the local mounting directory /data/goosefs, and you can view all files in the associated bucket of the namespace.