Open-Source Ecosystem Data Lake
Customers building big data processing and analysis based on the open-source Hadoop ecosystem may face issues such as mismatched scaling speeds between computational resources and storage resources, and the need for storage systems to integrate multiple data sources.
Recommended Products
Recommend the Data Accelerator Goose FileSystem (GooseFS).
Main Capabilities
Compute-storage separation
By separating computing and storage, it enables elastic scaling of computational resources to meet customers' flexible scheduling needs.
Support for multiple data sources
Can integrate multiple data sources, allowing storage of structured, semi-structured, and unstructured data at any scale.
High-performance business architecture
Enhances computing business access performance through multi-level acceleration services such as Data Accelerator (Data Accelerator Goose FileSystem, GooseFS), Metadata Accelerator, and AZ Accelerator.
Interactive Query Data Lake
Customers store multiple data source data in Cloud Object Storage (COS), including real-time compute data. They need to perform OLAP analysis on the data within it and conduct visualized display.
Main Capabilities
Support for multiple data sources
Can integrate multiple data sources, allowing storage of structured, semi-structured, and unstructured data at any scale.
Performance Acceleration
Achieves performance surpassing local HDFS through multi-level acceleration services such as Data Accelerator, Metadata Accelerator, and AZ Accelerator.
Machine Learning Data Lake
In classic machine learning scenarios, the training data volume is large, and both large private network bandwidth should be required.
Main Capabilities
Ultra-high bandwidth
Can provide oversized private network bandwidth to meet the high bandwidth demand in machine learning scenarios.
Support for multiple data sources
Can integrate multiple data sources, allowing storage of structured, semi-structured, and unstructured data at any scale.
Performance Acceleration
Achieves performance surpassing local HDFS through multi-level acceleration services such as Data Accelerator, Metadata Accelerator, and AZ Accelerator.
Cloud-Native Data Lake
By using container service, combine open-source applications such as Flink and TensorFlow to build cloud-native data ETL clusters and analysis clusters, achieving elasticization of computational resources; through multi-level acceleration services such as data accelerator, metadata accelerator, and AZ accelerator, enhance business access performance for computation; by using object storage service as the storage foundation for the data lake, realize low-cost storage of massive heterogeneous data.
Main Capabilities
Compute-storage separation
By separating computing from storage, it enables elastic scaling of computational resources and satisfies customers' flexible scheduling needs for computational resources.
High-performance business architecture
Enhances computing business access performance through multi-level acceleration services such as Data Accelerator, Metadata Accelerator, and AZ Accelerator.
Enrich ecosystem support
Can store various format data sources such as Parquet and ORC, and support multiple big data plug-ins such as Spark, Presto, and Flink.