where condition. This greatly reduces the scanned data volume and improves the query efficiency.PARTITIONED BY parameter in the table creation statement.
Example: Creating the test_part partition tableCREATE EXTERNAL TABLE IF NOT EXISTS `DataLakeCatalog`.`test_a_db`.`test_part` (`_c0` int,`_c1` int,`_c2` string,`dt` string) USING PARQUET PARTITIONED BY (dt) LOCATION 'cosn://testbucket/data/';
ALTER TABLE ADD PARTITION
ALTER TABLE `DataLakeCatalog`.`test_a_db`.`test_part` add PARTITION (dt = '202206')ALTER TABLE `DataLakeCatalog`.`test_a_db`.`test_part` add PARTITION (dt = '202207')ALTER TABLE `DataLakeCatalog`.`test_a_db`.`test_part` add PARTITION (dt = '202208')ALTER TABLE `DataLakeCatalog`.`test_a_db`.`test_part` add PARTITION (dt = '202209')ALTER TABLE `DataLakeCatalog`.`test_a_db`.`test_part` add PARTITION (dt = '202210')
ALTER TABLEALTER TABLE `DataLakeCatalog`.`test_a_db`.`test_part` add PARTITION (dt = '202211') LOCATION='cosn://testbucket/data2/202211'ALTER TABLE `DataLakeCatalog`.`test_a_db`.`test_part` add PARTITION (dt = '202212') LOCATION='cosn://testbucket/data2/202212'
MSCK REPAIR TABLEMSCK REPAIR TABLE statement to scan the data catalog specified during table creation. If there is a new partition catalog, the system will automatically add the partitions to the metadata of the data table.
Sample SQL:MSCK REPAIR TABLE `DataLakeCatalog`.`test_a_db`.`test_part`
ALTER TABLE to add a partition preferably, as automatic adding through MSCK REPAIR TABLE has the following restraints:MSCK REPAIR TABLE only adds partitions to the metadata of the data table but does not delete them.MSCK REPAIR TABLE is not recommended if the data volume is large, as it will scan all the data, which may cause a timeout.MSCK REPAIR TABLE cannot be used.Feedback