Release Notes
Announcements
hadoop-cos-2.x.x-${version}.jar, cos_api-bundle-${version}.jar, and chdfs_hadoop_plugin_network-${version}.jar to plugin/reader/hdfsreader/libs/ and plugin/writer/hdfswriter/libs/ in the extracted DataX path.datax.py scriptbin/datax.py script in the DataX decompression directory, and modify the CLASS_PATH variable in the script as follows:CLASS_PATH = ("%s/lib/*:%s/plugin/reader/hdfsreader/libs/*:%s/plugin/writer/hdfswriter/libs/*:.") % (DATAX_HOME, DATAX_HOME, DATAX_HOME)
hdfsreader and hdfswriter in the configuration JSON file{"job": {"setting": {"speed": {"byte": 10485760},"errorLimit": {"record": 0,"percentage": 0.02}},"content": [{"reader": {"name": "hdfsreader","parameter": {"path": "/test/","defaultFS": "cosn://examplebucket1-1250000000/","column": ["*"],"fileType": "text","encoding": "UTF-8","hadoopConfig": {"fs.cosn.impl": "org.apache.hadoop.fs.CosFileSystem","fs.cosn.trsf.fs.ofs.bucket.region": "ap-guangzhou","fs.cosn.bucket.region": "ap-guangzhou","fs.cosn.tmp.dir": "/tmp/hadoop_cos","fs.cosn.trsf.fs.ofs.tmp.cache.dir": "/tmp/","fs.cosn.userinfo.secretId": "COS_SECRETID","fs.cosn.userinfo.secretKey": "COS_SECRETKEY","fs.cosn.trsf.fs.ofs.user.appid": "1250000000"},"fieldDelimiter": ","}},"writer": {"name": "hdfswriter","parameter": {"path": "/","fileName": "hive.test","defaultFS": "cosn://examplebucket2-1250000000/","column": [{"name":"col1","type":"int"},{"name":"col2","type":"string"}],"fileType": "text","encoding": "UTF-8","hadoopConfig": {"fs.cosn.impl": "org.apache.hadoop.fs.CosFileSystem","fs.cosn.trsf.fs.ofs.bucket.region": "ap-guangzhou","fs.cosn.bucket.region": "ap-guangzhou","fs.cosn.tmp.dir": "/tmp/hadoop_cos","fs.cosn.trsf.fs.ofs.tmp.cache.dir": "/tmp/","fs.cosn.userinfo.secretId": "COS_SECRETID","fs.cosn.userinfo.secretKey": "COS_SECRETKEY","fs.cosn.trsf.fs.ofs.user.appid": "1250000000"},"fieldDelimiter": ",","writeMode": "append"}}}]}}
hadoopConfig as required for cosn.defaultFS to the COSN path, such as cosn://examplebucket-1250000000/.fs.cosn.userinfo.region and fs.cosn.trsf.fs.ofs.bucket.region to the bucket region, such as ap-guangzhou. For more information, see Regions and Access Endpoints.COS_SECRETID and COS_SECRETKEY, use your own COS key information.fs.ofs.user.appid and fs.cosn.trsf.fs.ofs.user.appid to your appid.hdfs_job.json in the job directory and run the following command:[root@172 /usr/local/service/datax]# python bin/datax.py job/hdfs_job.json
2022-10-23 00:25:24.954 [job-0] INFO JobContainer -[total cpu info] =>averageCpu | maxDeltaCpu | minDeltaCpu-1.00% | -1.00% | -1.00%[total gc info] =>NAME | totalGCCount | maxDeltaGCCount | minDeltaGCCount | totalGCTime | maxDeltaGCTime | minDeltaGCTimePS MarkSweep | 1 | 1 | 1 | 0.034s | 0.034s | 0.034sPS Scavenge | 14 | 14 | 14 | 0.059s | 0.059s | 0.059s2022-10-23 00:25:24.954 [job-0] INFO JobContainer - PerfTrace not enable!2022-10-23 00:25:24.954 [job-0] INFO StandAloneJobContainerCommunicator - Total 1000003 records, 9322478 bytes | Speed 910.40KB/s, 100000 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 1.000s | All Task WaitReaderTime 6.259s | Percentage 100.00%2022-10-23 00:25:24.955 [job-0] INFO JobContainer -Job start time : 2022-10-23 00:25:12Job end time : 2022-10-23 00:25:24Job duration : 12sAverage job traffic : 910.40 KB/sRecord write speed : 100000 records/sTotal number of read records : 1000003Read/Write failure count : 0
cosn-ranger-interface-1.x.x-${version}.jar and hadoop-ranger-client-for-hadoop-${version}.jar to plugin/reader/hdfsreader/libs/ and plugin/writer/hdfswriter/libs/ in the extracted DataX path. Click here to download them.hdfsreader and hdfswriter in the JSON configuration file.{"job": {"setting": {"speed": {"byte": 10485760},"errorLimit": {"record": 0,"percentage": 0.02}},"content": [{"reader": {"name": "hdfsreader","parameter": {"path": "/test/","defaultFS": "cosn://examplebucket1-1250000000/","column": ["*"],"fileType": "text","encoding": "UTF-8","hadoopConfig": {"fs.cosn.impl": "org.apache.hadoop.fs.CosFileSystem","fs.cosn.trsf.fs.ofs.bucket.region": "ap-guangzhou","fs.cosn.bucket.region": "ap-guangzhou","fs.cosn.tmp.dir": "/tmp/hadoop_cos","fs.cosn.trsf.fs.ofs.tmp.cache.dir": "/tmp/","fs.cosn.trsf.fs.ofs.user.appid": "1250000000","fs.cosn.credentials.provider": "org.apache.hadoop.fs.auth.RangerCredentialsProvider","qcloud.object.storage.zk.address": "172.16.0.30:2181","qcloud.object.storage.ranger.service.address": "172.16.0.30:9999","qcloud.object.storage.kerberos.principal": "hadoop/172.16.0.30@EMR-5IUR9VWW"},"haveKerberos": "true","kerberosKeytabFilePath": "/var/krb5kdc/emr.keytab","kerberosPrincipal": "hadoop/172.16.0.30@EMR-5IUR9VWW","fieldDelimiter": ","}},"writer": {"name": "hdfswriter","parameter": {"path": "/","fileName": "hive.test","defaultFS": "cosn://examplebucket2-1250000000/","column": [{"name":"col1","type":"int"},{"name":"col2","type":"string"}],"fileType": "text","encoding": "UTF-8","hadoopConfig": {"fs.cosn.impl": "org.apache.hadoop.fs.CosFileSystem","fs.cosn.trsf.fs.ofs.bucket.region": "ap-guangzhou","fs.cosn.bucket.region": "ap-guangzhou","fs.cosn.tmp.dir": "/tmp/hadoop_cos","fs.cosn.trsf.fs.ofs.tmp.cache.dir": "/tmp/","fs.cosn.trsf.fs.ofs.user.appid": "1250000000","fs.cosn.credentials.provider": "org.apache.hadoop.fs.auth.RangerCredentialsProvider","qcloud.object.storage.zk.address": "172.16.0.30:2181","qcloud.object.storage.ranger.service.address": "172.16.0.30:9999","qcloud.object.storage.kerberos.principal": "hadoop/172.16.0.30@EMR-5IUR9VWW"},"haveKerberos": "true","kerberosKeytabFilePath": "/var/krb5kdc/emr.keytab","kerberosPrincipal": "hadoop/172.16.0.30@EMR-5IUR9VWW","fieldDelimiter": ",","writeMode": "append"}}}]}}
fs.cosn.credentials.provider to org.apache.hadoop.fs.auth.RangerCredentialsProvider to use Ranger for authorization.qcloud.object.storage.zk.address to the ZooKeeper address.qcloud.object.storage.ranger.service.address to the COS Ranger address.haveKerberos to true.qcloud.object.storage.kerberos.principal and kerberosPrincipal to the Kerberos authentication principal name (which can be read from core-site.xml in the EMR environment with Kerberos enabled).kerberosKeytabFilePath to the absolute path of the keytab authentication file (which can be read from ranger-admin-site.xml in the EMR environment with Kerberos enabled).java.io.IOException: Permission denied: no access groups bound to this mountPoint examplebucket2-1250000000, access denied or java.io.IOException: Permission denied: No access rules matched error is reported?java. lang. RuntimeException: java. lang.ClassNotFoundException: Class org.apache.hadoop.fs.con.ranger.client.RangerQcloudObjectStorageClientImpl not found error is reported?cosn-ranger-interface-1.x.x-${version}.jar and hadoop-ranger-client-for-hadoop-${version}.jar have been copied to plugin/reader/hdfsreader/libs/ and plugin/writer/hdfswriter/libs/ in the extracted DataX path (click here to download them).java.io.IOException: Login failure for hadoop/_HOST@EMR-5IUR9VWW from keytab /var/krb5kdc/emr.keytab: javax.security.auth.login.LoginException: Unable to obtain password from user error is reported?kerberosPrincipal and qcloud.object.storage.kerberos.principal are mistakenly set to hadoop/_HOST@EMR-5IUR9VWW instead of hadoop/172.16.0.30@EMR-5IUR9VWW. As DataX cannot resolve a _HOST domain name, you need to replace _HOST with an IP. You can run the klist -ket /var/krb5kdc/emr.keytab command to find an appropriate principal.java.io.IOException: init fs.cosn.ranger.plugin.client.impl failed error is reported?qcloud.object.storage.kerberos.principal is configured in hadoopConfig in the JSON file, and if not, you need to configure it.Esta página foi útil?
Você também pode entrar em contato com a Equipe de vendas ou Enviar um tíquete em caso de ajuda.
comentários