跨版本distcp报Check-sum错误

报错信息:

Caused by: java.io.IOException: Check-sum mismatch between hdfs://192.168.81.30:8020/user/hive/warehouse/xy_ods.db/ods_verification_cardno_d_incr/pk_year=2017/pk_month=2017-07/pk_day=2017-07-23/000011_0 and hdfs://172.20.85.39:8020/user/hive/warehouse/xy_ods.db/ods_verification_cardno_d_incr/pk_year=2017/.distcp.tmp.attempt_1534853344008_892046_m_000020_1. Source and target differ in block-size. Use -pb to preserve block-sizes during copy. Alternatively, skip checksum-checks altogether, using -skipCrc. (NOTE: By skipping checksums, one runs the risk of masking data-corruption during file-transfer.)
     at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.compareCheckSums(RetriableFileCopyCommand.java:221)
     at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:134)

解决方法:

迁移到的hadoop集群版本比较高, 最好设置-skipcrccheck选项也-update选项, skipcrccheck忽略FileChecksum校验, 因为版本的升级可能带来Checksum值不一样, cdh6与cdh5就是这样.

命令为:

hadoop distcp -D mapreduce.job.queuename=xy_yarn_pool.development -D ipc.client.fallback-to-simple-auth-allowed=true -update -skipcrccheck hdfs://192.168.81.30:8020/user/hive/warehouse/xy_ods.db/ods_verification_cardno_d_incr/pk_year=2017/pk_month=2017-07/pk_day=2017-07-23/000011_0 hdfs://172.20.85.39:8020/user/hive/warehouse/xy_ods.db/ods_verification_cardno_d_incr/pk_year=2017/pk_month=2017-07/pk_day=2017-07-23/000011_0


您可以选择一种方式赞助本站

发表评论

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

图片 表情