On HDP 2.4, some services may have corrupted jar and tar.gz files on HDFS. The specific files I have seen broken are as follows:

  • hive.tar.gz
  • mapreduce.tar.gz
  • hadoop-streaming.jar
  • pig.tar.gz
  • spark-hdp-assembly.jar
  • sqoop.tar.gz
  • tez.tar.gz

All of these are found in the /hdp/apps/<hdp-version> directory. On my install, they all had zero size (reported as 0.1 kB on HDFS File View). This led to errors in a variety of services, including the following:

  • gzip: /foo/bar/yarn/local/filecache/11_tmp/tmp_mapreduce.tar.gz: unexpected end of file
  • tar: This does not look like a tar archive
  • Error: Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher

There may be other errors, but those are the ones I personally experienced. This is fairly easy to fix. Each corrupt file has a healthy version on the local file system. The healthy version must be copied from the local system to HDFS, replacing the corrupt version. For example, to update Tez, perform the following:

$ hdfs dfs -rm /hdp/apps/<hdp-version>/tez/*

$ hdfs dfs put /usr/hdp/current/tez-client/lib/tez.tar.gz /hdp/apps/<hdp-version>/tez/

$ hdfs dfs -chmod 444 /hdp/apps/<hdp_version>/tez/tez.tar.gz

Problems caused by a corrupt tar on Tez should now be fixed


I am having the same issue with broken links. Any idea why the links will break? Its happening every week.


Other Popular Courses