Question-23: If a developer is running PySpark jobs and he is complaining to you that, his PySpark jobs are not correctly required version of Python. What do you recommend?

  1. Ask Developer to set PYSPARK_PYTHON environment variable to point to correct Python executable before running pyspark command.
  2. Ask Developer to set PYSPARK_DRIVER_PYTHON environment variable to point to correct Python executable before running pyspark command.
  3. You need to have only one version of the Python on the Cloudera CDP Private Base cluster.
  4. You need to make the changes in Cloudera Manager, Spark service to point to correct Python version.

Answer:

Exp: Spark

Spark 2.4 supports Python 2.7 and 3.4-3.7.

Spark 3.0 supports Python 2.7 and 3.4 and higher, although support for Python 2 and 3.4 to 3.5 is deprecated.

Spark 3.1 supports Python 3.6 and higher.

If the right level of Python is not picked up by default, set the PYSPARK_PYTHON and

environment variables to point to the correct Python executable before running the pyspark command.


Other Popular Courses