Question-23: If a developer is running PySpark jobs and he is complaining to you that, his PySpark jobs are not correctly required version of Python. What do you recommend?
- Ask Developer to set PYSPARK_PYTHON environment variable to point to correct Python executable before running pyspark command.
- Ask Developer to set PYSPARK_DRIVER_PYTHON environment variable to point to correct Python executable before running pyspark command.
- You need to have only one version of the Python on the Cloudera CDP Private Base cluster.
- You need to make the changes in Cloudera Manager, Spark service to point to correct Python version.
Answer:
This Question is from QuickTechie Cloudera CDP Certification Preparation Kit.
Exp: Spark
Spark 2.4 supports Python 2.7 and 3.4-3.7.
Spark 3.0 supports Python 2.7 and 3.4 and higher, although support for Python 2 and 3.4 to 3.5 is deprecated.
Spark 3.1 supports Python 3.6 and higher.
If the right level of Python is not picked up by default, set the PYSPARK_PYTHON and
This Question is from QuickTechie Cloudera CDP Certification Preparation Kit.