QuickTechie | Software & IT Professional Network

Cloudera CDP : How to change column Type in SparkSQL?

The key thing to remember is that in Spark RDD/DF are immutable. So once created you can not change them.

However there are many situation where you want the column type to be different.

E.g By default Spark comes with cars.csv where year column is a String. If you want to use a datetime function you need the column as a Datetime. You can change the column type from string to date in a new dataframe.

Here is an example to change the column type.

val df2 = sqlContext.load("com.databricks.spark.csv", Map("path" -> "file:///Users/vshukla/projects/spark/sql/core/src/test/resources/cars.csv", "header" -> "true"))
df2.printSchema()
val df4 = df2.withColumn("year2", 'year.cast("Int")).select('year2 as 'year, 'make, 'model, 'comment)
df4.printSchema

Hi,

Its not working for me ..Unable to cast the column data type.Here is my code.I am getting only null values.Thanks!

val df4 = businessDF.withColumn("Estimated Gross Loss1", $"Estimated Gross Loss".cast("Int")).select($"Estimated Gross Loss1" as "Estimated Gross Loss")

Nice article. I am looking for a solution, Instead of specifying a each column separately. is there any way we can dynamically handle the datatype. Lets say in my dataframe i have 50 columns out of 8 are decimals and need to convert all decimal datatype to double. Without specify a column name can we do that directly?

I am also looking for something like this. I need to convert all my date datatypes to varchar in a dataframe having more than 300 columns.

Other Popular Courses

Cloudera Certifications

Cloudera: CDP Administrator - Private Cloud Base Exam : CDP-2001

QuickTechie Learning Resources

QuickTechie Learning Resources

Cloudera: CDP Administrator - Private Cloud Base Exam : CDP-2001 | 350+ Questions and Answer for pre...

Cloudera Certifications

Cloudera CDP Data Developer Certification Exam : CDP-3001

QuickTechie Learning Resources

QuickTechie Learning Resources

3- Practice Papers & 170+ Q&A | Access it under Course Contents tab above Cloudera...

Cloudera Certifications

CDP-0011: Cloudera Generalist Certification Exam

QuickTechie Learning Resources

QuickTechie Learning Resources

Price: $ 99.00 | INR : 3999 | CDP-0011: Cloudera Generalist Certification Exam : 250+ Questions and ...

Virtual Classroom

Annual Subscription

QuickTechie Learning Resources

QuickTechie Learning Resources

Annual Subscription Package : You can access entire QuickTechie and HadoopExam Online learning mate...

Cloudera Certifications

Cloudera CDP Data Analyst Certification Exam : CDP-4001

QuickTechie Learning Resources

QuickTechie Learning Resources

50+ Q&A | Access it under Course Contents tab above | In next update price will be increased to...

Cloudera Generalist Certification (CDP-0011): Study Guide(eBooks)

QuickTechie Learning Resources

QuickTechie Learning Resources

Cloudera Generalist Certification (CDP-0011): Study Guide(eBooks) : All Premium Chapters would be a...

Databricks Certifications

Databricks® Certified Machine Learning Associate Certification Exam

QuickTechie Learning Resources

QuickTechie Learning Resources

Databricks® Certified Machine Learning Associate Certification Exam : Total 320+ Questions : Highest...

AWS Certifications

AWS-SAA-C02: AWS Solution Architect Associate Certifications

QuickTechie Learning Resources

QuickTechie Learning Resources

About AWS Solution Architect Associate Certifications Preparation Kit (AWS-C02) : Total 225 Question...

Virtual Classroom

Monthly Subscription

QuickTechie Learning Resources

QuickTechie Learning Resources

Monthly Discount by 50%, please use coupon code month50qt while checkout. Monthly Subscription ...

DBT (Data Build Tools) Fundamentais

QuickTechie Learning Resources

QuickTechie Learning Resources

dbt (Data Build Tool): dbt Fundamental Training Course : To access the same please login and enroll....