Spark convert json string to dataframe

In this article, we will explore how we can covert a json string into a Spark dataframe. In the below code we have an employeeSystemStruct which is a string that we want to convert to a Spark dataframe.

Below is the code

object ConvertJsonStringToDataFrame extends App {

val employeeSystemStruct = "[
{
"id": "343434",
"name": "Mark",
"dob": "2019-01-01",
"company": "microsoft"
},
{
"id": "56565",
"name": "Steve",
"dob": "2019-01-01",
"company": "google"
},
{
"id": "787878",
"name": "Cummins",
"dob": "2019-01-01",
"company": "microsoft"
},
{
"id": "7872323",
"name": "nassir",
"dob": "2019-01-01",
"company": "microsoft"
}
]"

val df=Seq(employeeSystemStruct).toDS()
val jsondf = spark.read.json(df)

display(jsondf)

}

We can also convert this to a list of a custom object using the below code

object ConvertJsonStringToDataFrame extends App {

case class EmployeeStruct(id: String,name:String,
dob: String,name:String)

val employeeSystemStruct = "[
{
"id": "343434",
"name": "Mark",
"dob": "2019-01-01",
"company": "microsoft"
},
{
"id": "56565",
"name": "Steve",
"dob": "2019-01-01",
"company": "google"
},
{
"id": "787878",
"name": "Cummins",
"dob": "2019-01-01",
"company": "microsoft"
},
{
"id": "7872323",
"name": "nassir",
"dob": "2019-01-01",
"company": "microsoft"
}
]"

val df=Seq(employeeSystemStruct).toDS()
val jsondf = spark.read.json(df)

val employeeList = jsondf
.collect()
.map(row => {
EmployeeStruct
.apply(row.getAs("id"), row.getAs("name"),row.getAs("dob"),row.getAs("name"))
})

employeeList.foreach(hs => {
println(employeeList)

})

}