Both foldByKey() and reduceByKey() require that the return type of our result be the same type as that of the…
Spark has support for partition level functions which operate on per partition data. Working with data on a per partition…
Spark has support for zipping rdds using functions like zip, zipPartition, zipWithIndex and zipWithUniqueId . Lets go through each of…
A window function calculates a return value for every input row of a table based on a group of rows,…
Here we want to find the difference between two dataframes at a column level . We can use the dataframe1.except(dataframe2)…
Lets say we have dataset as below and we want to split a single column into multiple columns using withcolumn…
Lets create a dataframe from list of row object . First populate the list with row object and then we…