There are a number of ways to get pair RDDs in Spark and many formats will directly load pair RDDs…
An RDD in Spark is simply an immutable distributed collection of objects. Each RDD is split into multiple partitions, which…
Chain of responsibility is a behavioral design pattern which helps in decoupling the sender of a request from its receiver…
The Decorator Pattern is part of the structural design pattern and this is a pattern which attaches additional responsibilities to…
The builder pattern is an object creation software design pattern. The intention of the builder pattern is to find a…
I will explain how to use multipleinputs to process linelength and speeddata from ems . The input format we will…
I have covered most of the oozie actions in the previous tutorial and below are some of the random topics…