Skip to content

Big Data

Analytics And More
  • Home
  • Spark
  • Design Patterns
  • streaming
  • Map Reduce
  • Hive
  • Hdfs & Yarn
  • Pig
  • Oozie
  • Hbase

Tag: mapreduce filtering patterns

mapreduce example to find the distinct set of data

June, 2017 adarsh

This Pattern exploits MapReduce’s ability to group keys together to remove duplicates. This pattern uses a mapper to transform the…

Continue Reading →

Posted in: Data Analytics, Map Reduce Filed under: map reduce, map reduce design pattern, mapreduce filtering patterns

mapreduce example to find top n records in a sample data

adarsh 1 Comment

Finding outliers is an important part of data analysis because these records are typically the most interesting and unique pieces…

Continue Reading →

Posted in: Data Analytics, Map Reduce Filed under: map reduce, map reduce design pattern, mapreduce filtering patterns

mapreduce bloom filter example,pattern and optimization with sample data

June, 2017 adarsh 4d Comments

Bloom filtering is similar to generic filtering in that it is looking at each record and deciding whether to keep…

Continue Reading →

Posted in: Data Analytics, Map Reduce Filed under: map reduce, map reduce design pattern, mapreduce filtering patterns

mapreduce example for simple random sampling of data

June, 2017 adarsh

In simple random sampling (SRS), we want to grab a subset of our larger data set in which each record…

Continue Reading →

Posted in: Data Analytics, Map Reduce Filed under: map reduce, map reduce design pattern, mapreduce filtering patterns

mapreduce example to filter data

adarsh

Filtering serves as an abstract pattern for some of the other patterns. Filtering simply evaluates each record separately and decides,…

Continue Reading →

Posted in: Data Analytics, Map Reduce Filed under: map reduce, map reduce design pattern, mapreduce filtering patterns

Recent Posts

  • Producing events and handling credentials refresh for IAM enabled aws msk cluster using aws msk IAM auth library
  • spark example to replace a header delimiter
  • Scala code to get a secret stored in Azure key vault from databricks
  • Home
  • Contact Me
  • About Me
Copyright © 2017 Time Pass Techies