Skip to content

Big Data

Analytics And More
  • Home
  • Spark
  • Design Patterns
  • streaming
  • Map Reduce
  • Hive
  • Hdfs & Yarn
  • Pig
  • Oozie
  • Hbase

Category: Data Analytics

using hive udaf in spark sql

November, 2018 adarsh

In this article i will demonstrate how to build a Hive UDAF and execute it in Apache Spark. In hive…

Continue Reading →

Posted in: Data Analytics, Hive, Spark Filed under: datasets and dataframe, hive, Spark Rdd

using hive udf in spark sql

October, 2018 adarsh

In this article i will demonstrate how to build a Hive UDF and execute it in Apache Spark. Hive user-defined…

Continue Reading →

Posted in: Hive, Spark Filed under: datasets and dataframe, hive, Spark Rdd

spark textFileStream to find Relative Strength Index or RSI of stocks with sliding window and reduceByKeyAndWindow example

October, 2018 adarsh

The Relative Strength Index is a momentum indicator that measures the magnitude of recent price changes to analyze overbought or…

Continue Reading →

Posted in: Data Analytics, Spark Filed under: Spark Rdd

spark textFileStream example to process json data

adarsh 2d Comments

Problem To Solve : Calculate the trading volume of the stocks every 10 minutes and decide which stock to purchase.…

Continue Reading →

Posted in: Data Analytics, Spark Filed under: Spark Rdd

spark textFileStream with sliding window and reduceByKeyAndWindow example

adarsh 1 Comment

Problem To Solve : Calculate the maximum profit (average closing price – average opening price) in a 5-minute sliding window…

Continue Reading →

Posted in: Data Analytics, Spark Filed under: Spark Rdd

spark file streaming with sliding window to calculate the simple moving average using reduceByKeyAndWindow

October, 2018 adarsh 1 Comment

Problem To Solve : Calculate the simple moving average closing price of stocks in a 5-minute sliding window for the…

Continue Reading →

Posted in: Data Analytics, Spark Filed under: Spark Rdd

spark example for jaccard similarity for lsh algorithm

adarsh

The Jaccard similarity index or the jaccard similarity coefficient compares two datasets to see which data is shared and which…

Continue Reading →

Posted in: Data Analytics, Spark Filed under: Spark Rdd

Post navigation

Page 7 of 26
← Previous 1 … 6 7 8 … 26 Next →

Recent Posts

  • Optimization for Using AWS Lambda to Send Messages to Amazon MSK
  • Rebalancing a Kafka Cluster in AWS MSK using CLI Commands
  • Using StsAssumeRoleCredentialsProvider with Glue Schema Registry Integration in Kafka Producer
  • Home
  • Contact Me
  • About Me
Copyright © 2017 Time Pass Techies