Skip to content

Big Data

Analytics And More
  • Home
  • Spark
  • Design Patterns
  • streaming
  • Map Reduce
  • Hive
  • Hdfs & Yarn
  • Pig
  • Oozie
  • Hbase

Category: Pig

pig tutorial 7 – pig load and store functions with compression and shell, file and utility commands

July, 2017 adarsh

Load/Store functions determine how data goes into Pig and comes out of Pig. Pig provides a set of built-in load/store…

Continue Reading →

Posted in: Data Analytics, Pig, pig latin Filed under: pig latin, pig script

pig tutorial 6 – Eval Functions AVG, CONCAT, COUNT, COUNT_STAR, DIFF, IsEmpty, MAX, MIN, SIZE, SUM and TOKENIZE

adarsh 1 Comment

AVG Use the AVG function to compute the average of the numeric values in a single-column bag. AVG requires a…

Continue Reading →

Posted in: Data Analytics, Pig, pig latin Filed under: pig latin, pig script

pig tutorial 5 – debugging pig with diagnostic operators like describe, dump, explain and illustrate

adarsh

DESCRIBE Use the DESCRIBE operator to review the schema of a particular alias. Input Service Data 1,NDATEST,/shelf=0/slot/port=1 2,NDATEST,/shelf=0/slot/port=2 3,NDATEST,/shelf=0/slot/port=3 4,NDATEST,/shelf=0/slot/port=4…

Continue Reading →

Posted in: Data Analytics, Pig, pig latin Filed under: pig latin, pig script

pig tutorial 4 – inner join, outer join, replicated join, skewed join

adarsh

Inner JOIN Use the JOIN operator to perform an inner, equijoin join of two or more relations based on common…

Continue Reading →

Posted in: Data Analytics, Pig, pig latin Filed under: pig latin, pig script

pig tutorial 3 – Flatten, GROUP, COGROUP, CROSS, DISTINCT, FILTER, FOREACH, LIMIT, Load, ORDER, SAMPLE, SPLIT, STORE, STREAM and UNION Operators

adarsh

Flatten Operator The FLATTEN operator which is an arithmetic operator looks like a UDF syntactically, but it is actually an…

Continue Reading →

Posted in: Data Analytics, Pig, pig latin Filed under: pig latin, pig script

pig tutorial 2 – pig data types, relations, bags, tuples, fields and parameter substitution

adarsh

Relations, Bags, Tuples, Fields Pig Latin statements work with relations. A relation is a bag and a bag is a…

Continue Reading →

Posted in: Data Analytics, Pig, pig latin Filed under: pig latin, pig script

pig tutorial 1 – multiquery execution, store, dump, dependencies and replicated, skewed, merge joins

adarsh

A Pig Latin statement is an operator that takes a relation as input and produces another relation as output this…

Continue Reading →

Posted in: Data Analytics, Pig, pig latin Filed under: pig latin, pig script

Post navigation

Page 2 of 2
← Previous 1 2

Recent Posts

  • Optimization for Using AWS Lambda to Send Messages to Amazon MSK
  • Rebalancing a Kafka Cluster in AWS MSK using CLI Commands
  • Using StsAssumeRoleCredentialsProvider with Glue Schema Registry Integration in Kafka Producer
  • Home
  • Contact Me
  • About Me
Copyright © 2017 Time Pass Techies