hive Archives - Page 2 of 3

Hive tutorial 9 – Hive performance tuning using join optimization with common, map, bucket and skew join

August, 2017 adarsh

Common join The common join is also called reduce side join. It is a basic join in Hive and works…

adarsh

Local mode Hadoop can run in standalone, pseudo-distributed, and fully distributed mode. Most of the time, we need to configure…

adarsh

Hive supports TEXTFILE, SEQUENCEFILE, RCFILE, ORC, and PARQUET file formats. The three ways to specify the file format are as…

adarsh

Hive partitioning is one of the most effective methods to improve the query performance on larger tables. The query with…

adarsh

Hive provides an EXPLAIN command to return a query execution plan without running the query. We can use an EXPLAIN…

adarsh

Analytic functions are usually used with OVER, PARTITION BY, ORDER BY, and the windowing specification. Standard aggregations – COUNT(), SUM(),…

adarsh

Hive offers several built-in aggregate functions, such as MAX, MIN, AVG, and so on. Hive also supports advanced aggregation by…