SlideShare a Scribd company logo
@avkashchauhan
http://www.linkedin.com/in/avkashchauhan
Introduction to Apache Pig
Introduction to Apache Pig
Introduction to Apache Pig
Introduction to Apache Pig
http://pig.apache.org/philosophy.html
Introduction to Apache Pig
http://www.slideshare.net/mortardata/mongodb-pig-on-hadoop

http://www.slideshare.net/jeromatron/pig-with-cassandra-adventures-in-analytics
Introduction to Apache Pig
Introduction to Apache Pig
Introduction to Apache Pig
Introduction to Apache Pig
Introduction to Apache Pig
http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.a
pache.pig%22



http://pig.apache.org/docs/r0.11.0/basic.html
Introduction to Apache Pig
Pig Version
     #pig -version
Introduction to Apache Pig
Introduction to Apache Pig
Introduction to Apache Pig
Introduction to Apache Pig
Introduction to Apache Pig
Introduction to Apache Pig
Introduction to Apache Pig
Introduction to Apache Pig
Introduction to Apache Pig
Introduction to Apache Pig
{1,{1,2,3}}
Introduction to Apache Pig
Introduction to Apache Pig
Introduction to Apache Pig
Introduction to Apache Pig
Introduction to Apache Pig
Introduction to Apache Pig
Introduction to Apache Pig
Introduction to Apache Pig
Introduction to Apache Pig
Introduction to Apache Pig
http://pig.apache.org/docs/r0.11.0/basic.html#Relational+Operators
http://pig.apache.org/docs/r0.11.0/func.html
AVG
CONCAT
COUNT
COUNT_STAR
DIFF
IsEmpty
MAX
MIN
SIZE
SUM
TOKENIZE
             http://pig.apache.org/docs/r0.11.0/func.html#eval-functions
ABS                                           FLOOR
ACOS                                          LOG
                                              LOG10
ASIN
                                              RANDOM
ATAN                                          ROUND
CBRT                                          SIN
CEIL                                          SINH
COS                                           SQRT
COSH                                          TAN
                                              TANH
EXP
       http://pig.apache.org/docs/r0.11.0/func.html#math-functions
http://pig.apache.org/docs/r0.11.0/udf.html
UDF Category                Function Name
Load UDF Functions        PigStorage
                          HBaseStorage
                          TextLoader
Store UDF Functions       PigStorage
                        HBaseStorage
Evaluation Functions   ABS ROUND EXP LOG       SUM
                       SIZE ACOS     ASIN   ATAN RANDOM

Filter Functions          IsEmpty
http://pig.apache.org/docs/r0.7.0/udf.html#Schema
http://sivaanalytics.wordpress.com/2013/03/14/fundamentals-of-pig-exploring-more-on-schema-and-data-models/
http://developer.yahoo.com/hadoop/tutorial/pigtutorial.html

http://stackoverflow.com/questions/4968843/how-do-i-store-
gzipped-files-using-pigstorage-in-apache-pig

More Related Content

Viewers also liked (20)

PPTX
Introduction to Hadoop at Data-360 Conference
Avkash Chauhan
 
PDF
Win Friends and Influence People... with DSLs
Vladimir Bacvanski, PhD
 
PPTX
Introduction to pig
Uday Vakalapudi
 
PDF
High performance database applications with pure query and ibm data studio.ba...
Vladimir Bacvanski, PhD
 
PPTX
Apache pig power_tools_by_viswanath_gangavaram_r&d_dsg_i_labs
Viswanath Gangavaram
 
PDF
UML for Data Architects
Vladimir Bacvanski, PhD
 
PPTX
The concept of Datalake with Hadoop
Avkash Chauhan
 
PDF
Introduction to HBase
Avkash Chauhan
 
PPTX
Putting Lipstick on Apache Pig at Netflix
Jeff Magnusson
 
PPTX
Introduction to Apache Pig
Jason Shao
 
PDF
Introduction to Big Data Analytics on Apache Hadoop
Avkash Chauhan
 
PDF
Introduction To Apache Pig at WHUG
Adam Kawa
 
PPTX
Developing Hadoop strategy for your Enterprise
Avkash Chauhan
 
PDF
Applied Machine learning using H2O, python and R Workshop
Avkash Chauhan
 
PDF
SQL and Search with Spark in your browser
DataWorks Summit/Hadoop Summit
 
PDF
Introduction to Apache Sqoop
Avkash Chauhan
 
PDF
Simplifying Big Data Analytics with Apache Spark
Databricks
 
PDF
Introduction to Apache Spark
datamantra
 
PPTX
Introduction to Apache Spark Developer Training
Cloudera, Inc.
 
PDF
Introduction to Apache Hive
Avkash Chauhan
 
Introduction to Hadoop at Data-360 Conference
Avkash Chauhan
 
Win Friends and Influence People... with DSLs
Vladimir Bacvanski, PhD
 
Introduction to pig
Uday Vakalapudi
 
High performance database applications with pure query and ibm data studio.ba...
Vladimir Bacvanski, PhD
 
Apache pig power_tools_by_viswanath_gangavaram_r&d_dsg_i_labs
Viswanath Gangavaram
 
UML for Data Architects
Vladimir Bacvanski, PhD
 
The concept of Datalake with Hadoop
Avkash Chauhan
 
Introduction to HBase
Avkash Chauhan
 
Putting Lipstick on Apache Pig at Netflix
Jeff Magnusson
 
Introduction to Apache Pig
Jason Shao
 
Introduction to Big Data Analytics on Apache Hadoop
Avkash Chauhan
 
Introduction To Apache Pig at WHUG
Adam Kawa
 
Developing Hadoop strategy for your Enterprise
Avkash Chauhan
 
Applied Machine learning using H2O, python and R Workshop
Avkash Chauhan
 
SQL and Search with Spark in your browser
DataWorks Summit/Hadoop Summit
 
Introduction to Apache Sqoop
Avkash Chauhan
 
Simplifying Big Data Analytics with Apache Spark
Databricks
 
Introduction to Apache Spark
datamantra
 
Introduction to Apache Spark Developer Training
Cloudera, Inc.
 
Introduction to Apache Hive
Avkash Chauhan
 

Similar to Introduction to Apache Pig (20)

KEY
Rails Presentation (Anton Dmitriyev)
True-Vision
 
PDF
Front End Tooling and Performance - Codeaholics HK 2015
Holger Bartel
 
PPTX
Azkaban and Pig at LinkedIn
Russell Jurney
 
PDF
Squashing the Heisenbugs
Trotter Cashion
 
PDF
RESTful API - GDG Tech Talk - Novembro de 2014
Marlon Carvalho
 
PPTX
Spark Sql for Training
Bryan Yang
 
PPTX
Introduction to Apache Camel
Claus Ibsen
 
PDF
AWS Batch를 통한 손쉬운 일괄 처리 작업 관리하기 - 윤석찬 (AWS 테크에반젤리스트)
Amazon Web Services Korea
 
PPTX
Big data lambda architecture - Streaming Layer Hands On
hkbhadraa
 
PDF
PostgreSQL High-Availability and Geographic Locality using consul
Sean Chittenden
 
PDF
"今" 使えるJavaScriptのトレンド
Hayato Mizuno
 
PDF
Performance Wins with BPF: Getting Started
Brendan Gregg
 
PPT
Deploy Rails Application by Capistrano
Tasawr Interactive
 
PPTX
Serverless and Servicefull Applications - Where Microservices complements Ser...
Red Hat Developers
 
PDF
Padrino - the Godfather of Sinatra
Stoyan Zhekov
 
PDF
Scaling Mapufacture on Amazon Web Services
Andrew Turner
 
PDF
Postgres Performance for Humans
Citus Data
 
PDF
Amazon SageMaker で始める機械学習
Amazon Web Services Japan
 
PDF
Load testing with Blitz
Lindsay Holmwood
 
PDF
Web crawlers part-2-20161104
Patryk Omiotek
 
Rails Presentation (Anton Dmitriyev)
True-Vision
 
Front End Tooling and Performance - Codeaholics HK 2015
Holger Bartel
 
Azkaban and Pig at LinkedIn
Russell Jurney
 
Squashing the Heisenbugs
Trotter Cashion
 
RESTful API - GDG Tech Talk - Novembro de 2014
Marlon Carvalho
 
Spark Sql for Training
Bryan Yang
 
Introduction to Apache Camel
Claus Ibsen
 
AWS Batch를 통한 손쉬운 일괄 처리 작업 관리하기 - 윤석찬 (AWS 테크에반젤리스트)
Amazon Web Services Korea
 
Big data lambda architecture - Streaming Layer Hands On
hkbhadraa
 
PostgreSQL High-Availability and Geographic Locality using consul
Sean Chittenden
 
"今" 使えるJavaScriptのトレンド
Hayato Mizuno
 
Performance Wins with BPF: Getting Started
Brendan Gregg
 
Deploy Rails Application by Capistrano
Tasawr Interactive
 
Serverless and Servicefull Applications - Where Microservices complements Ser...
Red Hat Developers
 
Padrino - the Godfather of Sinatra
Stoyan Zhekov
 
Scaling Mapufacture on Amazon Web Services
Andrew Turner
 
Postgres Performance for Humans
Citus Data
 
Amazon SageMaker で始める機械学習
Amazon Web Services Japan
 
Load testing with Blitz
Lindsay Holmwood
 
Web crawlers part-2-20161104
Patryk Omiotek
 
Ad

More from Avkash Chauhan (9)

PPTX
AI Solutions with Macnica.ai - AI Expo 2018 Tokyo Japan
Avkash Chauhan
 
PPTX
AI Expo - AI Revolution in Silicon Valley
Avkash Chauhan
 
PDF
Nikkei xTech coverage on macnica.ai announcement
Avkash Chauhan
 
PPTX
H2O Core Introduction
Avkash Chauhan
 
PDF
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Avkash Chauhan
 
PDF
Big Data Perspective UI V2
Avkash Chauhan
 
PDF
Big Data Perspective (UI)
Avkash Chauhan
 
PDF
Big Data Perspective (Company Information)
Avkash Chauhan
 
PDF
Data 360 Conference: Introduction to Big Data, Hadoop and Big Data Analytics
Avkash Chauhan
 
AI Solutions with Macnica.ai - AI Expo 2018 Tokyo Japan
Avkash Chauhan
 
AI Expo - AI Revolution in Silicon Valley
Avkash Chauhan
 
Nikkei xTech coverage on macnica.ai announcement
Avkash Chauhan
 
H2O Core Introduction
Avkash Chauhan
 
Creating AnswerBot with Keras and TensorFlow (TensorBeat)
Avkash Chauhan
 
Big Data Perspective UI V2
Avkash Chauhan
 
Big Data Perspective (UI)
Avkash Chauhan
 
Big Data Perspective (Company Information)
Avkash Chauhan
 
Data 360 Conference: Introduction to Big Data, Hadoop and Big Data Analytics
Avkash Chauhan
 
Ad

Introduction to Apache Pig