Views:62921|Rating:4.79|View Time:46:44Minutes|Likes:338|Dislikes:15
( Hadoop Training: )
Check out our Pig Tutorial blog:
Check our complete Hadoop playlist here:
This Edureka Pig Tutorial will help you understand the concepts of Apache Pig in depth. Below are the topics covered in this Pig Tutorial:

1) Entry of Apache Pig
2) Pig vs MapReduce
3) Twitter Case Study on Apache Pig
4) Apache Pig Architecture
5) Pig Components
6) Pig Data Model
7) Running Pig Commands and Pig Scripts (Log Analysis)

Subscribe to our channel to get video updates. Hit the subscribe button above.

Instagram:
Facebook:
Twitter:
LinkedIn:

#PigTutorial #WhatisApachePig #PigLatin #PigScript

How it Works?

1. This is a 5 Week Instructor led Online Course, 40 hours of assignment and 30 hours of project work
2. We have a 24×7 One-on-One LIVE Technical Support to help you with any problems you might face or any clarifications you may require during the course.
3. At the end of the training you will have to undergo a 2-hour LIVE Practical Exam based on which we will provide you a Grade and a Verifiable Certificate!

– – – – – – – – – – – – – –

About the Course

Edureka’s Big Data and Hadoop online training is designed to help you become a top Hadoop developer. During this course, our expert Hadoop instructors will help you:

1. Master the concepts of HDFS and MapReduce framework
2. Understand Hadoop 2.x Architecture
3. Setup Hadoop Cluster and write Complex MapReduce programs
4. Learn data loading techniques using Sqoop and Flume
5. Perform data analytics using Pig, Hive and YARN
6. Implement HBase and MapReduce integration
7. Implement Advanced Usage and Indexing
8. Schedule jobs using Oozie
9. Implement best practices for Hadoop development
10. Work on a real life Project on Big Data Analytics
11. Understand Spark and its Ecosystem
12. Learn how to work in RDD in Spark

– – – – – – – – – – – – – –

Who should go for this course?

If you belong to any of the following groups, knowledge of Big Data and Hadoop is crucial for you if you want to progress in your career:
1. Analytics professionals
2. BI /ETL/DW professionals
3. Project managers
4. Testing professionals
5. Mainframe professionals
6. Software developers and architects
7. Recent graduates passionate about building successful career in Big Data

– – – – – – – – – – – – – –

Why Learn Hadoop?

Big Data! A Worldwide Problem?

According to Wikipedia, “Big data is collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.” In simpler terms, Big Data is a term given to large volumes of data that organizations store and process. However, it is becoming very difficult for companies to store, retrieve and process the ever-increasing data. If any company gets hold on managing its data well, nothing can stop it from becoming the next BIG success!

The problem lies in the use of traditional systems to store enormous data. Though these systems were a success a few years ago, with increasing amount and complexity of data, these are soon becoming obsolete. The good news is – Hadoop, which is not less than a panacea for all those companies working with BIG DATA in a variety of applications and has become an integral part for storing, handling, evaluating and retrieving hundreds of terabytes, and even petabytes of data.

For more information, Please write back to us at [email protected] or call us at IND: 9606058406 / US: 18338555775 (toll-free).

Customer Review:

Michael Harkins, System Architect, Hortonworks says: “The courses are top rate. The best part is live instruction, with playback. But my favorite feature is viewing a previous class. Also, they are always there to answer questions, and prompt when you open an issue if you are having any trouble. Added bonus ~ you get lifetime access to the course you took!!! Edureka lets you go back later, when your boss says “I want this ASAP!” ~ This is the killer education app… I’ve take two courses, and I’m taking two more.”

30 comments on “Pig Tutorial | Apache Pig Script | Hadoop Pig Tutorial | Edureka

  • OMG…Can you give me a quick confirmation on the chat window?… Afthar says Yes, Kumar Says Yes, Jessica Says Yes, Remi says. Great. Please buddy, next video don't do this. Its so boring yar.

  • Thank you, from Portugal; I am studying for my exam on "Big Data Systems", and I have missed the class on Hadoop/Pig (the problem of being a working student); Now I think I got it clearly!

  • sir, very good presentation. Very clear to understand. Sir, where can I find the log file? Can you Please send me to my mail-id.

  • Vineeth was really a fabulous presenter, the way he explain was really amazing and it goes to my head directly with out any confusion, thanks a lot sir…expecting more from you and i need more pig videos.

  • This is a wonderful tutorial with detailed explanation. I just have a query in the samle.log file. What are the parameters in REGEX_EXTRACT. Can you please explain in detail what is $0 and what is 1 in the REGEX_EXTRACT.

    Thank you so much for your videos. Keep the good work going 🙂

  • Thanks for posting informative videos.

    I have tried pig script as it was explained in the video. But it got failed. Can you please let me know, How to make it success ?
    Content of sampleLog.pig:

    log = LOAD '/sample.log';
    LEVELS = foreach log generate REGEX_EXTRACT($0,'(TRACE|DEBUG|INFO|WARN|ERROR|FATAL)', 1) as LOGLEVEL;
    FILTEREDLEVELS = FILTER LEVELS by LOGLEVEL is not null;
    GROUPEDLEVELS = GROUP FILTEREDLEVELS by LOGLEVEL;
    FREQUENCIES = foreach GROUPEDLEVELS generate group as LOGLEVEL, COUNT(FILTEREDLEVELS.LOGLEVEL) as COUNT;
    RESULT = order FREQUENCIES by COUNT desc;
    DUMP RESULT;

    [email protected]:~$ pig /home/hduser/HDFS_Practice_Dir/new_edureka/sampleLog.pig

    Failed Jobs:
    JobId Alias Feature Message Outputs
    job_1491887529789_0011 FILTEREDLEVELS,FREQUENCIES,GROUPEDLEVELS,LEVELS,log GROUP_BY,COMBINER Message: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Input path does not exist: hdfs://localhost:9000/sample.log
    .
    .
    .
    .
    .
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276)
    Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://localhost:9000/sample.log
    Input(s):
    Failed to read data from "/sample.log"

    Output(s):

    Counters:
    Total records written : 0
    Total bytes written : 0
    Spillable Memory Manager spill count : 0
    Total bags proactively spilled: 0
    Total records proactively spilled: 0

    Job DAG:
    job_1491887529789_0011 -> null,
    null -> null,
    null

    2017-04-10 23:24:32,688 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher – Failed!
    2017-04-10 23:24:32,697 [main] ERROR org.apache.pig.tools.grunt.Grunt – ERROR 1066: Unable to open iterator for alias RESULT
    Details at logfile: /home/hduser/pig_1491891860556.log

    Log file content:
    Pig Stack Trace
    —————
    ERROR 1066: Unable to open iterator for alias RESULT

    org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias RESULT
    .
    .
    .
    Caused by: java.io.IOException: Couldn't retrieve job.
    at org.apache.pig.PigServer.store(PigServer.java:1083)
    at org.apache.pig.PigServer.openIterator(PigServer.java:994)
    … 13 more
    ================================================================================

  • Hi,Its Nice tutorial about PIG.I just want to know that in which best case will PIG used over HIVE in real time scenarios ?

Leave a Reply

Your email address will not be published. Required fields are marked *