Programming Pig: Dataflow Scripting with Hadoop, 2nd Edition Front Cover

Programming Pig: Dataflow Scripting with Hadoop, 2nd Edition

  • Length: 347 pages
  • Edition: 2
  • Publisher:
  • Publication Date: 2016-12-05
  • ISBN-10: 1491937092
  • ISBN-13: 9781491937099
  • Sales Rank: #800434 (See Top 100 Books)
Description

This guide is an ideal learning tool and reference for Apache Pig, the open source engine for executing parallel data flows on Hadoop. With Pig, you can batch-process data without having to create a full-fledged application—making it easy for you to experiment with new datasets.

This fully updated edition of Programming Pig introduces new users to Pig, and provides experienced users with comprehensive coverage on key features such as the Pig Latin scripting language, the Grunt shell, and User Defined Functions (UDFs) for extending Pig. If you need to analyze terabytes of data, this book shows you how to do it efficiently with Pig.

Table of Contents

Chapter 1. What Is Pig?
Chapter 2. Installing and Running Pig
Chapter 3. Pig’s Data Model
Chapter 4. Introduction to Pig Latin
Chapter 5. Advanced Pig Latin
Chapter 6. Developing and Testing Pig Latin Scripts
Chapter 7. Making Pig Fly
Chapter 8. Embedding Pig
Chapter 9. Writing Evaluation and Filter Functions
Chapter 10. Writing Load and Store Functions
Chapter 11. Pig on Tez
Chapter 12. Pig and Other Members of the Hadoop Community
Chapter 13. Use Cases and Programming Examples

To access the link, solve the captcha.