Apache Hadoop Pig 0.11.0 ~ INTERNET AND TECHNOLOGY REVIEW

24 February, 2013

Apache Hadoop Pig 0.11.0


Developer: Website: License / Price: Platforms: Databases: Language: Last Updated: Category:	Apache Software Foundation \| More scripts hadoop.apache.org Apache License Windows / Linux / Mac OS / BSD / Solaris N/A Java February 24th, 2013, 21:53 GMT [view history] C: \ Database Tools

This is a platform used in analyzing large data sets consisting of high-level languages for expressing data analysis programs.

It is coupled with infrastructure for evaluating programs.

The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.

Here are some key features of "Apache Hadoop Pig":

· Ease of programming. It is trivial to achieve parallel execution of simple, "embarrassingly parallel" data analysis tasks. Complex tasks comprised of multiple interrelated data transformations are explicitly encoded as data flow sequences, making them easy to write, understand, and maintain.
· Optimization opportunities. The way in which tasks are encoded permits the system to optimize their execution automatically, allowing the user to focus on semantics rather than efficiency.
· Extensibility. Users can create their own functions to do special-purpose processing.

Requirements:

· Java 1.6.x or higher
· Ant
· Cygwin
· Apache Hadoop 0.20.x or higher

What's New in This Release: [ read full changelog ]

· This release includes DateType datatype, RANK, CUBE and ROLLUP operators, Groovy udfs, custom reducer estimation, schema-based tuples and HCatalog DDL integration.

Via: Apache Hadoop Pig 0.11.0

INTERNET AND TECHNOLOGY REVIEW

Blog Archive

24 February, 2013