Apache Pig

Pig Latin can be extended using user-defined functions (UDFs) which the user can write in Java, Python, JavaScript, Ruby or Groovy[3] and then call directly from the language.

While some have hinted that the name sounds coy or silly, it has provided us with an entertaining nomenclature, such as Pig Latin for the language, Grunt for the shell, and PiggyBank for the CPAN-like shared repository.Below is an example of a "Word Count" program in Pig Latin: The above program will generate parallel executable tasks which can be distributed across multiple machines in a Hadoop cluster to count the number of words in a dataset such as all the webpages on the internet.

[11] In effect, Pig Latin programming is similar to specifying a query execution plan, making it easier for programmers to explicitly control the flow of their data processing task.

SQL handles trees naturally, but has no built in mechanism for splitting a data processing stream and applying different operators to each sub-stream.

Pig Latin script describes a directed acyclic graph (DAG) rather than a pipeline.