Apache Impala is an open source massively parallel processing (MPP) SQL query engine for data stored in a computer cluster running Apache Hadoop.
The project was announced in October 2012 with a public beta test distribution[3][4] and became generally available in May 2013.
[5] Impala brings scalable parallel database technology to Hadoop, enabling users to issue low-latency SQL queries to data stored in HDFS and Apache HBase without requiring data movement or transformation.
Impala is integrated with Hadoop to use the same file and data formats, metadata, security and resource management frameworks used by MapReduce, Apache Hive, Apache Pig and other Hadoop software.
[8] In 2015, another format called Kudu was announced, which Cloudera proposed to donate to the Apache Software Foundation along with Impala.