PIVOTAL HAWQ is the SQL query engine on HDFS created by Pivotal Inc.
HAWQ has been designed from the ground up to be a massively parallel SQL processing engine optimized specifically for analytics with full transaction support. HAWQ is designed to have no single point of failure. HAWQ works with HDFS to ensure that recovery from hardware failures is automatic and online. Internally, system health is monitored continuously. When a server failure is detected, it is removed from the cluster dynamically and automatically while the system continues serving queries. Recovered servers can be added back to the system on the fly. HAWQ is priced per node annually.
PivotalR (http://cran.r-project.org/web/packages/PivotalR/PivotalR.pdf) provides the access to data stored on HDFS by supporting HAWQ. The user does not need to worry about the restriction of memory size even if the data size is very big, because PivotalR minimizes the amount of data transferred between the database and R. The user manipulates the data from R but the data itself stays in the database.
PivotalR also provides an R wrapper for MADlib. MADlib is an open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine-learning algorithms for structured and unstructured data. Thus PivotalR also enables the user to apply machine learning algorithms on big data.