Win-Win Stratregy for Analytical Queries

It might be that you run some queries on Drill with Hive as a data source, and some on Hive, because you can install both of them on your system.
Just execute and see which option is faster for every particular analytical query you need.

Drill supports the ANSI standard for SQL. You can use SQL to query your Hive, HBase, and distributed file system data sources. You can simply point to the Hive metastore from Drill and start performing low latency queries on Hive tables/views with no modifications. Drill supports Hive SerDe integration to query data from all Hive file formats.

Drill provides JDBC/ODBC drivers for integrating with BI/SQL based tools (such as Tableau, Microstrategy, etc.).

The two core technologies of Apache Drill are columnar storage for nested data and the tree architecture for query execution.
Hive 0.13 have been adopted these good ideas by having the ORC file for columnar storage and using Tez as the execution engine that structures the computation as a directed acyclic graph. Both (and other things from Stinger Initiative) significantly improves the performance of Hive.
Hive 0.14 will bring the cost-based query optimization (see the picture).


Ovaj unos je objavljen u Nekategorizirano. Bookmarkirajte stalnu vezu.


Popunite niže tražene podatke ili kliknite na neku od ikona za prijavu: Logo

Ovaj komentar pišete koristeći vaš račun. Odjava /  Izmijeni )

Google+ photo

Ovaj komentar pišete koristeći vaš Google+ račun. Odjava /  Izmijeni )

Twitter picture

Ovaj komentar pišete koristeći vaš Twitter račun. Odjava /  Izmijeni )

Facebook slika

Ovaj komentar pišete koristeći vaš Facebook račun. Odjava /  Izmijeni )


Spajanje na %s