The Presto Service

On This Page

Presto is an open-source distributed SQL query engine for running interactive analytic queries. The platform has a pre-deployed tenant-wide Presto service that can be used to run SQL queries and perform high-performance low-latency interactive data analytics. You can ingest data into the platform using your preferred method — such as using Spark, the NoSQL Web API, a Nuclio function, or V3IO Frames — and use Presto to analyze the data interactively with the aid of your preferred visualization tool. Running Presto over the platform's data services allows you to filter data as close as possible to the source.

You can run SQL commands that use ANSI SQL SELECT statements, which will be executed using Presto, from Jupyter Notebook, a serverless Nuclio function, or a local or remote Presto client. The platform comes pre-deployed with the native Presto CLI client (presto-cli), a convenience wrapper to this CLI that preconfigures some options for local execution (presto), and the Presto web UI — which you can log into from the dashboard's Services page. You can also integrate the platform's Presto service with a remote Presto client — such as Tableau or QlikView >}} — to remotely query and analyze data in the platform over a Java database connectivity (JDBC) connector.

The Iguazio Presto connector enables you to use Presto to run queries on data in the platform's NoSQL store — including support for partitioning, predicate pushdown, and column pruning, which enables users to optimize their queries.

You can also use Presto's built-in Hive connector to query data of the supported file types, such as Parquet or ORC, or to save table-query views to the default Hive schema. Note that to use the Hive connector, you first need to create a Hive Metastore by enabling Hive for the platform's Presto service. For more information, see Using the Hive Connector in the Presto overview.

The platform also has a built-in process that uses Presto SQL to create a Hive view that monitors both real-time data in the platform's NoSQL store and historical data in Parquet or ORC tables [Tech Preview].

For more information about using Presto in the platform, see the Presto Reference. See also the Presto and Hive restrictions in the Software Specifications and Restrictions documentation.

See Also