Time Series Database Overview

Up until now, most enterprises have settled for a reactive approach using a traditional time series database to visualize current trends and run batch analysis after the fact. However, modern businesses need to be proactive with sophisticated predictions and real-time actions which maximize the value of data. This requires new platforms which correlate time series data with multiple variables and large data volumes in real-time, run advanced AI algorithms, generate interactive dashboards and automate actions.

Extending Prometheus

  • Time series database with horizontal scaling
  • High speed push (ingestion) and streaming
  • Support for Spark and AI on the same data without copies
  • Accelerated queries through pre-aggregation and automatic rollups
  • High-availability, consistency, security and management

Time Series Database Challenges

  • Volume and variety: Processing high volumes of incoming time series data in different formats or protocols
  • Context: real-time enrichment with other data models taken from historical context, operational databases, social or environmental sources
  • Actionable insights: feeding data into AI algorithms and serving the results back to users, dashboards or control systems
  • Complexity: tedious integrations of various AI frameworks, data pipelines, a legacy time series database and/or expensive in-memory database solutions

Our Solution

The Iguazio Continuous Data Platform is the first integrated solution for simple delivery of real-time AI applications across cloud, on-premises and edge. With Iguazio’s platform users ingest large data volumes, add relevant historical or operational context, run AI tasks and serve the results in real-time without the complexity and high-costs of traditional solutions.

Iguazio integrates the following features:

  • Various standard and open data APIs: SQL, NoSQL (DynamoDB), time series (Prometheus), streaming (Kinesis), object (S3) and file. The data is accessed by multiple APIs simultaneously.
  • An innovative real-time database engine designed to reach the in-memory performance while using lower cost and high-density Flash storage.
  • Open source AI/machine learning services (Spark, Python, TensorFlow)
  • Real-time serverless functions (Nuclio) with low-latency data access.
  • End to end security, management and self-service operations in the cloud or at the edge.

Iguazio’s time series database leverages advanced APIs (such as row and column layouts, array vectors, random and sequential indexes and complex expressions) to provide an extremely fast and scalable time series database service. It is integrated with the following open-source frameworks to deliver a seamless development experience throughout different stages in the processing pipeline:

While traditional time series databases are limited to a single data type, the Iguazio CDP supports multiple data models (time series, SQL/NoSQL table, document, stream, object, file). This allows real-time correlation of time series data with static and operational data tables for AI inferencing, as well as simplified deployment, security and maintenance.

Iguazio’s time series database works with The Nuclio real-time open source serverless engine. Nuclio supports ingestion from a variety of sources through HTTP or a large variety of streaming/triggering protocols (Kafka, Kinesis, Azure Event- Hub, RabbitMQ, NATS, Iguazio streams, MQTT and Cron tasks) and provides limitless auto-scaling and automatic deployment across cloud, edge and on-premises. Nuclio functions can be customized for pre-processing incoming data (examine metric data, alert and convert formats) to run real time AI inferencing or handle post-processing (send notifications/triggers, write to external systems and provide custom query APIs).

Read more