MLOps Live

Join our webinar on Improving LLM Accuracy & Performance w/ Databricks - Tuesday 30th of April 2024 - 12 noon EST

What is Real-Time Machine Learning?

 

Today, machine learning has become commonplace, and many applications integrate machine intelligence behind their UI and API. However, having a well-trained machine learning model is not enough to achieve the performance required by many real-time applications. We need to understand essential aspects of real-time machine learning and build a high-performance operational pipeline that can handle streaming data in real time.

On this page, you will learn:

Real-time machine learning applications handle real-time data streams to make time-critical decisions such as those required of user-facing applications, fraud prediction, recommender systems, and predictive maintenance, to name a few. For example, online shopping applications need to react in real time, as the probability of a user clicking away increases by taking just one second too long to respond. Moreover, online learning requires real-time feature engineering and online feature stores because features used for predictions must stay relevant as user behavior changes—sometimes drastically.

Making a machine learning model lighter and faster isn’t enough to achieve real-time processing. In real-time applications, well-trained machine learning models aren’t complete without fast data pipelines capable of efficiently handling ever-increasing data streams, because real-time machine learning is about both the machine learning model and the system/infrastructure.

Real-Time Machine Learning Applications

Real-time machine learning serves many applications. Examples include:

  • TikTok—a popular application for creating and sharing short videos—uses their recommendation system to curate a stream of videos for each user by predicting individual preferences in real time. By interacting with the app (commenting on a post, for example), users reveal their preferences. So TikTok’s system/infrastructure ingests streams of user interactions to generate online features for use by their machine learning models.
  • Booking.com discovered that a 30% latency raised costs by roughly 0.5% in conversion rates. So they made considerable efforts to minimize prediction serving latencies in production, where they had 150 machine learning models.
  • Payoneer, a digital payment platform, uses machine learning for fraud prediction. Previously, their fraud detection could only work retroactively, after the transaction already happened. Because today’s financial transactions involve so many parameters—such as the amount of the transaction, its history, trend, time, and location—machine learning is required for real-time analysis via a real-time data pipeline.

Real-Time Data Pipeline

A data pipeline ingests raw data from an event source and applies a chain of transformations before delivering it to a destination such as a machine learning model. A data pipeline must be resilient against failure so that machine learning models won’t miss any data.

We need real-time data pipelines for real-time machine learning applications, where source data is coming from real-time data streaming. Such a data pipeline often makes use of event-driven architecture, reacting to source events as they occur. In this broader view of the system architecture, a machine learning model plays just one part of the process.

Moreover, real-time feature engineering is only possible thanks to real-time data pipelines.

Real-Time Feature Engineering

Online learning adapts to changes in user behavior by performing real-time feature engineering and making features available via online feature stores, where features always stay relevant and ready to be used for real-time inference. A feature store serves as a data transformation service that can run complex calculations in real time and also enables feature sharing for machine learning models at scale. Source data could come from databases, data lakes, HTTP requests, or events from a streaming engine, such as Apache Kafka and Amazon Kinesis.

Online training is a contentious term which, by definition, means learning from each new data point. Since online training involves adjusting the model parameters as data arrives, it can be very difficult to get it right. Instead, it is more realistic to perform offline training using periodically sampled data from production. For example, Weibo (a Twitter-like service in China) uses a process called streaming machine learning. This process involves downloading real-time logs to generate training data, performing offline training, and uploading updated model parameters to the online parameter service. The whole process is done in the span of minutes to hours, which is much faster than traditional offline training (which can take hours or even months).

Real-time feature engineering and online feature stores are must-have components for online learning and training to keep machine learning models up to date.

Real-Time Machine Learning Powered By Iguazio

Real-time machine learning applications typically evolve iteratively, and an automated, fast deployment workflow is highly desirable in order to make this iterative process faster and easier. A machine learning pipeline needs to integrate streaming tools like Apache Kafka and Amazon Kinesis.

The Iguazio MLOps platform can abstract the complexities of ingesting and transforming real-time data—providing a streamlined and accelerated way of deploying such complex pipelines to production—along with built-in monitoring that detects model drift in real-time.

Want to learn more about real-time machine learning? Book a live demo here.