Meet Us at ODSC West in San Francisco from Oct 31-Nov 1

How do I move my batch pipeline over to real time?

Moving your pipelines from batch to real-time is a complex endeavor. In many cases, the pipeline needs to be redesigned with a new tech stack that supports event-based architectures. Before you even contemplate moving your pipeline online, you must determine whether the data that supports the pipeline can also be moved to (near) real-time (streaming tech is the most popular right now). If your batch pipelines are supported by daily ETL that takes hours to process, even if you move the pipeline processing online, the data doesn’t support that.

Start with the data: Where does it originate? Can you get a hold of the data when it originates, such as when a user clicks a button on a mobile app? Solving these issues can involve changing source applications and using streams. If it can be done, the battle is halfway won!

Now that you have access to data streaming from the source, you need a way to run that data through your pipeline logic at scale. Iguazio's serverless open-source framework Nuclio allows you to create complex pipelines with many different event-based triggers. It also scales horizontally to fit the demand of the workload.

Need help?

Contact our team of experts or ask a question in the community.

Have a question?

Submit your questions on machine learning and data science to get answers from out team of data scientists, ML engineers and IT leaders.