Q&A MLOps

How do I serve models for real-time enterprise applications?

You are basically asking for model serving or a way to manage and deliver your models in a secure and governed way to production.

There are a few things you need to think about:

How will my models be managed?
How will my models be delivered (served) for inferencing?
Do I need real-time or batch level delivery?

In its simplest form, you store or deploy the trained model to a remote repository known as a model server. Then at runtime, you retrieve the model, pass features (inputs) into it and predict.
There's a lot of value in this simple model. Firstly, your models are stored in a central repository which provides governance, share-ability, versioning and reusability. It should be as easy as a few function calls.
Secondly, retrieving the model should also be as easy as a single function call. However, you must ensure the appropriate protocols are supported and are secure.

A great way to accomplish this is using MLRun Serving Pipelines. Using MLRun Serving uses the Nuclio real-time serverless framework for the pipelines.

Need help?

Contact our team of experts or ask a question in the community.

Have a question?

Submit your questions on machine learning and data science to get answers from out team of data scientists, ML engineers and IT leaders.

Submit a question

How do I serve models for real-time enterprise applications?

Need help?

More related questions