After spending a long time developing and training our model, it’s finally time to go to production. But how do you still know if your model is still making accurate predictions in a week or month from now? How do we know how many resources the model is using?
In short, model monitoring keeps track of your model after it goes into production. There are several facets of monitoring including:
- Kubernetes resources: How many resources are we using (CPU, MEM, GPU)?
- Latency: How long does inferencing take?
- Number/Average Invocations: How often is our model being used?
- Data drift: How different (statistically) is the incoming live data from the data we used to train the model?
- Concept drift: Has the meaning (statistical properties) of our prediction target changed?
While all of these facets are important, some are easier to compute than others. Natively, Kubernetes reports resource utilization. It’s possible to get latency and invocation information from most model serving frameworks. However, data and concept drift are incredibly difficult to compute in real-time and on an on-going basis.
This is one of the cases where an MLOps platform such as Iguazio will aid in deploying your models with built-in model monitoring and dashboards.