How to Easily Deploy Your Hugging Face Model to Production at Scale

Name: How to Easily Deploy Your Hugging Face Model to Production at Scale
Uploaded: 2022-10-26T11:06:49+00:00
Duration: 1 h 5 min 33 s
Description: Seems like almost everyone uses Hugging Face to simplify and reuse advanced models and work collectively as a community.

Seems like almost everyone uses Hugging Face to simplify and reuse advanced models and work collectively as a community.

But how do you deploy these models into real business environments, along with the required application logic? How do you serve them continuously, at scale? How do you manage their lifecycle in production (deploy, monitor retrain)?

Oh, there’s a tool for that 😉

MLRun is an open source MLOps orchestration framework that enables you to automate deployment and management of your Hugging Face models in production.

Join us for this technical session and learn how to:

Use GitHub Codespaces with MLRun to quickly develop, test, and deploy Hugging Face models with zero configuration
Build an application pipeline which incorporates your Hugging Face model, leveraging the MLRun serving graph and Gradio as a front-end.
Fine-tune and retrain your model using GPUs with MLRun, then push the model back Hugging Face to both Model Registry and Hugging Face Spaces.

Watch More

Session #18

Best Practices for Succeeding with MLOps

Session #7

Product Madness (an Aristocrat co.) on Predicting 1st-Day Churn in Real Time

Session #2

Quadient’s Jason Evans on Saving Time & Costs Bringing AI to Production