Today we all choose between the simplicity of Python tools (pandas, Scikit-learn), the scalability of Spark and Hadoop, and the operation readiness of Kubernetes. We end up using them all.
Today we all choose between the simplicity of Python tools (pandas, Scikit-learn), the scalability of Spark and Hadoop, and the operation readiness of Kubernetes. We end up using them all.
You’ve played around with machine learning, learned about the mysteries of neural networks, almost won a Kaggle competition and now you feel ready to bring all this to real world impact. It’s time to build some real AI-based applications.
Here’s the problem: we are always under pressure to reduce the time it takes to develop a new model, while datasets only grow in size. Running a training job on a single node is pretty easy, but nobody wants to wait hours and then run it again, only to realize that it wasn’t right to begin with.
With all the turmoil and uncertainty surrounding large Hadoop distributors in the past few weeks, many wonder what’s happening to the data framework we’ve all been working on for years?
Still waiting for ML training to be over? Tired of running experiments manually? Not sure how to reproduce results? Wasting too much of your time on devops and data wrangling?
Yaron Haviv explains serverless and its limitations, providing a hands-on example of using a serverless architecture to simplify data science development and accelerate time to production for data collection, exploration, model training and serving.