Application-Services Overview

On This Page

Introducing the Platform's Application Services

In addition to its core data services, the platform comes pre-deployed with essential and useful proprietary and third-party open-source tools and libraries that facilitate the implementation of a full data science workflow, from data collection to production (see Introducing the Platform). Both built-in and integrated tools are exposed to the user as application services that are managed by the platform using Kubernetes. Each application is packaged as a logical unit within a Docker container and is fully orchestrated by Kubernetes, which automates the deployment, scaling, and management of each containerized application. This provides users with the ability and flexibility to run any application anywhere, as part of their operational pipeline.

The application services can be viewed and managed from the dashboard Services page using a self-service model. This approach enables users to quickly get started with their development and focus on the business logic without having to spend precious time on deploying, configuring, and managing multiple tools and services. In addition, users can independently install additional software — such as real-time data analytics and visualization tools — and run them on top of the platform services.

The platform's application development ecosystem includes

  • Distributed data frameworks and engines — such as Spark, Presto, Horovod, and Hadoop.
  • The Nuclio serverless framework.
  • Enhanced support for time-series databases (TSDBs) — including a CLI tool, serverless functions, and integration with Prometheus.
  • Jupyter Notebook and Zeppelin interactive web notebooks for development and testing of data science and general data applications.
  • A web-based shell shell) service and Jupyter terminals, which provide bash command-line shells for running application services and performing basic file-system operations.
  • Integration with popular Python machine-learning and scientific-computation packages for development of ML and artificial intelligence (AI) applications — such as TensorFlow, Keras, scikit-learn, pandas, PyTorch, Pyplot, and NumPy.
  • Integration with common Python libraries that enable high-performance Python based data processing — such as Dask and RAPIDS.
  • Support for Data Science Automation (MLOps) Services using the MLRun library and Kubeflow Pipelines — including defining, running, and tracking managed, scalable, and portable ML tasks and full workflow pipelines.
  • The V3IO Frames open-source unified high-performance DataFrame API library for working with NoSQL, stream, and time-series data in the platform.
  • Support for executing code over GPUs.
  • Integration with data analytics, monitoring, and visualizations tools — including built-in integration with the open-source Grafana metric analytics and monitoring tool and easy integration with commercial business-intelligence (BI) analytics and visualization tools such as Tableau, Looker, and QlikView.
  • Logging and monitoring services for monitoring, indexing, and viewing application-service logs — including a log-forwarder service and integration with Elasticsearch.

For basic information about how to manage and create services in the dashboard, see Working with Services. For detailed service specifications, see the platform's Support and Certification Matrix.

DNS-Configuration Prerequisite

As a prerequisite to using the platform's application services, you need to configure conditional forwarding for your cluster's DNS server. For more information and step-by-step instructions, see Configuring the DNS Server.

Services List

To help you locate the services and tools that interest you, following is an alphabetical list with links to relevant documentation:

See Also