A Look Under the Hood

Iguazio AI Platform
to Automate and Orchestrate Your (Gen) AI Pipelines

The Iguazio AI Platform enables you to operationalize and de-risk the entire (gen) AI pipeline from data management and development to deployment. Use guardrails to monitor and meet requirements.

Data scientists, data engineers and DevOps can automatically bring business impact with AI at scale, developing on their own or using ready-made AI application recipes and components. The platform includes four resilient and scalable pipelines covering everything from development to production, the ability to eliminate risks by adding guardrails, LLM customization options to improve model accuracy and performance, GPU provisioning capabilities to optimize resources and flexible deployment options, including on-premises, hybrid and multi-cloud.

Data Management

A data pipeline is the foundation of any AI system. The Iguazio AI platform provides structured and unstructured data pipelines for collecting and ingesting the data, processing it (transformations, cleaning, arranging, versions, tags, labels, indexes, etc.) and managing the data. This helps prepare your data for model training and fine-tuning, supplies real-time data for responsive generative AI applications and supports RAG applications.

Deep diving into these steps in more detail:

Data ingestion – Collecting the data from documents databases, online, data warehouses, and other sources.
Quality filtering – Filtering language, metrics, statistics, keywords, and more. This helps cleanse the data.
De-duplication – Removing sections that repeat themselves. This also makes indexing more efficient.
Privacy reduction – Detecting and removing PII, like emails, phone numbers, SSNs, etc.
Tokenization – Analyzing and labeling the content. For example, indexing in a vector database, indexing keywords, labeling the year of the document to be used as a filter, and more.
Keyword and metadata extraction
Splitting and chunking

When the data is ready as indexed or featured data, it’s time to perform actions like:

Transmitting to Vector DB + K/V Store Data security
Data governance
Data versioning
Cataloging and labeling

The final aspect is managing metadata, which includes:

Data pipeline orchestration
Data lineage and traceability
Resource management and observability

Model Development

This pipeline ensures that the generative AI models are up-to-date and optimized for the specific tasks they are designed to perform. The Iguazio AI platform ensures an automated flow of adata prep, tuning, validating and LLM optimization to specific data efficiently using elastic resources (CPUs, GPUs, etc.). This includes:

Training – Initial training of the model on a large dataset to capture a broad understanding of the language or task
LLM Customization – Adjusting the model on a more specialized dataset to optimize performance for specific tasks or domains. LLM customization can take place with RAG, RAFT, or other ways to improve model accuracy and performance at minimal cost.
Evaluation – Regularly evaluating the model against benchmarks and real-world tasks to ensure it maintains high accuracy and relevance.

Iguazio supports any open source or commercial LLM.

Application Deployment

The application pipeline is responsible for handling user requests, sending data to the model so it generates responses (model inference) and validation that those responses are accurate, relevant and delivered promptly. The Iguazio AI platform’s application deployment pipeline ensures rapid deployment of scalable real-time serving and application pipelines that use LLMs (locally hosted or external) as well as the required data integration and business logic. This is the pipeline that brings business value to the organization. Deployment is supports across on-prem, hybrid and multi-cloud.

LiveOps

Constant monitoring and governance of all your AI / gen AI applications in production will ensure your applications are accurately working at peak performance. To do so, Iguazio provides a monitoring system pipeline for gathering application and data telemetry to identify resource usage, application performance, risks, etc. The monitoring data can be used to further enhance the application performance. On top of these, the Iguazio AI platform provides built-in monitoring for the LLM data, training, model and resources, with automated model re-tuning and RLHF.

Guardrails for Protecting Against LLM Risks

Throughout these four pipelines, the Iguazio AI platform eliminates LLM risks with guardrails that ensure:

Fair and unbiased outputs
Intellectual property protection
PII elimination to safeguard user privacy
Improved LLM accuracy and performance for minimizing AI hallucinations
Filtering of offensive or harmful content
Alignment with legal and regulatory standards
Ethical use of LLMs
Elimination of risks and improved accuracy with control and guardrail components.

GPU Provisioning

GPUs are often under-utilized due to inefficient resource allocation, data bottlenecks, complicated DevOps and limited support for use-cases beyond deep learning. The Iguazio AI platform provides provisioning capabilities that help customers use their GPU investments efficiently, saving heavy compute costs, simplifying complex infrastructure and improve performance.

Iguazio users can:

Run experiments with GPU resources attached, along with full resource control.
Assign GPUs to training engines like Spark or Horovod or to a Jupyter notebook, all within a simple UI.
Automatically free up resources with the scale to zero option, which triggers when a Jupyter notebook with assigned GPUs is idle for a certain amount of time.
Scale up when the workload demands, and easily release GPU resources to scale down on-demand.
Simplify GPU management with out-of-the-box GPU monitoring reports on both the cluster and the application level.

Open Source Core

Iguazio operates with open source at its core, allowing users to future-proof their stack, integrate with any third-party service and maintain flexibility.

MLRun, the open-source AI orchestration framework built and maintained by Iguazio, manages ML and generative AI applications across their lifecycle. It automates data preparation, model tuning, customization, validation and optimization of ML models, LLMs and live AI applications over elastic resources.

MLRun enables the rapid deployment of scalable real-time serving and application pipelines, while providing built-in observability and flexible deployment options, supporting multi-cloud, hybrid and on-prem environments.

Iguazio also maintains Nuclio, the open source serverless framework used to minimize development overhead, increase performance and automate the deployment of AI applications. Nuclio’s capabilities support data ingestion, model training, deployments, resource optimization and more.