MLOps Live

Join our webinar on Improving LLM Accuracy & Performance w/ Databricks - Tuesday 30th of April 2024 - 12 noon EST

Implementing Gen AI for Financial Services

Alexandra Quinn | February 20, 2024

Gen AI is quickly reshaping industries, and the pace of innovation is incredible to witness. The introduction of ChatGPT, Microsoft Copilot, Midjourney, Stable Diffusion and many more incredible tools have opened up new possibilities we couldn’t have imagined 18 months ago.

While building gen AI application pilots is fairly straightforward, scaling them to production-ready, customer-facing implementations is a novel challenge for enterprises, and especially for the financial services sector. Risk, compliance, data privacy and escalating costs are just a few of the acute concerns that financial services companies are grappling with today.

This blog post will discuss:

  • The potential impact of generative AI for financial services
  • The challenges of deploying LLMs in production
  • Which engineering and risk-related considerations financial services companies need to take to successfully implement gen AI in their business environments.

To learn more, watch the webinar “Implementing Gen AI for Financial Services” with Larry Lerner, Partner & Global Lead - Banking and Securities Analytics, McKinsey & Company, and Yaron Haviv, Co-founder and CTO, Iguazio (acquired by McKinsey), which this blog post is based on. View the entire webinar here.

What is the Potential Value of Gen AI and Analytics for Financial Services?

The potential annual value of AI and analytics for global banking could reach as high as $1 trillion. But the evolution from analytical Al to generative Al has led to major advancements in the power of advanced analytics. Gen AI has the potential to deliver significant incremental value, potentially leading to 3-5% margin improvements. These productivity lifts are worth approximately $200B - $340B

Potential Gen AI Use Cases

Which use cases is gen AI best suited for? There is a long list of potential use cases across enterprise functions and business groups in banking and securities. This includes dozens of use cases across Marketing, Operations, Legal and Compliance, and Talent and Organization.

For example, applications range from creating engaging customer content to profiling wealth prospects to drafting financial reports to fraud monitoring to generating job profiles. See the entire list in the webinar.

Today, approximately 75% of value from generative AI use cases falls under one of the following four use cases:

  1. Virtual expert - Summarizing and extract insights from unstructured data sources, efficient information retrieval to assist problem-solving and validating sources for credibility.
  2. Content generation - Creation of contracts, NDAs, etc. to reduce manual work, creating personalized messaging and next-product-to-buy recommendations.
  3. Customer engagement - Co-pilots that guide customers through personalized journeys and intelligent chatbots for enhanced 24/7 customer support,
  4. Coding acceleration - Interpreting, translating, and generating code (e.g., migration from legacy systems at scale), synthetic data generation and application prototyping

Each of these provides significant productivity gains for users. To enjoy the monetary benefits listed above, financial services need to go beyond use cases like marketing, sales and risk. Gen AI can amplify capital markets and investment banking, asset management, corporate banking, wealth management, retail banking and more.

What are Some Gen AI Pitfalls?

However, gen AI is not yet the answer to every situation. It’s not recommended to use gen AI for:

  • High-stakes scenarios where errors, factual inaccuracies, or value judgements can cause harm, like disease diagnostics.
  • Applications involving heavy volume of requests and/or tight response time limits, like high frequency stock trading.
  • Unconstrained, long, open-ended generation that may expose harmful or biased content to users, like legal document creation.
  • Applications requiring explainability and/or full understanding of potential failure modes (e.g., highly regulated environments), like credit scoring.
  • Applications requiring numerical reasoning (from basic arithmetic to optimization), like demand forecasting.

This is because gen AI introduces significant and novel risks. Risk categories include impaired fairness, IP infringement, privacy concerns, malicious use, performance and explainability risks, security threats, negative ESG impact and third-party risks. Even when deploying gen AI for the recommended use cases, organizations need to implement guardrails and operate within them to minimize these risks.

Eating Our Own Dogfood: Meet Lilli

McKinsey’s developers a gen AI-based conversational AI that can shorten weeks of research into hours. Lilli is a conversational AI tool powered by GPT 3.5 with access to a carefully selected corpus of McKinsey knowledge. Lilli can provide tailored answers to questions posed by McKinsey colleagues, including financial institutions, and enable them to access and synthesize proprietary information.

Lilli took approximately 4-5 months to build and another 4-5 months to deploy.  The solution has access to 60,000 internal knowledge resources, financial data and insights of 50,000 companies, 10,000 publications from and 25 data sources. These are based on McKinsey’s 30 years of experience, complemented with 150,000 hours of expert interviews. As of now, 66% of employees use Lilli multiple times a week, 50,000 questions were asked within the first two weeks of launch and the tool has cut down the time spent on research and planning from weeks to hours, and from hours to minutes.

LLMOps vs. MLOps: Understanding the Differences

Building MLOpsPedia

This demo on Github shows how to fine tune an LLM domain expert and build an ML application

Building Gen AI for Production

The ability to successfully scale and drive adoption of a generative AI application requires a comprehensive enterprise approach. This includes management vision and strategy, resource commitment, data and tech and operating model alignment, robust risk management and change management.

Considerations to take into account include:

  • The positioning in the enterprise
  • Data architecture, including access to large bodies of unstructured data (models are required but not sufficient)
  • Cloud infrastructure choices
  • The right UI/UX interface
  • Processes and people implications ("human in the loop" and analytics becomes a third leg of the stool with tech and business)

Taking a gen AI application to production also requires significant engineering, and is much more complicated than the prototyping phase alone. The required architecture includes a data pipeline, ML pipeline, application pipeline and a multi-stage pipeline. Read more here.

Let’s dive into the data management pipeline.

What are the Key Elements of Data Management in Gen AI?

The data pipeline ingests data from different sources and performs multiple actions like transformations, cleaning, versioning, tagging, labeling, indexing and more. This is essential for ensuring high quality outputs, which lead to high quality models.

Implementing Gen AI for Financial Services

Once data is ingested, it goes through transformations. These include:

  • Text cleansing and correcting
  • Toxicity detection and filtering
  • Bias detection and mitigation Privacy (PII) protection
  • Deduplication
  • Formatting and tagging
  • Keyword and metadata extraction
  • Splitting and chunking
  • Tokenization and embedding

When the data is ready as indexed or featured data, it’s time to perform actions like:

  • Transmitting to Vector DB + K/V Store Data security
  • Data governance
  • Data versioning
  • Cataloging and labeling
  • Ensuring data quality

The final aspect is managing metadata, which includes:

  • Data pipeline orchestration
  • Data lineage and traceability
  • Resource management and observability

For example, let’s take a document that contains symbols that don't really contribute to the language. The first step is filtering the symbols. The second is deduplication to prevent overfitting and accuracy. The third is substituting names and SSNs with masked data. The fourth is tokenization. Finally, the data is indexed in the vector database.

Implementing Gen AI for Financial Services

Once these are completed, it’s still important to validate requests and responses to lower risks.


Generative AI has the potential to drive substantial margin improvements and operational efficiencies for financial services. However, reaching the point where institutions can enjoy that potential does not come is not without its challenges. Financial institutions need to invest resources in risk management, compliance, data privacy and technological integration to realize the full benefits of generative AI.

More specifically, implementing gen AI in production requires a robust and thought out data management pipeline, for ingesting, transforming, cleaning, versioning, tagging, labeling, indexing and enhancing data. This helps address risks and improve data quality.

You can watch the entire webinar here.