Interview: Using AI agents as judges in GenAI workflows

Around 40 years ago, a bank branch manager probably knew the name of every customer and was able to offer personalised advice and guidance. But as Ranil Boteju, chief data and analytics officer at Lloyds Banking Group, points out, in today’s world, that model cannot scale.

“In the world of financial planning, most people in the UK cannot afford to see a financial planner,” he says.

There is also an insufficient number of trained financial advisers to help everyone seeking advice, which is why financial institutions are looking at how they can deploy generative artificial intelligence (GenAI) to support customers directly.

But the large language models (LLMs) and GenAI from hyperscalers are rather like black boxes and can deliver incorrect responses, known as hallucinations in AI terms. None of these things are acceptable in a sector regulated by the Financial Conduct Authority (FCA).

What excites Boteju is the ability to scale the 40-year-old model of a bank manager to meet current demand by using artificial intelligence in a way that provides the bank with confidence that the AI is able to understand what people need and give them the right guidance in a way that can be assessed and meets FCA guidelines.

“It would be a great ‘unlock’ for the UK in terms of giving access to high-quality financial guidance to a much broader and larger set of the population,” he says.

As Boteju notes, banks have been using AI for many years. “We’ve been using all sorts of machine learning algorithms for things like credit risk assessments and fraud screening for more than 15 years,” he says. “We’ve also been using chatbots for at least 10 years.”

As such, AI is a very well-used capability in financial services. What’s new, however, is generative AI and agentic AI. “Generative AI burst on the scene in late 2022 with ChatGPT. It’s been about for almost two-and-a-half years now,” says Boteju.

While banks have experience with AI, they have needed to figure out how to use generative AI and large language models. Speaking of his own experience, Boteju says: “We think about things like model performance and whether we are using the right algorithm.”

There is also transparency, ethics, guardrails and how the AI models are deployed. Boteju says: “These are common both to large language models and traditional AI. But generative AI has specific challenges in financial services because we are a regulated industry.”

Since generative AI can often lead to hallucinations, he says banks have to be very cautious about how they expose large action models directly to customers. “We put a lot of effort into ensuring that the outputs of the large language models are correct, accurate and transparent, and there’s no bias.”

In a regulated industry, it is vital to ensure the AI models are not hallucinating. “That’s probably one of the key things we need to be really cognisant of,” he says.