Blog
How JetBlue aligns costs, culture, and AI for CX success
In part two of our blog series on JetBlue’s approach to customer service, we dive into the financial and labor dynamics that shape the airline’s contact center operations. Shelly Griessel, VP of Customer Support at JetBlue, offers a candid look at the factors driving up costs, from first-contact resolution challenges to the role of AI in improving efficiency. But it’s not just about numbers—JetBlue is deeply committed to supporting their workforce, investing in training, and fostering a culture where employees feel valued and empowered. In this post, we’ll explore how GenerativeAgent, what JetBlue calls Amelia 2.0, is not just a tool for improving customer interactions, but a vital partner in alleviating burnout and keeping agents engaged.
* Minor edits have been made to the transcript for clarity and readability.
Understanding Cost Drivers in Customer Service
Dan: On cost reduction in the contact center, some will say, “okay, we're trying to reduce or maintain costs.” When you think of the financial aspect that either you or your peers are seeing, what is driving up that cost? Any insight that you have on what's affecting that?
Shelly: What's driving up the cost is that if you don't keep FCR at the highest possible level, customers will keep on calling back. And more volume means the cost of calls goes up.
So you have to try and find a way of doing two things in the customer contact center environment going forward. For us, it is bringing down the cost per call, and a big part of that is containment through conventional chat or now with more progressive, what we call Amelia 2.0 (powered by ASAPP’s GenerativeAgent).
We call our bot Amelia (after Amelia Earhart) because we had to give her a name.
We really like her. She shows up every day. She's got no absenteeism problems, never wants PTO, and she just shows up every day, and she's always friendly. She's, like, always friendly.
The Role of AI in Enhancing Customer Interactions
Shelly: So now we've got Amelia 2.0, which is gen AI, and she's a little bit more conversational, sometimes too much, but we're getting her there.
And I think that that is the next evolution.
We have to free crew members (JetBlue’s contact center agents) from having to deal with very basic stuff, and frankly, they get bored with it.
Our tenure is extremely long at JetBlue. The average tenure is about ten and a half years; they don't leave. But also the majority of our crew members are part-time.
So they work anything between fifteen and thirty hours a week. So that also helps with the lack of burnout. They don't burn out, and that's why.
But gen AI has got a massive role to play in this. It has a massive role to play in it.
Dan: When you're approaching the big topic of AI – something that’s so ubiquitous now that it's become very generic and losing a lot of its meaning and power – how are you and the team approaching those waters in the contact center space? What are the concerns and the outcomes that you're trying to get?
Shelly: So when we started really ramping up AI, there was obviously a massive fear by our crew members about “it's gonna take my job away.” And that was a very real fear for them.
And then they started realizing that she (Amelia 2.0) actually covers the shifts that they don't want to do. So Amelia became very handy. She would work weekends, and she would work through the night. So from that perspective, it became less of a threat for them because they knew that she was complementing them.
And our containment is extremely high. We started two years ago in the low 30s – 35, 36% containment. And now we're sitting at between 68% and 70% containment.
It just never gets past Amelia. Amelia keeps it. But when a call actually eventually comes over to the crew member, all the hard work has kind of been done already, and they can step in and just start making decisions that the customer is looking to be made. So for us, we've embraced AI as a company. Crew members are still more afraid of “will a BPO take over my job?” They're all more afraid of that than they are of AI now.
Addressing Labor Concerns in the Age of AI
Dan: When we talk about how agents are gonna lose jobs because of AI, I'm wondering why we aren't talking more about how agents are removing themselves from the job themselves and in record numbers. I think we are at 52-62% average turnover, major absenteeism, and the majority of contact center leaders are saying they are having a hard time recruiting.
I'm really interested in your thoughts on AI when it comes to this big issue of labor in the contact center and the high cost and absenteeism, and how contact centers are just dealing with that.
Shelly: We are proud of the fact that we don't lose people. But I think it's got a lot to do with the part-time model that we run. In fairness, there's a massive burnout level.
If you take call after call after call - and customers, let's be honest, they don't call in to say, ”way to go,” “I had a great flight,” “my flight was on time.” That's a given.
They don't. I mean, so every single call, call after call after call.
So we obviously manage it through the fact that they don't have these 40-hour week shifts, and they work part-time, so it makes for life to be a whole lot easier. But we also have a responsibility – and we spend a lot of time on culture. So we double down on culture. We watch our ratios of crew members to supervisors.
Maintaining Relevance in a Changing Workforce
Shelly: I was sharing with you earlier how much time we spend with crew members – what we call the PDRs, which is protecting the direct relationship that we have with them. It's a big deal for us. And we explain to them the why behind BPOs, the why behind Amelia. And the more they understand it, the better.
But the other obligation that we have in this industry is to make crew members or agents, wherever they are, relevant in five years from now because they won't be relevant if we don't make the effort. It's kind of our responsibility to do it.
And we have to teach them different scenarios of how to deal with de-escalation or really complicated problems. We have to, and that is where technology comes into it. Technology becomes their friend.
So I think we have a massive obligation and a responsibility to keep them relevant in the new world, and the communication has to be so big and so wide open to bring them along.
The worst thing that we can do is to make them feel disenchanted and they stay. Because if they stay, your customers will feel the impact of their unhappiness.
You've got an obligation. Employee engagement scores are huge in JetBlue. It's very, very big.
So we spend a lot of time on that, but I think that embracing technology as a part of it, we've never shied away from the fact that Amelia is there and this is what she does. These were her stats. We give them her stats on a daily basis because she's part of the team.
It's weird. She's not really a person, you know that.
Dan: Amelia is ASAPP's GenerativeAgent, but when she talks about it as Amelia, I'm like, yeah, she's part of the team. This nice woman in the front row said, “I wanna be Amelia's friend now.”
Fostering a Valued Workforce Culture
Dan: So, to put a finer point on it, because we are gonna talk about the tech part of this and the partnership with ASAPP later. Can you touch on what you are doing in the culture that you think, “this is going to have a really appreciable effect on keeping the crew members in the job?”
Shelly: People really want to feel they’re valued in the world. The feedback we constantly get from our crew members is that this is the first place I have worked at that I don't feel like a number. I'm not a number.
They've got access, and when we say they've got access to me as the VP, that is not just empty talk. It's real.
They really do have access. Every leader in my team has to dedicate two hours a week in their calendar for any crew member to talk to them about anything. Anything. And they blocked up those views.
We have monthly what we call "directly to you” meetings, in which it's open kimono. We tell them all the good and all the bad. The company is not doing well. The company is doing well. This is where we're going. Thhis is the good and the bad.
We've got a CEO that absolutely believes in 100% transparency. There's absolutely no point in sugarcoating anything. You have to be very honest with people.
And that's how we bring 24,000 people along with us. That is the total number of crew members that we've got in the company, but that's how we bring them along and how we protect the culture. So for the customer contact center area, we are not unionized, and there's no talk of being unionized because people know that things get done faster when they come to Shelley directly or their director or their manager to say, “I'm really not happy about this.” “What are we gonna do about it?” “Oh, well, this is why I can't do anything about it.”
So we absolutely believe in a very transparent relationship with them. We tell them the good and the bad all the time.
Read Part 1, JetBlue’s CX journey: tackling challenges in an evolving industry.
Part 3 coming soon.
JetBlue’s CX journey: tackling challenges in an evolving industry
Customer support is not just about answering customer questions—it's also about resolving customer issues quickly, and adapting to rapidly changing expectations. JetBlue has been on the forefront of this shift, with a model that’s built on 25 years of remote work and a deep commitment to empowering their agents. In this post, part one of a three-part blog series based on a discussion between Dan Rood, SVP Marketing at ASAPP, and Shelley Griessel, VP of Customer Support at JetBlue, we explore how the airline’s approach has adapted to the challenges of a rapidly evolving industry, where the unpredictable nature of customer needs and agents’ lives can make every day a unique experience.
Read Part 2, How JetBlue aligns costs, culture, and AI for CX success.
* Minor edits have been made to the transcript for clarity and readability.
The unpredictable joys of contact center work
Dan: I'd love to hear from you - what is a moment or moments that occur to you in this industry in your role that you go, “I love this job. I love this.”
Shelly: I think without a doubt, it is the unpredictability.
We all think we've got our week planned out, we've got our day planned out, but it really doesn't work like that. I think that I love, love, love the fact that I've got absolutely no idea what's going to come at me today… And the unpredictability of what's happening in your crew members' lives (that's what we call our agents).
And it makes for just constant entertaining. Sometimes it's heart-wrenching stuff that you have to deal with. So we go permanently from a state of being absolutely ecstatic about life and, like, “we're winning every day” – and then they are just heartbreaking times.
And I think that's what I love most about the job.
I don't think… this is not what I planned to do in life, but just being in contact centers for so long, I think I've been blessed to have experienced so much over the thirty-five years that I've been in the business.
Why contact center challenges persist
Dan: If we were to go back five years or ten years, the conversation in any surveys that we have in a room like this, is that three or four things happen in a contact center – controlling, or even reducing costs, and improving CSAT, are the themes over and over again. And yet, in 2024, there's a lot of evidence to say it's not necessarily getting better. If you were to just step back, why does it feel like it is not getting better?
Shelly: I think to start off with, the last four to five years have made it exponentially harder. There's something that has just clicked in the mind of consumers that they're not tolerating as much.
It feels like it is a lot harder to please. And I think that people are also more than ever before stretched for time. There's just so little tolerance for long hold times, and then they get through to somebody who is not going to wrap it up quick enough.
I think the most precious commodity in today's world is absolutely time.
I am of the belief that they want to get an answer quickly. They want to get a consistent answer. If they don't like it, then they'll phone back three or four or five or six or seven times, until they get the answer that they're looking for. That drives up costs, that drives down FCR.
It's just a lot more complicated, and I think COVID has a lot to do with it. It has also got a lot to do with how much more challenging crew members' lives are. It is harder, and the value of the job hasn't gone up. So people are not necessarily getting paid more, and they're holding down two or three jobs, which makes their lives complicated.
Remote work and customer expectations
Shelly: In the JetBlue model, our contact center crew members work one hundred percent from home, and have since the beginning for 25 years.
In addition to that, as they work from home, they were challenged with, during COVID, being alone and having no contact with anybody else. So it's been a lot worse for them.
But I do think that for customers, it is a lot more about getting through to somebody that can give me an answer really quickly, and let me be on my way. And I want to do it when I want to do it, any time of the day or night. It's just a different experience that we have to offer customers today versus what it was.
Empowering contact center roles for speed and resolution
Dan: At ASAPP, we call it abbreviating customer pain. From a performance standpoint, the main threshold that you're trying to get to is speed and obviously real resolution. Is that the strategy when it comes to keeping brand differentiation with JetBlue versus your competitors?
Shelly: Absolutely. So we do want to speed it up, and we want to make sure that we use the human intellect in a different way than what we probably did three or five years ago. Which means that we are personally investing a lot more in the training and development of people to deal with the really tough issues. It doesn't help to say to a person, “You're empowered, and you can do what you need to do.” You actually have to teach them what empowerment looks like. So there's a lot more time and resources that we're spending on evolving the role of a customer support person.
And I do believe that the footprint of humans in a contact center is going to get smaller, but I think it's gonna finally become a higher paid role, because they're going to deal with a lot more complicated matters – a lot more complicated – and we should pay them more.
So I think the footprint will get smaller, but we should be able to pay them more.
Read Part 2, How JetBlue aligns costs, culture, and AI for CX success.
The road ahead: What the next year will bring for AI in CX
We all hear the same thing about the future of the CX landscape every year, that it’s poised for a dramatic transformation. Most years, the steady evolution of CX tech delivers real gains for customers and businesses but falls short of true transformation. But this year seems different. As CX leaders embrace generative AI, the role of contact centers—and the agents who work within them—will undergo a profound shift. The key to navigating this colossal change will be effectively using the full range of genAI’s capabilities to improve both customer and employee experiences.
The growing complexity of AI use cases for CX
Over the next year, two things will happen in tandem – new CX use cases for AI-native solutions will continue to emerge and enterprises will shift focus to improving data curation. A whopping 95% of customer service leaders say that improving analytics, intelligence, and quality capabilities will be a bigger priority in 2025. That focus will lead to better, cleaner data. And with better data, new AI solutions will be able to tackle more complex customer interactions.
AI is already automating highly repeatable human-assisted transactions. The capabilities that power this automation, such as summarization, will be commoditized and widely deployed. At the same time, more advanced capabilities will be developed. As AI automates more complex and time-consuming tasks, the role of humans in the contact center will evolve to focus on critical interactions that build customer relationships and strengthen the brand.
Productivity improvements are already obvious early wins with AI, and that will continue. But businesses will lean into generative AI’s capacity to personalize experiences at scale. The vast majority of CX leaders (90%) say that determining the best use cases for AI will be a key focus in 2025. The blend of efficiency and personalization that new use cases offer will finally begin to transform the way companies execute across the entire customer journey.
From legacy tech to LLMs – and beyond
Legacy technologies simply aren’t up to the task of delivering the kind of service consumers expect these days. Chatbots, websites, mobile apps, and IVRs are all okay for very specific tasks, but their inability to support the dynamic, complex, and sometimes emotive needs of the customer are where LLM-native AI is a massive leap forward – when it’s deployed thoughtfully.
Here’s the catch. An open-source LLM alone cannot manage the complexity of your customer interactions. That’s a lesson some businesses have learned the hard way with some very public failures. But a system of models grounded on your business lexicon and integrated with your company’s backend systems is incredibly powerful at achieving first contact resolution. And this is where leading companies are headed, fast.
If you haven’t already moved in this direction, you would be well served to get started now by finding an AI solution partner who can implement a range of use cases tailored to your business. You can start simple and deploy use cases of increasing complexity over the next couple of years. This iterative process will ensure ongoing learning and deliver transformational customer, employee, and shareholder value.
The evolving role of human agents
In my experience, no one knows the best and worst of your company better than the agents in the contact center. They talk to the customers that use your products and services all day, every day. So, they know what’s working and what’s not. That makes them a valuable resource, one that too often remains untapped.
As the role of AI expands to handle more routine work, the role of human agents will evolve in ways that create opportunities to leverage their unique knowledge and perspective. Nearly 89% of customer service leaders say that the typical agent of the future will handle complex customer service and support interactions. Other likely aspects of the evolving agent role include participating in client retention and winback efforts (68%), contributing to knowledge management (66%), and collecting and sharing customer feedback (66%).
The expertise of your best agents makes them uniquely equipped to train and supervise AI in a way that will completely change how your contact center operates. Not only will AI resolve the systemic issues that make contact center work incredibly difficult and expensive, but it will also empower companies to create historically high efficiency and personalization. That shift will do a lot to address consumer dissatisfaction, which creates massive headwinds to growth and profitability.
To achieve this transformation, CX employees must be part of the solution from day one. It will be crucial for businesses to involve them directly in AI adoption and expansion. While some jobs, over time, will be reduced or even eliminated, new roles will also be created. Educating employees on the importance of their role with AI, upskilling those tasked with making the technology successful, and evolving incentive programs will accelerate benefits to the customer and company.
The transformational opportunity for CX leaders has never been this great. But it depends heavily on balancing the technology with the people in the right way.
- Chris Arnold, VP of Contact Center Strategy, ASAPP
Human involvement will be crucial for improving the performance of AI solutions and for ensuring safety and data security. Businesses that successfully evolve a subset of agent roles to provide supervision, human reasoning and judgment, and advanced triage will be better positioned to deliver safe and highly efficient first contact resolution at scale using AI.
Not all tech providers are equally mindful of the need for this balance of humans and AI. You’ll need to choose a partner that acknowledges the importance of people and process, rather than simply providing software. How technology is deployed matters as much as the technology itself, maybe more in this case.
Navigating this new path forward
There is a tremendous amount of noise in the market right now as companies in every industry develop their CX strategy for generative AI. This has created a massive amount of marketing from technology providers that serve contact centers, many of them painting bolt-on solutions as truly AI-native. That can make it difficult to evaluate and compare solutions and vendors. For long-term value, it will be important to identify the vendors who have deep expertise in AI and CX and have built and deployed AI-native solutions successfully. So, you’ll need to ask tough questions of vendors prior to signing a contract.
As you add generative and agentic AI solutions to your mix, don’t be afraid to start small with a plan to learn and iterate quickly. In the contact center, small changes can lead to massive benefits. Take the time to be proactive and thoughtful in choosing the right use cases to target first. Consider where the AI is strong out of the box, what data sources are available, and where pain points can be quickly eliminated. This requires more thought than simply looking at the top five call drivers. You’ll get there more quickly if you leverage the inherent strengths of the AI and invest the time and energy into creating space for the solution to learn and do what it’s built to do – meet the needs of your customers in the most effective and efficient manner possible.
The shift from cost center to value center
As we look toward the future, it's clear that the contact center is undergoing a fundamental shift. The days of seeing contact centers as mere cost centers are numbered. In the new AI-powered world, contact centers will become value centers that contribute directly to customer loyalty, retention, and revenue generation.
Traditional metrics like Average Handle Time (AHT) and First Contact Resolution (FCR) will be replaced by more holistic measures that align with the new workflows and possibilities created by AI. These new metrics will shift the focus to omnichannel resolution, customer lifetime value, and concurrency.
As more human agents shift into new roles supervising the work of autonomous AI agents, concurrency improvements will be dramatic, even with voice. The gains that will bring in productivity and efficiency will dwarf the incremental improvements we’ve seen with other recent technologies. We’re on the cusp on a radical change that will require a new approach to measuring performance. The new metrics will help drive contact centers to deliver exceptional customer experiences at scale, improving both the top and bottom lines of the business.
Thriving in this rapidly evolving landscape
The next year will see significant advancements in the contact center industry. AI will continue to reshape the way we deliver service, while human agents will play an even more critical role in driving customer satisfaction and brand loyalty. By embracing AI, empowering agents, and focusing on personalization and efficiency, businesses can position themselves to thrive in this rapidly evolving landscape. The future of customer service is here, and it’s one where technology and humans work together to deliver outstanding experiences.
Have we missed the point of empathy in CX?
Empathy in customer service doesn’t always look the way we expect. Sometimes it wears a disguise.
A few years ago, I bought my daughter a new mobile phone for Christmas. She planned to make a 3-hour drive, mostly through rural areas, to visit her cousins the next morning. We needed to get the phone working before she pulled out of the driveway.
But nothing we tried did the trick. So, late in the afternoon on Christmas day, we needed customer support.
Our service provider’s website offered two options, phone or chat. I hate chat for support, but the wait time for a phone call was more than I could commit to while getting ready for a family dinner. So, I fired up the chat and asked for help. At first, I wasn’t sure whether it was a bot or a human. I didn’t care either way as long as we got the phone working. Over nearly two hours, I alternated between the chat and my family. And in the end, the problem was fixed.
What does empathy actually look like in customer experience?
I don’t recall the agent (human after all) saying anything particularly compassionate. And yet, this was one of the most empathetic customer service experiences I’ve ever had. Here’s why:
- The interaction resolved my problem on my schedule without requiring a call or visit to the store.
- I had a clear choice between phone and chat and knew the current wait times.
- I got the problem resolved without missing Christmas dinner with my family.
The bottom line is that my service provider gave me options for how to engage and a convenient way to get what I needed.
This is how empathy sometimes wears a disguise. It masquerades as efficiency, convenience, and ease.
In an industry hyper-focused on the emotional side of empathy, we too often overlook this crucial practical side. But we shouldn’t. It matters to customers, a lot.
- 93% of customers expect their issue to be resolved on the first call
- 62% of customers would prefer to “hand out parking tickets” than wait in an automated phone tree for service or have to repeat themselves multiple times to different team members
- 80% of American consumers say efficiency is the most important factor in the customer experience, and more than half say it’s worth paying more for
The often-overlooked practical side of empathy in CX
In recent years, the CX industry has focused intently on empathy. Businesses spend time and resources to upskill agents on active listening, emotional intelligence, and expressing care and compassion. They even provide lists of empathetic phrases their agents can use. And a growing number of contact centers use AI to detect customer sentiment throughout each interaction. All of that is great. It reminds the agents that customers are human, too, and they need to hear that someone cares about their problem.
Validating a customer’s feelings is an important component of putting empathy into practice. But it’s only one component.
Caring alone doesn’t resolve a customer’s issue, and it doesn’t automatically make the process of reaching a resolution easy or convenient.
Long wait times, multiple interactions, and chatbot failures are not empathetic. Many CX leaders view those points of friction through the lens of contact center efficiency with metrics like transfer rates and digital containment. But friction also increases customer effort, which is an important component of empathy in CX. And too many contact centers deliver experiences that require a lot of customer effort – ineffective self-service, complicated IVR menus, disconnected channels, and more. An agent who says they understand your frustration can’t erase all that effort and wasted time.
Empathy in CX strategy: Are we making it too complicated?
The concept of empathy is somewhat vague and squishy, so it’s not surprising that CX leaders sometimes convert it into something else when crafting CX strategy. The problem is, they often convert empathy into the equally vague concept of customer-centricity. What does that mean? Keeping the customer front and center at all times, sure – but how? It isn’t always clear how centering the customer translates into actions and processes for the contact center to follow.
The vague nature of both empathy and customer-centricity tends to give rise to complex frameworks that attempt to make the strategy more concrete. For example, a framework might categorize elements in the CX ecosystem into systems of listening, understanding, action, and learning. Those frameworks can help shape perspectives within your business, but they still require additional translation to make them actionable for your frontline CX team.
Here’s a simpler approach. Embedding empathy into your CX strategy means consistently aiming to do these four things:
- Resolve the customer’s issue in the first interaction.
- Take up as little of the customer’s time as possible.
- Make the entire process easy and convenient.
- Treat your customers and employees like the human beings they are.
Getting to the point of empathy with generative AI
In contact centers, early AI implementations increased efficiency, but employees felt the impact more than customers. In some cases, AI deployments actually increased frustration by raising customers’ hopes with big promises of faster, more convenient service that didn’t ever materialize. Consider chatbots. Even with improved language processing, bots can’t take action to resolve a customer’s issue. So, they require time and effort from the customer but often, can’t truly help. When it comes to the practical side of empathy, they fail to deliver.
But that was then, and this is now. The technology has matured, and current implementations of generative AI are improving contact centers’ performance on both the emotional and practical sides of empathy. AI solutions increasingly take over repetitive and time-consuming tasks, freeing agents to focus more effectively on the customers they’re serving. This shift makes space to engage with more empathy across the board.
Customer-facing AI agents will generate a larger, even seismic, shift in how empathy is embedded into customer experiences. Generative AI agents can listen, understand, problem-solve, and take action to resolve customers’ issues. That ticks all the empathy boxes for me. This massive leap forward lays the groundwork for CX leaders to shift the emphasis of their AI investments toward solutions that do more than talk in a natural way.
Practical empathy that’s just a chat or call away
That Christmas a few years ago when I needed customer service, I didn’t care whether I chatted with a human or AI. I just wanted my problem resolved before my daughter left town, preferably without having to call or visit the store the next day. I got lucky that time. My service provider had agents available. But we all know that’s not always the case. With a generative AI agent ready to respond 24/7/365, the customer’s luck never runs out. Effective, efficient, and convenient service will always be just a call or chat away. For me, that’s the part of empathy in CX that too many businesses are missing today. But I suspect that’s about to change.
5 questions to ask before building your own AI solution for CX
The build vs. buy debate in software is nothing new. With the rise of large language models (LLMs), spinning up prototypes that can complete tasks has become easier than ever. However, developing an AI application for CX that’s scalable and production-worthy demands extra consideration. In this post, we’ll explore key questions to ask when deciding whether to build or buy generative AI solutions for CX.
While the right answer depends on industry, use case, and business goals, it’s important to carefully weigh your options.
Why building an AI application for CX requires extra consideration
The reason for this has a lot to do with the nature of generative AI.
Unlike most software development where the goal is to create a deterministic system with the same, expected outcome every single time, generative AI generates new information and content based on patterns, often producing varied responses that can be difficult to predict.
This variability is what makes gen AI applications so powerful, but also adds complexities in maintaining consistency and control.
Differing from standard applications where the workflow is predefined, generative AI models require continuous monitoring, training, and refinement. There is a misconception that generative AI just needs instructions. But in fact, they need much more, especially if you are training on your own data.
The flexibility makes them ideal for tasks that benefit from creative problem-solving or personalized engagement, but it also means that development doesn’t stop at launch. This, compounded by the fact that CX applications can ultimately impact a brand’s relationship with its customers, brings additional layers of considerations when deciding whether to build the application yourself or to partner with a vendor.
While the following is not an exhaustive list, here are some top questions you should consider when making a decision.
Is building AI applications part of your business objectives?
If building AI applications aligns with your business objectives, you’re likely prepared for both short-term and long-term investment in development and maintenance that will be absolutely critical for AI applications, ensuring the solution doesn’t become a burden on your resources.
In contrast, if developing your own AI applications isn’t a core objective, partnering with a vendor may be the more sensible approach. Vendors are equipped to provide solutions that fit your needs without the complexities of internal development, allowing you to focus on your primary business goals.
Based on McKinsey & Company’s recent estimates, building and maintaining your own foundational model could cost up to 200M with an annual recurring cost of 1-5M. Using off-the-shelf solutions or fine-tuning existing models with sector-specific knowledge can dramatically reduce this cost.
What’s your timeline for deployment?
If your deployment timeline is urgent—perhaps due to a competitor already leveraging AI solutions—working with an AI solution provider can help you hit the ground running and accelerate implementation. A vendor with solid experience in enterprise AI development will help you avoid unnecessary trial and error, minimizing risks associated with AI safety - such as AI hallucinations.
Instead, leaders should strongly consider partnering with gen AI solution providers and enterprise software vendors for solutions that aren’t very complex or [industry] specific. This is particularly critical in instances where any delays in implementation will put them at a disadvantage against competitors already leveraging these services.
- McKinsey & Company, in "How generative AI could revitalize profitability for telcos"
Do you have the internal expertise, resources, and infrastructure to build, scale, monitor, and maintain an AI solution?
While widely available large language models (LLMs) have significantly accelerated the process of building a working prototype on a laptop, this does not equate to having a production-ready, scalable solution that can effectively address your business needs [link to last mile]. Additionally, the demand for ongoing maintenance means that development does not stop after the application is launched.
Developing AI applications requires a skilled team knowledgeable in machine learning, data management, and software engineering, along with the necessary technological resources and datasets. Even choosing the right LLM to use for the best fitted use cases would require a good understanding of how various LLMs differ. Experience working with AI solutions is also crucial for successful deployment in an enterprise context.
In the last two months, people have started to understand that LLMs, open source or not, could have different characteristics, that you can even have smaller ones that work better for specific scenarios
- Docugami CEO Jean Paoli, in CIO “Should you build or buy generative AI”
If your organization lacks this expertise or the infrastructure to support such a project - not just at the prototype stage, but also to scale, monitor, and maintain the solution in the future - it may be more prudent to consider vendor solutions that can provide the capabilities you need without overextending your internal teams.
Also consider, if problems arise, do you have the manpower and expertise to investigate the root cause and implement long-term fixes, or will you be crossing your fingers and hoping the issues don’t recur? A strong internal team capable of addressing challenges as they emerge is necessary for ensuring the reliability and effectiveness of your AI application for CX. Without this capability, you risk operational disruptions and even diminishing trust in your brand.
What ROI do you expect from your AI application? And how comfortable are you with the associated risks?
When considering your expected return from an AI investment, it’s essential to balance potential returns with the associated risks.
Agent-facing projects generally carry lower risks, as the AI solution won’t directly interact with customers. This allows for more trial and error, with agents able to provide feedback on the AI application’s performance. That said, such solutions might yield only incremental gains in agent productivity with less or no impact on the customer experience, and do not take advantage of the full capabilities of gen AI.
In contrast, customer-facing gen AI applications can offer a much better return because they can directly improve customers’ ability to self-service and, in some cases, resolve their issues directly. Here are the kinds of results a customer got when deploying generative AI agents with the capability to resolve Tier-one tasks.
Allowing AI to handle a broader range of tasks introduces added complexity. While there are risks—such as AI hallucination, where the system may generate incorrect or irrelevant responses—these challenges can be managed with the right approach. Having a strong internal team that can tackle this, or choosing an experienced vendor with a well-informed strategy for handling AI behavior, will ensure guardrails are placed around customer-facing interactions — so you can get the most out of your AI applications with confidence.
Making a thoughtful decision
Ultimately, the choice between building or buying an AI solution should align with your organization’s long-term vision. Each option carries its own set of challenges and opportunities, and taking the time to assess your specific needs can set the stage for success.
Considering the evolving landscape of AI, it's not just about deploying technology, but also ensuring that it fits well into your operational framework and readies you for the future. With careful evaluation, you can make a choice that enhances your customer experience. Whether you decide to build or partner, the key is to stay focused on your goals and embrace a strategic approach to generative AI applications.
Preventing hallucinations in generative AI agent: Strategies to ensure responses are safely grounded
The term “hallucination” has become both a buzzword and a significant concern. Unlike traditional IT systems, Generative AI can produce a wide range of outputs based on its inputs, often leading to unexpected and sometimes incorrect responses. This unpredictability is what makes Generative AI both powerful and risky. In this blog post, we will explore what hallucinations are, why they occur, and how to ensure that AI responses are safely grounded to prevent these errors.
What is a Hallucination?
Definition and Types
In the context of Generative AI, a hallucination refers to an output that is not grounded in the input data or the knowledge base the AI is supposed to rely on. Hallucinations can be broadly categorized into two types:
- Harmless Hallucinations: These are errors that do not significantly impact the user experience or the integrity of the information provided. For example, an AI might generate a slightly incorrect but inconsequential detail in a story.
- Harmful Hallucinations: These are errors that can mislead users, compromise brand safety, or result in incorrect actions. For instance, an AI providing incorrect medical advice or financial information falls into this category.
Two Axes of Hallucinations
To better understand hallucinations, we can consider two axes. Justification - whether the AI had information indicating that its statement was true. And truthfulness - whether the statement was actually true.
Based on these axes, we can classify hallucinations into four categories:
- Justified and True: The ideal scenario where the AI’s response is both correct and based on available information.
- Justified but False: The AI’s response is based on outdated or incorrect information. This can be fixed by improving the information available. For example, if an API response is ambiguous, it can be updated so that it is clear what it is referring to.
- Unjustified but True: The AI’s response is correct but not based on the information it was given. For example, the AI telling the customer that they should arrive at an airport 2 hours before a flight departure. If this was not grounded in, say, a knowledge base article, then this information is technically a hallucination even if it is true.
- Unjustified and False: The worst-case scenario where the AI’s response is both incorrect and not based on any available information. These are harmful hallucinations that could require an organization to reach out to the customer to fix the mistake.
Why Do Hallucinations Occur?
Hallucinations in generative AI occur due to several reasons. Generative models are inherently stochastic, meaning they can produce different outputs for the same input. Additionally, the large output space of these models increases the likelihood of errors, as they are capable of generating a wide range of responses. AI systems that rely on incomplete or outdated data are also prone to making incorrect statements. Finally, the complexity of instructions can result in misinterpretation, which may cause the model to generate unjustified responses.
Hallucination prevention and management
We typically think about four pillars when it comes to preventing and managing hallucinations:
- Preventing hallucinations from occurring in the first place
- Catching hallucinations before they reach the customer
- Tracking hallucinations post-production
- Improving the system
1. Preventing Hallucinations: Ensuring Responses are Properly Grounded
One of the most effective ways to prevent hallucinations is to ensure that AI responses are grounded in reliable data. This can be achieved through:
- Explicit Instructions: Providing clear and unambiguous instructions can help the AI interpret and respond accurately.
- API Responses: Integrating real-time data from APIs ensures that the AI has access to the most current information.
- Knowledge Base (KB) Articles: Relying on a well-maintained knowledge base can provide a solid foundation for AI responses.
2. Catching hallucinations: Approaches to Catch Hallucinations in Production
To minimize the risk of hallucinations, several safety mechanisms can be implemented:
- Input Safety Mechanisms: Detecting and filtering out abusive or out-of-scope requests before they reach the AI system can prevent inappropriate responses.
- Output Safety Mechanisms: Assessing the proposed AI response for safety and accuracy before it is delivered to the user. This can involve:some text
- Deterministic Checks using rules-based systems to ensure that certain types of language or content are never sent to users, or
- Classification Models, which employ machine learning models to classify and filter out potentially harmful responses. For example, a model can classify whether the information contained in a proposed response has been grounded on information retrieved from the right data sources. If this model suspects that the information is not grounded, it can reprompt the AI system to try again with this feedback.
3. Tracking hallucinations: post-production
A separate post-production model can be used to classify AI responses as mistakes in more detail. While the “catching hallucination” model should balance effectiveness with latency, the post-production mistake monitoring model can be much larger, as latency is not a concern.
A well-defined hallucination taxonomy is crucial for systematically identifying, categorizing, and addressing errors in Generative AI systems. By having a well-defined error taxonomy system, users can aggregate reports that make it easy to identify, prioritize, and resolve issues quickly.
The following categories help identify the type of error, its source of misinformation, and the impact.
- Error Category -broad classification of types of the generative AI system errors.
- Error Type - specific nature or cause of an AI system error.
- Information Source - origin of data used by the AI system.
- System Source - component responsible for generating or processing AI output.
- Customer Impact Severity - level of negative effect on the customer.
4. Improving the system
Continuous improvement is crucial for managing and reducing hallucinations in AI systems. This involves several key practices. Regular updates ensure that the AI system is regularly updated with the latest data and information. Implementing feedback loops allows for the reporting and analysis of errors, which helps improve the system over time. Regular training and retraining of the AI model are essential to enable it to adapt to new data and scenarios. Finally, human oversight involving contact center supervisors to review and correct AI responses, especially in high-stakes situations, is critical.
Conclusion
By understanding the nature of hallucinations and implementing robust mechanisms to prevent, catch, and manage them, organizations can harness the power of Generative AI while minimizing risks. Just as human agents in contact centers are managed and coached to improve performance, Generative AI systems can also be continually refined to ensure they deliver accurate and reliable responses. By focusing on grounding responses in reliable data, employing safety mechanisms, and fostering continuous improvement, we can ensure that AI responses are safely grounded and free from harmful hallucinations.
Building enterprise-grade AI solutions for CX in a post-LLM world
The advent of commercially available cloud-hosted large language models (LLMs) such as OpenAI’s GPT-4o and Anthropic’s Claude family of models has completely changed how we solve business problems using machine learning. Large pre-trained models produce state-of-the-art results on many typical natural language processing (NLP) tasks such as question answering or summarization out-of-the-box. Agentic AI systems built on top of these LLMs incorporate tool usage and allow LLMs to perform goal-directed tasks and manipulate data autonomously, for example by using APIs to connect to the company knowledge base, or querying databases.
The extension of these powerful capabilities to audio and video has already started, and will endow these models with the ability to make inferences or generate multi-modal output (produce responses not only in text, but also audio, image, and video) based on complex, multi-modal input. As massive as these models are, the cost and the time it takes to produce outputs is decreasing rapidly, and this trend is expected to continue.
While the barrier to entry is now much lower, applying AI across an enterprise is no straightforward task. It’s necessary to consider hidden costs and risks involved in managing such solutions for the long-term. In this article, we will explore the challenges and considerations involved in building your own enterprise-grade AI solutions using LLMs, and look at a contact center example.
The first step of a long journey
LLMs have served to raise the level of abstraction at which AI practitioners can create solutions. Tremendously so. Problems that took immense time, skill, experience and effort to solve using machine learning (ML) are now trivial to solve on a laptop. This makes it tempting to think of the authoring of a prompt as being equivalent to the development of a feature, replacing the prior complex and time-consuming development process.
However, showing that something can be done isn’t quite the same as a production-worthy, scalable solution to a business problem.
For one, how do you know you have solved the problem? It is common practice to create a high-quality dataset for training and evaluating the model. While the creation of such a benchmark takes time and effort, it is standard practice. Since LLMs come pretrained, such a dataset is no longer needed for training. Impressive looking results can be obtained with little effort - but without anchoring results to data, the actual performance is unknown.
Creating the evaluation methodology for a generative model is much harder because the potential output space is tremendous. In addition, evaluation of output is much more subjective. These problems can be addressed, for example, using techniques such as “LLM as a judge”, but the evaluation becomes a demanding task - as the evaluator itself becomes a component to build, tune and maintain.
The need to continuously monitor quality requires the creation of complex data pipelines to sample model outputs from a production system, sending these to a specialized evaluation system, and tracking the scores. The method for sampling data - such as the data distribution desired, the adjustment of the sampling frequency so that it is appropriate (sampling and measuring quality often on a large amount of data may give more confidence in the quality, but would be expensive), considerations around protecting data in transit and retaining it based on policies all add to the complexity. In addition, the instructions for scoring, interpretation of the scores, type of LLM used for scoring, type of input data and so on can all change often.
A fantastic-looking prototype or demo running on a laptop is very promising, but it is the first step in a long journey that allows you to be assured of the quality of the models’ outputs in a production system.
In effect, the real work of preparing and annotating data to create a reliable evaluation takes place after the development of the initial prompt that shows promising results.
- Nirmal Mukhi, VP AI Engineering
Safety considerations
The generality and power of LLMs is a double-edged sword. Ultimately, they are models that predict and generate the text that, based on the huge amount of pre-trained data they have been exposed to, best matches the context and instructions they are given - they will “do their best” to produce an answer. But they don’t have any in-built notion of confidence or accuracy.
They can hallucinate, that is, generate outputs that make completely baseless claims not supported by the provided context. Some hallucinations could be harmless, while others could have detrimental consequences and can risk damaging the brand.
Hallucinations can be minimized with clever system design and by following the best practices in prompting (instructions to the LLM). But the chances of hallucinating are always non-zero; it is a fundamental limitation of how LLMs work. Hence, it becomes imperative to monitor them, and continuously attempt to improve the hallucination prevention approach, whatever that may be.
Beyond hallucinations, since LLMs can do so many things, getting them to “stay on script” when solving a problem can be challenging. An LLM being used to answer questions about insurance policies could also answer questions about Newton’s laws - and constraining it to a domain is akin to teaching an elephant to balance on a tightrope. It’s trying to limit an immensely powerful tool to narrow its focus to one problem, such as the customer query that they need to resolve right in front of them.
One solution to these problems is to fine-tune an LLM so that it is “better behaved” at solving a specific problem. The process of fine-tuning involves collecting high-quality data that allows the model to be trained to follow specific behaviors, following techniques such as few-shot learning, or Reinforcement Learning with Human Feedback (RLHF). Doing so however takes specialized skills: even assembling an appropriate dataset can be challenging; ideally, fine-tuning would be a periodic if not continuous process - and this is difficult to achieve. Thus, managing a large number of fine-tuned models and keeping them up to date will require specialized skills, expertise, and resources.
Inherent Limitations
While LLMs can do amazing things, they cannot do everything. Their reasoning and math capabilities can fail. Even a task like text matching can be hard to get right with an LLM (or at any rate can be much more easily and reliably solved with a regular expression matcher). So they aren’t always the right tool to use, even for NLP problems. These issues are inherent to how LLMs work. They are probabilistic systems that are optimized for generating a sequence of words, and are thus ill-suited for tasks that are completely deterministic.
Sophisticated prompting techniques can be designed to work around some of these limitations. Another approach is to attempt to solve a multi-step problem that involves complex reasoning using agentic orchestration. A different approach would allow for a model pipeline where tasks suited to being solved using LLMs are routed to LLMs, while tasks suited to being solved via a deterministic system or regular expression matcher are routed to other models. Identifying and designing for these situations is a requirement to support a diversity of use cases.
Managing and running production systems
While vendors like OpenAI and Anthropic have made LLMs relatively cheap to run, they still use complex serving architecture and hardware. Many such LLM hosting platforms are still in beta, and typical service level agreements (SLAs), when supported, promise far below 99.99% availability (rarely guaranteed). This risk can be managed by adding fallbacks and other mechanisms, and represents additional effort to build a production system.
And, in the end, building on LLMs is all software development and needs to follow standard software development processes. An LLM or even just a prompt is an artifact that has to be authored, evaluated, versioned, and released carefully just like any software artifact or ML model. You need observability, the ability to conduct experiments, auditability and so on. While this is all fairly standard, the discipline of MLOps introduces an additional layer of complexity because of the need to continuously monitor and tune for safety (including security concerns like prompt injection) and hallucinations. Additional resources need to be made available to handle such tasks.
A contact center example
Consider the problem of offering conversation summarization for all your conversations with customers, across voice and digital channels. The NLP problem to be solved here, that of abstractive summarization on multi-participant conversations with a potentially large number of turns, was difficult to solve as recently as 2021. Now, it is trivial to address using LLMs - writing a suitable prompt for an LLM to produce a summary that superficially looks high quality is easy.
But how good would the generated summaries be at scale as an enterprise solution? It is possible that the LLM might hallucinate and generate summaries that could include completely fictitious information that’s not easily discerned from the conversation itself. But would that happen 1% of the time, or 0.001% of the time…and how harmful would those hallucinations be? Adding the actual number of interactions into consideration, a 1% rate could mean 1 customer interaction out of 100, but as you scale up the interactions you could suddenly be faced with 1000 customers out of 100,000 interactions. Evaluating the quality of the prompt, and optimizing it, would require the creation of an evaluation dataset. Detecting hallucinations and classifying them into different categories so that they can be mitigated would take extra effort - and preventing or at least minimizing the chances that they occur, even more so.
Summarization could be solved using just about any good LLM - but which one would provide the best balance between cost and quality? That’s not a simple question to answer - it would require the availability of an evaluation dataset, followed by a lot of trials with different models and applications of prompt optimization.
A one-size-fits-all summary output rarely meets the needs of an enterprise. For example, the line of business responsible for handling customer care may want the summary to explicitly include details on any promised follow-up in the case of a complaint. The line of business responsible for sales may want to include information about promotions offered or whether the customer purchased the product or service, and so on. Serving these requirements would mean managing intricate configurations down to having different prompts to serve different use cases, or ideally fine-tuned LLMs that better serve the specific needs of these businesses. These requirements may change often, and so would need for versioning, change management and careful rollouts.
Summary quality would need to be monitored, and as changes in technology (such as improvements in models, inference servers or hardware) occur, things would need to be updated. Consider the availability of a new LLM that is launched with a lot of buzz in the industry. It would have to be evaluated to determine its effectiveness at summarization of the sort you are doing - which would mean updating the various prompts underlying the system, and checking this model’s output against the evaluation dataset, itself compiled from samples of data from a variety of use cases. Let’s say that it appears to produce higher quality summaries at a lower price on the evaluation dataset, and a decision is taken to roll it into production. This would have to be monitored and the expected boost in performance and reduction in price verified. In case something does go wrong (say it is cheaper and better…but takes unacceptably long and customers complain about the latency of producing summaries), it would need to be rolled back.
What about feedback loops? Perhaps summaries could be rated, or they could be edited. Edited summaries or highly rated ones could be used to fine tune a model to improve performance or lower cost by moving to a smaller, fine-tuned model.
This is not an exhaustive list of considerations - and this example is only about summarization, which is a ubiquitous, commodity capability in the LLM world. More complex use cases requiring agentic orchestration with far more sophisticated prompting techniques require more thought and effort to deploy responsibly.
Conclusions
Pre-trained foundational large language models have changed the paradigm of how ML solutions are built. But, as always, there’s no free lunch. Enterprises attempting to build from scratch using LLMs have to account for the total lifetime cost of maintaining such a solution from a business standpoint.
There is a point early in the deployment of an LLM-based solution where things look great - hallucinations haven’t been noticed, edge cases haven’t been encountered, and the value being derived is significant. However, this can lead to a false confidence, and under-investing at this stage is a dangerous fallacy. Without sufficient continued investment, the risk of having this solution in production without the necessary fallbacks, safeguards and flexibility will be an ever-present non-linear risk. Going beyond prototyping is therefore harder than it might seem.
An apt analogy is architecting a data center. Purchasing commodity hardware, and using open source software for running servers, managing clusters, setting up virtualization, and so on are all possible. But putting that package together and maintaining it for the long haul is enough of a burden that enterprises would prefer to use public cloud providers in most cases.
Similarly, when choosing whether to build AI solutions or partner with vendors who have the experience deploying enterprise solutions, organizations should be clear-eyed about the choices they are making, understand the long-term costs and the tradeoff associated with them.
Getting through the last mile of generative AI agent development
There’s been a huge explosion in large language models (LLMs) over the past two years. It can be hard to keep up – much less figure out where the real value is happening. The performance of these LLMs is impressive, but how do they deliver consistent and reliable business results?
OpenAI demonstrated that very, very large models, trained on very large amounts of data, can be surprisingly useful. Since then, there’s been a lot of innovation in the commercial and open-source spaces; it seems like every other day there’s a new model that beats others on public benchmarks.
These days, most of the innovation in LLMs isn’t even really coming from the language part. It’s coming in three different places:
- Tool use: That’s the ability of the LLM to basically call functions. There’s a good argument to be made that tool use defines intelligence in humans and animals, so it’s pretty important.
- Task-specific innovations: This refers to innovations that deliver considerable improvements in some kind of very narrow domain, such as answering binary questions or summarizing research data.
- Multimodality: This is the ability of models to incorporate images, audio, and other kinds of non-text data as inputs.
New capabilities – and challenges
Two really exciting things emerge out of these innovations. First, it’s much easier to create prototypes of new solutions – instead of needing to collect data to make a Machine Learning (ML) model, you can just write a specification, or in other words, a prompt. Many solution providers are jumping on that quick prototyping process to roll out new features, which simply “wrap” a LLM and a prompt. Because the process is so fast and inexpensive, these so-called “feature factories” can create a bunch of new features and then see what sticks.
Second, making LLMs useful in real time relies on the LLM not using its “intrinsic knowledge” – that is, what it learned during training. It is more valuable instead to feed it contextually relevant data, and then have it produce a response based on that data. This is commonly called retrieval augmented generation, or RAG. So as a result, there are many companies making it easier to put your data inside the LLM – connecting it to search engines, databases, and more.
The thing about these rapidly developed capabilities is that they always place the burden of making the technology work on you and your organization. You need to evaluate if the LLM-based feature works for your business. You need to determine if the RAG-type solution solves a problem you have. And you need to figure out how to drive adoption. How do you then evaluate the performance of those things? And how many edge cases do you have to test to make sure it is dependable?
This “last mile” in the AI solution development and deployment process costs time and resources. It also requires specific expertise to get it right.
Getting over the finish line with a generative AI agent
High-quality LLMs are widely available. Their performance is dramatically improved by RAG. And it’s easier than ever to spin up new prototypes. That still doesn’t make it easy to develop a customer-facing generative AI solution that delivers reliable value. Enabling all of this new technology - namely, LLMs capable of using contextually relevant tools at their disposal - requires expertise to make sure that it works, and that it doesn’t cause more problems than it solves.
It takes a deep understanding of the performance of each system, how customers interact with the system, and how to reliably configure and customize the solution for your business. This expertise is where the real value comes from.
Plenty of solution providers can stand up an AI agent that uses the best LLMs enhanced with RAG to answer customers’ questions. But not all of them cover that last mile to make everything work for contact center purposes, and to work well, such that you can confidently deploy it to your customers, without worrying about your AI agent mishandling queries and causing customer frustration.
Generative AI services provided by the major public cloud providers can offer foundational capabilities. And feature factories churn out a lot of products. But neither one gets you across the finish line with a generative AI agent you can trust. Developing solutions that add value requires investing in input safety, output safety, appropriate language, task adherence, solution effectiveness, continued monitoring and refining, and more. And it takes significant technical sophistication to optimize the system to work for real consumers.
That should narrow your list of viable vendors for generative AI agents to those that don’t abandon you in the last mile.
A new era of unprecedented capacity in the contact center
Ever heard the phrase, "Customer service is broken?"
It's melodramatic, right? —something a Southern lawyer might declaim with a raised finger. Regardless, there’s some truth to it, and the reason is a deadly combination of interaction volume and staffing issues. Too many inbound interactions, too few people to handle them. The demands of scale do, in fact, break customer service.
This challenge of scaling up is a natural phenomenon. You find it everywhere, from customer service to pizza parlors.
Too much appetite, too little dough
If you want to scale a pizza, you have to stretch the dough, but you can't stretch it infinitely. There’s a limit. Stretch it too far, and it breaks.
Customer service isn't exactly physical, but physical beings deliver it— the kind who have bad days, sickness, and fatigue. When you stretch physical things too far (like balloons, hamstrings, or contact center agents), they break. In contact centers, broken agents lead to broken customer service.
Contact centers are currently stretched pretty thin. Sixty-three percent of them face staffing shortages. Why are they struggling? Some cite rising churn rates year after year. Others note shrinking agent workforces in North America and Europe. While workers flee agent jobs for coding classes, pastry school, and duck farming, customer request volumes are up. In 2022, McKinsey reported that 61% of customer care leaders claimed a growth in total calls.
To put it in pizza terms (because why not?), your agent dough ball is shrinking even as your customers' insatiable pizza appetite expands.
What’s a contact center to do? There are two predominant strategies right now:
- reduce request volumes (shrink the appetite)
- stretch your contact center’s service capacity (expand the dough)
Contact centers seem intent on squeezing more out of their digital self-service capabilities in an attempt to contain interactions and shrink agent queues. At the same time, they’re feverishly investing in technology to expand capacity with performance management, process automation and real-time agent support.
But even with both of these strategies working at full force, contact centers are struggling to keep up. Interaction volume continues to increase, while agent turnover carries on unabated. Too much appetite. Not enough dough to go around.
How do we make more dough?
Here’s the harsh reality – interaction volume isn’t going to slow down. Customers will always need support and service, and traditional self-service options can’t handle the scope and complexity of their needs. We’ll never reduce the appetite for customer service enough to solve the problem.
We need more dough. And that means we need to understand the recipe for customer service and find a way to scale it. The recipe is no secret. It’s what your best agents do every day:
- Listen to the customer
- Understand their needs
- Propose helpful solutions
- Take action to resolve the issue
The real question is, how do we scale the recipe up when staffing is already a serious challenge?
Scaling up the recipe for customer service
We need to scale up capacity in the contact center without scaling up the workforce. Until recently, that idea was little more than a pipe dream. But the emergence of generative AI agents has created new opportunities to solve the long-running problem of agent attrition and its impact on CX.
Generative AI agents are a perfect match for the task. Like your best human agents, they can and should listen, understand, propose solutions, and take action to resolve customers’ issues. When you combine these foundational capabilities into a generative AI agent to automate customer interactions, you expand your contact center’s capacity – without having to hire additional agents.
Here’s how generative AI tools can and should help you scale up the recipe for customer service:
- Generative AI should listen to the customer
Great customer service starts with listening. Your best agents engage in active listening to ensure that they take in every word the customer is saying. Transcription solutions powered by generative AI should do the same. The most advanced solutions combine speed and exceptional accuracy to capture conversations in the moment, even in challenging acoustic environments.
- Generative AI should understand the customer’s needs
Your best agents figure out what the customer wants by listening and interpreting what the customer says. An insights and summarization solution powered by generative AI can also determine customer intent, needs, and sentiment. The best ones don’t wait until after the conversation to generate the summary and related data. They do it in real time.
- Generative AI should propose helpful solutions
With effective listening and understanding capabilities in place, generative AI can provide real-time contextual guidance for agents. Throughout a customer interaction, agents perform a wide range of tasks – listening to the customer, interpreting their needs, accessing relevant information, problem-solving, and crafting responses that move the interaction toward resolution. It’s a lot to juggle. Generative AI that proposes helpful solutions at the right time can ease both the cognitive load and manual typing burden on agents, allowing them to focus more effectively on the customer.
- Generative AI should take action to resolve customers’ issues
This is where generative AI combines all of its capabilities to improve customer service. It can integrate the ingredients of customer care—listening, understanding, and proposing—to safely and autonomously act on complex customer interactions. More than a conversational bot, it can resolve customers’ issues by proposing and executing the right actions, and autonomously determining which backend systems to use to retrieve information and securely perform issue-resolving tasks.
Service with a stretch: Expanding your ball of dough
Many contact centers are already using generative AI to listen, understand, and propose. But it’s generative AI’s ability to take action based on those other qualities that dramatically stretches contact center capacity (without breaking your agents).
A growing number of brands have already rolled out fully capable generative AI agents that handle Tier 1 customer interactions autonomously from start to finish. That does more than augment your agents’ capabilities or increase efficiency in your workflows. It expands your frontline team without the endless drain of recruiting, onboarding, and training caused by high agent turnover.
A single generative AI agent can handle multiple interactions at the same time. And when paired with a human agent who provides direct oversight, a generative AI agent can achieve one-to-many concurrency even with voice interactions. So when inbound volume spikes, your generative AI agent scales up to handle it.
More dough. More capacity. All without stretching your employees to the breaking point. For contact center leaders, that really might be as good as pizza for life.
Want more? Read our eBook on the impact of agent churn
Agent churn rates are historically high, and the problem persists no matter what we throw at it — greater schedule flexibility, gamified performance dashboards, and even higher pay.
Instead of incremental changes to timeworn tools, what if we could bypass the problem altogether?
Download the Agent Churn: Go Through It or Around It? eBook to learn why traditional strategies for agent retention aren't working, and how generative AI enables a radical new paradigm.