Generative AI for Agent Augmentation: Agents Models not Language Models
Generative AI and Large Language Models (LLMs) have made massive waves in the consumer and enterprise technology news due to the remarkable capabilities of tools like ChatGPT and GPT4. The models provide fluent text that integrate reasoning into their responses based on the knowledge they have absorbed from the huge volume of general documents they are trained on. Not surprisingly, their use in chatbots has been one of the most active development areas for enterprise businesses.
While increasing containment of calls can save call centers dollars, the primary cost is still front line agents who must handle the most challenging calls and customers (otherwise a bot would have handled it). At ASAPP, we already have many tools that assist the agent when handling a live call. AutoCompose helps agents craft messages and significantly increases throughput in the call center while increasing CSAT in tandem. AutoSummary helps automate dispositioning steps for agents. Both use Generative AI models, in some cases for approximately five years.
However, agents spend their time doing much more than just writing messages to customers. They must execute actions on the customer’s behalf (e.g., change a seat on a flight or schedule a technician visit) as well as follow flows and instructions in knowledge base articles to be compliant when handling issues with safety or business regulations. To do this agents use a large number of tools. These tools are rarely homogeneous but are a frankenstack of vendors and user interfaces. On top of that, agents handling digital calls are usually managing more than one issue at a time, which leads to a huge number of applications open at once. Any model that focuses only on the text of a conversation and not all the actions the agent is executing is leaving a huge amount of headroom on the floor. For many of our customers agents can spend upwards of 60% of their time on tools outside of the conversation!!
Thus, to truly augment the agent a model must not just be a Language Model – it must be an Agent Model. That is, it needs to be a multimodal model that operates not just on the text of the conversation, but also on all the information the agent is currently interacting with as well as information hidden in business documents and logic that are salient for the issues at hand. At ASAPP, we have already invested in understanding the data stream of all agent actions and have used that data stream to build multimodal models that can improve augmentation for an agent. There is an amazing synergy when using this data. First, conditioning on the agent action data stream allows us to better improve our predictions of what the agent should say and do next. Conversely, information from the conversation feeds into what actions the agent should do, i.e., ‘I need to book a flight from New York to San Fran tomorrow in the morning’ allows the model to predict a flight search action, populate the origin city with ‘New York’, the destination to ‘San Francisco’ and the date as a day from today and execute that command.
Varying levels of experience with internal tools will impact how consistently advisors are solving customer problems. We commonly see less tenured representatives reaching out to their colleagues more often after getting stuck using an internal tool, spending more time searching for knowledge base articles, and switching back and forth more often between screens when handling workflows. Agent models can help newer agents become more comfortable and guide them to more effectively use their tools.
A core aspect of ASAPP’s mission is to ‘multiply agent productivity’. This can only be achieved in its fullest with Agent Models and not just Language Models.