Conversational Agents
And the Ethical Risks of Human Manipulation
Last updated
And the Ethical Risks of Human Manipulation
Last updated
By Louis B. Rosenberg
Conversational agents are AI-powered systems that engage users in natural interactive dialog. When the interactions are text-based, these systems are generally referred to as chatbots. When enabled through natural spoken language (via real-time voice processing and voice generation) they are more commonly called voice bots and can be deployed for a wide range of uses, including personal assistants and customer service representatives. When deployed with a simulated appearance through an avatar (either on flat screens or on immersive displays) they become embodied entities that are sometimes referred to as virtual spokespersons (VSPs) or virtual assistants [1]. When implemented with sufficient visual and audio fidelity, these interactive avatars can be indistinguishable from authentic human representatives [2].
Whether conversational agents communicate through text, audio, or a visual avatar, all conversational dialog is generally converted to text before processing by a Large Language Model (LLM). In addition, Multimodal Large Language Models (MLLMs) can now process audio and video signal streams (in near real-time) to supplement language processing. This enables rapid analysis of the emotional affect of human users from audio cues (including vocal inflections) and visual cues (including facial expressions, body posture, and gestures.) In this way, conversational agents are poised to engage humans naturally through convincing avatars that can express empathetic facial expressions and vocal inflections and can react to the facial expressions, vocal inflections, body posture, and other cues of the engaged human.
The Benefits: Conversational Agents with this level of sophistication have significant potential to make computing more natural and intuitive, enabling us to interact with software-based systems using the most human of all skills – language. This, more than any advancement over the last 20 years, will enable mainstream computing to finally move away from traditional interface models based on flat screens and keyboards/touchscreens. In addition, this will likely bolster the adoption of XR interfaces (AR, VR, MR) by enabling hands-free interactions with embodied agents through natural conversational interactions. It will also transform traditional flat-screen computing. In the near future, it will become commonplace to embed conversational agents on website to act as salespeople, customer service representatives, and as the “friendly face” of critical software utilities such as search engines. In most cases, interactions will likely be perceived as highly convenient and intuitive.
The Risks: While offering value, Conversational Agents also pose a unique threat to human agency as they represent an interactive and adaptive form of media that can be used to impart targeted influence. While many forms of media can be used as tools of influence, Conversational Agents are unique in that they can easily target users through personalized content in real-time. This creates new risks that extend beyond the dangers of traditional forms of media, including social media. As described below, it is helpful to formalize these risks in the context of Control Theory, as this can help stakeholders appreciate that interactive agents represent a very threat which requires new safeguards, guidelines, and ethical use policies. Without enhanced protections, conversational agents could be deeply misused, easily crossing the line from deploying targeted influence, to driving targeted manipulation.
All forms of media can be used to impart influence on individuals. Therefore, technology developers and business leaders need to be mindful of the consequences of misuse or abuse. Conversational agents are a unique form of media in that they can easily be designed to adapt their influence tactics during real-time conversations, optimizing their impact on individual users [3]. The AI Manipulation Problem formalizes this risk by identifying the four basic steps by which a conversational agent could be used to excessively influence a user. These steps create a “feedback loop” around the user as follows:
Impart real-time targeted influence on an individual user through AI-crafted dialog.
Sense the user’s real-time reaction to the imparted conversational influence.
Adjust conversational tactics to increase the persuasive impact on the user.
Repeat steps i, ii, iii to maximize influence effectiveness.
Any system that follows these four steps could create a “feedback loop” around the user in which the individual is repeatedly targeted, assessed, and re-targeted to progressively optimize the influence. In the field of engineering, this is typically called a “Feedback Control System.” As shown in Figure 1 below, the basic pieces of any control system include a System you aim to control, Sensors for detecting the system’s behavior, and a Controller that adjusts the influence on the system to optimize the desired impact. A classic example is a thermostat for setting a temperature goal in a house. If the house (i.e. the System) falls below the goal, the thermostat turns the heat on. If the temperature rises above the goal, the thermostat turns the heat off. This simple feedback loop keeps the temperature close to the specified goal (i.e. imparts optimized influence).
The AI Manipulation Problem considers the case where the target system is a human user, the controller is a conversational agent, and the sensors are microphones and cameras that monitor user’s response via language processing, supplemented with analysis of vocal inflections, facial expressions and other physical cues. As shown below in Figure 2 below, this “human control system” is essentially the same as a simple thermostat, but the Input is not a temperature goal but an Influence Objective to be imparted on the user. The AI Agent will engage the user in interactive dialog, gradually adapting its conversational tactics based on the measured behavior of the system (i.e., the verbal responses of the user, potentially supplemented with the emotional content in the user’s vocal inflections, factual expressions, and body posture.
Consider a conversational agent deployed to convey misinformation about a political candidate. That AI agent will engage a target user in interactive dialog, likely adopting a conversational style that is custom selected for the specific user based on stored personal data about that user. For example, the user might be targeted with an AI Agent that speaks in a casual style and makes emotional arguments if that user’s personal data suggests this approach will be effective. Conversely, a different user might be targeted with more formal language and logical arguments, if that user’s personal data suggests it will be effective.
With the style selected, the conversational content can then be custom crafted to optimize impact on the target user, for example by referencing that user’s personal interests, profession, or political values. The user will react through a conversational response. The controller will assess the user’s reaction in real-time, for example by determining if their resistance to the influence is based on factual and/or emotional barriers. The controller will then adapt its tactics to overcome the resistance, offering custom tailored counterpoints. This process is repeated as conversation continues, the controller working to iteratively overcome objections and efficiently guide the user toward accepting the influence [2,3]. Such a controller could be deployed to persuade an individual user into beliefs or behaviors or that he or she may not normally adopt through traditional media.
To make these dangers clear for XR environments, the 2023 short film Privacy Lost gives a quick fictionalized example of conversational manipulation through AI-powered XR glasses:
Some argue that conversational manipulation is not a new threat, as human salespeople already use interactive dialogue to influence customers by overcoming objections. While true, AI-powered systems are likely to be significantly more effective agents of manipulation. For example, a 2024 study performed at Swiss Federal Institute of Technology found that when human users debated an AI-agent powered by GPT-4, they had 81.7% (p < 0.01) higher odds of increased agreement with their opponent as compared to participants who debated humans [4]. In addition, the AI agents could be trained on aggressive sales tactics, methods of persuasion, human psychology, and cognitive biases.
In addition, AI agents are likely to have super-human abilities to sense human emotions during interactive conversations. Already, AI system can detect micro-expressions on human faces that are far too subtle for human observers [5]. Similarly, AI-systems can read faint changes in human complexion known as facial blood flow patterns and subtle changes in pupil dilation to assess emotions in real-time. In addition, these platforms are likely to store data during conversational interactions over time, tracking and analyzing which types of arguments and approaches are most effective on each user personally.
For example, the system could be designed to learn whether a target user is more easily swayed by factual data, emotional appeals, or by playing on their insecurities or fear of missing out. In other words, these systems will not only adapt to real-time emotions, but they could get progressively better at influencing the target user over time, potentially learning how to draw them into conversations, guide them into accept new ideas, or convince them to buy things they don’t need, believe things that are untrue, or even support extreme policies or politicians that they’d otherwise reject.
In these ways, the AI Manipulation Problem cautions technology developers, business leaders, and policymakers that human users will be increasingly vulnerable to targeted manipulation unless ethical guidelines and/or regulations are put in place to restrict or ban the use of real-time feedback loops in which AI agents can adapt and optimize their tactics during interactive conversations to maximize impact. Protections could also include banning the storage of data that characterizes the responsiveness of individual human users to different conversational styles and/or persuasive tactics, thereby preventing systems from learning over time how to best target individual users with conversational influence [6,7].
[1] Rosenberg, L. (2022) Regulation of the Metaverse: A Roadmap: The risks and regulatory solutions for largescale consumer platforms. In Proceedings of the 6th International Conference on Virtual and Augmented Reality Simulations (ICVARS '22). Association for Computing Machinery, New York, NY, USA, 21–26. https://doi.org/10.1145/3546607.3546611
[2] Rosenberg, L. (2023). The Metaverse as an AI-mediated Tool of Targeted Persuasion. In: Matteo Zallio (eds) Human-Centered Metaverse and Digital Environments AHFE (2023) International Conference. AHFE Open Access, vol 99. AHFE International, USA. http://doi.org/10.54941/ahfe1003938
[3] Rosenberg, L (2023). The Manipulation Problem: Conversational AI as a Threat to Epistemic Agency. 2023 CHI Workshop on Generative AI and HCI (GenAICHI 2023). Association for Computing Machinery, Hamburg Germany (April 28, 2023)
[4] Salvi, F., et. al, (2024). “On the Conversational Persuasiveness of Large Language Models: A Randomized Controlled Trial.” ArXiv abs/2403.14380
[5] Li, X. et al. (2018). "Towards Reading Hidden Emotions: A Comparative Study of Spontaneous Micro-Expression Spotting and Recognition Methods," in IEEE Transactions on Affective Computing, vol. 9, no. 4, pp. 563-577, 1 Oct.-Dec. 2018, doi: 10.1109/TAFFC.2017.2667642.
[6] Wallace C., Rosenberg L., Pearlman K., et al. (2023) Whitepaper the Metaverse and Standards. Sydney, Australia: Standards Australia. https://www.standards.org.au/documents/h2-3061-metaverse-report
[7] Graylin, A., Rosenberg, L.“Our Next Reality,” Nicholas Brealey Publishing (June 4, 2024)