Monday, September 11, 2023

AI Emotional Detection and Response

This is a response to a question on Quora about the challenges of developing AI systems capable of dealing with human emotions. 

Going from the other answers [on Quora] I would say that the first obstacle might be to get people to believe it is possible. I am strongly of the opinion that it is operationally possible and conceptually a “slam dunk”. By ‘operationally’, I mean that we can train an AI to recognize all the various signals that people use to convey their state of mind, including temporal, social, geographical, and cultural context. We can do that in the same relatively well-established way we use GANs to work back and forth with photos, speech and other types of input data. If we can assemble the data, we can create an AI that both recognizes emotional states and operates appropriately by expressing the appropriate emotional state as a response.

Going from what we have seen in the past year, such a system would likely take less than a year to train, ground-up, both to ‘read’ human emotions and generate appropriate responses. It would become much better than humans. Other people answering here don’t seem to be aware that we are not working ‘ground up’. Work is already well underway.

Others answering here seem to believe that we are not as far along as we are with training and generative AI. They also seem to be articulating things that would make it difficult for a human being. They also seem hung up on what the responding system is ‘thinking’, assuming that both reading and expressing emotions requires a particular theory of mind. “Theorists have suggested that emotions are canonical responses to situations ancestrally linked to survival.” (Kragel et al. 2019) -- “... it is possible to argue that infants can see emotions in others even though they lack the sort of knowledge that, in the theory of mind view, is necessary to see patterns of changes in the face as expressions of emotions.” (Zamuner, 2013)

AI can read emotions by reading facial features. It can interpret visual cues and vocal cues separately, “Whereas by combining both audio and visual features, the overall system accuracy has been significantly improved up to 80.27 %.” (Rashid, et., et al 2013). We reveal emotions by vocal cues, body language, facial expressions, language, and more. Cardiac status can change with emotion and sensory apparatus is already in use for this. (Marin-Morales, et. Al 2018) If the sensory apparatus is available to read the cues, we can collect the data. If we can collect the data, we can use it to train AI. If we can train AI, we can train it to match and surpass the human ability to detect and respond to emotions.

We’ve passed the ‘can we do it’ point on this journey. We are now travelling through ‘how much cheaper, faster, and better can we do it’. Soon, we will arrive at ‘we can do it for free, instantly, and better than we can measure.

References

Marín-Morales, J., Higuera-Trujillo, J.L., Greco, A. et al. Affective computing in virtual reality: emotion recognition from brain and heartbeat dynamics using wearable sensors. Sci Rep 8, 13657 (2018). Affective computing in virtual reality: emotion recognition from brain and heartbeat dynamics using wearable sensors - Scientific Reports

Philip A. Kragel et al. Emotion schemas are embedded in the human visual system.Sci. Adv.5,eaaw4358(2019).DOI:10.1126/sciadv.aaw4358 Emotion schemas are embedded in the human visual system | Science Advances

Rashid, M., Abu-Bakar, S.A.R. & Mokji, M. Human emotion recognition from videos using spatio-temporal and audio features. Vis Comput 29, 1269–1275 (2013). Human emotion recognition from videos using spatio-temporal and audio features - The Visual Computer

Zamuner, E. The Role of the Visual System in Emotion Perception. Acta Anal 28, 179–187 (2013). The Role of the Visual System in Emotion Perception - Acta Analytica

No comments:

Javascript webp to png converter

[Done with programmer's assistants: Gemini, DALL-E] OpenAI's DALL-E produces images, but as webp files which can be awkward to work ...