Computers as marriage counsellors
Analysis of speech patterns can tell researchers a great deal about the feelings underlying the way we speak
Apredictions using psychological assessment based on the vocal (and other) attributes — including the words spoken and body language. Surprisingly, their prediction of the eventual outcome (they were correct in 75.6 per cent of the cases) was inferior to predictions made by the AI based only on vocal characteristics (79.3 per cent). Clearly there are elements encoded in the way we speak that not even experts are aware of. But the best results came from combining the automated assessment with the experts’ assessment (79.6 per cent correct).
The significance is not so much about involving AI in marriage counselling or getting couples to speak more nicely. The significance is revealing how much information about our underlying feelings is encoded in the way we speak — some of it completely unknown to us.
Words on a page or screen have lexical meanings derived from their dictionary definitions. These are modified by surrounding words. There can be great complexity in writing. But when words are read aloud, it is true that they take on additional meanings that are conveyed by word stress, volume, speaking rate and tone. In a typical conversation there is also meaning in how long each speaker talks for, and how quickly one or other might interject.
Consider the simple question “Who are you?” Try speaking this with stress on different words: “Who are you?”, “Who are you?” and “Who are you?”. The semantic meaning can change with how we read even when the words stay the same.
Computers reading ‘leaking senses’
It is not surprising words convey different meanings depending on how they are spoken. It is also not surprising that computers can interpret some of the meaning behind how we choose to speak (maybe one day they will even understand irony).
But this research takes matters further than just looking at the meaning conveyed by a sentence. It seems to reveal underlying attitudes and thoughts that lie behind the sentences. This is a much deeper level of understanding.
The therapy participants were not reading words like actors. They were just talking naturally — or as naturally as they could in a therapist’s office. Yet the analysis revealed information about their feelings that was “leaking” inadvertently into their speech. This may be one of the first steps in using computers to determine what we are really thinking or feeling. Imagine for a moment conversing with future smartphones — will we “leak” information that they can pick up? How will they respond?
Could they advise us about potential partners by listening to us talking together? Could they detect a propensity to antisocial behaviour, violence or depression?
Don’t worry just yet because we are years away from such a future, but it does raise privacy issues, especially as we interact more deeply with computers at the same time as they are becoming more powerful.
Consider the other senses apart from sound (speech); perhaps we also leak information through sight (such as body language, blushing), touch (temperature and movement) or even smell (pheromones). If smart devices can learn so much by listening to how we speak, one wonders how much more could they glean from the other senses.