Weekend Herald

Noooooooo! Social media’s stretchabl­e words confuse computers

- Michelle Dickinson

It’s been almost 30 years since the world was introduced to the internet through the world wide web and in this short time it has had a profound impact on our everyday lives. The way that we keep in touch with friends and family across the world has changed significan­tly, and with it the way that we use language with them.

Lots of new words have been introduced with technology, such as “unfriendin­g” and “photobombi­ng”. We have also appropriat­ed existing vocabulary to mean something different, such as “tablet”, “wireless” or “cloud”. This week new research looked into the relatively new world of stretchabl­e words and how they have become part of our everyday communicat­ion through social media.

Stretchabl­e words are those used to emphasise a regular word and are often used when communicat­ing by text message or through social media. While rarely used in formal writing, many of us will have written text messages that include “hahahahaha” to emphasise that something was Dr Michelle Dickinson,

creator of Nanogirl, is a nanotechno­logist who is passionate about getting Kiwis hooked on science and engineerin­g. Tweet her your science questions @medickinso­n funny, or “gooooooaaa­aaalll” when our favourite team scores in a sports game.

Though we as humans may be able to guess what is meant by these words, there is generally no correct or defined spelling of stretched words, making it very difficult for artificial intelligen­ce algorithms to be

How can you programme a computer to know that “suuuuuure” might imply sarcasm whereas “yeeeeeesss­s” might imply excitement?

programmed to recognise them.

How can you programme a computer to know that “suuuuuure” might imply sarcasm whereas “yeeeeeesss­s” might imply excitement? This is especially difficult when the number of uuuuu’s or eeeeee’s in each stretched word are often determined by how we are feeling at the time of writing.

The journal PLOS ONE has published one of the most comprehens­ive studies of stretchabl­e words in social media to help create automated ways of identifyin­g and analysing them. This new recognitio­n method was applied to more than 100 billion tweets published over eight years to see if there was a pattern in the way that we stretch our words online.

The study identified two key ways of identifyin­g the characteri­stics of stretchabl­e words by looking at their balance and their stretch. Balance was used to refer to the degree in which different letters were repeated. For example, when laughing online we may write “ha”, or “haha” or “hahaha”, where we repeat the h and the a equally. Whereas the word “no” can be emphasised by writing “noooooooo”, where the balance is unequal and only the letter o is repeated more than other letters.

Stretch was used to refer to how long a word is typically stretched. Short words tended to be stretched more and people often repeated them many times, such as “hahahahaha” while other words like “huuuuuuuge” typically had just one letter repeated.

If we are ever going to get to the point where computers and artificial intelligen­ce can understand the range of communicat­ion that people use day to day then being able to model how humans stretch words is important. This research shows how much more work there still is to do around understand­ing how humans modify our language online and how few rules we apply when doing it.

Studies like this bring us closer to a world where computers can quantify and translate words, including stretched words and newly appropriat­ed words, in a way that machines can understand, by developing tools to improve natural language processing, search engines and spam filters.

While the future linguistic trends and the effect technology will have on them is still unknown, what is apparent is that it will still take a very loooooong time for computers to understand what we mean.

 ??  ??

Newspapers in English

Newspapers from New Zealand