What’s All the Fuss About? Seeing AI
A new app that says what it sees
What is it?
A new iphone app for blind and visually impaired people that describes the world around them. Built by Microsoft, it uses artificial intelligence (AI) to narrate what the phone’s camera is seeing. The company’s ambition is to “turn the visual world into an audible experience”.
Does it recognise people?
Yes, if you save them to a list of contacts. For strangers, it tries to identify their gender, age and even their emotion. In the main screenshot for example (right), Seeing AI recognises the subject as a “28 year old female wearing glasses looking happy”. Spot on.
What else can it recognise?
Objects, short text (such as the address on an envelope or a street sign), long documents, currency and product barcodes (to identify what it is). It has several clever tricks that make it easier to use, including audio cues when you’re looking for a barcode on a product, beeping more quickly as it comes into view. Also, when you scan a document, you’re told where to position it so no words are cut off, something OCR (optical character recognition) software lacks.
But perhaps the most useful tool is the “experimental” scene description, which tells you what’s happening in a photo you’ve taken.
How accurate is that?
Fairly accurate, at least according to an online demonstration by Saqib Shaikh, a software developer for Microsoft who lost his sight aged seven. To watch his video visit www. snipca.com/24947, then scroll down and click ‘Scene Demo’. Saying that the app feels like “science fiction”, Shaikh shows us how the app can recognise “a man sitting on a couch using a laptop” and “a bus that is parked on the side of a road”. More impressively, another video on the site shows the app recognising “a young girl throwing a Frisbee in the park” (see screenshot left). Spot on again.
Is it free?
Yes, unlike similar apps such as Aipoly ( http://aipoly.com), which costs £4.99 a month. But it’s not yet available in the UK, only in the US, Canada, Hong Kong, India, New Zealand or Singapore.
Will there be an Android version?
Microsoft won’t say, but it would be crazy to ignore hundreds of millions Android users, particularly as its main tech rivals – Google, Apple and Facebook – are also investing heavily in AI. All three are working on tools that help the blind and visually impaired.
What are they doing?
Google’s ‘Show and Tell’ technology ( www.snipca.com/24958), available for app developers, generates captions for photos. Apple’s screen reader Voiceover ( www.snipca.com/24950) tells you what’s on your ipad and iphone when you tap the screen in a particular way (actions called ‘gestures’). Facebook is using ‘automatic alternative text’ ( www. snipca.com/24955) to describe the contents of photos posted to the site. These could improve the life of millions of people, but Microsoft’s ambition for AI is bigger.
How so?
It wants to secure the future of the planet, no less, having also launched ‘AI for Earth’ ( www.microsoft.com/en-us/ aiforearth), a project “empowering people and organizations to solve global environmental challenges”. But its aims are not entirely selfless. The company that dominates AI in future will make billions. Recognising frisbees is just the start.
Seeing AI describes the world for the blind – and it feels like science fiction
Device Manager Notepad Calendar Calculator Windows search Start menu Registry editor Volume controls Reader Resource Monitor Task Manager Uninstaller Print Screen ZIP tool Night Light Start-up Manager Wmp/groove Music