SeeingAI: Translating sight into sound
Artificial Intelligence is all around us. AI, as it is popularly known, tries to simulate human intelligence through a complex gamut of machines and software-based algorithms. The next time the Uber app on your iPhone or Android phone tries to predict where you want to go, or when the Google Assistant on your phone sends you suggestions on the best Mediterranean-cuisine restaurants, remember, this is AI at work. Just like Google, Apple, Amazon, Facebook and the rest, Microsoft is betting big on AI to make life easier for users. And one step in that direction is Seeing AI, a free app that is now available for the Apple iOS platform.
With it, Microsoft has integrated a bunch of features that will translate the visual world into a stream of audible information for those who may not be able to see well—simply by pointing the phone’s camera at whatever needs deciphering.
First among the many features is Short text, which will identify and speak out any piece of text that is in front of the phone’s camera. It could be notes, advertisements, documents, even flyers. It works well, though we noticed that if we pointed at a complex structure of text, it would sometimes pick out just the first segment, and couldn’t always identify the rest—this was particularly true if you were trying to identify what was written in multiple boxes of text.
Second, if you want to get a photograph of a document, the app will identify the document formatting and offer voice-based guidance on how to frame your photograph—you will be told whether to move the camera left or right too, to capture the entire document cleanly.
Among other features, you will be able to use this to identify people (the app will be able to tell you their gender, age, even the mood they are in at the time). During testing, it could tell us the gender and age of the person in the frame, but not how far the face was from the camera.
Seeing AI will also be able to scan barcodes and identify products.
The app uses neural networks to identify whatever you point the phone camera at, very much like the algorithms and artificial intelligence that are used in autonomous and semi-autonomous cars, for instance.
Some features of the app, such as descriptions of a scene or decoding handwriting, will require cloud connectivity. Unlike other features such as face detection or document scanning, for instance, these features will not remain functional if you have no or inconsistent internet connectivity. The scene feature is still experimental, and doesn’t always have the sort of detailing you might expect from an AI solution. For instance, a rather dynamic scene shot of our newsroom just elicited “it seems to be indoor” feedback from Seeing AI, with no descriptions of the televisions, computers, books, plants and chairs in the very same frame. Future updates should be able to tackle some of this.
One feature expected soon is the ability to identify currency notes while paying in cash. This could be handy, though we suspect it will be limited to certain countries initially.
The Seeing AI app is currently available for Apple iPhone, iPad and iPod Touch users in the US, Canada, India, Hong Kong, New Zealand and Singapore. There is no word yet on when (and if at all) the app will also be made available for Android phones.