Object recognition has come a long way from the days of Optical Character Recognition being used to recognise text characters.
Nowadays, we are using computers to recognise all sorts of things, like voice commands, images or the contents of legal documents.
However, one of the most fascinating applications of the improved performance of image recognition came from the video above, in which graduate student Joseph Redmond talks about his new video recognition technology called YOLO (You Only Look Once), based on open source software called Darknet.
When he began working in this field, a system would take at least 20 seconds to analyse an image and only be able to estimate with a low accuracy whether a specific object was in it.
This would be useless for anyone trying to analyse a video, especially where objects move around the screen. Think of the importance for any system trying to track objects in real time, for example when used for the safety and function of a self driving car.
However, as the power of machine learning has been enhanced and refined in the past few years, the performance capable by these object recognition systems has skyrocketed.
The Yolo system is now able to track and identify a wide variety of objects in real time, using only the processing power of a modern laptop computer.
In the latest release, there is a working system which is slimmed down and can be performed using only a smartphone, making the system accessible to a far greater range of people and researchers.
And because it is a “general purpose” system, it can be trained to identify certain types of objects faster and more effectively than a human counterpart.
For example, the system is already being used when looking at medical images to identify cancer cells in real time, or tracking animals in Kenya based on videos.
And in a few years, systems like this will be able to sort through visual information significantly faster than any human expert. This may lead to certain types of human jobs becoming redundant as they are replaced by this form of artificial intelligence, but it could also free thousands of workers from manual data classification to focus on more value-adding work.
For a view of how the latest version of the system (version 3) works, check out this additional video below.
Latest posts by Nick Skillicorn (see all)
- Podcast S3E29: Roger Firestien – Learning from the man who taught the creative process to the most people in the world - December 18, 2019
- Podcast S3E48: Adam Malofsky – Innovating with your customers’ customer - December 10, 2019
- S3E47: Prof. Keith Sawyer – The Creative Classroom and improving learning outcomes - December 4, 2019
- 3 Dimensions of Innovation: the 23 Capabilities your company needs to succeed - November 28, 2019