The problem of unknowns in Artificial Intelligence

In 2017, The Guardian reported on how Volvo’s self-driving cars failed to avoid kangaroos during road tests in Australia. Volvo – which is headquartered in Sweden – initially tested the technology in a local setting. As a result, their cars recognized animals native to Sweden (such as moose). However, once the cars were driving in a new setting like Australia, unforeseen elements started popping up, such as kangaroos.

The problem of unknowns

The above Volvo anekdote illustrates a major problem with current artificial intelligence (AI): the problem of unknowns. Typically, AI systems* learn to recognize things by looking at a lot of examples. For instance, if we want a system to recognize a cat, it needs to be shown thousands if not millions of images of cats. Afterwards the system is able – at least hopefully – to recognize images of cats it has never seen before.

In the case of Volvo, the AI system probably looked at a lot of images and movement patterns of moose. Then, while driving on actual Swedish roads, the cars were able to recognize and effectively respond to moose. Of course, because kangaroos were initially not on the minds of Volvo’s R&D team based in Sweden, they did not show their system enough examples of Kangaroos. Consequently, kangaroos were quite literally unknown to the cars.

The problem of unknowns is certainly not unique to Volvo, it is in fact quite common in the field of AI. For example, Facebook researchers reported that five of the most popular public image recognition systems failed to recognize common household objects (for example, soap and spices) from non-western countries. Google Health software that screens people for diabetic retinopathy performed considerably worse because of unknown light conditions in screening rooms in Thailand. Similar systems also performed worse with less drastic switches in environment, such as between Seattle and Atlanta. In another case, a New Zealand man of Asian descent’s passport renewal picture got rejected because the AI system was not familiar enough with the eyes of people of Asian descent.

Tackling the problem of unknowns

Tackling the problem of unknowns in AI is very hard. The world is extremely vast and diverse, so finding all relevant unknowns with limited time and resources requires a lot of expertise. The question therefore becomes: how do AI companies efficiently and thoroughly search for these unknowns? That question is the central theme of this blog.

* When I talk about AI systems on this blog, it’s about supervised learning systems with at least some labeling involved (unless further specified).