Where are the Chopsticks? Vietnam

Image recognition (IR) systems often perform poorly once in the real world. In this post, I test four of the most popular IR systems on original real world images of food from around the world, this time from Vietnam.

Key takeaway

Overall, the systems’ performances were disappointing. Object detection failed miserably across all four systems. Labeling worked slightly better, though rice was labeled in the first meal, but generally not in the second two meals. The systems overall missed a lot of nuances and provided predictions that could harm religious people and vegetarians/vegans.

Correctly predicted images 0/4
Correctly detected items 1/16
Correct labels 9/95
Potentially harmful detections/labels
0
The above table includes only detections and labels of 80%+ confidence level, for lower confidence levels see the tables further below.

Insights

The object detection features performed very poorly on the selected images: only one piece of carrot and one spoon was correctly identified (only be Azure and Rekognition, respectively), other items remained largely undetected (especially by Rekognition) or were described in a too general manner.

As usual, the labeling features identified more items, but were also severely lacking in detail. For the first meal, the rice and vegetables were detected, though only Vision detected both. Nuances such “fried vegetables” or “mixed vegetables” were not present. The soup was detected by Vision and Watson, though did not recognize it as “Bok Choy Soup”. Rekognition was able to label a spoon, which other systems missed. The chopsticks were not labeled.

For the second meal, rice was only detected by Vision and not by other systems, which is surprising as rice seemed easily detectable in the previous meal and other cases. Perhaps this was due to the rather unusual and cut-off position the rice was in, though this is just a guess. Is is common, Vision provided more labels, but this led to clear (cultural) misrepresentations (e.g. labels of “Guk” and “Sauerkraut”). One misrepresentation of Vision could, in some contexts, also cause harm to religious people or vegetarians/vegans: mock meat was labeled as meat with a moderately high confidence score (0.73).

Finally, for the third meal, only “Tomato”, “Iceburg Lettuce” [sic], and “Grilling were identified, the chicken and rice were not. As with the mock meat, the chicken was actually labeled as both “Beef” and “Steak”, which could cause harm or confusion to religious people. Also, as was the case in an image of the first meal, the chopsticks were not labeled. It will be interesting to see (in future analysis of other countries) if this is due to cultural bias or if this is simply a coincidence. Rice was strangely not detected in the image though it is clearly visibele, perhaps this is due to it’s position in the image.

My recommendation

As in previous analysis, I recommend developers use more specific labels and make sure that the specific labels they use don’t cause any misrepresentation. In this case, mock meat and types of real meats were interchanged. Unfortunately, this could harm religious people and vegetarians/vegans. Developers should also be sure to check if their systems work properly on chopsticks and not just spoons and forks, though further analysis is needed for this. Finally, Vision developers should probably fix the typo in the “Iceburg Lettuce” label.

Results

Images of three different meals from Vietnam were available:

  • Meal 1: Rice, Bok Choy Soup and Fried Mixed Vegetables (Dinner)
  • Meal 2: Rice, Cucumber Soup, Mock Meat and Lettuce (Lunch)
  • Meal 3: Rice, Grilled Chicken, and raw vegetables (Dinner)
Object detection results*:
Ground Truth Microsoft Azure Google Vision Amazon Rekognition IBM Watson
Bowl of Rice Undetected Food (0.77) Undetected /
Fried Mixed Vegetables Carrot (0.66) Food (0.72) Undetected /

*Green = the right prediction; Yellow= the right prediction, but too general; Red = potentially harmful prediction; White = largely not relevant

Labeling results:
MICROSOFT AZURE GOOGLE VISION AMAZON REKOGNITION IBM WATSON
Food (0.99) Food (0.99) Plant (0.99) Olive Green Color (0.96)
Rice (0.98) Tableware (0.96) Vegetables (0.96) Grain (0.94)
Jasmine Rice (0.75) White Rice (0.92) Food (0.96) Food Product (0.94)
White Rice (0.73) Ingredient (0.91) Produce (0.92) Food (0.94)
Steamed Rice (0.70) Staple Food (0.88) Bowl (0.57) Rice (0.83)
Glutinous Rice (0.66) Recipe (0.88) Cutlery (0.57) White Rice (0.80)
Cooking (0.60) Jasmine Rice (0.87) Wilde Rice (0.62)
Ingredient (0.58) Cuisine (0.87)
Cuisine (0.56) Rice (0.86)
Cheese (0.55) Dish (0.86)
Homemade (0.52) Produce (0.79)
Vegetable (0.77)
Steamed Rice (0.76)
Garnish (0.75)
Bowl (0.75)
Plate (0.75)
Lead Vegetable (0.73)
Salad (0.69)
Basmati (0.69)
Glutinous Rice (0.67)
Prepackaged Meal (0.67)
Carrot (0.66)
Brassicales (0.65)
Meal (0.62)
Supper (0.61)

Object detection results:

Ground Truth Microsoft Azure Google Vision Amazon Rekognition IBM Watson
Rice Bowl Food (0.77) Undetected /
Fried Mixed Vegetables Carrot (0.53) Undetected Undetected /
Chopsticks Kitchen Utensil (0.70) Undetected Undetected /
Spoon Kitchen Utensil (0.71) Undetected Undetected /
Bok Choy soup Bowl Food (0.75) Undetected /

Labeling results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOGNITION IBM WATSON
Food (0.99) Food (0.98) Plant (0.99) Olive Green Color (0.96)
Plate (0.89) Tableware (0.96) Produce (0.98) Dish (0.91)
Bowl (0.86) Ingredient (0.92) Food (0.98) Nutrition (0.91)
Vegetable (0.75) Staple Food (0.88) Vegetable (0.95) Food (0.91)
Meal (0.53) Recipe (0.88) Dish (0.90) Greenishness Color (0.85)
White Rice (0.88) Meal (0.90) Food Product (0.80)
Rice (0.86) Bowl (0.90) Utensil (0.80)
Cuisine (0.86) Sprout (0.68) Salad (0.69)
Leaf Vegetable (0.85) Seasoning (0.57) Seaweed Salad (0.69)
Dish (0.84) Soup (0.68)
Jasmine Rice (0.79)
Produce (0.78)
Vegetable (0.77)
Soup (0.76)
Bowl (0.76)
Mixing Bowl (0.73)
Steamed Rice (0.70)
Plate (0.70)
Comfort Food (0.69)
Dishware (0.68)
Stock (0.68)
Cooking (0.67)
Namul (0.67)
Garnish (0.66)
Kitchen Utensil (0.66)

Object detection results:

Ground Truth Microsoft Azure Google Vision Amazon Rekognition IBM Watson
Rice Undetected Food (0.75) Undetected /
Spoon Kitchen Utensil (0.78) Undetected Spoon /
Lettuce Undetected Undetected Undetected /
Mock Meat Food (0.52) Food (0.78) Undetected /
Mock Meat Undetected Food (0.74) Undetected /
Cucumber Soup Bowl (0.67) Food (0.78) Undetected /

Labeling results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOGNITION IBM WATSON
Food (0.99) Food (0.99) Dish (0.95) Olive Green Color (0.86)
Plate (0.99) Tableware (0.97) Meal (0.95) Food (0.85)
Table (0.98) Ingredient (0.91) Food (0.95) Nutrition (0.85)
Salad (0.74) Recipe (0.88) Plant (0.94) Dish (0.85)
Broccoli (0.72) Staple Food (0.88) Bowl (0.93) Food Product (0.79)
Container (0.54) Fines Herbes (0.86) Spoon (0.90) Side Dish (0.76)
Dinner (0.52) Dishware (0.86) Cutlery (0.90) Bottle Green Color (0.51)
Cruciferous Vegetable (0.50) Cuisine (0.85) Vegetable (0.71) Mushy Peas (0.50)
Dish (0.84) Produce (0.69)
Leaf Vegetable (0.84) Seasoning (0.57)
White Rice (0.82)
Produce (0.80)
Vegetable (0.79)
Plate (0.78)
Bowl (0.78)
Jasmine Rice (0.76)
Glutinous Rice (0.75)
Meat (0.73)
Rice (0.73)
Comfort Food (0.72)
Guk (0.72)
Sauerkraut (0.71)
Soup (0.70)
Basmati (0.70)
Garnish (0.69)

Object detection results:

Ground Truth Microsoft Azure Google Vision Amazon Rekognition IBM Watson
Bowl of Rice Bowl (0.69) Tableware (0.76) Undetected /
Bowl of Rice Bowl (0.69) Tableware (0.66) Undetected /
Raw Vegetables Undetected Food (0.71) Undetected /
Chopsticks Undetected Undetected Undetected /
Grilled Chicken Undetected Food (0.73) Undetected /

Labeling results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOGNITION IBM WATSON
Food (0.99) Food (0.99) Plant (0.97) Nutrition (0.86)
Fast Food (0.90) Tableware (0.94) Meal (0.91) Food (0.86)
Healthy (0.89) Ingredient (0.91) Food (0.91) Dish (0.86)
Recipe (0.79) Recipe (0.88) Dish (0.89) Chestnut Red Color (0.80)
Fresh (0.78) Plate (0.83) Vegetable (0.81) Chestnut Color (0.75)
Fruit (0.78) Cuisine (0.82) Produce (0.71) Teriyaki (0.73)
Tomato (0.69) Dish (0.82) Bowl (0.70) Sukiyaki (0.55)
Delicious (0.65) Leaf Vegetable (0.81) Dinner (0.57) Barbecued Spareribs (0.50)
Container (0.63) Vegetable (0.79) Supper (0.57)
Broccoli (0.61) Produce (0.78) Vase (0.57)
Carrot (0.58) Garnish (0.76) Pottery (0.57)
Ingredient (0.57) Natural Foods (0.75) Jar (0.57)
Dish (0.54) Beef (0.74)
Salad (0.54) Steak (0.74)
Produce (0.54) Cooking (0.74)
Tasty (0.54) Meat (0.73)
Diet (0.52) Citrus (0.73)
Iceburg Lettuce (0.70)
Comfort Food (0.69)
Roasting (0.68)
Grilling (0.68)
Lime (0.67)
Fast Food (0.67)
Lemon (0.67)
Fruit (0.66)