Guinea Pig Pizza: Ecuador

Image recognition (IR) systems often perform poorly once in the real world. In this post, I test four of the most popular IR systems on original real world images of food from around the world, this time from Ecuador.

Key takeaway

The IR systems performed very poorly for both object detection and labeling. Cuy was not recognized at all across four images. Several (cultural) misrepresentations were present.

Correctly predicted images 0/4
Correctly detected items 0/4
Correct labels 0/89
Potentially harmful detections/labels
6
The above table includes only detections and labels of 80%+ confidence level, for lower confidence levels see the tables further below.

Insights

Object Detection

The object detection systems failed to accurately describe the Cuy in any of the four images. Vision gave the general description of Food for the Cuy in all images, while Rekognition gave the description of Pizza for the Cuy in all four images. As such, object detection performed very poorly.

Labeling

The labeling systems performed very poorly as well. The labels given remained surface level and seemed to not even come close to describing the meal. Perhaps the most relevant label was Fried Food. This brings to question the usefulness of the results for the presented meal.

Furthermore, several cultural misrepresentations were present, the most obvious one being Rekognition consistently mistaking the Cuy for pizza. With labels such as Hendl and Britisch cuisine, Vision also gave descriptions that significantly mispresent the meal.

As in previous analyses, here too we have to address confusion (mostly by Vision) between meats. While a typical dish in Ecuador and neighboring countries, people from other cultures might prefer not to eat Cuy. Yet, the systems described the meal as chicken meat, duck meat, turkey meat, pork, etc. If someone would overly rely on the results of these labeling systems, they would perhaps eat Cuy while thinking it would be something else.

Finally, we see that Rekogntion primarily have labels for the laptop in the background. While obviously not wrong, I wonder if Rekognition found it easier to present results of something more common and visually simple/distinctive, and thereby failed to give a lot of results for the food.

Suggestions for improvement

  • Provide more specific and relevant labels for Cuy;
  • Address (cultural) misrepresentations (i.e. Cuy is not pizza, );
  • Make sure labels of meat do not harm people of certain religions or with certain diets (i.e. Cuy is not duck meat or chicken meat).
  • Check in how far the systems can distinguish between less relevant, yet visually simple background objects and meals in the foreground (especially for Rekognition).

Results

Four images of one meal from Ecuador were available:

  • Meal 1: Cuy (Fried Guinea pig) (Lunch)

Object detection results.

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Cuy Undetected Food (0.66) Pizza (0.74) /

*Green = the right prediction; Yellow= the right prediction, but too general; Red = potentially harmful prediction; White = largely not relevant

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
  Food (0.98)
Computer Keyboard (0.98)
nutrition (0.95)
  Laptop (0.94) Hardware (0.98) food (0.95)
  Lechona (0.9) Keyboard (0.98)
reddish orange color (0.86)
  Computer (0.89)
Computer Hardware (0.98)
dish (0.83)
  Ingredient (0.88) Computer (0.98)
light brown color (0.83)
  Tableware (0.88) Electronics (0.98) Apple Pie (0.78)
  Recipe (0.86) Pc (0.97) dessert (0.78)
  Chicken meat (0.8) Laptop (0.94)
fish and chips (0.67)
  Fried food (0.8) Food (0.79) turnover (0.51)
  Roasting (0.79) Pizza (0.74) samosa (0.5)
  Cuisine (0.79)    
  Cooking (0.78)    
  Duck meat (0.78)    
  Produce (0.76)    
  Turkey meat (0.75)    
  Dish (0.75)    
  Meat (0.74)    
  Drunken chicken (0.73)    
  Vegetable (0.71)    
  Personal computer (0.7)    
  Fast food (0.7)    
  Comfort food (0.66)    
  Pork (0.66)    
  Hendl (0.63)    
  Flesh (0.63)    

Object detection results.

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Cuy Undetected Food (0.73) Pizza (0.84) /

*Green = the right prediction; Yellow= the right prediction, but too general; Red = potentially harmful prediction; White = largely not relevant

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
food_grilled (0.69) Food (0.98) Pc (0.97)
light brown color (0.91)
  Tableware (0.93) Electronics (0.97) nutrition (0.87)
  Laptop (0.88) Computer (0.97) food (0.87)
  Ingredient (0.88) Food (0.91)
fish and chips (0.87)
  Recipe (0.87) Laptop (0.88) dish (0.87)
  Computer (0.84) Pizza (0.84)
food product (0.79)
  Chicken meat (0.84)
Computer Keyboard (0.83)
 
  Deep frying (0.83) Hardware (0.83)  
  Cuisine (0.81) Keyboard (0.83)  
  Dish (0.78)
Computer Hardware (0.83)
 
  Drunken chicken (0.78)    
  Plate (0.78)    
  Produce (0.77)    
  Fried food (0.77)    
  Vegetable (0.76)    
  Cooking (0.75)    
  Hendl (0.75)    
  Meat (0.74)    
  Seafood (0.72)    
  Roasting (0.72)    
  Comfort food (0.7)    
  Fast food (0.7)    
  Duck meat (0.69)    
  Frying (0.67)    
  British cuisine (0.66)    

Object detection results.

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Cuy Undetected Food (0.77) Pizza (0.91) /

*Green = the right prediction; Yellow= the right prediction, but too general; Red = potentially harmful prediction; White = largely not relevant

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
  Food (0.98) Pc (0.99)
reddish orange color (0.95)
  Computer (0.98) Computer (0.99) nutrition (0.75)
  Laptop (0.95) Electronics (0.99) food (0.75)
  Tableware (0.93) Laptop (0.99) dish (0.75)
 
Personal computer (0.93)
Computer Keyboard (0.97)
fish and chips (0.75)
  Ingredient (0.89) Hardware (0.97)  
  Recipe (0.86) Keyboard (0.97)  
  Input device (0.84)
Computer Hardware (0.97)
 
  Cuisine (0.83) Pizza (0.91)  
  Dish (0.83) Food (0.91)  
  Fast food (0.79)    
  Chicken meat (0.79)    
  Peripheral (0.77)    
  Fried food (0.76)    
  Produce (0.75)    
  Netbook (0.74)    
  Space bar (0.74)    
  Meat (0.74)    
  Drunken chicken (0.74)    
  Output device (0.73)    
  Cooking (0.72)    
  Junk food (0.71)    
  Comfort food (0.69)    
  Baked goods (0.68)    
  Touchpad (0.66)    

Object detection results.

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Cuy Undetected Food (0.77) Pizza (0.62) /

*Green = the right prediction; Yellow= the right prediction, but too general; Red = potentially harmful prediction; White = largely not relevant

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
food_grilled (0.67) Food (0.98)
Computer Keyboard (0.99)
fish and chips (0.92)
  Computer (0.98) Hardware (0.99) dish (0.92)
  Laptop (0.96) Keyboard (0.99) nutrition (0.92)
 
Personal computer (0.95)
Computer Hardware (0.99)
food (0.92)
  Ingredient (0.88) Computer (0.99)
reddish orange color (0.79)
  Input device (0.88) Electronics (0.99)
light brown color (0.57)
  Recipe (0.87) Pc (0.98)  
  Tableware (0.83) Laptop (0.96)  
  Output device (0.82) Food (0.78)  
  Chicken meat (0.81) Pizza (0.62)  
  Cuisine (0.78) Pork (0.58)  
  Cooking (0.78)    
  Office equipment (0.76)    
  Fried food (0.76)    
  Space bar (0.75)    
  Produce (0.75)    
  Dish (0.75)    
  Meat (0.74)    
  Roasting (0.73)    
  Plate (0.72)    
  Laptop part (0.71)    
  Deep frying (0.71)    
  Fried chicken (0.7)    
  Comfort food (0.68)    
 
Computer hardware (0.68)
   

Soup is (not) served: Bulgaria

Image recognition (IR) systems often perform poorly once in the real world. In this post, I test four of the most popular IR systems on original real world images of food from around the world, this time from Bulgaria.

Key takeaway

The IR systems performed poorly in detecting as well as labeling one meal. The name of the dish nor any of its ingredients were correctly identified.

Correctly predicted images 0/2
Correctly detected items 0/5
Correct labels 0/47
Potentially harmful detections/labels
4
The above table includes only detections and labels of 80%+ confidence level, for lower confidence levels see the tables further below.

Insights

Object Detection

The object detection systems performed poorly. The dish nor any of its ingredients were predicted. Only general descriptions (e.g. bowl, food, etc.) were given and Rekognition again described the dish as ice cream in both images.

Labeling

The labeling systems performed similarly poorly. Not one correct and accurate label for the meals was given by any of the four IR systems. Most labels remained too general (e.g. bowl, food, etc.).

In the second image, egg and sausage is clearly visible, yet no systems picked up on these items. One wonder if this is the case because they are part of larger meal with visually mixed items.

Many labels clearly also (culturally) misrepresented the meal. For instance, Vision and Watson both described the dish as soup, which it clearly is not. But the liquid-like substance in a pot might perhaps have fooled the systems, as this is often how soup is portrayed. A similar thing can be said about the description of custard and creme brulee [sic] by Watson.

Finally, Vision also provided wrong cultural and origin specific descriptions (e.g. Arancini, Kai Yang, Chiboust Cream, American Food, Skyr). As in previous analyses, one has to wonder the consequences of wrongly labeling the culture and origin of food.

Suggestions for improvement

  • Provide more specific and relevant labels for Sirene Po Shopski, Egg, and Sausage.
  • Address (cultural) misrepresentations (i.e. Sirene Po Shopski is not soup or creme brulee [sic]);

Results

Two images of one meal from Bulgaria were available:

  • Meal 1: Sirene Po Shopski (Cheese, Tomatoes, butter, eggs, and sausage)(Dinner)

Object detection results.

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Sirene Po Shopski in a Pot Undetected Food (0.61) Ice Cream (0.82) /

*Green = the right prediction; Yellow= the right prediction, but too general; Red = potentially harmful prediction; White = largely not relevant

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
food_ (0.63) Food (0.98) Bowl (0.95)
reddish brown color (0.87)
  Tableware (0.94) Dish (0.84) nutrition (0.83)
  Dishware (0.9) Meal (0.84) food (0.83)
  Bottle (0.89) Food (0.84) dish (0.83)
  Table (0.88) Ice Cream (0.82) food product (0.8)
  Ingredient (0.87) Dessert (0.82)
chocolate color (0.75)
  Recipe (0.87) Cream (0.82) soup (0.71)
  Plate (0.86) Creme (0.82) borsch (0.71)
  Serveware (0.86) Plant (0.82) custard (0.51)
  Soup (0.85) Pottery (0.79) creme brulee (0.5)
  Liquid (0.82) Shelf (0.78)  
  Dish (0.8) Wood (0.7)  
  Drink (0.8) Furniture (0.65)  
  Condiment (0.79) Pot (0.64)  
  Cuisine (0.79) Cup (0.57)  
  Produce (0.77) Plywood (0.56)  
  Flowerpot (0.76)    
  Gravy (0.75)    
  Spoon (0.73)    
 
Cookware and bakeware (0.73)
   
  Porcelain (0.72)    
  Cooking (0.72)    
  Drinkware (0.72)    
  Cup (0.72)    
  Kitchen utensil (0.71)    

Object detection results.

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Sirene Po Shopski in a Pot
Bowl (0.82)
Packaged Goods (0.77)
Undetected /
Spoon Kitchen Utensl (0.52) Undetected Undetected /
Egg Food (0.66) Food (0.73) Ice Cream (0.63) /
Sausage /

*Green = the right prediction; Yellow= the right prediction, but too general; Red = potentially harmful prediction; White = largely not relevant

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
food_ (0.66) Food (0.99) Dish (0.99)
reddish brown color (0.87)
  Tableware (0.95) Meal (0.99) nutrition (0.83)
  Plate (0.9) Food (0.99) food (0.83)
  Ingredient (0.9) Bowl (0.88) dish (0.83)
  Recipe (0.87) Dessert (0.87) food product (0.8)
  Cuisine (0.84) Plant (0.81)
chocolate color (0.75)
  Dish (0.83) Pottery (0.8) soup (0.71)
  Dishware (0.76) Cake (0.8) borsch (0.71)
  Produce (0.75) Cream (0.75) custard (0.51)
  Meat (0.74) Creme (0.75) creme brulee (0.5)
  Arancini (0.71) Cutlery (0.74)  
  Comfort food (0.7) Ice Cream (0.63)  
  Fried food (0.67) Pie (0.6)  
  Kai yang (0.66) Icing (0.59)  
  Chiboust cream (0.64) Porcelain (0.57)  
  Dairy (0.64) Art (0.57)  
  Junk food (0.62) Platter (0.56)  
  Cooking (0.62) Pasta (0.56)  
  Dessert (0.62)    
  Delicacy (0.6)    
  Side dish (0.59)    
  Baked goods (0.59)    
  American food (0.58)    
  Skyr (0.57)    
  Breakfast (0.57)    

Waffles are Easy: Singapore

Image recognition (IR) systems often perform poorly once in the real world. In this post, I test four of the most popular IR systems on original real world images of food from around the world, this time from Singapore.

Key takeaway

Finally, the IR systems performed somewhat well! Though object detection was still lacking, several systems correctly labeled the meal in two different pictures.

Of course, the meal contained only a single, simple, and visually distinguishable item (a waffle), but nevertheless a meal was correctly labeled in full for the first time.

It was also almost detected correctly in one image, but waffle had a confidence rating of 76%, right below our cut off point of 80%. Still, many faulty labels were present as well.

Correctly predicted images 0/2
Correctly detected items 2/6
Correct labels 6/39
Potentially harmful detections/labels
7
The above table includes only detections and labels of 80%+ confidence level, for lower confidence levels see the tables further below.

Insights

Object Detection

For the first image, object detection did not work well. All three detections (of nine in total) were too general (e.g. food). However, for the second image, Azure and Rekognition (combined) detected all three items with a fairly high confidence rating.

Unfortunately, Rekognition also described waffle as a bread, which I feel is a missed opportunity and a clear misrepresentation. Vision’s descriptions remained too general for the second image as well.

Labeling

As always, the labeling systems performed better than the object detection systems. What stands out compared to previous analyses, is that both Vision and Rekognition labeled the meal in both pictures with (very) high confidence ratings (85%-100%). Of course, compared to these previous analyses, the meal consists only of a waffle, a fork, and a knife – all simple and visually distinguishable items. Nevertheless, they labeled them well.

In the first image, the waffle is spread open and clearly shows the texture of Kaya and Margarine. This detail was not picked up by the IR systems. While understandable due to it’s detailed nature, one has to wonder if we can expect IR systems to pick up on these details. And if we can expect this from these systems, how much visual similarities between different countries confuse such systems (and humans).

For instance, I personally never heard of Kaya, though its popularity in Singapore is undeniable. So, as a human coming from a Western country, I’d probably would have described it as butter. Butter visually looks very similar, yet clearly misses the mark. Therefore, this is a clear case where something looks visually similar, but – depending on your background and the context surrounding the image – is something substantially very different.

Some wrongly labeled items were also prominent. For the second image, Rekognition correctly labeled knife first with 99% confidence. However, the next three labels were weapon, blade, weaponry also with 99% confidence. While wrongly labeling a weapon as not a weapon would perhaps have worse consequences, one has to wonder the consequences of labeling a simple table knife as a weapon.

Finally, Vision labeled the waffle as Belgian waffle with the same confidence as a waffle (95%). One wonders if the fame of Belgian waffles influenced the prediction of Vision. Again, one also has to wonder to what degree an IR system can determine the origin of a meal.

Suggestions for improvement

  • Address (cultural) misrepresentations (i.e. [not all waffles are Belgian waffles]);
  • Understand the limits of IR systems and think about the consequences of these limits:
    • Can we expect IR systems to detect if, for example, a waffle has Kaya and Margarine on it simply based on an image without further context or input?
  • Understand the consequences of labeling [a simple dinner knife] as a weapon with high confidence.

Results

Two images of one meal from Singapore were available:

  • Meal 1: Waffle with Kaya and Margarine (Dessert)

Object detection results.

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Waffle Food (0.59) Food (0.77) Undetected /
Fork Undetected Undetected Undetected /
Knife Undetected Tableware (0.67) Undetected /

*Green = the right prediction; Yellow= the right prediction, but too general; Red = potentially harmful prediction; White = largely not relevant

Labeling Results:

MICROSOFT AZURE** GOOGLE VISION AMAZON REKOG. IBM WATSON
  Food (0.99) Waffle (1) food (0.95)
  Tableware (0.96) Food (1) beige color (0.93)
  Waffle (0.95)   bread (0.86)
  Belgian waffle (0.95)  
food product (0.86)
  Ingredient (0.91)   nan (0.77)
  Baked goods (0.87)  
Chicken Quesadilla (0.76)
  Staple food (0.87)   dish (0.76)
  Fast food (0.86)   nutrition (0.76)
  Cuisine (0.85)   flatbread (0.5)
  Recipe (0.85)    
  Dish (0.82)    
  Finger food (0.73)    
  Junk food (0.72)    
  Dessert (0.72)    
  Produce (0.72)    
  Plate (0.7)    
  Dishware (0.69)    
  Comfort food (0.65)    
  Sweetness (0.65)    
  Kitchen utensil (0.64)    
  Snack (0.64)    
  Delicacy (0.63)    
  Waffle iron (0.63)    
  Breakfast (0.61)    
  Meal (0.59)    

**It appears that the Azure labeling API is not giving back any results at the time of analysis (only _others with a confident rating of 0.004, model version 2021-05-01 [object detection API used is model version 2021-04-01]).

Object detection results.

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Waffle Waffle (0.76) Food (0.74) Bread (0.89) /
Knife Undetected Tableware (0.59) Knife (0.99) /
Fork Undetected Undetected Fork (0.99) /

*Green = the right prediction; Yellow= the right prediction, but too general; Red = potentially harmful prediction; White = largely not relevant

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
  Food (0.98) Knife (0.99) beige color (0.98)
  Belgian waffle (0.96) Weapon (0.99) utensil (0.63)
  Tableware (0.96) Blade (0.99) food (0.6)
  Waffle (0.96) Weaponry (0.99) food product (0.6)
  Hood (0.9) Fork (0.99) tableware (0.56)
  Plate (0.89) Cutlery (0.99) tablefork (0.55)
  Ingredient (0.89) Food (0.91) restaurant (0.55)
  Recipe (0.86) Bread (0.89) building (0.55)
  Cuisine (0.82) Waffle (0.85) cafe (0.54)
  Baked goods (0.82)   spoon (0.51)
  Dish (0.8)    
  Kitchen utensil (0.8)    
  Grille (0.79)    
  Dishware (0.79)    
  Staple food (0.75)    
  Pizzelle (0.73)    
  Waffle iron (0.72)    
  Fork (0.72)    
  Dessert (0.71)    
  Sweetness (0.7)    
  Comfort food (0.69)    
  Produce (0.69)    
  Finger food (0.66)    
  Junk food (0.66)    
  Cooking (0.64)    

**It appears that the Azure API is currently not giving back any results (only abstract_ with a confident rating of 0.004).

Fritattapizza: England

Image recognition (IR) systems often perform poorly once in the real world. In this post, I test four of the most popular IR systems on original real world images of food from around the world, this time from England.

Key takeaway

The four IR systems performed better than in previous analyses, but still made quite some mistakes. One meal (Frittata) was finally correctly labeled by Watson, but unfortunately with a confidence rating of only 50%. Object detection picked up on a lemon and chopsticks, but failed for 17 other items.

Correctly predicted images 0/8
Correctly detected items 2/19
Correct labels 11/169
Potentially harmful detections/labels
10
The above table includes only detections and labels of 80%+ confidence level, for lower confidence levels see the tables further below.

Insights

Object Detection

In meal 3, Azure and Vision correctly detected a lemon, while the latter also detected chopsticks. Though detecting two over a total of 19 items still signifies a poor performance, it’s refreshing from previous analyses to see the exact items being detected. Unfortunately, the confidence rating for both detections was around 50%, well below an acceptable threshold for most IR systems.

Unfortunately, many faulty detections were made as well. Indian Spinach and Laksa (noodle soup) were both described as ice cream by Rekognition, while the same system also labeled a rice-based dish as a Birthday Cake. The latter being quite ironic, as the first meal actually was a cake, but there was not mention of cake by Rekognition. Rekognition also mistook a sliced lemon for an egg.

Finally, Azure and Rekognition described Frittata as pizza with high confidence rating (86% and 97% respectively). While understandable from a visual perspective (they both look cheesy), Frittata and pizza are very different dishes.

Labeling

The labeling systems were much better than the detection systems and appeared to work somewhat better compared with previous analyses. Though Laksa was not described by it’s name itself, Vision labeled it with noodle soup at a confidence rating of 80%. Rekognition also labeled the Laksa as Noodle with a confidence rating of 96%. Itss strange that the object detection system labeled the Laksa as ice cream with a much lower confidence rating of 75%.

Azure and Vision labeled a sliced carrot cake as cake with fairly high confidence ratings (93% and 91%, respectively). Unfortunately, the same systems, as well as the other two, rated the same cake, but not yet sliced, as meat, beef, steak, red meat, etc. This is interesting as a human would clearly see it as the same cake.

On a positive note, Azure labeled roti (and chapati) well with fairly high confidence rating (86%+), while Vision was able to do the same, but with lower confidence ratings. Unfortunately, Vision and Rekognition also culturally misrepresented roti by labeling it tortilla and pita.

One meal (Frittata) was finally correctly labeled by Watson, but unfortunately with a confidence rating of only 50%. This is unfortunate, as pizza for example was given a confidence rating of 92% for this meal. This is a missed opportunity.

Again, labels of meat were common across most images, even though all meals were vegetarian.

Suggestions for improvement

  • Provide more specific and relevant labels for Raita, Aubergine, Indian Spinach and carrot cake;
  • Fix (cultural) misrepresentations (i.e. roti is not tortilla or pita);
  • Make sure labels of meat do not harm people of certain religions or with certain diets.
  • why wrong labels with a lower confidence rating are assigned during object detection to items while the correct label with a nearly perfect confidence rating is not (specifically for Rekognition).
  • Check why Frittata (the correct label) had a significantly lower confidence rating than pizza (specifically for Rekognition).
  • For cake, make sure to include examples of both sliced and unsliced cake, as this small difference may result in a completely different outcome.

Results

Eight images of six different meals from England were available:

  • Meal 1: Carrot Cake (Snack)
  • Meal 2: Indian Spinach with Wild Garlic and Roti (Dinner)
  • Meal 3: Malaysian/Singaporean Laksa (Dinner)
  • Meal 4: Rice, Aubergine, Mint, Cashew nuts, Raita (Dinner)
  • Meal 5: Spanish Style Frittata (Dinner)
  • Meal 6: Pasta Bake (Tomato, basil and Mozzarella, Dinner)

Object detection results.

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Carrot Cake Food (0.58) Food (0.73) Bread (0.98) /

*Green = the right prediction; Yellow= the right prediction, but too general; Red = potentially harmful prediction; White = largely not relevant

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
food (0.86) Food (0.98) Bread (0.98)
chestnut red color (0.93)
indoor (0.61) Ingredient (0.9) Food (0.98) dish (0.9)
meat (0.58) Recipe (0.88) Steak (0.93) nutrition (0.9)
  Beef (0.85) Meat Loaf (0.83) food (0.9)
  Tableware (0.85)  
reddish brown color (0.86)
  Baked goods (0.84)   food product (0.8)
  Dish (0.82)   meat loaf (0.78)
  Cuisine (0.82)   Prime Rib (0.5)
  Cooking (0.82)    
  Steak (0.81)    
  Red meat (0.79)    
  Produce (0.76)    
  Pork (0.75)    
  Meat (0.75)    
  Fried food (0.73)    
  Dessert (0.72)    
  Comfort food (0.71)    
  Baking (0.69)    
  Soil (0.65)    
  Cake (0.64)    
  Flesh (0.63)    
  Pastrami (0.63)    
  Venison (0.6)    
  Ostrich meat (0.57)    
  Kuchen (0.56)    

Object detection results.

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Carrot Cake Food (0.64) Food (0.7) Bread (0.96) /

*Green = the right prediction; Yellow= the right prediction, but too general; Red = potentially harmful prediction; White = largely not relevant

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
dessert (0.97) Food (0.98) Bread (0.96) chocolate color (1)
baking (0.96) Tableware (0.92) Food (0.96) nutrition (0.91)
baked goods (0.96) Cake (0.91) Sweets (0.91) food (0.91)
cake (0.93) Ingredient (0.9) Confectionery (0.91) meat loaf (0.87)
snack (0.92) Recipe (0.88) Cookie (0.88) dish (0.87)
chocolate cake (0.91) Dish (0.83) Biscuit (0.88) food product (0.8)
chocolate brownie (0.9) Baked goods (0.83) Dessert (0.85) dessert (0.5)
parkin (0.89) Cuisine (0.83) Chocolate (0.83) tiramisu (0.5)
snack cake (0.88) Kuchen (0.79) Meat Loaf (0.6)  
chocolate (0.87)
Flourless chocolate cake (0.79)
Brownie (0.58)  
muscovado (0.86) Gluten (0.78)    
flourless chocolate cake (0.86)
Produce (0.77)    
sweetness (0.86) Frozen dessert (0.75)    
food (0.8) Dessert (0.75)    
indoor (0.6) Birthday cake (0.74)    
  Cooking (0.74)    
  Sweetness (0.72)    
  Baking (0.72)    
  Lekach (0.7)    
  Icing (0.7)    
  Buttercream (0.69)    
  Torta caprese (0.67)    
  Beef (0.67)    
  Chocolate cake (0.66)    
  Torte (0.65)    

Object Detection Results:

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Indian Spinach Food (0.84) Food (0.77) Ice Cream (0.7) /
Roti Undetected Undetected Bread (0.94) /

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
food (0.99) Food (0.98) Bread (0.94)
chestnut color (0.88)
indoor (0.89) Tableware (0.91) Food (0.94) dish (0.77)
roti (0.89) Ingredient (0.88) Ice Cream (0.7) nutrition (0.77)
recipe (0.87) Recipe (0.88) Dessert (0.7) food (0.77)
cooking (0.86) Staple food (0.87) Cream (0.7) beige color (0.75)
cookware and bakeware (0.86)
Cookware and bakeware (0.82)
Creme (0.7) meat loaf (0.69)
chapati (0.86) Dish (0.8) Plant (0.7) food product (0.6)
ingredient (0.84) Cuisine (0.8) Pita (0.57) utensil (0.6)
pan (0.72) Cooking (0.8) Seasoning (0.56) Filet Mignon (0.5)
stove (0.62) Produce (0.77)    
kitchen (0.58) Vegetable (0.76)    
  Chapati (0.76)    
  Tortilla (0.75)    
  Corn tortilla (0.74)    
  Jolada rotti (0.74)    
  Bhakri (0.71)    
  Comfort food (0.7)    
  Roti (0.69)    
  Piadina (0.67)    
  Metal (0.65)    
  Kitchen utensil (0.65)    
  Condiment (0.64)    
  Finger food (0.63)    
  Meat (0.63)    
  Fast food (0.61)    

Object Detection Results:

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Laksa (Noodle Soup) Undetected Food (0.7) Ice Cream (0.75) /
Chopsticks Undetected Chopsticks (0.5) Undetected /
Lemon Lemon (0.51) Lemon (0.51) Egg (0.6) /

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
food_ (0.75) Food (0.98) Noodle (0.96)
chestnut color (0.88)
  Tableware (0.96) Food (0.96) dish (0.77)
  Ingredient (0.91) Pasta (0.96) nutrition (0.77)
  Recipe (0.88) Plant (0.9) food (0.77)
  Soup (0.87) Vermicelli (0.82) beige color (0.75)
  Noodle (0.86) Ice Cream (0.75) meat loaf (0.69)
  Cuisine (0.86) Dessert (0.75) food product (0.6)
  Dish (0.84) Cream (0.75) utensil (0.6)
  Stew (0.84) Creme (0.75) Filet Mignon (0.5)
  Staple food (0.83) Produce (0.64)  
  Bowl (0.83) Dish (0.61)  
  Noodle soup (0.8) Meal (0.61)  
  Produce (0.79) Egg (0.6)  
  Chopsticks (0.78) Citrus Fruit (0.59)  
  Meat (0.76) Fruit (0.59)  
  Chinese noodles (0.75) Grapefruit (0.56)  
  Hot and sour soup (0.74)    
  Thukpa (0.74)    
  Vegetable (0.73)    
  Rice noodles (0.73)    
  Guk (0.73)    
  Spoon (0.73)    
  Cooking (0.73)    
  Comfort food (0.72)    
  Fast food (0.72)    

Object Detection Results:

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Rice Undetected Food Undetected /
Raita Undetected / Undetected /
Aubergine Undetected / Undetected /
Mint Undetected / Undetected /
Cashew nuts Undetected / Undetected /

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
food_ (0.86) Food (0.98) Dish (0.99) food (0.89)
  Tableware (0.96) Meal (0.99) nutrition (0.89)
  White rice (0.9) Food (0.99) beige color (0.85)
  Dishware (0.88) Plant (0.98) dish (0.83)
  Plate (0.88) Vegetable (0.88) risotto (0.83)
  Recipe (0.88) Platter (0.69) food product (0.8)
  Ingredient (0.87) Seasoning (0.58)
emerald color (0.72)
  Fines herbes (0.83) Seasoning (0.58) plate (0.5)
  Staple food (0.82)    
  Jasmine rice (0.81)    
  Rice (0.78)    
  Cuisine (0.78)    
  Produce (0.77)    
  Garnish (0.76)    
  Steamed rice (0.76)    
  Dish (0.76)    
  Lime (0.75)    
  Meat (0.75)    
  Kitchen utensil (0.72)    
  Leaf vegetable (0.71)    
  Comfort food (0.7)    
  Vegetable (0.69)    
  Cooking (0.68)    
  Culinary art (0.68)    
  Xôi (0.67)    

Object Detection Results:

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Rice Undetected Food (0.79) Birthday Cake (0.67) /
Raita Undetected /
Aubergine Undetected /
Mint Undetected /
Cashew Nuts Undetected /

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
food_ (0.87) Food (0.98) Plant (0.98) dish (0.95)
  White rice (0.94) Dish (0.95) nutrition (0.95)
  Tableware (0.93) Meal (0.95) food (0.95)
  Ingredient (0.9) Food (0.95) beige color (0.95)
  Recipe (0.88) Vegetable (0.85) risotto (0.93)
  Staple food (0.87) Produce (0.78)
food product (0.79)
  Rice (0.85) Seasoning (0.75)
light brown color (0.74)
  Jasmine rice (0.84) Birthday Cake (0.67)
Grilled Salmon (0.5)
  Dish (0.84) Dessert (0.67)  
  Cuisine (0.84) Cake (0.67)  
  Leaf vegetable (0.8)    
  Plate (0.79)    
  Basmati (0.79)    
  Glutinous rice (0.79)    
  Produce (0.79)    
  Fines herbes (0.78)    
  Steamed rice (0.76)    
  Vegetable (0.75)    
  Meat (0.75)    
  Comfort food (0.71)    
  Culinary art (0.7)    
  Dishware (0.7)    
  Coriander (0.68)    
  Rice and curry (0.61)    
  À la carte food (0.6)    

Object Detection Results:

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Frittata Pizza (0.86)
Packaged Goods (0.84)
Pizza (0.97) /

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
food_pizza (0.78) Food (0.95) Pizza (0.97)
pale yellow color (1)
  Ingredient (0.9) Food (0.97) dish (0.95)
  Recipe (0.88) Bread (0.91) nutrition (0.95)
  Baked goods (0.84) Cake (0.76) food (0.95)
  Cuisine (0.83) Dessert (0.76) pizza (0.92)
  Rectangle (0.81) Cornbread (0.6)
Sicilian pizza (0.82)
  Fast food (0.8) Lasagna (0.56) food product (0.8)
  Dish (0.8) Pasta (0.56) cheese pizza (0.7)
  Comfort food (0.71) Pie (0.56) frittata (0.5)
  Staple food (0.67) Dish (0.55)  
  Linens (0.61) Meal (0.55)  
  Side dish (0.61)    
  Pattern (0.61)    
  Junk food (0.59)    
  Metal (0.58)    
 
Cookware and bakeware (0.57)
   
  Meal (0.56)    
  Cooking (0.56)    
  Tin (0.55)    
  Italian food (0.55)    
  American food (0.55)    
  Mixture (0.51)    
  Pattern (0.51)    
       
       

Object Detection Results:

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Pasta Bake Food (0.65) Food (0.74) Bread (0.9) /

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
  Food (0.97) Food (0.95)
chestnut red color (0.96)
  Ingredient (0.89) Plant (0.92) nutrition (0.83)
  Recipe (0.88) Bread (0.9) food (0.83)
  Cuisine (0.8) Meat Loaf (0.64) dish (0.83)
  Fried food (0.79) Lasagna (0.6)
food product (0.79)
  Dish (0.79) Pasta (0.6) pasta (0.76)
  Fast food (0.79) Vegetable (0.6)
Spaghetti Bolognese (0.68)
  Produce (0.76)   meat loaf (0.52)
  Meat (0.74)   lasagna (0.5)
  Comfort food (0.71)    
  Mixture (0.65)    
  Side dish (0.62)    
  Dessert (0.62)    
  Deep frying (0.57)    
  Metal (0.56)    
  Panko (0.56)    
  Soil (0.54)    
  Rock (0.54)    
  Energy bar (0.52)    
       
       
       
       
       
       

Suggestions to Improve Image Recognition Systems to Recognize Food from Around the World? (Report)

I’m testing four of the most popular image recognition systems (Azure, Vision, Rekognition and Watson) on self-collected, original images of food from around the world. The results so far in numbers are:

  • Correctly predicted images: 0/24 (0%)
  • Correctly detected items 7/111 (6%)
  • Correct labels: 37/516 (7%)
  • Potentially harmful detections/labels: 49

Update: 29/06

I’ve tested the IR systems on five new countries: the Philippines, Canada, the US, Yemen, and Germany (6 meals and 13 images). The results were poor across all countries. If correct at all, detection and labeling descriptions overall remained too general (e.g. Food, Dish, etc.).

Results in numbers:

  • Correctly predicted images: 0/13 (0%)
  • Correctly detected items 1/45 (2%)
  • Correct labels: 14/258 (5%)
  • Potentially harmful detections/labels: 44

Insights

Watson, Rekognition and to a lesser degree Azure seem to be trigger happy to label food items (e.g. Rice and Kaydos) as Ice Cream. Is this due to a (Western) bias towards ice cream? Too early to tell really, but nonetheless an interesting direction to explore in future updates.

Though Rice was labeled as Ice Cream multiple times, at other times Rice seemed to be one of the most easily detected items, especially for Azure and Vision. So this leads us to wonder where the threshold for Rice and Ice Cream is located.

Unfortunately, potentially harmful descriptions were common, especially in the form of misplaced references to the origin of the meals as well as confusion between types of meats. More blatant forms of cultural misrepresentation were found too. For example Azure assigned 18 different sausage types to an image of spring rolls.

Finally, with labels such as Gluten and Sugar, one has to wonder what we can realistically expect from IR systems. How can these systems possibly know if a Bagel is gluten or sugar free without any context? Even most humans would find this task impossible.

Suggested improvements:

  • Provide more specific and relevant labels for:
    • BBQ, [Cup of] BBQ sauce, Kaydos, sliced melon, crab cake, spring rolls, bagel, minestrone, bottle of wine, olives and pesto
  • Fix (cultural) misrepresentations:
    • Rice, Kaydos and Crab Cakes are not ice cream, sliced melon is not a banana, spring rolls are not sausages, rice is not oatmeal;
    • A Thali (Indian serving plate) in an image possibly skews Yemeni food results towards Indian food results.
  • Understand the limits of IR systems and think about the consequences of these limits:
    • Can we expect IR systems to distinguish between similar dishes of different countries without further context or input?
    • Can we expect IR systems to detect if, for example, a meal is gluten or sugar free simply based on an image without further context or input?

Update: 17/06

So far, I’ve tested the systems on food from BelgiumMyanmarVietnam and Malaysia (9 meals and 11 images). With these admittedly limited samples, I wanted to write an interim conclusion to make the results of the project easily digestible.

Thus far, the results have been disappointing to very disappointing. Object detection consistently failed across the four systems. The systems labeled the images better, though these labels also often included too general or irrelevant descriptions.

In some cases, these descriptions culturally misrepresented (e.g. mistake the country or culture of the dish) the food, which could be controversial in some contexts. In one instance, “Chicken” was described as “Beef”, which in some contexts could disadvantage people from certain religions. In another instance, “Mock Meat” was labeled as “Meat”, which could similarly disadvantage people from certain religions as well as vegetarians/vegans.

Note: all numbers on this page are only from detections and labels of 80%+ confidence level, for lower confidence levels see the countries’ individual analyses.

Rice is not Ice Cream: the Philippines

Image recognition (IR) systems often perform poorly once in the real world. In this post, I test four of the most popular IR systems on original real world images of food from around the world, this time from the Philippines.

Key takeaway

Overall, the systems performed very poorly. Not a single item was correctly detected and severe cultural misrepresentations were made. Vision was able to pick up on the visual characteristics of the BBQ, but unfortunately ascribed it to many different cultures and countries except for the Philippines.

Correctly predicted images 0/3
Correctly detected items 0/9
Correct labels 4/104
Potentially harmful detections/labels
35
The above table includes only detections and labels of 80%+ confidence level, for lower confidence levels see the tables further below.

Insights

Object Detection

The object detection features failed to identify a single item across all three images. Descriptions that were given, were very general (e.g. “Food”). Also, Rekognition identified rice as “Ice Cream” across all three images, while Vision also identified the rice as “Ice Cream” and “Dessert”. Many of the items remained undetected.

Labeling

The BBQ is perhaps one of the most visually distinctive item in the three images. Vision clearly recognized this distinctiveness as it consistently labeled skewered BBQ like dishes such as Sate Kambing (Indonesian mutton satay), Shashlik (Caucasus/Central Asian skewered meat), Yakatori (Japanese skewered pork or chicken), and many more.

Unfortunately, none of these labels referred to the Philippines, which makes us question if any cultural misrepresentation is at play here (BBQ is a very common dish in the Philippines). It also makes us question as to in how far it is possible to distinguish visually very similar dishes between cultures based on an image without context.

While in previous analyses rice seemed somewhat easy to detect, Watson and Rekognition mistook rice as “Ice Cream” across multiple images. Vision also made this error (once as “Ice Cream” and another time as “Dessert”), though it did also label it as “Rice”. Here, again, it makes us wonder if any cultural misrepresentation is at play. Though one can imagine the visual similarities between the round shaped presentation of the rice (which is common in many Asian countries) in the images and ice cream, one can not help but feel ice cream was simply more represented in the training data.

As always, many labels were also too general or irrelevant.

My recommendation

  • Provide more specific and relevant labels for “BBQ” and “[Cup of] BBQ Sauce”;
  • Fix (cultural) misrepresentations (i.e. rice is not ice cream);
  • Understand the limits of how well a IR system could distinguish between similar dishes of different countries without further context or input, and what consequences this limit could have.

Results

Three images of one meal from the Philippines were available:

  • Meal 1: White rice and BBQ (dinner)

Object detection results.

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
White Rice Undetected Dessert (0.57) Ice Cream (0.96) /
BBQ Undetected Tableware (0.51) Undetected /
Cup of BBQ Sauce Bowl (0.61) Tableware (0.88) Undetected /

*Green = the right prediction; Yellow= the right prediction, but too general; Red = potentially harmful prediction; White = largely not relevant

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
food (0.98) Food (0.98) Ice Cream (0.96)
Ice Cream or Frozen Yoghurt (0.9)
person (0.94) Tableware (0.96) Dessert (0.96) dessert (0.9)
cuisine (0.93) Ingredient (0.91) Cream (0.96) nutrition (0.9)
snack (0.91) Suya (0.9) Creme (0.96) food (0.9)
fast food (0.9) Dishware (0.89) Food (0.96) Ice Cream Parlor (0.8)
dish (0.89) Plate (0.88) Meal (0.92) shop (0.8)
dairy (0.85) Shashlik (0.88) Person (0.88) retail store (0.8)
indoor (0.82) Recipe (0.87) Human (0.88) building (0.8)
table (0.78) Anticuchos (0.86) Bowl (0.85) chocolate color (0.61)
  Cuisine (0.85) Dish (0.8) dark red color (0.57)
  Dish (0.84) Dish (0.8)  
  Brochette (0.84)    
  Sate kambing (0.83)    
  Fried food (0.75)    
  Beef (0.72)    
  Meat (0.72)    
  Cooking (0.72)    
  Produce (0.72)    
  Churrasco food (0.71)    
  Bowl (0.7)    
  Platter (0.69)    
  Fork (0.69)    
  Buffalo wing (0.67)    
  Finger food (0.66)    
  Comfort food (0.65)    

Object detection results.

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Rice Undetected Ice Cream (0.69) Ice Cream (0.94) /
BBQ Undetected Food (0.71) Undetected /
Cup of BBQ Sauce Bowl (0.53) Tableware (0.84) Undetected /

*Green = the right prediction; Yellow= the right prediction, but too general; Red = potentially harmful prediction; White = largely not relevant

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
food (0.99) Food (0.98) Ice Cream (0.94)
Ice Cream or Frozen Yogurt (0.89)
cuisine (0.92) Tableware (0.95) Dessert (0.94) dessert (0.89)
dairy (0.91) White rice (0.93) Cream (0.94) nutrition (0.89)
ice cream (0.89) Ingredient (0.9) Creme (0.94) food (0.89)
person (0.62) Recipe (0.88) Food (0.94)
alizarine red color (0.84)
  Plate (0.87) Meal (0.89)
Ice Cream Parlor (0.62)
  Jasmine rice (0.87) Plant (0.77) shop (0.62)
  Staple food (0.85) Dish (0.73) retail store (0.62)
  Sate kambing (0.84) Outdoors (0.59) building (0.62)
  Shashlik (0.84)    
  Rice (0.84)    
  Brochette (0.83)    
  Glutinous rice (0.82)    
  Anticuchos (0.81)    
  Fork (0.81)    
  Suya (0.8)    
  Dish (0.78)    
  Nasi lemak (0.77)    
  Steamed rice (0.76)    
  Produce (0.76)    
  Basmati (0.75)    
  Cuisine (0.73)    
  Meat (0.72)    
  Fried food (0.71)    
  Comfort food (0.7)    

Object Detection Results:

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Rice Undetected Food (0.6) Ice Cream (0.81) /
BBQ Undetected Undetected Undetected /
Cup of BBQ Sauce Bowl (0.66) Bowl (0.88) Undetected /

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
snack (0.9) Food (0.99) Person (0.98) building (0.82)
dairy (0.89) Brochette (0.95) Human (0.98) food (0.8)
fast food (0.88) Suya (0.94) Ice Cream (0.81) shop (0.73)
person (0.84) Tableware (0.94) Dessert (0.81) retail store (0.73)
food (0.82) Ingredient (0.91) Cream (0.81)
chestnut color (0.73)
indoor (0.61) Anticuchos (0.91) Creme (0.81) deli (0.64)
  Recipe (0.88) Food (0.81) restaurant (0.55)
  Shish taouk (0.88) Meal (0.65) bakery (0.5)
  White rice (0.87)    
  Shashlik (0.87)    
  Sate kambing (0.85)    
  Pincho (0.84)    
  Satay (0.84)    
  Dish (0.83)    
  Arrosticini (0.83)    
  Rice (0.82)    
  Cuisine (0.82)    
  Souvlaki (0.81)    
  Plate (0.81)    
  Churrasco food (0.8)    
  Cooking (0.79)    
  Jasmine rice (0.79)    
  Yakitori (0.77)    
  Kebab (0.77)    
  Produce (0.75)    

All Sausage, but no Spring Roll: Canada

Image recognition (IR) systems often perform poorly once in the real world. In this post, I test four of the most popular IR systems on original real world images of food from around the world, this time from Canada.

Key takeaway

Overall, the systems performed very poorly. No images were correctly described, nor were any items in the images correctly detected. Some labels such as “Rice” and “White Rice” described an item in the images, but in general the labels remained superficial. Many potentially harmful labels were found.

Correctly predicted images 0/5
Correctly detected items 0/10
Correct labels 5/89
Potentially harmful detections/labels
9
The above table includes only detections and labels of 80%+ confidence level, for lower confidence levels see the tables further below.

Insights

Object Detection

Across the two meals, the object detection systems performed very disappointingly. Most items were described as the general terms of “Food” or “Bowl”, while two descriptions were completely off: crab cakes as well as Kadyos were detected as “Ice Cream” by Rekognition, while sliced melon was recognized as “Banana”. Though difficult to verify, these mistakes might indicate a (cultural) misrepresentation in the training data (i.e. more images of bananas than melons).

Labeling

Labeling remained surface level as well. General labels such as “Food”, “Dish” and “Plate” were common, as well as irrelevant labels such as “Ingredient” and “Recipe”. As is, the value of these labels is questionable, though this of course depends on the applications of use.

More specific (though still somewhat general) labels were also found, including “Stew”, “Brown Sauce”, “Comfort Food”. As in previous cases, descriptions such as “Rice” and “White Rice” were plentiful as one image clearly depicted rice.

Unfortunately, the systems also gave many potentially harmful or simply wrong descriptions. In a first instance, in one instance, one meal was described as “German Food”. In other instance, the meals’ were typical meals from other origins (e.g. “Semur” [an Indonesian Dish] , “Dumpling”, “Varenyky”) than Canadian/Filipino.

Azure seemed to be really confused by the image of three spring rolls, giving it 18 different descriptions of sausages (e.g. “Bratwurst”, “Loukaniko”, etc.) and not one of “Spring Rolls”. Rekognition also described the spring rolls as “Hot Dog” with 0.99 certainty.

As in the analyses of previous countries, types of meats were also often used interchangeably, as in “Pork”, “Beef”, “Chicken meat”, “Clam” would all be given as a label. Depending on the context, this could be detrimental to people of certain religions or with certain diets that avoid certain meats.

My recommendation

  • Provide more specific and relevant labels for “Kaydos”, “Sliced Melon”, “Crab cake” and “Spring Rolls”;
  • Fix (cultural) misrepresentations (i.e. crab cakes are not ice cream, sliced melon is not a banana, spring rolls are not sausages);
  • Make sure labels of meat do not harm people of certain religions or with certain diets.

Results

Five images of two different meals from Canada (with a Filipino origin) were available:

  • Meal 1: White rice, Kaydos (beans, jackfruit, pork [usually beef]), Sliced Melon (lunch)
  • Meal 2: Crab Cake, Spring Rolls (filled with crab meat), Sliced Melon (snack)

Object detection results.

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Kaydos Food (0.68) Food (0.68) Ice Cream (0.85) /
Sliced Melon Undetected Food (0.64) Undetected /

*Green = the right prediction; Yellow= the right prediction, but too general; Red = potentially harmful prediction; White = largely not relevant

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
food (0.97) Food (0.99) food (0.97) chocolate color (0.94)
plate (0.96) Tableware (0.92) Dish (0.99) food (0.74)
indoor (0.89) Ingredient (0.91) Meal (0.99) light brown color (0.7)
brown sauce (0.76) Staple food (0.86) Food (0.99) nutrition (0.67)
mole sauce (0.73) Stew (0.86) Plant (0.89) dish (0.67)
chocolate (0.67) Cuisine (0.84) Ice Cream (0.85)
beef Bourguignonne (0.58)
braising (0.61) Recipe (0.84) Dessert (0.85) restaurant (0.55)
recipe (0.58) Dish (0.84) Cream (0.85) building (0.55)
steak sauce (0.51) Brown sauce (0.83) Creme (0.85) food product (0.5)
  Produce (0.78) Soup Bowl (0.74) bourguignon (0.5)
  Soup (0.76) Sliced (0.59)  
  Kitchen appliance (0.74) Gravy (0.59)  
  Gravy (0.73) Clam (0.59)  
  Meat (0.73) Animal (0.59)  
  Curry (0.68) Sea Life (0.59)  
  Condiment (0.66) Invertebrate (0.59)  
  Stock (0.66) Seashell (0.59)  
  Semur (0.62) Breakfast (0.57)  
  Comfort food (0.61) Stew (0.56)  
  Braising (0.61)    
  Chasseur (0.6)    
  Mole sauce (0.59)    
  Hoisin sauce (0.58)    
  German food (0.57)    
  Boiled beef (0.55)    

Object detection results.

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Kadyos Bowl (0.5) Food (0.74) Ice Cream (0.6) /
Rice Undetected Food (0.69) Undetected /

*Green = the right prediction; Yellow= the right prediction, but too general; Red = potentially harmful prediction; White = largely not relevant

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
food (1) Food (0.97) food (0.97) food (0.96)
plate (0.98) Ingredient (0.89) Rice (0.98) dish (0.88)
rice (0.98) Staple food (0.88) Vegetable (0.98) nutrition (0.88)
jasmine rice (0.84) Recipe (0.88) Food (0.98)
alabaster color (0.8)
glutinous rice (0.76) Jasmine rice (0.87) Ice Cream (0.6)
Chicken Curry (0.73)
white rice (0.74) Rice (0.87) Dessert (0.6) rice (0.66)
basmati (0.74) Cuisine (0.81) Cream (0.6) grain (0.66)
steamed rice (0.7) Dish (0.79)  
food product (0.66)
white (0.65) White rice (0.79)   beige color (0.63)
arborio rice (0.59) Tableware (0.71)   white rice (0.6)
  Steamed rice (0.71)    
  Basmati (0.55)    
  Comfort food (0.53)    
  Plate (0.51)    

Object Detection Results:

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Sliced Melon Banana (0.52) Food (0.55) Undetected /

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
food (0.98) Food (0.97) food (0.97)
lemon yellow color (0.85)
plate (0.91) Tableware (0.93) Plant (0.92) nutrition (0.81)
fast food (0.88) Dishware (0.9) Fruit (0.71) food (0.81)
banana (0.76) Ingredient (0.86) Food (0.71) plant (0.8)
fruit (0.75) Recipe (0.83)   food product (0.8)
peel (0.58) Plate (0.77)   dish (0.77)
  Staple food (0.76)   custard (0.7)
  Produce (0.74)   beige color (0.69)
  Platter (0.72)  
creme anglais (0.58)
  Cuisine (0.72)  
creme caramel (0.55)
  Baked goods (0.72)    
  Jiaozi (0.71)    
  Junk food (0.7)    
  Dish (0.69)    
  Comfort food (0.67)    
  Fruit (0.65)    
  Finger food (0.63)    
  Delicacy (0.63)    
  Dumpling (0.62)    
  Serveware (0.62)    
  Gluten (0.6)    
  Peach (0.58)    
  Varenyky (0.58)    
  Cooking (0.58)    
  Fast food (0.57)    

Object Detection Results:

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Crab Cake Food (0.6) Food (0.73) Ice Cream (0.56) /
Kodyos Bowl (0.84) Food (0.56) Undetected /
Sliced Melon Food (0.7) Food (0.71) Undetected /

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
plate (0.99) Food (0.98) food (0.97) food (0.89)
food (0.98) Tableware (0.96) Food (0.93) sandwich (0.89)
dessert (0.8) Ingredient (0.89) Meat Loaf (0.57) dish (0.89)
baked goods (0.79) Plate (0.89) Pottery (0.57) nutrition (0.89)
  Recipe (0.88) Ice Cream (0.56)
food product (0.79)
  Cuisine (0.86) Dessert (0.56)
reddish orange color (0.78)
  Dish (0.83) Cream (0.56) chili dog (0.71)
  Pork steak (0.8) Creme (0.56) Sloppy Joe (0.67)
  Fried food (0.78) Pork (0.56)
reddish brown color (0.66)
  Roasting (0.78)  
Pulled Pork Sandwich (0.5)
  Staple food (0.77)    
  Produce (0.77)    
  Fast food (0.76)    
  Meat (0.74)    
  Finger food (0.72)    
  Beef (0.72)    
  Comfort food (0.7)    
  Chicken meat (0.69)    
  Deep frying (0.68)    
  Junk food (0.68)    
  Baked goods (0.68)    
  Pork (0.65)    
  Serveware (0.64)    
  Bread (0.63)    
  Cooking (0.63)    

Object Detection Results:

GROUND TRUTH MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
Spring Rolls Hot Dog (0.58) Food (0.71) Hot Dog (0.99) /
Sliced Melon Undetected Food (0.56) Undetected /

Labeling Results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOG. IBM WATSON
fast food (0.98) Food (0.98) Hot Dog (0.99)
alabaster color (0.95)
food (0.95) Tableware (0.96) Food (0.99) dish (0.63)
indoor (0.91) Ingredient (0.89) Bun (0.59) nutrition (0.63)
plate (0.9) Plate (0.89) Bread (0.59) food (0.63)
bratwurst (0.8) Recipe (0.88)   food product (0.6)
bread (0.79) Cuisine (0.86)  
bouillabaisse (0.56)
diot (0.78) Dish (0.83)   stew (0.56)
sausage (0.77) Pork steak (0.8)   bathtub (0.51)
bockwurst (0.77) Fried food (0.78)  
sashimi (raw fish) (0.5)
thuringian sausage (0.77) Roasting (0.78)   footbath (0.5)
cervelat (0.77) Staple food (0.77)    
kielbasa (0.77) Produce (0.77)    
knackwurst (0.76) Fast food (0.76)    
loukaniko (0.74) Meat (0.74)    
andouille (0.72) Finger food (0.72)    
longaniza (0.71) Beef (0.72)    
frankfurter wa14rstchen (0.7)
Comfort food (0.7)    
debrecener (0.67) Chicken meat (0.69)    
boudin (0.67) Deep frying (0.68)    
breakfast sausage (0.67) Junk food (0.68)    
italian sausage (0.67) Baked goods (0.68)    
morteau sausage (0.63) Pork (0.65)    
chistorra (0.6) Serveware (0.64)    
seafood (0.6) Bread (0.63)    
cumberland sausage (0.59)
Cooking (0.63)    

To Gluten or not to Gluten.. The US

Image recognition (IR) systems often perform poorly once in the real world. In this post, I test four of the most popular IR systems on original real world images of food from around the world, this time from the US.

Key takeaway

Overall, the systems’ performances were disappointing. “Bread” appeared somewhat easy to detect, though “Soup” was not. The labeling was a bit more generous and could also detect the soup, though specific description remained illusive. Finally, one wonders if labels such as “Gluten” and “Sugar” can be determined simply by a picture of a meal, and what the consequences could be of including these labels.

Correctly predicted images 0/1
Correctly detected items 0/3
Correct labels 1/19
Potentially harmful detections/labels
0
The above table includes only detections and labels of 80%+ confidence level, for lower confidence levels see the tables further below.

Insights

The object detection features performed poorly on the selected image, though Azure and Rekognition did detect “Baked Goods” and “Bread”. Though the bagel was not specifically in itself, the image’s perspective perhaps would have many humans say “bread” instead of “Bagle” as well. However, Azure and Vision detected the soup simply as “Food” (undetected by Rekognition and Watson), which is a very general description for a somewhat typical part of a Western meal.

The labeling features again performed better. “Soup” was labeled, as well as “Bread”, “Baked goods”, “Bun”, “Seed”. Unfortunately, many labels were also too general (e.g. “Food”, “Meal, “Dish”, etc.), irrelevant (e.g. “Recipe”, “Nutrition”, etc.), wrong (e.g. “Cake”, “Chocolate”, “Mole”, etc.), or a cultural misrepresentation (e.g. “Curry”).

Lastly, labels such as “Gluten” and “Sugar” are perhaps not wrong, but are hard to discern from an image (i.e. there are also gluten and sugar free bagels). As these could have a strong impact on people with certain diets, this leads to wonder if such labels should be present in IR systems at all.

My recommendation

As always, the developers should implement more specific and relevant labels. Bread seems to be recognized well, so the developers can be proud of that. Developers should also look into labels such as “Gluten” and “Sugar” and if these can actually be recognized from a picture of a meal.

Results

An image of one meal from the US was available:

  • Meal 1: New York “Everything” Bagel and Tomato Soup (Lunch)
Object detection results*:
Ground Truth Microsoft Azure Google Vision Amazon Rekognition IBM Watson
Cup of Soup Food (0.53) Food (0.73) Undetected /
Spoon Undetected Undetected Undetected /
"Everything" Bagel Baked Goods (0.77) Food (0.63) Bread (0.99) /

*Green = the right prediction; Yellow= the right prediction, but too general; Red = potentially harmful prediction; White = largely not relevant

Labeling results:
MICROSOFT AZURE GOOGLE VISION AMAZON REKOGNITION IBM WATSON
Baked Goods (0.99) Food (0.99) Bread (0.99) Light Brown Color (0.85)
Food (0.97) Ingredient (0.91) Food (0.99) Food (0.71)
Dessert (0.93) Tableware (0.88) Bun (0.88) Reddish Orange Color (0.76)
Bread (0.93) Recipe (0.87) Bowl (0.64) Nutrition (0.58)
Chocolate (0.83) Cuisine (0.86) Food Product (0.56)
Recipe (0.82) Dish (0.85) Sauce (0.56)
Cake (0.79) Seed (0.81) Condiment (0.56)
Delicious (0.78) Staple Food (0.80) Food Seasoning (0.56)
Fast Food (0.75) Produce (0.80) Food Ingredient (0.56)
Ingredient (0.60) Soup (0.79) Mole (0.54)
Gluten (0.60) Bun (0.78)
Staple Food (0.54) Gravy (0.78)
Gravy (0.52) Cake (0.76)
Dish (0.51) Gluten (0.76)
Bread (0.74)
Stew (0.73)
Curry (0.72)
Baked Goods (0.71)
Sugar (0.69)
Bowl (0.69)
Bread Roll (0.68)
Finger Food (0.67)
Baking (0.67)
Brown Bread (0.65)
Comfort Food (0.65)

This is not India: Yemen

Image recognition (IR) systems often perform poorly once in the real world. In this post, I test four of the most popular IR systems on original real world images of food from around the world, this time from Yemen.

Key takeaway

Overall, the systems’ performances were disappointing. In one image, “Rice” was labeled by multiple systems (though not detected), but not in the other where the rice was much more visible. Labels were often too general or irrelevant. Some labels culturally misrepresented the Yemeni food and utensils as Indian or Western. One mislabeling instance was found that could disadvantage people with certain diets or religions.

Correctly predicted images 0/2
Correctly detected items 0/8
Correct labels 3/35
Potentially harmful detections/labels
0
The above table includes only detections and labels of 80%+ confidence level, for lower confidence levels see the tables further below.

Insights

The object detection feature failed to provide specific descriptions of the objects across all four systems. The descriptions that were given remained surface level (e.g. “Bowl” instead of “Bowl of Chicken”, or “Bowl” instead of “Cup of Vegetable Sauce”) and many objects simply remained undetected.

Interestingly, the labeling features of Azure and Vision gave Indian-origin labels (e.g. “Masala”) to the first image. One explanation for this could be the plate on which the meal is served, which is also commonly used in Indian cuisine (a “Thali”). Perhaps the prevalence of Indian meals in the training images (i.e. because of a higher population, more common use of English, etc.) could contribute to this misrepresentation of a Yemen meal as Indian food.

Several cultural misrepresentations were present as well. For instance, Rekognition labeled (presumably) the bowl of rice as “Oatmeal” and IBM Watson labeled the food as an “Irish Stew”. These misrepresentations should make us question if Yemeni food was represented enough in the training images.

The systems provided more correct labels on the second image (e.g. “Chicken”, “Rice”, and “Carrot”). This is an interesting outcome because, in the first image, all the items are separated in different bowls. One could assume separate items helps the systems to distinguish between the items, but this was not the case. It will be interesting to see if this happens on pictures of other countries as well, as serving food as separate items is common in many Asian countries while and less so in Western countries.

Finally, in one instance, Vision labeled the food as “Seafood”, which could disadvantage people with certain diets (e.g. pescatarian) or religions.

My recommendation

As stated in the analyses of previous countries, the object detection features need to become better in detecting all the objects as well as giving more specific descriptions. The latter is similarly true for the labeling features. Also, further attention is needed towards the idea that the recognition of Yemeni food is influenced by Indian training images. Developers should also be careful that the presentation of Yemeni food and Asian food in general (i.e. as separate items) does not impact their system’s performance. Vision appears to be very good in detecting “Carrot” (see also Vietnam), so congratulations to the developers for that.

Results

Two images of one meal from Yemen were available:

  • Meal 1: Rice, Cooked Chicken, Raw Vegetables, Vegetable Sauce (Lunch)
Object detection results*:
Ground Truth Microsoft Azure Google Vision Amazon Rekognition IBM Watson
Bowl of Rice Undetected Bowl (0.87) Undetected /
Bowl of Chicken Bowl (0.58) Food (0.78) Undetected /
Cup of Vegetable Sauce Undetected Bowl (0.80) Undetected /
Plate Tableware (0.63)

*Green = the right prediction; Yellow= the right prediction, but too general; Red = potentially harmful prediction; White = largely not relevant

Labeling results:
MICROSOFT AZURE GOOGLE VISION AMAZON REKOGNITION IBM WATSON
Plate (0.99) Food (0.98) Bowl (0.97) Chestnut color (0.68)
Table (0.98) Tableware (0.97) Breakfast (0.91) Food (0.65)
Food (0.98) Dishware (0.88) Food (0.91) Orange Color (0.62)
Indoor (0.86) Ingredient (0.88) Produce (0.76) Beverage (0.60)
Mixture (0.72) Recipe (0.88) Meal (0.71) Nutrition (0.59)
Bowl (0.70) Cuisine (0.84) Dish (0.70) Dish (0.58)
Masala (0.66) Dish (0.81) Plant (0.67) Bowl (0.55)
Spoon (0.54) Staple Food (0.81) Oatmeal (0.60) Tableware (0.55)
Tableware (0.50) Bowl (0.79) Utensil (0.55)
Mixture (0.77) Slop Bowl (0.52)
Produce (0.75)
Serveware (0.70)
Comfort Food (0.69)
Masala (0.69)
Spoon (0.68)
Kitchen Utensil (0.68)
Metal (0.67)
Rice (0.67)
Mixing Bowl (0.66)
Breakfast (0.62)
Gravy (0.61)
South Indian Cuisine (0.61)
Tandoori Masala (0.59)
Side Dish (0.59)
Plate (0.59)

Object detection results:

Ground Truth Microsoft Azure Google Vision Amazon Rekognition IBM Watson
Rice Undetected Food (0.76) Undetected /
Chicken Undetected / Undetected /
Raw Vegetables Food (0.52) Carrot (0.66) Undetected /

Labeling results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOGNITION IBM WATSON
Food (0.90) Food (0.98) Plant (0.99) Reddish Orange Color (0.97)
Plate (0.88) Tableware (0.93) Dish (0.96) Dish (0.89)
Chicken (0.72) Ingredient (0.90) Meal (0.96) Nutrition (0.89)
Dinner (0.65) Recipe (0.87) Food (0.96) Food (0.89)
White Rice (0.60) White Rice (0.85) Vegetable (0.88) Stew (0.85)
Steamed Rice (0.59) Rice (0.82) Rice (0.84) Food Product (0.80)
Recipe (0.55) Staple Food (0.78) Bowl (0.63) Tableware (0.79)
Glutinous Rice (0.52) Cuisine (0.78) Produce (0.56) Bouilabaisse (0.71)
Dish (0.76) Curry (0.56) Irish Stew (0.68)
Produce (0.76) Curry (0.50)
Meat (0.75)
Vegetable (0.75)
Seafood (0.75)
Stew (0.73)
Jasmine Rice (0.70)
Fast Food (0.69)
Plate (0.68)
Comfort Food (0.68)
Carrot (0.65)
Baby Carrot (0.64)
Gosht (0.64)
Mixture (0.63)
Brassicales (0.63)
Take-out Food (0.63)
Leaf Vegetable (0.62)

A bit more Wine, Please: Germany

Image recognition (IR) systems often perform poorly once in the real world. In this post, I test four of the most popular IR systems on original real world images of food from around the world, this time from Germany.

Key takeaway

Overall, the systems’ performances were disappointing. On the positive side, Amazon Rekognition did correctly label “Alcohol”, which is great because of alcohol’s sensitive nature in some contexts. Other than that, the systems failed to detect many objects, failed to provide specific labels, and presented many irrelevant labels.

Correctly predicted images 0/2
Correctly detected items 1/15
Correct labels 1/38
Potentially harmful detections/labels
0
The above table includes only detections and labels of 80%+ confidence level, for lower confidence levels see the tables further below.

Insights

The object detection feature failed to provide specific descriptions of the objects across all four systems. The descriptions that were given remained surface level (e.g. “Food” instead of “Minestrone”, or “Bottle” instead of “Bottle of Wine”) and many objects simply remained undetected. In the case of Vision, the bounding boxes’ positions of the object detection feature were all scrambled and unintelligible (see picture below). It’s unclear why the bounding boxes showed up like this.

As was the case for the previous countries, the labeling feature performed slight better, but still unsatisfactory. Amazon Rekognition provided the label “Alcohol” and “Wine” which, given the sensitive nature of alcohol in certain contexts, works well. Unfortunately, the other three systems failed to label these.

Except for “Alcohol” and “Wine”, other labels remained very unspecific and uninformative. For instance, “Food”, “Plate”, “Dish”, “Kitchen Utensil” and such were common. None detected the olives, pesto, or Minestrone. In one case, the Minestrone was labeled as “Curry”, which could be considered a (cultural) misrepresentation.

My recommendation

As for the previous analyses, both the object detection and labeling feature need much improvement. For object detection, this means being able to detect the various items in the first place, and giving more specific description in the second place. The labeling feature of Amazon Rekognition correctly identified “Alcohol” and “Wine” – both sensitive items in certain contexts -, and the other three systems would perhaps do well to also implement the identification of “Alcohol”.

Results

Two images of one meal from Germany were available:

  • Meal 1: Minestrone with Pesto and Olives, and a glass of Wine (Dinner)
Object detection results*:
Ground Truth Microsoft Azure Google Vision Amazon Rekognition IBM Watson
Bowl of Minestrone Bowl (0.66) Scrambled* Undetected /
Bowl of Pesto Bowl (0.58) Scrambled* Undetected /
Spoon Undetected Scrambled* Undetected /
Glass of wine Undetected Scrambled* Undetected /
Cup of Olives Undetected Scrambled* Undetected /
Wine Bottle Bottle (0.56) Scrambled* Undetected /
Spoon Undetected Scrambled* Undetected /

*Green = the right prediction; Yellow= the right prediction, but too general; Red = potentially harmful prediction; White = largely not relevant

Labeling results:
MICROSOFT AZURE GOOGLE VISION AMAZON REKOGNITION IBM WATSON
Food (0.98) Food (0.98) Dish (0.95) Olive Green Color (0.80)
Plate (0.89) Tableware (0.97) Meal (0.95) Orange Color (0.66)
Ingredient (0.90) Food (0.95) Food (0.65)
Dishware (0.89) Plant (0.73) Dish (0.65)
Recipe (0.88) Pottery (0.68) Nutrition (0.65)
Serveware (0.85) Alcohol (0.58) Tableware (0.60)
Kitchen Utensil (0.84) Beverage (0.58) Side Dish (0.59)
Cuisine (0.83) Drink (0.58) Curry (0.51)
Dish (0.82) Garnish (0.50)
Rectangle (0.80)
Vegetable (0.78)
Leaf Vegetable (0.78)
Soup (0.77)
Produce (0.74)
Mixture (0.74)
Curry (0.70)
Comfort Food (0.69)
Spoon (0.69)
Yellow Curry (0.68)
Cutlery (0.68)
Circle (0.68)
Garnish (0.68)
Condiment (0.67)
Plate (0.65)
Stew (0.65)

Object detection results:

Ground Truth Microsoft Azure Google Vision Amazon Rekognition IBM Watson
Bowl of Minestrone Undetected Food (0.78) Undetected /
Bowl of Minestrone Bowl (0.75) Food (0.68) Undetected /
Bowl of Pesto Kitchen Utensil (0.52) Food (0.55) Undetected /
Spoon Undetected Undetected Undetected /
Glass of wine Undetected Undetected Undetected /
Glass of wine Cup (0.58) Undetected Undetected /
Wine Bottle Bottle (0.84) Packaged Goods Beer /
Spoon Undetected Undetected Spoon /

Labeling results:

MICROSOFT AZURE GOOGLE VISION AMAZON REKOGNITION IBM WATSON
Table (0.99) Food (0.99) Dish (0.99) Nutrition (0.74)
Plate (0.99) Tableware (0.97) Meal (0.99) Food (0.74)
Food (0.99) Table (0.95) Food (0.99) Charcoal Color (0.62)
Indoor (0.98) Bottle (0.94) Spoon (0.92) Dish (0.60)
Wall (0.98) Dog (0.93) Cutlery (0.92) Piece de Resistance Dish (0.59)
Bottle (0.90) Plate (0.90) Alcohol (0.74) Table (0.55)
Drink (0.75) Dishware (0.90) Beverage (0.74) Furniture (0.55)
Tableware (0.65) Ingredient (0.89) Drink (0.74) Dining Table (0.54)
Counter (0.60) Recipe (0.87) Person (0.67) Dinner Table (0.53)
Dish (0.53) Houseplant (0.85) Human (0.67) Plate (0.52)
Leaf vegetable (0.83) Stew (0.65)
Kitchen Utensil (0.78) Beer (0.62)
Vegetable (0.78) Pasta (0.60)
Cooking (0.77) Curry (0.58)
Cuisine (0.76) Glass (0.58)
Companion Dog (0.75) Wine (0.58)
Broccoli (0.75) Restaurant (0.55)
Produce (0.74)
Serveware (0.74)
Garnish (0.74)
Dish (0.73)
Bowl (0.73)
Comfort Food (0.71)
Fork (0.70)
Culinary Art (0.70)