Diners, Drive-Ins and Menu Transcription

6/11/2023

Introduction

For my next adventure with SingularAgent, I was looking for some repetitive and boring work on the internet. The kind of work that artificial intelligence software would be a perfect fit for. I came across a website called Amazon Mechanical Turk. Amazon Mechanical Turk (MTurk) is a crowdsourcing marketplace that makes it easier for individuals and businesses to outsource their processes and jobs to a distributed workforce who can perform these tasks virtually. There is a wide variety of tasks available for workers to perform on the website. The tasks include: taking surveys, typing up receipts / PDFs / LinkedIn profiles, making bounding boxes in pictures for cars / people / mice, rating profile photos, and more. I found one task that was particularly suited for SingularAgent's current abilities. Today I am announcing that SingularAgent can read a restaurant menu off of a  website and then type up the results.

Read Process

To start off, the restaurant website can't contain a menu that is an image or a link to a PDF. The menu actually needs to be stored in the HTML of the web pages. The more structured the data is, the easier it is for SingularAgent to parse it. The type of menu item information that a restaurant menu typically contains is exactly what you would expect: category, name, description (or ingredients), and the price. Before I start the read process, I inspect the HTML code to see how the restaurant menu is structured.

The read process has the following parameters: url, parentClass, categoryXPath, categoryNameXPath, itemXPath, itemNameXPath, itemDescriptionXPath, itemPriceXPath, and groupIndex. The url parameter contains the address of the website with the restaurant menu. The parentClass parameter is used so that SingularAgent searches for the parent class of the menu and doesn't need to copy the HTML of the entire page. SingularAgent uses an expression language called XPath to query (read) the web page. An example of an XPath expression is .//*[@class='item-name']. This expression selects all nodes descendant of the current node that have a class attribute with a value of "item-name". The categoryXPath is the expression that selects where the category exists and the categoryNameXPath selects where the name of the category exists. The groupIndex value is 0 by default. The groupIndex parameter is used if a menu is stored on multiple pages of a website. If the first page retrieved 50 menu items, then before I rerun the read process I change the groupIndex value to 50 and update the url to the next web page so that it doesn't overwrite the original menu items.

The read process starts by opening up Google Chrome and navigating to the website provided by the url parameter. Then inspect is opened and the HTML on the webpage is copied to the clipboard of the computer. SingularAgent reads the HTML text off of the clipboard and converts the HTML to a valid XML format. The XmlDocument object doesn't allow empty tags that have a start tag (such as <br>) without a closetag (</br> doesn't exist). After the text is loaded into the XmlDocument object, the XPath expressions select the menu item values. The menu item values are then saved into files on the computer.

Type Process

First, SingularAgent determines how many menu items it needs to type in (based on the previous files saved) and presses the Add Row button that many times minus ten (since there are ten rows to start). Then it clicks into the first Menu Category field and starts entering the data. I quickly discovered that it is much faster for SingularAgent to save the text to the clipboard and then paste it into each field then trying to type out each character individually. Once it finishes typing everything, the menu item values that were saved into files are deleted from the computer.

Demos

The YouTube videos containing SingularAgent demoing these processes can be seen below.

Using artificial intelligence software for repetitious tasks is much more efficient than having a human do it. The human can review the work that the AI has completed for quality purposes. Together we can accomplish so much more.