The relationship between food and language can be fascinating. The idiosyncrasies of a culture’s cuisine are often reflected in its vocabulary, and it is common for food words in one language to lack direct translations into other languages due to the uniqueness of cuisines around the world. Of course, a culture can adopt other foods, ingredients, and techniques, but when it does, the names are often borrowed along with the concepts themselves. Sushi, for example, is a wanderword, with examples such as Cherokee ᏑᏏ (susi), Azerbaijani suşi, and Bengali সুশি (suśi). (Wanderword is a term linguists use to refer to words borrowed into many different languages. “Taxi” is another good example: most languages in the world call taxis “taxi” or something very similar.)
There are also times in which two cultures end up creating a similar dish independently—a “convergent evolution of cuisine”, perhaps. A good example of this is the Persian dish ته دیگ (tah dig, literally “pot bottom”) and the Chinese dish 鍋巴 (Mandarin guō bā, Cantonese wo1 baa1, literally “pot crust”). These both refer to the layer of rice that gets crunchy at the bottom of a pot of rice, an important part of both Persian and Chinese cuisines. While there is a common, unambiguous term for it in both Persian and Chinese, there is no such term in English.
This makes translation between Persian ته دیگ and Chinese 鍋巴 a very interesting test case for PanLex. The PanLex Database contains data derived from thousands of multilingual dictionaries, but one of its main strengths is inferred translations—translations that are not directly attested in any of our sources, but can be inferred via translations through intermediate languages. Many translation engines attempt this strategy, often relying on a single “pivot language”, typically English.
However, in this case English does not have a standard term for the rice stuck to the bottom of the pot. Dictionaries translating from English into another language would therefore not include a word with this meaning. Dictionaries translating into English might include explanatory definitions of equivalent terms, but these could take many forms. For example, a Persian-English dictionary could translate ته دیگ as “a crust of rice that forms at the bottom of a rice pot” and a Chinese-English dictionary could translate 鍋巴 as “the rice that sticks to the bottom of the pan”. It would be difficult or impossible for a translation engine to recognize that these phrases are referring to the same concept and thus useful as a pivot between Persian and Chinese. An inferred translation cannot be made using English as the pivot language.
In order to avoid the problems that may be encountered by relying on only one pivot language, at PanLex we seek out and extract information not just from Some-Language-to-English or English-to-Some-Language dictionaries, but actively look for sources such as Persian-Chinese or Telugu-Russian dictionaries. In the case of rice that sticks to the bottom of the pan, by including Persian-Chinese direct translations, the PanLex Database can infer translations into languages of other cultures that have this culinary concept, such as Japanese お焦げ (okoge). In this case, the Persian-Japanese translation is done using Chinese as a pivot language. The nature of the PanLex Database’s inferred translation process is that it is not restricted to a single pivot language, but will make translations through as many pivot languages as are available. PanLex’s strategy of using a wide variety of diverse multilingual data aids this process.
We should note that this strategy is not just useful for PanLex to create more accurate translations, but also reflects our appreciation and support of linguistic diversity. To PanLex, non-English languages are not simply a curiosity, or a barrier to overcome. Instead they are integral to the connectivity between all languages that we envision. We believe the best approach to panlingual translation is not one that simply connects other languages to what is available in English, but one in which the words and concepts of all languages contribute to the connectivity, accuracy, and richness of the translation engine.