Adapting Entities across Languages and Cultures

How would you explain Bill Gates to a German? He is associated with founding a company in the United States, so perhaps the German founder Carl Benz could stand in for Gates in those contexts. This type of translation is called adaptation in the translation community (Vinay and Darbelnet, 1995). Until now, this task has not been done computationally. Automatic adaptation could be used in natural language processing for machine translation and indirectly for generating new question answering datasets and education. We propose two automatic methods and compare them to human results for this novel NLP task. First, a structured knowledge base adapts named entities using their shared properties. Second, vector arithmetic and orthogonal embedding mappings identify better candidates, but at the expense of interpretable features. We evaluate our methods through a new dataset 1 of human adaptations.


When Translation Misses the Mark
Imagine reading a translation from German, "I saw Merkel eating a Berliner from Dietsch on the ICE".This sentence is opaque without cultural context.
An extreme cultural adaptation for an American audience could render the sentence as "I saw Biden eating a Boston Cream from Dunkin' Donuts on the Acela", elucidating that Merkel is in a similar political post to Biden; that Dietsch (like Dunkin' Donuts) is a mid-range purveyor of baked goods; both Berliners and Boston Creams are filled, sweet pastries named after a city; and ICE and Acela are slightly ritzier high-speed trains.Human translators make this adaptation when it is appropriate to the translation (Gengshen, 2003).Because adaptation is understudied, we leave the full translation task to future work.Instead, we focus on the task of cultural adaptation of entities: given an entity in a source, what is the corresponding entity in English?Most Americans would not recognize Christian Drosten, but the most efficient explanation to an American would be to say that he is the "German Anthony Fauci" (Loh, 2020).We provide top adaptations suggested by algorithms and humans for another American involved with the pandemic response, Bill Gates, in Table 1.
Can machines reliably find these analogs with minimal supervision?We generate these adaptations with structured knowledge bases (Section 3) and word embeddings (Section 4).We elicit human adaptations (Section 5) to evaluate whether our automatic adaptations are plausible (Section 5.3).

Wer ist Bill Gates?
We define cultural adaptation and motivate its application for tasks like creating culturally-centered training data for QA.Vinay and Darbelnet (1995) define adaptation as translation in which the relationship not the literal meaning between the receiver and the content needs to be recreated.
You could formulate our task as a tradi-tional analogy Drosten::Germany as Fauci::United States (Turney, 2008;Gladkova et al., 2016), but despite this superficial resemblance (explored in Section 4), traditional approaches to analogy ignore the influence of culture and are typically within a language.Hence, analogies are tightly bound with culture; humans struggle with analogies outside their culture (Freedle, 2003).
We can use this task to identify named entities (Kasai et al., 2019;Arora et al., 2019;Jain et al., 2019) and for understanding other cultures (Katan and Taibi, 2004).

. . . and why Bill Gates?
This task requires a list of named entities adaptable to other cultures.Our entities come from two sources: a subset of the top 500 most visited German/English Wikipedia pages and the nonofficial characterization list (Veale, 2016, NOC), "a source of stereotypical knowledge regarding popular culture, famous people (real and fictional) and their trade-mark qualities, behaviours and settings".Wikipedia contains a plethora of singers and actors; we filter the top 500 pages to avoid a pop culture skew. 2 We additionally select all Germans and a subset of Americans from the Veale NOC list as it is human-curated, verified, and contains a broader historical period than popular Wikipedia pages.Like other semantic relationships (Boyd-Graber et al., 2006), this is not symmetric.Thus, we adapt entities in both directions; while Berlin is the German Washington, DC, there is less consensus on what is the American Berlin, as Berlin is both the capital, a tech hub, and a film hub.A full list of our entities is provided in Appendix D.

Adaptation from a Knowledge Base
We first adapt entities with a knowledge base.We use WikiData (Vrandečić and Krötzsch, 2014), a structured, human-annotated representation of Wikipedia entities that is actively developed.This resource is well-suited to the task as features are standardized both within and across languages.
Many knowledge bases explicitly encode the nationality of individuals, places, and creative works.Entities in the knowledge base are a discrete sparse vector, where most dimensions are unknown or not applicable (e.g., a building does not have a spouse).
For example, Angela Merkel is a human (instance of), German (country of citizenship), politician (occupation), Rotarian (member of), Lutheran (religion), 1.65 meters tall (height), and has a PhD (academic degree).How would we find the "most similar" American adaptation to Angela Merkel?Intuitively, we should find someone whose nationality is American.
Some issues immediately present themselves; contemporary entities will have more non-zero entries than older entities.Some characteristics are more important than others: matching unique attributes like "worked as journalist" is more important than matching "is human".
Each entity in WikiData has "properties", which we can think about as the dimension of a sparse vector and "values" that those properties can take on.For example, Merkel has the properties "occupation" and "academic degree".Values for those properties are that her "occupation" is "politician" and her "academic degree" is a "doctorate".To match entities across cultures, we focus on matching properties rather than values; many of the values are more relevant inside a culture.For example, we cannot find American politicians who belong to the Christian Democratic Union, but we can find politicians who have an academic degree and a dissertation title.
As a toy example, if Beethoven, Merkel, and Bach all have only two properties: Beethoven has an "occupation" and "genre", Merkel has an "Erdős number" and "political party", and Bach has a "occupation" and "genre", then Beethoven and Bach has a distance of zero and are the closest entities while Merkel has a distance of two since {"Erdős number", "political party"} is two away from {"occupation", "genre"}.
First, we bifurcate WikiData into two sets: an American set A for items which contain the value "United States of America" and a German set D for those with German values.3This is a liberal approximation, but it successfully excludes roughly seven out of the eight million items in WikiData.Then we explore the properties from WikiData.We create entity vectors with dimensions corresponding to frequently-occurring properties.
The properties are discrete and categorical; Merkel either has an "occupation" or she does not.Each entity then has a sparse vector.We calculate the similarity of the vectors with Faiss's L 2 distance (Johnson et al., 2021) and for each vector in A find the closest vector in D and vice versa.
So who is the American Angela Merkel?One possible answer is Woodrow Wilson, a member of a "political party", who had a "doctoral advisor" and a "religion", and ended up with "awards".This answer may be unsatisfying as it was Barack Obama who sat across from Merkel for nearly a decade.To capture these more nuanced similarities, we turn to large text corpora in Section 4.

An Alternate Embedding Approach
While the classic NLP vector example (Mikolov et al., 2013c) isn't as magical as initially claimed (Rogers et al., 2017), it provides useful intuition.We can use the intuitions of the cliché: to adapt between languages.This, however, requires relevant embeddings.First, we use the entire Wikipedia in English and German, preprocessed using Moses (Koehn et al., 2007).We follow Mikolov et al. (2013b) and use named entity recognition (Honnibal et al., 2020) to tokenize entities such as Barack_Obama.
We use word2vec (Mikolov et al., 2013b), rather than FastText (Bojanowski et al., 2017), as we do not want orthography to influence the similarity of entities.Angela Merkel in English and in German have quite different neighbors, and we intend to keep it that way by preserving the distinction between languages.
However, the standard word2vec model assumes a single monolingual embedding space.We use unsupervised Vecmap (Artetxe et al., 2018), a leading tool for creating cross-lingual word embeddings, to build bilingual word embeddings.We propose two approaches for adaptation.
3CosAdd We follow the word analogy approach of 3CosAdd4 (Levy and Goldberg, 2014;Köper et al., 2016).American→German adaptation takes the source entity's (v) embedding in the English vector space and looks for its adaptation (u * ) based on embeddings in the German space.This is like the word analogy task, i.e., what entity has the role in the German culture as v does in American culture.As an example, Merkel has a similar role in the German culture as Biden.Formally, the adaptation of the English entity v into German is where − → E l w is the embedding of word w in language l, V de is the German vocabulary and sim is the cosine similarity.The American anchor word − → a and German anchor − → d represent the American and German cultures. 5We average the English and German embeddings of the individual word types for robust anchor vectors.In standard analogies, as in Equation 1, the − → a and − → d vectors are different for each test pair; here they are the same for each example, as we always are pivoting between the two cultures.
Learned adaptation To eliminate the need for manual anchor selection for both cultures, our second approach learns the adaptation as a linear transformation of source embeddings to the target culture given a few adaptation examples.Specifically, we use the human adaptations sourced for the Wikipedia entities as training for the Veale NOC ones.We follow the work of Mikolov et al. (2013a) and learn a transformation matrix W en→de for American→German by minimizing the Likewise, we learn the reverse mapping W de→en for German→American adaptation.This requires supervised training data-but not much (Conneau et al., 2018)-which we collect in Section 5.

Comparing Automation to Human Judgment
The automated methods can generate entities at scale, but humans have to evaluate their relevance.

Adaptation by Locals
Since quality control is difficult for generation (Peskov et al., 2019), we need users who will answer the task accurately.We recruit five American citizens educated at American universities and five German citizens educated at German ones.These human annotations serve as a gold standard against which we can compare our automated approaches.To improve the user experience, we create an interface that provides a brief summary of each source entity from Wikipedia and asks the users to select a target adaptation that autocompletes Wikipedia page titles (all entities; targets are not limited to the lists in Section 2) in a text box a la answer selection in Wallace et al. ( 2019).The annotation task requires two hours for our users to complete.Obviously, German annotators are more familiar with German culture than the Americans, and vice-versa.Annotators translate into their native language.Since we are focusing on popular entities, they are often known despite the cultural divide, but the introductory paragraph from Wikipedia reminds users if not.

Are the Adaptations Plausible?
To validate and compare all our adaptation strategies' precision, five German translators6 who understand American culture assess the adaptations.
The top five adaptations from WikiData, 3CosAdd, learned adaptation, and humans-as well as five randomly selected options from the human poolare evaluated for plausibility on a five-level Likert scale.7 Fleiss' Kappa (0.382) and Krippendorf's Alpha (0.381) assess interannotator Agreement; this "fair" agreement suggests that vetting an adaptation is challenging and sometimes subjective, even for translators.

Why Adaptation is Difficult
Embedding adaptations are better than Wikidata's, and human adaptations are better still (Figure 1).Thus, we use human adaptations as the gold standard for evaluating recall.Only the learned embedding method uses training data, so we use human adaptations from Wikipedia to train the projection matrix and evaluate (for all methods) using human adaptations the NOC list.Given that the task is subjective, we take our results with a grain of salt given cultural variation (e.g., some people view Angela Merkel's conservatism as a defining characteristic, while others focus on her science pedigree).We use the mean reciprocal rank (Voorhees, 1999, MRR) to measure how high the gold adaptations are ranked by our other adaptation strategies.Since MRR decreases geometrically and our gold standard is not exhaustive, the Recall@5, and @100 metrics are more intuitive.We calculate Recall@n by measuring what fraction of the correct adaptations of a source entity is retrieved in the top n predictions. 8Table 2 validates that the human annotations are near the top of the automatic adaptations; the precision-oriented evaluation (Figure 1) validates whether the top of the list is reasonable.All human annotations and a sample of the automatic adaptations are provided in Appendix D.

Qualitative Analysis
There is no single answer to what makes a good adaptation.Let us return to the question of who Bill Gates is, which underlines how there is often no one right answer to this question but several context-specific possibilities.The human adaptations show the range of plausible adaptations, each appropriate for a particular facet of the position Bill Gates has in US society.As previously mentioned, Carl Benz represents a larger than life founder who created an entire industry with his company.However, Carl Benz made cars, not computers.
Even within technology, different adaptations highlight different aspects of Bill Gates.Like the implementer of the BASIC programming language, Konrad Zuse contributed to computers that were more than single-purpose machines.Just as as Bill Gates's Microsoft is seen as a stodgy tech giant, Dietmar Hopp founded SAS, a giant German tech company that is more often discussed in board rooms than in living rooms.And because the epicenter of modern tech is America's West Coast, Andreas von Bechtolsheim represents a German founder of Sun Microsystems and early Google investor that made his way to Silicon Valley.
Other times, there is more consensus: a majority of raters declare Angela Merkel is the German Hilary Clinton, and Joseph Smith is the American Martin Luther.There are even some unanimous adaptations: Bavaria is the German California.Adaptations of fictional characters seem particularly difficult, although this may represent the supremacy of American popular culture; Superman and Homer Simpson are so well known in Germany that there are no clear adaptations; Till Eulenspiegel, Maverick, Bibi Blocksberg are not superheroes from a dying world and Heidi is not a dumb, bald everyman.

A New Computational Task
We formally introduce entity adaptation as a new computational task.Word2vec embeddings and WikiData can be used to figuratively-not just literally-translate entities into a different culture.Humans are better at generating candidates for this task than our computational methods (Figure 1).These methods are well-motivated, but have room for improvement.Knowledge bases improve over time and increased coverage of entities-as well as improved information about each entity-would improve the method.Alternate word embedding approaches-perhaps those that discard orthography-may provide better candidates.Even humans occasionally disagree with other humans on this task, so evaluation for this task is nontrivial.
Our new dataset of machine-generated adaptations, human adaptations, and human evaluation of these adaptations can serve as an evaluation for future automatic methods.
People need NLP systems that reflect their language and culture, but datasets are lacking: adaptation can help.There has been an explosion of English-language QA datasets, but other languages continue to lag behind.Several approaches try to transfer English's bounty to other languages (Lewis et al., 2020;Artetxe et al., 2019), but most of the entities asked about in major QA datasets are American (Gor et al., 2021).Adapting entire questions will require not just adapting entities and non-entities in tandem but will also require integration with machine translation (Kim et al., 2019;Hangya and Fraser, 2019).Our automatic methods did not create precise adaptations, but the alternative "incorrect" adaptations may be useful for low-precision tasks, such as generating numerous simple open-ended questions or gauging the popularity of a entity.
Given the existence of robust datasets in high resource languages can we adapt, rather than literally translate, them to other cultures and languages?
We worked with human participants to collect our data.They are all adults who participated of their own volition and no payment was made.No personal data was collected or used for the dataset.For evaluation of the adaptations, we hired translators through Upwork.They were paid $40 for a task that took roughly between one and two hours.
The broad motivation of this work is to spread cultural understanding.Humans must be kept inthe-loop for making claims about cultural relevance.Having multiple diverse opinions is necessary for supporting any cultural claim.Like with language, nationality is often correlated with culture, but is not synonymous.Large countries contain multitudes, while some nationalities (e.g., Kurds) lack a de jure nation but span many nations.We elide this detail and focus on information often available in knowledge bases.
These lists contain figures that are controversial.From a research perspective, research datasets should reflect the real world and prior work, thus we include prominent entities as identified by Veale NOC and Wikipedia.Any list may contain biases in the collection processes, and this should not be thought of as an exclusive and definitive list, but as a start that can be refined and ultimately expanded to other cultures.

A Appendix
Our appendix contains our entire human-collected dataset, as well as a sample of our WikiData and embedding approaches for adaptation.
Figure 2 shows our collection tool.Table 3 shows German→American Veale NOC items.Table 4 shows American→German Veale NOC items.Table 5 shows German→American Veale NOC items.Table 6 shows American→German Veale NOC items.
Table 7 shows our WikiData predictions, Table 8 shows our 3CosAdd predictions.and Table 9 shows our Learned Adaptations predictions.We pose several background questions about Wikipedia and WikiData as well:

B Wikipedia Analysis
Are the Wikipedia pages in German and English visited from the associated country?Yes; the Wikipedias for the respective languages are most used by visitors located in those countries: 63% of German wikipedia was visited from Germany and 32% of English Wikipedia was visited from the United States in the past year.9 Are the top Wikipedia topics notably different across languages?Yes; less than a quarter of top 500 searches for 2019 are identical across English and German.Does WikiData cover areas outside of the United States?Wikipedia coverage does not mean that WikiData annotations are conducted equally across German and American entities.Analyzing WikiData10 reveals a discrepancy in coverage of Germans and Americans.
Out of 8,126,559 titles, 1,030,762 include a reference to the United States in any capacity.However, only 184,692 contain a reference to (broader) Germany.This imbalance is significant but has enough German items for our methodology.As WikiData is a maintained resource, there is room for future additional coverage and standardization of fields.
Countries use different names throughout history.While the United States of America is straightforward, Germany includes several variations, such as: German Empire, the Kingdom of Bavaria, the Kingdom of Prussia, etc.The WikiData feature-based approach can be used for other countries as well (. . .or anything that is consistently coded).For example, there are 65,957 Russian, 152,701 French, and 48,026 Chinese  items in WikiData. 11  Are the top Wikipedia topics necessarily belonging to the culture?No; the top 10 most visited German Wikipedia includes a cultural potpurri: Germany, Greta Thurnberg, Asperger Syndrome, Game of Thrones, and Freddie Mercury.While there are uniquely German entities in the longer list-ZDF, Capital Bra, The Cratez, Niki Lauda-we cannot conclude that all top entities in a language belong culturally to a given country.Therefore, we need a stricter methodology.
Where does one find entities?We rely on a human-sourced dataset: Veale's Non-Official Characterization list (Veale, 2016).This list contains 1031 people, real and fictional, such as Daniel Day-Lewis, Anton Chekhov, and Bridget Jones.These people are annotated with properties, one of which is conveniently their address.There are 25 people with a German location and 575 with an American one.Removing fictional characters written by non-nationals causes the German leaves the list with 20 entities.An American author filters the list of Americans down to 35 iconic ones with achievements that span politics, music, activism, athletics, and pop culture.
Wikipedia provides another avenue for gauging popular topics in a language.We manually filter the top 500 German/English Wikipedia topics to remove non-German/non-American entities; Game of Thrones and Unix-Shell are popular in the German Wikipedia, but they are not culturally idiosyncratic.For the 2019 German Wikipedia we are left with roughly 200 items, which we further reduce down to 120 after putting a cap on pop culture entities.For the American counterpart, over 300 items are culturally American.We add a three-year filter to remove pop items to make it comparable to the German one.

Figure 2 :
Figure 2: Our interface provides users with information about the entity and asks them to select an option from possible Wikipedia pages

Table 1 :
WikiData and unsupervised embeddings  (3CosAdd)generate adaptations of an entity, such as Bill Gates.Human adaptations are gathered for evaluation.American and German entities are color coded.