InfoSync: Information Synchronization across Multilingual Semi-structured Tables

Information Synchronization of semi-structured data across languages is challenging. For instance, Wikipedia tables in one language should be synchronized across languages. To address this problem, we introduce a new dataset InfoSyncC and a two-step method for tabular synchronization. InfoSync contains 100K entity-centric tables (Wikipedia Infoboxes) across 14 languages, of which a subset (3.5K pairs) are manually annotated. The proposed method includes 1) Information Alignment to map rows and 2) Information Update for updating missing/outdated information for aligned tables across multilingual tables. When evaluated on InfoSync, information alignment achieves an F1 score of 87.91 (en<->non-en). To evaluate information updation, we perform human-assisted Wikipedia edits on Infoboxes for 603 table pairs. Our approach obtains an acceptance rate of 77.28% on Wikipedia, showing the effectiveness of the proposed method.


Introduction
English articles across the web are more timely updated than other languages on particular subjects.Meanwhile, culture differences, topic preferences, and editing inconsistency lead to information mismatch across multilingual data, e.g., outdated information or missing information (Jang et al., 2016;Nguyen et al., 2018).Online encyclopedia, e.g., Wikipedia, contains millions of articles that need to be updated constantly, involving expanding existing articles, modifying content such as correcting facts in sentences (Shah et al., 2019) and altering Wikipedia categories (Zhang et al., 2020b).However, more than 40% of Wikipedia's active editors are in English.At the same time, only 15% of the world population speak English as their first language.Therefore, information in languages other  than English may not be as updated (Bao et al., 2012).See Figure 1 for an example of an information mismatch for the same entity across different languages.In this work, we look at synchronizing information across multilingual content.
To overcome the above-mentioned problem, we formally introduce the task of Information Synchronization for multilingual articles, which includes paragraphs, tables, lists, categories, and images.But due to its magnitude and complexity, synchronizing all of the information across different modalities on a webpage is daunting.Therefore, this work focuses on semi-structured data, a.k.a.table synchronization in a few languages, as the first step toward our mission.
We consider Infobox, a particular type of semistructured Wikipedia tables (Zhang and Balog, 2020a), which contain entity-centric information, where we observe various information mismatches, e.g., missing rows (cf. Figure 1).One intuitive idea to address them is translation-based.However, the Infoboxes contain rows with implicit context; translating these short phrases is prone to errors and leads to ineffective synchronization (Minhas et al., 2022).To systematically assess the challenge, we curate a dataset, namely INFOSYNC, consisting of 100K multilingual Infobox tables across 14 languages and covering 21 Wikipedia categories.∼3.5K table pairs of English to non-English or non-English to non-English are sampled and manually synchronized.
We propose a table synchronization approach that comprises two steps: (1.) Information Alignment: align table rows, and (2.) Information Update: update missing or outdated rows across language pairs to circumvent the inconsistency.The information alignment component aims to align the rows in multilingual tables.The proposed method uses corpus statistics across Wikipedia, such as key and value-based similarities.The information update step relies on an effective rule-based approach.We manually curate nine rules: row transfer, time-based, value trends, multi-key matching, append value, high to low resource, number of row differences, and rare keys.Both tasks are evaluated on INFOSYNC to demonstrate their effectiveness.Apart from the automatic evaluation, we deploy an online experiment that submits the detected mismatches by our method to Wikipedia after strictly following Wikipedia editing guidelines.We monitor the number of accepted and rejected edits by Wikipedia editors to demonstrate its efficacy.All proposed edits are performed manually, in accordance with Wikipedia's editing policies and guidelines 1 , rule set 2 , and policies 3 .These changes were subsequently accepted by Wikipedia editors, demonstrating the efficacy of our methodology.
The contributions in this work are as follows: 1) We investigate the problem of Information Synchronization across multilingual semi-structured data, i.e., tables, and construct a large-scale dataset INFOSYNC; 2) We propose a two-step approach (alignment and updation) and demonstrate superiority over exiting baselines; 3) The rule-based updation system achieves excellent acceptance when utilized for human-assisted Wikipedia editing.Our INFOSYNC dataset and method source code are available at https://info-sync.github.io/info-sync/.

Challenges in Table Synchronization
We observe the following challenges when taking Wikipedia Infoboxes as a running example.Note this is not an exhaustive list.
MI: Missing Information represents the problem where information appears in one language and is missing in others.This may be due to the fact that the table is out-of-date or to cultural, social, or demographic preferences for modification (cf. Figure 1).
OI: Outdated Information denotes that information is updated in one language but not others.
IR: Information Representation varies across languages.For example, one attribute about "parents" can be put in a single row or separate rows ("Father" and "Mother").
UI: Unnormalized Information presents cases where table attributes can be expressed differently.For example, "known for" and "major achievements" of a person represent the same attribute (i.e., paraphrase).
LV: Language Variation means that information is expressed in different variants across languages.This problem is further exaggerated by the implicit context in tables when translating.E.g., "Died" in English might be translated to "Overleden" (Pass Away) or "overlijdensplaats" (Place of Death) in Dutch due to missing context.
SV: Schema Variation denotes that the schema (template structure) varies.For example, extraction of "awards" in Musician tables can be harrowing due to dynamic on-click lists (Full Award Lists).
EEL: Erroneous Entity Linking is caused by mismatched linkages between table entities among multiple languages, e.g., "ABV" and "Alcohol by Volume".

Wikipedian "Biases"
Wikipedia is a global resource across over 300 languages.However, the information is skewed toward English-speaking countries (Roy et al., 2020) as English has the most significant Wikipedia covering 23% (11%) of total pages (articles).Most users' edits (76%) are also done in English Wikipedia.English Wikipedia also has the highest number of page reads (49%) and page edits (34%), followed by German (20% and 12%) and Spanish (12% and 6%), respectively.Except for the top 25 languages, the total number of active editors, pages, and edits is less than 1% (Warncke- Wang et al., 2012;Alonso and Robinson, 2016).
Multilingual Wikipedia articles evolve separately due to cultural and geographical bias (Callahan and Herring, 2011;REAGLE and RHUE, 2011;Tinati et al., 2014), which prevents information synchronization.For example, information on "Narendra Modi" (India's Prime Minister) is more likely to be better reflected in Hindi Wikipedia than in other Wikipedias.This means that in addition to the obvious fact that smaller Wikipedias can be expanded by incorporating content from larger Wikipedias, larger Wikipedias can also be augmented by incorporating information from smaller Wikipedias.Thus, information synchronization could assist Wikipedia communities by ensuring that information is consistent and of good quality across all language versions.

The INFOSYNC Dataset
To systematically assess the challenge of information synchronization and evaluate the methodologies, we aim to build a large-scale table synchronization dataset INFOSYNC based on entity-centric Wikipedia Infoboxes.

Table Extraction
We extract Wikipedia Infoboxes from pages appearing in multiple languages on the same date to simultaneously preserve Wikipedia's original information and potential discrepancies.These extracted tables are across 14 languages and cover 21 Wikipedia categories.
Languages Selection.We consider the following languages English(en), French(fr), German(de), Korean(ko), Russian(ru), Arabic(ar), Chinese(zh), Hindi(hi), Cebuano(ceb), Spanish(es), Swedish(sv), Dutch(nl), Turkish(tr), and Afrikaans(ak).We extracted tables across 14 languages and covered 21 diverse Wikipedia categories.In these 14 languages, four are low resource (af, ceb, hi, tr) < 6000, seven of them medium resource (ar, ko,nl, sv, zh,ru, de,es) (6000-10000), and the remaining one are high resource (en, en, fr), w.r.t. to the number of infobox total tables (see Table 1 in paper).Our choices were motivated by the following factors:-a) Cover all the continents, thus covering the majority and diverse population.Out of chosen languages, 7 (English, French, German, Spanish, Swedish, Dutch, and Turkish) are European.b).They have sufficient pages with info boxes; each entity info box is present in at least five languages, and c) an adequate number of rows (5 and above) facilitates better data extraction.
Categories.Extracted tables cover twenty-one simple, diverse, and popular topics: Airport, Album, Animal, Athlete, Book, City, College, Company, Country, Food, Monument, Movie, Musician, Nobel, Painting, Person, Planet, Shows, and Stadiums.We observe that Airport has the most number of entity tables followed by Movie and Shows, as shown in Table 10.Other extraction details are provided in Appendix A.1.We analyze the extracted tables in the context of the synchronization problem and identify the information gap.The number of tables is biased across languages, as shown in Table 1.We observe Afrikaans, Hindi, and Cebuano have a significantly less number of tables.Similarly, the table size is biased across several languages.Dutch and Cebuano have the last rows.In addition, the number of tables across categories is uneven; refer to

Table Synchronization Method
This section will explain our proposed table synchronization method for addressing missing or outdated information.This method includes two steps: information alignment and update.The former approach aims to align rows across a pair of tables, and the latter helps to update missing or outdated information.We further deploy our update process in a human-assisted Wikipedia edit framework to test the efficacy in the real world.

Information Alignment
An Infobox consists of multiple rows where each row has a key and value pair.Given a pair of tables ..] and T y = [..., (k j y , v j y ), ...] in two languages, table alignment aims to align all the possible pairs of rows, e.g., (k i x , v i x ) and (k j y , v j y ) refer to the same information and should be aligned.We propose a method that consists of five modules, each of which relaxes matching requirements in order to create additional alignments.
M1. Corpus-based.The pair of rows (k x , v y ) in T x and (k y , v y ) in T y are supposed to be aligned if cosine(em(tr en x (k x )), em(tr en y (k y ))) > θ 1 , where em is the embedding, θ 1 is the threshold, and tr en y () denotes the English translation of k if k is not in English.In order to achieve accurate key translations, we adopt a majority voting approach, considering multiple translations of the same key from different category tables.We consider the key's values and categories as additional context for better translation during the voting process.To simplify the voting procedure, we pre-compute mappings by selecting only the most frequent keys for each category across all languages.
M2. Key-only.This module attempts to align the unaligned pairs in module 1.Using their English translation, it first computes cosine similarity for all possible key pairs.k x will be aligned to k y only if they are mutually most similar key and the similarity is above a certain threshold θ 2 .This is similar to maximum bipartite matching, treating similarity scores as edge weights followed by threshold-based pruning.And it ensures we are capturing the highest similarity mapping from both language directions.Note that here we use only keys as the text for similarity computation.
M3. Key value bidirectional.This module is similar to step 2, except it uses the entire table row for computing similarities, i.e., key + value, using threshold θ 3 .
M4. Key value unidirectional.This module further relaxes the bidirectional mapping constraint in step 3, i.e., thus removing the requirement of the highest similarity score matching from both sides.We shift to unidirectional matching between row pairs, i.e., consider the highest similarity in either direction.However, this may result in adding spurious alignments.To avoid this, we have a higher threshold (θ 4 ) than the prior step.
M5. Multi-key.Previous modules only take the most similar key for alignment if exceeding the threshold.In this module, we further relax the constraint to select multiple keys (maximum two), given exceeding a threshold (θ 5 ).Multi-key mapping is sparse, but the above procedure will lead to dense mapping.To avoid this, we introduce a soft constraint for value-combination alignment, where multi-key values are merged.We consider valid multi-key alignment when the merge valuecombination similarity score exceeds that of the most similar key.
The thresholds of five modules are tuned in the sequence as stated above.

Information Updation
Information modification includes Row Append (adding missing rows), Row Update (replacing or adding values), and Merge Rows.We propose a rule-based heuristic approach for information updates.The rules are in form of logical expression (∀ (R Tx ,R Ty ) L → R) applied on infobox tables, where, R Tx and R Ty represent table rows for language x and y respectively.These rules are applied sequentially according to their priority rank (P.R.).Rules explanations are described below.
R1. Row Transfer.Following the logistic rule of , where Al Ty Tx (.; .)represents the alignment mapping between two tables T y and T x .Unaligned rows are transferred from one table to another.
R2. Multi-Match.We update the table by removing multi-alignments and replacing them with merged information to handle multikey alignments.
R3. Time-based.We update aligned values using the latest timestamp.
R4. Trends (positive/negative).This update applies to cases where the value is highly likely to follow a monotonic pattern (increasing or decreas-ing) w.r.t.time, e.g., athlete career statistics.The authors curated the positive/negative trend lists.
R5. Append Values.Additional value information from an up-to-date row is appended to the outdated row.
R6. HR to LR.This rule transfers information from high to low resource language to update outdated information.
R7. #Rows.This rule transfers information from bigger (more rows) to smaller (fewer rows) tables.
R8. Rare Keys (Non Popular).We update information from the table where non-popular keys are likely to be added recently to the outdated table.The authors also curate non-popular keys.
Detailed formulation of logical rules and their priority ranking are listed in Table 3. Figure 3 in Appendix shows an example of table update.
Human-assisted Wikipedia Infobox Edits: We apply the above rules to assist humans in updating Wikipedia infoboxes.Following Wikipedia edit guidelines4 , rule set5 , and policies6 , we append our update request with a description to provide evidence, which contains (a) up-to-date entity page URL in the source language, (b) exact table rows information, the source language, and the details of the changes, (c) and one additional citation discovered by the editor for extra validation. 7We further update beyond our heuristic-based rules but are aligned through our information alignment method.

Experiments
Our experiments assess the efficacy of our proposed two-stage approach by investigating the following questions.
-What is the efficacy of the unsupervised multilingual method for table alignment?( §5.2) -How significant are the different modules of the alignment algorithm?( §5.2 and §A.6) -Does the rule-based updating approach effective for information synchronization?( §5.3) -Can the two-step approach assist humans in updating Wikipedia Infoboxes?( §5.3) Figure 2: Explanation of Alignment Performance Metrics: T en and T hi are a collection of all rows in the English and Hindi tables, respectively.R n x represents the n th row in the language table.R x (X) retrieves all rows in the language x using mapping X. |.| represents the set's cardinality.Every alignment is saved as a tuple in form (R m x , R n y ).G is a collection of all gold (human) alignments.P is a collection of predicted alignments (can see there are mistakes in the alignment.
Information Alignment.We consider English as our reference language for alignment.Specifically, we translate all multilingual tables to English using an effective table translation approach of XIn-foTabS (Minhas et al., 2022).Then, we apply incremental modules as discussed in §4.1.We tune independently on the validation set for Non-English ↔ Non-English and English ↔ Non-English.
The method is assessed on two sets of metric (a.) matched score: measure the F1-score between ground truth matched row and predicted alignment, and (b.) unmatched score: measure the F1-score between independent (unmatched) rows in ground truth with predicted unaligned rows.See Figure 2 for the explanations of these metrics.
Information Updation.We apply the heuristicbased approach and deploy the predicted updates for human-assisted edits on Wikipedia Infoboxes.532 table pairs are edited distributed among T en → T x , T x → T y , and T x → T en , where x and y are non-English languages.

Information Alignment
Algorithm Efficacy.Table 4 reports the matched and unmatched scores.For match scores, we observe that the corpus-based module achieves an F1 score exceeding 50 for all language pairs.Using a key-only module boosts the performance by about 5-15 points.Taking the whole row context (key-value pair) with strict constraints on bidirectional mapping, i.e., two-way similarity, improves performance substantially (more than 16 points).Further relaxing the bi-direction constraint to unidirectional matching (one-way similarity), we improve our results marginally with less than 0.5 performance points.Thus relaxation of the bi- direction mapping constraint doesn't lead to significantly better alignments.The multi-key module, which considers one-to-many alignments, further improves the accuracy marginally.The reason for the marginal improvements is very few instances of one-to-many mappings.
For unmatch scores, we see similar results to match scores.The only significant difference is in key-only performance, where we observe a 0.5x performance improvement compared to match scores.We also analyze the precision-recall in Tables 17, 18, 19 and 20 of Appendix §A.3.We observe that the precision reduces and recall increases for match scores with module addition, whereas the reverse is true for unmatch scores.The number of alignments increases as we add more modules with relaxed constraints.This increases the number of incorrect alignments reducing the precision but increasing the recall.8Similarly, we can note the accuracy of unaligned rows increases because more incorrect alignments are added with relaxed constraints.We also report each module coverage in Appendix A.4.The performance of our proposed approach grouped by languages, category, and rows keys are detailed in Appendix A.5.
Error Analysis.Error analysis (cf §2.1) for matched and unmatched are reported in Table 5 and 6, respectively.Our proposed method works sequentially, relaxing constraints, and the number of falsely aligned rows increases with module addition (cf.Table 6).Different modules contribute unequally to unaligned mistakes, (25%, 56%) of the mistakes come from corpus-based module, (39%, 22%) from Key Only Module, (17%, 35%) from Key-Value-Bidirectional module, (7%, 4%) from Key-Value-uni-directional module, and (7.6%, 5%) from multi-key alignment module, for T en ↔ T x and T x ↔ T y respectively.The corpus-based module is worst performing in T x ↔ T y because of difficulty in multilingual mapping.The key-only module is the worst performing in T en ↔ T x because it's the first relaxation in the algorithm.Further analysis of the error cases is in Appendix ( §A.7).    the row addition rule accounts for the most updated, ∼64% of total updates for gold and predicted aligned table pairs.The flow of information from high resource to low resource accounts for ∼13% of the remaining updates, whereas a high number of rows too low adds another 8% of the updates.∼9% of the updates are done by the value updates rule.All the other rules combined give 8% of the remaining suggested updates.From the above results, most information gaps can be resolved by row transfer.The magnitude of rules like value updates and multi-key shows that table information needs to be synchronized regularly.Examples of edited infoboxes using the proposed algorithm are shown in Appendix Figures 4 and 5.

Information Updation
Table 8 reports a similar analysis for humanassisted Wikipedia infobox edits.We also report Wikipedia editors' accept/reject rate for the abovedeployed system in Table 9.We obtained an acceptance rate of 77.28% (as of May 2023), with the highest performance obtained when information flows across non-English languages.The lowest performance is obtained when the information flows from non-English to an English info box.This highlights that our two-step procedure is effective in a real-world scenario.Examples of live updates are shown in Appendix Figures 6 and 7.

Related Works
Information Alignment.Multilingual Table attribute alignment has been previously addressed via supervised (Adar et al., 2009;Zhang et al., 2017;Ta and Anutariya, 2015) and unsupervised methods (Bouma et al., 2009;Nguyen et al., 2011).Supervised methods trained classifiers on features extracted from multilingual tables.These features include cross-language links, text similarity, and schema features.Unsupervised methods made use of corpus statistics and template/schema matching for alignments.Other techniques by Jang et al. (2016); Nguyen et al. (2018) focus on using external knowledge graphs such as DBpedia for the updation of Infoboxes or vice versa.In their experiments, most of these methods use less than three languages, and machine translation is rarely used.Additionally, we don't require manual feature curation for strong supervision.We study the problem more thoroughly with grouped analysis along languages, categories, and keys direction.The works closest to our approach are Nguyen et al. (2011); Rinser et al. (2013), both of which use cross-language hyperlinks for feature or entity matching.Nguyen et al. (2011) uses translations before calculating text similarity.Utilizing crosslanguage links can provide a robust alignment supervision signal.In contrast to our approach, we do not use external knowledge or cross-language links for alignments.This additional information is rarely available for languages other than English.
Information Updation.Prior work for information updates (Iv et al., 2022;Spangher et al., 2022;Panthaplackel et al., 2022;Zhang et al., 2020b,d) covers Wikipedia or news articles than semi-structured data like tables.Spangher et al. (2022) studies the problem of updating multilingual news articles across different languages over 15 years.They classify the edits as addition, deletion, updates, and retraction.These were the primary intuitions behind our challenge classified in §2.1.Iv et al. (2022) focused on automating article updates with new facts using large language models.Panthaplackel et al. (2022) focused on generating updated headlines when presented with new information.Some prior works also focus on the automatic classification of edits on Wikipedia for content moderation and review (Sarkar et al., 2019;Daxenberger and Gurevych, 2013).Evening modeling editor's behavior for gauging collabora-tive editing and development of Wikipedia pages has been studied (Jaidka et al., 2021;Yang et al., 2017).Other related works include automated sentence updation based on information arrival (Shah et al., 2020;Dwivedi-Yu et al., 2022).None of these works focus on tables, especially Wikipedia Infoboxes.Also, they fail to address multilingual aspects of information updation.

Conclusion and Future Work
Information synchronization is a common issue for semi-structured data across languages.Taking Wikipedia Infoboxes as our case study, we created INFOSYNC and proposed a two-step procedure that consists of alignment and updation.The alignment method outperforms baseline approaches with an F1-score greater than 85; the rule-based method received a 77.28 percent approval rate when suggesting updates to Wikipedia.
We identify the following future directions.(a) Beyond Infobox Synchronization.While our technique is relatively broad, it is optimized for Wikipedia Infoboxes.We want to test whether the strategy applies to technical, scientific, legal, and medical domain tables (Wang et al., 2013;Gottschalk and Demidova, 2017).It will also be intriguing to widen the updating rules to include social, economic, and cultural aspects.(b) Beyond Pairwise Alignment.Currently, independent language pairs are considered for (bi) alignment.However, multiple languages can be utilized jointly for (multi) alignment.(c) Beyond Pairwise Updates.Similar to (multi) alignment, one can jointly update all language variants simultaneously.This can be done in two ways: (1.)With English as pivot language : To update across all languages.Here, English act as a central server with message passing.(2.) Round-Robin Fashion: where pairwise language updates between language pairs are transferred in a round-robin ring across all language pairs.In every update, we selected a leader similar to a leader election in distributed systems.(d) Joint Alignment and Updation.Even while our current approach is accurate, it employs a two-step process for synchronization, namely alignment followed by updating.We want to create rapid approaches aligning and updating in a single step.(e) Text for Updation: Our method doesn't consider Wikipedia articles for updating tables (Lange et al., 2010;Sáez and Hogan, 2018;Sultana et al., 2012).

Limitations
We only consider 14 languages and 21 categories, whereas Wikipedia has pages in more than 300 languages and 200 broad categories.Increasing the scale and diversity will further improve method generalization.Our proposed method relies on the good multilingual translation of key and value from table pairs.Although we use key, value, and category together for better context, enhancement in table translation (Minhas et al., 2022) will benefit our approach.Because our rule-based system requires manual intervention, it has automation limits.Upgrading to completely automated methods based on a large language model may be advantageous.We are only considering updates for semi-structured tables.However, updating other page elements, such as images and article text, could also be considered.Although a direct expansion of our method to a multi-modal setting is complex (Suzuki et al., 2012).

Ethics Statement
We aimed to create a balanced, bias-free dataset regarding demographic and socioeconomic factors.We picked a wide range of languages, even those with limited resources, and we also ensured that the categories were diversified.Humans curate the majority of information on Wikipedia.Using unrestricted automated tools for edits might result in biased information.For this reason, we adhere to the "human in the loop" methodology (Smith et al., 2020) for editing Wikipedia.Additionally, we follow Wikipedia editing guidelines9 , rule set10 , and policies11 for all manual edits.Therefore, we ask the community to use our method only as a recommendation tool for revising Wikipedia.As a result, we ask that the community utilize INFOSYNC strictly for scientific and non-commercial purposes from this point forward.
answering over tables via dense retrieval.In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 512-519, Online.Association for Computational Linguistics.Row Difference Across Paired Languages: There is substantial variation in the number of rows for infobox across different languages, i.e., rows difference

Jonathan
where L is set of all 14 languages under consideration.Table 11 shows that German followed by Arabic and Afrikaans, has the highest row difference.This indicates that tables in these languages are incomplete (with missing rows).

A.3 Precision and Recall
We also evaluated precision-recall values in information alignment for matched and unmatched scores ( §5.2).Precision recall values for T en ↔ T x , T x ↔ T y , T en * ← → T hi and T en * ← → T zh are reported in Tables 17, 18, 19, and 20, respectively.

A.4 Algorithm Coverage
We measure the coverage on the entire corpus, the rate of rows aligned w.r.t. the smaller table in a table pair.Table 12 reports ablations results of coverage for various modules.Our proposed method aligns 72.54% and 67.96% of rows for T en ↔ T x and T x ↔ T y , respectively.Corpus-based is the most constrained module, focusing more on precision; hence removing corpus-based gives better coverage for both cases.Key-Only-Unidirectional is the most important module for coverage, followed by the Key-Only module for both cases.

A.5 Domain and Language Wise Analysis
Table 13, 14, and 15 show the performance of our proposed method grouped by languages, domains, and keys, respectively.
Group-wise Analysis.From Table 13, for T en ↔ T x , Cebuano, Arabic, German, and Dutch are the worst performing languages with F1-score close to 85 for alignment.Whereas Turkish, Chinese, and Hindi have F1-score greater than 90.Korean, German, and Swedish are the lowest-performing language groups, with an F1-Score close to 86 for unaligned settings.Cebuano, Turkish, and Dutch get the highest score for unaligned metrics (greater than 90).For non-English language pairs, the lowest F1-score for match table pairs is observed for German-Arabic and Hindi-Korean pairs with an F1-score close to 78, as shown in Table 13.The highest F1-score is observed for Russian-German and Hindi-German, with F1-scores exceeding 88.8.For unmatched data, Korean-Hindi, French-Hindi, French-Korean, and Russian-Korean pairs have the lowest F1 scores, less than 85.In contrast, German-Hindi and Russion-German have exceeded the unaligned F1-Score of 90.
Category-wise Analysis.As reported in Table 14, our method performs worst in Airport and College categories for match settings when one of the languages is English.For non-English match settings, Movie and City are the worst-performing categories.For unmatch setting with English as one of the languages, Airport and Painting have the lowest F1-score, whereas Movie and Stadium have the most inferior performance for non-English languages.
Key-wise Analysis.Table 15 shows the average F1-scores across tables for frequent and nonfrequent keys.We observed an F1-score degrada-  Table 13: Language Wise Analysis :-Alignment F1score reported for same language for T en ↔ T x and T x ↔ T y averaged over all entities.
• Key-Val-Unidirectional and Multi-key together solves another (18.5%,7.5%) of the information representation in T en ↔ T x and T x ↔ T y , respectively, but are not effective against other challenges.
Figure 4: Example From Update Algorithm Proposed: Update English Infobox is obtained by using Spanish Infobox as a reference and vice versa."Country of origin" is updated in English infobox and "website" is updated in Spanish infobox.
Figure 5: Example From Update Algorithm Proposed: Update English Infobox is obtained by using French Infobox as a reference and vice versa.Multiple keys are updated in both infoboxes "Opened," "Location," "Owner," "Coordinates," "Operating Season," and "Slogan" in French, and "number of visitors," "surface," "Type of park," number of attractions" are updated in English infobox.
Figure 6: Example From Live Updates: In the above figure, the Target infobox needs to be updated using Reference infobox(available in English version) as extra/grounding information.The updated infobox is shown in column 3, where the key 'job' is updated.This is an example of "Value substitution," as in Table 8.The red box highlights the updated information.After conclusion before the ethic statement A2.Did you discuss any potential risks of your work?
In the limitation section A3.Do the abstract and introduction summarize the paper's main claims?
In the abstract and introduction in section 1 A4. Have you used AI writing assistants when working on this paper?
Left blank.
B Did you use or create scientific artifacts?
section 2 (Dataset) and section 4 (Model) B1. Did you cite the creators of artifacts you used?
Yes for models (Section 5) B2. Did you discuss the license or terms for use and / or distribution of any artifacts?
Non commercial academic use (dataset and models) discussed in the ethic statement section B3.Did you discuss if your use of existing artifact(s) was consistent with their intended use, provided that it was specified?For the artifacts you create, do you specify intended use and whether that is compatible with the original access conditions (in particular, derivatives of data accessed for research purposes should not be used outside of research contexts)?
In ethics statement section B4.Did you discuss the steps taken to check whether the data that was collected / used contains any information that names or uniquely identifies individual people or offensive content, and the steps taken to protect / anonymize it?Not applicable.Left blank.
B5. Did you provide documentation of the artifacts, e.g., coverage of domains, languages, and linguistic phenomena, demographic groups represented, etc.? Section 3 and appendix B6.Did you report relevant statistics like the number of examples, details of train / test / dev splits, etc. for the data that you used / created?Even for commonly-used benchmark datasets, include the number of examples in train / validation / test splits, as these provide necessary context for a reader to understand experimental results.For example, small differences in accuracy on large test sets may be significant, while on small test sets they may not be.D3.Did you discuss whether and how consent was obtained from people whose data you're using/curating?For example, if you collected data via crowdsourcing, did your instructions to crowdworkers explain how the data would be used?Not applicable.Left blank.
D4. Was the data collection protocol approved (or determined exempt) by an ethics review board?Not applicable.Left blank.
D5. Did you report the basic demographic and geographic characteristics of the annotator population that is the source of the data?Not applicable.Left blank.

Figure 1 :
Figure 1: Janaki Ammal Infoboxes in English (right) and Hindi (left).HindiTable lacks the "British Rule of India" as a cultural context.Two value mismatches (a) The Hindi table doesn't list Died key's state (b) Institution values differ.The Hindi table mentions "residence" while the English table doesn't.Hindi Table is missing Thesis, Awards, and Alma Mater keys.Both don't mention parents, early education, or honors.

Figure 3 :
Figure 3: Update Example:-"Shirley Strickland de la Hunty " Infoboxes for two languages, i.e., English and Spanish.Shows rows transfer for missing information.Value substitution because "Aged 78" is absent in Died.One medal information(Bronze,1952, 100m) is added to the medal tally.

Figure 7 :
Figure 7: Example From Live Updates: In the above figure, the target infobox needs to be updated using a reference infobox as extra/grounding information.The updated Infobox is shown in column 3, where the 'Load/Cargo Traffic' key is updated.This is an example of Row Addition, as referred to in Table 8.The red box highlights the updated information.

Table 20 :
T en * ← → T zh alignment performance on Human-Annotated Test Data.2557 ACL 2023 Responsible NLP Checklist A For every submission: A1.Did you describe the limitations of your work?

Table 3 :
Logical Rules for Information Updation.Notation:-T z represents a table in language z, R Tz represents a row of the table.In RT z [k] = v, k,v represent key and value pair.For RT z [k] = V , V denotes value list mapped to a key k.Al Ty Tx (.; .)represents the alignment mapping between two tables Ty and Tx.Translation between two languages(p and q) is represented by tr p q (.).exKey extract key from a table row.isTime is true if the row has time entry.exTime extract time from table row.PosTrend/NegTrend represent list of keys whose value always increase or decrease with time.RarKey represent set of keys are least frequent in the corpora.

Table 4 :
Matched and UnMatch Score : F1-Score for all test sets of INFOSYNC.

Table 5 :
Error Analysis for Matched Score : T en ↔ T x and T x ↔ T y .

Table 6 :
Error Analysis for UnMatch Score : Total Unaligned mistakes for T en ↔ T x and T x ↔ T y .

Table 7 :
Updates on Test Corpora: Count of the num- ber of updates done by different rules listed in §4.2.Al is the number of Alignments.R1-R8 are the rules listed in the same sequential manner as listed in §4.2.

Table 7
reports the results of different updation types of rules explained in §4.2.We observe that

Table 8 :
Analysis of Human-Assisted Updates: Accept/Reject rate of different types of edits for humanassisted Wikipedia infobox updates.

Table 9 :
Human -Assisted Wikipedia infobox updates: Accept/Reject rate for different flows of information.

Table 10 :
Missing information Analysis in Categories:-For each category unique number of entities and their average standard deviation across languages.

Table 11 :
Row Difference Across Paired Languages:-Column 2 shows average row count difference between languages for all entities.
handle these variations, which requires the following steps: (a) Detecting Infoboxes: We locate Wikipedia infoboxes that appear in at least five languages.(b)Extracting HTML: After detection, we extract HTML and preprocess to remove images, links, and signatures.(c)TableRepresentation: we convert the extracted table and store them in JSON.

Table 14 :
Category Wise Analysis :-Alignment F1score reported for same group entities average over all languages.

Table 16 :
Ablation Study of Matched and UnMatch Score : i.e.F1-Score for all test sets of INFOSYNC.

Table 17 :
T en ↔ T x alignment performance on Human-Annotated Test Data

Table 8 .
The red box highlights the updated information.

Table 18 :
T x ↔ T y alignment performance on Human-Annotated Test Data.

Table 19 :
T en * ← → T hi alignment performance on Human-Annotated Test Data.
C1. Did you report the number of parameters in the models used, the total computational budget (e.g., GPU hours), and computing infrastructure used?Section 5 C2.Did you discuss the experimental setup, including hyperparameter search and best-found C3.Did you report descriptive statistics about your results (e.g., error bars around results, summary statistics from sets of experiments), and is it transparent whether you are reporting the max, mean, etc. or just a single run?Section 5C4.If you used existing packages (e.g., for preprocessing, for normalization, or for evaluation), did you report the implementation, model, and parameter settings used (e.g., NLTK, Spacy, ROUGE, etc.)?Section 5 D Did you use human annotators (e.g., crowdworkers) or research with human participants?D1.Did you report the full text of instructions given to participants, including e.g., screenshots, disclaimers of any risks to participants or annotators, etc.?Not applicable.Left blank.D2.Did you report information about how you recruited (e.g., crowdsourcing platform, students) and paid participants, and discuss if such payment is adequate given the participants' demographic (e.g., country of residence)?Not applicable.Left blank.