Towards Table-to-Text Generation with Pretrained Language Model: A Table Structure Understanding and Text Deliberating Approach

Although remarkable progress on the neural table-to-text methods has been made, the generalization issues hinder the applicability of these models due to the limited source tables. Large-scale pretrained language models sound like a promising solution to tackle such issues. However, how to effectively bridge the gap between the structured table and the text input by fully leveraging table information to fuel the pretrained model is still not well explored. Besides, another challenge of integrating the deliberation mechanism into the text-to-text pretrained model for solving the table-to-text task remains seldom studied. In this paper, to implement the table-to-text generation with pretrained language model, we propose a table structure understanding and text deliberating approach, namely TASD. To be specific, we devise a three-layered multi-head attention network to realize the table-structureaware text generation model with the help of the pretrained language model. Furthermore, a multi-pass decoder framework is adopted to enhance the capability of polishing generated text for table descriptions. The empirical studies, as well as human evaluation, on two public datasets, validate that our approach can generate faithful and fluent descriptive texts for different types of tables.


Introduction
The task of learning to generate natural language descriptions from non-linguistic input, which is referred to as data-to-text, is important for many applications, such as weather forecast generation (Mei et al., 2016), sports news writing (Wiseman et al., 2017), biography writing (Lebret et al., 2016), market comments writing (Murakami et al., 2017) and automatic question-answering (Li et al., 2021b).The input data can be in various forms for data-to-text though, here we focus on the text generation task that takes the table as input.
Inspired by neural machine translation models, previous studies on table-to-text tasks mainly adopt traditional seq2seq methods to generate table descriptions (Lebret et al., 2016;Wiseman et al., 2017;Liu et al., 2018;Gong et al., 2019b;Wang et al., 2020;Li et al., 2021a).Despite generating text with high fluency, lacking numerous source tables leads to lower generalizability of the tableto-text model.Recent progress in the pretrained language model (Devlin et al., 2019;Radford et al., 2019) shows remarkable performance in solving natural language processing tasks.The model pretrained on large-scale data possesses rich knowledge, which inspires us with the potential for solving generalization issues of the text generation task.
To exploit the expressive power of the pretrained model for the table-to-text task, it is necessary to serialize the input table effectively.Several works have put efforts to bridge this gap, such as serializing the table into a token sequence (Zhang et al., 2020;Suadaa et al., 2021;Xing and Wan, 2021), or introducing an extra task to control the table representation (Gong et al., 2020).However, none of these leveraged the table structure information effectively.Furthermore, the text-to-text pretrained model decodes and generates a sequence in a onepass forward process, which means it cannot perceive the future words in advance on the target side.Recently, the deliberation mechanism (Niehues et al., 2016;Geng et al., 2018) implemented by the multi-pass decoder is proposed to tackle this problem.However, how to adapt this approach for text-to-text pretraining, which can be further applied to the table-to-text task, is another challenge.
To this end, we propose a table structure understanding and text deliberating approach, namely TASD, to solve the table-to-text task with the pretrained language model enhanced by the deliberation mechanism.Specifically, we first serialize the table input with customized templates which do not acquire the target cells to be labeled.Then, we employ the multi-head attention in a hierarchical way to learn the table representation that is aware of table structure and apply it to guide the fine-tuning of the text-to-text pretrained model.Afterward, we adopt the multi-pass decoder to realize text deliberation.More specifically, we treat the above table-structure-aware fine-tuned model as the first-pass decoder and adopt another pretrained model as the second-pass decoder to further polish the descriptive text.In the second-pass decoding phase, the table representation can be conveniently leveraged as the "original text" in the text deliberation mechanism.The main contributions of this work can be summarized as follows: • We propose a novel table-to-text generation approach (i.e., TASD) to assimilating the complete table information with the help of table structure distillation, the pretrained language model, and the text deliberation.
• We devise a table-structure-aware text generation model (TASATG) via the hierarchical multi-head attention network, which can realize the content selection automatically.And we develop an effective text deliberation method dedicated to the table-to-text task.
• Extensive experiments conducted on two different datasets demonstrate that TASD outperforms comparable baselines in terms of various metrics.
2 Related Work

Table-to-Text Generation
Encouraged by the success of seq2seq methods in machine translation and text summarization, researchers proposed to formulate the input table as a sequence of records (Lebret et al., 2016;Wiseman et al., 2017), and further improve the performance of table-to-text methods based on seq2seq by modeling table representation (Liu et al., 2018;Gong et al., 2019a).Introducing auxiliary tasks to enrich the table representation (Tian et al., 2019;Li et al., 2021a) is another promising paradigm to address the table-to-text problem.Moreover, there have been studies focusing on how to disaggregate the table-to-text pipeline effectively to generate more faithful and fluent text, e.g.leveraging content selection and planning (Puduppully et al., 2019;Trisedya et al., 2020;Bai et al., 2021), combining autoregressive and non-autoregressive meth-ods (Wang et al., 2021).In addition, recent Transformers were also applied to solve the table-to-text task (Gong et al., 2019b;Wang et al., 2020;Obeid and Hoque, 2020).However, current table-to-text methods may fail to tackle the overfitting problem aroused by the lack of diversity in small datasets.
Fine-tuning the model pretrained in a large corpus and adapting to a specific task is an effective approach to tackling the generation issues disturbed by small data and large parameters (Radford et al., 2019).(Kale and Rastogi, 2020) explored the feasibility of applying the text-to-text pretrained model to the table-to-text task, (Gong et al., 2020) applied multi-task learning to solve the table-to-text task with pretrained language model, and (Suadaa et al., 2021) leveraged pretrained language model for fact inference in numerical table contents.However, these approaches seldom perceived and integrated the complete table information into the fine-tuning of the pretrained model.A table-to-text pretrained model (Xing and Wan, 2021) was proposed though, the large and diversified table corpus is often unavailable.In addition, recent works on fact verification taking tabular as input (Yin et al., 2020;Dong and Smith, 2021) have suggested the effectiveness of the table-structure-aware pretrained model.

Text Deliberation
The encoder-decoder framework has been widely applied to neural machine translation, while the subsequent words are often invisible on the target side when decoding a sequence.To alleviate this, researchers proposed to decode and refine the output sequence in multiple passes, like human cognitive behavior when polishing an article.Studies have been made on text deliberation, such as the solution with two separate stages (i.e., generating and polishing) (Niehues et al., 2016), combining two separate stages as one framework (Xia et al., 2017), and deliberating generated text in multiple passes adaptively via reinforcement learning (Geng et al., 2018) or customized evaluating architecture (Li and Yao, 2021).To the best of our knowledge, we are the first to apply the deliberation mechanism to the table-to-text problem.

Problem Formulation
Our table-to-text problem takes a table as input, and we formulate a table as a sequence of records: , where m and n denote the number of rows and columns of T , re- spectively.Then, we aim to generate a document Y containing words Y = y 1 y 2 • • • y l that can describe the content of T precisely, where l is the document length.Formally, given a table T , the table-to-text model is excepted to generate a descriptive document Y in an auto-regressive way where Θ is the set of model parameters.

Data
NumericNLG Dataset.The numericNLG dataset was released by (Suadaa et al., 2021).In this dataset, the tables demonstrate experimental results in research papers, thus, most of the table contents are numerical values.We use this dataset to evaluate the accuracy and smoothness of the generated descriptions for the table with numerical content.In particular, for each table of numer-icNLG, <table_id> acts as the pronoun of the table, and <caption> is the descriptive text of the table.Moreover, for each cell of a table, there are <metric>, (row and column) <header>, and <value> as different views of a cell.Totto Dataset.The Totto dataset (Parikh et al., 2020)

Methodology
In this section, we introduce the proposed framework in detail.As shown in Fig. 1, our framework mainly consists of three components, i.e., templatebased table serialization, table-structure-aware fine-tuning, and text deliberation.Specifically, we first produce a sequence describing the table contents with customized templates.The templates we adopted do not require the target cells to be labeled.Then, to generate informative text, we adopt full table representation learning to guide the description generation, such that the outcome text is capable of emphasizing and delineating the facts in the table from a macroscopic perspective.Finally, we employ and adapt the multi-pass decoder to our data-to-text problem, which can further fine-tune the generated table description.Technical details for all three modules will be introduced separately in the following subsections.

Template-based Table Serialization
To well harness the expressive power of the text-totext pretrained model for the input table, it is necessary to serialize the raw table first.The templatebased representation offers us a simple yet effective linearization approach to generating descriptive texts which can reflect the facts in a table without yielding an intractable downstream model.
In particular, the templates we adopted in this work are devised to mention all the available facts in the table without knowing the emphasized cells in advance, which is different from (Suadaa et al., 2021).The template for describing facts consists of two parts: 1.The title or descriptive text that comes with the table.2. A series of expressions, in which each one describes the content of a cell.
More specifically, for the numericNLG dataset, we apply the following template: For the Totto dataset, we apply another template: The second part of the template enumerates all the cells in the table.This preliminary table representation, denoted by T S , covers all the available facts in a raw table.Note that, the templates we adopt may encounter the content selection problem.In table-to-text applications, target cells in the input table are often not highlighted and the generated table description should emphasize certain cells.

Table-Structure-Aware Text Generation
A text-to-text pretrained model can take the largescale corpus as input to possess vast knowledge and generate texts in an unsupervised way so that it has been widely applied to text-generation tasks.When handling a specific text generation task, it is effective to fine-tune the pretrained model on new data.However, for the table-to-text task, some hidden information, like table structure, is most likely to be overlooked, though the drafted T S mentions all the available facts in the table.Thus, we propose to exploit table structure information to guide fine-tuning of the text-to-text pretrained model.
As shown in Fig. 2, we first encode the table content in a multi-view fashion.To be specific, given a cell τ i, j in a table T , it can be viewed from different perspectives, such as the value of τ i, j , the row header of τ i, j , and the column header of τ i, j , etc.Then, we treat the k-th view of τ i, j as a token sequence which is denoted by x (k) i, j .Afterward, we pad x (k)  i, j with placeholders (if necessary) and concatenate these token sequences as follows: where ⊛ denotes the concatenation operator, and the multi-viewed representation of a table i, j can be encoded as a d-dimensional embedding by looking up the text-to-text pretrained model and updated accordingly when fine-tuning the pretrained model.In this way, we can obtain the semantic representation of table T , which is denoted by E (0) ∈ R m×n×s×d , where s is the length of concatenated sequence x i, j .
To realize TASATG for table-to-text, we pro-pose to employ multi-head attention (Vaswani et al., 2017) to guide fine-tuning of the text-to-text pretrained model.In particular, we adopt three multihead attention (MHA) layers to interactively extract the information in the table in a hierarchical way.Specifically, the MHA layer is defined as: where Q, K, V represent the query, key and value in the attention mechanism, respectively.As illustrated in Fig. 2, in the first MHA layer, we add a cell text position embedding (E (ctpe) ∈ R s×d ) to each cell of the aforementioned E (0) , and feed it to the multi-head attention to implement cell text self-attention, where ⊕ denotes the element-wise addition operation.Consequently, E (1) ∈ R m×n×d can be deemed as an initial aggregated table representation.Next, in the second MHA layer, we add a table position embedding (E (tpe) ∈ R m×n×d ) to E (1) to implement table structure self-attention,

Text Deliberation
The encoder-decoder framework applied in many sequence generation tasks often adopts a one-pass process while decoding a sequence.Though efficient, the one-pass decoder cannot perceive future context for further text deliberation.Multi-pass decoder extends the capability of generating more refined text by exploring global information in the sequence (Niehues et al., 2016;Xia et al., 2017).
For the text-to-text pretrained model, due to the huge amount of parameters of the pretrained language model, it is unwise to directly combine the models in different passes.A common solution is to concatenate the original serialized table content and the text generated in the previous pass to finetune the pretrained model in the next-pass decoding.However, in this way, the length of input text probably exceeds the limit of the text-to-text pretrained model, and the time complexity is too high.
To effectively implement the fine-tuning of the text-to-text pretrained model in multiple passes, as shown in Figs.3a and 3b, we take the table representation as the "original text" and feed the text generated in the first-pass fine-tuning plus the table representation to the second-pass finetuning.Note that, as shown in Fig. 3a, we separately fine-tune the table-to-text generation task and the text-to-text deliberation task with two independent TASATG models, and each of them takes a text-to-text pretrained model as the backbone.

Experimental Settings
Data.We conducted experiments on the aforementioned datasets, i.e., numericNLG and Totto.The statistics of the numericNLG dataset can be found in (Suadaa et al., 2021).Besides, the size of the original Totto dataset is 120K, which is much larger than the numericNLG dataset.To evaluate different methods for table-to-text with comparable data size, for the Totto dataset, we filtered out the tables with fewer rows and columns, i.e., #rows < 8 and #columns < 8, such that the filtered Totto dataset contains 1.8K tables.Then, we randomly selected 1.2K1 tables to generate the new Totto dataset.Evaluation Metrics.We calculated BLEU (from gram-1 to gram-4) (Papineni et al., 2002), ROUGE-L (Lin, 2004) and METEOR (Denkowski and Lavie, 2014) to evaluate the quality of the generated text.The BLEU-n with a small value of n measures the accuracy of the word level, and the BLEU-n with a large n can measure the fluency of the sentence.The ROUGE-L measures the recall rate based on the longest common sequence between source and target texts.The METEOR is based on the harmonic mean of unigram precision and recall, with recall weighted higher than precision.These metrics are widely used to measure the accuracy and fluency of the generated sentence.Baselines.We compare TASD with the following baselines.
• Template-based Table Serialization.We use the template designed for table serialization as a baseline.Note that, the token sequence generated by the template-based method is denoted as T S .
• Pointer Generator (See et al., 2017).This is a seq2seq model with the attention and copy mechanism.We take T S as input for the pointer generator model.
• TRM.We implemented a simplified version of the proposed TASD that omits the possessed knowledge in the pretrained language model and removes text deliberation for focusing on table representation modeling, namely TRM.In particular, TRM adopts the architecture of GPT2 but initializes the parameters randomly and trains 100 epochs at most for fine-tuning.Besides, TRM takes T S plus the table structure representation as input and is fed with T S in the inference phase.• Fine-tuned GPT2 (Radford et al., 2019).We take the concatenation of T S and Y as the input for fine-tuning.In the inference phase, we only feed T S to the model to generate Y starting after the last token of T S .
• TableGPT (Gong et al., 2020).TableGPT is a state-of-the-art table-to-text method.To improve the text fidelity and exploit the structural information at the same time, TableGPT employs a multi-task learning paradigm consisting of two auxiliary tasks, that is, one task reconstructs the table structure from representations of GPT2, and the other aligns the tables and the information in the generated text.Implementation Details.The split settings for training, validation and, testing were 1084:136:1352 for the numericNLG dataset and 960:120:120 for the Totto dataset, respectively.Regarding automatic evaluation, all results of deep models were obtained by conducting experiments on a Linux machine with Nvidia A100 GPU, and the averaged results of 5 runs were reported.Besides, an Adam optimizer was utilized (with an initial learning rate of 3e-5) for GPT2 fine-tuning, and the training was iterated in 20 epochs at most.A beam search algorithm was adopted when decoding a sequence and the beam width was set to 53 .
In other words, for different types of source tables, TASD generates better descriptive texts w.r.t.accuracy at the word level, recall of the sequence, and fluency of sentences.
Besides, we have the following observations: 1) The template-based method performs much better on the numericNLG dataset compared to the Totto dataset, since the referenced table descriptions in numericNLG were collected from scientific papers, however, the table summaries in the Totto dataset are more diverse.2) In the Totto dadaset, the pointer generator model tends to cover more words in descriptive text and generate more fluent sentences than the template-based method, as the contents in source tables of the Totto dataset are mostly linguistic.This can also explain why the pointer generator performs worse than the templatebased method on the numericNLG dataset w.r.t.BLEU and METEOR.3) Fine-tuned GPT2 can generate more faithful and fluent text than other baselines (refer to Tables 1 and 2) most of the time, which validates the effectiveness of the pretrained language model.4) In general, TableGPT performs better, and even the best, among all the baselines.In the numericNLG dataset, the headers of the input tables (a.k.a. the attributes of records for TableGPT) are more diverse, which may explain why the performance of TableGPT is not promising as expected on the numericNLG dataset.5) TRM can generate comparable, or even better descriptive text as fined-tuned GPT2, which further suggests the effectiveness of table structure understanding.

Ablation Analysis
Moreover, to verify the effectiveness of different modules, we compare TASD with its variants.
• After generating text with fine-tuned GPT2, we fed the generated text concatenated with T S to another fine-tuned GPT2 to realize the second-pass decoder without table structure representation.
• We implemented TASD without deliberating on the outcome text, which means that we realized TASATG based on GPT2 in a onepass forward process.
• TASD w/o 1 st -TAS .We removed table structure modeling in the first-pass decoding from TASD, which was implemented by taking the fine-tuned GPT2 as the first-pass decoder and the table-structure-aware fine-tuned GPT2 as the second-pass decoder.
As can be seen in Tables 1 and 2, TASD w/o TAS performs worse than TASD under all metrics, since the table structure modeling can benefit the finetuning of GPT2.This can also be validated by comparing fine-tuned GPT2 to TASD w/o D .Besides, the effectiveness of deliberating text can be proven by comparing TASD w/o D to TASD (this can also be validated by comparing fine-tuned GPT2 to TASD w/o TAS ).While text deliberation may harm sentence fluency as depicted by the results of these methods w.r.t.BLEU-3 & 4 in Table 1.In addition, TASD w/o 1 st -TAS outperforms TASD w/o TAS under all metrics suggesting that taking the table representation as the "original text" in the deliberation mechanism is also effective.

Qualitative Analysis
Figs. 4(a) and (b) show two selected source tables and corresponding descriptive texts (i.e., caption and section_text) in numericNLG and Totto datasets.Fig. 4(c) demonstrates the generated descriptions by different methods.The text that correctly reflects the facts of the source table is in green, the erroneous text is in red, and the confusing text is in blue.We can see that, there are many grammatical errors in the text produced by the pointer generator.Fine-tuned GPT2 tends to repeat phrases and sentences due to the limited knowledge about the input table, which can also explain why the fine-tuned GPT2 can obtain a false high score in BLEU-n as n grows.Thanks to the semantic knowledge brought by pretraining, finetuned GPT2 can generate more natural descriptions, in which, however, perplexing factual errors exist.Compared to fine-tuned GPT2, the description generated by TASD is more relevant to the table contents.Since the target cells are not known in advance, the generated text may miss the emphasized points described in the reference.The text generated by TableGPT is also fluent, though counterfactual descriptions may exist.

Human Evaluation
We randomly selected 30 samples from the test set in numericNLG and Totto datasets, respectively, and invited 10 volunteers to evaluate the quality of the outcome text by considering three criteria, i.e., grammar, coherence & concise, and factual perspective (correct and relevant).Each criterion has scores of five degrees, ranging from 1 (the worst) to 5 (the best).The averaged scores were reported in Table 3, which show that TASD can generate more  readable and coherent texts, and describe more correct facts.Moreover, the pretrained models consistently achieve better scores than the pointer generator on grammar and coherence because of the expressive power learned from the large-scale corpus.In the Totto dataset, the improvement of the table structure modeling is smaller than that of the polishing mechanism, which is consistent with the automatic evaluation results in Table 2.

Discussion
In our work, we devised a two-pass decoder framework dedicated to the table-to-text task with the help of the table-structure-aware text generation model (i.e., TASATG).However, the effectiveness of the text deliberation for the table-to-text task should be further explored and integrated into the table-structure-aware modeling in a more harmonic  manner.To discuss the limitation of the text deliberation of TASD, we additionally developed a table content reconstruction loss and integrate it into TASD in a multi-task learning fashion.
Specifically, given the table-structure-aware embedding E (2) generated with Eq. (3), we randomly mask certain cells of the input table and yield a partially corrupted embedding of the input table, denoted by E (2) .Then, a two-layer MLP (i.e., multilayer perceptron) is adopted to restore the tablestructure-aware embedding.Afterward, an MSE (i.e., mean square error) loss is adopted to measure the effectiveness of table reconstruction and further integrated into the TASD framework in the multi-task learning paradigm.The process of table reconstruction is demonstrated in Fig. 5.
We carried out a series of experiments to evaluate the performance of TASD w/ and w/o the help of table reconstruction loss (i.e., TRLoss) on nu-mericNLG and Totto datasets in terms of BLEU-n (1 to 4), METEOR, and ROUGE-L.The results can be found in Tables 4 and 5.According to the results reported on the nu-mericNLG dataset, the TRLoss is helpful in enhancing the capability of table comprehension though, the best performance is achieved by TASD w/o D w/ TRLoss .It seems that the performance improvement gained by the table comprehension enhancement is sacrificed after the text deliberation is adopted.Meanwhile, on the Totto dataset, TASD with the table reconstruction (i.e., TASD w/ TRLoss ) does achieve the best performance in terms of BLEU-1, BLEU-2, METEOR, and ROUGE-L, though the improvement is not remarkable.The contents of the input tables are mainly linguistic and the table structures are not too diverse might be able to explain the performance improvement of TASD w/ TRLoss on the Totto dataset.With the above comparisons, we can conclude that, for the input tables with diverse structures, the limitation of the current text deliberation mechanism cannot be neglected if one aims to enhance the capability of table comprehension for the table-to-text task.Moreover, this also suggests that the generalization capability of text deliberation of TASD should be improved in the future.
Limitations.In this work, long tables in the Totto dataset are removed since the efficiency and performance of TASD on large tables could be lowered.In the future, the capability of handling long tables for table-to-text models should be further explored.Besides, a large-scale and more exhaustive human evaluation is necessary.We plan to recruit more volunteers to conduct the human annotation.

Conclusion
In this paper, to realize table-to-text with the pretrained language model, we proposed a table structure understanding and text deliberating approach, namely TASD.The table structure understanding was realized by developing a hierarchical multihead attention network, which can benefit the finetuning of the text-to-text pretrained model.The fully represented table information benefits not only the pretrained language model but also the text deliberation process since the structure information with rich semantics could be fed into the second-pass decoding naturally.We carried out extensive experiments on two public datasets with different table types.Automatic and human-based evaluations, as well as qualitative analysis, validate the effectiveness of our approach to generating faithful and fluent table descriptions.In the future, we will improve text deliberation by devising a unified framework to integrate the multi-pass decoder and refine the descriptive text paying more attention to sentence fluency.

A Human Evaluation Settings
The criteria adopted in our human-based evaluation are (1) Grammar (e.g., is this paragraph grammatical?),(2) Coherence & Concise (e.g., is this paragraph coherent and contextually consistent?does it repeat redundant information?), and (3) Factual perspective (e.g., are the facts that this paragraph describes correct?are these facts related to references and tables?).More specifically, we list the detailed justifications on how to score the generated text in each criterion as follows.Grammar • 1 It is more like garbled code than a paragraph.
• 2 There are many obvious grammatical mistakes.• 3 There are a few obvious grammatical mistakes.
• 4 There are few grammatical mistakes.
• 5 There are no grammatical mistakes.

Coherence & Concise
• 1 The logic of text expression is chaotic and nonsense.
• 2 There are a lot of logical inconsistencies or redundant information.
• 3 There are some logical inconsistencies or redundant information.
• 4 There are a few logical inconsistencies or redundant information, but it does not affect browsing.
• 5 The logic of the text is smooth without redundant information.

Factual Perspective
• 1 The paragraph does not coincide with the reference or table, and it is full of information inconsistent with the facts.
• 2 The paragraph describes the facts incorrectly and has a low correlation with reference, but is related to the information in the table.
• 3 The paragraph description is incorrect, but it is highly coincident with the reference.
• 4 The paragraph description is basically correct, and the coincidence with the reference is low, but it also describes the information in the table.
• 5 The paragraph description is correct and highly coincident with the reference.

B Illustrative Examples of Generated Descriptions
We additionally selected another two examples of the generated table descriptions from the numeric-NLG and Totto datasets, respectively.The results are shown in Figs. 6 and 7. From these four examples, we can see that TASD can generate more accurate and fluent descriptive texts.While incorrect descriptions can be found in the outcome texts generated by different models for cases D and F, which suggests that generating faithful descriptions for open-domain tables is much more challenging and requires more powerful and, thus larger, pretrained language models.

C Extra Implementation Details
The learning rate of GPT2 was searched from {3e − 4, 3e − 5, 3e − 6}.In the evaluation of discussing the limitation of text deliberation (see Section 6), a trade-off parameter for balancing the GPT2 finetuning loss and the TRLoss was adopted, then the trade-off parameter was searched from {1e − 1, 5e − 2, 1e − 2, 5e − 3, 1e − 3}, and 1e-2 was selected for the reported performance.Besides, the reported results in Tables 4 and 5 were averaged in 3 runs.
(a) Training.(b) First and second fine-tuning of TASATG with validation data.(c)Testing.

Figure 3 :
Figure 3: Training, validation and testing procedures of the proposed TASD approach.

Figure 4 :
Figure 4: Two examples of the generated table descriptions.

Figure 6 :
Figure 6: Generated table descriptions on cases C and D.

Figure 7 :
Figure 7: Generated table descriptions on cases E and F.

Table 3 :
Result of Human Evaluation Table reconstruction for table-structure-aware modeling enhancement.