Decoupled Dialogue Modeling and Semantic Parsing for Multi-Turn Text-to-SQL

Recently, Text-to-SQL for multi-turn dialogue has attracted great interest. Here, the user input of the current turn is parsed into the corresponding SQL query of the appropriate database, given all previous dialogue history. Current approaches mostly employ end-to-end models and consequently face two challenges. First, dialogue history modeling and Text-to-SQL parsing are implicitly combined, hence it is hard to carry out interpretable analysis and obtain targeted improvement. Second, SQL annotation of multi-turn dialogue is very expensive, leading to training data sparsity. In this paper, we propose a novel decoupled multi-turn Text-to-SQL framework, where an utterance rewrite model first explicitly solves completion of dialogue context, and then a single-turn Text-to-SQL parser follows. A dual learning approach is also proposed for the utterance rewrite model to address the data sparsity problem. Compared with end-to-end approaches, the proposed decoupled method can achieve excellent performance without any annotated in-domain data. With just a few annotated rewrite cases, the decoupled method outperforms the released state-of-the-art end-to-end models on both SParC and CoSQL datasets.


Introduction
Text-to-SQL has lately become an interesting research topic along with the high demand to query a database using natural language (NL). Standard large database format can only be accessed with Structured Query Language (SQL), which requires certain special knowledge from users, hence lowering the accessibility of these databases. Text-to-SQL tasks, however, greatly minimize this gap and allow the query based on NL. Previous work on * The corresponding authors are Lu Chen and Kai Yu. Show the treatment details.

Utterances Semantic-Completion Utterances
Show the treatment details.
Show the treatment details ordered the cost in ascending order.
Order the cost in ascending order .

Ellipsis
Show the treatment details ordered the cost in descending order.
What about in descending order?

Ellipsis
Which treatment is the most recent cost?
Which one is the most recent cost?

SELECT cost_of_treatment FROM Treatments ORDER BY date_of_treatment DESC LIMIT 1
Co-reference SELECT cost_of_treatment FROM Treatments ORDER BY cost_of_treatment ASC SELECT * FROM Treatments Figure 1: An example to demonstrate the co-reference and ellipsis phenomenon in a conversation, where the right column shows the annotated semantic-completion utterances.
Text-to-SQL mostly focuses on single-turn utterance inference, evaluated on context-independent Text-to-SQL benchmarks. Nevertheless, in practice, the users usually need to interact with the Text-to-SQL system step-by-step to address their query purpose clearly. Under such conversation scenarios, the co-reference and information ellipses are always present, shown in Figure 1. Recently proposed methods are mostly end-to-end, which endeavors to design a suitable model to encode the dialogue context and infer the corresponding SQL based on the whole dialogue context. The main limitation of the end-to-end multi-turn Text-to-SQL models lies in their extreme reliance on annotated multi-turn Text-to-SQL data. The large-scale multiturn Text-to-SQL data is time-consuming and expensive. The annotators not only need to be SQL experts but also have to infer the complete and exact query intent of the latest utterance of the speaker. Different from previous end-to-end approaches, we propose a DEcoupled muLti-Turn pArsing (DELTA) framework, which decouples the multiturn Text-to-SQL into two subsequent pipeline tasks: utterance rewrite and single-turn Text-to-SQL. In recent years, these two individual tasks are both well-studied. The utterance rewrite task aims to generate the latest semantic-completion question based on the dialogue context. The single-turn Textto-SQL task aims to parse the semantic-completion question to a SQL, where the state-of-the-art methods (Shi et al., 2020;Chen et al., 2021;Rubin and Berant, 2021) can achieve over 70% exact match accuracy on Spider (Yu et al., 2018) (a cross-domain single-turn Text-to-SQL dataset) and even achieve more than 80% on easier Text-to-SQL benchmarks (Dahl et al., 1994;Zhong et al., 2017). However, there is no rewrite data on the existing multi-turn Text-to-SQL benchmarks and the existing utterance rewrite datasets normally pay more attention to the co-reference problem but ignore the information ellipses. Due to the limitation of the in-domain annotated rewrite data, we further propose a dual learning method to make comprehensive use of the unlabeled multiturn data to learn a reliable rewrite model. Our proposed framework DELTA is evaluated on both the SParC (Yu et al., 2019b) and CoSQL datasets (Yu et al., 2019a), the two existing large-scale benchmark for the multi-turn Text-to-SQL task.
Contributions are highlighted below: • We propose a decoupled parsing framework for the multi-turn Text-to-SQL task, whose annotated data is much easier to collect. Even without any in-domain multi-turn Text-to-SQL data, the decoupled parsing method can achieve encouraging results on multi-turn Text-to-SQL benchmarks.
• The decoupled framework includes an utterance rewrite model which is adapted from the pretrained BART (Lewis et al., 2020), with a newly implemented dual learning method to make comprehensive use of the unlabeled multi-turn data. Our adapted rewrite model achieves new state-of-the-art performance on the utterance rewrite benchmarks.
• With fully labeled multi-turn Text-to-SQL data, our decouple parsing method outperforms all the released end-to-end multi-turn Text-to-SQL model.

Decoupled Parsing Framework
In this section, we elaborate our decoupled parsing framework, which consists of two phases: 1) an utterance rewrite model (Section 2.1), to generate semantic-completion question based on the dialogue context; 2) a single-turn Text-to-SQL parser (Section 2.2), which is fed with the rewritten question to predict the corresponding SQL query. To further improve the rewrite model performance, we propose a dual learning method to make use of large-scale unlabeled data, which is detailed in Section 3.

Phase-I: BART as Rewrite Model
We leverage the pretrained BART (Lewis et al., 2020), which is a Transformer-based encoderdecoder architecture, as the utterance rewrite model. This idea is inspired by its success on the text generation tasks, including question answering and summarization. Along with the success of pretrained language models, the Transformer architecture has been widely applied in natural language process (NLP) tasks. Transformer aims to encode a sequence X = [x i ] n i=1 with the self-attention mechanism (Vaswani et al., 2017) is the representation of the sequence X at (l)-th Transformer layer. The next Transformer takes the following operations with H attention heads: where h is the head index, d z is the hidden dimension, α (h) ij is attention probability, denotes the concatenation operation, LN(·) is layer normalization (Ba et al., 2016) and FFN(·) is a feed-forward network consists of two linear transformations.
Similar to other large-scale pretrained language models, BART also uses a standard Transformerbased sequence-to-sequence architecture, where the encoder is the bidirectional Transformer and the decoder is the auto-regressive Transformer. BART's pretraining method reconstructs the original text from its corrupted text. In its essence, BART is a denoising autoencoder, which is applicable to a very wide range of NLP tasks. In our utterance rewrite task, both the co-reference and information ellipses can be regarded as the corrupted noise of an utterance. Based on this idea, BART can be an appropriate method to denoise the coreference and information ellipses. In addition, the rewrite data in the public multi-turn Text-to-SQL benchmarks are lacking. Therefore, we propose a dual learning method to learn a reliable rewrite model with large-scale unlabeled dialogue data. The details are introduced in Section 3.

Phase-II: RATSQL as Parsing Model
Given a natural language question and a schema for a relational database, the goal of Text-to-SQL parser is to generate the corresponding SQL query. Regarding the single-turn Text-to-SQL parsing model, we directly use the current state-of-the-art RATSQL model (Wang et al., 2020). RATSQL provides a unified framework, which is based on a relation-aware Transformer (RAT), to encode the question and the corresponding schema. The relation-aware Transformer is an important extension to the traditional transformer, which takes the input sequence as a labeled, directed, fullyconnected graph. The pairwise relations between input elements are considered in RAT. RAT incorporates the relation information in Equation 1. The edge from element x i to element x j is represented by vector r ij , which is represented as biases incorporated in the Transformer layer, as follows: The relations among the Text-to-SQL input elements can be categorized into three types: intraquestion, question-schema, and intra-schema. The intra-question relation means both tokens are the elements of the question. The question-schema relations are normally named by schema linking, which is used to represent the matching degree between the question token and the schema token. The intraschema relations include the relation types of the relational database: primary key, foreign key, etc. However, these relations within the input elements are independent of the domain information of the database. Incorporating the domain-independent relations into the representation of the Text-to-SQL input is thus beneficial to the Text-to-SQL parser generation.
During decoding, the SQL query is first represented as an abstract syntax tree (AST) following a well-designed grammar. Followed by that, the AST is flattened as a sequence by the deepfirst search (DFS) method. RATSQL uses the LSTM to generate the flattened AST sequence. The generated actions defined by the grammar has two structures: (1) it expands the last generated node into a grammar rule, called APPLYRULE or when completing a leaf node; (2) alternatively, it selects a column/table from the schema, called SELECTCOLUMN and SELECTTABLE.

Dual Learning for Utterance Rewrite
Due to the limitation of the in-domain annotated rewrite data, we propose a semi-supervised learning method via dual learning to make full use of the unlabeled multi-turn data to learn a reliable rewrite model. In this section, we first introduce the primal and dual tasks of the utterance rewrite. We then demonstrate the dual learning algorithm for utterance rewrite in detail, where a large amount of unlabeled utterance rewrite data participate in optimizing the primal and dual models under the dual learning framework.

Primal and Dual Tasks
In a conversation scenario, the co-reference and information ellipses are always present in the user's expressions (Androutsopoulos et al., 1995). Recently,  make a significant step to analyze the co-reference and ellipsis phenomenon (Iyyer et al., 2017;Andreas et al., 2020) at the fine-grained level. Co-reference has been divided into five types according to the existing pronoun: Bridging Anaphora, Definite Noun Phrases, One Anaphora, Demonstrative Pronoun, and Possessive Determiner. Ellipsis has been characterized by its intention: Continuation and Substitution. where the substitution can be further classified into 4 types: explicit vs. implicit and schema vs. operator. The detailed introduction of these fine-grained types refers to .
The primal task aims to denoise the above coreference and ellipsis and generate a semanticcompletion utterance at the t-th turn and the dialogue history h = [x (j) ] t−1 j=1 . We directly use the pretrained BART as the rewrite model (named rewriter). We further concatenate the dialogue history and the latest utterance as the input of the  rewriter, where they are separated by the special token "</s>". The dual task is to generate a simplified expression based on the latest utterance and the dialogue history. The simplified expression contains the above co-reference and ellipsis as more as possible without changing the original semantic meaning of the dialogue. Similar to the rewriter, we use the pretrained BART as the initial simplification model (named simplifier).

Dual Learning Algorithm
Under the dual learning framework, the dual models can be regarded as two agents in a closed-loop game. The game starts with one of the dual agents. The output of the start agent will be scored by an external reward function. Since the reward feedback is non-differentiable, the start agent is optimized by the policy gradient method (Sutton et al., 1999). The end agent is fed with the output of the start agent, where the end agent aims to reconstruct the initial input of the start agent. Thus, the end agent can be optimized by maximum likelihood estimation (MLE). Before deep-diving into the dual learning algorithm, we first introduce the definitions of the dual framework for the utterance rewrite.

Definition
There are two dual models: rewriter with parameter Θ c and simplifier with parameter Θ s . Two language models (LM c (·) and LM s (·)) are used to evaluate the quality of the generated utterances by rewriter and simplifier respectively. Both two language models are fine-tuned from GPT-2 model (Radford et al., 2019). LM c (·) is trained with semantic-completion Spider dataset. LM s (·) is trained from multi-turn Text-to-SQL data (SParC and CoSQL), where the utterances at the first turn are removed. There is an external single-turn Text-to-SQL parser RATSQL(·), which parses a question into a SQL query. Next, we will introduce the strategy of agent optimization under dual learning framework.

Loop Starts from Rewriter
As shown in Fig. 2, we sample an unlabeled dialogue data (x (t) ; h) from D u . The rewriter generates k possible rewritten formats [ĉ (t) i ] k i=1 with beam search mechanism. There are two-level external reward functions to evaluate the quality of generatedĉ (t) i : token-level reward and sentencelevel reward. Token-level Reward To reserve the schema information of the database mentioned in original utterance x (t) , the generated tokenĉ (t,j) i will get +0.1 reward at j-th step when it is database-related token mentioned in x (t) . To decrease the co-reference phenomenon in the rewritten utterance, we punish the generated pronoun words (e.g., it, their, and so on) with −0.1. Otherwise, the generated tokens will get zero points. Sentence-level Reward We first use the pretrained language model LM c (·) to evaluate the quality of the rewritten utterance with r LMc = log(LM c (ĉ i ) denotes the number of the tokens inĉ (t) i . In practice, the rewritten utteranceĉ (t) i can be directly evaluated by the user, who does not need any SQL background. The user can give an indicated score (0 or 1) to evaluate whetherĉ (t) i meets his/her real intent. Instead, we feed the rewritten utteranceĉ (t) i into Text-to-SQL parser and get the corresponding SQL query withq = RATSQL(ĉ (t) i ). Ifq equals to the golden SQL, we can say the rewritten utterance meets the user's intent (r u = 1) and vise versa (r u = 0). The final sentence-level reward ofĉ (t) i is represented as r c i = r LMc + r u . For the j-th token in the rewritten utterancê c (t) i , its accumulated reward can be represented as R means the j-th token reward of the rewritten utterance and the final token reward equals to the sentence-level reward r (t,m) i = r c i . The rewriter can be optimized by policy gradient method as: To force the simplifier to reconstruct the original input x (t) as similar as possible, the simplifier can be optimized with maximum likelihood estimation (MLE) as: Noting that x (t) could be a semantic-completion utterance. It is not reasonable to force the simplifier to reconstruct a semantic-completion utterance. Thus, we first compare the length of the original utterance len(x (t) ) with the length of the rewritten one len(ĉ (t) i ). Only when len(x (t) ) < len(ĉ (t) i ), we optimize the simplifier with MLE.

Loop Starts from Simplifier
As shown in Fig. 2, we also sample an unlabeled dialogue data (x (t) ; h) from D u . The simplifier generates k possible simplified formats [ŝ with beam search mechanism evaluated by twolevel external reward functions. Token-level Reward To decrease the schema information mentioned in original utterance x (t) , the generated tokenŝ (t,j) i will get −0.1 punishment at j-th step when it is database-related token mentioned in x (t) and history h. To encourage the co-reference phenomenon in the simplified utterance, we award the pronoun words with +0.1 reward. Otherwise, the generated tokens will get zero points. Sentence-level Reward We only use the pretrained language model LM s (·) to evaluate the quality of the simplified utterance with r LMs = log(LM s (ŝ means the j-th token reward of the simplified utterance and the final token reward equals to the sentence-level reward r (t,m) i = r LMs . Similar to the first loop in Section 3.2.2, the simplifier and the rewriter can be optimized with policy gradient method and MLE respectively: Noting that only when len(x (t) ) > len(ŝ (t) i ), we optimize the rewriter with MLE.

Experiments
Series of experiments are conducted to validate our proposed utterance rewrite model and the decoupled framework. We first validate the pretrained BART's performance on the utterance rewrite benchmarks. Then, the multi-turn Text-to-SQL with the decoupled parsing method (DELTA) are experimented on the limited utterance rewrite data. Finally, we analyze the interpretability of the decoupled parsing method through the case study.

Experimental Setup
Datasets&Metrics Our proposed rewrite model is validated on two utterance rewrite datasets: TASK (Quan et al., 2019)   automatic metrics BLEU, ROUGE, EM (Exact Match) and rewrite F-score as our evaluation metrics. BLEU n (B n ) and ROUGE n (R n ) are used to calculate the similarity and the overlapping at the n-grams level between predictions and golden ones. EM means the exact match rate, where the prediction exactly equals to the golden. Rewrite F-score F n is calculated on the collection of ngrams that contain at least one word from the context. Our decoupled parsing method is evaluated on two multi-turn Text-to-SQL tasks: SParC and CoSQL. Following (Yu et al., 2019b), with Question Match and Interaction Match as the metrics. Question match means the predicted SQL equals the golden one for each question, while Interaction match indicates the predicted SQL queries of all the questions in an interaction are correct.
Implementation Details Our implementation is based on PyTorch (Paszke et al., 2019) and Hug-gingFace's (Wolf et al., 2020) Transformers library. We reproduce RATSQL with the same setup presented in (Wang et al., 2020), where the encoder consists of eight relation-aware Transformer (RAT) layers. When fine-tuning the BARTs (rewriter and simplifier) on the utterance rewrite datasets, we use the AdamW as the optimizer with the learning rate 2e-6. At the dual learning and co-training period, we set the learning rate as 1e-6. Specifically, BARTs mentioned above refer to BART large . The discount rate λ in the dual learning method is 1.

BART as Rewrite Model
For the rewrite task, we compared the pretrained BART with state-of-the-art rewrite models: L-Ptr-Gen (See et al., 2017), GECOR (Quan et al., 2019), and RUN (Liu et al., 2020b). Table 1 shows the experimental results on TASK and CANARD datasets. As indicated, using the BART as rewrite model surpasses the best baseline RUN by a large margin on all the metrics. Even for the most challenging metric EM, the BART exceeds the previous best model by 5.0 points on TASK. The BART also obtains a large boost on CANARD, which improves the state-of-the-art by 4.1 points and 6.8 points on B 4 and R L respectively. The above experimental results demonstrate the superiority of the BART as the rewrite model.

DELTA for Decoupled Parsing
Regarding the multi-turn Text-to-SQL task, we compared the decoupled parsing method with all the released end-to-end multi-turn Text-to-SQL models: EditSQL , RichContext , IGSQL (Cai and Wan, 2020), and R 2 SQL (Hui et al., 2021).
Since there is no utterance rewrite data on SParC and CoSQL datasets, we randomly sample 10% dialogues on these two datasets and annotate them as the rewrite in-domain data. There are 741 annotated turns and 695 annotated turns on SParC and  Table 3: Three instances parsed by our proposed decoupled parsing method with rewritten utterance and final predicted SQL query. The red means that the error happens. The green is modified by us.
CoSQL respectively. At the Phase-I, we first use the rewrite in-domain data to warm-up the rewrite model and simplification model, where their encoders share the parameters inspired by (Lample et al., 2018). Then, we use the rest of 90% dialogues as unlabeled data to further improve the rewrite model with the dual learning method, detailed in Section 3. At the Phase-II, we first use the single-turn Text-to-SQL data (Spider) to warm-up the RATSQL parser. Since there is an annotation gap 1 between Spider and multi-turn Text-to-SQL datasets, we use the annotated Text-to-SQL data on the SParC and CoSQL to fine-tune the pretrained RATSQL parser, where the multi-turn dialogue data are rewritten as single-turn data by the rewrite model trained in Phase-I. Table 2 shows that our proposed decoupled framework (DELTA+Dual) gets a considerable performance boost on SParC and CoSQL datasets.

Ablation Study
We conducted an ablation study to analyze the contribution of our proposed decoupled parsing framework on the SParC dataset. To compare with our proposed dual learning method on rewrite task, we examined another semi-supervised learning method co-training (Blum and Mitchell, 1998), which uses the pretrained rewrite model to annotate the unlabeled data and add these pseudo-labeled data to improve the original rewrite model iteratively. To fairly compare with the dual learning method, we only use the pseudo labeled rewrite 1 For example, "Tell me how many rooms cost more than 120, for each different decor." is annotated as "SELECT decor, count(*) FROM Rooms WHERE basePrice > 120 GROUP BY decor" in SParC. It is tend to be annotated as "SELECT count(*) FROM Rooms WHERE basePrice > 120 GROUP BY decor" in Spider.

Variants
SParC QM IM (0) DELTA + Dual 58.6 35.6 (1) DELTA + Co-training 57.2 33.6 (2) -Dual 55.5 31.7 (3) -parsing in-domain data 54.7 31.5 (4) -rewrite in-domain data 42.1 14.4 (5) -rewriter 34.5 7.1 data that are correctly predicted by the RATSQL parser at each iteration of the co-training method. As shown in Table 4, our proposed dual learning method outperforms the co-training method at row (1). To further validate the effect of the dual learning method, we remove the dual learning part in the Phase-I. Compared with our adapted dual learning method at row (2), the above two variants have a significant performance degradation, which demonstrates the superiority of the dual learning method on rewrite task. Compared with the end-to-end multi-turn Textto-SQL models, our proposed decoupled parsing framework even does not require any annotated multi-turn in-domain data. We first evaluate the performance of the decoupled method without any Text-to-SQL parsing in-domain data at row (3). There are 3.9 points and 4.1 points degradation on question match accuracy and interaction match accuracy respectively, which is caused by the annotation gap between Spider and SParC. We further drop our annotated rewrite in-domain data at row (4) and warm-up the rewrite model and simplifica- tion model with TASK and CANARD datasets. As shown in Table 4, we can find the decoupled parsing framework still gets 42.1% question match accuracy without any annotated multi-turn in-domain data. Lastly, we just remove the Phase-I (rewrite model) at row (5), where the RATSQL parser is trained on Spider and fine-tuned with multi-turn Text-to-SQL data. It can be regarded as the baseline of all the ablations, which only gets 34.5% question match accuracy.

Case Analysis
Compared with end-to-end multi-turn Text-to-SQL models, our decoupled parser can generate the intermediate rewritten utterance, which is easier to understand for the user than a SQL query. As introduced in Section 3.2.2, the feedback of the user can be used to optimize the rewrite model. Additionally, our decoupled parser is more convenient in data collection compared with end-to-end methods, which does not require annotators' familiarization with SQL to rewrite an utterance. When collecting single-turn Text-to-SQL data, the annotator does not need to consider the dialogue context. It is also costly to collect the dialogue data on the SQL query task. Table 3 displays three cases parsed by our proposed decoupled method. We can pinpoint exactly which phase the error occurred under decoupled parsing framework. Through fine-grained error analysis, the bottleneck of multi-turn parser can be found accurately. Thus, we can target to optimize the bottleneck individually. Figure 3(a) shows the error rate of the utterance rewrite model (DELTA+Dual) on SParC development dataset at a fine-grained level. The orange line denotes the error rate on the individual co-reference or ellipsis type. The blue line denotes the overall error ratio. We can see that most rewrite errors happen on the co-reference side, especially at Demonstrative Pronoun type. For the ellipsis, Continuation type is a serious problem. Figure 3(b) shows the error ratios that happen in the rewrite model (Phase-I) or in the parsing model (Phase-II). We can find that at the first three turns the parsing model is still the bottleneck. After the third turn, the rewrite model gets a bigger error rate. The error rate of the rewrite model is more sensitive than the parser with turn increased. We can conclude that we need more annotated rewrite data, especially with Continuation type and Demonstrative Pronoun type.

Related Work
Utterance Rewrite Recently, the utterance rewrite has raised large attention. Some works use the sequence-to-sequence architecture with copy mechanism (Elgohary et al., 2019b;Quan et al., 2019;Rastogi et al., 2019) to solve the incomplete question problem.  decompose the utterance rewrite model as two-phase subtasks: split and recombine. The split and recombine models are both learned from the well-designed reward function by the policy gradient method. Borrowing the idea from image segmentation, Liu et al. (2020b) formulate the utterance rewrite as the semantic segmentation task, where the rewrite model is implemented with UNet (Ronneberger et al., 2015). For the downstream task, the utterance rewrite has been successfully used in dialogue state tracking (DST) tasks (Rastogi et al., 2019;Han et al., 2020).  propose a rule-based and selfsupervised learning method to generate weaklysupervised rewrite data, which are used to fine-tune GPT-2. Different from the previous works, we directly use the pretrained BART, which is a denoising autoencoder, as the utterance rewrite model.

End-to-End
Text-to-SQL Parser Edit-SQL  proposes an Edit-based model that reuses the SQL query generated from the previous step to alleviate the pressure of the increasing turns. RichContext  conducts an exploratory study on semantic parsing in context and performs a fine-grained analysis. IGSQL (Cai and Wan, 2020) presents a schema interaction graph encoder to capture the historical information of database schema items. R 2 SQL (Hui et al., 2021) presents a dynamic graph framework that employs dynamic memory decay mechanisms to introduce inductive bias to construct enriched contextual relation representation at both utterance and token level. Dual Learning Dual learning method is first proposed to improve neural machine translation (NMT) (He et al., 2016). The dual learning mechanism enables a pair of dual systems to automatically learn from unlabeled data through a closed-loop game. The idea of dual learning has been applied into various tasks, such as Question Answer (Tang et al., 2017)/Generation (Tang et al., 2018), Image-to-Image Translation (Yi et al., 2017), Open-domain Information Extraction/Narration , Text Simplification , Semantic Parsing (Cao et al., 2019;Cao et al., 2020) and dialogue state tracking (Chen et al., 2020c).

Conclusion and Future Work
In this paper, we propose a decoupled parsing framework (DELTA+Dual) to solve the multi-turn Text-to-SQL task. The previous end-to-end multiturn Text-to-SQL models rely on large-scale multiturn data. DELTA can achieve considerable performance without any multi-turn Text-to-SQL data. We adapt the pretrained BART as the rewrite model and achieve new state-of-the-art performance on the utterance rewrite benchmarks. We further propose an efficient dual learning method to make full use of unlabeled dialogue data. On the challenging multi-turn Text-to-SQL benchmarks, DELTA surpasses all the released end-to-end models with fully labeled data. In the future, we will try to reformulate the decoupled parsing method as a multitask, where the rewrite model and Text-to-SQL model are trained simultaneously. The proposed DELTA is also easy to extend to the other conversational semantic parsing tasks, like dialogue state tracking .