Reconstruct Before Summarize: An Efficient Two-Step Framework for Condensing and Summarizing Meeting Transcripts

Meetings typically involve multiple participants and lengthy conversations, resulting in redundant and trivial content. To overcome these challenges, we propose a two-step framework, Reconstruct before Summarize (RbS), for effective and efficient meeting summarization. RbS first leverages a self-supervised paradigm to annotate essential contents by reconstructing the meeting transcripts. Secondly, we propose a relative positional bucketing (RPB) algorithm to equip (conventional) summarization models to generate the summary. Despite the additional reconstruction process, our proposed RPB significantly compressed the input, leading to faster processing and reduced memory consumption compared to traditional summarization methods. We validate the effectiveness and efficiency of our method through extensive evaluations and analysis. On two meeting summarization datasets, AMI and ICSI, our approach outperforms previous state-of-the-art approaches without relying on large-scale pre-training or expert-grade annotating tools.


Introduction
Although numerous achievements have been made in the well-structured text abstractive summarization (Zhang et al., 2020a;Liu* et al., 2018;Lewis et al., 2020), the research on meeting summarization is still stretched in limit.There are some outstanding challenges in this field, including 1) much noise brought from automated speech recognition models; 2) lengthy meeting transcripts consisting of casual conversations, content redundancy, and diverse topics; 3) scattered salient information in such noisy and lengthy context, posing difficulties for models to effectively capture pertinent details.
To this end, previous works adapt the language model to long inputs through techniques such as long-sequence processing (Beltagy et al., 2020;Tay et al., 2020;Zhong et al., 2022) and hierarchical * Corresponding author.
learning (Zhu et al., 2020;Rohde et al., 2021), or tailor the input to an acceptable length through sentence compression (Shang et al., 2018a) and coarseto-fine generation (Zhang et al., 2022).However, these approaches do not specifically target the critical information in meeting transcripts.Feng et al. (2021b) utilizes the token-wise loss as a criterion to annotate contents with DialoGPT (Zhang et al., 2020b), suffering from labeling unpredictable contents as critical information.Besides, the commonly used pre-processing procedure that extends models' positional embedding by copying, and truncating the lengthy input compromises the positional relationships learned during pre-training, and results in a loss of important information due to the brutal truncation.Consequently, a natural question is -How can we precisely capture the salient contents from noisy and lengthy meeting transcripts, and summarize them with conventional language models?Our observation is that meetings are characterized by extensive communication and interaction, with specific texts often containing pivotal content that drives these interactions.Based on this understanding, we propose a two-step meeting summarization framework, Reconstrcut before Summarize(RbS), to address the challenge of scattered information in meetings.RbS adopts a reconstructor to reconstruct the responses in the meeting, it also synchronically traces out which texts in the meeting drove the responses and marks them as essential contents.Therefore, salient information is captured and annotated as anchor tokens in RbS.To preserve the anchors but compress the lengthy and noisy input, we propose the relative positional bucketing (RPB), a dynamic embedding compression algorithm inspired by relative positional encoding (RPE) (Shaw et al., 2018;Huang et al., 2019).Our RPB-integrated summarizer can preserve the anchors and compress the less important contents according to their relative position to the anchors.This allows the summarizer to generate a concise and informative summary of the meeting transcripts.
Although RbS introduces an additional importance assessment step, the introduction of RPB greatly compresses the length of the original text, making RbS even faster and memory-efficient than the traditional one-step approach.The experimental results on AMI (Mccowan et al., 2005) and ICSI (Janin et al., 2003) show that RbS outperforms previous state-of-the-art approaches and surpasses a strong baseline pre-trained with large-scale dialogue corpus and tasks.Extensive experiments and analyses are conducted to verify the effectiveness and efficiency of each process of our approach.
To sum up, our contributions are as follows: (1) We propose the RbS, an efficient and effective framework for long-text meeting transcripts summarization; (2) Without external annotating tools or large-scale pre-training corpus and tasks, our method can efficiently generate meeting minutes with conventional language models (PLMs); (3) Extensive experiments demonstrate the effectiveness of our framework.

Methods
The main architecture is shown in Figure 1.Our framework comprises two components: the reconstructor and the summarizer.The reconstructor is responsible for reconstructing meeting transcripts and identifying the context that drives the interaction.Meanwhile, before generating the summary, the summarizer compresses the lengthy input and preserves critical content.

Reconstruction and Retracing
To capture the essential information, we propose retracing the contexts that drive interactions with a reconstructor.We split the meeting transcripts into context-response pairs.By reconstructing the response based on the context and tracking the contributing contexts, we can effectively capture the important content of the transcript.
Reconstruction The architecture in Figure 1 illustrates the process.To recover each response, we use a window of size w to limit the input history, with w set to 3 in Figure 1.We assume that a meeting transcript contains m sentences and create a sub-dataset consisting of m pairs, i.e.,{S [max(0,i−w):i−1] , S i }, where i ∈ [2 : m].To prompt the language model to predict the end of the meeting, we add a special token [EOM] at the end of the transcript.Finally, a reconstructor recover the response from S w to [EOM] as closely as possible using the input S [max(0,i−w):i−1] .The reconstruction is conducted via the forward pass of language model with teacher forcing (Williams and Zipser, 1989;Lamb et al., 2016).Retracing As shown in Figure 2, during the reconstruction, RbS synchronically retrace the contribution of each token of the context to the recovery of the response, from the perspective of attention weights (Bahdanau et al., 2014;Kim et al., 2017;Vaswani et al., 2017) and gradients.Recap the procedure of the attention mechanism, the cross attention (Vaswani et al., 2017) is formulated as: Where Q is the representation of the response to be generated in the decoder, K, and V are the memories and values that come from encoded contexts.Inspired by the works that adopt the attention and gradients to retrace which part of the input drives the model to make predictions (Jain et al., 2020;Kindermans et al., 2016;Atanasova et al., 2020;Sundararajan et al., 2017), we extract the importance scoring with scaled attention (Serrano and Smith, 2019) a∇a from the last cross-attention layer to determine the contribution of each token in contexts S [max(0,i−w):i−1] to the restoration of the response S i .The scaled attention a∇a is denoted as the attention scores a i scaled by its corresponding gradient ∇a i = ∂ ŷ ∂a i , where ŷ is the model's prediction.

Scores Aggregation
We utilize a context window of size w to reconstruct responses, which leads to reasonable reconstruction.However, this also results in each sentence being treated as context w times to recover responses, which poses a challenge in combining importance-related scores during tracebacks and scoring.To address this issue, we carefully propose the hypothesis that each token can be considered to be scored according to w different criteria after it is rated by w different responses.Therefore, two different strategies are investigated in this paper, which we refer to aver- aging and multi-view voting.Averaging involves taking an average of the w ratings for each token during the reconstruction process.We then select the top-k tokens with the highest average rating as the salient information, which we refer to as anchor tokens in RbS.This approach leverages the average score to express the overall contribution of each token.Multi-view voting involves selecting the top-k m tokens with the highest score in each criterion after the reconstruction is completed.This approach considers multiple perspectives for evaluating contexts, selecting the contexts that contribute most prominently under each perspective as anchors.

Summarization
With the obtained anchors, our hypothesis is that tokens in meeting texts are more relevant to salient information when they are closer to the anchors, and conversely, tokens that are farther away from the anchors are less relevant to the important content.Therefore, we propose relative positional bucketing(RPB), which compresses the original input as losslessly as possible by preserving the anchors and dynamically compressing the less-important contexts around the anchors.
Relative Positional Bucketing RbS employ an conventional language model that accept c tokens input as the summarizer, consider a sequence {t and positions of all anchors are {i 0 The summarizer first extract the embeddings } in the middle of all adjacent anchors as boundaries.Each pair of adjacent boundaries will form a sub-sequence containing an anchor point.For each sub-sequence, the summarizer obtains the position code of each token relative to the anchor.Inspired by the T5 (Raffel et al., 2020) that translates relative position to a bucket number for memory-efficient and long-sequence-friendly attention, we compress the sequence by bucketing the embeddings {e 0 , e 1 , • • • , e m } with c buckets.
We assign larger buckets to the embeddings that have a large relative distance to the anchors, and smaller buckets to the embeddings that are close to the anchors.Embeddings that share the same bucket will be compressed by average pooling.The calculation of bucket assignment for each token refers to Algorithm 1. Finally, the summarizer processes the compressed embeddings The difference between our proposed summarizer and the original BART is only the addition of the RPB module, which is inserted between the embedding layer and the first attention layer.The summarization is based on the compressed embedding instead of the original one, which greatly saves memory and computation time.Furthermore, it is noteworthy that the bucketing operation and batch average-pooling are parallel constant calculation1 and scatter-reduce operation2 , respectively, making the process highly efficient.

Setup
Dataset & Preprocessing RbS is evaluated on two meeting summarization datasets, AMI (Mccowan et al., 2005) and ICSI (Janin et al., 2003).AMI is a dataset of business project meeting scenarios that contains 137 transcripts.The average length of input and target length is 6,007 and 296, respectively.ICSI is aimed at academic discussion scenarios, where professors and other students have discussions with each other.The average length of input and target reaches 13,317 and 488.5, respectively, while only 59 meeting transcripts are included.Following the preprocessing pipeline proposed in Shang et al. (2018a), we split the data into training/development/testing sets with the list provided in (Shang et al., 2018b): 97/20/20 for AMI and 42/11/6 for ICSI.Besides the meeting minutes, decisions, actions (progress in the ICSI), and problems encountered in the meeting are included in the golden summarization.Extra spaces and duplicate punctuation removal are adopted to clean the data further.
Baseline & Metric BART (Lewis et al., 2020) is selected as both the baseline and backbone.BART-CNN that finetuned on the CNN-Daily Mail (Hermann et al., 2015) is also evaluated.Sentence Gated (Goo and Chen, 2018) utilizes the dialogue acts to generate summaries.PGNet (See et al., 2017) is a traditional approach that summarizes the meeting with the pointer network.HMnet (Zhu et al., 2020) adopts cross-domain pre-training before summarizing with a hierarchical attention mechanism.HAT (Rohde et al., 2021) performs a hierarchical attention transformer-based architecture.DDAMS (Feng et al., 2021a) incorporates discourse information to learn the diverse relationship among utterances.SummˆN (Zhang et al., 2022) performs the split-then-summarize in multi-stage for lengthy input.Feng et al. (2021b) employs DialogGPT as the annotator to label the keywords and topics in the meeting transcripts.Besides, Di-alogLM (Zhong et al., 2022) pre-trained on largescale dialogue-related corpus and tasks are also compared to show our efficiency.All approaches are evaluated with ROUGE (Lin, 2004), namely ROUGE-1, ROUGE-2, and ROUGE-L.
Implementation Details We use the released BART checkpoints in Huggingface's Transformers (Wolf et al., 2020) for RbS.Specifically, we initialize RbS with BART-large checkpoints, and the parameters in RbS-CNN are initialized by BARTlarge-CNN.During the response reconstruction, we used eight sentences as contexts.The reconstructor is trained for 2, 300 steps on the split AMI and 1, 500 steps on the split ICSI, with a learning rate of 5e-5 and a total batch size of 256.Once the reconstructor is enabled to recover the response, we perform one forward pass with the teacher-forcing to retrace the contribution of the contexts.During this process, 6.4% of the tokens are annotated as (Lewis et al., 2020) anchors.The total bucket number is equal to the maximum acceptable input of the backbone, which is 1024 for BART.The quantity of buckets for each sub-sequence depends on the length ratio to the total length.For the summarizer, we set the learning rate as 3e-5 with a total batch size of 64.It is worth noting that RbS is trained solely on AMI and ICSI without any external data or tools.We do not introduce any pretraining from other domains.

Analysis
In this section, we conduct further analysis to show the effectiveness of the RbS.We aim to investigate the correlation between anchor tokens and salient information in the meeting transcript.Through extensive experiments, we will demonstrate the validity of our approach in capturing anchor tokens, and the significance of anchor tokens in conveying important information.Furthermore, we will analyze the impact of different methods for aggregating the importance scores.We will also provide justification for our bucketing algorithm, analysis of the computing complexity is also provided to prove the efficiency of the framework.Additionally, the potential for reusing the parameters of the reconstructor is explored in Appendix A.3.

Importance Scoring
In this section, we examine the impact of importance scoring on our framework.To demonstrate the criticality of the anchors selected by our reconstruction and retracing process, we conduct experiments in various settings: (1) We delete or substitute the selected anchors with different ratios and observe the resulting changes in performance.(2) We test the framework with different indicators of importance, including the attention weights, the gradient of the attention weights, random scoring, and token-wise loss similar to Feng et al. (2021b), which uses r percentage of words with the highest reconstruction loss as keywords.(3) We extract and visualize the heatmap of our approach to see if the anchor words are precisely those we need.(4) We investigate the number of anchor tokens required to achieve acceptable performance for different scoring algorithms.
Anchor Deletion and Substitution For substitution, we take two measures.One is to replace the anchor tokens with other tokens randomly sampled from the meeting transcript, and the other is to replace anchors with high-frequency tokens.
For anchor token deletion, there are also two different strategies.One is to delete the anchor tokens randomly, and the other is to sort the anchors in descending order of importance score and divide them into four fractions.Then, we remove one at a time to observe the performance change.
The results in Table 2 demonstrate that both anchor token substitution and deletion negatively affect the performance.Specifically, when randomly substituting the anchor tokens with the others, a plunge of the ROUGE-1 scores could be observed (54.99 → 52.47).Although the score improves slightly after replacing anchors with highfrequency tokens, the performance still falls far short of anchor tokens.This indicates that anchor tokens selected by our framework are informative and play irreplaceable roles.This phenomenon is even more evident in the random removal of the anchor tokens.Results on the performance of different percentages of anchor tokens also show that our framework produces strongly importancecorrelated rankings.
Attention, Gradient, and Token-wise Loss We conduct an ablation study on different types of scoring indicators, namely the attention weights, gradients of the attention, and token-wise loss.Attention weights and their corresponding gradients are extracted from the last transformer layer of the model.As for the token-wise loss, different from our framework that treats the response as a query to rate the importance of the context, this approach scores the response directly according to the generation loss: where t is the generated token, t is the ground truth, and V is the vocabulary size.Similar to our setting, all the methods extract 6.4% tokens as anchors.
As shown in Table 3, scaled attention achieves the best performance on AMI.The performance of  the gradient is comparable with the scaled attention, while there are sharp decreases when switching the scaled attention to attention weights and tokenwise loss.These results demonstrate that scaled attention weights are more importance-correlated than the others.This finding is consistent with Chrysostomou and Aletras (2022); Serrano and Smith (2019) Visualization To validate the importance scoring approaches, we visualized the ratings of context in meeting transcripts.Figure 4 displays the heatmap for each scoring method using the response "if that's possible, we might consider getting into it," where "it" refers to voice recognition and cutting- edge technologies.This excerpt is from a meeting discussing the need to include voice recognition in remote controls.The middle of Figure 4 shows that most recovered tokens assign high attention scores to punctuation, indicating that attention weights do not accurately reflect context importance.The bottom part of the figure shows that while gradients can select essential content, they also assign high weights to irrelevant contents, making them an unsatisfactory indicator of importance.The top of Figure 4 shows that scaled attention weights accurately detect important content, assigning high scores to tokens such as "pay more for voice recognition" and "cutting-edge technology in remote control," while giving low scores to most other content, especially punctuation.This visualization provides an intuitive picture of our framework's ability to capture key points, further explaining its superior performance.

Number of anchors
We conduct the ablation studies on how the number of anchors influences the framework's performance, as shown in Figure 5.We gradually increased the number of anchor points and observed the change in the ROUGE score.Surprisingly, we found that the total number of anchors did not need to be very high; in fact, increasing the number of anchor tokens resulted in performance degradation.We attribute this phenomenon to the fact that the total number of buckets for the BART model is limited to 1024.The more anchor tokens there are, the fewer buckets the other tokens can share, leading to overcompressed context and performance degradation.
We also observed that our method achieved strong performance with fewer top-ranked tokens, while the other two methods required more anchor tokens to achieve acceptable performance.This indicates that our approach effectively captures salient information.

Scores Aggregation
We conduct an ablation study on two proposed score aggregation methods: averaging and multiview voting.Results in Table 4 show that multiview voting outperforms averaging.We attribute this to the fact that averaging disrupts the multiperspective rating mechanism.This result is consistent with our motivation that multi-view voting brings multiple horizons to the choice of anchor tokens.Therefore, we conclude that multi-view voting is necessary and beneficial for filtering anchor tokens.

Bucketing and Truncation
Our bucketing strategy can be viewed as a "softtruncation" approach that pools contents dynamically instead of truncating the sequence brutally.To justify this compression process, we compared bucketing with truncation.For sequence truncation, we truncated the sequence from the left/right/middle side or a random position to fit the input into the summarization model.We also tested anchor-based hard truncation, which keeps only the top-30% anchors as input.Table 5 shows significant performance degradation when using hard-truncation, suggesting that it is more sensible to compress sequences dynamically according to importance than to truncate them brutally.However, cutting sequences based on anchor points still outperforms direct left/right/middle truncation.These results further demonstrate that anchors are informative tokens.

Computational Complexity
The reconstructor divides meetings of length n into r context-response pairs, where the average length of each context is c.The values of n, r, and c are in the range of 5k-20k, 2-60, and 100-300, respectively.The time complexity of the reconstruction process is approximately O(r × c 2 × d model ).For summarizer, our introduced RPB greatly com-pressed the length of input from n to l(1024 by default), without altering the model structure beyond its initial form.The time complexity is O(l 2 × d model ).Therefore, given the lengthy meeting texts, despite the additional introduction of the reconstructor, the combined complexity O(r ×c 2 ×d model )+O(l 2 ×d model ) is much lower than that of the regular summary model, which has a complexity of O(n 2 × d model ).Our proposed approach effectively handles lengthy meeting texts with lower time complexity, making it a promising solution for real-world applications.

Related Work
Long-sequence processing techniques such as sliding-window attention (Beltagy et al., 2020), sparse sinkhorn attention (Tay et al., 2020), and hierarchical learning (Zhu et al., 2020;Rohde et al., 2021) are well-explored.These approaches target specifically lengthy input but ignore capturing salient information.Sentence compression (Shang et al., 2018a) and coarse-to-fine generation (Zhang et al., 2022) are developed to tailor the input length.However, the error propagation in intermediate steps severely limits their performance.
Meanwhile, language models have gradually equipped with dialogue acts (Goo and Chen, 2018), discourse relationship (Feng et al., 2021a), coreference resolution (Liu et al., 2021), and topicsegmentation (Liu et al., 2019;Li et al., 2019;Feng et al., 2021b).Despite the modest advances, these methods require external annotating tools or expertgrade annotators to accomplish the task.Feng et al. (2021b) employ the DialogGPT (Zhang et al., 2020b) as an annotator to capture keywords and topics.Despite the performance, adopting the tokenwise loss to label keywords needs to be considered more deeply.

Conclusion
In this paper, we proposed RbS, a meeting summarization framework that accurately captures salient contents from noisy and lengthy transcripts.RbS uses a two-step process to evaluate content importance and dynamically compress the text to gen-erate summaries.We introduce RPB, an anchorbased dynamic compression algorithm that condenses the original text, making RbS faster and more memory-efficient than one-step approaches.Without resorting to expert-grade annotation tools or large-scale dialogue-related pretraining tasks, experimental results on AMI and ICSI datasets show that RbS outperforms various previous approaches and reaches state-of-the-art performance.

Limitations
Our exploration of the summary algorithm focuses on the traditional summary model.However, it is worth noting that in the era of large language models(LLM), effective compression of the input of LLM is worth being explored.Our future work should investigate how to effectively compress the input of LLM to make it more efficient.

Ethical Considerations
We use publicly released datasets to train/dev/test our models.Generally, these previous works have considered ethical issues when creating the datasets.For the datasets we used in this work, we manually checked some samples and did not find any obvious ethical concerns, such as violent or offensive content.Source code and the models will be released with instructions to support correct use.

Gold
The meeting opens with the group doing introductions by giving their name and role, betty is the project manager, francina is the user interface specialist eileen is the marketing expert and jeanne is the industrial designer.The project manager tells them they will be designing a new remote control that should be original trendy and userfriendly.They will be concerned with functional conceptional and detailed design.To try out the whiteboard, each group member draws their favorite animal on the board.They discuss the project budget and then talk about their experiences with remote controls.They seemed to agree that the remote should be compact and have a multi-purpose functions.They also agree that it should do something different that current controls cannot do, and that it should be made of different colors, materials, and shapes.They also discuss a way of helping people find the remote when it is lost, a signal whether it is a beep or light.Then they close the meeting with the project manager going over the tasks they are to complete and telling them they will meet again in about thirty minutes.Selling price will be twenty five euro.Company aims to profit fifty million euro.It should be compact, multi-functional, different in shape, color, material.Have a locator to help find the remote when it is lost.The industrial designer will work on the working design and technical function.The interface specialist will do the working design and functional design.The marketing manager will look for user requirement specifications, such as friendliness.The group is not sure if they will have the budget to make the gadget multi-functional, but they would like to make one that would control basically all household machines.RbS The project manager introduced the upcoming project to the team members and introduced the name and role of each participant in the project.The team then began a training exercise in which they learned how to use the white board, and practiced drawing on the whiteboard.The project manager also introduced the project budget and the projected profit aim of the project which was fifty million euros.The team then discussed their experiences with remote controls and what features they would like to see in the remote they will be producing.They discussed the features they would like to include in the remote control design, such as color options and different shapes.They also discussed the possibility of adding a locator function to help locate the remote when it is lost.They then discussed what features the remote should have and what price point it should be.The industrial designer will work on the working design.The user interface designer, and the marketing expert will work together on the technical design.Whether to have a light on the remote to help find the remote if it is misplaced.SummˆN The project manager introduced the project to the team members and went over the agenda.The team members discussed the project budget and discussed the features of the remote.The remote will control televisions, computers, and other household appliances.The group decided that the remote should be small, compact, and have a fancy look and feel.The industrial designer and user interface specialist will work on the technical and functional design.The marketing expert will work with the marketing expert to figure out how to sell the product.The project manager closes the meeting and the project manager gives each team member their individual assignments.They will get instructions to work with and if they have any questions, they can ask them.It was decided that it would be a good idea to include a throw signal to help locate the remote when it is lost.It would be possible to make the remote more fashionable by using different colors and materials and using different shapes.The device will be for televisions only, and will not be for teletext.It should be a multi-functional gadget that controls all household machines.It could be used for voice recognition as well as voice recognition.It will be made of plastic and rubber and will be shaped like a kidney.There will be no LCD screen, and the remote will have buttons for power, volume, mute, channel-changing, channel up/down, channel down, and mute.They were not sure how much the remote would cost to produce.They did not know what the profit aim was for the project.They decided to use a white board to draw their favorite animals on the white board.They also decided to include an indicator on the remote so that it will light up when a button is pressed.They discussed how to incorporate the company colors and logo into the design.

Figure 1 :
Figure 1: By computing the scaled attention, the reconstructor identifies the contribution of each token in the contexts toward recovering the response.Tokens that make significant contributions are marked as anchors(labeled grey in the figure).Following this, the summarizer embeds the annotated texts.RPB then compresses the embedding from n × d model to c × d model based on the anchors, where n is the length of the input(normally 5k -20k), c is a constant(1024 by default) and n ≫ c.

Figure 2 :
Figure 2: For context-response generation, RbS utilizes the scaled-attention to retrace how much each token in context contributes to the recovery of the response.

Figure 3 :
Figure 3: An illustration of compressing a d model × n sequence embedding with two annotated anchors to d model × 10 using RPB, where b is the bucket number.Two orange blocks are the anchors.
and generate the summary.Figure 3 demonstrates an example that RPB compresses d model × n embeddings containing two anchors to d model × 10 embeddings.Such a process forms a dynamic compression based on the importance of the contexts.

Figure 4 :
Figure 4: Visualization of the heatmap.From top to bottom are heatmaps of scaled attention, gradient, and attention weights, respectively.

Figure 5 :
Figure 5: Trend of ROUGE score of different methods with increasing anchor ratio

Table 1 :
The performance on AMI and ICSI.l is the maximum number of input tokens for the corresponding model.
* denotes the metrics are calculated without sentence split.RbS takes the BART-large as the backbone, while the backbone of RbS-CNN is BART-large-CNN.

Table 1
Meanwhile, our RbS-CNN outperforms the previous state-of-the-art approach, SummˆN, by approximately 1.5 ROUGE-1 score on AMI, and 4 ROUGE-1 score on ICSI, without requiring the large-scale dialogue-specific pre-training.Even when compared to DialogLM, which is pre-

Table 2 :
Ablation studies on the substitution and deletion of anchor tokens

Table 4 :
Ablation study on channel aggregation

Table 5 :
Ablation study on bucketing and truncation

Table 6 :
Case study of RbS