<?xml version="1.0" encoding="UTF-8" ?>
<volume id="W17">
  <paper id="4500">
    <title>Proceedings of the Workshop on New Frontiers in Summarization</title>
    <editor>Lu Wang</editor>
    <editor>Jackie Chi Kit Cheung</editor>
    <editor>Giuseppe Carenini</editor>
    <editor>Fei Liu</editor>
    <month>September</month>
    <year>2017</year>
    <address>Copenhagen, Denmark</address>
    <publisher>Association for Computational Linguistics</publisher>
    <url>http://www.aclweb.org/anthology/W17-45</url>
    <bibtype>book</bibtype>
    <bibkey>FrontiersSummarization:2017</bibkey>
  </paper>

  <paper id="4501">
    <title>Video Highlights Detection and Summarization with Lag-Calibration based on Concept-Emotion Mapping of Crowdsourced Time-Sync Comments</title>
    <author><first>Qing</first><last>Ping</last></author>
    <author><first>Chaomei</first><last>Chen</last></author>
    <booktitle>Proceedings of the Workshop on New Frontiers in Summarization</booktitle>
    <month>September</month>
    <year>2017</year>
    <address>Copenhagen, Denmark</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>1&#8211;11</pages>
    <url>http://www.aclweb.org/anthology/W17-4501</url>
    <abstract>With the prevalence of video sharing, there are increasing demands for
	automatic video digestion such as highlight detection. Recently, platforms with
	crowdsourced time-sync video comments have emerged worldwide, providing a good
	opportunity for highlight detection. However, this task is non-trivial: (1)
	time-sync comments often lag behind their corresponding shot; (2) time-sync
	comments are semantically sparse and noisy; (3) to determine which shots are
	highlights is highly subjective. The present paper aims to tackle these
	challenges by proposing a framework that (1) uses concept-mapped lexical-chains
	for lag-calibration; (2) models video highlights based on comment intensity and
	combination of emotion and concept concentration of each shot; (3) summarize
	each detected highlight using improved SumBasic with emotion and concept
	mapping. Experiments on large real-world datasets show that our highlight
	detection method and summarization method both outperform other benchmarks with
	considerable margins.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>ping-chen:2017:FrontiersSummarization</bibkey>
  </paper>

  <paper id="4502">
    <title>Multimedia Summary Generation from Online Conversations: Current Approaches and Future Directions</title>
    <author><first>Enamul</first><last>Hoque</last></author>
    <author><first>Giuseppe</first><last>Carenini</last></author>
    <booktitle>Proceedings of the Workshop on New Frontiers in Summarization</booktitle>
    <month>September</month>
    <year>2017</year>
    <address>Copenhagen, Denmark</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>12&#8211;19</pages>
    <url>http://www.aclweb.org/anthology/W17-4502</url>
    <abstract>With the proliferation of Web-based social media, asynchronous conversations
	have become very common for supporting online communication and collaboration.
	Yet the increasing volume and complexity of conversational data often make
	it very difficult to get insights about the discussions. We consider combining
	textual summary with visual representation of conversational data as a
	promising way of supporting the user in exploring conversations. In this paper,
	we report our current work on developing visual interfaces that
	present multimedia summary combining text and visualization for online
	conversations and how our solutions have been tailored for a variety of domain
	problems. We then discuss the key challenges and opportunities for future work
	in this research space.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>hoque-carenini:2017:FrontiersSummarization</bibkey>
  </paper>

  <paper id="4503">
    <title>Low-Resource Neural Headline Generation</title>
    <author><first>Ottokar</first><last>Tilk</last></author>
    <author><first>Tanel</first><last>Alum&#228;e</last></author>
    <booktitle>Proceedings of the Workshop on New Frontiers in Summarization</booktitle>
    <month>September</month>
    <year>2017</year>
    <address>Copenhagen, Denmark</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>20&#8211;26</pages>
    <url>http://www.aclweb.org/anthology/W17-4503</url>
    <abstract>Recent neural headline generation models
	have shown great results, but are generally
	trained on very large datasets. We focus
	our efforts on improving headline quality
	on smaller datasets by the means of pretraining.
	We propose new methods that
	enable pre-training all the parameters of
	the model and utilize all available text, resulting
	in improvements by up to 32.4%
	relative in perplexity and 2.84 points in
	ROUGE.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>tilk-alumae:2017:FrontiersSummarization</bibkey>
  </paper>

  <paper id="4504">
    <title>Towards Improving Abstractive Summarization via Entailment Generation</title>
    <author><first>Ramakanth</first><last>Pasunuru</last></author>
    <author><first>Han</first><last>Guo</last></author>
    <author><first>Mohit</first><last>Bansal</last></author>
    <booktitle>Proceedings of the Workshop on New Frontiers in Summarization</booktitle>
    <month>September</month>
    <year>2017</year>
    <address>Copenhagen, Denmark</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>27&#8211;32</pages>
    <url>http://www.aclweb.org/anthology/W17-4504</url>
    <abstract>Abstractive summarization, the task of rewriting and compressing a document
	into a short summary, has achieved considerable success with neural
	sequence-to-sequence models. However, these models can still benefit from
	stronger natural language inference skills, since a correct summary is
	logically entailed by the input document, i.e., it should not contain any
	contradictory or unrelated information. We incorporate such knowledge into an
	abstractive summarization model via multi-task learning, where we share its
	decoder parameters with those of an entailment generation model. We achieve
	promising initial improvements based on multiple metrics and datasets
	(including a test-only setting). The domain mismatch between the entailment
	(captions) and summarization (news) datasets suggests that the model is
	learning some domain-agnostic inference skills.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>pasunuru-guo-bansal:2017:FrontiersSummarization</bibkey>
  </paper>

  <paper id="4505">
    <title>Coarse-to-Fine Attention Models for Document Summarization</title>
    <author><first>Jeffrey</first><last>Ling</last></author>
    <author><first>Alexander</first><last>Rush</last></author>
    <booktitle>Proceedings of the Workshop on New Frontiers in Summarization</booktitle>
    <month>September</month>
    <year>2017</year>
    <address>Copenhagen, Denmark</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>33&#8211;42</pages>
    <url>http://www.aclweb.org/anthology/W17-4505</url>
    <abstract>Sequence-to-sequence models with attention have been successful for a variety
	of NLP problems, but their speed does not scale well for tasks with long source
	sequences such as document summarization.
	We propose a novel coarse-to-fine attention model that hierarchically reads a
	document, using coarse attention to select top-level chunks of text and fine
	attention to read the words of the chosen chunks. While the computation for
	training standard attention models scales linearly with source sequence length,
	our method scales with the number of top-level chunks and can handle much
	longer sequences.
	Empirically, we find that while coarse-to-fine attention models lag behind
	state-of-the-art baselines, our method achieves the desired behavior of
	sparsely attending to subsets of the document for generation.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>ling-rush:2017:FrontiersSummarization</bibkey>
  </paper>

  <paper id="4506">
    <title>Automatic Community Creation for Abstractive Spoken Conversations Summarization</title>
    <author><first>Karan</first><last>Singla</last></author>
    <author><first>Evgeny</first><last>Stepanov</last></author>
    <author><first>Ali Orkan</first><last>Bayer</last></author>
    <author><first>Giuseppe</first><last>Carenini</last></author>
    <author><first>Giuseppe</first><last>Riccardi</last></author>
    <booktitle>Proceedings of the Workshop on New Frontiers in Summarization</booktitle>
    <month>September</month>
    <year>2017</year>
    <address>Copenhagen, Denmark</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>43&#8211;47</pages>
    <url>http://www.aclweb.org/anthology/W17-4506</url>
    <abstract>Summarization of spoken conversations is a challenging task, since it requires
	deep understanding of dialogs. Abstractive summarization techniques rely on
	linking the summary sentences to sets of original conversation sentences, i.e.
	communities. Unfortunately, such linking information is rarely available or
	requires trained annotators. We propose and experiment automatic community
	creation using cosine similarity on different levels of representation: raw
	text, WordNet SynSet IDs, and word embeddings. We show that the abstractive
	summarization systems with automatic communities significantly outperform
	previously published results on both English and Italian corpora.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>singla-EtAl:2017:FrontiersSummarization</bibkey>
  </paper>

  <paper id="4507">
    <title>Combining Graph Degeneracy and Submodularity for Unsupervised Extractive Summarization</title>
    <author><first>Antoine</first><last>Tixier</last></author>
    <author><first>Polykarpos</first><last>Meladianos</last></author>
    <author><first>Michalis</first><last>Vazirgiannis</last></author>
    <booktitle>Proceedings of the Workshop on New Frontiers in Summarization</booktitle>
    <month>September</month>
    <year>2017</year>
    <address>Copenhagen, Denmark</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>48&#8211;58</pages>
    <url>http://www.aclweb.org/anthology/W17-4507</url>
    <abstract>We present a fully unsupervised, extractive text summarization system that
	leverages a submodularity framework introduced by past research. The framework
	allows summaries to be generated in a greedy way while preserving near-optimal
	performance guarantees. Our main contribution is the novel coverage reward term
	of the objective function optimized by the greedy algorithm. This component
	builds on the graph-of-words representation of text and the k-core
	decomposition algorithm to assign meaningful scores to words. We evaluate our
	approach on the AMI and ICSI meeting speech corpora, and on the DUC2001 news
	corpus. We reach state-of-the-art performance on all datasets. Results indicate
	that our method is particularly well-suited to the meeting domain.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>tixier-meladianos-vazirgiannis:2017:FrontiersSummarization</bibkey>
  </paper>

  <paper id="4508">
    <title>TL;DR: Mining Reddit to Learn Automatic Summarization</title>
    <author><first>Michael</first><last>V&#246;lske</last></author>
    <author><first>Martin</first><last>Potthast</last></author>
    <author><first>Shahbaz</first><last>Syed</last></author>
    <author><first>Benno</first><last>Stein</last></author>
    <booktitle>Proceedings of the Workshop on New Frontiers in Summarization</booktitle>
    <month>September</month>
    <year>2017</year>
    <address>Copenhagen, Denmark</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>59&#8211;63</pages>
    <url>http://www.aclweb.org/anthology/W17-4508</url>
    <abstract>Recent advances in automatic text summarization have used deep neural networks
	to generate high-quality abstractive summaries, but the performance of these
	models strongly depends on large amounts of suitable training data. We propose
	a new method for mining social media for author-provided summaries, taking
	advantage of the common practice of appending a &#x201c;TL;DR&#x201d; to long posts. A case
	study using a large Reddit crawl yields the Webis-TLDR-17 dataset,
	complementing existing corpora primarily from the news genre. Our technique is
	likely applicable to other social media sites and general web crawls.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>volske-EtAl:2017:FrontiersSummarization</bibkey>
  </paper>

  <paper id="4509">
    <title>Topic Model Stability for Hierarchical Summarization</title>
    <author><first>John</first><last>Miller</last></author>
    <author><first>Kathleen</first><last>McCoy</last></author>
    <booktitle>Proceedings of the Workshop on New Frontiers in Summarization</booktitle>
    <month>September</month>
    <year>2017</year>
    <address>Copenhagen, Denmark</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>64&#8211;73</pages>
    <url>http://www.aclweb.org/anthology/W17-4509</url>
    <attachment type="attachment">W17-4509.Attachment.zip</attachment>
    <abstract>We envisioned responsive generic hierarchical text summarization with summaries
	organized by section and paragraph based on hierarchical structure topic
	models. But we had to be sure that topic models were stable for the sampled
	corpora. To that end we developed a methodology for aligning multiple
	hierarchical structure topic models run over the same corpus under similar
	conditions, calculating a representative centroid model, and reporting
	stability of the centroid model. We ran stability experiments for standard
	corpora and a development corpus of Global Warming articles. We found flat and
	hierarchical structures of two levels plus the root offer stable centroid
	models, but hierarchical structures of three levels plus the root didn't seem
	stable enough for use in hierarchical summarization.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>miller-mccoy:2017:FrontiersSummarization</bibkey>
  </paper>

  <paper id="4510">
    <title>Learning to Score System Summaries for Better Content Selection Evaluation.</title>
    <author><first>Maxime</first><last>Peyrard</last></author>
    <author><first>Teresa</first><last>Botschen</last></author>
    <author><first>Iryna</first><last>Gurevych</last></author>
    <booktitle>Proceedings of the Workshop on New Frontiers in Summarization</booktitle>
    <month>September</month>
    <year>2017</year>
    <address>Copenhagen, Denmark</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>74&#8211;84</pages>
    <url>http://www.aclweb.org/anthology/W17-4510</url>
    <abstract>The evaluation of summaries is a challenging but crucial task of the
	summarization field. In this work, we propose to learn an automatic scoring
	metric based on the human judgements available as part of classical
	summarization datasets like TAC-2008 and TAC-2009. Any existing automatic
	scoring metrics can be included as features, the model learns the combination
	exhibiting the best correlation with human judgments. The reliability of the
	new metric is tested in a further manual evaluation where we ask humans to
	evaluate summaries covering the whole scoring spectrum of the metric. We
	release the trained metric as an open-source tool.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>peyrard-botschen-gurevych:2017:FrontiersSummarization</bibkey>
  </paper>

  <paper id="4511">
    <title>Revisiting the Centroid-based Method: A Strong Baseline for Multi-Document Summarization</title>
    <author><first>Demian</first><last>Gholipour Ghalandari</last></author>
    <booktitle>Proceedings of the Workshop on New Frontiers in Summarization</booktitle>
    <month>September</month>
    <year>2017</year>
    <address>Copenhagen, Denmark</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>85&#8211;90</pages>
    <url>http://www.aclweb.org/anthology/W17-4511</url>
    <abstract>The centroid-based model for extractive document summarization is a simple and
	fast baseline that ranks sentences based on their similarity to a centroid
	vector. In this paper, we apply this ranking to possible summaries instead of
	sentences and use a simple greedy algorithm to find the best summary.
	Furthermore, we show possibilities to scale up to larger input document
	collections by selecting a small number of sentences from each document prior
	to constructing the summary.
	Experiments were done on the DUC2004 dataset for multi-document summarization.
	We observe a higher performance over the original model, on par with more
	complex state-of-the-art methods.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>gholipourghalandari:2017:FrontiersSummarization</bibkey>
  </paper>

  <paper id="4512">
    <title>Reader-Aware Multi-Document Summarization: An Enhanced Model and The First Dataset</title>
    <author><first>Piji</first><last>Li</last></author>
    <author><first>Lidong</first><last>Bing</last></author>
    <author><first>Wai</first><last>Lam</last></author>
    <booktitle>Proceedings of the Workshop on New Frontiers in Summarization</booktitle>
    <month>September</month>
    <year>2017</year>
    <address>Copenhagen, Denmark</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>91&#8211;99</pages>
    <url>http://www.aclweb.org/anthology/W17-4512</url>
    <abstract>We investigate the problem of reader-aware multi-document summarization
	(RA-MDS) and introduce a new dataset for this problem. To tackle RA-MDS, we
	extend a variational auto-encodes (VAEs) based MDS framework by jointly
	considering news documents and reader comments. To conduct evaluation for
	summarization performance, we prepare a new dataset. We describe the methods
	for data collection, aspect annotation, and summary writing as well as
	scrutinizing by experts. Experimental results show that reader comments can
	improve the summarization performance, which also demonstrates the usefulness
	of the proposed dataset.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>li-bing-lam:2017:FrontiersSummarization</bibkey>
  </paper>

  <paper id="4513">
    <title>A Pilot Study of Domain Adaptation Effect for Neural Abstractive Summarization</title>
    <author><first>Xinyu</first><last>Hua</last></author>
    <author><first>Lu</first><last>Wang</last></author>
    <booktitle>Proceedings of the Workshop on New Frontiers in Summarization</booktitle>
    <month>September</month>
    <year>2017</year>
    <address>Copenhagen, Denmark</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>100&#8211;106</pages>
    <url>http://www.aclweb.org/anthology/W17-4513</url>
    <abstract>We study the problem of domain adaptation for neural abstractive summarization.
	We make initial efforts in investigating what information can be transferred to
	a new domain. Experimental results on news stories and opinion articles
	indicate that neural summarization model benefits from pre-training based on
	extractive summaries. We also find that the combination of in-domain and
	out-of-domain setup yields better summaries when in-domain data is
	insufficient. Further analysis shows that, the model is capable to select
	salient content even trained on out-of-domain data, but requires in-domain data
	to capture the style for a target domain.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>hua-wang:2017:FrontiersSummarization</bibkey>
  </paper>

</volume>

