In this paper, we present our approach to solve the LongSumm 2020 Shared Task, at the 1st Workshop on Scholarly Document Processing. The objective of the long summaries task is to generate long summaries that cover salient information in scientific articles. The task is to generate abstractive and extractive summaries of a given scientific article. In the proposed approach, we are inspired by the concept of Argumentative Zoning (AZ) that de- fines the main rhetorical structure in scientific articles. We define two aspects that should be covered in scientific paper summary, namely Claim/Method and Conclusion/Result aspects. We use Solr index to expand the sentences of the paper abstract. We formulate each abstract sentence in a given publication as query to retrieve similar sentences from the text body of the document itself. We utilize a sentence selection algorithm described in previous literature to select sentences for the final summary that covers the two aforementioned aspects.
Volatility prediction—an essential concept in financial markets—has recently been addressed using sentiment analysis methods. We investigate the sentiment of annual disclosures of companies in stock markets to forecast volatility. We specifically explore the use of recent Information Retrieval (IR) term weighting models that are effectively extended by related terms using word embeddings. In parallel to textual information, factual market data have been widely used as the mainstream approach to forecast market risk. We therefore study different fusion methods to combine text and market data resources. Our word embedding-based approach significantly outperforms state-of-the-art methods. In addition, we investigate the characteristics of the reports of the companies in different financial sectors.
In this paper, we address the shortage of evaluation benchmarks on Persian (Farsi) language by creating and making available a new benchmark for English to Persian Cross Lingual Word Sense Disambiguation (CL-WSD). In creating the benchmark, we follow the format of the SemEval 2013 CL-WSD task, such that the introduced tools of the task can also be applied on the benchmark. In fact, the new benchmark extends the SemEval-2013 CL-WSD task to Persian language.