Stanislaw Jastrzębski

Also published as: Stanislaw Jastrzebski


Can Wikipedia Categories Improve Masked Language Model Pretraining?
Diksha Meghwal | Katharina Kann | Iacer Calixto | Stanislaw Jastrzebski
Proceedings of the The Fourth Widening Natural Language Processing Workshop

Pretrained language models have obtained impressive results for a large set of natural language understanding tasks. However, training these models is computationally expensive and requires huge amounts of data. Thus, it would be desirable to automatically detect groups of more or less important examples. Here, we investigate if we can leverage sources of information which are commonly overlooked, Wikipedia categories as listed in DBPedia, to identify useful or harmful data points during pretraining. We define an experimental setup in which we analyze correlations between language model perplexity on specific clusters and downstream NLP task performances during pretraining. Our experiments show that Wikipedia categories are not a good indicator of the importance of specific sentences for pretraining.


pdf bib
Commonsense mining as knowledge base completion? A study on the impact of novelty
Stanislaw Jastrzębski | Dzmitry Bahdanau | Seyedarian Hosseini | Michael Noukhovitch | Yoshua Bengio | Jackie Cheung
Proceedings of the Workshop on Generalization in the Age of Deep Learning

Commonsense knowledge bases such as ConceptNet represent knowledge in the form of relational triples. Inspired by recent work by Li et al., we analyse if knowledge base completion models can be used to mine commonsense knowledge from raw text. We propose novelty of predicted triples with respect to the training set as an important factor in interpreting results. We critically analyse the difficulty of mining novel commonsense knowledge, and show that a simple baseline method that outperforms the previous state of the art on predicting more novel triples.