Learning about Word Vector Representations and Deep Learning through Implementing Word2vec

David Jurgens


Abstract
Word vector representations are an essential part of an NLP curriculum. Here, we describe a homework that has students implement a popular method for learning word vectors, word2vec. Students implement the core parts of the method, including text preprocessing, negative sampling, and gradient descent. Starter code provides guidance and handles basic operations, which allows students to focus on the conceptually challenging aspects. After generating their vectors, students evaluate them using qualitative and quantitative tests.
Anthology ID:
2021.teachingnlp-1.19
Volume:
Proceedings of the Fifth Workshop on Teaching NLP
Month:
June
Year:
2021
Address:
Online
Editors:
David Jurgens, Varada Kolhatkar, Lucy Li, Margot Mieskes, Ted Pedersen
Venue:
TeachingNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
108–111
Language:
URL:
https://aclanthology.org/2021.teachingnlp-1.19
DOI:
10.18653/v1/2021.teachingnlp-1.19
Bibkey:
Cite (ACL):
David Jurgens. 2021. Learning about Word Vector Representations and Deep Learning through Implementing Word2vec. In Proceedings of the Fifth Workshop on Teaching NLP, pages 108–111, Online. Association for Computational Linguistics.
Cite (Informal):
Learning about Word Vector Representations and Deep Learning through Implementing Word2vec (Jurgens, TeachingNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.teachingnlp-1.19.pdf