Creative Language Encoding under Censorship

Heng Ji, Kevin Knight


Abstract
People often create obfuscated language for online communication to avoid Internet censorship, share sensitive information, express strong sentiment or emotion, plan for secret actions, trade illegal products, or simply hold interesting conversations. In this position paper we systematically categorize human-created obfuscated language on various levels, investigate their basic mechanisms, give an overview on automated techniques needed to simulate human encoding. These encoders have potential to frustrate and evade, co-evolve with dynamic human or automated decoders, and produce interesting and adoptable code words. We also summarize remaining challenges for future research on the interaction between Natural Language Processing (NLP) and encryption, and leveraging NLP techniques for encoding and decoding.
Anthology ID:
W18-4203
Volume:
Proceedings of the First Workshop on Natural Language Processing for Internet Freedom
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Editors:
Chris Brew, Anna Feldman, Chris Leberknight
Venue:
NLP4IF
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
23–33
Language:
URL:
https://aclanthology.org/W18-4203
DOI:
Bibkey:
Cite (ACL):
Heng Ji and Kevin Knight. 2018. Creative Language Encoding under Censorship. In Proceedings of the First Workshop on Natural Language Processing for Internet Freedom, pages 23–33, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Creative Language Encoding under Censorship (Ji & Knight, NLP4IF 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-4203.pdf