George Flint
2025
Quantifying Phonosemantic Iconicity Distributionally in 6 Languages
George Flint
|
Kaustubh Kislay
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Language is, as commonly theorized, largely arbitrary. Yet, systematic relationships between phonetics and semantics have been observed in many specific cases. To what degree could those systematic relationships manifest themselves in large scale, quantitative investigations–both in previously identified and unidentified phenomena? This work undertakes a distributional approach to quantifying phonosemantic iconicity at scale across 6 diverse languages (English, Spanish, Hindi, Finnish, Turkish, and Tamil). In each language, we analyze the alignment of morphemes’ phonetic and semantic similarity spaces with a suite of statistical measures, and discover an array of interpretable phonosemantic alignments not previously identified in the literature, along with crosslinguistic patterns. We also analyze 5 previously hypothesized phonosemantic alignments, finding support for some such alignments and mixed results for others.
Automated Coding of Counsellor and Client Behaviours in Motivational Interviewing Transcripts: Validation and Application
Armaity Katki
|
Nathan Choi
|
Son Sophak Otra
|
George Flint
|
Kevin Zhu
|
Sunishchal Dev
NLP-AI4Health
Protein language models (PLMs) are powerful tools for protein engineering, but remain difficult to steer toward specific biochemical properties, where small sequence changes can affect stability or function. We adapt two prominent unsupervised editing methods: task arithmetic (TA; specifically, Forgetting via Negation) in weight space and feature editing with a sparse autoencoder (SAE) in activation space. We evaluate their effects on six biochemical properties of generations from three PLMs (ESM3, ProGen2-Large, and ProLLaMA). Across models, we observe complementary efficacies: TA more effectively controls some properties while SAE more effectively controls others. Property response patterns show some consistence across models. We suggest that the response pattern of biochemical properties should be considered when steering PLMs.
Search
Fix author
Co-authors
- Nathan Choi 1
- Sunishchal Dev 1
- Armaity Katki 1
- Kaustubh Kislay 1
- Son Sophak Otra 1
- show all...