Long-Huei Chen


pdf bib
Multilingualization of Medical Terminology: Semantic and Structural Embedding Approaches
Long-Huei Chen | Kyo Kageura
Proceedings of the 12th Language Resources and Evaluation Conference

The multilingualization of terminology is an essential step in the translation pipeline, to ensure the correct transfer of domain-specific concepts. Many institutions and language service providers construct and maintain multilingual terminologies, which constitute important assets. However, the curation of such multilingual resources requires significant human effort; though automatic multilingual term extraction methods have been proposed so far, they are of limited success as term translation cannot be satisfied by simply conveying meaning, but requires the terminologists and domain experts’ knowledge to fit the term within the existing terminology. Here we propose a method to encode the structural property of a term by aligning their embeddings using graph convolutional networks trained from separate languages. We observe that the structural information can augment the semantic methods also explored in this work, and recognize the unique nature of terminologies allows our method to fully take advantage and produce superior results.


pdf bib
Translating Terminologies: A Comparative Examination of NMT and PBSMT Systems
Long-Huei Chen | Kyo Kageura
Proceedings of Machine Translation Summit XVII: Translator, Project and User Tracks

pdf bib
Entropic characterisation of termino-conceptual structure : A preliminary study
Kyo Kageura | Long-Huei Chen
Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN) PFIA 2019. Terminologie et Intelligence Artificielle (atelier TALN-RECITAL \& IC)

Terms represent concepts, which consist of conceptual characteristics. In actual concept-term formation, which is done by researchers, the process is in reverse: conceptual elements/characteristics are consolidated to form concepts, which are represented by terms. As concepts do not exist on the fly, what we may call termino-conceptual system provides scaffolding in this process. Terminologists, both in practice and in research, do not only collect and list terms but also analyse, describe and define terms and systematise terminologies. To carry out these tasks, terminologists must refer to conceptual systems, to the extent that they contribute to systematising terminologies; terminologists thus also deal with the sphere of termino-conceptual system. In this paper, we consolidate the status of termino-conceptual sphere and propose a way to characterise the structure of termino-conceptual system by using entropy. The entropic characterisation of English terminologies of six domain, i.e. agriculture, botany, chemistry, computer science, physics and psychology are presented.