Changran Hu
2024
SambaLingo: Teaching Large Language Models New Languages
Zoltan Csaki
|
Bo Li
|
Jonathan Lingjie Li
|
Qiantong Xu
|
Pian Pawakapan
|
Leon Zhang
|
Yun Du
|
Hengyu Zhao
|
Changran Hu
|
Urmish Thakker
Proceedings of the Fourth Workshop on Multilingual Representation Learning (MRL 2024)
Despite the widespread availability of LLMs, there remains a substantial gap in their capabilities and availability across diverse languages. One approach to address these issues has been to take an existing pre-trained LLM and continue to train it on new languages. While prior works have experimented with language adaptation, many questions around best practices and methodology have not been covered. In this paper, we present a comprehensive investigation into the adaptation of LLMs to new languages. Our study covers the key components in this process, including vocabulary extension, direct preference optimization and the data scarcity problem for human alignment in low resource languages. We scale these experiments across 9 languages and 2 parameter scales (7B and 70B). We compare our models against Llama 2, Aya-101, XGLM, BLOOM and existing language experts, outperforming all prior published baselines. Additionally, all evaluation code and checkpoints are made public to facilitate future research.
2021
Jointly Extracting Explicit and Implicit Relational Triples with Reasoning Pattern Enhanced Binary Pointer Network
Yubo Chen
|
Yunqi Zhang
|
Changran Hu
|
Yongfeng Huang
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Relational triple extraction is a crucial task for knowledge graph construction. Existing methods mainly focused on explicit relational triples that are directly expressed, but usually suffer from ignoring implicit triples that lack explicit expressions. This will lead to serious incompleteness of the constructed knowledge graphs. Fortunately, other triples in the sentence provide supplementary information for discovering entity pairs that may have implicit relations. Also, the relation types between the implicitly connected entity pairs can be identified with relational reasoning patterns in the real world. In this paper, we propose a unified framework to jointly extract explicit and implicit relational triples. To explore entity pairs that may be implicitly connected by relations, we propose a binary pointer network to extract overlapping relational triples relevant to each word sequentially and retain the information of previously extracted triples in an external memory. To infer the relation types of implicit relational triples, we propose to introduce real-world relational reasoning patterns in our model and capture these patterns with a relation network. We conduct experiments on several benchmark datasets, and the results prove the validity of our method.
Search
Co-authors
- Yubo Chen 1
- Yunqi Zhang 1
- Yongfeng Huang 1
- Zoltan Csaki 1
- Bo Li 1
- show all...