Dandan Huang


2024

pdf bib
Which Sense Dominates Multisensory Semantic Understanding? A Brain Decoding Study
Dandan Huang | Lu Cao | Zhenting Li | Yue Zhang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Decoding semantic meanings from brain activity has attracted increasing attention. Neurolinguists have found that semantic perception is open to multisensory stimulation, as word meanings can be delivered by both auditory and visual inputs. Prior work which decodes semantic meanings from neuroimaging data largely exploits brain activation patterns triggered by stimulation in cross-modality (i.e. text-audio pairs, text-picture pairs). Their goal is to develop a more sophisticated computational model to probing what information from the act of language understanding is represented in human brain. While how the brain receiving such information influences decoding performance is underestimated. This study dissociates multisensory integration of word understanding into written text, spoken text and image perception respectively, exploring the decoding efficiency and reliability of unisensory information in the brain representation. The findings suggest that, in terms of unisensory, decoding is most successful when semantics is represented in pictures, but the effect disappears in the case of congeneric words which share a related meaning. These results reveal the modality dependence and multisensory enhancement in the brain decoding methodology.

2021

pdf bib
A Comparison between Pre-training and Large-scale Back-translation for Neural Machine Translation
Dandan Huang | Kun Wang | Yue Zhang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib
Investigating Rich Feature Sources for Conceptual Representation Encoding
Lu Cao | Yulong Chen | Dandan Huang | Yue Zhang
Proceedings of the Workshop on the Cognitive Aspects of the Lexicon

Functional Magnetic Resonance Imaging (fMRI) provides a means to investigate human conceptual representation in cognitive and neuroscience studies, where researchers predict the fMRI activations with elicited stimuli inputs. Previous work mainly uses a single source of features, particularly linguistic features, to predict fMRI activations. However, relatively little work has been done on investigating rich-source features for conceptual representation. In this paper, we systematically compare the linguistic, visual as well as auditory input features in conceptual representation, and further introduce associative conceptual features, which are obtained from Small World of Words game, to predict fMRI activations. Our experimental results show that those rich-source features can enhance performance in predicting the fMRI activations. Our analysis indicates that information from rich sources is present in the conceptual representation of human brains. In particular, the visual feature weights the most on conceptual representation, which is consistent with the recent cognitive science study.

pdf bib
What Have We Achieved on Text Summarization?
Dandan Huang | Leyang Cui | Sen Yang | Guangsheng Bao | Kun Wang | Jun Xie | Yue Zhang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Deep learning has led to significant improvement in text summarization with various methods investigated and improved ROUGE scores reported over the years. However, gaps still exist between summaries produced by automatic summarizers and human professionals. Aiming to gain more understanding of summarization systems with respect to their strengths and limits on a fine-grained syntactic and semantic level, we consult the Multidimensional Quality Metric (MQM) and quantify 8 major sources of errors on 10 representative summarization models manually. Primarily, we find that 1) under similar settings, extractive summarizers are in general better than their abstractive counterparts thanks to strength in faithfulness and factual-consistency; 2) milestone techniques such as copy, coverage and hybrid extractive/abstractive methods do bring specific improvements but also demonstrate limitations; 3) pre-training techniques, and in particular sequence-to-sequence pre-training, are highly effective for improving text summarization, with BART giving the best results.