Decoding semantic meanings from brain activity has attracted increasing attention. Neurolinguists have found that semantic perception is open to multisensory stimulation, as word meanings can be delivered by both auditory and visual inputs. Prior work which decodes semantic meanings from neuroimaging data largely exploits brain activation patterns triggered by stimulation in cross-modality (i.e. text-audio pairs, text-picture pairs). Their goal is to develop a more sophisticated computational model to probing what information from the act of language understanding is represented in human brain. While how the brain receiving such information influences decoding performance is underestimated. This study dissociates multisensory integration of word understanding into written text, spoken text and image perception respectively, exploring the decoding efficiency and reliability of unisensory information in the brain representation. The findings suggest that, in terms of unisensory, decoding is most successful when semantics is represented in pictures, but the effect disappears in the case of congeneric words which share a related meaning. These results reveal the modality dependence and multisensory enhancement in the brain decoding methodology.
Functional Magnetic Resonance Imaging (fMRI) provides a means to investigate human conceptual representation in cognitive and neuroscience studies, where researchers predict the fMRI activations with elicited stimuli inputs. Previous work mainly uses a single source of features, particularly linguistic features, to predict fMRI activations. However, relatively little work has been done on investigating rich-source features for conceptual representation. In this paper, we systematically compare the linguistic, visual as well as auditory input features in conceptual representation, and further introduce associative conceptual features, which are obtained from Small World of Words game, to predict fMRI activations. Our experimental results show that those rich-source features can enhance performance in predicting the fMRI activations. Our analysis indicates that information from rich sources is present in the conceptual representation of human brains. In particular, the visual feature weights the most on conceptual representation, which is consistent with the recent cognitive science study.
Deep learning has led to significant improvement in text summarization with various methods investigated and improved ROUGE scores reported over the years. However, gaps still exist between summaries produced by automatic summarizers and human professionals. Aiming to gain more understanding of summarization systems with respect to their strengths and limits on a fine-grained syntactic and semantic level, we consult the Multidimensional Quality Metric (MQM) and quantify 8 major sources of errors on 10 representative summarization models manually. Primarily, we find that 1) under similar settings, extractive summarizers are in general better than their abstractive counterparts thanks to strength in faithfulness and factual-consistency; 2) milestone techniques such as copy, coverage and hybrid extractive/abstractive methods do bring specific improvements but also demonstrate limitations; 3) pre-training techniques, and in particular sequence-to-sequence pre-training, are highly effective for improving text summarization, with BART giving the best results.