As companies aim to enhance and expand their product portfolios, Technology Opportunity Discovery (TOD) has gained increasing interest. To comprehend the role of emerging technologies in innovation, we introduce a novel technology-market corpus in English and Japanese languages, and conduct a comprehensive empirical evaluation of the linkage between technology and the market. Our dataset comprises English patents extracted from the USPTO database and Japanese patents from the Japanese Patent Office (JPO), along with their associated products for each stock market company. We compare several static and contextualized word embedding methods to construct a technology-market space and propose an effective methodology based on a fine-tuned BERT model for linking technology to the market.
As a fresh way to improve the user viewing experience, videos of time-sync comments have attracted a lot of interest. Many efforts have been made to explore the effectiveness of time-sync comments for various applications. However, due to the complexity of interactions among users, videos, and comments, it still remains challenging to understand users’ behavior on time-sync comments. Along this line, we study the problem of time-sync comment behavior prediction with considerations of both historical behaviors and multi-modal information of visual frames and textual comments. Specifically, we propose a novel Multi-modal short- and long-Range Temporal Convolutional Network model, namely MRT. Firstly, we design two amplified Temporal Convolutional Networks with different sizes of receptive fields, to capture both short- and long-range surrounding contexts for each frame and time-sync comments. Then, we design a bottle-neck fusion module to obtain the multi-modal enhanced representation. Furthermore, we take the user preferences into consideration to generate the personalized multi-model semantic representation at each timestamp. Finally, we utilize the binary cross-entropy loss to optimize MRT on the basis of users’ historical records. Through comparing with representative baselines, we demonstrate the effectiveness of MRT and qualitatively verify the necessity and utility of short- and long-range contextual and multi-modal information through extensive experiments.
Large language models (LLMs) are capable to perform complex reasoning by in-context learning (ICL) when provided with a few input-output demonstrations (demos) and more powerful when intermediate reasoning steps (chain of thoughts (CoT)) of the demos are given. Is it necessary to use multi-demo in ICL? In this paper, we study ICL using fewer demos for each test query on the tasks in (Wei et al., 2022). Surprisingly, we do not observe significant degradation when using only one randomly chosen demo. To study this phenomenon, for each test query, we categorize demos into “positive demos” leading to the correct answer, and “negative demos” resulting in wrong answers. Our analysis reveals an inherent bias in those widely studied datasets and the redundancy of demos: most demos are positive for a majority of test queries, which explains the good performance of ICL with one random demo. Moreover, ICL (with and w/o CoT) using only one positive demo significantly outperforms multi-demo ICL adopted by most previous works, indicating the weakness of LLMs in finding positive demo(s) for input queries, which is difficult to evaluate on the biased datasets. Furthermore, we observe a counterintuitive behavior of ICL using multi-demo, i.e., its accuracy degrades(improves) when given more positive(negative) demos. This implies that ICL can be easily misguided by interference among demos and their spurious correlations. Our analyses highlight several fundamental challenges that need to be addressed in LLMs training, ICL, and benchmark design.