Japanese Language Analaysis for Syntactic Tree Mining to Extract Characteristic Contents
Yohsuke Sakao | Takahiro Ikeda | Kenji Satoh | Susumu Akamine
Proceedings of Machine Translation Summit X: Posters
Existing syntactic ordered tree mining methods for extracting characteristic contents from text sets have two problems: 1) subtrees which are semantically the same but are different ordered trees fail to be considered equivalent, and 2) raw extracted subtrees can be difficult to understand. In order to avoid these problems, we have developed a method of transforming all ordered trees so that the ordered trees having the same meaning are considered equivalent. We have also developed a method of constructing Japanese texts from extracted subtrees, and evaluated the effectiveness of our methods as applied to syntactic tree mining.