Felix Heine


2026

IT-systems generate log messages containing important information about the system’s health. To gather information about system entities, we extract technical terms and proper nouns as multi-word expressions (MWEs) from a wide range of log messages from 16 different real systems. We apply Gries’ information-theoretic approach which iteratively calculates the best MWE candidates using an eight-dimensional ranking method. These candidates are evaluated in an annotation study, achieving a precision of 66 %. This value is significantly higher than evaluations on general-purpose texts, demonstrating the higher occurrence of compound technical terms and proper nouns in log messages. The MWEs found can be used to reduce the number of nodes in a system behavior graph while increasing the information density of the nodes.