David Burian
2024
Unveiling Semantic Information in Sentence Embeddings
Leixin Zhang
|
David Burian
|
Vojtěch John
|
Ondřej Bojar
Proceedings of the Fifth International Workshop on Designing Meaning Representations @ LREC-COLING 2024
This study evaluates the extent to which semantic information is preserved within sentence embeddings generated from state-of-art sentence embedding models: SBERT and LaBSE. Specifically, we analyzed 13 semantic attributes in sentence embeddings. Our findings indicate that some semantic features (such as tense-related classes) can be decoded from the representation of sentence embeddings. Additionally, we discover the limitation of the current sentence embedding models: inferring meaning beyond the lexical level has proven to be difficult.
Search