Unsupervised Estimation of Subjective Content Descriptions in an Information System

Bender, Magnus; Braun, Tanya; Möller, Ralf; Gehrke, Marcel

Forschungsartikel (Zeitschrift) | Peer reviewed

Zusammenfassung

Let us consider the following scenario: A human is working with a corpus of text documents. In this corpus, the human needs to know documents with similar content and highlight relevant locations in retrieved documents. An information system displaying the contents of the corpus and providing an information retrieval agent will help the human. To perform information retrieval on the corpus, the agent used internally in the information system may need additional data associated with the documents. In order to support this, so-called Subjective Content Descriptions (SCDs) provide additional location-specific data for text documents. SCDs are subjective in the sense that the agent associates data with sentences to reflect beliefs of users. In our scenario, the agent needs SCDs referencing sentences of similar content across various documents in the corpus and most text documents are not associated with SCDs. Therefore, this paper presents UESM, the Unsupervised Estimator for SCDs Matrices, an approach to associate any corpus with SCDs. In an evaluation, we show that the performance of UESM in estimating topics of similar content in the corpus is on par with latent Dirichlet allocation, while UESM provides SCDs referencing sentences of similar content.

Details zur Publikation

FachzeitschriftInternational Journal of Semantic Computing (IJSC)
Jahrgang / Bandnr. / Volume18
Ausgabe / Heftnr. / Issue1
Seitenbereich51-75
StatusVeröffentlicht
Veröffentlichungsjahr2024
Sprache, in der die Publikation verfasst istEnglisch
DOI10.1142/S1793351X24410034
StichwörterSubjective content description; text annotation; topic modelling; sentence clustering; information system

Autor*innen der Universität Münster

Braun, Tanya
Juniorprofessur für Praktische Informatik - Moderne Aspekte der Verarbeitung von Daten / Data Science (Prof. Braun)