Identifizierung quantifizierbarer Bewertungsinhalte und -kategorien mittels Text Mining
J. Kersting, Identifizierung quantifizierbarer Bewertungsinhalte und -kategorien mittels Text Mining, Universität der Bundeswehr München , Neubiberg, 2023.
Download
No fulltext has been uploaded.
Dissertation
| Published
| German
Author
Supervisor
Project
Abstract
Reading between the lines has so far been reserved for humans. The present dissertation addresses this research gap using machine learning methods.
Implicit expressions are not comprehensible by computers and cannot be localized in the text. However, many texts arise on interpersonal topics that, unlike commercial evaluation texts, often imply information only by means of longer phrases. Examples are the kindness and the attentiveness of a doctor, which are only paraphrased (“he didn’t even look me in the eye”). The analysis of such data, especially the identification and localization of implicit statements, is a research gap (1). This work uses so-called Aspect-based Sentiment Analysis as a method for this purpose. It remains open how the aspect categories to be extracted can be discovered and thematically delineated based on the data (2). Furthermore, it is not yet explored how a collection of tools should look like, with which implicit phrases can be identified and thus made explicit
(3). Last, it is an open question how to correlate the identified phrases from the text data with other data, including the investigation of the relationship between quantitative scores (e.g., school grades) and the thematically related text (4). Based on these research gaps, the research question is posed as follows: Using text mining methods, how can implicit rating content be properly interpreted and thus made explicit before it is automatically categorized and quantified?
The uniqueness of this dissertation is based on the automated recognition of implicit linguistic statements alongside explicit statements. These are identified in unstructured text data so that features expressed only in the text can later be compared across data sources, even though they were not included in rating categories such as stars or school grades. German-language physician ratings from websites in three countries serve as the sample domain. The solution approach consists of data creation, a pipeline for text processing and analyses based on this. In the data creation, aspect classes are identified and delineated across platforms and marked in text data. This results in six datasets with over 70,000 annotated sentences and detailed guidelines. The models that were created based on the training data extract and categorize the aspects. In addition, the sentiment polarity and the evaluation weight, i. e., the importance of each phrase, are determined. The models, which are combined in a pipeline, are used in a prototype in the form of a web application. The analyses built on the pipeline quantify the rating contents by linking the obtained information with further data, thus allowing new insights.
As a result, a toolbox is provided to identify quantifiable rating content and categories using text mining for a sample domain. This is used to evaluate the approach, which in principle can also be adapted to any other domain.
Publishing Year
Page
208
LibreCat-ID
Cite this
Kersting J. Identifizierung quantifizierbarer Bewertungsinhalte und -kategorien mittels Text Mining. Universität der Bundeswehr München ; 2023.
Kersting, J. (2023). Identifizierung quantifizierbarer Bewertungsinhalte und -kategorien mittels Text Mining. Universität der Bundeswehr München .
@book{Kersting_2023, place={Neubiberg}, title={Identifizierung quantifizierbarer Bewertungsinhalte und -kategorien mittels Text Mining}, publisher={Universität der Bundeswehr München }, author={Kersting, Joschka}, year={2023} }
Kersting, Joschka. Identifizierung quantifizierbarer Bewertungsinhalte und -kategorien mittels Text Mining. Neubiberg: Universität der Bundeswehr München , 2023.
J. Kersting, Identifizierung quantifizierbarer Bewertungsinhalte und -kategorien mittels Text Mining. Neubiberg: Universität der Bundeswehr München , 2023.
Kersting, Joschka. Identifizierung quantifizierbarer Bewertungsinhalte und -kategorien mittels Text Mining. Universität der Bundeswehr München , 2023.