TY - CHAP AB - Opinion mining from physician rating websites depends on the quality of the extracted information. Sometimes reviews are user-error prone and the assigned stars or grades contradict the associated content. We therefore aim at detecting random individual error within reviews. Such errors comprise the disagreement in polarity of review texts and the respective ratings. The challenges that thereby arise are (1) the content and sentiment analysis of the review texts and (2) the removal of the random individual errors contained therein. To solve these tasks, we assign polarities to automatically recognized opinion phrases in reviews and then check for divergence in rating and text polarity. The novelty of our approach is that we improve user-generated data quality by excluding error-prone reviews on German physician websites from average ratings. AU - Geierhos, Michaela AU - Bäumer, Frederik Simon AU - Schulze, Sabine AU - Stuß, Valentina ED - Ali, Moonis ED - Kwon, Young Sig ED - Lee, Chang-Hwan ED - Kim, Juntae ED - Kim, Yongdai ID - 293 SN - 978-3-319-19065-5 T2 - Proceedings of the 28th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems (IEA/AIE 2015) TI - Filtering Reviews by Random Individual Error VL - 9101 ER - TY - CONF AU - Stuß, Valentina AU - Geierhos, Michaela ID - 1141 T2 - DHd 2015: Book of Abstracts TI - Identifikation kognitiver Effekte in Online-Bewertungen ER - TY - CONF AU - Geierhos, Michaela AU - Bäumer, Frederik Simon ID - 1142 T2 - DHd 2015: Book of Abstracts TI - Erfahrungsberichte aus zweiter Hand: Erkenntnisse über die Autorschaft von Arztbewertungen in Online-Portalen ER - TY - JOUR AB - Der Erfahrungsaustausch zwischen Patienten findet heutzutage zunehmend im Internet statt. Bewertungsportale wie jameda, DocInsider oder imedo.de bieten Patienten und deren Angehörigen die Möglichkeit, anonym Beschwerden zu äußern oder Weiterempfehlungen auszusprechen. Gleichzeitig ermöglichen diese hunderttausend Individualerfahrungen die Erhebung der Patientenzufriedenheit sowie die Überprüfung bestehender Gerüchte, wie z. B. dass Privatpatienten schneller einen Arzttermin bekommen und weniger Zeit im Wartezimmer verbringen. Die Analyse anonymer Online-Arztbewertungen kann nur dann erfolgreich sein, wenn bei der Interpretation der Patientenerfahrungsberichte berücksichtigt wird, dass behandlungsqualitätsunabhängige Faktoren Auswirkungen auf die subjektive Bewertung und das Beschwerdeverhalten haben. Ein neuer Ansatz ist daher, bedeutende Indikatoren für die Patientenzufriedenheit im Web 2.0 zur Generierung eines detaillierten Erfahrungs- und Patientenstimmungsbildes unter Berücksichtigung demographischer und regionaler Einflüsse zu ermitteln. AU - Geierhos, Michaela AU - Schulze, Sabine ID - 1143 JF - ForschungsForum Paderborn TI - Der zufriedene Patient 2.0: Analyse anonymer Arztbewertungen zur Generierung eines Patientenstimmungsbildes VL - 18 ER - TY - CONF AB - Adopting the concept of “Local Grammars” (M. Gross), which were successfully applied in practice by (Geierhos, 2010) to biographical information extraction in English our project aims to detect, encode, and finally visualize relations between persons. Our corpus consists of the digitised biographical lexicon “Neue Deutsche Biographie (NDB)”, roughly 21.000 biographies in 25 volumes in print since 1953. We developed local grammars and suitable dictionaries to describe interpersonal relations and applied them to the corpus with Unitex 3.1. The local grammars were designed to integrate existing TEI-XML structures in the corpus. Using the ability of local grammars in Unitex to act as transducers we were able to produce XML-Tags and encode semantic information. Based on grammars for personal names and places we described interpersonal relations like to study, predecessors and successors as well as friends and circles. Afterwards we identified persons (as given in the authority file or index). Finally we displayed relations on our website in an interactive and dynamic way. Utilizing the Javascript library D3.js we represented named relations between identified individuals as ego centred network graphs. AU - Stotz, Sophia AU - Stuß, Valentina AU - Reinert , Matthias AU - Schrott, Maximilian ED - ter Braake, Serge ED - Fokkens, Antske ED - Sluijter, Ronald ED - Declerck, Thierry ED - Wandl-Vogt, Eveline ID - 1144 KW - Local Grammar KW - Relation Extraction KW - Visualisation SN - 16130073 T2 - Proceedings of the First Conference on Biographical Data in a Digital World 2015 TI - Interpersonal relations in biographical dictionaries. A case study VL - 1399 ER - TY - CONF AB - Received medical services are increasingly discussed and recommended on physician rating websites (PRWs). The reviews and ratings on these platforms are valuable sources of information for patient opinion mining. In this paper, we have tackled three issues that come along with inconsistency analysis on PRWs: (1) Natural language processing of user-generated reviews, (2) the disagreement in polarity of review text and its corresponding numerical ratings (individual inconsistency) and (3) the differences in patients’ rating behavior for the same service category (e.g. ‘treatment’) expressed by varying grades on the entire data set (collective inconsistency). Thus, the basic idea is first to identify relevant opinion phrases that describe service categories and to determine their polarity. Subsequently, the particular phrase has to be assigned to its corresponding numerical rating category before checking the (dis-)agreement of polarity values. For this purpose, several local grammars for the pattern-based analysis as well as domain-specific dictionaries for the recognition of entities, aspects and polarity were applied on 593,633 physician reviews from both German PRWs jameda.de and docinsider.de. Furthermore, our research contributes to content quality improvement of PRWs because we provide a technique to detect inconsistent reviews that could be ignored for the computation of average ratings. AU - Geierhos, Michaela AU - Bäumer, Frederik Simon AU - Schulze, Sabine AU - Stuß, Valentina ID - 1145 SN - 9783000502842 T2 - ECIS 2015 Completed Research Papers TI - "I grade what I get but write what I think." Inconsistency Analysis in Patients' Reviews ER - TY - GEN AB - Der Erfahrungsaustausch zwischen Patienten findet verstärkt über Arztbewertungsportale statt. Dabei ermöglicht die Anonymität des Netzes ein weitestgehend ehrliches Beschwerdeverhalten, von dem das sensible Arzt-Patienten-Vertrauensverhältnis unbeschädigt bleibt. Im Rahmen des vorliegenden Beitrags wurden anonyme Arztbewertungen im Web 2.0 automatisiert ausgewertet, um Einflussfaktoren auf das Beschwerdeverhalten deutscher Patienten zu bestimmen und in der Gesellschaft vermeintlich etablierte „Patienten-Mythen“ aufzuklären. Die Aufdeckung von Irrtümern und Zufriedenheitsindikatoren soll längerfristig dazu dienen, Patientenäußerungen differenzierter zu interpretieren und somit zu einer nachhaltigen Verbesserung der Arzt-Patienten-Beziehung beizutragen. AU - Geierhos, Michaela AU - Schulze, Sabine AU - Bäumer, Frederik Simon ID - 1147 TI - Der zufriedene Patient 2.0: Analyse anonymer Arztbewertungen im Web 2.0 VL - 3 ER - TY - CONF AB - The individual search for information about physicians on Web 2.0 platforms can affect almost all aspects of our lives. People can directly access physician rating websites via web browsers or use any search engine to find physician reviews and ratings filtered by location resp. specialty. However, sometimes keyword search does not meet user needs because of the disagreement of users’ common terms queries for symptoms and the widespread medical terminology. In this paper, we present the prototype of a specialised search engine that overcomes this by indexing user-generated content (i.e., review texts) for physician discovery and provides automatic suggestions as well as an appropriate visualisation. On the one hand, we consider the available numeric physician ratings as sorting criterion for the ranking of query results. Furthermore, we extended existing ranking algorithms with respect to domain-specific types and physicians ratings on the other hand. We gathered more than 860,000 review texts and collected more than 213,000 physician records. A random test shows that about 19.7% of 5,100 different words in total are health- related and partly belong to consumer health vocabularies. Our evaluation results show that the query results fit user's particular health issues when seeking for physicians. AU - Bäumer, Frederik Simon AU - Dollmann, Markus AU - Geierhos, Michaela ED - Shakshuki, Elhadi M. ID - 1148 KW - Physician Discovery KW - Consumer Health Vocabulary KW - Common Terms Query SN - 18770509 T2 - The 6th International Conference on Emerging Ubiquitous Systems and Pervasive Networks (EUSPN 2015) / The 5th International Conference on Current and Future Trends of Information and Communication Technologies in Healthcare (ICTH-2015) / Affiliated Workshops TI - Find a Physician by Matching Medical Needs described in your Own Words VL - 63 ER - TY - CHAP AB - The contacts a health care provider (HCP), like a physician, has to other HCPs is perceived as a quality characteristic by patients. So far, only the German physician rating website jameda.de gives information about the interconnectedness of HCPs in business networks. However, this network has to be maintained manually and is thus incomplete. We therefore developed a system for uncovering latent connectivity of HCPs in online reviews to provide users with more valuable information about their HCPs. The overall goal of this approach is to extend already existing business networks of HCPs by integrating connections that are newly discovered by our system. Our most recent evaluation results are promising: 70.8 % of the connections extracted from the reviews texts were correctly identified and in total 3,788 relations were recognized that have not been displayed in jameda.de’s network before. AU - Bäumer, Frederik Simon AU - Geierhos, Michaela AU - Schulze, Sabine ED - Dregvaite, Giedre ED - Damasevicius, Robertas ID - 1149 KW - Latent Connectivity KW - Person Named Entity Recognition and Disambiguation KW - Health Care Provider Reviews SN - 978-3-319-24769-4 T2 - Information and Software Technologies. 21st International Conference, ICIST 2015, Druskininkai, Lithuania, October 15-16, 2015. Proceedings TI - A System for Uncovering Latent Connectivity of Health Care Providers in Online Reviews VL - 538 ER - TY - CHAP AB - Patients 2.0 increasingly inform themselves about the quality of medical services on physician rating websites. However, little is known about whether the reviews and ratings on these websites truly reflect the quality of services or whether the ratings on these websites are rather influenced by patients’ individual rating behavior. Therefore, we investigate more than 790,000 physician reviews on Germany’s most used physician rating website jameda.de. Our results show that patients’ ratings do not only reflect treatment quality but are also influenced by treatment quality independent factors like age and complaint behavior. Hence, we provide evidence that users should be well aware of user specific rating distortions when intending to make their physician choice based on these ratings. AU - Geierhos, Michaela AU - Bäumer, Frederik Simon AU - Schulze, Sabine AU - Klotz, Caterina ED - Christiansen, Henning ED - Stojanovic, Isidora ED - Papadopoulos, George A. ID - 1150 KW - Health 2.0 KW - Rating Behavior KW - Patient Opinion Mining on Physician Rating Websites SN - 9783319255903 T2 - Modeling and Using Context. 9th International and Interdisciplinary Conference, CONTEXT 2015, Lanarca, Cyprus, November 2-6, 2015. Proceedings TI - Understanding the Patient 2.0: Gaining Insight into Patients' Rating Behavior by User-generated Physician Review Mining VL - 9405 ER - TY - CONF AB - Existing approaches towards service composition demand requirements of the customers in terms of service templates, service query profiles, or partial process models. However, addressed non-expert customers may be unable to fill-in the slots of service templates as requested or to describe, for example, pre- and postconditions, or even have difficulties in formalizing their requirements. Thus, our idea is to provide non-experts with suggestions how to complete or clarify their requirement descriptions written in natural language. Two main issues have to be tackled: (1) partial or full inability (incapacity) of non-experts to specify their requirements correctly in formal and precise ways, and (2) problems in text analysis due to fuzziness in natural language. We present ideas how to face these challenges by means of requirement disambiguation and completion. Therefore, we conduct ontology-based requirement extraction and similarity retrieval based on requirement descriptions that are gathered from App marketplaces. The innovative aspect of our work is that we support users without expert knowledge in writing their requirements by simultaneously resolving ambiguity, vagueness, and underspecification in natural language. AU - Geierhos, Michaela AU - Schulze, Sabine AU - Bäumer, Frederik Simon ED - Loiseau, Stephane ED - Filipe, Joaquim ED - Duval, Béatrice ED - van den Herik, Jaap ID - 231 SN - 978-989-758-073-4 T2 - Proceedings of the 7th International Conference on Agents and Artificial Intelligence (ICAART), Special Session on Partiality, Underspecification, and Natural Language Processing (PUaNLP 2015) TI - What did you mean? Facing the Challenges of User-generated Software Requirements ER - TY - CHAP AB - Finding information about people in the World Wide Web is one of the most common activities of Internet users. It is now impossible to manually analyze all this information and new approaches are needed that are capable of processing the large-scale heterogeneous data in order to extract the pertinent information. The Wikipedia community still puts much effort in manually adding structured data to biographical articles, the so-called {{Persondata}} template. Thanks to this kind of metadata, semantically-enriched information concerning the biographee (e.g. name, date of birth, place of birth) can be extracted and processed by search engines. But it is a rather time-consuming task and users quite often forget to add this template: some biographies contain persondata, others do not. There is considerably less work done on developing approaches to automatically enhance English Wikipedia biographies with persondata and therefore improve the quality of structured user contributions. Within this paper, we describe our method to automatically generate persondata from biographical information in Wikipedia articles. AU - Geierhos, Michaela ED - Kakoyianni-Doa, Fryni ID - 1124 SN - 9782745325129 T2 - Penser le Lexique-Grammaire TI - Towards a Local Grammar-based Persondata Generator for Wikipedia Biographies ER - TY - CONF AB - In this paper, we focus on the acronym representation, the concept of abbreviation of major terminology. To this end, we try to find the most efficient method to disambiguate the sense of the acronym. Comparing the various feature types, we found that using single noun (NN) overwhelmingly outperformed noun phrase (NP) base. Moreover, the result also showed that collocation information (CL) was not efficient for enhancing performance considering a huge extra data processing. We expect to apply the open knowledge base Wikipedia to scholarly service to enhance the quality of the local knowledge base and to develop value-added services. AU - Jeong, Do-Heon AU - Gim, Jangwon AU - Jung, Hanmin AU - Geierhos, Michaela AU - Bäumer, Frederik Simon ID - 1130 SN - 20930542 T2 - Conference Proceedings of the 9th Asia Pacific International Conference on Information Science and Technology (APIC-IST 2014) TI - Comparative study on disambiguating acronyms in the scientific papers using the open knowledge base ER - TY - GEN AU - Geierhos, Michaela AU - Schulze, Sabine ID - 1131 T2 - Challenges for Consumer Research and Consumer Policy in Europe TI - The same but not the same - Challenges in comparing patient opinions ER - TY - GEN AU - Geierhos, Michaela AU - Siri, Jasmin ID - 1133 T2 - Tagungsband Forschungsethik in der qualitativen und quantitativen Sozialforschung TI - Was beobachtet die Forschungsethik? Eine interdisziplinäre Diskussion zwischen Computerlinguistik und qualitativ-konstruktivistischer Sozialforschung ER - TY - CONF AB - This paper focuses on the first step in combining prescriptive analytics with scenario techniques in order to provide strategicdevelopment after the useof InSciTe, a data prescriptive analytics application. InSciTe supports the improvement of researchers‘ individual performance by recommending new research directions. Standardized influential factors are presented as a foundation for automated scenario modelling such as the prototypical report generation function of InSciTe. Additionally, a use-case is shown which validatesthe potential of the standardized influential factors for raw scenario development. AU - Weber, Jens AU - Minhee, Cho AU - Lee, Mikyoung AU - Song, Sa-kwang AU - Geierhos, Michaela AU - Jung, Hanmin ED - Jung, Hanmin ED - Mandl, Thomas ED - Womsen-Hacker, Christa ED - Xu, Shuo ID - 1134 KW - Standardized Influential Factors KW - Prescriptive Analytics KW - Role Model Group KW - Scenario Technique SN - 16130073 T2 - Proceedings of the First International Workshop on Patent Mining and Its Applications (IPaMin 2014) co-located with Konvens 2014 TI - System Thinking: Crafting Scenarios for Prescriptive Analytics VL - 1292 ER - TY - CONF AB - In this paper, we describe our system developed for the GErman SenTiment AnaLysis shared Task (GESTALT) for participation in the Maintask 2: Subjective Phrase and Aspect Extraction from Product Reviews. We present a tool, which identifies subjective and aspect phrases in German product reviews. For the recognition of subjective phrases, we pursue a lexicon-based approach. For the extraction of aspect phrases from the reviews, we consider two possible ways: Besides the subjectivity and aspect look-up, we also implemented a method to establish which subjective phrase belongs to which aspect. The system achieves better results for the recognition of aspect phrases than for the subjective identification. AU - Dollmann, Markus AU - Geierhos, Michaela ED - Faaß, Gertrud ED - Ruppenhofer, Josef ID - 1135 KW - corpus linguistics KW - sentiment analysis SN - 978-3-934105-47-8 T2 - Workshop Proceedings of the 12th Edition of the KONVENS Conference TI - SentiBA: Lexicon-based Sentiment Analysis on German Product Reviews ER - TY - CONF AB - In this paper, we present a system which makes scientific data available following the linked open data principle using standards like RDF and URI as well as the popular D2R server (D2R) and the customizable D2RQ mapping language. Our scientific data sets include acronym data and expansions, as well as researcher data such as author name, affiliation, coauthors, and abstracts. The system can easily be extended to other records. Regarding this, a domain adaptation to patent mining seems possible. For this reason, obvious similarities and differences are presented here. The data set is collected from several different providers like publishing houses and digital libraries, which follow different standards in data format and structure. Most of them are not supporting semantic web technologies, but the legacy HTML standard. The integration of these large amounts of scientific data into the Semantic Web is challenging and it needs flexible data structures to access this information and interlink them. Based on these data sets, we will be able to derive a general technology trend as well as the individual research domain for each researcher. The goal of our Linked Open Data System for scientific data is to provide access to this data set for other researchers using the Web of Linked Data. Furthermore we implemented an application for visualization, which allows usto explorethe relations between single data sets. AU - Bäumer, Frederik Simon AU - Gim, Jangwon AU - Jeong, Do-Heon AU - Geierhos, Michaela AU - Jung, Hanmin ED - Jung, Hanmin ED - Mandl, Thomas ED - Womsen-Hacker, Christa ED - Xu, Shuo ID - 1137 KW - Linked Open Data KW - Researcher Data KW - Acronym Data KW - D2R SN - 16130073 T2 - Proceedings of the First International Workshop on Patent Mining and Its Applications (IPaMin 2014) co-located with Konvens 2014 TI - Linked Open Data System for Scientific Data Sets VL - 1292 ER - TY - CONF AB - Customized planning, engineering and build-up of factory plants are very complex tasks, where project management contains lots of risks and uncertainties. Existing simulation techniques could help massively to evaluate these uncertainties and achieve improved and at least more robust plans during project management, but are typically not applied in industry, especially at SMEs (small and medium-sized enterprises). This paper presents some results of the joint research project simject of the Universities of Paderborn and Kassel, which aims at the development of a demonstrator for a simulation-based and logistic-integrated project planning and scheduling. Based on the researched state-of-the-art, requirements and a planning process are derived and described, as well as a draft of the current technical infrastructure of the intended modular prototype. First plug-ins for project simulation and multi-project optimization are implemented and already show possible benefits for the project management process. AU - Gutfeld, Thomas AU - Jessen, Ulrich AU - Wenzel, Sigrid AU - Weber, Jens ED - Tolk, Andreas ED - Diallo, Saikou Y. ED - Ryzhov, Ilya O. ED - Yilmaz, Levent ED - Buckley, Stephen J. ED - Miller, John A. ID - 1140 SN - 9781479974863 T2 - Proceedings of the 2014 Winter Simulation Conference TI - A Technical Concept for Plant Engineering by Simulation-Based and Logistic-Integrated Project Management ER - TY - CONF AB - The conceptual condensability of technical terms permits us to use them as effective queries to search scientific databases. However, authors often employ alternative expressions to represent the meanings of specific terms, in other words, Terminological Paraphrases (TPs) in the literature for certain reasons. In this paper, we propose an effective way to retrieve “de facto relevance documents” which only contain those TPs and cannot be searched by conventional models in an environment with only controlled vocabularies by adapting Predicate Argument Tuple (PAT). The experiment confirms that PAT-based document retrieval is an effective and promising method to search those kinds of documents and to improve terminology-based scientific information access models. AU - Choi, Sung-Pil AU - Song, Sa-kwang AU - Jung, Hanmin AU - Geierhos, Michaela AU - Myaeng, Sung Hyon ED - Chang, Chin-Chen ED - Gelogo, Yvette E. ED - Caytiles, Ronnie E. ID - 1127 SN - 22871233 T2 - Information Science and Industrial Applications: Proceedings, International Conference, ISI 2012, Cebu, Philippines, May 2012 TI - Scientific Literature Retrieval based on Terminological Paraphrases using Predicate Argument Tuple VL - 4 ER - TY - CONF AB - Our purpose is to perform data record extraction from onlineevent calendars exploiting sublanguage and domain characteristics. We therefore use so-called domain-dependent data (D³) completely based on language-specific key expressions and HTML patterns to recognize every single event given on the investigated web page. One of the most remarkable advantages of our method is that it does not require any additional classification steps based on machine learning algorithms or keyword extraction methods; it is a so-called one-step mining technique. Moreover, another important criteria is that our system is robust to DOM and layout modifications made by web designers. Thus, preliminary experimental results are provided to demonstrate proof-of-concept of such an approach tested on websites in the German opera domain. Furthermore, we could show that our proposed technique outperforms other data record mining applications run on event sites. AU - Lee, Yeong Su AU - Geierhos, Michaela AU - Song, Sa-Kwang AU - Jung, Hanmin ED - Chang, Chin-Chen ED - Gelogo, Yvette E. ED - Caytiles, Ronnie E. ID - 1128 SN - 22871233 T2 - Software Technology: Prooceedings, International Conference, SoftTech 2012, Cebu, Philippines, May 2012 TI - A Proof-of-Concept of D³ Record Mining using Domain-Dependent Data VL - 5 ER - TY - CHAP AB - Within this chapter, we will describe a novel technical service dealing with the integration of social networking channels into existing business processes. Since many businesses are moving to online communities as a means of communicating directly with their customers, social media has to be explored as an additional communication channel between individuals and companies. While the English-speaking consumers on Facebook are more likely to respond to communication rather than to initiate communication with an organisation, some German companies already have regularly updated Facebook pages for customer service and support, e.g. Telekom. Therefore, the idea of classifying and evaluating public comments addressed to German companies is based on an existing demand. In order to maintain an active Facebook wall, the consumer posts have to be categorised and then automatically assigned to the corresponding business processes (e.g. the technical service, shipping, marketing, accounting, etc.). This service works like an issue tracking system sending e-mails to the corresponding person in charge of customer service and support. That way, business process management systems which are already used to e-mail communication can benefit from social media. This allows the company to follow general trends in customer opinions on the Internet; moreover it facilates the recording of two-sided communication for customer relationship management and the company’s response will be delivered through consumer’s preferred medium: Facebook. AU - Geierhos, Michaela AU - Ebrahim, Mohamed ED - Abraham, Ajith ED - Hassanien, Aboul-Ella ID - 1129 SN - 9781447140474 T2 - Computational Social Networks: Tools, Perspectives and Applications TI - Customer Interaction Management goes Social: Getting Business Processes plugged in Social Networks ER - TY - CONF AU - Geierhos, Michaela AU - Lee, Yeong Su AU - Schuster, Jörg AU - Kobothanassi, Despina AU - Bargel, Matthias ED - De Bra, Paul ED - Grønbæk, Kaj ID - 1119 T2 - Proceedings of the 22nd ACM Conference on Hypertext and Hypermedia TI - A Social Media Customer Service ER - TY - CONF AB - SCM is a simple, modular and flexible system for web monitoring and customer interaction management. In our view, its main advantages are the following: It is completely web based. It combines all technologies, data, software agents and human agents involved in the monitoring and customer interaction process. It can be used for messages written in any natural language. Although the prototype of SCM is designed for classifying and processing messages about mobile-phone related problems in social networks, SCM can easily be adapted to other text types such as discussion board posts, blogs or emails. Unlike comparable systems, SCM uses linguistic technologies to classify messages and recognize paraphrases of product names. For two reasons, product name paraphrasing plays a major role in SCM: First, product names typically have many, sometimes hundreds or thousands of intralingual paraphrases. Secondly, product names have interlingual paraphrases: The same products are often called or spelt differently in different countries and/or languages. By mapping product name variants to an international canonical form, SCM allows for answering questions like Which statements are made about this mobile phone in which languages/in which social networks/in which countries/...? The SCM product name paraphrasing engine is designed in such a way that standard variants are assigned automatically, regular variants are assigned semiautomatically and idiosyncratic variants can be added manually. With this and similar features we try to realize our philosophy of simplicity, modularity and flexibility: Whatever can be done automatically is done automatically. But manual intervention is always possible and easy and it does not conflict in any way with the automatic functions of SCM. AU - Schuster, Jörg AU - Lee, Yeong Su AU - Kobothanassi, Despina AU - Bargel, Matthias AU - Geierhos, Michaela ID - 1120 KW - Social Media Business Integration KW - Contact Center Application Support KW - Monitoring Social Conversations KW - Social Customer Interaction Management KW - Monitoring KW - Software Agents SN - 978-1-61284-148-9 T2 - International Conference on Information Society (i-Society 2011) TI - SCM - A Simple, Modular and Flexible Customer Interaction Management System ER - TY - CHAP AB - This paper presents a novel linguistic information extraction approach exploiting analysts’ stock ratings for statistical decision making. Over a period of one year, we gathered German stock analyst reports in order to determine market trends. Our goal is to provide business statistics over time to illustrate market trends for a user-selected company. We therefore recognize named entities within the very short stock analyst reports such as organization names (e.g. BASF, BMW, Ericsson), analyst houses (e.g. Gartner, Citigroup, Goldman Sachs), ratings (e.g. buy, sell, hold, underperform, recommended list) and price estimations by using lexicalized finite-state graphs, so-called local grammars. Then, company names and their acronyms respectively have to be cross-checked against data the analysts provide. Finally, all extracted values are compared and presented into charts with different views depending on the evaluation criteria (e.g. by time line). Thanks to this approach it will be easier and even more comfortable in the future to pay attention to analysts’ buy/sell signals without reading all their reports. AU - Lee, Yeong Su AU - Geierhos, Michaela ED - Beigl, Michael ED - Christiansen, Henning ED - Roth-Berghofer, Thomas R. ED - Kofod-Petersen, Anders ED - Coventry, Kenny R. ED - Schmidtke, Hedda R. ID - 1121 SN - 9783642242786 T2 - Modeling and Using Context: 7th International and Interdisciplinary Conference, CONTEXT 2011, Karlsruhe, Germany, September 26-30, 2011, Proceedings TI - Buy, Sell, or Hold? Information Extraction from Stock Analyst Reports VL - 6967 ER - TY - CONF AB - Within this paper, we will describe a new approach to customer interaction management by integrating social networking channels into existing business processes. Until now, contact center agents still read these messages and forward them to the persons in charge of customer’s in the company. But with the introduction of Web 2.0 and social networking clients are more likely to communicate with the companies via Facebook and Twitter instead of filling data in contact forms or sending e-mail requests. In order to maintain an active communication with international clients via social media, the multilingual consumer contacts have to be categorized and then automatically assigned to the corresponding business processes (e.g. technicalservice, shipping, marketing, and accounting). This allows the company to follow general trends in customer opinions on the Internet, but also record two-sided communication for customer relationship management. AU - Geierhos, Michaela AU - Lee, Yeong Su AU - Bargel, Matthias ED - Hedeland, Hanna ED - Schmidt, Thomas ED - Wörner, Kai ID - 1122 KW - Classification of Multilingual Customer Contacts KW - Contact Center Application Support KW - Social Media Business Integration SN - 0176-599X T2 - Multilingual Resources, Multilingual Applications: Proceedings of the Conference of the German Society for Computational Linguistics and Language Technology (GSCL) 2011 TI - Processing Multilingual Customer Contacts via Social Media VL - 96 ER - TY - CONF AB - Within this paper, we describe the special requirements of a semantic annotation scheme used for biographical event extraction in the framework of the Europeancollaborative research project Biographe. This annotationscheme supports interlingual search for people due to its multilingual support covering four languages such as English, German, French and Dutch. AU - Geierhos, Michaela AU - Bouraoui, Jean-Leon AU - Watrin, Patrick ED - Hedeland, Hanna ED - Schmidt, Thomas ED - Wörner, Kai ID - 1123 KW - Biographical Event Extraction for Interlingual People Search KW - Semantic Annotation Scheme SN - 0176-599X T2 - Multilingual Resources, Multilingual Applications. Proceedings of the Conference of the German Society for Computational Linguistics and Language Technology (GSCL) 2011 TI - Towards Multilingual Biographical Event Extraction VL - 96 ER - TY - JOUR AB - Since customers first share their problems with a social networking community before directly addressing a company, social networking sites such as Facebook, Twitter, MySpace or Foursquare will be the interface between customer and company. For this reason, it is assumed that social networks will evolve into a common communication channel – not only between individuals but also between customers and companies. However, social networking has not yet been integrated into customer interaction management (CIM) tools. In general, a CIM application is used by the agents in a contact centre while communicating with the customers. Such systems handle communication across multiple different channels, such as e-mail, telephone, Instant Messaging, letter etc. What we do now is to integrate social networking into CIM applications by adding another communication channel. This allows the company to follow general trends in customer opinions on the Internet, but also record two-sided communication for customer service management and the company’s response will be delivered through the customer’s preferred social networking site. AU - Geierhos, Michaela ID - 1125 IS - 4 JF - Journal of Advances in Information Technology KW - Social Media Business Integration KW - Multichannel Customer Interaction Management KW - Contact Centre Application Support SN - 17982340 TI - Customer Interaction 2.0: Adopting Social Media as Customer Service Channel VL - 2 ER - TY - CHAP AU - Geierhos, Michaela AU - Blanc, Olivier ED - De Gioia, Michele ID - 1117 SN - 9788854831667 T2 - Actes du 27e Colloque international sur le lexique et la grammaire (L’Aquila, 10-13 septembre 2008) TI - BiographIE - Biographical Information Extraction from Business News VL - 2 ER - TY - BOOK AB - Das wesentliche Ziel der vorliegenden Publikation ist die Erstellung von sprachspezifischen Modulen im Bereich der Biographischen InformationsExtraktion (BiographIE). Unter Informationsextraktion verstehen wir die automatisierte Analyse von Dokumenten im Hinblick auf das Entdecken und Normalisieren von semantisch interessanten Entitäten und deren Eigenschaften. Das Hauptgewicht der Arbeit liegt auf sehr detaillierten und umfangreichen linguistischen Grammatiken im Bereich der Beschreibung von Personen und deren Beziehungen zu anderen relevanten Entitäten (z.B. Organisationen, Orte, Datums- und Zeitangaben) in Texten. Neben den öffentlichen und privaten Eigenschaften von Personen (Geburtsdatum, Nationalität etc.) sollen vor allem alle biographisch relevanten Attribute aus Texten extrahiert werden können. Dazu gehören in erster Linie berufliche Werdegänge, Anstellungsverhältnisse, Rollen in Firmen und ähnliche Eigenschaften. Da alle diese Attribute in unzählbar verschiedenen Formen ausgedrückt werden können, müssen sehr umfangreiche Lexika und sehr detaillierte grammatische Beschreibungen erstellt werden. Dies geschieht hauptsächlich bei der systematischen Evaluierung von Korpora. Je umfangreicher diese sind, desto adäquater werden die erstellten Grammatiken sein. Im Gegensatz zu den heute üblichen statistischen, auf maschinellem Lernen basierenden Verfahren setzen wir auch umfangreiche semi-automatisch erstellte, linguistische Module ein, die dann durch systematische Evaluierung auf Korpora schnell ergänzt und verbessert werden können. Basierend auf unseren Extraktionsmethoden ist es nun möglich, im Bereich der semantischen Suche deutliche Fortschritte zu machen. Insbesondere Personensuchmaschinen können sich unsere detaillierten Analysemethoden zu Nutze machen, um beispielsweise zu ermitteln, wer in welcher Funktion bei welcher Firma von wann bis wann beschäftigt war. AU - Geierhos, Michaela ID - 1118 KW - Natural Language Processing SN - 9783862880133 TI - BiographIE - Klassifikation und Extraktion karrierespezifischer Informationen VL - 5 ER - TY - CONF AB - This paper presents a system that uses the domain name of a German business website to locate its information pages (e.g. company profile, contact page, imprint) and then identifies business specific information. We therefore concentrate on the extraction of characteristic vocabulary like company names, addresses, contact details, CEOs, etc. Above all, we interpret the HTML structure of documents and analyze some contextual facts to transform the unstructured web pages into structured forms. Our approach is quite robust in variability of the DOM, upgradeable and keeps data up-to-date. The evaluation experiments show high efficiency of information access to the generated data. Hence, the developed technique is adaptive to non-German websites with slight language-specific modifications, and experimental results on real-life websites confirm the feasibility of the approach. AU - Lee, Yeong Su AU - Geierhos, Michaela ED - Aly, Robin ED - Hauff, C. ED - Hiemstra, Djoerd ED - Huibers, Theo W.C. ED - de Jong, Franciska M.G. ID - 1114 KW - company search KW - information extraction KW - sublanguage SN - 0929-0672 T2 - Proceedings of the 9th Dutch-Belgian Information Retrieval Workshop TI - Business Specific Online Information Extraction from German Websites ER - TY - CONF AB - This paper presents an approach to extract data records from websites, particularly ones with event calendars. We therefore use language-specific key expressions and HTML patterns to recognize every single event given on the investigated web page. One of the most remarkable advantages of our method is that it does not require any additional classification steps based on machine learning algorithms or keyword extraction methods; it is a so-called one-step mining technique. Our experimental results obtained on German opera websites show excellent results in precision and recall. Furthermore, we could demonstrate that our proposed technique outperforms other data record mining applications run on event sites. AU - Lee, Yeong Su AU - Geierhos, Michaela ID - 1115 T2 - KDML’09 Tagungsband. Workshop-Woche: Lernen - Wissen - Adaptivität. LWA 2009. 21.-23.09.2009 TI - Key Expression driven Record Mining for Event Calendar Search ER - TY - CHAP AB - This paper presents a system that uses the domain name of a German business website to locate its information pages (e.g. company profile, contact page, imprint) and then identifies business specific information. We therefore concentrate on the extraction of characteristic vocabulary like company names, addresses, contact details, CEOs, etc. Above all, we interpret the HTML structure of documents and analyze some contextual facts to transform the unstructured web pages into structured forms. Our approach is quite robust in variability of the DOM, upgradeable and keeps data up-to-date. The evaluation experiments show high efficiency of information access to the generated data. Hence, the developed technique is adaptive to non-German websites with slight language-specific modifications, and experimental results on real-life websites confirm the feasibility of the approach. AU - Lee, Yeong Su AU - Geierhos, Michaela ED - Gelbukh, Alexander ID - 1116 SN - 978-3-642-00381-3 T2 - Computational Linguistics and Intelligent Text Processing: 5th International Conference, CICLing 2004, Seoul, Korea, February 15-21, 2004, Proceedings TI - Business Specific Online Information Extraction from German Websites VL - 5449 ER - TY - CONF AB - Ce papier présente le contexte linguistique et la modélisation de notre système iBeCOOL (Informations Biographiques Extraites à l’aide de COntextes Observés Linguistiquement) dédié à l’extraction d’informations biographiques dans les textes de la presse financière en langue anglaise. La notion d’événement biographique (tel que la naissance, le mariage, la carrière professionnelle) est caractérisée formellement par un schéma prédicatif à plusieurs arguments dont l’un étant une instance de la classe d’objets . Notre approche consiste à décrire ces types de relations à l’aide de grammaires locales etde lexiques terminologiques. Nos résultats montrent que cette approche semble viable et nous poussent à élargir cette étude par l’analyse de nouveaux genres textuels. AU - Geierhos, Michaela AU - Blanc, Olivier AU - Bsiri, Sandra ID - 1108 KW - extraction d’informations biographiques KW - elations sémantiques KW - grammaires locales KW - entités nommées KW - enrichissement du lexique T2 - Proceedings of the Lexis and Grammar Conference 2008 TI - iBeCOOL - Extraction d'informations biographiques dans les textes financiers ER - TY - CHAP AU - Geierhos, Michaela AU - Bsiri, Sandra ED - Gross , Gaston ED - Schulz, Klaus U. ID - 1109 SN - 978-1-904987-80-2 T2 - Linguistics, Computer Science and Language Processing: Festschrift for Franz Guenthner on the Occasion of His 60th Birthday (Tributes 6) TI - ProfilPro: Reconstitution automatique d'un profil professionnel à partir des documents du Web VL - 6 ER - TY - JOUR AU - Geierhos, Michaela AU - Blanc, Olivier AU - Bsiri, Sandra ID - 1110 IS - 1 JF - Traitement Automatique des Langues (TAL) TI - RELAX - Extraction de relations sémantiques dans les contextes biographiques VL - 49 ER - TY - JOUR AB - The standard approach of job search engines disregards the structural aspect of job announcements in the Web. Bag-of-words indexing leads to a high amount of noise. In this paper we describe a method that uses local grammars to transform unstructured Web pages into structured forms. Evaluation experiments show high efficiency of information access to the generated documents. AU - Bsiri, Sandra AU - Geierhos, Michaela AU - Ringlstetter, Christoph ID - 1169 JF - Research in Computing Science SN - 1870-4069 TI - Structuring Job Search via Local Grammars VL - 33 ER - TY - BOOK AU - Geierhos, Michaela ID - 1106 TI - Grammatik der Menschenbezeichner in biographischen Kontexten VL - 2 ER - TY - CONF AB - Dieser Beitrag beschäftigt sich mit der Informationsextraktion aus Stellenanzeigen im französischsprachigen Web. Ziel dieser Arbeit ist es, unstrukturierte Dokumente in Repräasentationsvektoren anhand lokaler Grammatiken zu transformieren. Auf diese Weise wird es möglich, den Stellenmarkt für Jobsuchmaschinen transparenter zu gestalten, indem nur auf dem Inhalt der Anzeige in Form von Darstellungsvektoren anstatt auf unübersichtlichem Fließtext gesucht werden muss. AU - Bsiri, Sandra AU - Geierhos, Michaela ED - Hinneburg, Alexander ID - 1107 SN - 978-3-86010-907-6 T2 - LWA 2007: Lernen - Wissen - Adaption, Halle, September 2007, Workshop Proceedings TI - Informationsextraktion aus Stellenanzeigen im Internet ER -