TY - CHAP AB - Linked knowledge graphs build the backbone of many data-driven applications such as search engines, conversational agents and e-commerce solutions. Declarative link discovery frameworks use complex link specifications to express the conditions under which a link between two resources can be deemed to exist. However, understanding such complex link specifications is a challenging task for non-expert users of link discovery frameworks. In this paper, we address this drawback by devising NMV-LS, a language model-based verbalization approach for translating complex link specifications into natural language. NMV-LS relies on the results of rule-based link specification verbalization to apply continuous training on T5, a large language model based on the Transformerarchitecture. We evaluated NMV-LS on English and German datasets using well-known machine translation metrics such as BLUE, METEOR, ChrF++ and TER. Our results suggest that our approach achieves a verbalization performance close to that of humans and outperforms state of the art approaches. Our source code and datasets are publicly available at https://github.com/dice-group/NMV-LS. AU - Ahmed, Abdullah Fathi Ahmed AU - Firmansyah, Asep Fajar AU - Sherif, Mohamed AU - Moussallem, Diego AU - Ngonga Ngomo, Axel-Cyrille ID - 46516 SN - 0302-9743 T2 - Natural Language Processing and Information Systems TI - Explainable Integration of Knowledge Graphs Using Large Language Models ER - TY - CONF AU - Vogelsang, Christoph AU - Janzen, Thomas AU - Meier, Jana AU - Wotschel, Philipp ID - 46522 TI - „Die Prüfungen werden mich sicherlich nicht zu einer besseren Lehrkraft machen.“ Wie beurteilen Studierende Prüfungen und Feedback im Lehramtsstudium? ER - TY - CONF AU - Wotschel, Philipp AU - Vogelsang, Christoph AU - Janzen, Thomas AU - Meier, Jana ID - 46524 TI - Als Lehrkraft gut beraten? Entwicklung und Erprobung eines handlungsnahen Prüfungsformates zur Erfassung von Beratungskompetenz von Lehramtsstudierenden ER - TY - CONF AB - Purpose: This study addresses the limitations of current short abstracts of DBpedia entities, which often lack a comprehensive overview due to their creating method (i.e., selecting the first two-three sentences from the full DBpedia abstracts). Methodology: We leverage pre-trained language models to generate abstractive summaries of DBpedia abstracts in six languages (English, French, German, Italian, Spanish, and Dutch). We performed several experiments to assess the quality of generated summaries by language models. In particular, we evaluated the generated summaries using human judgments and automated metrics (Self-ROUGE and BERTScore). Additionally, we studied the correlation between human judgments and automated metrics in evaluating the generated summaries under different aspects: informativeness, coherence, conciseness, and fluency. Findings: Pre-trained language models generate summaries more concise and informative than existing short abstracts. Specifically, BART-based models effectively overcome the limitations of DBpedia short abstracts, especially for longer ones. Moreover, we show that BERTScore and ROUGE-1 are reliable metrics for assessing the informativeness and coherence of the generated summaries with respect to the full DBpedia abstracts. We also find a negative correlation between conciseness and human ratings. Furthermore, fluency evaluation remains challenging without human judgment. Value: This study has significant implications for various applications in machine learning and natural language processing that rely on DBpedia resources. By providing succinct and comprehensive summaries, our approach enhances the quality of DBpedia abstracts and contributes to the semantic web community AU - Zahera, Hamada Mohamed Abdelsamee AU - Vitiugin, Fedor AU - Sherif, Mohamed AU - Castillo, Carlos AU - Ngonga Ngomo, Axel-Cyrille ID - 46518 KW - dice enexa kiam ngonga porque sherif zahera T2 - SEMANTiCS TI - Using Pre-trained Language Models for Abstractive DBpedia Summarization: A Comparative Study ER - TY - DATA AB - Graffiti is an urban phenomenon that is increasingly attracting the interest of the sciences. To the best of our knowledge, no suitable data corpora are available for systematic research until now. The Information System Graffiti in Germany project (Ingrid) closes this gap by dealing with graffiti image collections that have been made available to the project for public use. Within Ingrid, the graffiti images are collected, digitized and annotated. With this work, we aim to support the rapid access to a comprehensive data source on Ingrid targeted especially by researchers. In particular, we present IngridKG, an RDF knowledge graph of annotated graffiti, abides by the Linked Data and FAIR principles. We weekly update IngridKG by augmenting the new annotated graffiti to our knowledge graph. Our generation pipeline applies RDF data conversion, link discovery and data fusion approaches to the original data. The current version of IngridKG contains 460,640,154 triples and is linked to 3 other knowledge graphs by over 200,000 links. In our use case studies, we demonstrate the usefulness of our knowledge graph for different applications. AU - Sherif, Mohamed AU - Morim da Silva, Ana Alexandra AU - Pestryakova, Svetlana AU - Ahmed, Abdullah Fathi Ahmed AU - Niemann, Sven AU - Ngonga Ngomo, Axel-Cyrille ID - 45558 TI - IngridKG: A FAIR Knowledge Graph of Graffiti ER - TY - JOUR AB - Multiprotein adsorption from complex body fluids represents a highly important and complicated phenomenon in medicine. In this work, multiprotein adsorption from diluted human serum at gold and oxidized iron surfaces is investigated at different serum concentrations and pH values. Adsorption-induced changes in surface topography and the total amount of adsorbed proteins are quantified by atomic force microscopy (AFM) and polarization-modulation infrared reflection absorption spectroscopy (PM-IRRAS), respectively. For both surfaces, stronger protein adsorption is observed at pH 6 compared to pH 7 and pH 8. PM-IRRAS furthermore provides some qualitative insights into the pH-dependent alterations in the composition of the adsorbed multiprotein films. Changes in the amide II/amide I band area ratio and in particular side-chain IR absorption suggest that the increased adsorption at pH 6 is accompanied by a change in protein film composition. Presumably, this is mostly driven by the adsorption of human serum albumin, which at pH 6 adsorbs more readily and thereby replaces other proteins with lower surface affinities in the resulting multiprotein film. AU - Huang, Jingyuan AU - Qiu, Yunshu AU - Lücke, Felix AU - Su, Jiangling AU - Grundmeier, Guido AU - Keller, Adrian ID - 46542 IS - 16 JF - Molecules KW - Chemistry (miscellaneous) KW - Analytical Chemistry KW - Organic Chemistry KW - Physical and Theoretical Chemistry KW - Molecular Medicine KW - Drug Discovery KW - Pharmaceutical Science SN - 1420-3049 TI - Multiprotein Adsorption from Human Serum at Gold and Oxidized Iron Surfaces Studied by Atomic Force Microscopy and Polarization-Modulation Infrared Reflection Absorption Spectroscopy VL - 28 ER - TY - JOUR AB - The influence of nanoscale surface topography on protein adsorption is highly important for numerous applications in medicine and technology. Herein, ferritin adsorption at flat and nanofaceted, single-crystalline Al2O3 surfaces is investigated using atomic force microscopy and X-ray photoelectron spectroscopy. The nanofaceted surfaces are generated by the thermal annealing of Al2O3 wafers at temperatures above 1000 °C, which leads to the formation of faceted saw-tooth-like surface topographies with periodicities of about 160 nm and amplitudes of about 15 nm. Ferritin adsorption at these nanofaceted surfaces is notably suppressed compared to the flat surface at a concentration of 10 mg/mL, which is attributed to lower adsorption affinities of the newly formed facets. Consequently, adsorption is restricted mostly to the pattern grooves, where the proteins can maximize their contact area with the surface. However, this effect depends on the protein concentration, with an inverse trend being observed at 30 mg/mL. Furthermore, different ferritin adsorption behavior is observed at topographically similar nanofacet patterns fabricated at different annealing temperatures and attributed to different step and kink densities. These results demonstrate that while protein adsorption at solid surfaces can be notably affected by nanofacet patterns, fine-tuning protein adsorption in this way requires the precise control of facet properties. AU - Pothineni, Bhanu K. AU - Kollmann, Sabrina AU - Li, Xinyang AU - Grundmeier, Guido AU - Erb, Denise J. AU - Keller, Adrian ID - 46543 IS - 16 JF - International Journal of Molecular Sciences KW - Inorganic Chemistry KW - Organic Chemistry KW - Physical and Theoretical Chemistry KW - Computer Science Applications KW - Spectroscopy KW - Molecular Biology KW - General Medicine KW - Catalysis SN - 1422-0067 TI - Adsorption of Ferritin at Nanofaceted Al2O3 Surfaces VL - 24 ER - TY - CONF AU - Kouagou, N’Dah Jean AU - Heindorf, Stefan AU - Demir, Caglar AU - Ngomo, Axel-Cyrille Ngonga ID - 46459 T2 - NeSy TI - Neural Class Expression Synthesis (Extended Abstract) VL - 3432 ER - TY - CHAP AB - Indonesian is classified as underrepresented in the Natural Language Processing (NLP) field, despite being the tenth most spoken language in the world with 198 million speakers. The paucity of datasets is recognized as the main reason for the slow advancements in NLP research for underrepresented languages. Significant attempts were made in 2020 to address this drawback for Indonesian. The Indonesian Natural Language Understanding (IndoNLU) benchmark was introduced alongside IndoBERT pre-trained language model. The second benchmark, Indonesian Language Evaluation Montage (IndoLEM), was presented in the same year. These benchmarks support several tasks, including Named Entity Recognition (NER). However, all NER datasets are in the public domain and do not contain domain-specific datasets. To alleviate this drawback, we introduce IndQNER, a manually annotated NER benchmark dataset in the religious domain that adheres to a meticulously designed annotation guideline. Since Indonesia has the world’s largest Muslim population, we build the dataset from the Indonesian translation of the Quran. The dataset includes 2475 named entities representing 18 different classes. To assess the annotation quality of IndQNER, we perform experiments with BiLSTM and CRF-based NER, as well as IndoBERT fine-tuning. The results reveal that the first model outperforms the second model achieving 0.98 F1 points. This outcome indicates that IndQNER may be an acceptable evaluation metric for Indonesian NER tasks in the aforementioned domain, widening the research’s domain range. AU - Gusmita, Ria Hari AU - Firmansyah, Asep Fajar AU - Moussallem, Diego AU - Ngonga Ngomo, Axel-Cyrille ID - 46572 SN - 0302-9743 T2 - Natural Language Processing and Information Systems TI - IndQNER: Named Entity Recognition Benchmark Dataset from the Indonesian Translation of the Quran ER - TY - JOUR AB - AbstractExternal visualization (i.e., physically embodied visualization) is central to the teaching and learning of mathematics. As external visualization is an important part of mathematics at all levels of education, it is diverse, and research on external visualization has become a wide and complex field. The aim of this scoping review is to characterize external visualizations in recent mathematics education research in order to develop a common ground and guide future research. A qualitative content analysis of the full texts of 130 studies published between 2018 and 2022 applied a deductive-inductive coding procedure to assess four dimensions: visualization product or process, type of visualization, media, and purpose. The analysis revealed different types of external visualizations including visualizations with physical resemblance ranging from pictorial to abstract visualizations as well as three types of visualizations with structural resemblance: length, area, and relational visualizations. Future research should include measures of visualization products or processes to help explain the demands and affordances that different types of visualizations present to learners and teachers. AU - Schoenherr, Johanna AU - Schukajlow, Stanislaw ID - 46569 JF - ZDM – Mathematics Education KW - General Mathematics KW - Education SN - 1863-9690 TI - Characterizing external visualization in mathematics education research: a scoping review ER -