Achtung:

Sie haben Javascript deaktiviert!
Sie haben versucht eine Funktion zu nutzen, die nur mit Javascript möglich ist. Um sämtliche Funktionalitäten unserer Internetseite zu nutzen, aktivieren Sie bitte Javascript in Ihrem Browser.

Info-Icon Diese Seite ist nicht in Deutsch verfügbar
mikemacmarketing (https://commons.wikimedia.org/wiki/File:Artificial_Intelligence_&_AI_&_Machine_Learning_-_30212411048.jpg), "Artificial Intelligence & AI & Machine Learning", cropped, https://creativecommons.org/licenses/by/2.0/legalcode,
This image was originally posted to Flickr by mikemacmarketing at https://flickr.com/photos/152824664@N07/30212411048. 
Image via www.vpnsrus.com Bildinformationen anzeigen

mikemacmarketing (https://commons.wikimedia.org/wiki/File:Artificial_Intelligence_&_AI_&_Machine_Learning_-_30212411048.jpg), "Artificial Intelligence & AI & Machine Learning", cropped, https://creativecommons.org/licenses/by/2.0/legalcode, This image was originally posted to Flickr by mikemacmarketing at https://flickr.com/photos/152824664@N07/30212411048. Image via www.vpnsrus.com

Publications


Liste im Research Information System öffnen

2023

Counterfactual Explanations for Concepts in ELH

L.N. Sieger, S. Heindorf, L. Blübaum, A. Ngonga Ngomo, in: arXiv:2301.05109, 2023

Knowledge bases are widely used for information management on the web, enabling high-impact applications such as web search, question answering, and natural language processing. They also serve as the backbone for automatic decision systems, e.g. for medical diagnostics and credit scoring. As stakeholders affected by these decisions would like to understand their situation and verify fair decisions, a number of explanation approaches have been proposed using concepts in description logics. However, the learned concepts can become long and difficult to fathom for non-experts, even when verbalized. Moreover, long concepts do not immediately provide a clear path of action to change one's situation. Counterfactuals answering the question "How must feature values be changed to obtain a different classification?" have been proposed as short, human-friendly explanations for tabular data. In this paper, we transfer the notion of counterfactuals to description logics and propose the first algorithm for generating counterfactual explanations in the description logic $\mathcal{ELH}$. Counterfactual candidates are generated from concepts and the candidates with fewest feature changes are selected as counterfactuals. In case of multiple counterfactuals, we rank them according to the likeliness of their feature combinations. For evaluation, we conduct a user survey to investigate which of the generated counterfactual candidates are preferred for explanation by participants. In a second study, we explore possible use cases for counterfactual explanations.


Neural Class Expression Synthesis

N.J. KOUAGOU, S. Heindorf, C. Demir, A. Ngonga Ngomo, in: Extended Semantic Web Conference (ESWC), 2023

Most existing approaches for class expression learning in description logics are search algorithms. As the search space of these approaches is infinite, they often fail to scale to large learning problems. Our main intuition is that class expression learning can be regarded as a translation problem. Based thereupon, we propose a new family of class expression learning approaches which we dub neural class expression synthesis. Instances of this new family circumvent the high search costs entailed by current algorithms by translating training examples into class expressions in a fashion akin to machine translation solutions. Consequently, they are not subject to the runtime limitations of search-based approaches post training. We study three instances of this novel family of approaches to synthesize class expressions from sets of positive and negative examples. An evaluation of our approach on four benchmark datasets suggests that it can effectively synthesize high-quality class expressions with respect to the input examples in approximately one second on average. Moreover, a comparison to other state-of-the-art approaches suggests that we achieve better F-measures on large datasets. For reproducibility purposes, we provide our implementation as well as pretrained models in our public GitHub repository at https://github.com/dice-group/NeuralClassExpressionSynthesis


2022

AI-Based Assistance System for Manufacturing

S. Deppe, L. Brandt, M. Brünninghaus, J. Papenkordt, S. Heindorf, G. Tschirner-Vinke, 2022

Manufacturing companies are challenged to make the increasingly complex work processes equally manageable for all employees to prevent an impending loss of competence. In this contribution, an intelligent assistance system is proposed enabling employees to help themselves in the workplace and provide them with competence-related support. This results in increasing the short- and long-term efficiency of problem solving in companies.


Tab2Onto: Unsupervised Semantification with Knowledge Graph Embeddings

H.M.A. Zahera, S. Heindorf, S. Balke, J. Haupt, M. Voigt, C. Walter, F. Witter, A. Ngonga Ngomo, in: The Semantic Web: ESWC 2022 Satellite Events, Springer International Publishing, 2022



EvoLearner: Learning Description Logics with Evolutionary Algorithms

S. Heindorf, L. Blübaum, N. Düsterhus, T. Werner, V.N. Golani, C. Demir, A. Ngonga Ngomo, in: WWW, ACM, 2022, pp. 818-828

Classifying nodes in knowledge graphs is an important task, e.g., predicting missing types of entities, predicting which molecules cause cancer, or predicting which drugs are promising treatment candidates. While black-box models often achieve high predictive performance, they are only post-hoc and locally explainable and do not allow the learned model to be easily enriched with domain knowledge. Towards this end, learning description logic concepts from positive and negative examples has been proposed. However, learning such concepts often takes a long time and state-of-the-art approaches provide limited support for literal data values, although they are crucial for many applications. In this paper, we propose EvoLearner - an evolutionary approach to learn ALCQ(D), which is the attributive language with complement (ALC) paired with qualified cardinality restrictions (Q) and data properties (D). We contribute a novel initialization method for the initial population: starting from positive examples (nodes in the knowledge graph), we perform biased random walks and translate them to description logic concepts. Moreover, we improve support for data properties by maximizing information gain when deciding where to split the data. We show that our approach significantly outperforms the state of the art on the benchmarking framework SML-Bench for structured machine learning. Our ablation study confirms that this is due to our novel initialization method and support for data properties.


COVIDPUBGRAPH: A FAIR Knowledge Graph of COVID-19 Publications

S.. Pestryakova, D. Vollmers, M. Sherif, S. Heindorf, M.. Saleem, D. Moussallem, A. Ngonga Ngomo, Scientific Data (2022)


CausalQA: A Benchmark for Causal Question Answering

A. Bondarenko, M. Wolska, S. Heindorf, L. Blübaum, A. Ngonga Ngomo, B. Stein, P. Braslavski, M. Hagen, M. Potthast, in: Proceedings of the 29th International Conference on Computational Linguistics, International Committee on Computational Linguistics, 2022, pp. 3296–3308

At least 5% of questions submitted to search engines ask about cause-effect relationships in some way. To support the development of tailored approaches that can answer such questions, we construct Webis-CausalQA-22, a benchmark corpus of 1.1 million causal questions with answers. We distinguish different types of causal questions using a novel typology derived from a data-driven, manual analysis of questions from ten large question answering (QA) datasets. Using high-precision lexical rules, we extract causal questions of each type from these datasets to create our corpus. As an initial baseline, the state-of-the-art QA model UnifiedQA achieves a ROUGE-L F1 score of 0.48 on our new benchmark.


User Involvement in Training Smart Home Agents

L.N. Sieger, J. Hermann, A. Schomäcker, S. Heindorf, C. Meske, C. Hey, A. Doğangün, in: International Conference on Human-Agent Interaction, ACM, 2022

Smart home systems contain plenty of features that enhance wellbeing in everyday life through artificial intelligence (AI). However, many users feel insecure because they do not understand the AI’s functionality and do not feel they are in control of it. Combining technical, psychological and philosophical views on AI, we rethink smart homes as interactive systems where users can partake in an intelligent agent’s learning. Parallel to the goals of explainable AI (XAI), we explored the possibility of user involvement in supervised learning of the smart home to have a first approach to improve acceptance, support subjective understanding and increase perceived control. In this work, we conducted two studies: In an online pre-study, we asked participants about their attitude towards teaching AI via a questionnaire. In the main study, we performed a Wizard of Oz laboratory experiment with human participants, where participants spent time in a prototypical smart home and taught activity recognition to the intelligent agent through supervised learning based on the user’s behaviour. We found that involvement in the AI’s learning phase enhanced the users’ feeling of control, perceived understanding and perceived usefulness of AI in general. The participants reported positive attitudes towards training a smart home AI and found the process understandable and controllable. We suggest that involving the user in the learning phase could lead to better personalisation and increased understanding and control by users of intelligent agents for smart home automation.


2021

ASSET: A Semi-supervised Approach for Entity Typing in Knowledge Graphs

H.M.A. Zahera, S. Heindorf, A. Ngonga Ngomo, in: Proceedings of the 11th on Knowledge Capture Conference, ACM, 2021


Drift Detection in Text Data with Document Embeddings

R. Feldhans, A. Wilke, S. Heindorf, M.H. Shaker, B. Hammer, A. Ngonga Ngomo, E. Hüllermeier, in: Intelligent Data Engineering and Automated Learning – IDEAL 2021, Springer International Publishing, 2021


Convolutional Hypercomplex Embeddings for Link Prediction

C. Demir, D. Moussallem, S. Heindorf, A. Ngonga Ngomo, in: The 13th Asian Conference on Machine Learning, ACML 2021, 2021

Knowledge graph embedding research has mainly focused on the two smallest normed division algebras, $\mathbb{R}$ and $\mathbb{C}$. Recent results suggest that trilinear products of quaternion-valued embeddings can be a more effective means to tackle link prediction. In addition, models based on convolutions on real-valued embeddings often yield state-of-the-art results for link prediction. In this paper, we investigate a composition of convolution operations with hypercomplex multiplications. We propose the four approaches QMult, OMult, ConvQ and ConvO to tackle the link prediction problem. QMult and OMult can be considered as quaternion and octonion extensions of previous state-of-the-art approaches, including DistMult and ComplEx. ConvQ and ConvO build upon QMult and OMult by including convolution operations in a way inspired by the residual learning framework. We evaluated our approaches on seven link prediction datasets including WN18RR, FB15K-237 and YAGO3-10. Experimental results suggest that the benefits of learning hypercomplex-valued vector representations become more apparent as the size and complexity of the knowledge graph grows. ConvO outperforms state-of-the-art approaches on FB15K-237 in MRR, Hit@1 and Hit@3, while QMult, OMult, ConvQ and ConvO outperform state-of-the-approaches on YAGO3-10 in all metrics. Results also suggest that link prediction performances can be further improved via prediction averaging. To foster reproducible research, we provide an open-source implementation of approaches, including training and evaluation scripts as well as pretrained models.



Automatically generating instructions from tutorials for search and user navigation

S. Heindorf. Automatically generating instructions from tutorials for search and user navigation, Patent 10936684. 2021.


2020

CauseNet: Towards a Causality Graph Extracted from the Web

S. Heindorf, Y. Scholten, H. Wachsmuth, A. Ngonga Ngomo, M. Potthast, in: Proceedings of the 28th ACM International Conference on Information and Knowledge Management (CIKM 2020), 2020, pp. 3023-3030


2019


Vandalism Detection in Crowdsourced Knowledge Bases

S. Heindorf, Universität Paderborn, 2019



2018

Semantic Data Mediator: Linking Services to Websites

D. Wolters, S. Heindorf, J. Kirchhoff, G. Engels, in: Service-Oriented Computing -- ICSOC 2017 Workshops, Springer International Publishing, 2018, pp. 388-392

Many websites offer links to social media sites for convenient content sharing. Unfortunately, those sharing capabilities are quite restricted and it is seldom possible to share content with other services, like those provided by a user's favorite applications or smart devices. In this paper, we present Semantic Data Mediator (SDM) --- a flexible middleware linking a vast number of services to millions of websites. Based on reusable repositories of service descriptions defined by the crowd, users can easily fill a personal registry with their favorite services, which can then be linked to websites by SDM. For this, SDM leverages semantic data, which is already available on millions of websites due to search engine optimization. Further support for our approach from website or service developers is not required. To enable the use of a broad range of services, data conversion services are automatically composed by SDM to transform data according to the needs of the different services. In addition to linking web services, various service adapters allow services of applications and smart devices to be linked as well. We have fully implemented our approach and present a real-world case study demonstrating its feasibility and usefulness.


2017


Linking Services to Websites by Leveraging Semantic Data

D. Wolters, S. Heindorf, J. Kirchhoff, G. Engels, in: 2017 IEEE International Conference on Web Services (ICWS), IEEE, 2017

Websites increasingly embed semantic data for search engine optimization. The most common ontology for semantic data, schema.org, is supported by all major search engines and describes over 500 data types, including calendar events, recipes, products, and TV shows. As of today, users wishing to pass this data to their favorite applications, e.g., their calendars, cookbooks, price comparison applications or even smart devices such as TV receivers, rely on cumbersome and error-prone workarounds such as reentering the data or a series of copy and paste operations. In this paper, we present Semantic Data Mediator (SDM), an approach that allows the easy transfer of semantic data to a multitude of services, ranging from web services to applications installed on different devices. SDM extracts semantic data from the currently displayed web page on the client-side, offers suitable services to the user, and by the press of a button, forwards this data to the desired service while doing all the necessary data conversion and service interface adaptation in between. To realize this, we built a reusable repository of service descriptions, data converters, and service adapters, which can be extended by the crowd. Our approach for linking services to websites relies solely on semantic data and does not require any additional support by either website or service developers. We have fully implemented our approach and present a real-world case study demonstrating its feasibility and usefulness.


Overview of the Wikidata Vandalism Detection Task at WSDM Cup 2017

S. Heindorf, M. Potthast, G. Engels, B. Stein, in: WSDM Cup 2017 Notebook Papers, 2017

We report on the Wikidata vandalism detection task at the WSDM Cup 2017. The task received five submissions for which this paper describes their evaluation and a comparison to state of the art baselines. Unlike previous work, we recast Wikidata vandalism detection as an online learning problem, requiring participant software to predict vandalism in near real-time. The best-performing approach achieves a ROC-AUC of 0.947 at a PR-AUC of 0.458. In particular, this task was organized as a software submission task: to maximize reproducibility as well as to foster future research and development on this task, the participants were asked to submit their working software to the TIRA experimentation platform along with the source code for open source release.


Proceedings of the WSDM Cup 2017: Vandalism Detection and Triple Scoring

M. Potthast, S. Heindorf, H. Bast, in: arXiv:1712.09528, 2017

The WSDM Cup 2017 was a data mining challenge held in conjunction with the 10th International Conference on Web Search and Data Mining (WSDM). It addressed key challenges of knowledge bases today: quality assurance and entity search. For quality assurance, we tackle the task of vandalism detection, based on a dataset of more than 82 million user-contributed revisions of the Wikidata knowledge base, all of which annotated with regard to whether or not they are vandalism. For entity search, we tackle the task of triple scoring, using a dataset that comprises relevance scores for triples from type-like relations including occupation and country of citizenship, based on about 10,000 human relevance judgements. For reproducibility sake, participants were asked to submit their software on TIRA, a cloud-based evaluation platform, and they were incentivized to share their approaches open source.


2016

Vandalism Detection in Wikidata

S. Heindorf, M. Potthast, B. Stein, G. Engels, in: Proceedings of the 25th International Conference on Information and Knowledge Management (CIKM 2016), 2016, pp. 327--336

Wikidata is the new, large-scale knowledge base of the Wikimedia Foundation. Its knowledge is increasingly used within Wikipedia itself and various other kinds of information systems, imposing high demands on its integrity.Wikidata can be edited by anyone and, unfortunately, it frequently gets vandalized, exposing all information systems using it to the risk of spreading vandalized and falsified information. In this paper, we present a new machine learning-based approach to detect vandalism in Wikidata.We propose a set of 47 features that exploit both content and context information, and we report on 4 classifiers of increasing effectiveness tailored to this learning task. Our approach is evaluated on the recently published Wikidata Vandalism Corpus WDVC-2015 and it achieves an area under curve value of the receiver operating characteristic, ROC-AUC, of 0.991. It significantly outperforms the state of the art represented by the rule-based Wikidata Abuse Filter (0.865 ROC-AUC) and a prototypical vandalism detector recently introduced by Wikimedia within the Objective Revision Evaluation Service (0.859 ROC-AUC).


2015


2012

Optimized XPath evaluation for Schema-compressed XML data

S. Böttcher, R. Hartel, S. Heindorf, in: ADC, Australian Computer Society, 2012, pp. 137-144


Liste im Research Information System öffnen

Sie interessieren sich für:

Die Universität der Informationsgesellschaft