Explainable Graph Query Answering (Project Group)

Content

This project aims to develop a comprehensive framework for answering complex queries over graph-structured data while incorporating advanced explainability components. Graph Databases (GDBs) offer a powerful way to represent and query interconnected data. Unlike traditional relational databases that store data in tables, GDBs use nodes and edges to represent entities and their relationships. This structure is particularly well-suited for domains where relationships are as important as the data itself, such as social networks, knowledge graphs, and recommendation systems.

Complex queries (CQ) in GDBs go beyond simple lookups and involve traversing multiple relationships to find answers. For example, in a social network, a complex query might be "Find friends of friends who live in a particular city and have a specific interest". These queries often require specialized query languages like SPARQL to navigate the intricate connections within the graph.

Traditional GDBs, however, face limitations when dealing with incomplete data and complex reasoning tasks. This is a significant issue as most real-world graphs are inherently incomplete. Neural Graph Databases (NGDBs) [1] have been proposed as a conceptual framework to address these limitations. NGDBs aim to bring together different technologies, like graph databases and graph representation learning into one unified system. By embedding entities and relationships into a latent space, NGDBs can handle incomplete graphs, perform approximate reasoning, and offer a more flexible and efficient querying mechanism. Currently, NGDBs are more of a concept than a fully realized technology. This project will be one of the first attempts to implement a comprehensive NGDB with a strong focus on explainability components.

The explainability components in this framework aim to demystify the process of answering complex graph queries by providing insights into intermediate results. Unlike traditional black-box approaches, the framework enables transparency by revealing key subgraph patterns, intermediate query states, and multi-hop traversal paths that contribute to the final answers. For instance, instead of merely outputting a result, the system can highlight which relationships, entities, or topological features (centrality, middleness, etc.) were most influential in the reasoning process. By offering interpretable outputs—such as confidence scores, feature attributions, and visualizations of reasoning steps—the framework bridges the gap between the computational complexity of graph-based neural models and the practical need for users to debug, validate and trust its results.

This framework will define a structured approach for complex query answering over graphs, encompassing a standardized methodology for dataset creation and loading, modeling, evaluation, explanation, and visualization. This will make the framework an invaluable tool for researchers and practitioners dealing with graph-structured data.

Benefits

This project offers students the opportunity to gain hard skills in areas such as graph database technologies (e.g., Neo4j), graph representation learning (e.g., PyTorch Geometric), explainability techniques, and visualization tools. You will also develop soft skills including problem-solving, collaboration, communication, and critical thinking.

You will gain practical experience in team-based coding project by using Git for version control. Additionally, the project involves reading scientific papers and writing technical reports using LaTeX, which will prepare you for your master thesis by strengthening your academic research and writing skills.

Given the framework’s modularity and the inclusion of diverse components, students with various interests and skill sets can find meaningful opportunities to contribute. Whether your passion lies in programming, algorithm design, data modeling, or machine learning, this project welcomes participants from both Computer Engineering and Computer Science backgrounds.

Requirements

Strong programming skills in Python
A habit of documenting and commenting code for clarity and collaboration
Willingness to develop skills in reading scientific papers and writing technical reports
Openness to learning new concepts (e.g., Graph ML, Graph DB, PyTorch, etc.) in an interdisciplinary project

Related Work

[1] Hongyu Ren, Mikhail Galkin, Michael Cochez, Zhaocheng Zhu, & Jure Leskovec (2023). Neural Graph Reasoning: Complex Logical Query Answering Meets Graph Databases. CoRR, abs/2303.14617

Contact

Slides

You can access the slides from the introductory presentation using this link.