Back to the news list

DEER: RDF Data Extraction and Enrichment Framework

21.08.2017

A contribution from Dr. Mohamed Sherif

RDF enrichment with DEER

Linked Data enrichment is the process of adding, altering or deleting a set of triples of a source dataset in order to obtain an enriched version of the source dataset. This enriched dataset usually provides significant benefits for a specific use case scenario. These benefits include (but are not limited to) more data (quantity), better data quality, better data organization (refined ontology) and interoperability with other datasets (interlinking). Over the past few years, several frameworks for RDF data enrichment have been developed. Such frameworks provide enrichment methods such as entity recognition, link discovery and schema enrichment.

POI Enrichment

During<link http: slipo.eu> the SLIPO project, we will adapt the<link http: aksw.org projects deer> DEER framework to effectively handle the enrichment of Point Of Interest (POI) data.<link http: aksw.org projects deer> DEER was designed to be a modular framework which can be easily extended and re-purposed. Therefore, in addition to using the available enrichment functions in<link http: aksw.org projects deer> DEER, we intend to extend them by implementing POI-related enrichment functions. These include: retrieving the location of a POI from a third party geolocation service, determining the validity of a certain POI at a particular time based on a given time stamp and grouping POIs into areas of interest. Currently,<link http: aksw.org projects deer> DEER implements a supervised machine learning approach for generating the aforementioned sequence of enrichment functions and operators that must be used to process the input dataset. Some limitations in the current supervised approach implemented in<link http: aksw.org projects deer> DEER are the usage of only one input dataset and the generation of one enriched dataset. In<link http: slipo.eu> SLIPO, we will extend this approach by enabling the<link http: aksw.org projects deer> DEER supervised approach to accept a set of input datasets and generate a set of output datasets. Also, we plan to apply unsupervised or weakly supervised approaches for the automatic detection of enrichment configuration for enriching POI data with a focus on its geo-spatial and temporal dimensions. Our approaches will not only aim to enrich POI, but also to provide the enriched data in a format suitable for industrial consumption.