Linked Data and PageRank based classification

Michal Nykl
IADIS International Conference Theory and Practice in Modern Computing 2013 (part of MCCSIS 2013) (2013)
Semantic analysis | Document classification


Authors: Michal Nykl, Karel Je┼żek, Martin Dostal, Dalibor Fiala. In this article, we would like to present new approach to classification with Linked Data and PageRank. Our research is focused on classification methods that are enhanced by semantic information. The semantic information can be obtained from ontology or from Linked Data. DBpedia was used as source of Linked Data in our case. Feature selection method is semantically based so features can be recognized by nonprofessional users because they are in a human readable and understandable form. PageRank is used during feature selection and generation phase for expansion of basic features into more general representatives. It means that feature selection and processing is based on a network relations obtained from Linked Data. The features can be used by standard classification algorithms. We will present the promising preliminary results that show the easy applicability of this approach to different datasets.

