WEKA-LR: A Label Ranking Extension for WEKA



The problem setting of label ranking, which has recently been introduced in machine learning research, is a specific type of preference learning and can be seen as an extension of conventional multi-class classification. In comparison to the latter, there are notable differences regarding the type of training data and the type of models (predictions) produced.

While a classifier is a mapping from instances to class labels, assigning to each instance x one label y among a finite set of candidates Y, a label ranker is a mapping from instances to rankings (total orders) over Y. Thus, given any instance x as an input, a label ranker produces a prediction in the form of a ranking of the complete set of labels Y as an output. Typically, the ranking thus produced is interpreted as a preference relation.

As an example, consider the problem to predict the preferences of people (e.g., characterized in terms of a feature vector which constitutes the input of the label ranker) regarding the set of music genres {classic, jazz, popular, traditional}. The prediction popular > jazz > classic > traditional suggests that the person for whom the prediction is made mostly likes popular music, which he or she prefers to jazz, which is preferred to classic, which is in turn preferred to traditional music. Please note, however, that this is only an example, and that the order relation > does not necessarily need to be interpreted in terms of a preference semantics. Instead, the prediction of a ranking may also be of interest in other cases. For example, if instances are proteins and labels are small molecules, then y>y’ could mean that y is binding better (stronger) to a protein x than y’.

Just like a classification algorithm induces a classifier, a label ranking algorithm learns a label ranker from a set of training data. Here, the training data essentially consists of exemplary preference information. In the simplest case, this information is given in the form of pairwise comparisons, i.e., in the form of an instance x together with a comparison y>y’ suggesting that, for x as an input, y should be ranked ahead of y’.

For a more detailed introduction to label ranking, see the references in the list below.

We have developed an extension of the Java machine learning framework WEKA which is able to handle preference data and includes label ranking algorithms. This extension, called WEKA-LR, can be downloaded here. The essentials of WEKA-LR are described in a short documentation.

Finally, here are some sample data sets for label ranking, stored in our new data format .xarff, which is an extension of the conventional .arff format of WEKA. 


References:

J. Fürnkranz and E. Hüllermeier.
Preference Learning.
Künstliche Intelligenz, 1/05, pp. 60-61, 2005.
[ a very concise introduction formalizing the settings of label and object ranking, PDF ]

J. Fürnkranz and E. Hüllermeier.

Preference Learning.

Springer-Verlag, Berlin, 2010.
[ our edited book on preference learning, PDF of the introductory chapter ]

E. Hüllermeier, J. Fürnkranz, W. Cheng, and K. Brinker.
Label Ranking by Learning Pairwise Preferences.
Artificial Intelligence 172:1897-1917, 2008.
[ Draft-PDF ]

W. Cheng, J. Hühn, and E. Hüllermeier.
Decision Tree and Instance-Based Learning for Label Ranking.
Proc. ICML-09, International Conference on Machine Learning.
Montreal, Canada, June 2009.
[ PDF ]

W. Cheng, K. Dembczynski and E. Hüllermeier.
Label Ranking based on the Placket-Luce Model.
Proc. ICML-2010, International Conference on Machine Learning.
Haifa, Israel, June 2010.
[ PDF ]

Sie interessieren sich für: