Decision Tree Modeling for Ranking Data
Inhaltsverzeichnis
Reference
Philip L. H. Yu, Wai Ming Wan, Paul H. Lee: Decision Tree Modeling for Ranking Data. In: Fürnkranz, J. and Hüllermeier, E.: Preference Learning, 2011, 83-106.
DOI
http://dx.doi.org/10.1007/978-3-642-14125-6_5
Abstract
Ranking/preference data arises from many applications in marketing, psychology, and politics. We establish a new decision tree model for the analysis of ranking data by adopting the concept of classification and regression tree. The existing splitting criteria are modified in a way that allows them to precisely measure the impurity of a set of ranking data. Two types of impurity measures for ranking data are introduced, namelyg-wise and top-k measures. Theoretical results show that the new measures exhibit properties of impurity functions. In model assessment, the area under the ROC curve (AUC) is applied to evaluate the tree performance. Experiments are carried out to investigate the predictive performance of the tree model for complete and partially ranked data and promising results are obtained. Finally, a real-world application of the proposed methodology to analyze a set of political rankings data is presented.
Extended Abstract
Bibtex
@incollection{ year={2011}, isbn={978-3-642-14124-9}, booktitle={Preference Learning}, editor={Fürnkranz, Johannes and Hüllermeier, Eyke}, doi={10.1007/978-3-642-14125-6_5}, title={Decision Tree Modeling for Ranking Data}, url={http://dx.doi.org/10.1007/978-3-642-14125-6_5, http://de.evo-art.org/index.php?title=Decision_Tree_Modeling_for_Ranking_Data }, publisher={Springer Berlin Heidelberg}, author={Yu, PhilipL.H. and Wan, WaiMing and Lee, PaulH.}, pages={83-106}, language={English} }
Used References
1. A. Asuncion, D.J. Newman, UCI Machine Learning Repository [1]. Irvine, CA: University of California, School of Information and Computer Science (2007)
2. A.P. Bradley, The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 30, 1145–1159 (1997) http://dx.doi.org/10.1016/S0031-3203(96)00142-2
3. L. Breiman, J.H. Friedman, R.A. Olshen, C.J. Stone.Classification and Regression Trees (Belmont, California: Wadsworth, 1984)
4. W. Cheng, J. Hühn, E. Hüllermeier, Decision tree and instance-based learning for label ranking, inProceedings of the 26th International Conference on Machine Learning (ICML 2009) (Montreal, Canada, 2009)
5. P.A. Chou, Optimal partitioning for classification and regression trees. IEEE Trans. Pattern Anal. Mach. Intell. 13, 340–354 (1991) http://dx.doi.org/10.1109/34.88569
6. D.E. Critchlow, M.A. Fligner, J.S. Verducci, Probability models on rankings. J. Math. Psychol. 35, 294–318 (1991) http://dx.doi.org/10.1016/0022-2496(91)90050-4
7. C. Drummond, R.C. Holte, What ROC curves can’t do (and cost curves can), inProceedings of the 1st Workshop on ROC Analysis in AI (Valencia, Spain, 2004), pp. 19-Ű26
8. R.M. Duch, M.A. Taylor, Postmaterialism and the economic condition. Am. J. Pol. Sci. 37, 747–778 (1993) http://dx.doi.org/10.2307/2111573
9. J. Fürnkranz, E. Hüllermeier, Pairwise preference learning and ranking, inProceedings of the 14th European Conference on Machine Learning (ECML-03) (Springer, Cavtat, Croatia, 2003), pp. 145–156
10. P. Geurts, L. Wehenkel, A. Florence, Kernelizing the output of tree-based methods, inProceedings of the 23rd International Conference on Machine Learning (ICML-06), (Pittsburgh, Pennsylvania, 2006), pp. 345–352
11. D.J. Hand, R.J. Till, A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach. Learn. 45(2), 171–186 (2001) http://dx.doi.org/10.1023/A%3A1010920819831
12. J.A. Hanley, B.J. McNeil, The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 29–36 (1982)
13. E. Hüllermeier, J. Fürnkranz, On Minimizing the Position Error in Label Ranking, inProceedings of the 17th European Conference on Machine Learning (ECML-07) (Springer, Warsawa, Poland, 2007), pp. 583–590
14. E. Hüllermeier, J. Fürnkranz, W. Cheng, K. Brinker, Label ranking by learning pairwise preferences. Artif. Intell. 172(16–17), 1897–1916 (2008) http://dx.doi.org/10.1016/j.artint.2008.08.002
15. R. Inglehart,The Silent Revolution: Changing Values and Political Styles among Western Publics (Princeton Univerity Press, Princeton, 1977)
16. R. Jowell, L. Brook, L. Dowds,International Social Attributes: the 10th BSA Report (Dartmouth Publishing, Aldershot, 1993)
17. M.G. Karlaftis, Predicting mode choice through multivariate recursive partitioning. J. Trans. Eng. 130(22), 245–250 (2004) http://dx.doi.org/10.1061/(ASCE)0733-947X(2004)130%3A2(245)
18. G.V. Kass, An exploratory technique for investigation large quantities of categorical data. Appl. Stat. 29, 119–127 (1980) http://dx.doi.org/10.2307/2986296
19. W.-Y. Loh, Y.-S. Shih, Split selection methods for classification trees. Statistica Sinica 7, 815–840 (1997)
20. J.I. Marden, Analyzing and Modeling Rank Data (Chapman & Hall, 1995)
21. J.R. Quinlan, Induction of decision trees. Mach. Learn. 1, 81–106 (1986)
22. J.R. Quinlan.C4.5: Programs for Machine Learning (Morgan Kaufmann, 1993)
23. B.D. Ripley,Pattern Recognition and Neural Networks (Cambridge University Press, Cambridge, 1996)
24. L. Rokach, O. Maimon, Decision trees, inThe Data Mining and Knowledge Discovery Handbook, ed. by O. Maimon, L. Rokach (Springer, Berlin, 2005), pp. 165–192 http://dx.doi.org/10.1007/0-387-25465-X_9
25. R. Siciliano, F. Mola, Multivariate data analysis and modeling through classification and regression trees. Comput. Stat. Data Anal. 32, 285–301 (2000) http://dx.doi.org/10.1016/S0167-9473(99)00082-1
Links
Full Text
http://www.researchgate.net/publication/252998138