Improving a SVM Meta-classifier for Text Documents by using Naive Bayes
Keywords:
Meta-classification, Support Vector Machine, Naive Bayes, Text document and Performance EvaluationAbstract
Text categorization is the problem of classifying text documents into a set of predefined classes. In this paper, we investigated two approaches: a) to develop a classifier for text document based on Naive Bayes Theory and b) to integrate this classifier into a meta-classifier in order to increase the classification accuracy. The basic idea is to learn a meta-classifier to optimally select the best component classifier for each data point. The experimental results show that combining classifiers can significantly improve the classification accuracy and that our improved meta-classification strategy gives better results than each individual classifier. For Reuters2000 text documents we obtained classification accuracies up to 93.87%References
S. Chakrabarti, Mining the Web- Discovering Knowledge from hypertext data, Morgan Kaufmann Press, 2003.
N. Dimitrova, L. Agnihotri and G. Wei, Video Classification Based on HMM Using Text and Face, Proceedings of the European Conference on Signal Processing,Vol. XVII, pp. 1373-1376, Finland, 2000.
J. Engler, A. Kusiak, Mining Authoritativeness of Collaborative Innovation Partners, International Journal of Computers, Communications and Control, Vol. V, No. 1, pp. 42-51, 2010. http://dx.doi.org/10.15837/ijccc.2010.1.2463
D. Lewis, Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval, ATT Lab Research, NJ, Vol. 1398, pp. 4-15, USA, 1998.
W.H. Lin, A. Houptmann, News Video Classification Using SVM-based Multimodal Classifier and Combination Strategies, Proceedings of the tenth ACM international conference on Multimedia, pp. 323-326, 2002. http://dx.doi.org/10.1145/641007.641075
W.H. Lin, R. Jin, A. Houptmann, A Meta-classification of Multimedia Classifiers, International Workshop on Knowledge Discovery in Multimedia and Complex Data, Taiwan, 2002.
D. Morariu, L. Vintan, A Better Correlation of the SVM kernel's Parameters, Proceeding of the 5th RoEduNet International Conference, Sibiu, pp. 244-249, June 2006.
D. Morariu, L. Vintan, V. Tresp, Feature Selection Methods for an Improved SVM Classifier, Proceedings of the 14th International Conference on Computational and Information Science, pp. 83-89, Prague, August 2006.
D. Morariu, L. Vintan, V. Tresp, Evolutionary Feature Selection for Text Documents Using the SVM, Proceeding of the 3rd International Conference on Neural Computing and Patter Recognition, pp. 215-221, Barcelona, October 2006.
D. Morariu, Classification and Clustering using Support Vector Machine, 2nd PhD Report, University "Lucian Blaga" of Sibiu, September, 2005, http://webspace.ulbsibiu.ro/daniel.morariu/html/Docs /Report2.pdf.
D. Morariu, L. Vintan, V. Tresp, Meta-Classification using SVM Classifiers for Text Documents, The 3rd International Conference on Neural Computing and Patter Recognition, pp. 222-227, Barcelona, October 2006.
D. Morariu, Text Mining Methods based on Support Vector Machine, MatrixRom, Bucharest, 2008.
C. Nello, J. Swawe-Taylor, An introduction to Support Vector Machines, Cambridge University Press, 2000.
Reuters Corpus: http://about.reuters.com/researchandstandards/corpus/. Released in November 2000.
B. Schoelkopf, A. Smola, Learning with Kernels. Support Vector Machines, MIT Press, London, 2002.
G. Siyang, L. Quingrui, M. Lin, Meta-classifier in Text Classification, http://www. comp.nus.edu.sg/ zhouyong/papers/cs5228project.pdf.
R. Stoean, C. Stoean, M. Preuss, D. Dumitrescu, Evolutionary Multi-class Support Vector Machine for Classification, International Journal of Computers, Communications and Control, 1(S):423-428, 2006.
Published
Issue
Section
License
ONLINE OPEN ACCES: Acces to full text of each article and each issue are allowed for free in respect of Attribution-NonCommercial 4.0 International (CC BY-NC 4.0.
You are free to:
-Share: copy and redistribute the material in any medium or format;
-Adapt: remix, transform, and build upon the material.
The licensor cannot revoke these freedoms as long as you follow the license terms.
DISCLAIMER: The author(s) of each article appearing in International Journal of Computers Communications & Control is/are solely responsible for the content thereof; the publication of an article shall not constitute or be deemed to constitute any representation by the Editors or Agora University Press that the data presented therein are original, correct or sufficient to support the conclusions reached or that the experiment design or methodology is adequate.