Paper title: Framework for Urdu News Headlines Classification
Published in: Issue 1, (Vol. 10) / 2016Download
Publishing date: 2016-04-14
Pages: 17-21
Author(s): AHMED Kashif, ALI Mubashir, KHALID Shehzad, KAMRAN Muhammad
Abstract. Automatic text classification has great significance in the field of text mining and plays a pivotal role in areas such as spam filtering, news classification, noise reduction etc. It is evident from the literature that there is ample of research conducted for classifying text documents e.g. English news classification, Persian text classification etc. but there is no copious amount of work related to short Urdu text or Urdu news headlines classification. Therefore, after examining various existing news classification methodologies we propose an SVM based framework in this paper for classification of Urdu news headlines. This approach classifies Urdu news based on headlines in their respective pre-defined categories by utilizing their feature vector’s maximum indexes. This proposed system is compared with existing state-of-the art techniques.
Keywords: Urdu News Classification, Feature Vector, SVM

1. I H. Witten. Text mining. In Practical Handbook of Internet Computing, M.P. Singh (eds.). Chapman and Hall/CRC Press, Boca Raton, FL, 2005, pp. 14-1–14-22 (2005).

2. Ali, M., Khalid, S., &Saleemi, M. H. (2014). A Novel Stemming Approach for Urdu Language. J. Appl. Environ. Biol. Sci, 4(7S), 436-443.

3. J. B. Lovin’s. Development of a stemming algorithm. Mechanical Translation and Computer Linguistic. vol.11, no.1/2, pp. 22-31, (1968).

4. M.F. Porter. An algorithm for suffix stripping. Program. 14: 130-137. (1980).

5. M. W. Pope, "Automatic Classification of Online News Headlines," 2007. School of Information and Library Science of the University of North Carolina at Chapel Hill in partial fulfillment of the requirements for the degree of Master of Science in Information Science (November 2007).

6. Dr. R. R. Deshmukh, Mr D. K. Kirange. Classifying News Headlines for Providing User Centered E-Newspaper Using SVM. International Journal of Emerging Trends & Technology in Computer Science (IJETTCS).

7. K Anita. TEXT CATEGORIZATION: Building a KNN classifier for the Reuters-21578 collection. December 4, 2006.

8. Andreas Heb, Philipp Dopichaj and Christian Maab, "Multi-Value Classification of Very Short Texts.", In Proceedings of the 31st annual German conference on Advances in Artificial Intelligence, Springer-Verlag Berlin, 2008, pp.70-77.

9. J.Sreemathy, P.S. Balamurugan, “An Efficient Text Classification Using k-NN and Naïve Bayesian”, International Journal on Computer Science and Engineering (IJCSE), ISSN: 0975-3397 Vol. 4 No. 03 March 2012.

10. D. K. Kirange, R. R. Deshmukh, "Emotion classification of news headlines using SVM," Asian Journal of Computer Science and Information Technology, pp. 104-106, 2012.

11. Jia, Y., Chen, Z., Yu, S.: Reader emotion classification of news headlines. In: Proceedings of the International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE), pp. 1–6 (2009).

12. Dilrukshi, I., De Zoysa, K., Caldera, A.: Twitter news classication using svm. In: Computer Science & Education (ICCSE), 2013 8th International Conference on, IEEE (2013) 287-291.

13. A.R Ali and M Ijaz, "Urdu Text Classification", Proceedings of the 7th International Conference on Frontiers of Information Technology, New York, USA: ACM Press, December 2009, pp. 21-27.

14. Mubashir, A. et al “A Rule based Stemming Method for Multilingual Urdu Text”, International Journal of Computer Applications, (0975 – 8887) Volume 134 – No.8, January 2016

Back to the journal content
Creative Commons License
This article is licensed under a
Creative Commons Attribution-ShareAlike 4.0 International License.
Home | Editorial Board | Author info | Archive | Contact
Copyright JACSM 2007-2020