Large-scale Text-based Video Classification using Contextual Features
Article Main Content
The production of video has increased and expanded dramatically. There is a need to reach accurate video classification. In our work, we use deep learning as a mean to accelerate the video retrieval task by classifying them into categories. We classify a video depending on the text extracted from it. We trained our model using fastText, a library for efficient text classification and representation learning, and tested our model on 15000 videos. Experimental results show that our approach is efficient and has good performance. Our technique can be used on huge datasets. It produces a model that can be used to classify any video into a specific category very quickly.
References
-
M. Darji and D. Mathpal, ?A review of video classification techniques,? IRJET Journal, vol. 4. no. 6, June 2017.
Google Scholar
1
-
G. Kaur and P. Kaur, ?Review on text classification by NLP approaches with machine learning and data mining approaches,? IJARIIT Journal, vol. 3, no. 4, pp. 767-771, 2017.
Google Scholar
2
-
S. Parameswaran and D. Joseph, ?A review of machine learning techniques used for video classification,? IJCESR Journal, vol. 4, no. 12, pp. 64-69, 2017.
Google Scholar
3
-
J. Pennington, R. Socher, and C. Manning, ?Glove: Global vectors for word representation,? Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532?1543, Doha, Qatar, 2014.
Google Scholar
4
-
T. Mikolov, K. Chen, G. Corrado, and J. Dean, ?Efficient estimation of word representations in vector space,? Proceedings of the International Conference on Learning Representations, January 2013.
Google Scholar
5
-
P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, ?Enriching word vectors with subword information,? arXiv preprint arXiv:1607.04606 [Online]. Available: https://research.fb.com/fasttext, 2016.
Google Scholar
6
-
B. Cui, C. Zhang, and G. Cong, ?Content-enriched classifier for Web video classification,? Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 619?626, USA, 2010.
Google Scholar
7
-
S. Schmiedeke, P. Xu, I. Ferran?, M. Eskevich, C. Kofler, M. Larson, Y. Est?ve, L. Lamel, G. Jones, and T. Sikora, ?Blip10000: A social video dataset containing SPUG content for tagging and retrieval,? ACM Multimedia Systems Conference, Oslo, Norway, 2013.
Google Scholar
8
-
L. Yang, J. Liu, X. Yang, and X.-S. Hua, ?Multi-modality web video categorization,? Proceedings of the International Workshop on Workshop on Multimedia Information Retrieval, pp. 265?274, NY, USA, September 2007.
Google Scholar
9
-
J. R. Zhang, Y. Song, and T. Leung, ?Improving video classification via YouTube video co-watch data,? Proceedings of the 2011 ACM workshop on Social and behavioral networked media access - SBNMA ?11, pp. 21-26, Arizona, USA, December 2011.
Google Scholar
10
-
W.-H. Lin, and A. Hauptmann, ?News video classification using SVM-based multimodal classifiers and combination strategies,? Proceedings of the tenth ACM international conference on Multimedia, pp. 323-326, NY, USA, December 2002.
Google Scholar
11
-
?Related Words - Find Words Related to Another Word,? [Online]. Available: https://relatedwords.org/.
Google Scholar
12
-
B. Ionescu, I. Mironica, K. Seyerlehner, P. Knees, J. Schluter, M. Schedl, C. Horia, A. Buzo, and P. Lambert, ?ARF @ MediaEval 2012: Multimodal Video Classification,? Proceedings of the MediaEval 2012 Workshop, Pisa, Italy.
Google Scholar
13
-
Semela Tomas, Tapaswi Makarand, Ekenel Hazim Kemal, and Stiefelhagen Rainer, ?KIT @ MediaEval 2012: Content-based Genre Classification using Visual Cues,? Proceedings of the MediaEval 2012 Workshop, Pisa, Italy, 2012.
Google Scholar
14
-
S. Schmiedeke, P. Kelm, and T. Sikora, ?TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visual)-Words Approaches,? Proceedings of the MediaEval 2012 Workshop, Pisa, Italy, 2012.
Google Scholar
15
-
Y. Shi, M. Larson, P. Wiggers, and C. Jonker, ?MediaEval 2012 Tagging Task: Prediction based on One Best List and Confusion Networks,? Proceedings of the MediaEval 2012 Workshop, Pisa, Italy, 2012.
Google Scholar
16
-
P. Xu, Y. Shi, and M. Larson, ?TUD @ MediaEval 2012 Genre Tagging Task: Multi-modality Video Categorization with one-vs-all Classifiers,? Proceedings of the MediaEval 2012 Workshop, Pisa, Italy, 2012.
Google Scholar
17
-
J. Almeida, T. Salles, E. Martins, O. Penatti, R. Torres, M. Goncalves, and J. Almeida, ?UNICAMP-UFMG @ MediaEval 2012: Genre Tagging Task,? Proceedings of the MediaEval 2012 Workshop, Pisa, Italy, 2012.
Google Scholar
18





