Video Saliency Detection by using an Enhance Methodology Involving a Combination of 3DCNN with Histograms
DOI:
https://doi.org/10.15837/ijccc.2022.2.4299Keywords:
Histogram of optical flow (HoF), Histogram of oriented gradient (HoG), Human Visual System (HVS), Saliency detection, salient object detection, salient region detectionAbstract
When watching pictures or videos, the Human Visual System has the potential to concentrate on important locations. Saliency detection is a tool for detecting the abnormality and randomness of images or videos by replicating the human visual system. Video saliency detection has received a lot of attention in recent decades, but due to challenging temporal abstraction and fusion for spatial saliency, computational modelling of spatial perception for video sequences is still limited.Unlike methods for detection of salient objects in still images, one of the most difficult aspects of video saliency detection is figuring out how to isolate and integrate spatial and temporal features.Saliency detection, which is basically a tool to recognize areas in images and videos that catch the attention of the human visual system, may benefit multimedia applications such as video or image retrieval, copy detection, and so on. As the two crucial steps in trajectory-based video classification methods are feature point identification and local feature extraction. We suggest a new spatio-temporal saliency detection using an enhanced 3D Conventional neural network with an inclusion of histogram for optical and orient gradient in this paper.
References
[2] A. Borji, M.-M. Cheng, H. Jiang, and J. Li, "Salient object detection: A benchmark", IEEE Trans. Image Process., vol. 24, no. 12, pp. 5706-5722, 2015. https://doi.org/10.1109/TIP.2015.2487833
[3] X. Shen and Y.Wu, "A unified approach to salient object detection via low rank matrix recovery," in Proc. IEEE CVPR, Providence, RI, USA, 2012, pp. 853-860.
[4] H. Kim, Y. Kim, J.-Y. Sim, and C.-S. Kim, "Spatiotemporal saliency detection for video sequences based on random walk with restart," IEEE Trans. Image Process., vol. 24, no. 8, pp. 2552-2564, Aug. 2015. https://doi.org/10.1109/TIP.2015.2425544
[5] W.Wang, J. Shen, and L. Shao, "Video salient object detection via fully convolutional networks," IEEE Trans. Image Process., to be published, doi: 10.1109/TIP.2017.2754941. https://doi.org/10.1109/TIP.2017.2754941
[6] J. Peng, J. Shen, and X. Li, "High-order energies for stereo segmentation," IEEE Trans. Cybern., vol. 46, no. 7, pp. 1616-1627, Jul. 2016. https://doi.org/10.1109/TCYB.2015.2453091
[7] F. Perazzi, P. Krí¤henbühl, Y. Pritch, and A. Hornung, "Saliency filters: Contrast based filtering for salient region detection," in Proc. IEEE CVPR, Providence, RI, USA, 2012, pp. 733-740. https://doi.org/10.1109/CVPR.2012.6247743
[8] .M.-M. Cheng, N. J. Mitra, X. Huang, P. H. S. Torr, and S.-M. Hu, "Global contrast based salient region detection," IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 3, pp. 569-582, Mar. 2015. https://doi.org/10.1109/TPAMI.2014.2345401
[9] W. Wang, Q. Lai, H. Fu, J. Shen, H. Ling, Salient object detection in the deep learning era: an in-depth survey, CoRR abs/1904.09146 (2019).
[10] L. Itti, C. Koch, E. Niebur, A model of saliency-based visual attention for rapid scene analysis, IEEE Trans. Pattern Anal. Mach. Intell. 20 (11) (1998) 1254-1259. https://doi.org/10.1109/34.730558
[11] J. Harel, C. Koch, P. Perona, Graph-based visual saliency, in: International Conference on Neural Information Processing Systems, 2006, pp. 545-552.
[12] P. Zhang, T. Zhuo, W. Huang, K. Chen, M. Kankanhalli, Online object tracking based on CNN with spatial-temporal saliency guided sampling, Neurocomputing 257 (2017) 115-127. https://doi.org/10.1016/j.neucom.2016.10.073
[13] J. Zhang, K.A. Ehinger, H.Wei, K. Zhang, J. Yang, A novel graph-based optimization framework for salient object detection, PatternRecognit. 64 (1) (2017) 39-50. https://doi.org/10.1016/j.patcog.2016.10.025
[14] H. Chen, Y. Li, D. Su, Multi-modal fusion network with multi-scale multi- path and cross-modal interactions for RGB-D salient object detection, Pattern Recognit. 1 (1) (2018).1-1.
[15] E. Macaluso, C.D. Frith, J. Driver, Directing attention to locations and to sensory modalities: multiple levels of selective processing revealed with PET, Cerebral Cortex 12 (4) (2002) 357-368. https://doi.org/10.1093/cercor/12.4.357
[16] T.S. Lee, D. Mumford, Hierarchical bayesian inference in the visual cortex, JOSAA 20 (7) (2003) 1434-1448. https://doi.org/10.1364/JOSAA.20.001434
[17] Q. Yan, L. Xu, J. Shi, J. Jia, Hierarchical saliency detection, in: IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 1155-1162. https://doi.org/10.1109/CVPR.2013.153
[18] Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned salient region detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 1597-1604. IEEE (2009) https://doi.org/10.1109/CVPR.2009.5206596
[19] Cheng, M.M., Zhang, G.X., Mitra, N.J., Huang, X., Hu, S.M.: Global contrast based salient region detection. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 409-416. IEEE (2011) https://doi.org/10.1109/CVPR.2011.5995344
[20] Cui, X., Liu, Q., Metaxas, D.: Temporal spectral residual: fast motion saliency detection. In: Proceedings of the ACM International Conference on Multimedia (2009). https://doi.org/10.1145/1631272.1631370
[21] B. X. Nie, P. Wei, and S.-C. Zhu, "Monocular 3D human pose estimation by predicting depth on joints." in IEEE International Conference on Computer Vision, 2017 https://doi.org/10.1109/ICCV.2017.373
[22] D. Zhang, J. Han, C. Li, J. Wang, and X. Li, "Detection of co-salient objects by looking deep and wide", International Journal of Computer Vision, vol. 120, no. 2, pp. 215-232, 2016. https://doi.org/10.1007/s11263-016-0907-4
[23] X. Dong et al., "Occlusion-aware real-time object tracking," IEEE Trans. Multimedia, vol. 19, no. 4, pp. 763-771, Apr. 2017. https://doi.org/10.1109/TMM.2016.2631884
[24] X. Dong, J. Shen, L. Shao, and L. Van Gool, "Sub-Markov random walk for image segmentation," IEEE Trans. Image Process., vol. 25, no. 2, pp. 516-527, Feb. 2016. https://doi.org/10.1109/TIP.2015.2505184
[25] J. Shen et al., "Real-time superpixel segmentation by DBSCAN clustering algorithm", IEEE Trans. Image Process., vol. 25, no. 12, pp. 5933-5942, Dec. 2016. https://doi.org/10.1109/TIP.2016.2616302
[26] Y. Yuan, C. Li, J. Kim, W. Cai, D.D. Feng, Dense and sparse labeling with multidimensional features for saliency detection, IEEE Trans. Circuits Syst. Video Technol. 28 (5) (2018) 1130-1143. https://doi.org/10.1109/TCSVT.2016.2646720
[27] W. Wang, J. Shen, F. Guo, M.-M. Cheng, A. Borji, Revisiting video saliency: a large-scale benchmark and a new model, in: IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 4894-4903. https://doi.org/10.1109/CVPR.2018.00514
[28] Li Q., Chen S., Zhang B. (2012) Predictive Video Saliency Detection. In: Liu CL., Zhang C., Wang L. (eds) Pattern Recognition. CCPR 2012. Communications in Computer and Information Science, vol 321. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33506-8_23. https://doi.org/10.1007/978-3-642-33506-8_23
[29] Wang, Wenguan et al. "Deep Learning For Video Saliency Detection." ArXiv abs/1702. 00871 (2017): n. pag.
[30] F. Guo et al., "Video Saliency Detection Using Object Proposals," in IEEE Transactions on Cybernetics, vol. 48, no. 11, pp. 3159-3170, Nov. 2018, doi: 10.1109/TCYB.2017.2761361. https://doi.org/10.1109/TCYB.2017.2761361
[31] Karthik, A., MazherIqbal, J.L. Efficient Speech Enhancement Using Recurrent Convolution Encoder and Decoder. Wireless Pers Commun 119, 1959-1973 (2021). https://doi.org/10.1007/s11277-021-08313-6
[32] Yuming Fang, Xiaoqiang Zhang, Feiniu Yuan, NevrezImamoglu, Haiwen Liu, Video saliency detection by gestalt theory, Pattern Recognition, Volume 96,2019,106987, ISSN 0031-3203. https://doi.org/10.1016/j.patcog.2019.106987
[33] https://docs.microsoft.com/en-us/cpp/build/reference/clr common language runtime compilation? View = msvc-160
[34] https://docs.microsoft.com/en-us/cpp/dotnet/walkthrough-compiling-a-cpp-program-thattargets- the-clr-in-visual-studio?view=msvc-160
[35] https://en.wikipedia.org/wiki/Common_Language_Runtime
[36] https://www.red-gate.com/simple-talk/dotnet/net-development/creating-ccli-wrapper/
[37] Wang, Bofei et al. "Object-based Spatial Similarity for Semi-supervised Video Object Segmentation." (2019).
[38] Li F., Kim T., Humayun A., Tsai D., Rehg J. M.,"Video Segmentation byTracking Many Figure- Ground Segments" In:IEEE International Conference onComputer Vision (ICCV), 2013. https://doi.org/10.1109/ICCV.2013.273
Additional Files
Published
Issue
Section
License
ONLINE OPEN ACCES: Acces to full text of each article and each issue are allowed for free in respect of Attribution-NonCommercial 4.0 International (CC BY-NC 4.0.
You are free to:
-Share: copy and redistribute the material in any medium or format;
-Adapt: remix, transform, and build upon the material.
The licensor cannot revoke these freedoms as long as you follow the license terms.
DISCLAIMER: The author(s) of each article appearing in International Journal of Computers Communications & Control is/are solely responsible for the content thereof; the publication of an article shall not constitute or be deemed to constitute any representation by the Editors or Agora University Press that the data presented therein are original, correct or sufficient to support the conclusions reached or that the experiment design or methodology is adequate.