Research on Key Technology of Web Hierarchical Topic Detection and Evolution Based on Behaviour Tracking Analysis
Keywords:
Web hierarchical topic, topic detection, event evolution, behaviour tracking analysisAbstract
In the development background of today’s big data era, the research direction of Web hierarchical topic detection and evolution characterized by the semistructured or unstructured data has caught wide attention for academicians. This paper proposes an idea of Web hierarchical topic detection and evolution based on behaviour tracking analysis taking the network big data as the research object, and expounds main implementation methods, which include the instance analysis of the usage mode, the instance analysis of the seed, the set analysis of similar instance supporting the topics, the set analysis of similar instance supporting the events, the evolution analysis of the event, and expounds the algorithm of Web hierarchical topic detection and evolution based on behaviour tracking analysis. The process of experimental analysis is organized as follows, first of all, the experiment analyses the quality of topic detection, the accuracy rate with the number of instance concerned and the seed threshold variation trend, the accuracy rate with the number of instance concerned and the probability threshold variation trend, secondly, the experiment analyses the quality of topic evolution, the accuracy rate with the variation trend of parameter adjustment, the accuracy rate with the number of instance concerned and the similar threshold variation trend, finally, the experiment analyses the time consuming to solve main research problem under different method, the qualitative result of topic detection and evolution under different data set. The results of experimental analysis show the idea is feasible, verifiable and superior, which plays a major role in reconfiguring Web hierarchical topic corpus and providing an intelligent big data warehouse for the network information evolution application.References
Ahila, S.S.; Shunmuganathan, K.L. (2016). Role of Agent Technology in Web Usage Mining: Homomorphic Encryption Based Recommendation for E-commerce Applications, Wireless Personal Communications, 87(2), 499-512, 2016. https://doi.org/10.1007/s11277-015-3082-y
Alam, M.H.; Ryu, W.J.; Lee, S. (2017). Hashtag-Based Topic Evolution in Social Media, World Wide Web-Internet and Web Information Systems, 20(6), 1527-1549, 2017. https://doi.org/10.1007/s11280-017-0451-3
Aujla, G.S.; Kumar, N.; Zomaya, A.Y. (2018). Optimal Decision Making for Big Data Processing at Edge-Cloud Environment: An SDN Perspective, IEEE Transactions on Industrial Informatics, 14(2), 778-782, 2018. https://doi.org/10.1109/TII.2017.2738841
Chen, B.T.; Tsutsui, S.; Ding, Y.; Ma, F.C. (2017). Understanding the Topic Evolution in a Scientific Domain: an Exploratory Study for the Field of Information Retrieval, Journal of Informetrics, 11(4), 1175-1189, 2017. https://doi.org/10.1016/j.joi.2017.10.003
Chen, M.; Yang, X.P. (2016). Research on Model of Network Information Extraction Based on Improved Topic-Focused Web Crawler Key Technology, Tehnicki vjesnik/Technical Gazette, 23(4), 49-54, 2016. https://doi.org/10.17559/TV-20150314134638
Chen, M.; Yang, X.P.; Sun, M.; Zhao, Y. (2014). Research on Model of Network Information Currency Evaluation Based on Web Semantic Extraction Method, International Journal of Future Generation Communication and Networking, 7(2), 103-116, 2014. https://doi.org/10.14257/ijfgcn.2014.7.2.11
Chen, Y.; Zhang, H.; Liu, R.; Ye, Z.W.; Lin, J.Y. (2019). Experimental Explorations on Short Text Topic Mining Between LDA and NMF Based Schemes, Knowledge-Based Systems, 163, 1-3, 2019. https://doi.org/10.1016/j.knosys.2018.08.011
Dai, Y.; Wu, W.; Zhou, H.B.; Zhang, J.; Ma, F.Y. (2018). Numerical Simulation and Optimization of Oil Jet Lubrication for Rotorcraft Meshing Gears, International Journal of Simulation Modelling, 17(2), 318-326, 2018. https://doi.org/10.2507/IJSIMM17(2)CO6
Dai, Y.; Zhu, X.; Zhou, H.; Mao, Z.; Wu, W. (2018). Trajectory Tracking Control for Seafloor Tracked Vehicle by Adaptive Neural-Fuzzy Inference System Algorithm, International Journal of Computers Communications & Control, 13(4), 465-476, 2018. https://doi.org/10.15837/ijccc.2018.4.3267
Du, J.; Sun, Y.; Ren, H. (2018). The Relationship of Delivery Frequency with the Cost and Resource Operational Efficiency: A Case Study of Jingdong Logistics, Mathematics and Computer Science, 3(6), 129-140, 2018.
Fatima, B.; Ramzan, H.; Asghar, S. (2016). Session Identification Techniques Used in Web Usage Mining a Systematic Mapping of Scholarly Literature, Online Information Review, 40(7), 1033-1053, 2016. https://doi.org/10.1108/OIR-08-2015-0274
Gaul, W.G.; Vincent, D. (2017). Evaluation of the Evolution of Relationships between Topics over Time, Advances in Data Analysis and Classification, 11(1), 159-178, 2017. https://doi.org/10.1007/s11634-016-0241-2
Jimenez-Marquez, J.L.; Gonzalez-Carrasco, I.; Lopez-Cuadrado, J.L.; Ruiz-Mezcua, B. (2019). Towards a Big Data Framework for Analysing Social Media Content, International Journal of Information Management, 44, 1-3, 2019. https://doi.org/10.1016/j.ijinfomgt.2018.09.003
Kaseb, M.R.; Khafagy, M.H.; Ali, I.A.; Saad, E.M. (2019). An Improved Technique for Increasing Availability in Big Data Replication, Future Generation Computer Systems-The International Journal of Escience, 91, 493-497, 2019. https://doi.org/10.1016/j.future.2018.08.015
Kausel, E.E. (2018). Big Data at Work: The Data Science Revolution and Organizational Psychology, Personnel Psychology, 71(1), 135-136, 2018. https://doi.org/10.1111/peps.12255
Kho, N.D. (2018). The State of Big Data, Econtent, 41(1), 11-12, 2018. https://doi.org/10.1007/978-3-319-63962-8_255-1
Liu, J.; Fang, C.; Ansari, N. (2016). Request Dependency Graph: a Model for Web Usage Mining in Large-Scale Web of Things, IEEE Internet of Things Journal, 3(4), 598-608, 2016. https://doi.org/10.1109/JIOT.2015.2452964
Makkie, M.; Huang, H.; Zhao, Y.; Vasilakos, A.V.; Liu, T.M. (2019). Fast and Scalable Distributed Deep Convolutional Autoencoder for fMRI Big Data Analytics, Neurocomputing, 325, 20-22, 2019. https://doi.org/10.1016/j.neucom.2018.09.066
Osman, A.M.S. (2019). A Novel Big Data Analytics Framework for Smart Cities, Future Generation Computer Systems-The International Journal of Escience, 91, 620-623, 2019. https://doi.org/10.1016/j.future.2018.06.046
O'Halloran, K.L.; Tan, S.; Duc-Son, P. (2018). A Digital Mixed Methods Research Design: Integrating Multimodal Analysis with Data Mining and Information Visualization for Big Data Analytics, Journal of Mixed Methods Research, 12(1), 11-15, 2018. https://doi.org/10.1177/1558689816651015
Pandian, P.S.; Srinivasan, S. (2016). A Unified Model for Preprocessing and Clustering Technique for Web Usage Mining, Journal of Multiple-Valued Logic and Soft Computing, 26(3), 205-220, 2016.
Sagi, T.; Gal, A. (2018). Non-Binary Evaluation Measures for Big Data Integration, VLDB Journal, 27(1), 105-110, 2018. https://doi.org/10.1007/s00778-017-0489-y
Tran, Q.T.; Nguyen, S.D.; Seo, T.I. (2019). Algorithm for Estimating Online Bearing Fault Upon the Ability to Extract Meaningful Information From Big Data of Intelligent Structures, IEEE Transactions on Industrial Electronics, 66(5), 3804-3806, 2019. https://doi.org/10.1109/TIE.2018.2847704
Uma, R.; Muneeswaran, K. (2017). OMIR: Ontology-Based Multimedia Information Retrieval System for Web Usage Mining, Cybernetics and Systems, 48(4), 393-414, 2017. https://doi.org/10.1080/01969722.2017.1285163
Wu, P.J.; Lin, K.C. (2018); Unstructured Big Data Analytics for Retrieving E-Commerce Logistics Knowledge, Telematics and Informatics, 35(1), 237-241, 2018. https://doi.org/10.1016/j.tele.2017.11.004
Yao, L.; Ge, Z.Q. (2019). Scalable Semisupervised GMM for Big Data Quality Prediction in Multimode Processes, IEEE Transactions on Industrial Electronics, 66(5), 3681-3684, 2019. https://doi.org/10.1109/TIE.2018.2856200
Zhang, D. (2017). High-Speed Train Control System Big Data Analysis Based on Fuzzy RDF Model and Uncertain Reasoning, International Journal of Computers Communications & Control, 12(4), 11-15, 2017. https://doi.org/10.15837/ijccc.2017.4.2914
Zhang, D.; Sui, J.; Gong, Y. (2017). Large Scale Software Test Data Generation Based on Collective Constraint and Weighted Combination Method, Tehnicki Vjesnik, 24(4), 1041- 1050, 2017. https://doi.org/10.17559/TV-20170319045945
Zhang, D.; Jin, D.; Gong, Y. (2015). Research of Alarm Correlations Based on Static Defect Detection, Tehnicki vjesnik, 22(2), 311-318, 2015. https://doi.org/10.17559/TV-20150317102804
Zhou, H.K.; Yu, H.M.; Hu, R. (2017). Topic Discovery and Evolution in Scientific Literature Based on Content and Citations, Frontiers of Information Technology & Electronic Engineering, 18(10), 1511-1524, 2017. https://doi.org/10.1631/FITEE.1601125
Zhou, H.K.; Yu, H.M.; Hu, R. (2017). Topic Evolution Based on the Probabilistic Topic Model: a Review, Frontiers of Computer Science, 11(5), 786-802, 2017. https://doi.org/10.1007/s11704-016-5442-5
Published
Issue
Section
License
ONLINE OPEN ACCES: Acces to full text of each article and each issue are allowed for free in respect of Attribution-NonCommercial 4.0 International (CC BY-NC 4.0.
You are free to:
-Share: copy and redistribute the material in any medium or format;
-Adapt: remix, transform, and build upon the material.
The licensor cannot revoke these freedoms as long as you follow the license terms.
DISCLAIMER: The author(s) of each article appearing in International Journal of Computers Communications & Control is/are solely responsible for the content thereof; the publication of an article shall not constitute or be deemed to constitute any representation by the Editors or Agora University Press that the data presented therein are original, correct or sufficient to support the conclusions reached or that the experiment design or methodology is adequate.