An Ensemble Machine Learning Approach to Understanding the Effect of a Global Pandemic on Twitter Users’ Attitudes
Keywords:
COVID-19, Coronavirus, Machine Learning, Natural Language Processing, Automatic Hate-Speech Detection, RacismAbstract
It is thought that the COVID-19 outbreak has significantly fuelled racism and discrimination, especially towards Asian individuals[10]. In order to test this hypothesis, in this paper, we build upon existing work in order to classify racist tweets before and after COVID-19 was declared a global pandemic. To overcome the difficult linguistic and unbalanced nature of the classification task, we combine an ensemble of machine learning techniques such as a Linear Support Vector Classifiers, Logistic Regression models, and Deep Neural Networks. We fill the gap in existing literature by (1) using a combined Machine Learning approach to understand the effect of COVID-19 on Twitter users’ attitudes and by (2) improving on the performance of automatic racism detectors. Here we show that there has not been a sharp increase in racism towards Asian people on Twitter and that users that posted racist Tweets before the pandemic are prone to post an approximately equal amount during the outbreak. Previous research on racism and other virus outbreaks suggests that racism towards communities associated with the region of the origin of the virus is not exclusively attributed to the outbreak but rather it is a continued symptom of deep-rooted biases towards minorities[13]. Our research supports these previous findings. We conclude that the COVID-19 outbreak is an additional outlet to discriminate against Asian people, instead of it being the main cause.
References
[2] Atkenson, A. (2020). What Will Be the Economic Impact of COVID-19 in the US? Rough Estimates of Disease Scenarios National Bureau of Economic Research, DOI: 10.3386/w26867 https://doi.org/10.3386/w26867
[3] Chen, E., Lerman, K., Ferrara, E. (2020). COVID-19: The First Public Coronavirus Twitter Dataset JMIR Public Health Surveill , DOI: 10.2196/19273 https://doi.org/10.2196/19273
[4] Davidson T., Warmsley D., Macy M., Weber I. (2017). Automated Hate Speech Detection and the Problem of Offensive Language, AAAI Publications, Eleventh International AAAI Conference on Web and Social Media, DOI: http://arxiv.org/abs/1703.04009
[5] Devakumar, D., Shannon, G., Bhopal, S. S., Abubakar, I.(2020). Racism and discrimination in COVID-19 responses The Lancet, DOI: https://doi.org/10.1016/S0140-6736(20)30792-3
[6] Godin, F., Vandersmissen, B., De Neve, W., Van de Walle, R. (2015). Named Entity Recognition for Twitter Microposts using Distributed Word Representations Proceedings of the Workshop on Noisy User-generated Text , DOI: 10.18653/v1/W15-4322 https://doi.org/10.18653/v1/W15-4322
[7] Hanasoge, S., Horiuchi, N., Huang, C., Jia, H., Kim, N. Y., Murao, M., Seo, M., Tan, R., Wilkinson, J. (2020). Visibility challenges for Asian scientists Nature Reviews Physics, DOI: https://doi.org/10.1038/s42254-020-0162-z
[8] Kwok, I., Wang, Y. (2013). Locate the Hate: Detecting Tweets against Blacks, AAAI Publications, Twenty-Seventh AAAI Conference on Artificial Intelligence , DOI: 10.5555/2891460.2891697
[9] Li J., Guo K., Viedma E. H., Lee H., Liu J., Zhong N., Gomes L. F. A. M., Filip F.G., Fang SC., í–zdemir M.S., Liu X., Lu G., Shi Y. (2020), Culture versus Policy: More Global Collaboration to Effectively Combat COVID-19 The Innovation, Volume 1, Issue 2 https://doi.org/10.1016/j.xinn.2020.100023
[10] Nature (2020). Stop the coronavirus stigma now Nature 580, 165, DOI: https://doi.org/10.1038/d41586-020-01009-0
[11] Pitsilis, G., Ramampiaro, H., Langseth, H. (2018). Effective hate-speech detection in Twitter data using recurrent neural networks Appl Intell 48, DOI: https://doi.org/10.1007/s10489-018-1242-y
[12] Saars H. A., Keil, R. (2006). Global Cities and the Spread of Infectious Disease: The Case of Severe Acute Respiratory Syndrome (SARS) in Toronto, Canada Urban Studies , DOI: https://doi.org/10.1080/00420980500452458
[13] Siu, J. Y. (2015). Influence of social experiences in shaping perceptions of the Ebola virus among African residents of Hong Kong during the 2014 outbreak: a qualitative study International Journal for Equity in Health , DOI: https://doi.org/10.1186/s12939-015-0223-6
[14] Dong E, Du H, Gardner L. (2020). An interactive web-based dashboard to track COVID-19 in real time Lancet Inf Dis. 20(5):533-534, DOI: 10.1016/S1473-3099(20)30120-1 https://doi.org/10.1016/S1473-3099(20)30120-1
[15] Wang, W., Chen, L., Thirunarayan, K., Sheth, A. P. (2014). Cursing in English on Twitter Association for Computing Machinery , DOI: https://doi.org/10.1145/2531602.2531734
[16] Waseem, Z., Hovy, D.(2016). Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter Proceedings of the NAACL Student Research Workshop , DOI: 10.18653/v1/N16-2013 https://doi.org/10.18653/v1/N16-2013
[17] World Health Organization (2021). Coronavirus disease 2021 (COVID-19): situation report, 52 World Health Organization
[18] Zimmerman, S., Kruschwitz, U., Fox, C. (2018). Improving Hate Speech Detection with Deep Learning Ensembles Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
[19] Zubiaga, A., Voss, A., Procter, R., Liakata, M., Wang, B., Tsakalidis, A.(2016). Towards Real-Time, Country-Level Location Classification of Worldwide Tweets IEEE Transactions on Knowledge and Data Engineering, Volume: 29, Issue: 9, Sept. 1 2017, DOI: 10.1109/TKDE.2017.2698463 https://doi.org/10.1109/TKDE.2017.2698463
Additional Files
Published
Issue
Section
License
ONLINE OPEN ACCES: Acces to full text of each article and each issue are allowed for free in respect of Attribution-NonCommercial 4.0 International (CC BY-NC 4.0.
You are free to:
-Share: copy and redistribute the material in any medium or format;
-Adapt: remix, transform, and build upon the material.
The licensor cannot revoke these freedoms as long as you follow the license terms.
DISCLAIMER: The author(s) of each article appearing in International Journal of Computers Communications & Control is/are solely responsible for the content thereof; the publication of an article shall not constitute or be deemed to constitute any representation by the Editors or Agora University Press that the data presented therein are original, correct or sufficient to support the conclusions reached or that the experiment design or methodology is adequate.