Hate Speech Patterns in Social Media: A Methodological Framework and Fat Stigma Investigation Incorporating Sentiment Analysis, Topic Modelling and Discourse Analysis


  • Vajisha Udayangi Wanniarachchi School of Mathematical and Computational Sciences, Massey University, Auckland
  • Chris Scogings School of Mathematical and Computational Sciences, Massey University, Auckland
  • Teo Susnjak School of Mathematical and Computational Sciences, Massey University, Auckland
  • Anuradha Mathrani School of Mathematical and Computational Sciences, Massey University, Auckland




social media, hate speech, sentiment analysis, discourse analysis, fat stigma, topic modelling


Social media offers users an online platform to freely express themselves; however, when users post opinionated and offensive comments that target certain individuals or communities, this could instigate animosity towards them. Widespread condemnation of obesity (fatness) has led to much fat stigmatizing content being posted online. A methodological framework that uses a novel mixed-method approach for unearthing hate speech patterns from large text-based corpora gathered from social media is proposed. We explain the use of computer-mediated quantitative methods comprising natural language processing techniques such as sentiment analysis, emotion analysis and topic modelling, along with qualitative discourse analysis. Next, we have applied the framework to a corpus of texts on gendered and weight-based data that have been extracted from Twitter and Reddit. This assisted in the detection of different emotions being expressed, the composition of word frequency patterns and the broader fat-based themes underpinning the hateful content posted online. The framework has provided a synthesis of quantitative and qualitative methods that draw on social science and data mining techniques to build real-world knowledge in hate speech detection. Current information systems research is limited in its use of mixed analytic approaches for studying hate speech in social media. Our study therefore contributes to future research by establishing a roadmap for conducting mixed-method analyses for better comprehension and understanding of hate speech patterns.


al-Utbi, M. I. K. (2019). A Critical Discourse Analysis of Hate Speech. Journal of the College of Languages (JCL) Mağallaẗ kulliyyaẗ al-luġāt (39), 19-40.


Albadi, N., Kurdi, M., & Mishra, S. (2018). Are they our brothers? analysis and detection of religious hate speech in the arabic twittersphere. Paper presented at the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).Barcelona, Spain. 28 – 31 August 2018.


Almenara, C. A., & Ježek, S. (2015). The source and impact of appearance teasing: an examination by sex and weight status among early adolescents from the Czech Republic. Journal of School Health, 85(3), 163-170. doi: https://doi.org/10.1111/josh.12236

Alshalan, R., Al-Khalifa, H., Alsaeed, D., Al-Baity, H., & Alshalan, S. (2020). Detection of Hate Speech in COVID-19–Related Tweets in the Arab Region: Deep Learning and Topic Modeling Approach. Journal of Medical Internet Research, 22(12), e22609. https://doi.org/10.2196/22609

Alshenqeeti, H. (2016). Are emojis creating a new or old visual language for new generations? A socio-semiotic study. Advances in Language and Literary Studies, 7(6). 56-69. https://doi.org/10.31219/osf.io/4hdgs

Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. The Journal of Machine Learning Research, 3, 993-1022.

Boban, I., Doko, A., & Gotovac, S. (2020). Sentence retrieval using stemming and lemmatization with different length of the queries. Advances in Science, Technology and Engineering Systems, 5(3), 349-354. https://doi.org/10.25046/aj050345

Boettcher, N. (2021). Studies of Depression and Anxiety Using Reddit as a Data Source: Scoping Review. Journal of Medical Internet Research Mental Health, 8(11), e29487. https://doi.org/10.2196/29487

Bograd, S., Chen, B., & Kavuluru, R. (2022). Tracking sentiments toward fat acceptance over a decade on Twitter. Health Informatics Journal, 28(1), 1-16.


Brewis, A., SturtzSreetharan, C., & Wutich, A. (2018). Obesity stigma as a globalizing health challenge. Globalization and Health, 14(1), 1-6. https://doi.org/10.1186/s12992-018-0337-x

Brooker, P., Barnett, J., Vines, J., Lawson, S., Feltwell, T., & Long, K. (2018). Doing stigma: Online commenting around weight-related news media. New Media & Society, 20(9), 3201-3222. https://doi.org/10.1177/1461444817744790

Calderón, C. A., de la Vega, G., & Herrero, D. B. (2020). Topic Modeling and Characterization of Hate Speech against Immigrants on Twitter around the Emergence of a Far-Right Party in Spain. Social Sciences, 9(11), 188. https://doi.org/10.3390/socsci9110188

Cao, R., Lee, R. K.-W., & Hoang, T.-A. (2020). DeepHate: Hate speech detection via multi-faceted text representations. Paper presented at the 12th ACM Conference on Web Science. Southampton, United Kingdom, 6 – 10 July 2020.


Chiril, P., Moriceau, V., Benamara, F., Mari, A., Origgi, G., & Coulomb-Gully, M. (2020). He said “who’s gonna take care of your children when you are at ACL?”: Reported Sexist Acts are Not Sexist. Paper presented at the Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 4055-4066. Online. Association for Computational Linguistics. https://aclanthology.org/2020.acl-main.373.pdf

Chou, W. Y. S., Prestin, A., & Kunath, S. (2014). Obesity in social media: a mixed methods analysis. Translational Behavioral Medicine, 4(3), 314-323. https://doi.org/10.1007/s13142-014-0256-1

Curiskis, S. A., Drake, B., Osborn, T. R., & Kennedy, P. J. (2020). An evaluation of document clustering and topic modelling in two online social networks: Twitter and Reddit. Information Processing & Management, 57(2), 102034.


De Brún, A., McCarthy, M., McKenzie, K., & McGloin, A. (2014). Weight stigma and narrative resistance evident in online discussions of obesity. Appetite, 72, 73-81. https://doi.org/10.1016/j.appet.2013.09.022

Delgado, R., & Stefancic, J. (1991). Images of the outsider in American law and culture: Can free expression remedy systemic social ills. Cornell Law Review, 77, 1258. https://doi.org/10.4324/9780429037627-2

Delgado, R., & Stefancic, J. (2014). Hate speech in cyberspace. Wake Forest Law Review, 49, 319. https://scholarship.law.ua.edu/fac_articles/560

Durrheim, K., Okuyan, M., Twali, M. S., García-Sánchez, E., Pereira, A., Portice, J. S., Gur, T., Weiner-Blotmer, O. & Keil, T. F. (2018). How racism discourse can mobilize right-wing populism: The construction of identity and alliance in reactions to UKIP's Brexit “Breaking Point” campaign. Journal of Community & Applied Social Psychology, 28(6), 385-405. https://doi.org/10.1002/casp.2347

Eisenstein, J., O'Connor, B., Smith, N. A., & Xing, E. P. (2014). Diffusion of lexical change in social media. PloS one, 9(11), e113114. https://doi.org/10.1371/journal.pone.0113114

Erjavec, K., & Kovačič, M. P. (2012). “You Don't Understand, This is a New War!” Analysis of Hate Speech in News Web Sites' Comments. Mass Communication and Society, 15(6), 899-920. https://doi.org/10.1080/15205436.2011.619679

Fredrickson, B. L., & Roberts, T. A. (1997). Objectification theory: Toward understanding women's lived experiences and mental health risks. Psychology of Women Quarterly, 21(2), 173-206. https://doi.org/10.1111/j.1471-6402.1997.tb00108.x

Gallacher, J. D., Heerdink, M. W., & Hewstone, M. (2021). Online engagement between opposing political protest groups via social media is linked to physical violence of offline encounters. Social Media+ Society, 7(1), https://doi.org/10.1177/2056305120984445

Himmelstein, M. S., Puhl, R. M., & Quinn, D. M. (2017). Intersectionality: an understudied framework for addressing weight stigma. American Journal of Preventive Medicine, 53(4), 421-431. https://doi.org/10.1016/j.amepre.2017.04.003

Holmberg, C., Berg, C., Hillman, T., Lissner, L., & Chaplin, J. E. (2018). Self-presentation in digital media among adolescent patients with obesity: Striving for integrity, risk-reduction, and social recognition. Digital Health, 4.


Hussin, M., Frazier, S., & Thompson, J. K. (2011). Fat stigmatization on YouTube: A content analysis. Body Image, 8(1), 90-92. https://doi.org/10.1016/j.bodyim.2010.10.003

Irani, A., Hendry, Manongga, D. H. F., & Chen, R.-C. (2020). Mining Public Opinion on Radicalism in Social Media Via Sentiment Analysis. International Journal of Innovative Computing, Information and Control, 16(5). 1787-1800.


Izydorczyk, B., Walenista, W., Kamionka, A., Lizińczyk, S., & Ptak, M. (2021). Connections Between Perceived Social Support and the Body Image in the Group of Women With Diastasis Recti Abdominis. Frontiers in Psychology, 12. 3182. https://doi.org/10.3389/fpsyg.2021.707775

Jeon, Y. A., Hale, B., Knackmuhs, E., & Mackert, M. (2018). Weight stigma goes viral on the Internet: Systematic assessment of YouTube comments attacking overweight men and women. Interactive Journal of Medical Research, 7(1), e6. https://doi.org/10.2196/ijmr.9182

Kemp, S. (2021). Digital 2021: Global Overview Report. Retrieved from https://datareportal.com/reports/digital-2021-global-overview-report. Accessed 30 August 2022.

Kent, E. E., Prestin, A., Gaysynsky, A., Galica, K., Rinker, R., Graff, K., & Chou, W.-Y. S. (2016). “Obesity is the new major cause of cancer”: connections between obesity and cancer on facebook and twitter. Journal of Cancer Education, 31(3), 453-459. https://doi.org/10.1007/s13187-015-0824-1

Kocoń, J., Figas, A., Gruza, M., Puchalska, D., Kajdanowicz, T., & Kazienko, P. (2021). Offensive, aggressive, and hate speech analysis: From data-centric to human-centered approach. Information Processing & Management, 58(5), 102643.


Kosara, R. (2016). Presentation-oriented visualization techniques. IEEE Computer Graphics and Applications, 36(1), 80-85. https://doi.org/10.1109/mcg.2016.2

Larson, S. R. (2021). The Rhetoricity of Fat Stigma: Mental Disability, Pain, and Anorexia Nervosa. Rhetoric Society Quarterly, 51(5), 392-406.


Lazarus, J. V., Kakalou, C., Palayew, A., Karamanidou, C., Maramis, C., Natsiavas, P., Picchio, C.A., Villota-Rivas, M. Zelber-Sagi, S. & Carrieri, P. (2021). A Twitter discourse analysis of negative feelings and stigma related to NAFLD, NASH, and obesity. Liver International, 41(10), 2295-2307. https://doi.org/10.1111/liv.14969

Lipizzi, C., Iandoli, L., & Marquez, J. E. R. (2015). Extracting and evaluating conversational patterns in social media: A socio-semantic analysis of customers’ reactions to the launch of new products using Twitter streams. International Journal of Information Management, 35(4), 490-503. https://doi.org/10.1016/j.ijinfomgt.2015.04.001

Liu, B. (2012). Sentiment analysis and opinion mining (Vol. 5). Morgan & Claypool Publishing, Springer Chan. https://doi.org/10.1007/978-3-031-02145-9

Lydecker, J. A., Cotter, E. W., Palmberg, A. A., Simpson, C., Kwitowski, M., White, K., & Mazzeo, S. E. (2016). Does this Tweet make me look fat? A content analysis of weight stigma on Twitter. Eating and Weight Disorders-Studies on Anorexia, Bulimia and Obesity, 21(2), 229-235. https://doi.org/10.1007/s40519-016-0272-x

Maftei, A., & Merlici, I.-A. (2022). Am I thin enough? Social media use and the ideal body stereotype: The mediating role of perceived socio-cultural pressure and the moderating role of cognitive fusion. Current Psychology, 1-14. https://doi.org/10.1007/s12144-022-02938-x

Manning, C. D., Raghavan, P., & Schütze, H. (2008). An Introduction to Information Retrieval. Cambridge, England: Cambridge University Press.

Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J. R., Bethard, S., & McClosky, D. (2014). The Stanford CoreNLP natural language processing toolkit. Paper presented at the Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations. 55–60, Baltimore, Maryland. Association for Computational Linguistics. https://aclanthology.org/P14-5010

Martins, R., Gomes, M., Almeida, J. J., Novais, P., & Henriques, P. (2018). Hate speech classification in social media using emotional analysis. Paper presented at the 2018 7th Brazilian Conference on Intelligent Systems (BRACIS). 22-25 October 2018, Sao Paulo, Brazil. 61-66. https://doi.org/10.1109/BRACIS.2018.00019

Matamoros-Fernández, A., & Farkas, J. (2021). Racism, hate speech, and social media: A systematic review and critique. Television & New Media, 22(2), 205-224. https://doi.org/10.1177/1527476420982230

Mathew, B., Illendula, A., Saha, P., Sarkar, S., Goyal, P., & Mukherjee, A. (2020). Hate begets hate: A temporal study of hate speech. Proceedings of the ACM on Human-Computer Interaction, 4(CSCW2), 92. 1-24. https://doi.org/10.1145/3415163

McComb, S. E., & Mills, J. S. (2022). The effect of physical appearance perfectionism and social comparison to thin-, slim-thick-, and fit-ideal Instagram imagery on young women’s body image. Body Image, 40, 165-175. https://doi.org/10.1016/j.bodyim.2021.12.003

Microsoft Corporation. (2016). Microsoft Excel (Version 2209) [Computer Software]. Retrieved from https://www.microsoft.com/en-nz/microsoft-365/excel. Accessed 01 January 2022.

Miranda, S., Berente, N., Seidel, S., Safadi, H., & Burton-Jones, A. (2022). Editor's Comments: Computationally Intensive Theory Construction: A Primer for Authors and Reviewers. Management Information Systems Quarterly, 46(2), iii-xviii.

Mohammad, S. M., & Turney, P. D. (2013). NRC emotion lexicon. National Research Council Canada, Technical report (NPARC number: 21270984), 243p.

https://doi.org/10.4224/21270984. Accessed 31 May 2022.

Mondal, M., Silva, L. A., & Benevenuto, F. (2017). A measurement study of hate speech in social media. Paper presented at the Proceedings of the 28th ACM conference on hypertext and social media. 4-7 July 1017, Prague, Czech Republic. 85-94.


Mossie, Z., & Wang, J.-H. (2020). Vulnerable community identification using hate speech detection on social media. Information Processing & Management, 57(3), 102087. https://doi.org/10.1016/j.ipm.2019.102087

Murakami, J. M., & Latner, J. D. (2015). Weight acceptance versus body dissatisfaction: Effects on stigma, perceived self-esteem, and perceived psychopathology. Eating Behaviors, 19, 163-167. https://doi.org/10.1016/j.eatbeh.2015.09.010

Ophir, Y., Walter, D., & Marchant, E. R. (2020). A collaborative way of knowing: Bridging computational communication research and grounded theory ethnography. Journal of Communication, 70(3), 447-472. https://doi.org/10.1093/joc/jqaa013

Pronoza, E., Panicheva, P., Koltsova, O., & Rosso, P. (2021). Detecting ethnicity-targeted hate speech in Russian social media texts. Information Processing & Management, 58(6), 102674. https://doi.org/10.1016/j.ipm.2021.102674

Puhl, R. M., Andreyeva, T., & Brownell, K. D. (2008). Perceptions of weight discrimination: prevalence and comparison to race and gender discrimination in America. International Journal of Obesity, 32(6), 992. https://doi.org/10.1038/ijo.2008.22

Rawal, K., Patel, T. P., Purohit, K. M., Israni, K., Kataria, V., Bhatt, H., & Gupta, S. (2020). Influence of obese phenotype on metabolic profile, inflammatory mediators and stemness of hADSC in adipose tissue. Clinical Nutrition, 39(12), 3829-3835. https://doi.org/10.1016/j.clnu.2020.02.032

Reddit. (2020). Reddit API Documentation. Retrieved from https://www.reddit.com/dev/api/. Accessed 01 Jan 2021.

Řehůřek, R., & Sojka, P. (2011). Gensim—statistical semantics in python. Poster presented at EuroScipy 2011, Paris, 25.–28. 8. 2011. https://initiative.eudml.org/news/presentation-euroscipy-2011-paris. Accessed 01 Jan 2022.

Röder, M., Both, A., & Hinneburg, A. (2015). Exploring the space of topic coherence measures. Paper presented at the Proceedings of the eighth ACM international conference on Web search and data mining, February 2 -6, 2015, Shanghai, China. 399-408.


Rodríguez, A., Argueta, C., & Chen, Y.-L. (2019). Automatic detection of hate speech on facebook using sentiment and emotion analysis. Paper presented at the 2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC). 11-13 February 2019, Okinawa, Japan. https://ieeexplore.ieee.org/document/8669073

Salminen, J., Almerekhi, H., Milenković, M., Jung, S.-g., An, J., Kwak, H., & Jansen, B. J. (2018). Anatomy of online hate: developing a taxonomy and machine learning models for identifying and classifying hate in online news media. Paper presented at the Twelfth International AAAI Conference on Web and Social Media. 330-339. 25 -28 June 2018. Palo Alto, United States. https://pennstate.pure.elsevier.com/en/publications/anatomy-of-online-hate-developing-a-taxonomy-and-machine-learning

Satapathy, R., Guerreiro, C., Chaturvedi, I., & Cambria, E. (2017). Phonetic-based microtext normalization for twitter sentiment analysis. Paper presented at the 2017 IEEE international conference on data mining workshops (ICDMW). 18-21 November 2017. New Orleans, LA, USA https://doi.org/10.1109/ICDMW.2017.59

Semeraro, A., Vilella, S., & Ruffo, G. (2021). PyPlutchik: Visualising and comparing emotion-annotated corpora. PloS one, 16(9), e0256503.


Shibly, F., Sharma, U., & Naleer, H. (2021). Classifying and measuring hate speech in Twitter using topic classifier of sentiment analysis. Paper presented at the International Conference on Innovative Computing and Communications. Advances in Intelligent Systems and Computing, vol 1165. Springer, Singapore. 671-678. https://doi.org/10.1007/978-981-15-5113-0_54

Shirazi, F. (2013). Social media and the social movements in the Middle East and North Africa: A critical discourse analysis. Information Technology & People. 26(1). 28-49. https://doi.org/10.1108/09593841311307123

So, J., Prestin, A., Lee, L., Wang, Y., Yen, J., & Chou, W.-Y. S. (2016). What do people like to “share” about obesity? A content analysis of frequent retweets about obesity on Twitter. Health Communication, 31(2), 193-206. https://doi.org/10.1080/10410236.2014.940675

Ștefăniță, O., & Buf, D.-M. (2021). Hate speech in social media and its effects on the LGBT community: A review of the current research. Romanian Journal of Communication and Public Relations, 23(1), 47-55. https://doi.org/10.21018/rjcpr.2021.1.322

Stieglitz, S., & Dang-Xuan, L. (2013). Emotions and information diffusion in social media—sentiment of microblogs and sharing behavior. Journal of Management Information Systems, 29(4), 217-248. https://doi.org/10.2753/MIS0742-1222290408

Szepietowska, B., Polonsky, B., Sherazi, S., Biton, Y., Kutyifa, V., McNitt, S., . . . Zareba, W. (2016). Effect of obesity on the effectiveness of cardiac resynchronization to reduce the risk of first and recurrent ventricular tachyarrhythmia events. Cardiovascular Diabetology, 15(1), 1-8. https://doi.org/10.1186/s12933-016-0401-x

Tang-Péronard, J., & Heitmann, B. (2008). Stigmatization of obese children and adolescents, the importance of gender. Obesity Reviews, 9(6), 522-534. https://doi.org/10.1111/j.1467-789x.2008.00509.x

Thompson, L., Rickett, B., & Day, K. (2018). Feminist relational discourse analysis: Putting the personal in the political in feminist research. Qualitative Research in Psychology, 15(1), 93-115. https://doi.org/10.1080/14780887.2017.1393586

Tomiyama, A. J. (2014). Weight stigma is stressful. A review of evidence for the Cyclic Obesity/Weight-Based Stigma model. Appetite, 82, 8-15.


Törnberg, A., & Törnberg, P. (2016). Muslims in social media discourse: Combining topic modeling and critical discourse analysis. Discourse, Context & Media, 13, 132-142. https://doi.org/10.1016/j.dcm.2016.04.003

Tucey, C. B. (2010). Online vs. face-to-face deliberation on the global warming and stem cell issues. Paper presented at the Western Political Science Association 2010 Annual Meeting Paper. https://ssrn.com/abstract=1580573

Twitter. (2019). Twitter API. Retrieved from https://developer.twitter.com/. Accessed 01 Jan 2021.

Tylka, T. L., & Hill, M. S. (2004). Objectification theory as it relates to disordered eating among college women. Sex Roles, 51(11-12), 719-730. https://doi.org/10.1007/s11199-004-0721-2

Veldhuis, J., Konijn, E. A., & Seidell, J. C. (2014). Negotiated media effects. Peer feedback modifies effects of media’s thin-body ideal on adolescent girls. Appetite, 73, 172-182. https://doi.org/10.1016/j.appet.2013.10.023

Vigna, F. D., Cimino, A., Dell’Orletta, F., Petrocchi, M., & Tesconi, M. (2017). Hate me, hate me not: Hate speech detection on facebook. Paper presented at the Proceedings of the First Italian Conference on Cybersecurity (ITASEC17). Venice, Italy. 86- 95. https://ceur-ws.org/Vol-1816/paper-09.pdf

Wallach, H. M. (2006). Topic modeling: beyond bag-of-words. Paper presented at the Proceedings of the 23rd international conference on Machine learning. Pittsburgh, Pennsylvania. 25-29 July 2006. 977-984. https://doi.org/10.1145/1143844.1143967

Wanniarachchi, V. U., Mathrani, A., Susnjak, T., & Scogings, C. (2019). Gendered objectification of weight stigma in social media: a mixed method analysis. Paper presented at the Australasian Conference on Information Systems, Perth, Western Australia. 9-11 December 2019 https://acis2019.io/pdfs/ACIS2019_PaperFIN_077.pdf

Wanniarachchi, V. U., Mathrani, A., Susnjak, T., & Scogings, C. (2022). Methodological Aspects in Study of Fat Stigma in Social Media Contexts: A Systematic Literature Review. Applied Sciences, 12(10), 5045. https://doi.org/10.3390/app12105045

WHO. (2022). Data and statistics. Retrieved from https://www.euro.who.int/en/health-topics/noncommunicable-diseases/obesity/data-and-statistics. Accessed 01 Jan 2022.

Yeruva, V. K., Junaid, S., & Lee, Y. (2019). Contextual Word Embeddings and Topic Modeling in Healthy Dieting and Obesity. Journal of Healthcare Informatics Research, 3(2), 159-183. https://doi.org/10.1007/s41666-019-00052-5

Zhong, B. (2020). Social consequences of internet civilization. Computers in Human Behavior, 107, 106308. https://doi.org/10.1016/j.chb.2020.106308




How to Cite

Wanniarachchi, V. U., Scogings, C., Susnjak, T., & Mathrani, A. (2023). Hate Speech Patterns in Social Media: A Methodological Framework and Fat Stigma Investigation Incorporating Sentiment Analysis, Topic Modelling and Discourse Analysis. Australasian Journal of Information Systems, 27. https://doi.org/10.3127/ajis.v27i0.3929



Research Articles