Leveraging Deep Learning for Cyber Bullying Detection on Social Media Platforms: A Holistic Approach to Mitigate Online Harassment

Thanuja S, Dr. Priya V, Vanthanadevi S, Yuvanidhi S; Thanuja S, Dr. Priya V, Vanthanadevi S, Yuvanidhi S

doi:10.15662/IJSRAT.2025.0802003

Leveraging Deep Learning for Cyber Bullying Detection on Social Media Platforms: A Holistic Approach to Mitigate Online Harassment

Thanuja S, Dr. Priya V, Vanthanadevi S, Yuvanidhi S

Abstract

Cyberbullying is a growing concern on social media platforms. This research presents an automated system for detecting cyberbullying in social media messages using a combination of supervised machine learning (ML) and natural language processing (NLP) techniques. The system utilizes a pre-trained model to classify messages as bullying or non-bullying based on textual features. It combines traditional ML models, like Logistic Regression, with advanced deep learning methods such as Long Short-Term Memory (LSTM) networks to analyze user-generated content. Preprocessing steps, including tokenization, stop word removal, and TF-IDF vectorization, transform raw text into structured data for model training. The model is trained on a labeled dataset containing both bullying and non- bullying messages to improve classification accuracy. A Streamlit-based web application allows users to input messages and receive real-time feedback on whether the message is classified as cyberbullying. This system aims to assist social media platforms in identifying and preventing cyberbullying, contributing to safer online communities. Experimental results highlight its potential for real-world deployment.

Article Information

Journal	International Journal of Science, Research and Technology
Volume (Issue)	Vol. 8 No. 2 (2025): International Journal of Science, Research and Technology (IJSRAT)
DOI	https://doi.org/10.15662/IJSRAT.2025.0802003
Pages	13856-13867
Published	April 7, 2025
Copyright	All rights reserved
Open Access	This work is licensed under a Creative Commons Attribution 4.0 International License.
How to Cite	Thanuja S, Dr. Priya V, Vanthanadevi S, Yuvanidhi S (%2025). Leveraging Deep Learning for Cyber Bullying Detection on Social Media Platforms: A Holistic Approach to Mitigate Online Harassment. International Journal of Science, Research and Technology , Vol. 8 No. 2 (2025): International Journal of Science, Research and Technology (IJSRAT) , pp. 13856-13867. https://doi.org/10.15662/IJSRAT.2025.0802003

References

1. A. Ostayeva, Z. Kozhamkulova, Y. Aimakhanov, D. Abylkhassenova, A. Serik, and Y. Tenizbayev, “Utilizing machine learning and deep learning approaches for the detection of cyberbullying issues,” Int. J. Adv. Comput. Sci. Appl., vol. 15, no. 6, 2024.
2. M. S. Islam, A. N. Orno, and M. Arifuzzaman, “Approach to social media cyberbullying and harassment detection using advanced machine learning,” SSRN, 2024. [Online]. Available: https://ssrn.com/abstract=4705261
3. S. Mallappa, M. A. N. Saif, and H. D. E. Al-Ariki, “DEA-RNN: A hybrid deep learning approach for cyberbullying detection in Twitter social media platform.”
4. V. Shah, A. Sinha, N. Navalkar, S. Gupta, P. Gonsalves, and A. Malik, “ML and natural language processing: Cyberbullying detection system for safer and culturally adaptive digital communities,” J. Smart Internet Things, vol. 2023, no. 2, pp. 193–205, 2023.
5. S. Kaur, S. Singh, and S. Kaushal, “Deep learning-based approaches for abusive content detection and classification for multi-class online user-generated data,” Int. J. Cogn. Comput. Eng., vol. 5, pp. 104–122, 2024.
6. J. H. Park and P. Fung, “One-step and two-step classification for abusive language detection on Twitter,” arXiv preprint arXiv:1706.01206, 2017.
7. Z. Pitenis, M. Zampieri, and T. Ranasinghe, “Offensive language identification in Greek,” arXiv preprint arXiv:2003.07459, 2020.
8. R. Haque, N. Islam, M. Tasneem, and A. K. Das, “Multi-class sentiment classification on Bengali social media comments using machine learning,” Int. J. Cogn. Comput. Eng., vol. 4, pp. 21–35, 2023.
9. A. Ostayeva, Z. Kozhamkulova, Y. Aimakhanov, D. Abylkhassenova, A. Serik, and Y. Tenizbayev, “Utilizing machine learning and deep learning approaches for the detection of cyberbullying issues,” Int. J. Adv. Comput. Sci. Appl., vol. 15, no. 6, 2024.
10. K. Smagulova and A. P. James, “A survey on LSTM memristive neural network architectures and applications,” Eur. Phys. J. Spec. Top., vol. 228, no. 10, pp. 2313–2324, 2019.
11. K. Alemerien, A. Al-Ghareeb, and M. Z. Alksasbeh, “Sentiment analysis of online reviews: A machine learning based approach with TF-IDF vectorization,” J. Mobile Multimedia, pp. 1089–1116, 2024.
12. S. Farley, I. Coyne, and P. D’Cruz, “Cyberbullying at work: Understanding the influence of technology,” Concepts, Approaches and Methods, pp. 233–263, 2021.
13. V. Ganganwar, “An overview of classification algorithms for imbalanced datasets,” Int. J. Emerg. Technol. Adv. Eng., vol. 2, no. 4, pp. 42–47, 2012.
14. A. Deekshith, “Data engineering for AI: Optimizing data quality and accessibility for machine learning models,” Int. J. Manag. Educ. Sustain. Dev., vol. 4, no. 4, pp. 1–33, 2021.
15. P. A. Brown and R. A. Anderson, “A methodology for preprocessing structured big data in the behavioral sciences,” Behav. Res. Methods, vol. 55, no. 4, pp. 1818–1838, 2023.
16. B. Aklouche, I. Bounhas, and Y. Slimani, “Query expansion based on NLP and word embeddings,” in Proc. TREC, Nov. 2018.
17. F. Matteucci, V. Arzamasov, and K. Böhm, “A benchmark of categorical encoders for binary classification,” Adv. Neural Inf. Process. Syst., vol. 36, 2024.
18. L. Havrlant and V. Kreinovich, “A simple probabilistic explanation of term frequency-inverse document frequency (TF-IDF) heuristic (and variations motivated by this explanation),” Int. J. Gen. Syst., vol. 46, no. 1, pp. 27–36, 2017.
19. M. I. Alfarizi, L. Syafaah, and M. Lestandy, “Emotional text classification using TF-IDF (term frequency-inverse document frequency) and LSTM (long short-term memory),” JUITA: J. Informatika, vol. 10, no. 2, pp. 225–232, 2022.
20. H. Gasmi, J. Laval, and A. Bouras, “Information extraction of cybersecurity concepts: An LSTM approach,” Appl. Sci., vol. 9, no. 19, p. 3945, 2019.
21. H. Gonaygunta, “Machine learning algorithms for detection of cyber threats using logistic regression,” Dept. Inf. Technol., Univ. Cumberlands, 2023.
22. F. A. Gers, N. N. Schraudolph, and J. Schmidhuber, “Learning precise timing with LSTM recurrent networks,” J. Mach. Learn. Res., vol. 3, pp. 115–143, Aug. 2002.
23. S. S. Noureen, S. B. Bayne, E. Shaffer, D. Porschet, and M. Berman, “Anomaly detection in cyber-physical system using logistic regression analysis,” in Proc. IEEE Texas Power Energy Conf. (TPEC), Feb. 2019, pp. 1–6.
24. M. Goswami and P. Sajwan, “A comparative analysis of sentiment analysis using RNN-LSTM and logistic regression,” in Trends Wireless Commun. Inf. Secur.: Proc. EWCIS 2020, pp. 165–174, 2021.
25. H. Gasmi, J. Laval, and A. Bouras, “LSTM recurrent neural networks for cybersecurity named entity recognition,” arXiv preprint arXiv:2409.10521, 2024.
26. J. Yadav, D. Kumar, and D. Chauhan, “Cyberbullying detection using pre-trained BERT model,” in Proc. IEEE ICESC, Jul. 2020, pp. 1096–1100.
27. Z. M. ALbazzaz and O. B. Shukur, “Using LSTM network based on logistic regression model for classifying solar radiation time series,” in Proc. Int. Conf. Explainable AI Digital Sustainability, Jun. 2024, pp. 375–388.
28. M. S. Shelke, P. R. Deshmukh, and V. K. Shandilya, “A review on imbalanced data handling using undersampling and oversampling technique,” Int. J. Recent Trends Eng. Res., vol. 3, no. 4, pp. 444–449, 2017.
29. A. Y. C. Liu, “The effect of oversampling and undersampling on classifying imbalanced text datasets,” 2004.
30. B. Santoso, H. Wijayanto, K. A. Notodiputro, and B. Sartono, “Synthetic over sampling methods for handling class imbalanced problems: A review,” in IOP Conf. Ser.: Earth Environ. Sci., vol. 58, no. 1, p. 012031, Mar. 2017.
31. S. M. Malakouti, M. B. Menhaj, and A. A. Suratgar, “Applying Grid Search, Random Search, Bayesian Optimization, Genetic Algorithm, and Particle Swarm Optimization to fine-tune the hyperparameters of the ensemble of ML models enhances its predictive accuracy for mud loss,” 2024.
32. W. Chen, T. Paraschivescu, and X. Can, “Practical Bayesian optimization of machine learning algorithms,” Adv. Neural Inf. Process. Syst., vol. 4, pp. 2951–2959, 2012.
33. J. Wu, X. Y. Chen, H. Zhang, L. D. Xiong, H. Lei, and S. H. Deng, “Hyperparameter optimization for machine learning models based on Bayesian optimization,” J. Electron. Sci. Technol., vol. 17, no. 1, pp. 26–40, 2019.
34. S. Paul and S. Saha, “CyberBERT: BERT for cyberbullying identification,” Multimed. Syst., vol. 28, no. 6, pp. 1897–1904, 2022.
35. B. Bhatia, A. Verma, Anjum, and R. Katarya, “Analysing cyberbullying using natural language processing by understanding jargon in social media,” in Sustainable Adv. Comput.: Sel. Proc. ICSAC 2021, pp. 397–406, Singapore: Springer Singapore, 2022.
36. H. Kindbom, “Investigating the attribution quality of LSTM with attention and SHAP: Going beyond predictive performance,” [Journal/Conference Name], vol. [X], no. [Y], pp. [Page Range], 2021.