Home > Published Issues > 2020 > Volume 15, No. 4, April 2020 >

Detecting Bengali Spam SMS Using Recurrent Neural Network

Md. Mohsin Uddin 1, Monica Yasmin 1, M Saddam Hossain Khan 1, Md Istianatur Rahman 2, and Tabassum Islam 1
1. East West University, Dhaka, 1212, Bangladesh
2. World University of Bangladesh

Abstract—SMS is being spammed if the sender sends it to the targeted users to gain important personal information. If targeted users respond with personal information, it will be a great opportunity for the sender to grab their desired goal. Now, this phenomenon increases rapidly and Machine Learning (ML) is mostly used to classify this problem. In terms of Bangladesh, email spam detection is common but detecting SMS spam with the Bengali dataset is completely new as a research problem. This research is taken part to detect Bengali spam SMS using traditional Machine Learning algorithms along with Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU). Then, the performances of all algorithms are compared to find the best among them. The highest testing accuracy rate is gained by both LSTM and GRU, which is 99%. To the best of our knowledge, this work is the first to apply the deep learning algorithms LSTM and GRU for detecting Bengali spam. Besides, a comparative analysis is performed with some traditional supervised ML algorithms and deep learning algorithms. Moreover, the effects of various activation functions and optimizers are also experimented on LSTM and GRU deep learning algorithms. ADAGRAD optimizer gains the best accuracy over RMSPROP, ADAMAX, ADADELTA and SGD. Finally, the best combinations of deep learning algorithms, activation functions, and optimizers are proposed based on experimental analysis.
Index Terms—SMS spam, RNN, LSTM, GRU, ADAGRAD, Machine Learning, Logistic regression, SVM, Naïve Bayes

Cite: Md. Mohsin Uddin, Monica Yasmin, M Saddam Hossain Khan, Md Istianatur Rahman, and Tabassum Islam, "Detecting Bengali Spam SMS Using Recurrent Neural Network," Journal of Communications vol. 15, no. 4, pp. 325-331, April 2020. Doi: 10.12720/jcm.15.4.325-331

Copyright © 2020 by the authors. This is an open access article distributed under the Creative Commons Attribution License (CC BY-NC-ND 4.0), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.