• Please send your full manuscript to: jocm@vip.163.com

Detecting Bengali Spam SMS Using Recurrent Neural Network

Md. Mohsin Uddin 1, Monica Yasmin 1, M Saddam Hossain Khan 1, Md Istianatur Rahman 2, and Tabassum Islam 1
1. East West University, Dhaka, 1212, Bangladesh
2. World University of Bangladesh
Abstract—SMS is being spammed if the sender sends it to the targeted users to gain important personal information. If targeted users respond with personal information, it will be a great opportunity for the sender to grab their desired goal. Now, this phenomenon increases rapidly and Machine Learning (ML) is mostly used to classify this problem. In terms of Bangladesh, email spam detection is common but detecting SMS spam with the Bengali dataset is completely new as a research problem. This research is taken part to detect Bengali spam SMS using traditional Machine Learning algorithms along with Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU). Then, the performances of all algorithms are compared to find the best among them. The highest testing accuracy rate is gained by both LSTM and GRU, which is 99%. To the best of our knowledge, this work is the first to apply the deep learning algorithms LSTM and GRU for detecting Bengali spam. Besides, a comparative analysis is performed with some traditional supervised ML algorithms and deep learning algorithms. Moreover, the effects of various activation functions and optimizers are also experimented on LSTM and GRU deep learning algorithms. ADAGRAD optimizer gains the best accuracy over RMSPROP, ADAMAX, ADADELTA and SGD. Finally, the best combinations of deep learning algorithms, activation functions, and optimizers are proposed based on experimental analysis.
Index Terms—SMS spam, RNN, LSTM, GRU, ADAGRAD, Machine Learning, Logistic regression, SVM, Naïve Bayes

Cite: Md. Mohsin Uddin, Monica Yasmin, M Saddam Hossain Khan, Md Istianatur Rahman, and Tabassum Islam, "Detecting Bengali Spam SMS Using Recurrent Neural Network," Journal of Communications vol. 15, no. 4, pp. 325-331, April 2020. Doi: 10.12720/jcm.15.4.325-331
Copyright © 2013-2020 Journal of Communications, All Rights Reserved
E-mail: jcm@etpub.com