Home > Published Issues > 2021 > Volume 16, No. 8, August 2021 >

Improvement of K-nearest Neighbors (KNN) Algorithm for Network Intrusion Detection Using Shannon-Entropy

Nguyen Gia Bach, Le Huy Hoang, and Tran Hoang Hai
School of Information and Communication Technology, Hanoi University of Science and Technology, Hanoi, Vietnam

Abstract—Non-parametric Nearest Neighbor is an algorithm seeking for the closest data points based on the Euclidean Norm (the standard distance between two data points in a multidimensional space). The classical K-nearest Neighbor (KNN) algorithm applies this theory to find K data points in a vicinity of the considering data, then uses majority voting to label its category. This paper proposes a modification to the original KNN to improve its accuracy by changing that Euclidean Norm based on Shannon-Entropy theory in the context of Network Intrusion Detecton System. Shannon-Entropy calculates the importance of features based on the labels of those data points, then the distance between data points would be re-calculated through the new weights found for these features. Therefore, it is possible to find the more suitable K data points nearby. NSL - KDD dataset is used in this paper to evaluate the performance of the proposed model. A comparison is drawn between the results of the classic KNN, related work on its improvement and the proposed algorithm as well as novel deep learning approaches to evaluate its effectivenes in different scenarios. Results reveal that the proposed algorithm shows good performance on NSL - KDD data set. Specifically, an accuracy up to 99.73% detecting DoS attacks is obtained, 5.46% higher than the original KNN, and 1.15% higher than the related work of M-KNN. Recalculating the Euclidean-Norm distance retains the contribution of the features with low importance to the data classification, while assuring that features with higher importance will have a higher impact. Thus, the proposal does not raise any concern for losing information, and even achieves high efficiency in the classification of features and data classification.
 
Index Terms—KNN, Shannon-entropy, classification, improving KNN, NSL-KDD, intrusion detection

Cite: Nguyen Gia Bach, Le Huy Hoang, and Tran Hoang Hai, "Improvement of K-nearest Neighbors (KNN) Algorithm for Network Intrusion Detection Using Shannon-Entropy," Journal of Communications vol. 16, no. 8, pp. 347-354, August 2021. Doi: 10.12720/jcm.16.8.347-354

Copyright © 2021 by the authors. This is an open access article distributed under the Creative Commons Attribution License (CC BY-NC-ND 4.0), which permits use, distribution and reproduction in any medium, provided that the article is properly cited, the use is non-commercial and no modifications or adaptations are made.