2024-08-20
2024-07-22
2024-06-20
Abstract—Data compression plays an important role in data mining in assessing the minability of data and a modality of evaluating similarities between complex objects. We discuss various mining applications ranging from compressibility of strings of symbols and of languages, graph compressibility, compression of market basket data. Also, we examine the role of compression in computing similarity in text corpora and we propose a novel approach for assessing the quality of text summarization. Index Terms—Compression ratio, Thue-Morse sequence, lossless compression, stemming, lemmatizing Cite: Dan A. Simovici, Ping Chen, Tong Wang, and Dan Pletea, "Compression and Data Mining," Journal of Communications, vol. 10, no. 9, pp. 677-684, 2015. Doi: 10.12720/jcm.10.9.677-684