CONCEPT DRIFT IN STREAMING DATA: A SYSTEMATIC LITERATURE REVIEW

Authors

  • Tariq Mahmood Institute of Business Administration
  • Tatheer Fatima Institute of Business Administration, Karachi

DOI:

https://doi.org/10.51153/kjcis.v4i1.43

Keywords:

machine learning, concept drift, systematic literature review, streaming data

Abstract

World is generating immeasurable amount of data every minute, that needs to be analyzed for better decision making. In order to fulfil this demand of faster analytics, businesses are adopting efficient stream processing and machine learning techniques. However, data streams are particularly challenging to handle. One of the prominent problems faced while dealing with streaming data is concept drift. Concept drift is described as, an unexpected change in the underlying distribution of the streaming data that can be observed as time passes. In this work, we have conducted a systematic literature review to discover several methods that deal with the problem of concept drift. Most frequently used supervised and unsupervised techniques have been reviewed and we have also surveyed commonly used publicly available artificial and real-world datasets that are used to deal with concept drift issues.

References

Adriana Sayuri Iwashita, Joao Paulo Papa. 2019. "An Overview on Concept Drift Learning."

IEEE Access (IEEE) 7: 1532 to 1547.

Albert Bifet, Ricard Gavaldà. 2007. "Learning from Time-Changing Data with Adaptive

Windowing." Proceedings of the 2007 SIAM International Conference on Data Mining.

Society for Industrial and Applied Mathematics.

AWS. n.d. What is Streaming Data? Accessed 2020. https://aws.amazon.com/streamingdata.

Bartosz Krawczyk, Leandro L. Minku, João Gama, Jerzy Stefanowski, Micha? Wo?niak.

"Ensemble learning for data stream analysis: A survey." Information Fusion

(Elsevier BV) 37: 132 to 156.

Geoff Hulten, Laurie Spencer, Pedro Domingos. 2001. "Mining time-changing data

streams." Proceedings of the seventh ACM SIGKDD international conference on

Knowledge discovery and data mining - KDD '01. ACM Press.

Geoffrey I. Webb, Loong Kuan Lee, François Petitjean, Bart Goethals. 2017. "Understanding

Concept Drift." Arxiv abs/1704.00362 (Arxiv).

Geoffrey I. Webb, Roy Hyde, Hong Cao, Hai Long Nguyen, Francois Petitjean. 2016.

"Characterizing concept drift." Data Mining and Knowledge Discovery (Springer Science

and Business Media) 30 (4): 964 to 994.

Gerhard Widmer, Miroslav Kubat. 1996. "Learning in the presence of concept drift and

hidden contexts." Machine Learning (Springer) 23 (1): 69 to 101.

Hanqing Hu, Mehmed Kantardzic, Lingyu Lyu. 2018. "Detecting Different Types of

Concept Drifts with Ensemble Framework." 17th IEEE International Conference on

Machine Learning and Applications (ICMLA). IEEE.

Jeffrey C. Schlimmer, Richard Granger. 1986. "Beyond Incremental Processing: Tracking

Concept Drift." (AAAI).

Jie Lu, Anjin Liu, Fan Dong, Feng Gu, Joao Gama, Guangquan Zhang. 2018. "Learning

under Concept Drift: A Review." IEEE Transactions on Knowledge and Data Engineering

(IEEE).

João Gama, Pedro Medas, Gladys Castillo, Pedro Rodrigues. 2004. "Learning with Drift

Detection." In Advances in Artificial Intelligence – SBIA 2004, 286 to 295. Springer.

Joung Woo Ryu, Mehmed M. Kantardzic and Myung-Won Kim. 2012. "Efficiently

Maintaining the Performance of an Ensemble Classifier in Streaming Data." In

Convergence and Hybrid Information Technology, 533 to 540. Springer.

Kantardzic, Tegjyot Singh Sethi and Mehmed. 2015. "Don’t pay for validation: Detecting

drifts from unlabeled data using margin density." Procedia Computer Science (Elsevier

BV) 53: 103 to 112.

M. Baena-Garc?a, J. del Campo-Avila, R. Fidalgo, A. Bifet, R. Gavalda, and R. Morales-

Bueno. 2006. "Early Drift Detection Method." StreamKDD. 77 to 86.

Maciej Jaworski, Piotr Duda, Leszek Rutkowski. 2017. "On Applying the Restricted

Boltzmann Machine to Active Concept Drift Detection." Symposium Series on

Computational Intelligence (SSCI). IEEE.

Mohammad Masud, Jing Gao, Latifur Khan, Jiawei Han, Bhavani M. Thuraisingham. 2011.

"Classification and Novel Class Detection in Concept-Drifting Data Streams under Time

Constraints." IEEE Transactions on Knowledge and Data Engineering (IEEE) 23 (6): 859

to 874.

Niloofar Mozafari, Sattar Hashemi, Ali Hamzeh. 2011. "A Precise Statistical approach for

concept change detection in unlabeled data streams." Computers & Mathematics with

Applications (Elsevier BV) 62 (4): 1655 to 1669.

S. Wang, L. L. Minku, D. Ghezzi, D. Caltabiano, P. Tino and X. Yao. 2013. "Concept drift

detection for online class imbalance learning." The 2013 International Joint Conference

on Neural Networks (IJCNN). IEEE.

Shujian Yu, Zubin Abraham. 2017. "Concept Drift Detection with Hierarchical Hypothesis

Testing." In Proceedings of the 2017 SIAM International Conference on Data Mining,

to 776. SIAM.

Stanley, Kenneth O. 2003. "Learning Concept Drift with a Committee of Decision Trees."

Stephen H. Bach, Marcus A. Maloof. 2008. "Paired Learners for Concept Drift." 2008

Eighth IEEE International Conference on Data Mining. IEEE.

Tegjyot Singh Sethi, Mehmed Kantardzic. 2017. "On the reliable detection of concept

drift from streaming unlabeled data." Expert Systems with Applications (Elsevier BV)

: 77 - 99.

Tegjyot Singh Sethi, Mehmed Kantardzic, Elaheh Arabmakki. 2016. "Monitoring

Classification Blindspots to Detect Drifts from Unlabeled Data." IEEE 17th International

Conference on Information Reuse and Integration (IRI). IEEE.

Viktor Losing, Barbara Hammer, Heiko Wersing. 2016. "KNN Classifier with Self Adjusting

Memory for Heterogeneous Concept Drift." 2016 IEEE 16th International Conference on

Data Mining (ICDM). IEEE.

Wang, Heng, and Zubin Abraham. 2015. "Concept Drift Detection for Streaming Data."

International Joint Conference on Neural Networks (IJCNN). IEEE.

Xindong Wu, Peipei Li, Xuegang Hu. 2012. "Learning from concept drifting data streams

with unlabeled data." Neurocomputing (Elsevier BV) 92: 145 to 155.

Downloads

Published

2021-01-01