CONCEPT DRIFT IN STREAMING DATA: A SYSTEMATIC LITERATURE REVIEW
Keywords:machine learning, concept drift, systematic literature review, streaming data
World is generating immeasurable amount of data every minute, that needs to be analyzed for better decision making. In order to fulfil this demand of faster analytics, businesses are adopting efficient stream processing and machine learning techniques. However, data streams are particularly challenging to handle. One of the prominent problems faced while dealing with streaming data is concept drift. Concept drift is described as, an unexpected change in the underlying distribution of the streaming data that can be observed as time passes. In this work, we have conducted a systematic literature review to discover several methods that deal with the problem of concept drift. Most frequently used supervised and unsupervised techniques have been reviewed and we have also surveyed commonly used publicly available artificial and real-world datasets that are used to deal with concept drift issues.
Adriana Sayuri Iwashita, Joao Paulo Papa. 2019. "An Overview on Concept Drift Learning."
IEEE Access (IEEE) 7: 1532 to 1547.
Albert Bifet, Ricard Gavaldà. 2007. "Learning from Time-Changing Data with Adaptive
Windowing." Proceedings of the 2007 SIAM International Conference on Data Mining.
Society for Industrial and Applied Mathematics.
AWS. n.d. What is Streaming Data? Accessed 2020. https://aws.amazon.com/streamingdata.
Bartosz Krawczyk, Leandro L. Minku, João Gama, Jerzy Stefanowski, Micha? Wo?niak.
"Ensemble learning for data stream analysis: A survey." Information Fusion
(Elsevier BV) 37: 132 to 156.
Geoff Hulten, Laurie Spencer, Pedro Domingos. 2001. "Mining time-changing data
streams." Proceedings of the seventh ACM SIGKDD international conference on
Knowledge discovery and data mining - KDD '01. ACM Press.
Geoffrey I. Webb, Loong Kuan Lee, François Petitjean, Bart Goethals. 2017. "Understanding
Concept Drift." Arxiv abs/1704.00362 (Arxiv).
Geoffrey I. Webb, Roy Hyde, Hong Cao, Hai Long Nguyen, Francois Petitjean. 2016.
"Characterizing concept drift." Data Mining and Knowledge Discovery (Springer Science
and Business Media) 30 (4): 964 to 994.
Gerhard Widmer, Miroslav Kubat. 1996. "Learning in the presence of concept drift and
hidden contexts." Machine Learning (Springer) 23 (1): 69 to 101.
Hanqing Hu, Mehmed Kantardzic, Lingyu Lyu. 2018. "Detecting Different Types of
Concept Drifts with Ensemble Framework." 17th IEEE International Conference on
Machine Learning and Applications (ICMLA). IEEE.
Jeffrey C. Schlimmer, Richard Granger. 1986. "Beyond Incremental Processing: Tracking
Concept Drift." (AAAI).
Jie Lu, Anjin Liu, Fan Dong, Feng Gu, Joao Gama, Guangquan Zhang. 2018. "Learning
under Concept Drift: A Review." IEEE Transactions on Knowledge and Data Engineering
João Gama, Pedro Medas, Gladys Castillo, Pedro Rodrigues. 2004. "Learning with Drift
Detection." In Advances in Artificial Intelligence – SBIA 2004, 286 to 295. Springer.
Joung Woo Ryu, Mehmed M. Kantardzic and Myung-Won Kim. 2012. "Efficiently
Maintaining the Performance of an Ensemble Classifier in Streaming Data." In
Convergence and Hybrid Information Technology, 533 to 540. Springer.
Kantardzic, Tegjyot Singh Sethi and Mehmed. 2015. "Don’t pay for validation: Detecting
drifts from unlabeled data using margin density." Procedia Computer Science (Elsevier
BV) 53: 103 to 112.
M. Baena-Garc?a, J. del Campo-Avila, R. Fidalgo, A. Bifet, R. Gavalda, and R. Morales-
Bueno. 2006. "Early Drift Detection Method." StreamKDD. 77 to 86.
Maciej Jaworski, Piotr Duda, Leszek Rutkowski. 2017. "On Applying the Restricted
Boltzmann Machine to Active Concept Drift Detection." Symposium Series on
Computational Intelligence (SSCI). IEEE.
Mohammad Masud, Jing Gao, Latifur Khan, Jiawei Han, Bhavani M. Thuraisingham. 2011.
"Classification and Novel Class Detection in Concept-Drifting Data Streams under Time
Constraints." IEEE Transactions on Knowledge and Data Engineering (IEEE) 23 (6): 859
Niloofar Mozafari, Sattar Hashemi, Ali Hamzeh. 2011. "A Precise Statistical approach for
concept change detection in unlabeled data streams." Computers & Mathematics with
Applications (Elsevier BV) 62 (4): 1655 to 1669.
S. Wang, L. L. Minku, D. Ghezzi, D. Caltabiano, P. Tino and X. Yao. 2013. "Concept drift
detection for online class imbalance learning." The 2013 International Joint Conference
on Neural Networks (IJCNN). IEEE.
Shujian Yu, Zubin Abraham. 2017. "Concept Drift Detection with Hierarchical Hypothesis
Testing." In Proceedings of the 2017 SIAM International Conference on Data Mining,
to 776. SIAM.
Stanley, Kenneth O. 2003. "Learning Concept Drift with a Committee of Decision Trees."
Stephen H. Bach, Marcus A. Maloof. 2008. "Paired Learners for Concept Drift." 2008
Eighth IEEE International Conference on Data Mining. IEEE.
Tegjyot Singh Sethi, Mehmed Kantardzic. 2017. "On the reliable detection of concept
drift from streaming unlabeled data." Expert Systems with Applications (Elsevier BV)
: 77 - 99.
Tegjyot Singh Sethi, Mehmed Kantardzic, Elaheh Arabmakki. 2016. "Monitoring
Classification Blindspots to Detect Drifts from Unlabeled Data." IEEE 17th International
Conference on Information Reuse and Integration (IRI). IEEE.
Viktor Losing, Barbara Hammer, Heiko Wersing. 2016. "KNN Classifier with Self Adjusting
Memory for Heterogeneous Concept Drift." 2016 IEEE 16th International Conference on
Data Mining (ICDM). IEEE.
Wang, Heng, and Zubin Abraham. 2015. "Concept Drift Detection for Streaming Data."
International Joint Conference on Neural Networks (IJCNN). IEEE.
Xindong Wu, Peipei Li, Xuegang Hu. 2012. "Learning from concept drifting data streams
with unlabeled data." Neurocomputing (Elsevier BV) 92: 145 to 155.