Mining interesting and useful knowledge from the huge amount of data gathered in wireless sensor networks is a challenging task. Works reported in literature use support metric-based sensor association rule which employs the occurrence frequency of patterns as criteria. Such criteria may not be appropriate for finding significant patterns. Moreover, temporal regularity in occurrence behavior should be considered as another important measure for assessing the importance of patterns in WSNs. Frequent sensor patterns that occur after regular intervals is called regularly frequent sensor patterns. Even though mining regularly frequent sensor patterns from sensor data stream is extremely important in many real-time applications, no such algorithm has been proposed yet. In this paper, we propose a novel tree structure called Regularly Frequent Sensor Pattern-tree (RSP-tree) and an efficient mining approach for finding regularly frequent sensor patterns from WSNs. Extensive performance analyses show that our technique is time and memory efficient in finding regularly frequent sensor patterns.
Wireless sensor networks (WSNs) will be an integral part of the future Internet of Things (loT) environment and generate large volumes of data. However, these data would only be of benefit if useful knowledge can be mined from them. A data mining framework for WSNs includes data extraction, storage and mining techniques, and must be efficient and dependable. In this paper, we propose a new type of behavioral pattern mining technique from sensor data called regularly frequent sensor patterns (RFSPs). RFSPs can identify a set of temporally correlated sensors which can reveal significant knowledge from the monitored data. A distributed data extraction model to prepare the data required for mining RFSPs is proposed, as the distributed scheme ensures higher availability through greater redundancy. The tree structure for RFSP is compact requires less memory and can be constructed using only a single scan through the dataset, and the mining technique is efficient with low runtime. Current mining techniques in the literature on sensor data employ a single memory-based sequential approach and hence are not efficient. Moreover, usage of the. MapReduce model for the distributed solution has not been explored extensively. Since MapReduce is becoming the de facto model for computation on large data, we also propose a parallel implementation of the RFSP mining algorithm, called RFSP on Hadoop (RFSP-H), which uses a MapReduce-based framework to gain further efficiency. Experiments conducted to evaluate the compactness and performance of the data extraction model, RFSP-tree and RFSP-H mining show improved results. (C) 2016 Elsevier Inc. All rights reserved.