Malware detection in edge devices with fuzzy oversampling and dynamic class weighting
- Authors: Khoda, Mahbub , Kamruzzaman, Joarder , Gondal, Iqbal , Imam, Tasadduq , Rahman, Ashfaqur
- Date: 2021
- Type: Text , Journal article
- Relation: Applied Soft Computing Vol. 112, no. (2021), p.
- Full Text: false
- Reviewed:
- Description: In Internet-of-things (IoT) domain, edge devices are used increasingly for data accumulation, preprocessing, and analytics. Intelligent integration of edge devices with Artificial Intelligence (AI) facilitates real-time analysis and decision making. However, these devices simultaneously provide additional attack opportunities for malware developers, potentially leading to information and financial loss. Machine learning approaches can detect such attacks but their performance degrades when benign samples substantially outnumber malware samples in training data. Existing approaches for such imbalanced data assume samples represented as continuous features and thus can generate invalid samples when malware applications are represented by binary features. We propose a novel malware oversampling technique that addresses this issue. Further, we propose two approaches for malware detection. Our first approach uses fuzzy set theory, while the second approach dynamically assigns higher priority to malware samples using a novel loss function. Combining our oversampling technique with these approaches, the proposed approach attains over 9% improvement over competing methods in terms of F1_score. Our approaches can, therefore, result in enhanced privacy and security in edge computing services. © 2021 Elsevier B.V.
Optimization based clustering algorithms for authorship analysis of phishing emails
- Authors: Seifollahi, Sattar , Bagirov, Adil , Layton, Robert , Gondal, Iqbal
- Date: 2017
- Type: Text , Journal article
- Relation: Neural Processing Letters Vol. 46, no. 2 (2017), p. 411-425
- Relation: http://purl.org/au-research/grants/arc/DP140103213
- Full Text: false
- Reviewed:
- Description: Phishing has given attackers power to masquerade as legitimate users of organizations, such as banks, to scam money and private information from victims. Phishing is so widespread that combating the phishing attacks could overwhelm the victim organization. It is important to group the phishing attacks to formulate effective defence mechanism. In this paper, we use clustering methods to analyze and characterize phishing emails and perform their relative attribution. Emails are first tokenized to a bag-of-word space and, then, transformed to a numeric vector space using frequencies of words in documents. Wordnet vocabulary is used to take effects of similar words into account and to reduce sparsity. The word similarity measure is combined with the term frequencies to introduce a novel text transformation into numeric features. To improve the accuracy, we apply inverse document frequency weighting, which gives higher weights to features used by fewer authors. The k-means and recently introduced three optimization based algorithms: MS-MGKM, INCA and DCClust are applied for clustering purposes. The optimization based algorithms indicate the existence of well separated clusters in the phishing emails dataset. © 2017, Springer Science+Business Media New York.
A data mining approach for machine fault diagnosis based on associated frequency patterns
- Authors: Rashid, Md. Mamunur , Amar, Muhammad , Gondal, Iqbal , Kamruzzaman, Joarder
- Date: 2016
- Type: Text , Journal article
- Relation: Applied Intelligence Vol. 45, no. 3 (2016), p. 638-651
- Full Text: false
- Reviewed:
- Description: Bearings play a crucial role in rotational machines and their failure is one of the foremost causes of breakdowns in rotary machinery. Their functionality is directly relevant to the operational performance, service life and efficiency of these machines. Therefore, bearing fault identification is very significant. The accuracy of fault or anomaly detection by the current techniques is not adequate. We propose a data mining-based framework for fault identification and anomaly detection from machine vibration data. In this framework, to capture the useful knowledge from the vibration data stream (VDS), we first pre-process the data using Fast Fourier Transform (FFT) to extract the frequency signature and then build a compact tree called SAFP-tree (sliding window associated frequency pattern tree), and propose a mining algorithm called SAFP. Our SAFP algorithm can mine associated frequency patterns (i.e., fault frequency signatures) in the current window of VDS and use them to identify faults in the bearing data. Finally, SAFP is further enhanced to SAFP-AD for anomaly detection by determining the normal behavior measure (NBM) from the extracted frequency patterns. The results show that our technique is very efficient in identifying faults and detecting anomalies over VDS and can be used for remote machine health diagnosis. © 2016, Springer Science+Business Media New York.
On temporal order invariance for view-invariant action recognition
- Authors: Ul-Haq, Anwaar , Gondal, Iqbal , Murshed, Manzur
- Date: 2013
- Type: Text , Journal article
- Relation: IEEE Transactions on Circuits and Systems for Video Technology Vol. 23, no. 2 (2013), p. 203-211
- Full Text: false
- Reviewed:
- Description: View-invariant action recognition is one of the most challenging problems in computer vision. Various representations are being devised for matching actions across different viewpoints to achieve view invariance. In this paper, we explore the invariance property of temporal order of action instances during action execution and utilize it for devising a new view-invariant action recognition approach. To ensure temporal order during matching, we utilize spatiotemporal features, feature fusion and temporal order consistency constraint. We start by extracting spatiotemporal cuboid features from video sequences and applying feature fusion to encapsulate within-class similarity for the same viewpoints. For each action class, we construct a feature fusion table to facilitate feature matching across different views. An action matching score is then calculated based on global temporal order constraint and number of matching features. Finally, the action label of the class with the maximum value of the matching score is assigned to the query action. Experimentation is performed on multiple view Inria Xmas motion acquisition sequences and West Virginia University action datasets, with encouraging results, that are comparable to the existing view-invariant action recognition techniques.
How to improve postgenomic knowledge discovery using imputation
- Authors: Sehgal, Muhammad Shoaib B , Gondal, Iqbal , Dooley, Laurence , Coppel, Ross
- Date: 2009
- Type: Text , Journal article
- Relation: Eurasip Journal on Bioinformatics and Systems Biology Vol. 2009, no. 1 (2009), p. 1-14
- Full Text:
- Reviewed:
- Description: While microarrays make it feasible to rapidly investigate many complex biological problems, their multistep fabrication has the proclivity for error at every stage. The standard tactic has been to either ignore or regard erroneous gene readings as missing values, though this assumption can exert a major influence upon postgenomic knowledge discovery methods like gene selection and gene regulatory network (GRN) reconstruction. This has been the catalyst for a raft of new flexible imputation algorithms including local least square impute and the recent heuristic collateral missing value imputation, which exploit the biological transactional behaviour of functionally correlated genes to afford accurate missing value estimation. This paper examines the influence of missing value imputation techniques upon postgenomic knowledge inference methods with results for various algorithms consistently corroborating that instead of ignoring missing values, recycling microarray data by flexible and robust imputation can provide substantial performance benefits for subsequent downstream procedures