A nethack learning environment language wrapper for autonomous agents
- Authors: Goodger, Nikolaj , Vamplew, Peter , Foale, Cameron , Dazeley, Richard
- Date: 2023
- Type: Text , Journal article
- Relation: Journal of Open Research Software Vol. 11, no. (2023), p.
- Full Text:
- Reviewed:
- Description: This paper describes a language wrapper for the NetHack Learning Environment (NLE) [1]. The wrapper replaces the non-language observations and actions with comparable language versions. The NLE offers a grand challenge for AI research while MiniHack [2] extends this potential to more specific and configurable tasks. By providing a language interface, we can enable further research on language agents and directly connect language models to a versatile environment. © 2023 The Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/.
Evolved similarity techniques in malware analysis
- Authors: Black, Paul , Gondal, Iqbal , Vamplew, Peter , Lakhotia, Arun
- Date: 2019
- Type: Text , Conference proceedings
- Relation: 2019 18th IEEE International Conference On Trust, Security And Privacy; published in In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE), 5-8th Aug, 2019 p. 404-410
- Full Text: false
- Reviewed:
- Description: Malware authors are known to reuse existing code, this development process results in software evolution and a sequence of versions of a malware family containing functions that show a divergence from the initial version. This paper proposes the term evolved similarity to account for this gradual divergence of similarity across the version history of a malware family. While existing techniques are able to match functions in different versions of malware, these techniques work best when the version changes are relatively small. This paper introduces the concept of evolved similarity and presents automated Evolved Similarity Techniques (EST). EST differs from existing malware function similarity techniques by focusing on the identification of significantly modified functions in adjacent malware versions and may also be used to identify function similarity in malware samples that differ by several versions. The challenge in identifying evolved malware function pairs lies in identifying features that are relatively invariant across evolved code. The research in this paper makes use of the function call graph to establish these features and then demonstrates the use of these techniques using Zeus malware.
Patient admission prediction using a pruned fuzzy min-max neural network with rule extraction
- Authors: Wang, Jin , Lim, Cheepeng , Creighton, Douglas , Khorsavi, Abbas , Nahavandi, Saeid , Ugon, Julien , Vamplew, Peter , Stranieri, Andrew , Martin, Laura , Freischmidt, Anton
- Date: 2015
- Type: Text , Journal article
- Relation: Neural Computing and Applications Vol. 26, no. 2 (2015), p. 277-289
- Full Text: false
- Reviewed:
- Description: A useful patient admission prediction model that helps the emergency department of a hospital admit patients efficiently is of great importance. It not only improves the care quality provided by the emergency department but also reduces waiting time of patients. This paper proposes an automatic prediction method for patient admission based on a fuzzy min–max neural network (FMM) with rules extraction. The FMM neural network forms a set of hyperboxes by learning through data samples, and the learned knowledge is used for prediction. In addition to providing predictions, decision rules are extracted from the FMM hyperboxes to provide an explanation for each prediction. In order to simplify the structure of FMM and the decision rules, an optimization method that simultaneously maximizes prediction accuracy and minimizes the number of FMM hyperboxes is proposed. Specifically, a genetic algorithm is formulated to find the optimal configuration of the decision rules. The experimental results using a large data set consisting of 450740 real patient records reveal that the proposed method achieves comparable or even better prediction accuracy than state-of-the-art classifiers with the additional ability to extract a set of explanatory rules to justify its predictions.
Detecting K-complexes for sleep stage identification using nonsmooth optimization
- Authors: Moloney, David , Sukhorukova, Nadezda , Vamplew, Peter , Ugon, Julien , Li, Gang , Beliakov, Gleb , Philippe, Carole , Amiel, Hélène , Ugon, Adrien
- Date: 2012
- Type: Text , Journal article
- Relation: ANZIAM Journal Vol. 52, no. 4 (2012), p. 319-332
- Full Text:
- Reviewed:
- Description: The process of sleep stage identification is a labour-intensive task that involves the specialized interpretation of the polysomnographic signals captured from a patient's overnight sleep session. Automating this task has proven to be challenging for data mining algorithms because of noise, complexity and the extreme size of data. In this paper we apply nonsmooth optimization to extract key features that lead to better accuracy. We develop a specific procedure for identifying K-complexes, a special type of brain wave crucial for distinguishing sleep stages. The procedure contains two steps. We first extract "easily classified" K-complexes, and then apply nonsmooth optimization methods to extract features from the remaining data and refine the results from the first step. Numerical experiments show that this procedure is efficient for detecting K-complexes. It is also found that most classification methods perform significantly better on the extracted features. © 2012 Australian Mathematical Society.
Optimization and matrix constructions for classification of data
- Authors: Kelarev, Andrei , Yearwood, John , Vamplew, Peter , Abawajy, Jemal , Chowdhury, Morshed
- Date: 2011
- Type: Journal article
- Relation: New Zealand Journal of Mathematics Vol. 41, no. 2011 (2011), p. 65-73
- Full Text:
- Reviewed:
- Description: Max-plus alegbras and more general semirings have many useful applications and have been actively investigated. On the other hand, structural matrix rings are also well known and have been considered by many authors. The main theorem of this article completely describes all optimal ideas in the more general structural matrix semirings. Originally, our investigation of these ideals was motivated by applications in data mining for the design of multiple classification systems combining several individual classifiers.
An online scalarization multi-objective reinforcement learning algorithm : TOPSIS Q-learning
- Authors: Mirzanejad, Mohammad , Ebrahimi, Morteza , Vamplew, Peter , Veisi, Hadi
- Date: 2022
- Type: Text , Journal article
- Relation: Knowledge Engineering Review Vol. 37, no. 4 (2022), p.
- Full Text: false
- Reviewed:
- Description: Conventional reinforcement learning focuses on problems with single objective. However, many problems have multiple objectives or criteria that may be independent, related, or contradictory. In such cases, multi-objective reinforcement learning is used to propose a compromise among the solutions to balance the objectives. TOPSIS is a multi-criteria decision method that selects the alternative with minimum distance from the positive ideal solution and the maximum distance from the negative ideal solution, so it can be used effectively in the decision-making process to select the next action. In this research a single-policy algorithm called TOPSIS Q-Learning is provided with focus on its performance in online mode. Unlike all single-policy methods, in the first version of the algorithm, there is no need for the user to specify the weights of the objectives. The user's preferences may not be completely definite, so all weight preferences are combined together as decision criteria and a solution is generated by considering all these preferences at once and user can model the uncertainty and weight changes of objectives around their specified preferences of objectives. If the user only wants to apply the algorithm for a specific set of weights the second version of the algorithm efficiently accomplishes that. ©
Using stereotypes to improve early-match poker play
- Authors: Layton, Robert , Vamplew, Peter , Turville, Christopher
- Date: 2008
- Type: Text , Journal article
- Relation: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 5360 LNAI, no. (1 December 2008 through 5 December 2008 2008), p. 584-593
- Full Text: false
- Description: Agent modelling is a critical aspect of many artificial intelligence systems. Many different techniques are used to learn the tendencies of another agent, though most suffer from a slow learning time. The research proposed in this paper examines stereotyping as a method to improve the learning time of poker playing agents. Poker is a difficult domain for opponent modelling due to its hidden information, stochastic elements and complex strategies. However, the literature suggests there are clusters of similar poker strategies, making it an ideal environment to test the effectiveness of stereotyping. This paper presents a method for using stereotyping in a poker bot, and shows that stereotyping improves performance in early-match play in many scenarios. © 2008 Springer Berlin Heidelberg.
Rapid anomaly detection using integrated prudence analysis (IPA)
- Authors: Maruatona, Omaru , Vamplew, Peter , Dazeley, Richard , Watters, Paul
- Date: 2018
- Type: Text , Conference proceedings
- Relation: PAKDD 2018.Trends and Applications in Knowledge Discovery and Data Mining. p. 137-141
- Full Text: false
- Reviewed:
- Description: Integrated Prudence Analysis has been proposed as a method to maximize the accuracy of rule based systems. The paper presents evaluation results of the three Prudence methods on public datasets which demonstrate that combining attribute-based and structural Prudence produces a net improvement in Prudence Accuracy.
Taming the devil: a game based approach to teaching immunology
- Authors: Nankervis, Scott , Meredith, Grant , Vamplew, Peter , Fotinatos, Nina
- Date: 2012
- Type: Text , Conference paper
- Relation: ascilite 2012: Future challenges, sustainable futures
- Full Text: false
- Reviewed:
- Description: Immunology is a complex field requiring rapid memorisation of numerous components. An indepth understanding of cellular and molecular biology is required before even moderately advanced concepts can be taught. We sought methods that actively engage students and help develop new knowledge and consolidate existing concepts to support lectures. We created an interactive and entertaining prototype immunology computer game as a tool for learning and revision, with the ability to interactively cover course content outside of class that modern learners expect. Our prototype appears to be a successful study aid when used additionally to attendance at lectures. We seek to continue the development of the game in a higher education context, but also produce a modified version for a secondary school context, in an effort to raise the profile of this key health area and promote learning for the future through the study of the sciences prior to students entering higher education.
Unsupervised color textured image segmentation using cluster ensembles and MRF mdel
- Authors: Islam, Mofakharul , Yearwood, John , Vamplew, Peter
- Date: 2008
- Type: Text , Book chapter
- Relation: Advances in computer and information sciences and engineering p. 323-328
- Full Text: false
- Reviewed:
- Description: We propose a novel approach to implement robust unsupervised color image content understanding approach that segments a color image into its constituent parts automatically. The aim of this work is to produce precise segmentation of color images using color and texture information along with neighborhood relationships among image pixels which will provide more accuracy in segmentation. Here, unsupervised means automatic discovery of classes or clusters in images rather than generating the class or cluster descriptions from training image sets. As a whole, in this particular work, the problem we want to investigate is to implement a robust unsupervised SVFM model based color medical image segmentation tool using Cluster Ensembles and MRF model along with wavelet transforms for increasing the content sensitivity of the segmentation model. In addition, Cluster Ensemble has been utilized for introducing a robust technique for finding the number of components in an image automatically. The experimental results reveal that the proposed tool is able to find the accurate number of objects or components in a color image and eventually capable of producing more accurate and faithful segmentation and can. A statistical model based approach has been developed to estimate the Maximum a posteriori (MAP) to identify the different objects/components in a color image. The approach utilizes a Markov Random Field model to capture the relationships among the neighboring pixels and integrate that information into the Expectation Maximization (EM) model fitting MAP algorithm. The algorithm simultaneously calculates the model parameters and segments the pixels iteratively in an interleaved manner. Finally, it converges to a solution where the model parameters and pixel labels are stabilized within a specified criterion. Finally, we have compared our results with another well-known segmentation approach.
A polynomial ring construction for the classification of data
- Authors: Kelarev, Andrei , Yearwood, John , Vamplew, Peter
- Date: 2009
- Type: Text , Journal article
- Relation: Bulletin of the Australian Mathematical Society Vol. 79, no. 2 (2009), p. 213-225
- Full Text:
- Reviewed:
- Description: Drensky and Lakatos (Lecture Notes in Computer Science, 357 (Springer, Berlin, 1989), pp. 181-188) have established a convenient property of certain ideals in polynomial quotient rings, which can now be used to determine error-correcting capabilities of combined multiple classifiers following a standard approach explained in the well-known monograph by Witten and Frank (Data Mining: Practical Machine Learning Tools and Techniques (Elsevier, Amsterdam, 2005)). We strengthen and generalise the result of Drensky and Lakatos by demonstrating that the corresponding nice property remains valid in a much larger variety of constructions and applies to more general types of ideals. Examples show that our theorems do not extend to larger classes of ring constructions and cannot be simplified or generalised.
Constructing stochastic mixture policies for episodic multiobjective reinforcement learning tasks
- Authors: Vamplew, Peter , Dazeley, Richard , Barker, Ewan , Kelarev, Andrei
- Date: 2009
- Type: Text , Book chapter
- Relation: AI 2009 : Advances in Artificial Intelligence : 22nd Australasian Joint Conference, Melbourne, Australia, December 1-4, 2009. Proceedings Chapter p. 340-349
- Full Text:
- Description: Multiobjective reinforcement learning algorithms extend reinforcement learning techniques to problems with multiple conflicting objectives. This paper discusses the advantages gained from applying stochastic policies to multiobjective tasks and examines a particular form of stochastic policy known as a mixture policy. Two methods are proposed for deriving mixture policies for episodic multiobjective tasks from deterministic base policies found via scalarised reinforcement learning. It is shown that these approaches are an efficient means of identifying solutions which offer a superior match to the user’s preferences than can be achieved by methods based strictly on deterministic policies.
- Description: 2003007906
Levels of explainable artificial intelligence for human-aligned conversational explanations
- Authors: Dazeley, Richard , Vamplew, Peter , Foale, Cameron , Young, Cameron , Aryal, Sunil , Cruz, Francisco
- Date: 2021
- Type: Text , Journal article
- Relation: Artificial Intelligence Vol. 299, no. (2021), p.
- Full Text:
- Reviewed:
- Description: Over the last few years there has been rapid research growth into eXplainable Artificial Intelligence (XAI) and the closely aligned Interpretable Machine Learning (IML). Drivers for this growth include recent legislative changes and increased investments by industry and governments, along with increased concern from the general public. People are affected by autonomous decisions every day and the public need to understand the decision-making process to accept the outcomes. However, the vast majority of the applications of XAI/IML are focused on providing low-level ‘narrow’ explanations of how an individual decision was reached based on a particular datum. While important, these explanations rarely provide insights into an agent's: beliefs and motivations; hypotheses of other (human, animal or AI) agents' intentions; interpretation of external cultural expectations; or, processes used to generate its own explanation. Yet all of these factors, we propose, are essential to providing the explanatory depth that people require to accept and trust the AI's decision-making. This paper aims to define levels of explanation and describe how they can be integrated to create a human-aligned conversational explanation system. In so doing, this paper will survey current approaches and discuss the integration of different technologies to achieve these levels with Broad eXplainable Artificial Intelligence (Broad-XAI), and thereby move towards high-level ‘strong’ explanations. © 2021 Elsevier B.V.
API based discrimination of ransomware and benign cryptographic programs
- Authors: Black, Paul , Sohail, Ammar , Gondal, Iqbal , Kamruzzaman, Joarder , Vamplew, Peter , Watters, Paul
- Date: 2020
- Type: Text , Conference paper
- Relation: 27th International Conference on Neural Information Processing, ICONIP 2020, Bangkok, 18 to 22 November 2020, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 12533 LNCS, p. 177-188
- Full Text: false
- Reviewed:
- Description: Ransomware is a widespread class of malware that encrypts files in a victim’s computer and extorts victims into paying a fee to regain access to their data. Previous research has proposed methods for ransomware detection using machine learning techniques. However, this research has not examined the precision of ransomware detection. While existing techniques show an overall high accuracy in detecting novel ransomware samples, previous research does not investigate the discrimination of novel ransomware from benign cryptographic programs. This is a critical, practical limitation of current research; machine learning based techniques would be limited in their practical benefit if they generated too many false positives (at best) or deleted/quarantined critical data (at worst). We examine the ability of machine learning techniques based on Application Programming Interface (API) profile features to discriminate novel ransomware from benign-cryptographic programs. This research provides a ransomware detection technique that provides improved detection accuracy and precision compared to other API profile based ransomware detection techniques while using significantly simpler features than previous dynamic ransomware detection research. © 2020, Springer Nature Switzerland AG.
Reanimating historic malware samples
- Authors: Black, Paul , Gondal, Iqbal , Vamplew, Peter , Lakhotia, Arun
- Date: 2021
- Type: Text , Book chapter
- Relation: Malware Analysis Using Artificial Intelligence and Deep Learning p. 345-360
- Full Text: false
- Reviewed:
- Description: Many types of malicious software are controlled from an attacker’s command and control (C2) servers. Anti-virus organizations seek to defeat malware attacks by requesting removal of C2 server Domain Name Server (DNS) records. As a result, the life span of most malware samples is relatively short. Large datasets of historical malware samples are available for countermeasures research. However, due to the age of these malware samples, their C2 servers are no longer available. To cope with high volumes of malware production, malware analysis is increasingly performed using machine learning techniques. Dynamic analysis is commonly used for feature extraction. However, due to the absence of their C2 servers, after initialization, malware samples may exit or loop attempting to establish C2 server connections and, as a result, no longer exhibit their original capabilities. Therefore, partial execution of historical malware samples in a sandbox results in features that differ from those that would be extracted in-the-wild, thus invalidating the results of any machine learning research based on these features. One approach to extracting accurate features is to build an emulated C2 server to provide an environment that allows control of the full capabilities of the malware in an isolated environment. To illustrate the benefits of building C2 server emulators, this chapter provides examples of techniques for the creation of C2 server emulators for three malware families (Zeus, CryptoWall, and CryptoLocker) using manual reverse engineering techniques and a review of semi-automated techniques for the construction of C2 server emulators.
Applying clustering and ensemble clustering approaches to phishing profiling
- Authors: Webb, Dean , Yearwood, John , Vamplew, Peter , Ma, Liping , Ofoghi, Bahadorreza , Kelarev, Andrei
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at Eighth Australasian Data Mining Conference, AusDM 2009, University of Melbourne, Melbourne, Victoria : 1st–4th December 2009
- Full Text:
- Description: 2003007911
Softmax exploration strategies for multiobjective reinforcement learning
- Authors: Vamplew, Peter , Dazeley, Richard , Foale, Cameron
- Date: 2017
- Type: Text , Journal article
- Relation: Neurocomputing Vol. 263, no. (2017), p. 74-86
- Full Text:
- Reviewed:
- Description: Despite growing interest over recent years in applying reinforcement learning to multiobjective problems, there has been little research into the applicability and effectiveness of exploration strategies within the multiobjective context. This work considers several widely-used approaches to exploration from the single-objective reinforcement learning literature, and examines their incorporation into multiobjective Q-learning. In particular this paper proposes two novel approaches which extend the softmax operator to work with vector-valued rewards. The performance of these exploration strategies is evaluated across a set of benchmark environments. Issues arising from the multiobjective formulation of these benchmarks which impact on the performance of the exploration strategies are identified. It is shown that of the techniques considered, the combination of the novel softmax–epsilon exploration with optimistic initialisation provides the most effective trade-off between exploration and exploitation.
Visualising the value of water
- Authors: Block, Jessica , Graymore, Michelle , Wallis, Anne , Vamplew, Peter , Mitchell, Bradley , O'Toole, Kevin , McRae-Williams, Pamela
- Date: 2012
- Type: Text , Book chapter
- Relation: Pipes, Ponds and People: Adaptive water management p. 195-225
- Full Text: false
- Reviewed:
Griefers versus the Griefed - what motivates them to play Massively Multiplayer Online Role-Playing Games?
- Authors: Achterbosch, Leigh , Miller, Charlynn , Turville, Christopher , Vamplew, Peter
- Date: 2014
- Type: Text , Journal article
- Relation: The Computer Games Journal Vol. 3, no. 1 (2014), p. 5-18
- Full Text:
- Reviewed:
- Description: 'Griefing' is a term used to describe when a player within a multiplayer online environment intentionally disrupts another player’s game experience for his or her own personal enjoyment or gain. Every day a certain percentage of users of Massively Multiplayer Online Role-Playing Games (MMORPG) are experiencing some form of griefing. There have been studies conducted in the past that attempted to ascertain the factors that motivate users to play MMORPGs. A limited number of studies specifically examined the motivations of users who perform griefing (who are also known as 'griefers'). However, those studies did not examine the motivations of users subjected to griefing. Therefore, the aim of this paper is to examine the factors that motivate the subjects of griefing to play MMORPGs, as well as the factors motivating the griefers. The authors conducted an online survey with the intention to discover the motivations for playing MMORPGs among those whom identified themselves as (i) those that perform griefing, and (ii) those who have been subjected to griefing. A previously devised motivational model by Nick Yee that incorporated ten factors was used to determine the respondents’ motivational trends. In general, players who identified themselves as griefers were more likely to be motivated by all three 'achievement' sub-factors (advancement, game mechanics and competition) at the detriment of all other factors. The subjects of griefing were highly motivated by 'advancement' and 'mechanics', but they ranked 'competition' significantly lower (compared to the griefers). In addition, 'immersion' factors were rated highly by the respondents who were subjected to griefing, with a significantly higher rating of the 'escapism' factor (compared with rankings by griefers). In comparison to the griefers, the respondents subjected to griefing with many years’ experience in the genre of MMORPGs, also placed a greater emphasis on the 'socializing' and 'relationship' factors. Overall, the griefers in this survey considered 'achievement' to be a prime motivating factor, whereas the griefed players tended to be motivated by all ten factors to a similar degree.
Identifying cross-version function similarity using contextual features
- Authors: Black, Paul , Gondal, Iqbal , Vamplew, Peter , Lakhotia, Arun
- Date: 2020
- Type: Text , Conference paper
- Relation: 19th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2020 p. 810-818
- Full Text: false
- Reviewed:
- Description: The identification of similar functions in malware assists analysis by supporting the exclusion of functions that have been previously analysed, allows the identification of new variants, supports authorship attribution, and the analysis of malware phylogeny. A function's context is a set comprising the function itself and all the program functions that may be executed when this function is called. Contextual features consist of data that is extracted from the functions contained in the function context. This paper presents a novel technique called Cross Version Contextual Function Similarity (CVCFS) to identify function pairs in two programs using features based on both individual functions and function context. The CVCFS technique uses Support Vector Machine (SVM) machine learning of function similarity features to pre-filter function pairs and then applies an edit distance technique using function semantics to reduce false positives. A case study is provided where individual and contextual features are extracted from three versions of Zeus malware. The SVM pre-filtering, followed by the use of an edit distance technique to filter false positives, gives a function pair identification accuracy of 85 percent. © 2020 IEEE.