The Combative Accretion Model-; Multiobjective Optimisation Without Explicit Pareto Ranking
- Authors: Berry, Adam , Vamplew, Peter
- Date: 2005
- Type: Text , Conference paper
- Relation: Paper presented at Third International Conference, EMO 2005: Evolutionary multi-criterion optimization, Guanajuato, Mexico : 9-11 March 2005 p. 77-91
- Full Text:
- Description: Contemporary evolutionary multiobjective optimisation techniques are becoming increasingly focussed on the notions of archiving, explicit diversity maintenance and population-based Pareto ranking to achieve good approximations of the Pareto front. While it is certainly true that these techniques have been effective, they come at a significant complexity cost that ultimately limits their application to complex problems. This paper proposes a new model that moves away from explicit population-wide Pareto ranking, abandons both complex archiving and diversity measures and incorporates a continuous accretion-based approach that is divergent from the discretely generational nature of traditional evolutionary algorithms. Results indicate that the new approach, the Combative Accretion Model (CAM), achieves markedly better approximations than NSGA across a range of well-recognised test functions. Moreover, CAM is more efficient than NSGAII with respect to the number of comparisons (by an order of magnitude), while achieving comparable, and generally preferable, fronts.
- Description: 2003002711
An efficient approach to unbounded bi-objective archives : Introducing the Mak_Tree algorithm
- Authors: Vamplew, Peter , Berry, Adam
- Date: 2006
- Type: Text , Conference paper
- Relation: Paper presented at GECCO 2006, 8th Annual Genetic and Evolutionary Computation Conference, Seattle, USA : 8th July, 2006
- Full Text: false
- Reviewed:
- Description: Given the prominence of elite archiving in contemporary multiobjective optimisation research and the limitations inherent in bounded population sizes, it is unusual that the vast majority of popular techniques aggressively truncate the capacity of archives and are based upon inefficient list representations. By forming better data structures and algorithms for the storage of archival members, the need for truncation is reduced and unbounded elite sets become viable. While work does exist in this vein, it is always of a general nature and significant improvements can be made in the bi-objective case. As such, this paper elucidates the unique properties of two-dimensional non-dominated sets and capitalises on these notions to develop the highly efficient and specialised bi-objective Mak_Tree algorithm. Theoretical results indicate that the specialised approach is preferable to pre-existing general techniques, while empirical analysis illustrates improved performance over both unbounded and bounded list techniques.
- Description: E1
- Description: 2003001715
On the limitations of scalarisation for multi-objective reinforcement learning of Pareto fronts
- Authors: Vamplew, Peter , Yearwood, John , Dazeley, Richard , Berry, Adam
- Date: 2008
- Type: Text , Conference paper
- Relation: Paper presented at 21st Australasian Joint Conference on Artificial Intelligence, Auckland, New Zealand : 1st-5th December 2008 Vol. 5360, p. 372-378
- Full Text: false
- Description: Multiobjective reinforcement learning (MORL) extends RL to problems with multiple conflicting objectives. This paper argues for designing MORL systems to produce a set of solutions approximating the Pareto front, and shows that the common MORL technique of scalarisation has fundamental limitations when used to find Pareto-optimal policies. The work is supported by the presentation of three new MORL benchmarks with known Pareto fronts.
- Description: 2003006504
Empirical evaluation methods for multiobjective reinforcement learning algorithms
- Authors: Vamplew, Peter , Dazeley, Richard , Berry, Adam , Issabekov, Rustam , Dekker, Evan
- Date: 2011
- Type: Text , Journal article
- Relation: Machine Learning Vol. 84, no. 1-2 (2011), p. 51-80
- Full Text: false
- Reviewed:
- Description: While a number of algorithms for multiobjective reinforcement learning have been proposed, and a small number of applications developed, there has been very little rigorous empirical evaluation of the performance and limitations of these algorithms. This paper proposes standard methods for such empirical evaluation, to act as a foundation for future comparative studies. Two classes of multiobjective reinforcement learning algorithms are identified, and appropriate evaluation metrics and methodologies are proposed for each class. A suite of benchmark problems with known Pareto fronts is described, and future extensions and implementations of this benchmark suite are discussed. The utility of the proposed evaluation methods are demonstrated via an empirical comparison of two example learning algorithms. © 2010 The Author(s).
Steering approaches to Pareto-optimal multiobjective reinforcement learning
- Authors: Vamplew, Peter , Issabekov, Rustam , Dazeley, Richard , Foale, Cameron , Berry, Adam , Moore, Tim , Creighton, Douglas
- Date: 2017
- Type: Text , Journal article
- Relation: Neurocomputing Vol. 263, no. (2017), p. 26-38
- Full Text:
- Reviewed:
- Description: For reinforcement learning tasks with multiple objectives, it may be advantageous to learn stochastic or non-stationary policies. This paper investigates two novel algorithms for learning non-stationary policies which produce Pareto-optimal behaviour (w-steering and Q-steering), by extending prior work based on the concept of geometric steering. Empirical results demonstrate that both new algorithms offer substantial performance improvements over stationary deterministic policies, while Q-steering significantly outperforms w-steering when the agent has no information about recurrent states within the environment. It is further demonstrated that Q-steering can be used interactively by providing a human decision-maker with a visualisation of the Pareto front and allowing them to adjust the agent’s target point during learning. To demonstrate broader applicability, the use of Q-steering in combination with function approximation is also illustrated on a task involving control of local battery storage for a residential solar power system.