On the limitations of scalarisation for multi-objective reinforcement learning of Pareto fronts
- Authors: Vamplew, Peter , Yearwood, John , Dazeley, Richard , Berry, Adam
- Date: 2008
- Type: Text , Conference paper
- Relation: Paper presented at 21st Australasian Joint Conference on Artificial Intelligence, Auckland, New Zealand : 1st-5th December 2008 Vol. 5360, p. 372-378
- Full Text: false
- Description: Multiobjective reinforcement learning (MORL) extends RL to problems with multiple conflicting objectives. This paper argues for designing MORL systems to produce a set of solutions approximating the Pareto front, and shows that the common MORL technique of scalarisation has fundamental limitations when used to find Pareto-optimal policies. The work is supported by the presentation of three new MORL benchmarks with known Pareto fronts.
- Description: 2003006504
Constructing stochastic mixture policies for episodic multiobjective reinforcement learning tasks
- Authors: Vamplew, Peter , Dazeley, Richard , Barker, Ewan , Kelarev, Andrei
- Date: 2009
- Type: Text , Book chapter
- Relation: AI 2009 : Advances in Artificial Intelligence : 22nd Australasian Joint Conference, Melbourne, Australia, December 1-4, 2009. Proceedings Chapter p. 340-349
- Full Text:
- Description: Multiobjective reinforcement learning algorithms extend reinforcement learning techniques to problems with multiple conflicting objectives. This paper discusses the advantages gained from applying stochastic policies to multiobjective tasks and examines a particular form of stochastic policy known as a mixture policy. Two methods are proposed for deriving mixture policies for episodic multiobjective tasks from deterministic base policies found via scalarised reinforcement learning. It is shown that these approaches are an efficient means of identifying solutions which offer a superior match to the user’s preferences than can be achieved by methods based strictly on deterministic policies.
- Description: 2003007906