Your selections:
On the limitations of scalarisation for multi-objective reinforcement learning of Pareto fronts
- Vamplew, Peter, Yearwood, John, Dazeley, Richard, Berry, Adam
Constructing stochastic mixture policies for episodic multiobjective reinforcement learning tasks
- Vamplew, Peter, Dazeley, Richard, Barker, Ewan, Kelarev, Andrei
A survey of multi-objective sequential decision-making
- Roijers, Diederik, Vamplew, Peter, Whiteson, Shimon, Dazeley, Richard
Empirical evaluation methods for multiobjective reinforcement learning algorithms
- Vamplew, Peter, Dazeley, Richard, Berry, Adam, Issabekov, Rustam, Dekker, Evan
The ballarat incremental knowledge engine
- Dazeley, Richard, Warner, Philip, Johnson, Scott, Vamplew, Peter
RM and RDM, a preliminary evaluation of two prudent RDR Techniques
- Maruatona, Omaru, Vamplew, Peter, Dazeley, Richard
Prudent fraud detection in internet banking
- Maruatona, Omaru, Vamplew, Peter, Dazeley, Richard
Steering approaches to Pareto-optimal multiobjective reinforcement learning
- Vamplew, Peter, Issabekov, Rustam, Dazeley, Richard, Foale, Cameron, Berry, Adam, Moore, Tim, Creighton, Douglas
Coarse Q-Learning : Addressing the convergence problem when quantizing continuous state variables
- Dazeley, Richard, Vamplew, Peter, Bignold, Adam
Non-functional regression : A new challenge for neural networks
- Vamplew, Peter, Dazeley, Richard, Foale, Cameron, Choudhury, Tanveer
Human-aligned artificial intelligence is a multiobjective problem
- Vamplew, Peter, Dazeley, Richard, Foale, Cameron, Firmin, Sally, Mummery, Jane
Softmax exploration strategies for multiobjective reinforcement learning
- Vamplew, Peter, Dazeley, Richard, Foale, Cameron
Reinforcement learning of pareto-optimal multiobjective policies using steering
- Vamplew, Peter, Issabekov, Rustam, Dazeley, Richard, Foale, Cameron
Levels of explainable artificial intelligence for human-aligned conversational explanations
- Dazeley, Richard, Vamplew, Peter, Foale, Cameron, Young, Cameron, Aryal, Sunil, Cruz, Francisco
An evaluation methodology for interactive reinforcement learning with simulated users
- Bignold, Adam, Cruz, Francisco, Dazeley, Richard, Vamplew, Peter, Foale, Cameron
Rapid anomaly detection using integrated prudence analysis (IPA)
- Maruatona, Omaru, Vamplew, Peter, Dazeley, Richard, Watters, Paul
The impact of environmental stochasticity on value-based multiobjective reinforcement learning
- Vamplew, Peter, Foale, Cameron, Dazeley, Richard
Language representations for generalization in reinforcement learning
- Goodger, Nikolaj, Vamplew, Peter, Foale, Cameron, Dazeley, Richard
A multi-objective deep reinforcement learning framework
- Nguyen, Thanh, Nguyen, Ngoc, Vamplew, Peter, Nahavandi, Saeid, Dazeley, Richard, Lim, Chee
A prioritized objective actor-critic method for deep reinforcement learning
- Nguyen, Ngoc, Nguyen, Thanh, Vamplew, Peter, Dazeley, Richard, Nahavandi, Saeid
Are you sure you would like to clear your session, including search history and login status?