A conceptual framework for externally-influenced agents: an assisted reinforcement learning review
- Bignold, Adam, Cruz, Francisco, Taylor, Matthew, Brys, Tim, Dazeley, Richard, Vamplew, Peter, Foale, Cameron
- Authors: Bignold, Adam , Cruz, Francisco , Taylor, Matthew , Brys, Tim , Dazeley, Richard , Vamplew, Peter , Foale, Cameron
- Date: 2023
- Type: Text , Journal article
- Relation: Journal of Ambient Intelligence and Humanized Computing Vol. 14, no. 4 (2023), p. 3621-3644
- Full Text:
- Reviewed:
- Description: A long-term goal of reinforcement learning agents is to be able to perform tasks in complex real-world scenarios. The use of external information is one way of scaling agents to more complex problems. However, there is a general lack of collaboration or interoperability between different approaches using external information. In this work, while reviewing externally-influenced methods, we propose a conceptual framework and taxonomy for assisted reinforcement learning, aimed at fostering collaboration by classifying and comparing various methods that use external information in the learning process. The proposed taxonomy details the relationship between the external information source and the learner agent, highlighting the process of information decomposition, structure, retention, and how it can be used to influence agent learning. As well as reviewing state-of-the-art methods, we identify current streams of reinforcement learning that use external information in order to improve the agent’s performance and its decision-making process. These include heuristic reinforcement learning, interactive reinforcement learning, learning from demonstration, transfer learning, and learning from multiple sources, among others. These streams of reinforcement learning operate with the shared objective of scaffolding the learner agent. Lastly, we discuss further possibilities for future work in the field of assisted reinforcement learning systems. © 2021, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
- Authors: Bignold, Adam , Cruz, Francisco , Taylor, Matthew , Brys, Tim , Dazeley, Richard , Vamplew, Peter , Foale, Cameron
- Date: 2023
- Type: Text , Journal article
- Relation: Journal of Ambient Intelligence and Humanized Computing Vol. 14, no. 4 (2023), p. 3621-3644
- Full Text:
- Reviewed:
- Description: A long-term goal of reinforcement learning agents is to be able to perform tasks in complex real-world scenarios. The use of external information is one way of scaling agents to more complex problems. However, there is a general lack of collaboration or interoperability between different approaches using external information. In this work, while reviewing externally-influenced methods, we propose a conceptual framework and taxonomy for assisted reinforcement learning, aimed at fostering collaboration by classifying and comparing various methods that use external information in the learning process. The proposed taxonomy details the relationship between the external information source and the learner agent, highlighting the process of information decomposition, structure, retention, and how it can be used to influence agent learning. As well as reviewing state-of-the-art methods, we identify current streams of reinforcement learning that use external information in order to improve the agent’s performance and its decision-making process. These include heuristic reinforcement learning, interactive reinforcement learning, learning from demonstration, transfer learning, and learning from multiple sources, among others. These streams of reinforcement learning operate with the shared objective of scaffolding the learner agent. Lastly, we discuss further possibilities for future work in the field of assisted reinforcement learning systems. © 2021, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
An evaluation methodology for interactive reinforcement learning with simulated users
- Bignold, Adam, Cruz, Francisco, Dazeley, Richard, Vamplew, Peter, Foale, Cameron
- Authors: Bignold, Adam , Cruz, Francisco , Dazeley, Richard , Vamplew, Peter , Foale, Cameron
- Date: 2021
- Type: Text , Journal article
- Relation: Biomimetics Vol. 6, no. 1 (2021), p. 1-15
- Full Text:
- Reviewed:
- Description: Interactive reinforcement learning methods utilise an external information source to evaluate decisions and accelerate learning. Previous work has shown that human advice could significantly improve learning agents’ performance. When evaluating reinforcement learning algorithms, it is common to repeat experiments as parameters are altered or to gain a sufficient sample size. In this regard, to require human interaction every time an experiment is restarted is undesirable, particularly when the expense in doing so can be considerable. Additionally, reusing the same people for the experiment introduces bias, as they will learn the behaviour of the agent and the dynamics of the environment. This paper presents a methodology for evaluating interactive reinforcement learning agents by employing simulated users. Simulated users allow human knowledge, bias, and interaction to be simulated. The use of simulated users allows the development and testing of reinforcement learning agents, and can provide indicative results of agent performance under defined human constraints. While simulated users are no replacement for actual humans, they do offer an affordable and fast alternative for evaluative assisted agents. We introduce a method for performing a preliminary evaluation utilising simulated users to show how performance changes depending on the type of user assisting the agent. Moreover, we describe how human interaction may be simulated, and present an experiment illustrating the applicability of simulating users in evaluating agent performance when assisted by different types of trainers. Experimental results show that the use of this methodology allows for greater insight into the performance of interactive reinforcement learning agents when advised by different users. The use of simulated users with varying characteristics allows for evaluation of the impact of those characteristics on the behaviour of the learning agent. © 2021 by the authors. Licensee MDPI, Basel, Switzerland.
- Authors: Bignold, Adam , Cruz, Francisco , Dazeley, Richard , Vamplew, Peter , Foale, Cameron
- Date: 2021
- Type: Text , Journal article
- Relation: Biomimetics Vol. 6, no. 1 (2021), p. 1-15
- Full Text:
- Reviewed:
- Description: Interactive reinforcement learning methods utilise an external information source to evaluate decisions and accelerate learning. Previous work has shown that human advice could significantly improve learning agents’ performance. When evaluating reinforcement learning algorithms, it is common to repeat experiments as parameters are altered or to gain a sufficient sample size. In this regard, to require human interaction every time an experiment is restarted is undesirable, particularly when the expense in doing so can be considerable. Additionally, reusing the same people for the experiment introduces bias, as they will learn the behaviour of the agent and the dynamics of the environment. This paper presents a methodology for evaluating interactive reinforcement learning agents by employing simulated users. Simulated users allow human knowledge, bias, and interaction to be simulated. The use of simulated users allows the development and testing of reinforcement learning agents, and can provide indicative results of agent performance under defined human constraints. While simulated users are no replacement for actual humans, they do offer an affordable and fast alternative for evaluative assisted agents. We introduce a method for performing a preliminary evaluation utilising simulated users to show how performance changes depending on the type of user assisting the agent. Moreover, we describe how human interaction may be simulated, and present an experiment illustrating the applicability of simulating users in evaluating agent performance when assisted by different types of trainers. Experimental results show that the use of this methodology allows for greater insight into the performance of interactive reinforcement learning agents when advised by different users. The use of simulated users with varying characteristics allows for evaluation of the impact of those characteristics on the behaviour of the learning agent. © 2021 by the authors. Licensee MDPI, Basel, Switzerland.
Human engagement providing evaluative and informative advice for interactive reinforcement learning
- Bignold, Adam, Cruz, Francisco, Dazeley, Richard, Vamplew, Peter, Foale, Cameron
- Authors: Bignold, Adam , Cruz, Francisco , Dazeley, Richard , Vamplew, Peter , Foale, Cameron
- Date: 2023
- Type: Text , Journal article
- Relation: Neural Computing and Applications Vol. 35, no. 25 (2023), p. 18215-18230
- Full Text:
- Reviewed:
- Description: Interactive reinforcement learning proposes the use of externally sourced information in order to speed up the learning process. When interacting with a learner agent, humans may provide either evaluative or informative advice. Prior research has focused on the effect of human-sourced advice by including real-time feedback on the interactive reinforcement learning process, specifically aiming to improve the learning speed of the agent, while minimising the time demands on the human. This work focuses on answering which of two approaches, evaluative or informative, is the preferred instructional approach for humans. Moreover, this work presents an experimental setup for a human trial designed to compare the methods people use to deliver advice in terms of human engagement. The results obtained show that users giving informative advice to the learner agents provide more accurate advice, are willing to assist the learner agent for a longer time, and provide more advice per episode. Additionally, self-evaluation from participants using the informative approach has indicated that the agent’s ability to follow the advice is higher, and therefore, they feel their own advice to be of higher accuracy when compared to people providing evaluative advice. © 2022, The Author(s).
- Authors: Bignold, Adam , Cruz, Francisco , Dazeley, Richard , Vamplew, Peter , Foale, Cameron
- Date: 2023
- Type: Text , Journal article
- Relation: Neural Computing and Applications Vol. 35, no. 25 (2023), p. 18215-18230
- Full Text:
- Reviewed:
- Description: Interactive reinforcement learning proposes the use of externally sourced information in order to speed up the learning process. When interacting with a learner agent, humans may provide either evaluative or informative advice. Prior research has focused on the effect of human-sourced advice by including real-time feedback on the interactive reinforcement learning process, specifically aiming to improve the learning speed of the agent, while minimising the time demands on the human. This work focuses on answering which of two approaches, evaluative or informative, is the preferred instructional approach for humans. Moreover, this work presents an experimental setup for a human trial designed to compare the methods people use to deliver advice in terms of human engagement. The results obtained show that users giving informative advice to the learner agents provide more accurate advice, are willing to assist the learner agent for a longer time, and provide more advice per episode. Additionally, self-evaluation from participants using the informative approach has indicated that the agent’s ability to follow the advice is higher, and therefore, they feel their own advice to be of higher accuracy when compared to people providing evaluative advice. © 2022, The Author(s).
Potential-based multiobjective reinforcement learning approaches to low-impact agents for AI safety
- Vamplew, Peter, Foale, Cameron, Dazeley, Richard, Bignold, Adam
- Authors: Vamplew, Peter , Foale, Cameron , Dazeley, Richard , Bignold, Adam
- Date: 2021
- Type: Text , Journal article
- Relation: Engineering Applications of Artificial Intelligence Vol. 100, no. (2021), p.
- Full Text:
- Reviewed:
- Description: The concept of impact-minimisation has previously been proposed as an approach to addressing the safety concerns that can arise from utility-maximising agents. An impact-minimising agent takes into account the potential impact of its actions on the state of the environment when selecting actions, so as to avoid unacceptable side-effects. This paper proposes and empirically evaluates an implementation of impact-minimisation within the framework of multiobjective reinforcement learning. The key contributions are a novel potential-based approach to specifying a measure of impact, and an examination of a variety of non-linear action-selection operators so as to achieve an acceptable trade-off between achieving the agent's primary task and minimising environmental impact. These experiments also highlight a previously unreported issue with noisy estimates for multiobjective agents using non-linear action-selection, which has broader implications for the application of multiobjective reinforcement learning. © 2021
- Authors: Vamplew, Peter , Foale, Cameron , Dazeley, Richard , Bignold, Adam
- Date: 2021
- Type: Text , Journal article
- Relation: Engineering Applications of Artificial Intelligence Vol. 100, no. (2021), p.
- Full Text:
- Reviewed:
- Description: The concept of impact-minimisation has previously been proposed as an approach to addressing the safety concerns that can arise from utility-maximising agents. An impact-minimising agent takes into account the potential impact of its actions on the state of the environment when selecting actions, so as to avoid unacceptable side-effects. This paper proposes and empirically evaluates an implementation of impact-minimisation within the framework of multiobjective reinforcement learning. The key contributions are a novel potential-based approach to specifying a measure of impact, and an examination of a variety of non-linear action-selection operators so as to achieve an acceptable trade-off between achieving the agent's primary task and minimising environmental impact. These experiments also highlight a previously unreported issue with noisy estimates for multiobjective agents using non-linear action-selection, which has broader implications for the application of multiobjective reinforcement learning. © 2021
- «
- ‹
- 1
- ›
- »