- Title
- Coarse Q-Learning : Addressing the convergence problem when quantizing continuous state variables
- Creator
- Dazeley, Richard; Vamplew, Peter; Bignold, Adam
- Date
- 2015
- Type
- Text; Conference paper
- Identifier
- http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/162416
- Identifier
- vital:12644
- Identifier
-
https://doi.org/10.13140/RG.2.1.1965.1041
- Abstract
- Value-based approaches to reinforcement learning (RL) maintain a value function that measures the long term utility of a state or state-action pair. A long standing issue in RL is how to create a finite representation in a continuous, and therefore infinite, state environment. The common approach is to use function approximators such as tile coding, memory or instance based methods. These provide some balance between generalisation, resolution, and storage, but converge slowly in multidimensional state environments. Another approach of quantizing state into lookup tables has been commonly regarded as highly problematic, due to large memory requirements and poor generalisation. In particular , attempting to reduce memory requirements and increase generalisation by using coarser quantization forms a non-Markovian system that does not converge. This paper investigates the problem in using quantized lookup tables and presents an extension to the Q-Learning algorithm, referred to as Coarse Q-Learning (C QL), which resolves these issues. The presented algorithm will be shown to drastically reduce the memory requirements and increase generalisation by simulating the Markov property. In particular, this algorithm means the size of the input space is determined by the granularity required by the policy being learnt, rather than by the inadequacies of the learning algorithm or the nature of the state-reward dynamics of the environment. Importantly, the method presented solves the problem represented by the curse of dimensionality.
- Relation
- 2nd Multidisciplinary Conference on Reinforcement Learning and Decision Making
- Rights
- This metadata is freely available under a CCO license
- Subject
- Reinforcement learning; Temporal difference learning; Continuous state; Quantized state; Function approximation
- Reviewed
- Hits: 1504
- Visitors: 1272
- Downloads: 0
Thumbnail | File | Description | Size | Format |
---|