Sarath Sreedharan, Tathagata Chakraborti, et al.
AAAI 2020
A combination of Monte Carlo tree search (MCTS) and deep reinforcement learning has demonstrated incredibly high performance and has been attracting much attention lately. However, the convergence of learning is very time-consuming. On the other hand, when we want to acquire skills efficiently, it is important to learn from failure, locating its cause, and modifying the strategy accordingly. Using the analogy of this context, we propose an efficient tree search method by introducing a failure ratio that has high values in important phases. We applied our method to Othello board game. We conducted experiments and showed that our method has a higher winning ratio than the state-of-the-art method, especially in the early stage of learning.
Sarath Sreedharan, Tathagata Chakraborti, et al.
AAAI 2020
Sijia Liu, Parikshit Ram, et al.
AAAI 2020
Gosia Lazuka, Andreea Simona Anghel, et al.
SC 2024
Natalia Martinez Gil, Dhaval Patel, et al.
UAI 2024