Abstract
While autonomous mobile robots used to be built for domain specific tasks in factories or similar safe environments, we are now seeing a shift towards the general market. Automated lawn movers and cleaning robots are sold at general stores. They will have to be able to adapt to unknown environments while being safe around humans and animals. This means that we will have to think differently when building the decision making systems for these robots. Reinforcement learning is a field in robotics inspired by humans' ability to learn by trial-and-error. Agents trained with reinforcement learning has been developed and successfully applied to computer games, performing at a world class level. This master thesis describes the implementation of an AI designed for a robot competing in the 2015 Eurobot-competition. The task is set to a dynamic environment with a robotic opponent. A Goal Oriented Action Planner was implemented as the planner for the AI. In order to adapt the planner to a changing environment, a decision making policy trained with reinforcement learning was utilized to rate actions depending on the state of the game. An implementation of SARSA with feature value function approximation was used to train the policy. The learned decision making policy showed promising results in experiments conducted for this thesis. The AI found a good static solution to the Eurobot task, and was able to adapt the strategy to a dynamic environment by avoiding the opponent and respecting time limits.
While autonomous mobile robots used to be built for domain specific tasks in factories or similar safe environments, we are now seeing a shift towards the general market. Automated lawn movers and cleaning robots are sold at general stores. They will have to be able to adapt to unknown environments while being safe around humans and animals. This means that we will have to think differently when building the decision making systems for these robots. Reinforcement learning is a field in robotics inspired by humans' ability to learn by trial-and-error. Agents trained with reinforcement learning has been developed and successfully applied to computer games, performing at a world class level. This master thesis describes the implementation of an AI designed for a robot competing in the 2015 Eurobot-competition. The task is set to a dynamic environment with a robotic opponent. A Goal Oriented Action Planner was implemented as the planner for the AI. In order to adapt the planner to a changing environment, a decision making policy trained with reinforcement learning was utilized to rate actions depending on the state of the game. An implementation of SARSA with feature value function approximation was used to train the policy. The learned decision making policy showed promising results in experiments conducted for this thesis. The AI found a good static solution to the Eurobot task, and was able to adapt the strategy to a dynamic environment by avoiding the opponent and respecting time limits.