Deep Skill Chaining

Akhil Bagaria

May 18, 2020

Figure 1: [Top] Combined value function learned by deep skill chaining. [Bottom] Value functions learned by discovered skills. In this U-shaped maze, the goal state is in the top-left and the start state is in the bottom-left

While modern RL algorithms have achieved impressive results on hard problems, they have struggled in long-horizon problems with sparse rewards. Hierarchical reinforcement learning is a promising approach to overcome these challenges.

While the benefit of using hierarchies has been known for a long time, the question of how useful hierarchies can be discovered autonomously has remained largely unanswered. In this work, we present an algorithm that can construct temporally extended, higher level actions (called skills) from the set of primitive actions already available to the RL agent.

Not only is the ability to break down complex problems into simpler sub-problems a hallmark of intelligence, it is also the missing piece from traditional/flat reinforcement learning techniques. By constructing useful hierarchies, RL agents will be able to combine modular solutions to easy sub-problems to reliably solve hard real-world problems.

We propose Deep Skill Chaining as a step towards realizing the goal of autonomous skill discovery in high-dimensional problems with continuous state and action spaces.

To learn more, see the full blog post, read the ICLR paper, or check out the code.