Parsing Motion and Composing Behavior for Semi-Autonomous Manipulation

Abstract

Robots are becoming an ever bigger part of our day to day life. They take up simple tasks in households, like vacuum cleaning and lawn mowing. They ensure a steady and reliable process at many work places in large scale manufacturing, like the automotive and electronics industry. Furthermore, robots are becoming more and more socially accepted, for instance as autonomous drivers. They even start to engage in special and elderly care, aiming to fill a void created by a rapidly aging population. Additionally, the increasing complexity and capability of robotic systems allows to solve ever more complicated tasks in increasingly difficult scenarios and environments. Soon, encountering and interacting with robots will be considered as natural as interacting with other humans. However, when it comes to defining and understanding the behavior of robots, experts are still necessary. Robots usually follow predefined routines which are programmed and tuned by people with years of experience. Unintended behavior is traced back to a certain part of the source code which can be modified using a specific programming language. Most of the people that will interact with robotic servants or coworkers in the future, will not have the necessary skill set to instruct robots in such detail. This need for an expert represents a significant bottleneck to the deployment of robots as our everyday companion in households and at work. This thesis presents several novel approaches aiming at facilitating the interaction between non-expert humans and robots in terms of intuitive instruction and simple understanding of the robot capabilities with respect to a given task. Chapter 3 introduces a novel method that segments unlabeled demonstrations into sequence of movement primitives while simultaneously learning a movement primitive library. This method allows the non-expert to teach an entire task rather than every single primitive. Movement primitives represent a simple, atomic and commonly parameterized motion. The presented method segments each demonstration by identifying similar patterns across all demonstrations and treating them as samples drawn from a learned probabilistic representation of a movement primitive. The method is formulated as an expectation-maximization approach and was evaluated in several tasks,including a chair assembly and segmenting table tennis demonstrations. In Chapter 4 the previously segmented demonstrations and the learned primitive library are used to induce a formal grammar for movements. Formal grammars are a well established concept in formal language theory and have been applied in several fields, reaching from linguistics, over compiler architecture to robotics. The simplest class of grammars, regular grammars, correspond in their probabilistic form to Hidden Markov Models. However, the intuitive, hierarchical representation of transitions as a set of rules makes it easier for non-experts to comprehend the possible behaviors the grammar implies. A sequence of movements can now be considered a sentence produced by the learned grammar. The production of each sentence can be illustrated by a tree structure, allowing an easy understanding of the involved rules. Probabilistic context-free grammars are a superset of regular grammars and, hence, are more expressive and exceed the capabilities of Hidden Markov Models. While the induction of probabilistic context-free grammars is considered a difficult, unsolved problem for natural languages, the observed sequences of movement primitives show much simpler structures, making the induction more feasible. The method was successfully evaluated on several tasks, such as a pick-and-place task in a tic-tac-toe setting or a handover task in a collaborative tool box assembly. Chapter 5 introduces the concept of reinforcement learning into the domain of formal grammars. Given an objective, we apply a natural policy gradient approach in order to learn the grammar parameters that produces sequences of primitives that solve that objective. This allows the autonomous improvement of robot behavior. For instance, a cleaning up task can be optimized for efficiency while avoiding self collisions. The parameters of the grammar are the probabilities of each production. Therefore, probability constraints have to be maintained while learning the parameters. The applied natural policy gradient method ensures reasonably small parameter updates, such that the grammar probabilities change gradually. We derive the natural policy gradient method for formal grammars and evaluate the method on several tasks. Together, the individual contributions presented in this thesis form an imitation learning pipeline that facilitates the instruction, interaction and collaboration with robots. Starting from unlabeled demonstrations, an underlying movement primitive library is learned while simultaneously segmenting the given demonstrations into sequences of primitives. These sequences are than used to induce a formal grammar. The structure of the grammar and the produced parse trees form a comprehensible representation of the robot capabilities with respect to the demonstrated task. Finally, a reinforcement learning approach allows the autonomous optimization of the grammar given an objective.

Publication
PhD Thesis