Publications

2012
G. Konidaris, S. Kuindersma, R. Grupen, and A. Barto, “Robot learning from demonstration by constructing skill trees,” The International Journal of Robotics Research, vol. 31, no. 3, pp. 360–375, 2012.Abstract

We describe CST, an online algorithm for constructing skill trees from demonstration trajectories. CST segments a demonstration trajectory into a chain of component skills, where each skill has a goal and is assigned a suitable abstraction from an abstraction library. These properties permit skills to be improved eciently using a policy learning algorithm. Chains from multiple demonstration trajectories are merged into a skill tree. We show that CST can be used to acquire skills from human demonstration in a dynamic continuous domain, and from both expert demonstration and learned control sequences on the uBot-5 mobile manipulator.

cst-ijrr.pdf
S. Kuindersma, R. Grupen, and A. Barto, “Variable Risk Dynamic Mobile Manipulation,” in RSS 2012 Workshop on Mobile Manipulation, Sydney, Australia, 2012.Abstract

The ability to operate effectively in a variety of contexts will be a critical attribute of deployed mobile manipulators. In general, a variety of properties, such as battery charge, workspace constraints, and the presence of dangerous obstacles, will determine the suitability of particular control policies. Some context changes will cause shifts in risk sensitivity, or tendency to seek or avoid policies with high performance variation. We describe a policy search algorithm designed to address the problem of variable risk control. We generalize the simple stochastic gradient descent update to the risk-sensitive case, and show that, under certain conditions, it leads to an unbiased estimate of the gradient of the risk-sensitive objective. We show that the local critic structure used in the update can be exploited to interweave offline and online search to select local greedy policies or quickly change risk sensitivity. We evaluate the algorithm in experiments with a dynamically stable mobile manipulator lifting a heavy liquid-filled bottle while balancing.

vrmm12.pdf
Variational Bayesian Optimization for Runtime Risk-Sensitive Control
S. Kuindersma, R. Grupen, and A. Barto, “Variational Bayesian Optimization for Runtime Risk-Sensitive Control,” in Robotics: Science and Systems VIII (RSS), Sydney, Australia, 2012, pp. 201–206.Abstract

We present a new Bayesian policy search algorithm suitable for problems with policy-dependent cost variance, a property present in many robot control tasks. We extend recent work on variational heteroscedastic Gaussian processes to the optimization case to achieve efficient minimization of very noisy cost signals. In contrast to most policy search algorithms, our method explicitly models the cost variance in regions of low expected cost and permits runtime adjustment of risk sensitivity without relearning. Our experiments with artificial systems and a real mobile manipulator demonstrate that flexible risk-sensitive policies can be learned in very few trials.

vbo.pdf
2011
G. D. Konidaris, S. R. Kuindersma, R. A. Grupen, and A. G. Barto, “Acquiring Transferrable Mobile Manipulation Skills,” in RSS 2011 Workshop on Mobile Manipulation: Learning to Manipulate, Los Angeles, CA, 2011.Abstract

This abstract summarizes recent research on the autonomous acquisition of transferrable manipulation skills. We describe a robot system that learns to sequence a set of innate controllers to solve a task, and then extracts transferrable manipulation skills from the resulting solution. Using the extracted skills, the robot is able to significantly reduce the time required to discover the solution to a second task.

rss-ws11.pdf
Autonomous Skill Acquisition on a Mobile Manipulator
G. Konidaris, S. Kuindersma, R. Grupen, and A. Barto, “Autonomous Skill Acquisition on a Mobile Manipulator,” in Proceedings of the Twenty-Fifth Conference on Artificial Intelligence (AAAI-11), San Francisco, CA, 2011, pp. 1468–1473.Abstract

We describe a robot system that autonomously acquires skills through interaction with its environment. The robot learns to sequence the execution of a set of innate controllers to solve a task, extracts and retains components of that solution as portable skills, and then transfers those skills to reduce the time required to learn to solve a second task.

arsa-aaai.pdf
G. D. Konidaris, S. R. Kuindersma, R. A. Grupen, and A. G. Barto, “CST: Constructing Skill Trees by Demonstration,” in Proceedings of the ICML Workshop on New Developments in Imitation Learning, Bellevue, WA, 2011.Abstract

We describe recent work on CST, an online algorithm for constructing skill trees from demonstration trajectories. CST segments a demonstration trajectory into a chain of component skills, where each skill has a goal and is assigned a suitable abstraction from an abstraction library. These properties per- mit skills to be improved eciently using a policy learning algorithm. Chains from mul- tiple demonstration trajectories are merged into a skill tree. We describe applications of CST to acquiring skills from human demon- stration in a dynamic continuous domain and from both expert demonstration and learned control sequences on a mobile manipulator.

cst-ws.pdf
S. Kuindersma, R. Grupen, and A. Barto, “Learning Dynamic Arm Motions for Postural Recovery,” in Proceedings of the 11th IEEE-RAS International Conference on Humanoid Robots, Bled, Slovenia, 2011, pp. 7–12.Abstract

The biomechanics community has recently made progress toward understanding the role of rapid arm movements in human stability recovery. However, comparatively little work has been done exploring this type of control in humanoid robots. We provide a summary of recent insights into the functional contributions of arm recovery motions in humans and experimentally demonstrate advantages of this behavior on a dynamically stable mobile manipulator. Using Bayesian optimization, the robot efficiently discovers policies that reduce total energy expenditure and recovery footprint, and increase ability to stabilize after large impacts.

armstab-humanoids.pdf
2010
S. Kuindersma, “Control Model Learning for Whole-Body Mobile Manipulation,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2010.
G. Konidaris, S. Kuindersma, A. Barto, and R. Grupen, “Constructing Skill Trees for Reinforcement Learning Agents from Demonstration Trajectories,” in Advances in Neural Information Processing Systems 23, 2010, pp. 1162–1170.Abstract

We introduce CST, an algorithm for constructing skill trees from demonstration trajectories in continuous reinforcement learning domains. CST uses a changepoint detection method to segment each trajectory into a skill chain by detecting a change of appropriate abstraction, or that a segment is too complex to model as a single skill. The skill chains from each trajectory are then merged to form a skill tree. We demonstrate that CST constructs an appropriate skill tree that can be further refined through learning in a challenging continuous domain, and that it can be used to segment demonstration trajectories on a mobile manipulator into chains of skills where each skill is assigned an appropriate abstraction.

cst-nips.pdf
S. Kuindersma, G. Konidaris, R. Grupen, and A. Barto, “Learning from a Single Demonstration: Motion Planning with Skill Segmentation,” in NIPS Workshop on Learning and Planning from Batch Time Series Data, Vancouver, BC, 2010.Abstract

We propose an approach to control learning from demonstration that first segments demonstration trajectories to identify subgoals, then uses model-based con- trol methods to sequentially reach these subgoals to solve the overall task. Using this approach, we show that a mobile robot is able to solve a combined navigation and manipulation task robustly after observing only a single successful trajectory.

mbskillseg.pdf

Pages