S. Kuindersma, R. Grupen, and A. Barto, “Learning Dynamic Arm Motions for Postural Recovery,” in Proceedings of the 11th IEEE-RAS International Conference on Humanoid Robots, Bled, Slovenia, 2011, pp. 7–12.Abstract

The biomechanics community has recently made progress toward understanding the role of rapid arm movements in human stability recovery. However, comparatively little work has been done exploring this type of control in humanoid robots. We provide a summary of recent insights into the functional contributions of arm recovery motions in humans and experimentally demonstrate advantages of this behavior on a dynamically stable mobile manipulator. Using Bayesian optimization, the robot efficiently discovers policies that reduce total energy expenditure and recovery footprint, and increase ability to stabilize after large impacts.

S. Kuindersma, “Control Model Learning for Whole-Body Mobile Manipulation,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2010.
G. Konidaris, S. Kuindersma, A. Barto, and R. Grupen, “Constructing Skill Trees for Reinforcement Learning Agents from Demonstration Trajectories,” in Advances in Neural Information Processing Systems 23, 2010, pp. 1162–1170.Abstract

We introduce CST, an algorithm for constructing skill trees from demonstration trajectories in continuous reinforcement learning domains. CST uses a changepoint detection method to segment each trajectory into a skill chain by detecting a change of appropriate abstraction, or that a segment is too complex to model as a single skill. The skill chains from each trajectory are then merged to form a skill tree. We demonstrate that CST constructs an appropriate skill tree that can be further refined through learning in a challenging continuous domain, and that it can be used to segment demonstration trajectories on a mobile manipulator into chains of skills where each skill is assigned an appropriate abstraction.

S. Kuindersma, G. Konidaris, R. Grupen, and A. Barto, “Learning from a Single Demonstration: Motion Planning with Skill Segmentation,” in NIPS Workshop on Learning and Planning from Batch Time Series Data, Vancouver, BC, 2010.Abstract

We propose an approach to control learning from demonstration that first segments demonstration trajectories to identify subgoals, then uses model-based con- trol methods to sequentially reach these subgoals to solve the overall task. Using this approach, we show that a mobile robot is able to solve a combined navigation and manipulation task robustly after observing only a single successful trajectory.

S. R. Kuindersma, E. Hannigan, D. Ruiken, and R. A. Grupen, “Dexterous mobility with the uBot-5 mobile manipulator,” in Proceedings of the 14th International Conference on Advanced Robotics, Munich, Germany, 2009.Abstract

We present an initial demonstration of dexterous mobility using the uBot-5, a dynamically balancing mobile manipulator. Dexterous mobility refers generally to a level of bodily resourcefulness that permits the autonomous reassignment of effectors for the purpose of maintaining mobility in a variety of situations. We begin by describing a set of postural stability controllers in terms of a small number of simple control objectives. We then show how the resulting postures support dexterous mobility by enabling a new ldquoknuckle walkingrdquo mobility mode. In a preliminary experiment, we develop this mobility mode by formulating a practical reinforcement learning problem that allows the robot to learn an efficient gait on-line in a single trial.

B. S. Blais, et al., “Recovery From Monocular Deprivation Using Binocular Deprivation,” Journal of Neurophysiology, vol. 100, no. 4, pp. 2217–2224, 2008.Abstract

Ocular dominance (OD) plasticity is a robust paradigm for examining the functional consequences of synaptic plasticity. Previous experimental and theoretical results have shown that OD plasticity can be accounted for by known synaptic plasticity mechanisms, using the assumption that deprivation by lid suture eliminates spatial structure in the deprived channel. Here we show that in the mouse, recovery from monocular lid suture can be obtained by subsequent binocular lid suture but not by dark rearing. This poses a significant challenge to previous theoretical results. We therefore performed simulations with a natural input environment appropriate for mouse visual cortex. In contrast to previous work, we assume that lid suture causes degradation but not elimination of spatial structure, whereas dark rearing produces elimination of spatial structure. We present experimental evidence that supports this assumption, measuring responses through sutured lids in the mouse. The change in assumptions about the input environment is sufficient to account for new experimental observations, while still accounting for previous experimental results.

S. R. Kuindersma and B. S. Blais, “Teaching Bayesian Model Comparison With the Three-Sided Coin,” The American Statistician, vol. 61, no. 3, pp. 239–244, 2007.Abstract

This article introduces the problem of determining the probability that a rotating and bouncing cylinder (i.e., flipped coin) will land and come to rest on its edge. We present this problem and analysis as a practical, nontrivial example to introduce the reader to Bayesian model comparison. Several models are presented, each of which take into consideration different physical aspects of the problem and the relative effects on the edge landing probability. The Bayesian formulation of model comparison is then used to compare the models and their predictive agreement with data from hand-flipped cylinders of several sizes.

B. S. Blais and S. Kuindersma, “Synaptic Modification in Spiking-Rate Models: A Comparison between Learning in Spiking Neurons and Rate-Based Neuron Models,” in Proceedings of the Society for Neuroscience Meeting, Washington, DC, 2005.Abstract
Rate-based neuron models have been successful in understanding many aspects of development such as the development of orientation selectivity(Bienenstock et al., 1982; Oja, 1982; Linsker, 1986; Miller, 1992; Bell and Sejnowski, 1997), the particular dynamics of visual deprivation(Blais et al., 1999) and the development of direction selectivity(Wimbauer et al., 1997; Blais et al., 2000). These models do not address phenomena such as temporal coding, spike-timing dependant synaptic plasticity, or any short-time behavior of neurons. More detailed spiking models (Song, 2000; Shouval 2002; Yeung 2004) address these issues, and have had some success, but have failed to develop receptive fields in natural environments. These more detailed models are difficult to explore, given their large number of parameters and the run-time computational limitations. In addition, their results are often difficult to compare directly with the rate-based models. We propose a model, which we call a spiking-rate model, which can serve as a middle-ground between the over simplistic rate-based models, and the more detailed spiking models. The spiking-rate model is a spiking model where all of the underlying processes are continuous Poisson, the summation of inputs is entirely linear (although non-linearities can be added), and the generation of outputs is done by calculating a rate output and then generating an appropriate Poisson spike train. In this way, the limiting behavior is identical to a rate-based model, but the properties of spiking models can be incorporated more easily. We present the development of receptive fields with this model in various visual environments. We then present the necessary conditions for the receptive field development in the spiking-rate models, and make comparisons to detailed spiking models, in order to more clearly understand the necessary conditions for receptive field development.