Technique for teaching a computer or a robot new behaviors
In computer science, programming by demonstration (PbD) is an end-user development technique for teaching a computer or a robot new behaviors by demonstrating the task to transfer directly instead of programming it through machine commands.
The terms programming by example (PbE) and programming by demonstration (PbD) appeared in software development research as early as the mid 1980s[1] to define a way to define a sequence of operations without having to learn a programming language. The usual distinction in literature between these terms is that in PbE the user gives a prototypical product of the computer execution, such as a row in the desired results of a query; while in PbD the user performs a sequence of actions that the computer must repeat, generalizing it to be used in different data sets.
These two terms were first undifferentiated, but PbE then tended to be mostly adopted by software development researchers while PbD tended to be adopted by robotics researchers. Today, PbE refers to an entirely different concept, supported by new programming languages that are similar to simulators. This framework can be contrasted with Bayesian program synthesis.
Robot programming by demonstration
The PbD paradigm is first attractive to the robotics industry due to the costs involved in the development and maintenance of robot programs. In this field, the operator often has implicit knowledge on the task to achieve (he/she knows how to do it), but does not have usually the programming skills (or the time) required to reconfigure the robot. Demonstrating how to achieve the task through examples thus allows to learn the skill without explicitly programming each detail.
The first PbD strategies proposed in robotics were based on teach-in, guiding or play-back methods that consisted
basically in moving the robot (through a dedicated interface or manually) through a set of relevant configurations that the robot
should adopt sequentially (position, orientation, state of the gripper). The method was then progressively ameliorated by
focusing principally on the teleoperation control and by using different interfaces such as vision.
However, these PbD methods still used direct repetition, which was useful in industry only when conceiving an assembly line using exactly the same product components. To apply this concept to products with different variants or to apply the programs to new robots, the generalization issue became a crucial point. To address this issue, the first attempts at generalizing the skill
were mainly based on the help of the user through queries about the user's intentions. Then, different levels of abstractions were
proposed to resolve the generalization issue, basically dichotomized in learning methods at a symbolic level or at a trajectory level.
The development of humanoid robots naturally brought a growing interest in robot programming by demonstration. As a humanoid robot is supposed by its nature to adapt to new environments, not only the human appearance is important but the algorithms used for its control require flexibility and versatility. Due to the continuously changing environments and to the huge varieties of tasks that a robot is expected to perform, the robot requires the ability to continuously learn new skills and adapt the existing skills to new contexts.
Research in PbD also progressively departed from its original purely engineering perspective to adopt an interdisciplinary approach, taking insights from neuroscience and social sciences to emulate the process of imitation in humans and animals. With the increasing consideration of this body of work in robotics, the notion of Robot programming by demonstration (also known as RPD or RbD) was also progressively replaced by the more biological label of Learning by imitation.
Neurally-imprinted Stable Vector Fields (NiVF)
Neurally-imprinted Stable Vector Fields[2] (NiVF) was introduced as a novel learning scheme during ESANN 2013 and show how to imprint vector fields into neurals networks such as Extreme Learning Machines (ELMs) in a guaranteed stable manner. Furthermore, the paper won the best student paper award. The networks represent movements, where asymptotic stability is incorporated through constraints derived from Lyapunov stability theory. It is shown that this approach successfully performs stable and smooth point-to-point movements learned from human handwriting movements.
It is also possible to learn the Lyapunov candidate that is used for stabilization of the dynamical system.[3] For this reason, neural learning scheme that estimates stable dynamical systems from demonstrations based on a two-stage process are needed: first, a data-driven Lyapunov function candidate is estimated. Second, stability is incorporated by means of a novel method to respect local constraints in the neural learning. This allows for learning stable dynamics while simultaneously sustaining the accuracy of the dynamical system and robustly generate complex movements.
Diffeomorphic Transformations
Diffeomorphic transformations turn out to be particularly suitable for substantially increasing the learnability of dynamical systems for robotic motions. The stable estimator of dynamical systems (SEDS) is an interesting approach to learn time invariant systems to control robotic motions. However, this is restricted to dynamical systems with only quadratic Lyapunov functions. The new approach Tau-SEDS[4] overcomes this limitations in a mathematical elegant manner.
Parameterized skills
After a task was demonstrated by a human operator, the trajectory is stored in a database. Getting easier access to the raw data is realized with parameterized skills.[5] A skill is requesting a database and generates a trajectory. For example, at first the skill “opengripper(slow)” is sent to the motion database and in response, the stored movement of the robotarm is provided. The parameters of a skill allow to modify the policy to fulfill external constraints.
A skill is an interface between task names, given in natural language and the underlying spatiotemporal movement in the 3d space, which consists of points. Single skills can be combined into a task for defining longer motion sequences from a high level perspective. For practical applications, different actions are stored in a skill library. For increasing the abstraction level further, skills can be converted into dynamic movement primitives (DMP). They generate a robot trajectory on the fly which was unknown at the time of the demonstration. This helps to increase the flexibility of the solver.[6]
Non-robotic use
For final users, to automate a workflow in a complex tool (e.g. Photoshop), the most simple case of PbD is the macro recorder.
^Alizadeh, Tohid; Saduanov, Batyrkhan (2017). "Robot programming by demonstration of multiple tasks within a common environment". 2017 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI). IEEE. pp. 608–613. doi:10.1109/mfi.2017.8170389. ISBN978-1-5090-6064-1. S2CID40697084.
Schaal, Stefan; Ijspeert, Auke; Billard, Aude (2003), "Computational approaches to motor learning by imitation", Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, 358 (1431): 537–547, doi:10.1098/rstb.2002.1258, PMC1693137, PMID12689379
Robots that imitate humans, Cynthia Breazeal and Brian Scassellati, Trends in Cognitive Sciences, 6:1, 2002, pp. 481–87
Billard, A, "Imitation", in Arbib, MA (ed.), Handbook of Brain Theory and Neural Networks, MIT Press, pp. 566–69.