[TOC]
- Title: Foundations for Restraining Bolts: Reinforcement Learning with LTLf/LDLf Restraining Specification
- Author: Giuseppe De Giacomo et. al.
- Publish Year: 2019
- Review Date: Mar 2022
Summary of paper
The author investigated the concept of “restraining bolt” that can control the behaviour of learning agents.
Essentially, the way to control a RL agent is that the bolt provides additional rewards to the agent
Although this method is essentially the same as reward shaping (providing additional rewards to the agent), the contribution of this paper is
- provide theoretical support for combining MDP and Linear Temporal Logic
- provide a deterministic finite state automata to provide rewards signals from LTL.
Some key terms
Restraining Bolts
Restraining bolts were small, cylindrical devices that could be affixed to a droid in order to limit its functions and enforce its obedience.
- two distinct representation of the world
- one for the agent, by the designer of the agent
- one for the restraining bolt, by the authority imposing the bolt.
- are these to representation related to each other?
- NO: the agent designer and the authority imposing the bolt are not aligned (why should they!)
- YES: the agent and the bolt act in the same real world.
- so the glue between two representation is the world itself.
- But can restraining bolt exist at all?
- YES: for example based on RL
The learning agents and the restraining bolts are very decoupled.
You can put a specific restraining bolt to the agent and then this agent can play new games (because now the behaviour of the agent is changed).
Minor comments
check this ICAPS 2019 video about this paper
https://www.youtube.com/watch?v=qGLRxfKD40s
[^]: