Friday, December 22nd, 2017

From the Inside: Optimization (1 of 2)

by Benjamin Recht, UC Berkeley

Simons Minisymposium on Optimization Challenges in Robotics

This semester at the Simons Institute, we experimented with holding minisymposia, where we invited experts on topics related to the program on Bridging Continuous and Discrete Optimization to come for a half day and present what they see as grand challenge problems facing their research and society. The goals of these symposia were both for the participants to be exposed to new problems in their areas and for the experts to have access to theorists for new potential approaches to core challenges.

Our first symposium focused on optimization challenges in robotics. Indeed, optimization is one of the core components of contemporary robotic technology, where the main paradigm for planning and actuation is Model Predictive Control (MPC). In MPC, a robot makes a plan based on a simulation from the present until a short time into the future. The robot executes exactly one step of this plan, and then, based on what it observes after taking this action, performs another short-time simulation to plan the next action. This feedback loop allows the robot to couple the actual behavior of itself and its environment with its internal simulator. Model predictive control allows for practitioners to put all of their cleverness into solving optimization problems – but they need to be solved in real time! The faster one can solve the associated simulation optimization, the more intricate the maneuvers that can be executed.

The overarching challenges in robotics can thus be summarized as: What can happen in the world – what range of different situations can the robot encounter? What is the best framework to model these situations? What can humans who are interacting with the robots do? How can we model human actions? And how can we deal with the combinatorial explosions that arise in dealing with such environmental uncertainty? These problems lie at the bridge between continuous and discrete optimization and require innovations in mixed integer programming, real-time and distributed optimization, game theory, and machine learning in order to make progress.

Russ Tedrake presented a discussion of challenges in robotic control that arise through contacts. When robots encounter contacts (for example, if they are walking, if they are pushed, or if they accidentally bump a wall), the simulation in MPC requires planning around discontinuities in the dynamics models. At contact points, the dynamics switches, making the optimization problem nonsmooth and highly nonconvex. Tedrake showed how such contact problems could be tackled by solving mixed integer nonlinear programs, but most roboticists typically do not solve such discrete optimization problems due to shortcomings of existing tools. In order to utilize mixed integer optimization in real time on walking robots, sophisticated heuristics for branch-and-bound and warm starting are needed. Improvements in discrete optimization would thus directly translate into improvements in the capabilities of walking robots.

Francesco Borrelli discussed MPC in the context of autonomous driving. Here, the dynamics are usually predictable because cars tend to operate in safe regimes. But in emergency situations, such as when a vehicle encounters black ice, more complicated control procedures need to be integrated to ensure safety. Borrelli described how such nonlinear models are solved with off-the shelf solvers such as IPOPT. One of the major obstacles to the widespread deployment of self-driving cars is the ability to incorporate updates. Each new update of the internal controller will have to work on a fleet with millions of vehicles that all interact with highly variable environmental conditions. The major challenge problem in this space is to define control laws that incorporate the experiences of millions of cars and improve the performance on all of them. This task will require novel integration of control design, robust optimization, and large-scale machine learning to provide new, safe control policies.

Anca Dragan presented an optimization paradigm for robotic planning when humans are in the loop. Two basic scenarios where humans and robots might encounter each other would be in a robot-assisted assembly line and when autonomous cars must share the road with human drivers. These cases are quite distinct: In the case of manufacturing, humans and robots work in tandem to build products; while in driving, the objectives of robotic and human drivers are not necessarily aligned. Dragan described an optimization framework for modeling human behavior. The robot runs its usual MPC loop, but it includes in its simulation a human running another MPC loop. That is, the robot tries to simulate what the human would do if it was acting optimally. The resulting problem is a challenging one in game theory, and tools from this area could be applied to simplify the simulation procedure and to design and understand policies that satisfy both human and robotic agents. Another layer of difficulty arises when we admit that we typically don’t know the objectives being pursued by human agents, which may not be optimal or even rational. These objectives must be estimated from observations, a process that is commonly called inverse reinforcement learning, or inverse optimal control. Understanding the errors introduced in learning human motives from data is a challenging problem that needs to be addressed in order for robots to interact with humans in a reliable and safe manner.

Related Articles