Vive Center Publications

2020

Feature Expansive Reward Learning: Rethinking Human Input
When a person is not satisfied with how a robot performs a task, they can intervene to correct it. Reward learning methods enable the robot to adapt its reward function online based on such human input, but they rely on handcrafted features. When the correction cannot be explained by these features, recent work in deep Inverse Reinforcement Learning (IRL) suggests that the robot could ask for task demonstrations and recover a reward defined over the raw state space. Our insight is that rather than implicitly learning about the missing feature(s) from demonstrations, the robot should instead ask for data that explicitly teaches it about what it is missing. We introduce a new type of human input in which the person guides the robot from states…

Author(s): Andreea Bobu, Marius Wiggert, Claire Tomlin, Anca D. Dragan. 
Journal/Conference: 
arxiv.org/abs/2006.13208

Learned Initializations for Optimizing Coordinate-Based Neural Representations
We propose applying standard meta-learning algorithms to learn the initial weight parameters for coordinate based neural representations based on the underlying class of signals being represented (e.g., images of faces or 3D models of chairs). Despite requiring only a minor change in implementation, using these learned initial weights enables faster convergence during optimization and can serve as a strong prior over the signal class being modeled, resulting in better generalization when only partial observations of a given signal are available…

Author(s): Matthew Tancik*, Ben Mildenhall*, Terrance Wang, Divi Schmidt, Pratul P. Srinivasan, Jonathan T. Barron, Ren Ng
Journal/Conference: 
arxiv.org/abs/2012.02189

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
We present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views. Our algorithm represents a scene using a fully-connected (non-convolutional) deep network, whose input is a single continuous 5D coordinate (spatial location (x, y, z) and viewing direction (θ, φ)) and whose output is the volume density and view-dependent emitted radiance at that spatial location. We synthesize views by querying 5D coordinates along camera rays…

Author(s): Ben Mildenhall*, Pratul P. Srinivasan*, Matthew Tancik*, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng
Journal/Conference: ECCV (2020) Oral – Best Paper Honorable Mention
arxiv.org/abs/2003.08934

Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains
We show that passing input points through a simple Fourier feature mapping enables a multilayer perceptron (MLP) to learn high-frequency functions in low-dimensional problem domains. These results shed light on recent advances in computer vision and graphics that achieve state-of-the-art results by using MLPs to represent complex 3D objects and scenes. Using tools from the neural tangent kernel (NTK) literature, we show that a standard MLP fails to learn high frequencies both in theory and in practice. To overcome this spectral bias, we use a Fourier feature mapping to transform the effective NTK into a stationary kernel with a tunable bandwidth. We suggest an approach for selecting problem-specific Fourier features that greatly improves the performance of MLPs for low-dimensional regression tasks relevant to the computer vision and graphics communities.

Author(s): Matthew Tancik*, Ben Mildenhall*, Pratul P. Srinivasan*, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan T. Barron, Ren Ng
Journal/Conference: NeurIPS (2020) Spotlight
arxiv.org/abs/2006.10739

GenScan: A Generative Method for Populating Parametric 3D Scan Datasets
The availability of rich 3D datasets corresponding to the geometrical complexity of the built environments is considered an ongoing challenge for 3D deep learning methodologies. To address this challenge, we introduce GenScan, a generative system that populates synthetic 3D scan datasets in a parametric fashion. The system takes an existing captured 3D scan as an input and outputs alternative variations of the building layout including walls, doors, and furniture with corresponding textures. GenScan is a fully automated system that can also be manually controlled by a user through an assigned user interface. Our proposed system utilizes a combination of a hybrid deep neural network and a parametrizer module to extract and transform elements of a given 3D scan….

Author(s): Mohammad Keshavarzi, Oladapo Afolabi, Luisa Caldas, Allen Y. Yang, Avideh Zakhor. 
Journal/Conference: December 2020
arxiv.org/abs/2012.03998

Optimistic Dual Extrapolation for Non-monotone Variational Inequality
Author(s): Chaobing Song, Yichao Zhou, Zhengyuan Zhou, Yong Jiang, and Yi Ma
Journal/Conference: NeurIPS, December 2020.

Voronoi Progressive Widening: Efficient Online Solvers for Continuous Space MDPs and POMDPs with Provably Optimal Components
Markov decision processes (MDPs) and partially observable MDPs (POMDPs) can effectively represent complex real-world decision and control problems. However, continuous space MDPs and POMDPs, i.e. those having continuous state, action and observation spaces, are extremely difficult to solve, and there are few online algorithms with convergence guarantees. This paper introduces Voronoi Progressive Widening (VPW), a general technique to modify tree search algorithms to effectively handle continuous or hybrid action spaces, and proposes

Author(s): Michael H. Lim, Claire J. Tomlin, Zachary N. Sunberg. 
Journal/Conference: December 2020.
arxiv.org/abs/2012.10140

Stochastic Variance Reduction via Accelerated Dual Averaging for Finite-Sum Optimization
In this paper, we introduce a simplified and unified method for finite-sum convex optimization, named \emph{Stochastic Variance Reduction via Accelerated Dual Averaging (SVR-ADA)}. In the nonstrongly convex and smooth setting, SVR-ADA can attain an O(1n)-accurate solution in O(nloglogn) number of stochastic gradient evaluations, where n is the number of samples; meanwhile, SVR-ADA matches the lower bound…

Author(s): Chaobing Song, Yong Jiang, and Yi Ma
Journal/Conference: NeurIPS, December 2020.
arXiv:2006.10281

Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization
Recent advances have shown that implicit bias of gradient descent on over-parameterized models enables the recovery of low-rank matrices from linear measurements, even with no prior knowledge on the intrinsic rank. In contrast, for robust low-rank matrix recovery from grossly corrupted measurements, over-parameterization leads to overfitting without prior knowledge on both the intrinsic rank and sparsity of corruption. This paper shows that with a double over-parameterization for both…

Author(s): Chong You, Zhihui Zhu, Qing Qu, and Yi Ma
Journal/Conference: NeurIPS (spotlight), December 2020.
arXiv:2006.08857

Dynamic Legged Manipulation of a Ball Through Multi-Contact Optimization
The feet of robots are typically used to design locomotion strategies, such as balancing, walking, and running. However, they also have great potential to perform manipulation tasks. In this paper, we propose a model predictive control (MPC) framework for a quadrupedal robot to dynamically balance on a ball and simultaneously manipulate it to follow various trajectories such as straight lines, sinusoids, circles and in-place turning…

Author(s): Chenyu Yang, Bike Zhang, Jun Zeng, Ayush Agrawal, Koushil Sreenath. 
Journal/Conference: 
arxiv.org/abs/2008.00191

Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction
To learn intrinsic low-dimensional structures from high-dimensional data that most discriminate between classes, we propose the principle of Maximal Coding Rate Reduction (MCR2), an information-theoretic measure that maximizes the coding rate difference between the whole dataset and the sum of each individual class. We clarify its relationships with most existing frameworks such as cross-entropy, information bottleneck, information gain, contractive…

Author(s): Yaodong Yu, Kwan Ho Ryan Chan, Chong You, Chaobing Song, and Yi Ma
Journal/Conference: NeurIPS, December 2020.
 arXiv:2006.08558

Trajectory Optimization for Nonlinear Multi-Agent Systems using Decentralized Learning Model Predictive Control
We present a decentralized minimum-time trajectory optimization scheme based on learning model predictive control for multi-agent systems with nonlinear decoupled dynamics and coupled state constraints. By performing the same task iteratively, data from previous task executions is used to construct and improve local time-varying safe sets and an approximate value function…

Author(s): Edward L. Zhu, Yvonne R. Stürz, Ugo Rosolia, Francesco Borrelli. 
Journal/Conference: Conference on Decision and Control 2020
arxiv.org/abs/2004.01298

Formation and Reconfiguration of Tight Multi-Lane Platoons
Advances in vehicular communication technologies are expected to facilitate cooperative driving. Connected and Automated Vehicles (CAVs) are able to collaboratively plan and execute driving maneuvers by sharing their perceptual knowledge and future plans. In this paper, an architecture for autonomous navigation of tight multi-lane platoons travelling on public roads is presented. Using the proposed approach, CAVs are able to form single or multi-lane platoons of various geometrical configurations. They are able to reshape and adjust their configurations according to changes in the environment…

Author(s): Roya Firoozi, Xiaojing Zhang, Francesco Borrelli. 
Journal/Conference: 
arxiv.org/abs/2003.08595

TransceiVR: Bridging Asymmetrical Communication Between VR Users and External Collaborators
Virtual Reality (VR) users often need to work with other users, who observe them outside of VR using an external display. Communication between them is difficult; the VR user cannot see the external user’s gestures, and the external user cannot see VR scene elements outside of the VR user’s view. We carried out formative interviews with experts to understand these asymmetrical interactions and identify their goals and challenges. From this, we identify high-level system design goals to facilitate asymmetrical interactions and a corresponding space of implementation approaches based on the level of programmatic access to a VR application. We present TransceiVR, a system that utilizes VR platform APIs to enable asymmetric communication interfaces for third-party applications without requiring source code access…

Author(s): Balasaravanan Thoravi Kumaravel, Cuong Nguyen, Stephen DiVerdi, Bjoern Hartmann. 2020.
Conference: Will be presented at UIST’20.

Complete Dictionary Learning via L4-Norm Maximization over the Orthogonal Group
This paper considers the fundamental problem of learning a complete (orthogonal) dictionary from samples of sparsely generated signals. Most existing methods solve the dictionary (and sparse representations) based on heuristic algorithms, usually without theoretical guarantees for either optimality or complexity. The recent 1-minimization based methods do provide such guarantees but the associated algorithms recover the dictionary one column at a time. In this work, we propose a new formulation that maximizes the 4-norm over the orthogonal group, to learn the entire dictionary….

Author(s): Yuexiang Zhai, Zitong Yang, Zhenyu Liao, John Wright, and Yi Ma
Journal/Conference: Journal of Machine Learning Research (JMLR), 2020.
arXiv:1906.02435

Understanding L4-based Dictionary Learning: Interpretation, Stability, and Robustness

Author(s): Yuexiang Zhai, Hermish Mehta, Zhengyuan Zhou, and Yi Ma
Journal/Conference: International Conference on Learning Research (ICLR), 2020.

Staging energy sources to extend flight time of a multirotor UAV
Energy sources such as batteries do not decrease in mass after consumption, unlike combustion-based fuels. We present the concept of staging energy sources, i.e. consuming energy in stages and ejecting used stages, to progressively reduce the mass of aerial vehicles in-flight which reduces power consumption, and consequently increases flight time. A flight time vs. energy storage mass analysis is presented to show the endurance benefit of staging to multirotors. We consider two specific problems in discrete staging — optimal order of staging given a certain number of energy sources, and optimal partitioning of a given energy storage mass budget into a given number of stages….

Author(s): Karan P. Jain, Jerry Tang, Koushil Sreenath, Mark W. Mueller. 
Journal/Conference: IROS 2020
arxiv.org/abs/2003.04290

Gaussian Process-based Min-norm Stabilizing Controller for Control-Affine Systems with Uncertain Input Effects
This paper presents a method to design a min-norm Control Lyapunov Function (CLF)-based stabilizing controller for a control-affine system with uncertain dynamics using Gaussian Process (GP) regression. We propose a novel compound kernel that captures the control-affine nature of the problem, which permits the estimation of both state and input-dependent model uncertainty in a single GP regression problem. Furthermore, we provide probabilistic guarantees of convergence by the use of GP Upper Confidence Bound analysis and the formulation of a CLF-based stability chance constraint which can be incorporated in a min-norm optimization problem…

Author(s): Fernando Castañeda, Jason J. Choi, Bike Zhang, Claire J. Tomlin, Koushil Sreenath. 
Journal/Conference: 
arxiv.org/abs/2011.07183

Collision Avoidance in Tightly-Constrained Environments without Coordination: a Hierarchical Control Approach
We present a hierarchical control approach for maneuvering an autonomous vehicle (AV) in a tightly-constrained environment where other moving AVs and/or human driven vehicles are present. A two-level hierarchy is proposed: a high-level data-driven strategy predictor and a lower-level model-based feedback controller. The strategy predictor maps a high-dimensional environment encoding into a set of high-level strategies…

Author(s): Xu Shen, Edward L. Zhu, Yvonne R. Stürz, Francesco Borrelli. 
Journal/Conference: 
arxiv.org/abs/2011.00413

DeepReach: A Deep Learning Approach to High-Dimensional Reachability
Hamilton-Jacobi (HJ) reachability analysis is an important formal verification method for guaranteeing performance and safety properties of dynamical control systems. Its advantages include compatibility with general nonlinear system dynamics, formal treatment of bounded disturbances, and the ability to deal with state and input constraints. However, it involves solving a PDE, whose computational and memory complexity scales exponentially…

Author(s): Somil Bansal, Claire Tomlin. 
Journal/Conference: November, 2020.
arxiv.org/abs/2011.02082

Multi-Hypothesis Interactions in Game-Theoretic Motion Planning
We present a novel method for handling uncertainty about the intentions of non-ego players in dynamic games, with application to motion planning for autonomous vehicles. Equilibria in these games explicitly account for interaction among other agents in the environment, such as drivers and pedestrians. Our method models the uncertainty about the intention of other agents by constructing multiple hypotheses about the objectives and constraints of other agents in the scene…

Author(s): Forrest Laine, David Fridovich-Keil, Chih-Yuan Chiu, Claire Tomlin. 
Journal/Conference: November, 2020.
arxiv.org/abs/2011.06047

Testing for Typicality with Respect to an Ensemble of Learned Distributions
Methods of performing anomaly detection on high-dimensional data sets are needed, since algorithms which are trained on data are only expected to perform well on data that is similar to the training data. There are theoretical results on the ability to detect if a population of data is likely to come from a known base distribution, which is known as the goodness-of-fit problem. One-sample approaches to this problem offer significant computational advantages for online testing, but require knowing a model of the base distribution. The ability to correctly reject anomalous data in this setting hinges on the accuracy of the model of the base distribution…

Author(s): Forrest Laine, Claire Tomlin. 
Journal/Conference: November, 2020.
arxiv.org/abs/2011.06041

Encoding Defensive Driving as a Dynamic Nash Game
Robots deployed in real-world environments should operate safely in a robust manner. In scenarios where an “ego” agent navigates in an environment with multiple other “non-ego” agents, two modes of safety are commonly proposed: adversarial robustness and probabilistic constraint satisfaction. However, while the former is generally computationally-intractable and leads to overconservative solutions, the latter typically relies on strong distributional assumptions and ignores strategic coupling between agents. To avoid these drawbacks, we present a novel formulation of robustness within the framework of general sum dynamic game theory, modeled on defensive driving…

Author(s): Chih-Yuan Chiu, David Fridovich-Keil, Claire J. Tomlin. 
Journal/Conference: November, 2020.
arxiv.org/abs/2011.04815

Approximate Solutions to a Class of Reachability Games
In this paper, we present a method for finding approximate Nash equilibria in a broad class of reachability games. These games are often used to formulate both collision avoidance and goal satisfaction. Our method is computationally efficient, running in real-time for scenarios involving multiple players and more than ten state dimensions. The proposed approach forms a family of increasingly exact approximations to the original game. Our results characterize the quality of these approximations and show operation in a receding horizon, minimally-invasive control context. Additionally, as a special case, our method reduces to local optimization in the single-player (optimal control) setting, for which a wide variety of efficient algorithms exist.

Author(s): David Fridovich-Keil, Claire J. Tomlin. 
Journal/Conference: November, 2020.
arxiv.org/abs/2011.00601

Incremental Learning via Rate Reduction
Current deep learning architectures suffer from catastrophic forgetting, a failure to retain knowledge of previously learned classes when incrementally trained on new classes. The fundamental roadblock faced by deep learning methods is that deep learning models are optimized as “black boxes,” making it difficult to properly adjust the model parameters to preserve knowledge about previously seen data. To overcome the problem of catastrophic forgetting…

Author(s): Ziyang Wu, Christina Baek, Chong You, and Yi Ma
Journal/Conference: November, 2020.
arXiv:2011.14593

Comments on Efficient Singular Value Thresholding Computation
We discuss how to evaluate the proximal operator of a convex and increasing function of a nuclear norm, which forms the key computational step in several first-order optimization algorithms such as (accelerated) proximal gradient descent and ADMM. Various special cases of the problem arise in low-rank matrix completion, dropout training in deep learning and high-order low-rank tensor recovery, although they have all been solved on a case-by-case basis. We provide an unified and efficiently computable procedure for solving this problem…

Author(s): Zhengyuan Zhou and Yi Ma.
Journal/Conference: November, 2020.
arXiv:2011.06710

Safety-Critical Model Predictive Control with Discrete-Time Control Barrier Function
The optimal performance of robotic systems is usually achieved near the limit of state and input bounds. Model predictive control (MPC) is a prevalent strategy to handle these operational constraints, however, safety still remains an open challenge for MPC as it needs to guarantee that the system stays within an invariant set. In order to obtain safe optimal performance in the context of set invariance, we present a safety-critical model predictive control strategy utilizing discrete-time control barrier functions…

Author(s): Jun Zeng, Bike Zhang, Koushil Sreenath. 
Journal/Conference: 
arxiv.org/abs/2007.11718

Expert Selection in High-Dimensional Markov Decision Processes
In this work we present a multi-armed bandit framework for online expert selection in Markov decision processes and demonstrate its use in high-dimensional settings. Our method takes a set of candidate expert policies and switches between them to rapidly identify the best performing expert using a variant of the classical upper confidence bound algorithm, thus ensuring low regret in the overall performance of the system. This is useful in applications where several expert policies may be available, and one needs to be selected at run-time for the underlying environment.

Author(s): Vicenc Rubies-Royo, Eric Mazumdar, Roy Dong, Claire Tomlin, S. Shankar Sastry. 
Journal/Conference: In proceedings of the 59th IEEE Conference on Decision and Control 2020
arxiv.org/abs/2010.15599

Deep Networks from the Principle of Rate Reduction
This work attempts to interpret modern deep (convolutional) networks from the principles of rate reduction and (shift) invariant classification. We show that the basic iterative gradient ascent scheme for optimizing the rate reduction of learned features naturally leads to a multi-layer deep network, one iteration per layer. The layered architectures, linear…

Author(s): Kwan Ho Ryan Chan, Yaodong Yu, Chong You, Haozhi Qi, John Wright, and Yi Ma
Journal/Conference: October, 2020.
arXiv:2010.14765

Control of Unknown Nonlinear Systems with Linear Time-Varying MPC
We present a Model Predictive Control (MPC) strategy for unknown input-affine nonlinear dynamical systems. A non-parametric method is used to estimate the nonlinear dynamics from observed data. The estimated nonlinear dynamics are then linearized over time varying regions of the state space to construct an Affine Time Varying (ATV) model…

Author(s): Dimitris Papadimitriou, Ugo Rosolia, Francesco Borrelli. 
Journal/Conference: 
arxiv.org/abs/2004.03041

Animated Cassie: A Dynamic Relatable Robotic Character
Creating robots with emotional personalities will transform the usability of robots in the real world. As previous emotive social robots are mostly based on statically stable robots whose mobility is limited, this paper develops an animation to real world pipeline that enables dynamic bipedal robots that can twist, wiggle, and walk to behave with emotions. First, an animation method is introduced to design emotive motions for the virtual robot character. Second, a dynamics optimizer is used to convert the animated motion to dynamically feasible motion…

Author(s): Zhongyu Li, Christine Cummings, Koushil Sreenath. 
Journal/Conference: IROS 2020
arxiv.org/abs/2009.02846

Adversarial Robustness of Stabilized Neural ODEs Might be from Obfuscated Gradients
In this paper we introduce a provably stable architecture for Neural Ordinary Differential Equations (ODEs) which achieves non-trivial adversarial robustness under white-box adversarial attacks even when the network is trained naturally. For most existing defense methods withstanding strong white-box attacks, to improve robustness of neural networks, they need to be trained adversarially, hence have to strike a trade-off between natural accuracy and adversarial robustness. Inspired by…

Author(s): Yifei Huang, Yaodong Yu, Hongyang Zhang, Yi Ma, and Yuan Yao. 
Journal/Conference: September, 2020.
arXiv:2009.13145

Robust MPC for Linear Systems with Parametric and Additive Uncertainty: A Novel Constraint Tightening Approach
We propose a novel approach to design a robust Model Predictive Controller (MPC) for constrained uncertain linear systems. The system dynamics matrices are not known exactly, leading to parametric model mismatch. We also consider the presence of an additive disturbance. Set based bounds for each component of the model uncertainty are assumed to be known…

Author(s): Monimoy Bujarbaruah, Ugo Rosolia, Yvonne R Stürz, Xiaojing Zhang, Francesco Borrelli. 
Journal/Conference: 
arxiv.org/abs/2007.00930

Learning to Satisfy Unknown Constraints in Iterative MPC
We propose a control design method for linear time-invariant systems that iteratively learns to satisfy unknown polyhedral state constraints. At each iteration of a repetitive task, the method constructs an estimate of the unknown environment constraints using collected closed-loop trajectory data…

Author(s): Monimoy Bujarbaruah, Charlott Vallon, Francesco Borrelli. 
Journal/Conference: IEEE-CDC 2020
arxiv.org/abs/2006.05054

SceneGen: Generative Contextual Scene Augmentation using Scene Graph Priors
Spatial computing experiences are constrained by the real-world surroundings of the user. In such experiences, augmenting virtual objects to existing scenes require a contextual approach, where geometrical conflicts are avoided, and functional and plausible relationships to other objects are maintained in the target environment. Yet, due to the complexity and diversity of user environments, automatically calculating ideal positions of virtual content that is adaptive to the context of the scene is considered a challenging task. Motivated by this problem, in this paper we introduce SceneGen, a generative contextual augmentation framework that predicts virtual object positions and orientations within existing scenes….

Author(s): Mohammad Keshavarzi, Aakash Parikh, Xiyu Zhai, Melody Mao, Luisa Caldas, Allen Y. Yang. 
Journal/Conference: September 2020
arxiv.org/abs/2009.12395

HoliCity: A City-Scale Data Platform for Learning Holistic 3D Structures
We present HoliCity, a city-scale 3D dataset with rich structural information. Currently, this dataset has 6,300 real-world panoramas of resolution 13312×6656 that are accurately aligned with the CAD model of downtown London with an area of more than 20 km2, in which the median reprojection error of the alignment of an average image is less than half a degree. This dataset aims…

Author(s): Yichao Zhou, Jingwei Huang, Xili Dai, Linjie Luo, Zhili Chen, and Yi Ma.
Journal/Conference: August, 2020.
arXiv:2008.03286

Learning Long-term Visual Dynamics with Region Proposal Interaction Networks

Learning long-term dynamics models is the key to understanding physical common sense. Most existing approaches on learning dynamics from visual input sidestep long-term predictions by resorting to rapid re-planning with short-term models. This not only requires such models to be super accurate but also limits them only to tasks where an agent can continuously obtain feedback and take action at each step until completion. In this paper, we aim to…

Author(s): Haozhi Qi, Xiaolong Wang, Deepak Pathak, Yi Ma, and Jitendra Malik. 
Journal/Conference: August, 2020.
arXiv:2008.02265

Reinforcement Learning for Safety-Critical Control under Model Uncertainty, using Control Lyapunov Functions and Control Barrier Functions
In this paper, the issue of model uncertainty in safety-critical control is addressed with a data-driven approach. For this purpose, we utilize the structure of an input-ouput linearization controller based on a nominal model along with a Control Barrier Function and Control Lyapunov Function based Quadratic Program (CBF-CLF-QP). Specifically, we propose a novel reinforcement learning framework which learns the model uncertainty present in the CBF and CLF constraints…

Author(s): Jason Choi, Fernando Castañeda, Claire J. Tomlin, Koushil Sreenath. 
Journal/Conference: 
arxiv.org/abs/2004.07584

Rethinking Bias-Variance Trade-off for Generalization of Neural Networks
The classical bias-variance trade-off predicts that bias decreases and variance increase with model complexity, leading to a U-shaped risk curve. Recent work calls this into question for neural networks and other over-parameterized models, for which it is often observed that larger models generalize better. We provide a simple explanation for this by measuring the bias and variance of neural networks: while the bias is monotonically decreasing as in the classical theory, the variance is unimodal or bell-shaped: it increases then decreases with the width of the network…

Author(s): Zitong Yang, Yaodong Yu, Chong You, Jacob Steinhardt, and Yi Ma
Journal/Conference:  International Conference on Machine Learning (ICML), June 2020.
arXiv:2002.11328 [cs.LG]

Distributed Learning Model Predictive Control for Linear Systems
This paper presents a distributed learning model predictive control (DLMPC) scheme for distributed linear time invariant systems with coupled dynamics and state constraints. The proposed solution method is based on an online distributed optimization scheme with nearest-neighbor communication. If the control task is iterative and data from previous feasible iterations are available, local data are exploited by the subsystems in order to construct the local terminal set and terminal cost, which guarantee recursive feasibility and asymptotic stability, as well as performance improvement over iterations…

Author(s): Yvonne R. Stürz, Edward L. Zhu, Ugo Rosolia, Karl H. Johansson, Francesco Borrelli. 
Journal/Conference: 
arxiv.org/abs/2006.13406

A Distributed Multi-Robot Coordination Algorithm for Navigation in Tight Environments
This work presents a distributed method for multi-robot coordination based on nonlinear model predictive control (NMPC) and dual decomposition. Our approach allows the robots to coordinate in tight spaces (e.g., highway lanes, parking lots, warehouses, canals, etc.) by using a polytopic description of each robot’s shape and formulating the collision avoidance as a dual optimization problem. Our method accommodates heterogeneous teams of robots (i.e., robots with different polytopic shapes and dynamic models can be part of the same team) and can be used to avoid collisions in…

Author(s): Roya Firoozi, Laura Ferranti, Xiaojing Zhang, Sebastian Nejadnik, Francesco Borrelli. 
Journal/Conference: 
arxiv.org/abs/2006.11492

Learning to Detect 3D Reflection Symmetry for Single-View Reconstruction

3D reconstruction from a single RGB image is a challenging problem in computer vision. Previous methods are usually solely data-driven, which lead to inaccurate 3D shape recovery and limited generalization capability. In this work, we focus on object-level 3D reconstruction and present a geometry-based end-to-end deep learning framework that first detects the mirror plane of reflection symmetry that commonly exists in man-made objects and then predicts depth maps by finding the intra-image pixel-wise correspondence of the symmetry…

Author(s): Yichao Zhou, Sichen Liu, and Yi Ma
Journal/Conference: June 2020.
arXiv:2006.10042

Deep Isometric Learning for Visual Recognition
Initialization, normalization, and skip connections are believed to be three indispensable techniques for training very deep convolutional neural networks and obtaining state-of-the-art performance. This paper shows that deep vanilla ConvNets without normalization nor skip connections can also be trained to achieve surprisingly good performance on standard image recognition benchmarks. This is achieved by enforcing the convolution kernels to be near isometric during initialization and training, as well as by using a variant of ReLU that is shifted towards being isometric….

Author(s): Haozhi Qi, Chong You, Xiaolong Wang, Yi Ma, and Jitendra Malik. 
Journal/Conference:  International Conference on Machine Learning (ICML), June 2020.
 arXiv:2006.16992

Data-Driven Hierarchical Predictive Learning in Unknown Environments
We propose a hierarchical learning architecture for predictive control in unknown environments. We consider a constrained nonlinear dynamical system and assume the availability of state-input trajectories solving control tasks in different environments. A parameterized environment model generates state constraints specific to each task, which are satisfied by the stored trajectories. Our goal is to find a feasible trajectory for a new task in an unknown environment…

Author(s): Charlott Vallon, Francesco Borrellin T. Barron, Ren Ng
Journal/Conference: 
arxiv.org/abs/2005.05948

Maximum Likelihood Constraint Inference for Inverse Reinforcement Learning
While most approaches to the problem of Inverse Reinforcement Learning (IRL) focus on estimating a reward function that best explains an expert agent’s policy or demonstrated behavior on a control task, it is often the case that such behavior is more succinctly represented by a simple reward combined with a set of hard constraints. In this setting, the agent is attempting to maximize cumulative rewards subject to these given constraints on their behavior. We reformulate the problem of IRL on Markov Decision Processes (MDPs) such that…

Author(s): Dexter R.R. Scobee and S. Shankar Sastry
Journal/Conference: International Conference on Learning Representations (ICLR), 2020

Improving Input-Output Linearizing Controllers for Bipedal Robots via Reinforcement Learning
The main drawbacks of input-output linearizing controllers are the need for precise dynamics models and not being able to account for input constraints. Model uncertainty is common in almost every robotic application and input saturation is present in every real world system. In this paper, we address both challenges for the specific case of bipedal robot control by the use of reinforcement learning techniques…

Author(s): Fernando Castañeda, Mathias Wulfman, Ayush Agrawal, Tyler Westenbroek, Claire J. Tomlin, S. Shankar Sastry, Koushil Sreenath. 
Journal/Conference: Learning for Dynamics and Control (L4DC) 2020 Conference
arxiv.org/abs/2004.07276

Optimal Robust Safety-Critical Control for Dynamic Robotics
We present a novel method of optimal robust control through quadratic programs that offers tracking stability while subject to input and state-based constraints as well as safety-critical constraints for nonlinear dynamical robotic systems in the presence of model uncertainty…

Author(s):Quan Nguyen, Koushil Sreenath. 
Journal/Conference: 
arxiv.org/abs/2005.07284

Task Decomposition for MPC: A Computationally Efficient Approach for Linear Time-Varying Systems
A Task Decomposition method for iterative learning Model Predictive Control (TDMPC) for linear time-varying systems is presented. We consider the availability of state-input trajectories which solve an original task T1, and design a feasible MPC policy for a new task, T2, using stored data from T1…

Author(s): Charlott Vallon, Francesco Borrelli. 
Journal/Conference: 
arxiv.org/abs/2005.01673

Eyes-Closed Safety Kernels: Safety for Autonomous Systems Under Loss of Observability
A framework is presented for handling a potential loss of observability of a dynamical system in a provably-safe way. Inspired by the fragility of data-driven perception systems used by autonomous vehicles, we formulate the problem that arises when a sensing modality fails or is found to be untrustworthy during autonomous operation. We cast this problem as a differential game played between the dynamical system being controlled and the external system factor(s) for which observations are lost…

Author(s): Forrest Laine, Chiu-Yuan Chiu, Claire Tomlin. 
Journal/Conference: Robotics: Science and Systems 2020
arxiv.org/abs/2005.07144

Inference-Based Strategy Alignment for General-Sum Differential Games
In many settings where multiple agents interact, the optimal choices for each agent depend heavily on the choices of the others. These coupled interactions are well-described by a general-sum differential game, in which players have differing objectives, the state evolves in continuous time, and optimal play may be characterized by one of many equilibrium concepts, e.g., a Nash equilibrium…

Author(s): Lasse Peters, David Fridovich-Keil, Claire J. Tomlin, Zachary N. Sunberg. 
Journal/Conference: 
arxiv.org/abs/2002.04354

Learning Min-norm Stabilizing Control Laws for Systems with Unknown Dynamics
This paper introduces a framework for learning a minimum-norm stabilizing controller for a system with unknown dynamics using model-free policy optimization methods. The approach begins by first designing a Control Lyapunov Function (CLF) for a (possibly inaccurate) dynamics model for the system, along with a function which specifies a minimum acceptable rate of energy dissipation for the CLF at different points in the state-space. Treating the energy dissipation condition as a constraint on the desired closed-loop behavior of the real-world system, we use penalty methods to formulate an unconstrained optimization problem over the parameters of a learned controller, which can be solved using model-free policy optimization algorithms using data collected from the plant…

Author(s): Tyler Westenbroek, Fernando Castañeda, Ayush Agrawal, S. Shankar Sastry, Koushil Sreenath. 
Journal/Conference: April 21, 2020
arxiv.org/abs/2004.10331

Technical Report: Adaptive Control for Linearizable Systems Using On-Policy Reinforcement Learning
This paper proposes a framework for adaptively learning a feedback linearization-based tracking controller for an unknown system using discrete-time model-free policy-gradient parameter update rules. The primary advantage of the scheme over standard model-reference adaptive control techniques is that it does not require the learned inverse model to be invertible at all instances of time. This enables the use of general function approximators to approximate the linearizing controller for the system without having to worry about singularities…

Author(s): Tyler Westenbroek, Eric Mazumdar, David Fridovich-Keil, Valmik Prabhu, Claire J. Tomlin, S. Shankar Sastry. 
Journal/Conference: 
arxiv.org/abs/2004.02766

ParkPredict: Motion and Intent Prediction of Vehicles in Parking Lots
We investigate the problem of predicting driver behavior in parking lots, an environment which is less structured than typical road networks and features complex, interactive maneuvers in a compact space. Using the CARLA simulator, we develop a parking lot environment and collect a dataset of human parking maneuvers. We then study the impact of model complexity and feature information by comparing…

Author(s): Xu Shen, Ivo Batkovic, Vijay Govindarajan, Paolo Falcone, Trevor Darrell, Francesco Borrelli. 
Journal/Conference: IEEE Intelligent Vehicles Symposium (IV) 2020
arxiv.org/abs/2004.10293

Extending DeepSDF for automatic 3D shape retrieval and similarity transform estimation
Recent advances in computer graphics and computer vision have found successful application of deep neural network models for 3D shapes based on signed distance functions (SDFs) that are useful for shape representation, retrieval, and completion. However, this approach has been limited by the need to have query shapes in the same canonical scale and pose as those observed during training, restricting its effectiveness on real world scenes. We present a formulation to overcome this issue by jointly estimating shape and similarity transform parameters. We conduct experiments to demonstrate the effectiveness of this formulation on synthetic and real datasets and report favorable comparisons to the state of the art. Finally, we also emphasize the viability of this approach as a form of 3D model compression.

Author(s): AfolabiOladapo and Yang, Allen and Sastry, Shankar S. 
Journal/Conference: Apr 20, 2020
arXiv:2004.09048

Adaptive Control for Linearizable Systems using On-Policy Reinforcement Learning
This paper proposes a framework for adaptively learning a feedback linearization-based tracking controller for an unknown system using discrete-time model-free policy-gradient parameter update rules. The primary advantage of the scheme over standard model-reference adaptive control techniques is that it does not require the learned inverse model to be invertible at all instances of time. This enables the use of general function approximators to approximate the linearizing controller for the system without having to worry about singularities…

Author(s): Tyler Westenbroek, Eric Mazumdar, David Fridovich-Keil, Valmik Prabhu, Claire J. Tomlin and S. Shankar Sastry. 
Journal/Conference: Technichal Report, April 2020
arxiv.org/abs/2004.02766

Exponentially Stable First Order Control on Matrix Lie Groups
We present a novel first order controller for systems evolving on matrix Lie groups, a major use case of which is Cartesian velocity control on robot manipulators. This controller achieves global exponential trajectory tracking on a number of commonly used Lie groups including the Special Orthogonal Group SO(n), the Special Euclidean Group SE(n), and the General Linear Group over complex numbers GL(n, C). Additionally, this controller achieves local exponential trajectory tracking on all matrix Lie groups. We demonstrate the effectiveness of this controller in simulation on a number of different Lie groups as well as on hardware with a 7-DOF Sawyer robot arm.

Author(s): Valmik PrabhuAmay Saxena, and S. Shankar Sastry
Journal/Conference: April 1, 2020
arxiv.org/abs/2004.00239

Visual Navigation Among Humans with Optimal Control as a Supervisor
Real world navigation requires robots to operate in unfamiliar, dynamic environments, sharing spaces with humans. Navigating around humans is especially difficult because it requires predicting their future motion, which can be quite challenging. We propose a novel framework for navigation around humans which combines learning-based perception with model-based optimal control. Specifically, we train a Convolutional Neural Network (CNN)-based perception module which maps the robot’s visual inputs…

Author(s): Varun Tolani, Somil Bansal, Aleksandra Faust, Claire Tomlin. 
Journal/Conference: 
arxiv.org/abs/2003.09354

Output-Lifted Learning Model Predictive Control
We propose a computationally efficient Learning Model Predictive Control (LMPC) scheme for constrained optimal control of a class of nonlinear systems, performing iterative tasks. For the considered class of systems, we show how to use historical trajectory data to construct a convex value function approximation along with a convex safe set in a lifted space of virtual outputs…

Author(s): .Siddharth H. Nair, Ugo Rosolia, Francesco Borrelli 
Journal/Conference: 
arxiv.org/abs/2004.05173

Optimization and Manipulation of Contextual Mutual Spaces for Multi-User Virtual and Augmented Reality Interaction
Spatial computing experiences are physically constrained by the geometry and semantics of the local user environment. This limitation is elevated in remote multi-user interaction scenarios, where finding a common virtual ground physically accessible for all participants becomes challenging. Locating a common accessible virtual ground is difficult for the users themselves, particularly if they are not aware of the spatial properties of other participants. In this paper, we introduce a framework to generate an optimal mutual virtual space for a multi-user interaction setting where remote users’ room spaces can have different layout and sizes…

Author(s): Mohammad Keshavarzi, Allen Y. Yang, Woojin Ko, Luisa Caldas. 
Journal/Conference: 2020 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), February 2020
arxiv.org/abs/1910.05998

Policy-Gradient Algorithms Have No Guarantees of Convergence in Linear Quadratic Games
We show by counterexample that policy-gradient algorithms have no guarantees of even local convergence to Nash equilibria in continuous action and state space multi-agent settings. To do so, we analyze gradient-play in N–player general-sum linear quadratic games, a classic game setting which is recently emerging as a benchmark in the field of multi-agent learning. In such games the state and action spaces are continuous and global Nash equilibria can be found be solving coupled Ricatti equations…

Author(s): Eric Mazumdar, Lillian J. Ratliff, Micheal I. Jordan, S. Shankar Sastry
Journal/Conference: AAMAS, 2020

LESS is More: Rethinking Probabilistic Models of Human Behavior
Robots need models of human behavior for both inferring human goals and preferences, and predicting what people will do. A common model is the Boltzmann noisily-rational decision model, which assumes people approximately optimize a reward function and choose trajectories in proportion to their exponentiated reward. While this model has been successful in a variety of robotics domains, its roots lie in econometrics, and in modeling decisions among different discrete options, each with its own utility or reward. In contrast, human trajectories lie in a continuous space, with continuous-valued features that influence the reward function…

Author(s): Andreea Bobu, Dexter R.R. Scobee, Jaime F. Fisac, S. Shankar Sastry, and Anca D. Dragan. 
Journal/Conference: ACM/IEEE International Conference on Human-Robot Interaction (HRI), 2020
dl.acm.org/doi/10.1145/3319502.3374811

2019

TutoriVR: A Video-Based Tutorial System for Design Applications in Virtual Reality
Virtual Reality painting is a form of 3D-painting done in a Virtual Reality (VR) space. Being a relatively new kind of art form, there is a growing interest within the creative practices community to learn it. Currently, most users learn using community posted 2D-videos on the internet, which are a screencast recording of the painting process by an instructor. While such an approach may suffice for teaching 2D-software tools, these videos by themselves fail in delivering crucial details that required by the user to understand actions in a VR space…

Author(s): Balasaravanan Thoravi Kumaravel, Cuong Nguyen, Stephen DiVerdi, Bjoern Hartmann. 2019. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI ’19).
DOI: https://doi.org/10.1145/3290605.3300514 

Tracking of Deformable Human Avatars through Fusion of Low-Dimensional 2D and 3D Kinematic Models
We propose a method to estimate and track the 3D posture as well as the 3D shape of the human body from a single RGB-D image. We estimate the full 3D mesh of the body and show that 2D joint positions greatly improve 3D estimation and tracking accuracy. The problem is inherently very challenging because due to the complexity of the human body, lighting, clothing, and occlusion. The solve the problem, we leverage a custom MobileNet implementation of OpenPose CNN to construct a 2D skeletal model of the human body. We then fit a low-dimensional deformable body model called SMPL to the observed point cloud using initialization from the 2D skeletal model…

Author(s): Ningjian Zhou and S. Shankar Sastry
Technical Report No. UCB/EECS-2019-87
Publication Date: May 19, 2019

Temporal IK: Data-Driven Pose Estimation for Virtual Reality
High-quality human avatars are an important part of compelling virtual reality (VR) experiences. Animating an avatar to match the movement of its user, however, is a fundamentally difficult task, as most VR systems only track the user’s head and hands, leaving the rest of the body undetermined. In this report, we introduce Temporal IK, a data-driven approach to predicting full-body poses from standard VR headset and controller inputs. We describe a recurrent neural network that, when given a sequence of positions and rotations from VR tracked objects, predicts the corresponding full-body poses in a manner that exploits the temporal consistency of human motion…

Author(s): James Lin and James O’ Brien
Technical Report No. UCB/EECS-2019-59
Publication Date: May 17, 2019

Real-Time Hand Model Estimation from Depth Images for Wearable Augmented Reality Glasses
This work presents a hand model estimation method designed specifically with augmented reality (AR) glasses and 3D AR interface in mind. The proposed work is capable of estimating the 3D positions of all ten finger from a single depth image. By leveraging a low-dimensional hand model and exploiting hand geometries from an ego-centric view, we build a lightweight algorithm that is accurate, environment agnostic, and runs in real time on mobile hardware. One major consideration in our design for AR is that the user’s hand is likely to interact with planar surfaces since they serve as ideal “touchscreens”…PDF

Author(s): Bill Zhou, Alex Yu, Joseph Menke and Allen Yang
DOI: 10.1109/ISMAR-Adjunct.2019.00-31

A User Experience Study of Locomotion Design in Virtual Reality Between Adult and Minor Users
Virtual reality (VR) is an important new technology that is fundamentally changing the way people experience entertainment and education content. Due to the fact that most currently available VR products are one-size-fits-all, the user experience of the content interface and user interaction for children is not well understood compared to that for adults. In this study, we seek to explore user experience of locomotion in VR between healthy adults and healthy minors along both objective and subjective dimensions…

Author(s): Zhijiong Huang, Yu Zhang, Kathryn C. Quigley, Ramya Sankar, and Allen Yang
DOI: 10.1109/ISMAR-Adjunct.2019.00027

NeurVPS: Neural Vanishing Point Scanning via Conic Convolution
We present a simple yet effective end-to-end trainable deep network with geometry-inspired convolutional operators for detecting vanishing points in images. Traditional convolutional neural networks rely on aggregating edge features and do not have mechanisms to directly exploit the geometric properties of vanishing points as the intersections of parallel lines. In this work, we identify a canonical conic space in which the neural network can effectively compute the global geometric information of vanishing points locally, and we propose a novel operator named conic convolution…

Author(s): Yichao Zhou, Haozhi Qi, and Yi Ma
Conference: NeurIPS, 2019

L-CNN: End-to-End Wireframe Parsing
We present a conceptually simple yet effective algorithm to detect wireframes in a given image. Compared to the previous methods which first predict an intermediate heat map and then extract straight lines with heuristic algorithms, our method is end-to-end trainable and can directly output a vectorized wireframe that contains semantically meaningful and geometrically salient junctions and lines…

Author(s): Yichao Zhou, Haozhi Qi, and Yi Ma
Conference: International Conference on Computer Vision (ICCV), 2019

Learning to Reconstruct 3D Manhattan Wireframes from a Single Image
In this paper, we propose a method to obtain a compact and accurate 3D wireframe representation from a single image by effectively exploiting global structural regularities. Our method trains a convolutional neural network to simultaneously detect salient junctions and straight lines, as well as predict their 3D depth and vanishing points. Compared with the state-of-the-art learning-based wireframe detection methods, our network is much simpler and more unified, leading to better 2D wireframe detection…

Author(s): Yichao Zhou, Haozhi Qi, Yuexiang Zhai, Qi Sun, Zhili Chen, Li-Yi Wei, and Yi Ma
Conference: International Conference on Computer Vision (ICCV), 2019

Faculty Researchers

Ruzena Bajcsy

Topics: Exoskeletons, Human Kinematic & Dynamic Modeling, Telemedicine, Health Telemonitoring, and Human Musculoskeletal Modeling
Publications

Francesco Borrelli

Topics: Applications
Publications

Luisa Caldas

Publications

Bjoern Hartmann

Publications

Richard Koci Hernandez

Publications

Ren Ng

Topics: Imaging
Publications

James O' Brien

Topics: Graphics
Publications

Shankar Sastry

Publications

Claire Tomlin

Publications

Stella Yu

Publications

Allen Yang

Topics: Localization, Immersion, Applications, and Interaction
Publications