| |
Last updated on May 10, 2023. This conference program is tentative and subject to change
Technical Program for Tuesday May 30, 2023
|
TuAT1 Oral Session, ICC Cap Suite 7-9 |
Add to My Program |
SLAM 1 |
|
|
Chair: Tardos, Juan D. | Universidad De Zaragoza |
Co-Chair: Ott, Lionel | ETH Zurich |
|
08:30-08:40, Paper TuAT1.1 | Add to My Program |
Picking up Speed: Continuous-Time Lidar-Only Odometry Using Doppler Velocity Measurements |
|
Wu, Yuchen | University of Toronto |
Yoon, David Juny | University of Toronto |
Burnett, Keenan | University of Toronto |
Kammel, Sören | Aeva Inc |
Chen, Yi | Aeva |
Vhavle, Heethesh | Aeva, Inc |
Barfoot, Timothy | University of Toronto |
Keywords: SLAM, Localization, Range Sensing
Abstract: Frequency-Modulated Continuous-Wave (FMCW) lidar is a recently emerging technology that additionally enables per-return instantaneous relative radial velocity measurements via the Doppler effect. In this letter, we present the first continuous-time lidar-only odometry algorithm using these Doppler velocity measurements from an FMCW lidar to aid odometry in geometrically degenerate environments. We apply an existing continuous-time framework that efficiently estimates the vehicle trajectory using Gaussian process regression to compensate for motion distortion due to the scanning-while-moving nature of any mechanically actuated lidar (FMCW and non-FMCW). We evaluate our proposed algorithm on several real-world datasets, including publicly available ones and datasets we collected. Our algorithm outperforms the only existing method that also uses Doppler velocity measurements, and we study difficult conditions where including this extra information greatly improves performance. We additionally demonstrate state-of-the-art performance of lidar-only odometry with and without using Doppler velocity measurements in nominal conditions. Code for this project can be found at: https://github.com/utiasASRL/steam_icp.
|
|
08:40-08:50, Paper TuAT1.2 | Add to My Program |
Stein ICP for Uncertainty Estimation in Point Cloud Matching |
|
Afzal Maken, Fahira | Data61, CSIRO |
Ramos, Fabio | University of Sydney, NVIDIA |
Ott, Lionel | ETH Zurich |
Keywords: SLAM, Sensor Fusion, Perception for Grasping and Manipulation
Abstract: Quantification of uncertainty in point cloud matching is critical in many tasks such as pose estimation, sensor fusion, and grasping. Iterative closest point (ICP) is a commonly used pose estimation algorithm which provides a point estimate of the transformation between two point clouds. There are many sources of uncertainty in this process that may arise due to sensor noise, ambiguous environment, initial condition, and occlusion. However, for safety critical problems such as autonomous driving, a point estimate of the pose transformation is not sufficient as it does not provide information about the multiple solutions. Current probabilistic ICP methods usually do not capture all sources of uncertainty and may provide unreliable transformation estimates which can have a detrimental effect in state estimation or decision making tasks that use this information. In this work we propose a new algorithm to align two point clouds that can precisely estimate the uncertainty of ICP's transformation parameters. We develop a Stein variational inference framework with gradient based optimization of ICP's cost function. The method provides a non-parametric estimate of the transformation, can model complex multi-modal distributions, and can be effectively parallelized on a GPU. Experiments using 3D kinect data as well as sparse indoor/outdoor LiDAR data show that our method is capable of efficiently producing accurate pose uncertainty estimates.
|
|
08:50-09:00, Paper TuAT1.3 | Add to My Program |
Direct and Sparse Deformable Tracking |
|
Lamarca, Jose | Apple Inc |
Gomez Rodriguez, Juan Jose | Universidad De Zaragoza (VAT: ESQ5018001G) |
Tardos, Juan D. | Universidad De Zaragoza |
Montiel, J.M.M | I3A. Universidad De Zaragoza |
Keywords: SLAM, Localization, Computer Vision for Medical Robotics
Abstract: Deformable Monocular SLAM algorithms recover the localization of a camera in an unknown deformable environment. Current approaches use a template-based deformable tracking to recover the camera pose and the deformation of the map. These template-based methods use an underlying global deformation model. In this paper, we introduce a novel deformable camera tracking method with a local deformation model for each point. Each map point is defined as a single textured surfel that moves independently of the other map points. Thanks to a direct photometric error cost function, we can track the position and orientation of the surfel without an explicit global deformation model. In our experiments, we validate the proposed system and observe that our local deformation model estimates more accurately the targeted deformations of the map in both laboratory-controlled experiments and in-body scenarios undergoing quasi-isometric deformations, with changing topology or discontinuities.
|
|
09:00-09:10, Paper TuAT1.4 | Add to My Program |
ASRO-DIO: Active Subspace Random Optimization Based Depth Inertial Odometry (I) |
|
Zhang, Jiazhao | National University of Defense Technology |
Tang, Yijie | National University of Defense Technology |
Wang, He | Peking University |
Xu, Kai | National University of Defense Technology |
Keywords: SLAM, RGB-D Perception, Sensor Fusion, Evolution Strategy
Abstract: High-dimensional nonlinear state estimation is at the heart of inertial-aided navigation systems (INS). Traditional methods usually rely on good initialization and find difficulty in handling large inter-frame transformations due to fast camera motion. We opt to tackle these challenges by solving the depth inertial odometry (DIO) problem with random optimization. To address the exponentially increased amount of candidate states sampled for the high-dimensional state space, we propose a highly efficient variant of random optimization based on the idea of active subspace. Our method identifies the active dimensions which contribute the most significantly to the decrease of the cost function in each iteration, and samples candidate states only within the corresponding subspace. This allows us to efficiently explore the 18D state space of DIO and achieve good optimality by sampling and evaluating only thousands of candidate states. Experiments show that our method attains highly robust and accurate DIO under fast camera motions and low light conditions, without needing a slow-motion warm-up for initialization.
|
|
09:10-09:20, Paper TuAT1.5 | Add to My Program |
Discrete-Continuous Smoothing and Mapping |
|
Doherty, Kevin | Massachusetts Institute of Technology |
Lu, Ziqi | MIT |
Singh, Kurran | Massachusetts Institute of Technology |
Leonard, John | MIT |
Keywords: SLAM, Localization, Mapping
Abstract: We describe a general approach for maximum a posteriori (MAP) inference in a class of discrete-continuous factor graphs commonly encountered in robotics applications. While there are openly available tools providing flexible and easy-to-use interfaces for specifying and solving inference problems formulated in terms of either discrete or continuous graphical models, at present, no similarly general tools exist enabling the same functionality for hybrid discrete-continuous problems. We aim to address this problem. In particular, we provide a library, DC-SAM, extending existing tools for inference problems defined in terms of factor graphs to the setting of discrete-continuous models. A key contribution of our work is a novel solver for efficiently recovering approximate solutions to discrete-continuous inference problems. The key insight to our approach is that while joint inference over continuous and discrete state spaces is often hard, many commonly encountered discrete-continuous problems can naturally be split into a “discrete part” and a “continuous part” that can individually be solved easily. Leveraging this structure, we optimize discrete and continuous variables in an alternating fashion. In consequence, our proposed work enables straightforward representation of and approximate inference in discrete-continuous graphical models. We also provide a method to approximate the uncertainty in estimates of both discrete and continuous variables.
|
|
09:20-09:30, Paper TuAT1.6 | Add to My Program |
Anderson Acceleration for On-Manifold Iterated Error State Kalman Filters |
|
Gao, Xiang | Idriverplus.com |
Xiao, Tao | Beijing Idriverplus Technology Co. Ltd |
Bai, Chunge | Tsinghua University |
Zhang, Dezhao | 441422198412035111 |
Zhang, Fang | Beijing Idriverplus Technology Co., Ltd |
Keywords: SLAM, Localization, Mapping
Abstract: Iterated Extended Kalman Filter is a promising and widely-used estimator for real-time localization applications. It iterates the observation equation to find a better linearization point and, simultaneously, only maintains the state estimation in a single time to save the computation resources. Inspired by the recent development of the iterative closest point algorithm, this paper investigates an acceleration approach to the iterations in iterative error state Kalman filters (IESKFs). We show that the IESKF can be seen as a fixed point problem, and the Anderson acceleration (AA) can be elegantly applied to the iterations of IESKF since the error state naturally lies in the tangent space and does not require additional transforms. However, the tangent space of the current estimation may change during the iterations, so we should switch the tangent space to the starting point to perform Anderson acceleration. We propose the AA-IEKF and apply it to the lidar-inertial odometry (LIO) systems to estimate the ego-motion of a lidar. The experiments show that the Anderson acceleration can efficiently reduce the number of iterations in ESKF and achieve a lower computational cost.
|
|
09:30-09:40, Paper TuAT1.7 | Add to My Program |
Generalized LOAM: LiDAR Odometry Estimation with Trainable Local Geometric Features |
|
Honda, Kohei | Nagoya University Graduate School |
Koide, Kenji | National Institute of Advanced Industrial Science and Technology |
Yokozuka, Masashi | Nat. Inst. of Advanced Industrial Science and Technology |
Oishi, Shuji | National Institute of Advanced Industrial Science and Technology |
Banno, Atsuhiko | National Instisute of Advanced Industrial Science and Technology |
Keywords: SLAM, Localization, Mapping
Abstract: This paper presents a LiDAR odometry estimation framework called Generalized LOAM. Our proposed method is generalized in that it can seamlessly fuse various local geometric shapes around points to improve the position estimation accuracy compared to the conventional LiDAR odometry and mapping (LOAM) method. To utilize continuous geometric features for LiDAR odometry estimation, we incorporate tiny neural networks into a generalized iterative closest point (GICP) algorithm. These neural networks improve the data association metric and the matching cost function using local geometric features. Experiments with the KITTI benchmark demonstrate that our proposed method reduces relative trajectory errors compared to the GICP and LOAM methods.
|
|
09:40-09:50, Paper TuAT1.8 | Add to My Program |
BoW3D: Bag of Words for Real-Time Loop Closing in 3D LiDAR SLAM |
|
Cui, Yunge | Shenyang Institute of Automation Chinese Academy of Sciences |
Chen, Xieyuanli | National University of Defense Technology |
Zhang, Yinlong | Shenyang Institute of Automation, Chinese Academy of Sciences |
Dong, Jiahua | Shenyang Institute of Automation Chinese Academy of Sciences |
Wu, Qingxiao | Shenyang Institute of Automation Chinese Academy of Sciences |
Zhu, Feng | Shenyang Institute of Automation,Chinese Academy of Scien |
Keywords: SLAM, Localization
Abstract: Loop closing is a fundamental part of simultaneous localization and mapping (SLAM) for autonomous mobile systems. In the field of visual SLAM, bag of words (BoW) has achieved great success in loop closure. The BoW features for loop searching can also be used in the subsequent 6-DoF loop correction. However, for 3D LiDAR SLAM, the state-of-the-art methods may fail to effectively recognize the loop in real time, and usually cannot correct the full 6-DoF loop pose. To address this limitation, we present a novel Bag of Words for real-time loop closing in 3D LiDAR SLAM, called BoW3D. Our method not only efficiently recognizes the revisited loop places, but also corrects the full 6-DoF loop pose in real time. BoW3D builds the bag of words based on the 3D LiDAR feature LinK3D, which is efficient, pose-invariant and can be used for accurate point-to-point matching. We furthermore embed our proposed method into 3D LiDAR odometry system to evaluate loop closing performance. We test our method on public dataset, and compare it against other state-of-the-art algorithms. Our BoW3D shows better performance in terms of F1 max and extended precision scores in most scenarios with superior real-time performance. It is noticeable that BoW3D takes an average of 50 ms to recognize and correct the loops on KITTI 00 (includes 4K+ 64-ray LiDAR scans), when executed on a notebook with an Intel Core i7 @2.2 GHz processor. We release the implementation of our method here: https://github.com/YungeCui/
|
|
09:50-10:00, Paper TuAT1.9 | Add to My Program |
Gaussian Mixture Midway-Merge for Object SLAM with Pose Ambiguity |
|
Jung, Jae Hyung | Seoul National University |
Park, Chan Gook | Seoul National University |
Keywords: SLAM, Sensor Fusion, Object Detection, Segmentation and Categorization
Abstract: In this letter, we propose a novel method to merge a Gaussian mixture on matrix Lie groups and present its application for a simultaneous localization and mapping problem with symmetric objects. The key idea is to predetermine the weighted mean called a midway point and merge Gaussian mixture components at the associated tangent space. Through this rule, the covariance matrix captures the original density more accurately, and the need for the back-projection is spared when compared to the conventional merge. We highlight the midway-merge by numerically evaluating dissimilarity metrics of density functions before and after the merge on the rotational group. Furthermore, we experimentally discover that the rotational error of symmetric objects follows heavy-tailed behavior and formulate the Gaussian sum filter to model it by a Gaussian mixture noise. The effectiveness of our approach is validated through virtual and real-world datasets.
|
|
TuAT2 Oral Session, Theatre 1 |
Add to My Program |
Soft Robot Applications |
|
|
Chair: Prattichizzo, Domenico | Università Di Siena |
Co-Chair: Sung, Cynthia | University of Pennsylvania |
|
08:30-08:40, Paper TuAT2.1 | Add to My Program |
Design and Characterization of a 3D-Printed Pneumatically-Driven Bistable Valve with Tunable Characteristics |
|
Wang, Sihan | University of Oxford |
He, Liang | University of Oxford |
Maiolino, Perla | University of Oxford |
Keywords: Soft Robot Applications, Additive Manufacturing, Hydraulic/Pneumatic Actuators
Abstract: Although research studies in pneumatic soft robots develop rapidly, most pneumatic actuators are still controlled by rigid valves and conventional electronics. The existence of these rigid, electronic components sacrifices the compliance and adaptability of soft robots. Current electronics-free valve designs based on soft materials are facing challenges in behaviour consistency, design flexibility, and fabrication complexity. Taking advantages of soft material 3D printing, this paper presents a new design of a bi-stable pneumatic valve, which utilises two soft, pneumatically-driven, and symmetrically-oriented conical shells with structural bistability to stabilise and regulate the airflow. The critical pressure required to operate the valve can be adjusted by changing the design features of the soft bi-stable structure. Multi-material printing simplifies the valve fabrication, enhances the flexibility in design feature optimisations, and improves the system repeatability. In this work, both a theoretical model and physical experiments are introduced to examine the relationships between the critical operating pressure and the key design features. Results with valve characteristic tuning via material stiffness changing show better effectiveness compared to the change of geometry design features (demonstrated largest tunable critical pressure range from 15.3 to 65.2 kPa and fastest response time <= 1.8 s).
|
|
08:40-08:50, Paper TuAT2.2 | Add to My Program |
Design of Fully Controllable and Continuous Programmable Surface Based on Machine Learning |
|
Wang, Jue | Purdue University |
Suo, Jiaqi | Gensler Baltimore |
Chortos, Alex | Purdue |
Keywords: Soft Robot Applications, AI-Based Methods, Machine Learning for Robot Control
Abstract: Programmable surfaces (PSs) consist of a 2D array of actuators that can deform in the third dimension, providing the ability to create continuous 3D profiles. Discrete PSs can be realized using an array of independent solid linear actuators. Continuous PSs consist of actuators that are mechanically coupled, providing deformation states that are more similar to real surfaces with reduced complexity of the control electronics. However, continuous PSs have been limited in size by the lack of the control systems required to take into account the complex internal coupling between actuators in the array. In this work, we computationally explore the deformation of a fully continuous PS with 81 independent actuation pixels based on ionic bending actuator. We establish a control strategy using machine learning (ML) regression models. Both forward and inverse control are achieved based on the training datasets which are derived from the finite element analysis (FEA) data of our PS. The prediction of surface deformation achieved by forward control with accuracy under 1% is 15000 times faster than FEM. And the real-time inverse control of continuous PSs that is to reproduce any arbitrary pre-defined surfaces, which possess high practical value for tactile display or human-machine interactive devices, is first proposed in the letter.
|
|
08:50-09:00, Paper TuAT2.3 | Add to My Program |
On the Use of Magnets to Robustify the Motion Control of Soft Hands |
|
Marullo, Sara | University of Siena |
Salvietti, Gionata | University of Siena |
Prattichizzo, Domenico | Università Di Siena |
Keywords: Soft Robot Applications, Soft Sensors and Actuators, Multifingered Hands
Abstract: In this letter, we propose a physics-based framework to exploit magnets in robotic manipulation. More specifically, we suggest equipping soft and underactuated hands with magnetic elements, which can generate a magnetic actuation able to synergistically interact with tendon-driven and pneumatic actuations, engendering a complementarity that enriches the capabilities of the actuation system. Magnetic elements can act as additional Degrees of Actuation (DoAs), robustifying the motion control of the device and augmenting the hand manipulation capabilities. We investigate the interaction of a soft hand with itself for enriching possible hand shaping, and the interaction of the hand with the environment for enriching possible grasping capabilities. Physics laws and notions reported in the manuscript can be used as a guidance for DoAs augmentation and can provide tools for the design of novel soft hands.
|
|
09:00-09:10, Paper TuAT2.4 | Add to My Program |
Kinegami: Algorithmic Design of Compliant Kinematic Chains from Tubular Origami (I) |
|
Chen, Wei-Hsi | University of Pennsylvania |
Yang, Woohyeok | University of Pennsylvania |
Peach, Lucien | University of Pennsylvania |
Koditschek, Daniel | University of Pennsylvania |
Sung, Cynthia | University of Pennsylvania |
Keywords: Origami robot, Soft Robot Materials and Design, Kinematics, Compliant Joint/Mechanism
Abstract: Origami processes can generate both rigid and compliant structures from the same homogeneous sheet material. We advance the origami robotics literature by showing that it is possible to construct an arbitrary rigid kinematic chain with prescribed joint compliance from a single tubular sheet. Our "Kinegami" algorithm converts a Denavit-Hartenberg specification into a single-sheet crease pattern for an equivalent serial robot mechanism by composing origami modules from a catalogue. The algorithm arises from the key observation that tubular origami linkage design reduces to a Dubins path planning problem. The automatically generated structural connections and movable joints that realize the specified design can also be endowed with independent user-specified compliance. We apply the Kinegami algorithm to a number of common robot mechanisms and hand-fold their algorithmically generated single-sheet crease patterns into functioning kinematic chains. We believe this is the first completely automated end-to-end system for converting an abstract manipulator specification into a physically realizable origami design that requires no additional human input.
|
|
09:10-09:20, Paper TuAT2.5 | Add to My Program |
Entrainment During Human Locomotion Using a Lightweight Soft Robotic Hip Exosuit (SR-HExo) |
|
Baye-Wallace, Lily C. | Southwest Research Institute; Arizona State University |
Thalman, Carly | Arizona State University |
Lee, Hyunglae | Arizona State University |
Keywords: Rehabilitation Robotics, Wearable Robotics, Soft Robot Applications
Abstract: A gait entrainment study was conducted using a lightweight soft robotic hip exosuit (SR-HExo) that can apply perturbations at the hip joint during treadmill walking. Periodic perturbations were applied by flat fabric Pneumatic Artificial Muscle actuators starting at a subject’s preferred gait frequency and increasing up to 15% higher in 3% increments. Anterior hip flexion perturbations and posterior hip extension perturbations were tested in two separate experiments. All 11 healthy participants showed successful entrainment in all 12 experimental conditions (i.e., from preferred gait frequency to 15% higher in both flexion and extension perturbation directions). This study confirmed that there exists a single stable point attractor during gait entrainment to unilateral, unidirectional hip perturbations, which is consistent with previous ankle studies. Phase-locking was consistently observed around toe-off phase of the gait cycle (GC). Group averaged results showed gait synchronization with extension perturbations occurred earlier in the gait cycle (around 50% GC where the hip angle reaches maximum extension) than with flexion perturbations (just after 60% GC where the transition from maximum hip extension towards hip flexion occurs). Other gait entrainment characteristics (success rate of entrainment, basin of entrainment, and transient response) observed in this study posits the potential of the SR-HExo for entrainment-based gait training in rehabilitation contexts.
|
|
09:20-09:30, Paper TuAT2.6 | Add to My Program |
SOPHIE: SOft and Flexible Aerial Vehicle for PHysical Interaction with the Environment |
|
Ruiz Vincueria, Fernando | Universidad De Sevilla |
Arrue, Begoña C. | Universidad De Sevilla |
Ollero, Anibal | University of Seville |
Keywords: Soft Robot Materials and Design, Soft Robot Applications, Aerial Systems: Mechanics and Control
Abstract: This paper presents the first design of a soft, 3D-printed in flexible filament, lightweight UAV, capable of performing full-body perching using soft tendons, specifically landing and stabilizing on pipelines and irregular surfaces without the need for an auxiliary system. The flexibility of the UAV can be controlled during the additive manufacturing process by adjusting the infill rate distribution. However, the increase in flexibility implies difficulties in controlling the UAV, as well as structural, aerodynamic, and aeroelastic effects. This article provides insight into the dynamics of the system and validates the flyability of the vehicle for densities as low as 6%. Within this range, quasi-static arm deformations can be considered, thus the autopilot is fed back through a static arm deflection model. At lower densities, strong non-linear elastic dynamics appear, which translates to complex modeling, and it is suggested to switch to data-based approaches.
|
|
09:30-09:40, Paper TuAT2.7 | Add to My Program |
A Tensegrity-Based Inchworm-Like Robot for Crawling in Pipes with Varying Diameters |
|
Liu, Yixiang | Shandong University |
Dai, Xiaolin | Shandong University |
Wang, Zhe | Shandong University |
Bi, Qing | Volvo Construction Equipment Technology (China) Co., Ltd |
Song, Rui | Shandong University |
Zhao, Jie | Harbin Institute of Technology |
Li, Yibin | Shandong University |
Keywords: Soft Robot Materials and Design, Biologically-Inspired Robots, Climbing Robots
Abstract: Most current in-pipe robots are usually designed for pipes of a specific size. In this paper, we propose a novel inchworm-like in-pipe robot based on the concept of tensegrity for moving in pipes with varying diameters. Firstly, a tensegrity-based robotic module capable of two kinds of shape change is designed. One kind is extension in the axial direction accompanied by contrac-tion in the radial direction, which is the basis for the wave-like crawling movement of the in-pipe robot. The other kind is ex-pansion in the radial direction while keeping changeless in the axial direction, enabling the module adaptable to pipes with different diameters. Then, the geometrical equilibrium configu-ration of the tensegrity module is determined, followed by kinematic analysis using force density method. By cascading three modules, the in-pipe crawling robot is developed. Finally, a series of experiments are performed to test the shape change-ability and friction force of the tensegrity module, and the mo-bility, load capacity, and adaptability of the in-pipe robot. The results validate that the robot can crawl in horizontal pipes, vertical pipes, and elbow pipes under the control of a simple actuation sequence. Furthermore, the robot has the abilities to adapt to pipes with different diameters varying from 100 mm to 180 mm. It is suggested that the usage of tensegrity structures brings about higher adaptability, flexibility, and mobility to the in-pipe crawling robot.
|
|
09:40-09:50, Paper TuAT2.8 | Add to My Program |
Untethered Robotic Millipede Driven by Low-Pressure Microfluidic Actuators for Multi-Terrain Exploration |
|
Shao, Qi | Tsinghua University |
Dong, Xuguang | Tsinghua University |
Lin, Zhonghan | Tsinghua University |
Tang, Chao | Tsinghua University |
Sun, Hao | Tsinghua University |
Liu, Xin-Jun | Tsinghua University |
Zhao, Huichan | Tsinghua University |
Keywords: Soft Robot Materials and Design, Biologically-Inspired Robots, Soft Robot Applications
Abstract: Mobile robots that can adapt to an extensive range of terrains play essential roles in many applications. Millipedes are one of the most terrain-adaptive creatures in nature due to their multi-legged locomotion and flexible body. Inspired by natural millipedes, we report an untethered robotic millipede with a 6-segments soft-rigid hybrid body that can actively bend and 24 legs driven by low-pressure microfluidic actuators. The 24 microfluidic actuators are driven by two independent low-pressure sources from miniature pumps, which allows the untethered locomotion of the robotic millipede in small size (length, 23 cm; width, 5 cm; height, 4 cm) and lightweight (150 g). Using a pre-defined gait for the multi-legs, the robotic millipede can locomote with a maximum speed of 30.96 cm/min (1.35 body length per minute) and a minimum turning radius of 15 cm (0.65 body length). Experiments also demonstrated that the robot was able to locomote effectively in various uneven terrains. Utilizing its passive or active mode of its flexible body, the robot could also achieve adaptive moves. The robotic millipede has the potential to perform a variety of environment exploration tasks by remotely controlling and transmitting real time images wirelessly.
|
|
09:50-10:00, Paper TuAT2.9 | Add to My Program |
FEA-Based Soft Robotic Modeling: Simulating a Soft-Actuator in SOFA (I) |
|
Ferrentino, Pasquale | Vrije Universiteit Brussels |
Roels, Ellen | Vrije Universiteit Brussel |
Brancart, Joost | Vrije Universiteit Brussel (VUB) |
Terryn, Seppe | Vrije Universiteit Brussel (VUB) |
Van Assche, Guy | Vrije Universiteit Brussel (VUB) |
Vanderborght, Bram | Vrije Universiteit Brussel |
Keywords: Soft Robot Materials and Design, Modeling, Control, and Learning for Soft Robots, Soft Robot Applications
Abstract: Soft robotics modeling is a research topic that is evolving fast. Many techniques are present in literature but most of them require analytical models with a lot of equations that are time-consuming, hard to resolve, and not so easy to handle. For this reason, the help of a soft mechanics simulator is essential in this field. In fact, this paper presents a tutorial on how to build a soft-robot model using an open-source Finite Element Analysis (FEA) simulator, called SOFA. This software is able to generate a simulation scene from a code written in Python or XML, so it can be used by people that with different fields of competence like mechanical knowledge, knowledge of material properties and programming skills. As a case study, a Python simulation of a cable-driven soft actuator that makes contact with a rigid object is considered. The basic working principles of SOFA required to make a scene are explained step by step. In particular, it shows how to simulate the mechanics and animate the bending behavior of the actuator. Furthermore, it will be shown also how to retrieve and save data from simulation, demonstrating that SOFA can easily adapt to a multi-disciplinary subject as the research in soft-robotics, but also be useful for teaching simulation and programming language principles to engineering students.
|
|
10:00-10:10, Paper TuAT2.10 | Add to My Program |
Inflated Bendable Eversion Cantilever Mechanism with Inner Skeleton for Increased Stiffness |
|
Takahashi, Tomoya | Tohoku University |
Watanabe, Masahiro | Tohoku University |
Abe, Kazuki | Tohoku University |
Tadakuma, Kenjiro | Tohoku University |
Saiki, Naoto | Tohoku University |
Konyo, Masashi | Tohoku University |
Tadokoro, Satoshi | Tohoku University |
Keywords: Soft Robot Materials and Design, Mechanism Design, Compliant Joints and Mechanisms
Abstract: Inflatable structures used in soft robotics applications have unique characteristics. In particular, the tip-extension structure, which extends the structure from its tip, can grow without creating friction with the environment. However, these inflatable structures need high pressure to maintain their stiffness under various conditions. Excessive inner pressure limits their application in that it prevents the structure from maintaining its curved shape and from complying with specifications. This study aimed to simultaneously lower the pressure and increase the rigidity of the structure. Our work resulted in the proposal of a mechanism that combines a skeleton structure consisting of multi-joint links with functions to increase the rigidity. Insertion of this mechanism into an inflatable structure obviates the need for high inner pressure, yet enables the structure to bend and maintain the intended shape. We devised a design based on rigid articulated links and combined it with a membrane structure that utilizes the advantages of the tip-extension structure. The experimental results show that the payload of the structure designed to operate at low pressure increases compared to that of the membrane-only structure. The findings of this research can be applied to long robots that can be extended into open space without drooping and to mechanisms that enable structures to wrap around the human body.
|
|
TuAT3 Oral Session, ICC Cap Suite 2-4 |
Add to My Program |
Design of Mechanisms |
|
|
Chair: Tadokoro, Satoshi | Tohoku University |
Co-Chair: Kruusmaa, Maarja | Tallinn University of Technology (TalTech) |
|
08:30-08:40, Paper TuAT3.1 | Add to My Program |
Energy-Based Design Optimization of a Miniature Wave-Like Robot Inside Curved Compliant Tubes |
|
Katz, Rotem | Ben Gurion University of the Negev |
Shachaf, Dan | BGU |
Zarrouk, David | Ben Gurion University |
Keywords: Mechanism Design, Biologically-Inspired Robots, Medical Robots and Systems
Abstract: This paper analyzes the crawling locomotion of a wave-like robot in curved tubes. We use an energy-based approach to determine the optimal crawling orientation of the robot that minimizes the surface energy while advancing. The results showed that the robot rotated its body along the roll direction so that the wave motion would be in the same plane as the curvature plane of the tube. The incorporation of a passive bending joint along the plane of the wave motion decreased the surface energy and enhanced the robot’s ability to advance in even tighter curves. Given these findings we designed and manufactured two new robots with either one or two passive bending joints. We molded custom flexible surfaces and tubes and experimentally tested our robots in them. These validating experiments indicated that the bending joints substantially improved the robots’ ability to traverse curved tubes (see video).
|
|
08:40-08:50, Paper TuAT3.2 | Add to My Program |
A Palm-Sized Omnidirectional Mobile Robot Driven by 2-DOF Torus Wheels |
|
Sato, Yunosuke | Toyohashi University of Technology |
Kanada, Ayato | Kyushu University |
Mashimo, Tomoaki | Okayama University |
Keywords: Mechanism Design, Soft Robot Applications
Abstract: This paper proposes a palm-sized omnidirectional mobile robot with two torus wheels. A single torus wheel is made of an elastic elongated coil spring in which the two ends of the coil connected each other and is driven by a piezoelectric actuator (stator) that can generate 2-degrees-of-freedom (axial and angular) motions. The stator converts its thrust force and torque into longitudinal and meridian motions of the torus wheel, respectively, making the torus work as an omnidirectional wheel on a plane. In this paper, we build a control system of a piezo-driven 2-degrees-of-freedom torus wheel and evaluate its performance measures, such as the transient characteristics, the orientation accuracy and the payload capacity. An omnidirectional robot with the two torus wheels is constructed, and the feedback control for a desired planar motion is demonstrated. The design inspired by a ring torus represents the possibility toward the creation of an unprecedentedly simple, light, and compact 2-wheel omnidirectional robot.
|
|
08:50-09:00, Paper TuAT3.3 | Add to My Program |
Flipper-Style Locomotion through Strong Expanding Modular Robots |
|
Chin, Lillian | Massachusetts Institute of Technology |
Burns, Max | MIT |
Xie, Gregory | MIT |
Rus, Daniela | MIT |
Keywords: Cellular and Modular Robots, Actuation and Joint Mechanisms, Biologically-Inspired Robots
Abstract: Modular robotic units that can change their size at will presents an exciting pathway for modular robotics. However, current attempts have been relatively limited, requiring tethers, complex fabrication or slow cycle times. In this work, we present AuxBots: an auxetic-based approach to create high force, fast cycle time self-contained modules. By driving the auxetic shell's inherent mathematical expansion with a motor and leadscrew, these robots are capable of expanding their volume by 274% in 0.7 seconds with a maximum strength to weight ratio of 76x. These force and expansion properties enable us to use these modules in conjunction with flexible wire constraints to get shape changing behavior and independent locomotion. We demonstrate the power of this modular system by using a limited number of AuxBots to mimic the flipper-style locomotion of mudskippers and sea turtles. These structures are entirely untethered and can still move forward even as some AuxBots stall out, achieving the key modular robotics goal of versatility and robustness.
|
|
09:00-09:10, Paper TuAT3.4 | Add to My Program |
Simplified Configuration Design of Anthropomorphic Hand Imitating Specific Human Hand Grasps |
|
Tian, Xinyang | Beihang University |
Zhan, Qiang | Beihang University |
Zhang, Yin | Beihang University |
Zou, Junyi | Beihang University |
Jiang, Lingxiao | Beihang University |
Xu, Qinhuan | Beihang University |
Keywords: Multifingered Hands, Methods and Tools for Robot System Design, Product Design, Development and Prototyping
Abstract: How to design an anthropomorphic hand imitating specific human hand grasps with as few actuators as possible is still a challenge. This paper presents a method for obtaining a simplified configuration of anthropomorphic hand imitating specific human hand grasps based on the motion analyses of the human hand. A participation matrix which characterizes a human hand grasp on joint motion level is constructed according to the motion participation of each finger joint. By adding all participation matrices of expected human hand grasps together a total participation matrix can be derived, and through mathematical processing a simplified anthropomorphic hand configuration can be obtained. Following the proposed method, a simplified anthropomorphic hand configuration that imitates six basic human hand grasps was obtained. A series of grasp experiments with the anthropomorphic hand prototype were conducted to validate the grasping capability as well as the proposed simplified configuration design method. This method can help to obtain a reasonably simplified configuration of an anthropomorphic hand when expected human hand grasps are definite.
|
|
09:10-09:20, Paper TuAT3.5 | Add to My Program |
Meta Reinforcement Learning for Optimal Design of Legged Robots |
|
Belmonte-Baeza, Alvaro | University of Alicante |
Lee, Joonho | ETH Zurich Robotic Systems Laboratory |
Valsecchi, Giorgio | Robotic System Lab, ETH |
Hutter, Marco | ETH Zurich |
Keywords: Reinforcement Learning, Mechanism Design, Legged Robots
Abstract: The process of robot design is a complex task and the majority of design decisions are still based on human intuition or tedious manual tuning. A more informed way of facing this task is computational design methods where design parameters are concurrently optimized with corresponding controllers. Existing approaches, however, are strongly influenced by predefined control rules or motion templates and cannot provide end-to-end solutions. In this paper, we present a design optimization framework using model-free meta reinforcement learning, and its application to the optimizing kinematics and actuator parameters of quadrupedal robots. We use meta reinforcement learning to train a locomotion policy that can quickly adapt to different designs. This policy is used to evaluate each design instance during the design optimization. We demonstrate that the policy can control robots of different designs to track random velocity commands over various rough terrains. With controlled experiments, we show that the meta policy achieves close-to-optimal performance for each design instance after adaptation. Lastly, we compare our results against a model-based baseline and show that our approach allows higher performance while not being constrained by predefined motions or gait patterns.
|
|
09:20-09:30, Paper TuAT3.6 | Add to My Program |
Advanced 2-DOF Counterbalance Mechanism Based on Gear Units and Springs to Minimize Required Torques of Robot Arm |
|
Kim, Hwi-su | Korea Institute of Machinery & Materials |
Park, Jongwoo | Korea Institue of Machinery & Materials |
Bae, Myeongsu | Dyence Tech |
Park, Dongil | Korea Institute of Machinery and Materials (KIMM) |
Park, Chanhun | KIMM |
Do, Hyun Min | Korea Institute of Machinery and Materials |
Choi, Taeyong | KIMM |
Kim, Doo-hyeong | Korea Institute of Machinery & Materials |
Kyung, Jinho | Korea Institute of Machinery & Materials (KIMM) |
Keywords: Robot Safety, Cooperating Robots, Mechanism Design
Abstract: In recent years, human-robot cooperation has enhanced productivity and achieved high payload, speed, and accuracy. Integrating typical industrial robots in human-robot cooperation is challenging because their arms may cause serious injuries to humans during a collision due to malfunction or errors due to robot operators. Therefore, counterbalance robot arms that are capable of counterbalancing the gravitational torques due to the robot mass have been developed to decrease the required capacity of the motors and speeds of these robots. In this research, we propose an advanced counterbalance mechanism using gear units and springs to improve the durability and reliability compared to the previously proposed wire-based counterbalance mechanism, which is difficult to apply to a commercialized product because it can easily be broken or stretched when an excessive force is applied for a long period. Moreover, our proposed method was extended to a multi-DOF system using a parallelogram mechanism based on a timing belt and pulleys to achieve multi-DOF robotic arms. A 2-DOF counterbalanced arm was designed to verify the effectiveness of the proposed mechanism. The simulations and experimental results showed that the proposed mechanism effectively reduced the gravitational torques of each joint of the multi-DOF arm.
|
|
09:30-09:40, Paper TuAT3.7 | Add to My Program |
Permanent-Magnetically Amplified Robotic Gripper with Less Clamping Width Influence on Compensation Realized by a Stepless Width Adjustment Mechanism |
|
Shimizu, Tori | Tohoku University |
Tadakuma, Kenjiro | Tohoku University |
Watanabe, Masahiro | Tohoku University |
Abe, Kazuki | Tohoku University |
Konyo, Masashi | Tohoku University |
Tadokoro, Satoshi | Tohoku University |
Keywords: Actuation and Joint Mechanisms, Force Control, Mechanism Design
Abstract: Machines such as robotic grippers use powerful actuators or gearboxes to exert large loads at the expense of energy consumption, volume, and mass. We propose a stepless force amplification mechanism that assists clamping by a pair of permanent magnets, in which the external control force required to adjust their distance, and thus the output force, is suppressed by compensation springs. For further sophistication, we invented a new width adjuster using a lever. By separating the actuation of fingers and compensated magnets temporarily, the adjuster eliminated the nonlinear influence of the object width on the clamping force. The prototype gripper for proof of concept revealed that the adjuster successfully linearized the width-force characteristic with an inclination of 0.15 N/mm, which is sufficiently insignificant compared to the major output force of approximately 50 N. The force amplification effect coexisted with this phenomenon, such that the clamping force was amplified to 137.5% while maintaining the energy consumption of a DC motor, and the force-energy efficiency was multiplied by 1.39. Thus, able to be driven by a weaker, smaller, and lighter actuator, the gripper contributes to extension of the operation time of robots with limited power supply.
|
|
09:40-09:50, Paper TuAT3.8 | Add to My Program |
Design of a New Bio-Inspired Dual-Axis Compliant Micromanipulator with Millimeter Strokes (I) |
|
Lyu, Zekui | University of Macau |
Xu, Qingsong | University of Macau |
Keywords: Compliant Joint/Mechanism, Mechanism Design, Micro/Nano Robots, Biologically-Inspired Robots
Abstract: This paper proposes the concept design of a novel bio-inspired dual-axis compliant micromanipulator with millimeter working strokes dedicated to fiber alignment. It subtly mimics the gripping and rubbing function of the human hand consisting of the forefinger, purlicue, and thumb. Compared with traditional dual-axis grippers, its advantages lie in millimeter-level stroke, bi-directional rotation, less slippage, and comprehensive force sensing. To achieve dexterous and reliable manipulation, a two-degree-of-freedom (2-DOF) flexible decoupling mechanism and a displacement reversing mechanism based on the leaf-shaped flexible hinge are introduced. A prototype driven by two voice coil motors is fabricated for experimental testing. Three high-precision strain gauges with temperature compensation are glued on the sensitive region to measure the gripping force and rubbing force. Experimental results show that the gripping and rubbing strokes of the manipulator are up to 2.3 mm and 2.1 mm, respectively. For a custom-made fiber flag with a diameter of 200 um, the rotation stroke of more than 1000 degrees has been achieved, which cannot be realized by previous work with the same level of compact mechanism design.
|
|
09:50-10:00, Paper TuAT3.9 | Add to My Program |
Optimal Elastic Wing for Flapping-Wing Robots through Passive Morphing |
|
Ruiz Paez, Cristina | Group of Robotics Vision and Control |
Acosta, Jose Angel | University of Seville |
Ollero, Anibal | University of Seville |
Keywords: Biologically-Inspired Robots, Actuation and Joint Mechanisms, Aerial Systems: Mechanics and Control
Abstract: Flapping wing robots show promise as platforms for safe and efficient flight in near-human operations, thanks to their ability to agile maneuver or perch at a low Reynolds number. The growing trend in the automatization of these robots has to go hand in hand with an increase in the payload capacity. This work provides a new passive morphing wing prototype to increase the payload of this type of UAV. The prototype is based on a biased elastic joint and the holistic research also includes the modelling, simulation and optimization scheme, thus allowing to adapt the prototype for any flapping wing robot. This model has been validated through flight experiments on the available platform, and it has also been demonstrated that the morphing prototype can increase the lift of the robot under study by up to 16% in real flight and 10% of estimated consumption reduction.
|
|
TuAT4 Oral Session, South Gallery Rms 20-22 |
Add to My Program |
Planning |
|
|
Chair: Simeon, Thierry | LAAS-CNRS |
Co-Chair: Otte, Michael W. | University of Maryland |
|
08:30-08:40, Paper TuAT4.1 | Add to My Program |
Robust Multi-Robot Trajectory Optimization Using Alternating Direction Method of Multiplier |
|
Ni, Ruiqi | Florida State University |
Pan, Zherong | Tencent America |
Gao, Xifeng | Tencent America |
Keywords: Motion and Path Planning, Collision Avoidance, Multi-Robot Systems
Abstract: We propose a variant of alternating direction method of multiplier (ADMM) to solve constrained trajectory optimization problems. Our ADMM framework breaks a joint optimization into small sub-problems, leading to a low iteration cost and decentralized parameter updates. Starting from a collision-free initial trajectory, our method inherits the theoretical properties of primal interior point method (P-IPM), i.e., guaranteed collision avoidance and homotopy preservation throughout optimization, while being orders of magnitude faster. We have analyzed the convergence and evaluated our method for time-optimal multi-UAV trajectory optimizations and simultaneous goal-reaching of multiple robot arms, where we take into consider kinematics-, dynamics-limits, and homotopy-preserving collision constraints. Our method highlights an order of magnitude's speedup, while generating trajectories of comparable qualities as state-of-the-art P-IPM solver.
|
|
08:40-08:50, Paper TuAT4.2 | Add to My Program |
Autonomous Exploration in a Cluttered Environment for a Mobile Robot with 2D-Map Segmentation and Object Detection |
|
Kim, Hyung seok | Kyungpook National University |
Kim, HyeongJin | Kyungpook National University |
Lee, Seon-il | Kyungpook National University |
Lee, Hyeonbeom | Kyungpook National University |
Keywords: Planning under Uncertainty, Search and Rescue Robots, Object Detection, Segmentation and Categorization
Abstract: Frontier-based exploration is widely adopted for exploring an unknown region. The conventional frontier-based exploration for a mobile robot may collide with three-dimensional (3D) obstacles or can suffer from a slower exploration time because the robot may move to another place before completely exploring the current area. To solve this problem, in this paper, we propose a new exploration algorithm by considering a path traveled by a mobile robot and segmenting a two-dimensional (2D) map. The segmented 2D map is generated in real-time by using the position of the robot and the location of the detected frontiers. To apply our algorithm to the actual experiment, we develop an object detection-based exploration algorithm that can remarkably reduce the probability of collision with 3D obstacles. To verify the effectiveness of our proposed algorithm, we perform simulations (Gazebo) and experiments (in the real world) to compare the conventional approach and our algorithm in a cluttered environment. The simulation and experiment results show that our algorithm can satisfactorily shorten the exploration path and time.
|
|
08:50-09:00, Paper TuAT4.3 | Add to My Program |
Distributionally Safe Path Planning: Wasserstein Safe RRT |
|
Lathrop, Paul | University of California, San Diego |
Boardman, Beth | Los Alamos National Laboratory |
Martinez, Sonia | UC San Diego |
Keywords: Motion and Path Planning, Collision Avoidance, Robot Safety
Abstract: In this paper, we propose a Wasserstein metric-based random path planning algorithm. Wasserstein Safe RRT (W-Safe RRT) provides finite-sample probabilistic guarantees on the safety of a returned path in an uncertain obstacle environment. Vehicle and obstacle states are modeled as distributions based upon state and model observations. We define limits on distributional sampling error so the Wasserstein distance between a vehicle state distribution and obstacle distributions can be bounded. This enables the algorithm to return safe paths with a confidence bound through combining finite sampling error bounds with calculations of the Wasserstein distance between discrete distributions. W-Safe RRT is compared against a baseline minimum encompassing ball algorithm, which ensures balls that minimally encompass discrete state and obstacle distributions do not overlap. The improved performance is verified in a 3D environment using single, multi, and rotating non-convex obstacle cases, with and without forced obstacle error in adversarial directions, showing that W-Safe RRT can handle poorly modeled complex environments.
|
|
09:00-09:10, Paper TuAT4.4 | Add to My Program |
Sim2Real Learning of Obstacle Avoidance for Robotic Manipulators in Uncertain Environments |
|
Zhang, Tan | Shenzhen Techonology University |
Zhang, Kefang | Shenzhen University |
Lin, Jiatao | Shenzhen University |
Louie, Wing-Yue Geoffrey | Oakland University |
Huang, Hui | Shenzhen University |
Keywords: Motion and Path Planning, Reinforcement Learning, Collision Avoidance
Abstract: Obstacle avoidance for robotic manipulators can be challenging when they operate in unstructured environments. This problem is probed with the sim-to-real (sim2real) deep reinforcement learning, such that a moving policy of the robotic arm is learnt in a simulator and then adapted to the real world. However, the problem of sim2real adaptation is notoriously difficult. To this end, this work proposes (1) a unified representation of obstacles and targets to capture the underlying dynamics of the environment while allowing generalization to unseen goals and (2) a flexible end-to-end model combining the unified representation with the deep reinforcement learning control module that can be trained by interacting with the environment. Such a representation is agnostic to the shape and appearance of the underlying objects, which simplifies and unifies the scene representation in both simulated and real worlds. We implement this idea with a vision-based actor-critic framework by devising a bounding box predictor module. The predictor estimates the 3D bounding boxes of obstacles and targets from the RGB-D input. The features extracted by the predictor are fed into the policy network, and all the modules are jointly trained. Our experiments in simulated environment and the real-world show that the end-to-end model of the unified representation achieves better sim2real adaption and scene generalization than state-of-the-art techniques.
|
|
09:10-09:20, Paper TuAT4.5 | Add to My Program |
Bidirectional Sampling-Based Motion Planning without Two-Point Boundary Value Solution (I) |
|
Nayak, Sharan | University of Maryland, College Park |
Otte, Michael W. | University of Maryland |
Keywords: Motion and Path Planning, Autonomous Agents, Dynamics, Sampling-Based Motion Planning
Abstract: Bidirectional path and motion planning approaches decrease planning time, on average, compared to their unidirectional counterparts. In single-query feasible motion planning, using bidirectional search to find a continuous motion plan requires an edge connection between the forward and the reverse search tree. Such a tree–tree connection requires solving a two-point boundary value problem (BVP). However, obtaining a closed-form two-point BVP solution can be difficult or impossible for many systems. While numerical methods can provide a reasonable solution in many cases, they are often computationally expensive or numerically unstable for the purposes of single-query sampling-based motion planning. To overcome this challenge, we present a novel bidirectional search strategy that does not require solving the two-point BVP. Instead of connecting the forward and reverse trees directly, the reverse tree’s cost information is used as a guiding heuristic for forward search. This enables the forward search to quickly grow down the reverse tree—converging to a fully feasible solution without the solution to a two-point BVP. In this article, we propose two algorithms that use this strategy for single-query feasible motion planning for various dynamical systems, performing experiments in both simulation and hardware testbeds. We find that these algorithms perform better than or comparable to existing state-of-the-art methods with respect to quickly finding an initial feasible solution.
|
|
09:20-09:30, Paper TuAT4.6 | Add to My Program |
Long-Horizon Multi-Robot Rearrangement Planning for Construction Assembly (I) |
|
Hartmann, Valentin Noah | University of Stuttgart |
Orthey, Andreas | TU Berlin |
Driess, Danny | TU Berlin |
Oguz, Ozgur S. | Bilkent University |
Toussaint, Marc | TU Berlin |
Keywords: Manipulation Planning, Task Planning, Robotics in Construction, Multi-Robot Systems
Abstract: Robotic construction assembly planning aims to find feasible assembly sequences as well as the corresponding robot-paths and can be seen as a special case of task and motion planning (TAMP). As construction assembly can well be parallelized, it is desirable to plan for multiple robots acting concurrently. Solving TAMP instances with many robots and over a long time-horizon is challenging due to coordination constraints, and the difficulty of choosing the right task assignment. We present a planning system which enables parallelization of complex task and motion planning problems by iteratively solving smaller subproblems. Combining optimization methods to jointly solve for manipulation constraints with a sampling-based bi-directional space-time path planner enables us to plan cooperative multi-robot manipulation with unknown arrival-times. Thus, our solver allows for completing subproblems and tasks with differing timescales and synchronizes them effectively. We demonstrate the approach on multiple construction case-studies to show the robustness over long planning horizons and scalability to many objects and agents. % of our algorithm. Finally, we also demonstrate the
|
|
09:30-09:40, Paper TuAT4.7 | Add to My Program |
A Reachability-Based Spatio-Temporal Sampling Strategy for Kinodynamic Motion Planning |
|
Tang, Yongxing | Northwestern Polytechnical University |
Zhu, Zhanxia | Northwestern Polytechnical University |
Zhang, Hongwen | Zhejiang Lab |
Keywords: Motion and Path Planning, Constrained Motion Planning, Optimization and Optimal Control
Abstract: By limiting the planning domain to “L2 Informed Set”, some sampling-based motion planner (SBMP) (e.g. Informed RRT*, BIT*) can solve the geometric motion planning problems efficiently. However, the construction of informed set (IS) will be very challenging, when further differential constraints are considered. For the time-optimal kinodynamic motion planning problem, this paper defines a modified time informed set (MTIS) to limit the planning domain. Due to drawing inspiration from Hamilton-Jacobi-Bellman (HJB) reachability analysis, MTIS, compared with the original TIS, can not only help save the running time of SBMP, but also extend the applicable scope from linear systems to polynomial nonlinear systems with control constrains. On this basis, a spatio-temporal sampling strategy adapted to MTIS is proposed. Firstly, MTIS is used to estimate the optimal cost and the valid tree structure is reused, so that we do not need to provide a solution trajectory in advance. Secondly, this strategy is generic, allowing it to be combined with common SBMP (such as SST, etc.) to accelerate convergence and reduce memory complexity. Several simulation experiments also demonstrate the effectiveness of proposed method.
|
|
09:40-09:50, Paper TuAT4.8 | Add to My Program |
Efficient Speed Planning for Autonomous Driving in Dynamic Environment with Interaction Point Model |
|
Chen, Yingbing | The Hongkokng University of Science and Technology |
Xin, Ren | The Hong Kong University of Science and Technology |
Cheng, Jie | Hong Kong University of Science and Technology |
Zhang, Qingwen | KTH Royal Institute of Technology |
Mei, Xiaodong | HKUST |
Liu, Ming | Hong Kong University of Science and Technology |
Wang, Lujia | The Hong Kong University of Technology |
Keywords: Autonomous Vehicle Navigation, Integrated Planning and Learning, Motion and Path Planning
Abstract: Safely interacting with other traffic participants is one of the core requirements for autonomous driving, especially in intersections and occlusions. Most existing approaches are designed for particular scenarios and require significant human labor in parameter tuning to be applied to different situations. To solve this problem, we first propose a learning-based Interaction Point Model (IPM), which describes the interaction between agents with the protection time and interaction priority in a unified manner. We further integrate the proposed IPM into a novel planning framework, demonstrating its effectiveness and robustness through comprehensive simulations in highly dynamic environments.
|
|
09:50-10:00, Paper TuAT4.9 | Add to My Program |
Efficient Anytime CLF Reactive Planning System for a Bipedal Robot on Undulating Terrain (I) |
|
Huang, Jiunn-Kai | University of Michigan |
Grizzle, J.W | University of Michigan |
Keywords: Motion and Path Planning, Reactive and Sensor-Based Planning, Nonholonomic Motion Planning, Field Robots
Abstract: We propose and experimentally demonstrate a reactive planning system for bipedal robots on unexplored, challenging terrain. The system includes: a multi-layer local map for assessing traversability; an anytime omnidirectional Control Lyapunov Function (CLF) for use with a Rapidly Exploring Random Tree Star (RRT*) that generates a vector field for specifying motion between nodes; a sub-goal finder when the final goal is outside of the current map; and a finite-state machine to handle high-level mission decisions. The system also includes a reactive thread that copes with robot deviations via a vector field, defined by a closed-loop feedback policy. The vector field provides real-time control commands to the robot's gait controller as a function of instantaneous robot pose. The system is evaluated on various challenging outdoor terrains and cluttered indoor scenes in both simulation and experiment on Cassie Blue, a bipedal robot with 20 degrees of freedom. All implementations are coded in C++ with the Robot Operating System (ROS) and are available at https://github.com/UMich-BipedLab/CLF_reactive_planning_system.
|
|
10:00-10:10, Paper TuAT4.10 | Add to My Program |
A Framework to Co-Optimize Robot Exploration and Task Planning in Unknown Environments |
|
Xu, Yuanfan | Tsinghua University |
Zhang, Zhaoliang | Tsinghua University |
Jincheng, Yu | Tsinghua University |
Shen, Yuan | Tsinghua University |
Wang, Yu | Tsinghua University |
Keywords: Task Planning, Reactive and Sensor-Based Planning, Planning, Scheduling and Coordination
Abstract: Robots often need to accomplish complex tasks in unknown environments, which is a challenging problem, involving autonomous exploration for acquiring necessary scene knowledge and task planning. In traditional approaches, the agent first explores the environment to instantiate a complete planning domain and then invokes a symbolic planner to plan and perform high-level actions. However, task execution is inefficient since the two processes involve many repetitive states and actions. Hence, this paper proposes a framework to co-optimize robot exploration and task planning in unknown environments. To afford robot exploration and symbolic planning not being independent and separated, we design a unified structure named subtask, which is exploited to decompose the robot exploration and planning phases. To select the appropriate subtask each time, we develop a value function and a value-based scheduler to co-optimize exploration and task processing. Our framework is evaluated in a photo-realistic simulator with three complex household tasks, increasing task efficiency by 25%-29%.
|
|
TuAT5 Oral Session, ICC Cap Suite 10-12 |
Add to My Program |
Reinforcement Learning |
|
|
Chair: Hovakimyan, Naira | University of Illinois at Urbana-Champaign |
Co-Chair: Kumar, Vikash | Meta AI |
|
08:30-08:40, Paper TuAT5.1 | Add to My Program |
Binarized P-Network: Deep Reinforcement Learning of Robot Control from Raw Images on FPGA |
|
Kadokawa, Yuki | Nara Institute of Science and Technology |
Tsurumine, Yoshihisa | Nara Institute of Science and Technology |
Matsubara, Takamitsu | Nara Institute of Science and Technology |
Keywords: Reinforcement Learning, Embedded Systems for Robotic and Automation, Hardware-Software Integration in Robotics
Abstract: This paper explores a Deep Reinforcement Learning (DRL) approach for designing image-based control for edge robots to be implemented on Field Programmable Gate Arrays (FPGAs). Although FPGAs are more power-efficient than CPUs and GPUs, a typical (DRL) method cannot be applied since they are composed of many Logic Blocks (LBs) for high-speed logical operations but low-speed real-number operations. To cope with this problem, we propose a novel DRL algorithm called Binarized P-Network (BPN), which learns image-input control policies using Binarized Convolutional Neural Networks (BCNNs). To alleviate the instability of reinforcement learning caused by a BCNN with low function approximation accuracy, our BPN adopts a robust value update scheme called Conservative Value Iteration, which is tolerant of function approximation errors. We confirmed the BPN's effectiveness through applications to a visual tracking task in simulation and real-robot experiments with FPGA.
|
|
08:40-08:50, Paper TuAT5.2 | Add to My Program |
Automating Reinforcement Learning with Example-Based Resets |
|
Kim, Jigang | Seoul National University |
Park, J. hyeon | Seoul National University |
Cho, Daesol | Seoul National University |
Kim, H. Jin | Seoul National University |
Keywords: Reinforcement Learning, Incremental Learning, Autonomous Agents
Abstract: Deep reinforcement learning has enabled robots to learn motor skills from environmental interactions with minimal to no prior knowledge. However, existing reinforcement learning algorithms assume an episodic setting, in which the agent resets to a fixed initial state distribution at the end of each episode, to successfully train the agents from repeated trials. Such reset mechanism, while trivial for simulated tasks, can be challenging to provide for real-world robotics tasks. Resets in robotic systems often require extensive human supervision and task-specific workarounds, which contradicts the goal of autonomous robot learning. In this paper, we propose an extension to conventional reinforcement learning towards greater autonomy by introducing an additional agent that learns to reset in a self-supervised manner. The reset agent preemptively triggers a reset to prevent manual resets and implicitly imposes a curriculum for the forward agent. We apply our method to learn from scratch on a suite of simulated and real-world continuous control tasks and demonstrate that the reset agent successfully learns to reduce manual resets whilst also allowing the forward policy to improve gradually over time.
|
|
08:50-09:00, Paper TuAT5.3 | Add to My Program |
Improving the Robustness of Reinforcement Learning Policies with L1 Adaptive Control |
|
Cheng, Yikun | University of Illinois at Urbana-Champaign |
Zhao, Pan | University of Illinois Urbana-Champaign |
Wang, Fanxin | University of Illinois at Urbana-Champaign |
Block, Daniel | University of Illinois |
Hovakimyan, Naira | University of Illinois at Urbana-Champaign |
Keywords: Reinforcement Learning, Robust/Adaptive Control, Robot Safety
Abstract: A reinforcement learning (RL) control policy could fail in a new/perturbed environment that is different from the training environment, due to the presence of dynamics variations. For controlling systems with continuous state and action spaces, we propose an add-on approach to robustifying a pre-trained RL policy by augmenting it with an L1 adaptive controller (L1AC). Leveraging the capability of an L1AC for fast estimation and active compensation of dynamic variations, the proposed approach can improve the robustness of an RL policy that is trained either in a simulator or in the real world without consideration of a broad class of dynamics variations. Numerical and real-world experiments empirically demonstrate the efficacy of the proposed approach in robustifying RL policies trained using both model-free and model-based methods.
|
|
09:00-09:10, Paper TuAT5.4 | Add to My Program |
Developing Cooperative Policies for Multi-Stage Reinforcement Learning Tasks |
|
Erskine, Jordan | Queensland University of Technology |
Lehnert, Christopher | Queensland University of Technology |
Keywords: Reinforcement Learning
Abstract: Many hierarchical reinforcement learning algorithms utilise a series of independent skills as a basis to solve tasks at a higher level of reasoning. These algorithms don't consider the value of using skills that are cooperative instead of independent. This paper proposes the Cooperative Consecutive Policies (CCP) method of enabling consecutive agents to cooperatively solve long time horizon multi-stage tasks. This method is achieved by modifying the policy of each agent to maximise both the current and next agent's critic. Cooperatively maximising critics allows each agent to take actions that are beneficial for its task as well as subsequent tasks. Using this method in a multi-room maze domain and a peg in hole manipulation domain, the cooperative policies were able to outperform a set of naive policies, a single agent trained across the entire domain, as well as another sequential HRL algorithm.
|
|
09:10-09:20, Paper TuAT5.5 | Add to My Program |
Learning Performance Graphs from Demonstrations Via Task-Based Evaluations |
|
Puranic, Aniruddh Gopinath | University of Southern California |
Deshmukh, Jyotirmoy | University of Southern California |
Nikolaidis, Stefanos | University of Southern California |
Keywords: Formal Methods in Robotics and Automation, Learning from Demonstration, Reinforcement Learning
Abstract: In the paradigm of robot learning-from-demonstrations (LfD), understanding and evaluating the demonstrated behaviors plays a critical role in extracting control policies for robots. Without this knowledge, a robot may infer incorrect reward functions that lead to undesirable or unsafe control policies. Prior work has used temporal logic specifications, manually ranked by human experts based on their importance, to learn reward functions from imperfect/suboptimal demonstrations. To overcome reliance on expert rankings, we propose a novel algorithm that learns from demonstrations, a partial ordering of provided specifications in the form of a performance graph. Through various experiments, including simulation of industrial mobile robots, we show that extracting reward functions with the learned graph results in robot policies similar to those generated with the manually specified orderings. We also show in a user study that the learned orderings match the orderings or rankings by participants for demonstrations in a simulated driving domain. These results show that we can accurately evaluate demonstrations with respect to provided task specifications from a small set of imperfect data with minimal expert input.
|
|
09:20-09:30, Paper TuAT5.6 | Add to My Program |
Tumbling Robot Control Using Reinforcement Learning (I) |
|
Schwartzwald, Andrew | CSE, UMN |
Tlachac, Matthew | CSE, UMN |
Guzman, Luis | CSE, University of Minnesota |
Bacharis, Athanasios | University of Minnesota |
Papanikolopoulos, Nikos | University of Minnesota |
Keywords: AI-Based Methods, Agent-Based Systems, Machine Learning for Robot Control
Abstract: Tumbling robots are simple platforms that are able to traverse large obstacles relative to their size, at the cost of being difficult to control. Existing control methods apply only a subset of possible robot motions and make the assumption of flat terrain. Reinforcement learning allows for the development of sophisticated control schemes that can adapt to diverse environments. By utilizing domain randomization while training in simulation, a robust control policy can be learned which transfers well to the real world. In this paper, we implement autonomous setpoint navigation on a tumbling robot prototype and evaluate it on flat and uneven terrain. The flexibility of our system demonstrates the viability of nontraditional robots for navigational tasks.
|
|
09:30-09:40, Paper TuAT5.7 | Add to My Program |
Guided Reinforcement Learning – a Review and Evaluation for Efficient and Effective Real-World Robotics (I) |
|
Eßer, Julian | Fraunhofer IML |
Bach, Nicolas | Fraunhofer IML |
Jestel, Christian | Fraunhofer IML |
Urbann, Oliver | Fraunhofer IML |
Kerner, Sören | Fraunhofer IML |
Keywords: Reinforcement Learning, AI-Enabled Robotics, Transfer Learning
Abstract: Recent successes aside, reinforcement learning still faces significant challenges in its application to the real-world robotics domain. Guiding the learning process with additional knowledge offers a potential solution, thus leveraging the strengths of data- and knowledge-driven approaches. However, this field of research encompasses several disciplines and hence would benefit from a structured overview. In this paper, we propose the concept of guided reinforcement learning that provides a systematic approach towards accelerating the training process and improving the performance for real-world robotic settings. We introduce a classification that structures guided reinforcement learning approaches and shows how different sources of knowledge can be integrated into the learning pipeline in a practical way. Based upon this, we describe available approaches in this field and evaluate their specific impact in terms of efficiency, effectiveness, and sim-to-real transfer within the robotics domain.
|
|
09:40-09:50, Paper TuAT5.8 | Add to My Program |
Robust Adaptive Ensemble Adversary Reinforcement Learning |
|
Zhai, Peng | Fudan University |
Hou, Taixian | FuDan University |
Ji, Xiaopeng | Zhejiang University |
Dong, Zhiyan | Fudan University |
ZHang, Lihua | Fudan University |
Keywords: Reinforcement Learning, Machine Learning for Robot Control
Abstract: Reinforcement learning needs to learn policies through trial and error. The unstable policies in the early stage of training make it expensive (and time-consuming) to train directly in the real environment, which may cause disastrous consequences. The popular solution is to use the simulator to train the policy and deploy it in a real environment. However, the modeling error and external disturbance between the simulation and the real environment may fail the physical deployment, resulting in the sim2real transfer problem. In this letter, we propose a novel robust adversarial reinforcement learning framework, which uses the ensemble training of multi-adversarial agents that can adaptively adjust adversaries’ strength to enhance RL policy’s robustness. More specifically, we take the accumulative reward as feedback and construct a PID controller to adjust the adversary’s output magnitude to perform the adversarial training well. Experiments in the simulated and the real environment show that our algorithm improves the generalization ability of the policy for the modeling error and the uncertain disturbance simultaneously, outperforming the next best prior methods across all domains. The algorithm was further proven to be effective in a sim2real transfer task through the load experiment of a real racing drone, and the tracking performance is better than the PID-based flight controller.
|
|
09:50-10:00, Paper TuAT5.9 | Add to My Program |
GIN: Graph-Based Interaction-Aware Constraint Policy Optimization for Autonomous Driving |
|
Yoo, Se-Wook | Seoul National University |
Kim, Chan | Seoul National University |
Choi, Jinwoo | Seoul National University |
Kim, Seong-Woo | Seoul National University |
Seo, Seung-Woo | Seoul National University |
Keywords: Reinforcement Learning, Integrated Planning and Learning, Robot Safety
Abstract: Applying reinforcement learning to autonomous driving entails particular challenges, primarily due to dynamically changing traffic flows. To address such challenges, it is necessary to quickly determine response strategies to the changing intentions of surrounding vehicles. This paper proposes a new policy optimization method for safe driving using graph-based interaction-aware constraints. In this framework, the motion prediction and control modules are trained simultaneously while sharing a latent representation that contains a social context. To reflect social interactions, we illustrate the movements of agents in graph form and filter the features with the graph convolution networks. This helps preserve the spatiotemporal locality of adjacent nodes. Furthermore, we create feedback loops to combine these two modules effectively. As a result, this approach encourages the learned controller to be safe from dynamic risks and renders the motion prediction robust to abnormal movements. In the experiment, we set up a navigation scenario comprising various situations with CARLA, an urban driving simulator. The experiments show state-of-the-art performance in navigation strategy and motion prediction compared to the baselines. The code is available online.
|
|
10:00-10:10, Paper TuAT5.10 | Add to My Program |
Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning |
|
Dorka, Nicolai | University of Freiburg |
Welschehold, Tim | Albert-Ludwigs-Universität Freiburg |
Boedecker, Joschka | University of Freiburg |
Burgard, Wolfram | University of Technology Nuremberg |
Keywords: Reinforcement Learning
Abstract: Accurate value estimates are important for off-policy reinforcement learning. Algorithms based on temporal difference learning typically are prone to an over- or underestimation bias building up over time. In this paper, we propose a general method called Adaptively Calibrated Critics (ACC) that uses the most recent high variance but unbiased on-policy rollouts to alleviate the bias of the low variance temporal difference targets. We apply ACC to Truncated Quantile Critics [1], which is an algorithm for continuous control that allows regulation of the bias with a hyperparameter tuned per environment. The resulting algorithm adaptively adjusts the parameter during training rendering hyperparameter search unnecessary and sets a new state of the art on the OpenAI gym continuous control benchmark among all algorithms that do not tune hyperparameters for each environment. ACC further achieves improved results on different tasks from the Meta-World robot benchmark. Additionally, we demonstrate the generality of ACC by applying it to TD3 [2] and showing an improved performance also in this setting.
|
|
TuAT6 Oral Session, ICC Cap Suite 14-16 |
Add to My Program |
Marine and Field Robotics |
|
|
Chair: Stuart, Hannah | UC Berkeley |
Co-Chair: Forbes, James Richard | McGill University |
|
08:30-08:40, Paper TuAT6.1 | Add to My Program |
An Investigation on the Effect of Actuation Pattern on the Power Consumption of Legged Robots for Extraterrestrial Exploration (I) |
|
Hu, Yuan | University of Shanghai for Science and Technology |
Guo, Weizhong | Shanghai Jiao Tong University |
Lin, Rongfu | ShangHai JiaoTong University |
Keywords: Parallel Robots, Legged Robots, Motion and Path Planning, Power consumption minimization
Abstract: Legged robots have great potential to be extraterrestrial exploration rovers of extraordinary versatility. Minimizing power consumption is of vital importance in the scenarios of extraterrestrial explorations. The actuation pattern, which refers to the combination of necessary actuators that output torque, has a significant influence on the power consumption of legged robots. This article seeks to investigate the effect of actuation patterns on the power consumption of legged robots that perform motion in a quasi-static manner. The power consumption model of legged robots considering actuation patterns is deduced. Based on that, the effect of the actuation pattern on mechanical power and heat power, which are the main power-loss terms, is investigated. The lowest power consumption under various conditions achieved by different actuation patterns is investigated. Simulation results show that the power consumption can be reduced by choosing the actuation pattern properly. Furthermore, the principles of selecting the optimal actuation pattern from the perspective of power consumption are summarized, which are expected to facilitate the minimal power consumption motion planning of legged robots.
|
|
08:40-08:50, Paper TuAT6.2 | Add to My Program |
Intent Inference-Based Ship Collision Avoidance in Encounters with Rule-Violating Vessels |
|
Cho, Yonghoon | Agency for Defense Development |
Kim, Jonghwi | KAIST |
Kim, Jinwhan | KAIST |
Keywords: Marine Robotics, Autonomous Vehicle Navigation, Collision Avoidance
Abstract: All vessels operating in a marine environment are required to comply with the international regulations for preventing collisions at sea (COLREGs), which provide the guidelines and evasive procedures required to resolve potential conflicts between vessels. However, not all vessels strictly abide by COLREGs, often leading to dangerous situations. This paper presents a novel approach for robust collision avoidance in encounter situations involving COLREG-violating vessels. A probabilistic velocity obstacle algorithm based on intent inference is designed and implemented with consideration of the tradeoff between the adherence to traffic rules and the proactive evasive actions for safety. One-to-one and multi-ship encounter situations in the presence of rule-violating vessels are examined through Monte-Carlo simulations, and the results are discussed to demonstrate the feasibility and performance of the proposed approach.
|
|
08:50-09:00, Paper TuAT6.3 | Add to My Program |
Nezha-Mini: Design and Locomotion of a Miniature Low-Cost Hybrid Aerial Underwater Vehicle |
|
Bi, Yuanbo | Shanghai Jiao Tong University |
Jin, Yufei | Shanghai Jiao Tong University |
Lyu, Chenxin | Shanghai Jiao Tong University |
Zeng, Zheng | Shanghai Jiao Tong University |
Lian, Lian | Shanghai Jiaotong University |
Keywords: Marine Robotics, Motion Control, Field Robots
Abstract: The distinct design concepts of the vehicles operating in air and water is one of the tremendous challenges that constrain the development of the hybrid aerial underwater vehicle (HAUV). This incompatibility consequently results in the enlarging volume and weight of the existing prototypes, as well as the unmatched maneuvering characteristics in both domains. This letter presented a novel miniaturized and lightweight HAUV, "Nezha-mini", which weighs 953g and is only A4-scaled. Besides, the low cost and high modularity allow the convenient repair and remanufacturing. Nezha-mini reconciles the complete multi-domain maneuverability within 50m aerially and 6m underwater whilst sufficing for the rapid and stable cross-domain locomotion, which benefits from the selection and unique layout of the propulsion system, as well as our proposed multi-modal control strategy and the cross-domain triggering mechanism. The results of the field experiments are in good agreement with the dynamics simulation, demonstrating the performance of multi-domain locomotion in real environments. The preliminary exploration in this letter provides a referential solution for the miniaturization of the highly maneuverable HAUVs for practical applications and creates a feasible platform for the future clustering and networking of HAUVs.
|
|
09:00-09:10, Paper TuAT6.4 | Add to My Program |
CPG-Based Motion Planning of Hybrid Underwater Hexapod Robot for Wall Climbing and Transition |
|
Ma, Feiyu | Northwestern Polytechnical University |
Yan, Weisheng | Northwestern Polytechnical University |
Chen, Lepeng | Northwestern Polytechnical University |
Cui, Rongxin | Northwestern Polytechnical University |
Keywords: Marine Robotics, Legged Robots, Motion Control
Abstract: Most of the existing underwater legged robots are capable of moving on small-angled slopes, but few of them can climb the large-angled slope or transition from one plane to another, such as transition from horizontal plane to vertical plane. In this paper, we propose a motion planning method of a hybrid underwater hexapod robot (HUHR) driven by six C-shape legs and eight thrusters. By analyzing the relationship between rotation and displacement of the hip joint, we establish a single-leg kinematic model. By analyzing the force at the touchpoint, we propose a locomotion mechanism to ensure no slip of the C-shape leg. Based on the central pattern generator (CPG) and tripod gait, we design an aperiodic mapping between the oscillator outputs and the desired rotation angles of hip joints. Overall, a gait planning and control method for our robot is proposed to realize continuous legged locomotion from one plane to another, including directional climbing and transition between them. Furthermore, the effectiveness of the proposed method has been verified on HUHR.
|
|
09:10-09:20, Paper TuAT6.5 | Add to My Program |
Improving Self-Consistency in Underwater Mapping through Laser-Based Loop Closure (I) |
|
Hitchcox, Thomas | McGill University |
Forbes, James Richard | McGill University |
Keywords: Marine Robotics, Sensor Fusion, SLAM, Commercial off-the-shelf (COTS) systems
Abstract: Accurate, self-consistent bathymetric maps are needed to monitor changes in subsea environments and infrastructure. These maps are increasingly collected by underwater vehicles, and mapping requires an accurate vehicle navigation solution. Commercial off-the-shelf (COTS) navigation solutions for underwater vehicles often rely on external acoustic sensors for localization, however survey-grade acoustic sensors are expensive to deploy and limit the range of the vehicle. Techniques from the field of simultaneous localization and mapping, particularly loop closures, can improve the quality of the navigation solution over dead-reckoning, but are difficult to integrate into COTS navigation systems. This work presents a method to improve the self-consistency of bathymetric maps by smoothly integrating loop-closure measurements into the state estimate produced by a commercial subsea navigation system. Integration is done using a white-noise-on-acceleration motion prior, without access to raw sensor measurements or proprietary models. Improvements in map self-consistency are shown for both simulated and experimental datasets, including a 3D scan of an underwater shipwreck in Wiarton, Ontario, Canada.
|
|
09:20-09:30, Paper TuAT6.6 | Add to My Program |
Passive Inverted Ultra-Short Baseline Positioning for a Disc-Shaped Autonomous Underwater Vehicle: Design and Field Experiments |
|
Wang, Yingqiang | Zhejiang University |
Hu, Ruoyu | Zhejiang University |
Huang, S. H. | Zhejiang University |
Wang, Zhikun | Zhejiang University |
Du, Peizhou | Zhejiang University |
Yang, Wencheng | Zhejiang University |
Chen, Ying | Zhejiang Univ., China |
Keywords: Marine Robotics, Localization, Autonomous Vehicle Navigation
Abstract: Underwater positioning is critical to autonomous underwater vehicles (AUVs) for navigation and geo-referencing. The rapid attenuation of the electromagnetic wave in the underwater environment prevents the use of traditional positioning methods such as the Global Positioning System, whereupon acoustic methods like ultra-short baseline (USBL) positioning systems play an important role in AUV navigation. However, the high cost and complexity of classical USBL systems have stifled the democratization of these technologies, which leads to a new method called passive inverted ultra-short baseline (piUSBL) positioning. In a typical piUSBL system, a single beacon is placed at a reference point, periodically broadcasting a positioning signal. A passive USBL receiver, time-synchronized to the beacon, is mounted on an AUV to get one-way travel-time (OWTT) slant range and azimuth estimates. The passive nature of the receiver means the system is inexpensive, low-power, and lightweight. Particularly, the omnidirectional broadcasted signals offer a feasible solution for concurrent multi-AUV navigation. This letter demonstrates a full-stack design and development of a piUSBL positioning system, and presents evaluations of the accuracy and reliability of the system through a series of experiments. More significantly, a successful sea trial of a disc-shaped AUV outfitted with our piUSBL was conducted in the South China Sea.
|
|
09:30-09:40, Paper TuAT6.7 | Add to My Program |
The Robustness of Tether Friction in Non-Idealized Terrains |
|
Page, Justin | UC Berkeley Mechanical Engineering |
Treers, Laura | University of California Berkeley |
Jorgensen, Steven Jens | Apptronik |
Fearing, Ronald | University of California at Berkeley |
Stuart, Hannah | UC Berkeley |
Keywords: Field Robots, Tendon/Wire Mechanism, Cooperating Robots
Abstract: Reduced traction limits the ability of mobile robotic systems to resist or apply large external loads, such as tugging a massive payload. One simple and versatile solution is to wrap a tether around naturally occurring objects to leverage the capstan effect and create exponentially-amplified holding forces. Experiments show that an idealized capstan model explains force amplification experienced on common irregular outdoor objects – trees, rocks, posts. Robust to variable environmental conditions, this exponential amplification method can harness single or multiple capstan objects, either in series or in parallel with a team of robots. This adaptability allows for a range of potential configurations especially useful for when objects cannot be fully encircled or gripped. This versatility is demonstrated with teleoperated mobile platforms to (1) control the lowering and arrest of a payload, (2) to achieve planar control of a payload, and (3) to act as an anchor point for a more massive platform to winch towards. We show the simple addition of a tether, wrapped around shallow stones in sand, amplifies holding force of a low-traction platform by up to 774x.
|
|
TuPO1S Poster Session, Room T8 |
|
Poster Session 1 |
|
|
|
08:30-10:10, Subsession TuPO1S-01, Room T8 | |
Soft Robots I Poster Session, 8 papers |
|
08:30-10:10, Subsession TuPO1S-02, Room T8 | |
Soft and Flexible Sensors Poster Session, 8 papers |
|
08:30-10:10, Subsession TuPO1S-03, Room T8 | |
Soft Robots: Actuation Poster Session, 7 papers |
|
08:30-10:10, Subsession TuPO1S-04, Room T8 | |
Sensor Fusion I Poster Session, 8 papers |
|
08:30-10:10, Subsession TuPO1S-05, Room T8 | |
Visual Servoing Poster Session, 8 papers |
|
08:30-10:10, Subsession TuPO1S-06, Room T8 | |
Visual Tracking Poster Session, 8 papers |
|
08:30-10:10, Subsession TuPO1S-07, Room T8 | |
Robot Learning Poster Session, 12 papers |
|
08:30-10:10, Subsession TuPO1S-08, Room T8 | |
Learning for Control I Poster Session, 12 papers |
|
08:30-10:10, Subsession TuPO1S-09, Room T8 | |
Marine Robotics I Poster Session, 12 papers |
|
08:30-10:10, Subsession TuPO1S-10, Room T8 | |
Biomimetic Systems Poster Session, 11 papers |
|
08:30-10:10, Subsession TuPO1S-11, Room T8 | |
Aerial Robotics I Poster Session, 12 papers |
|
08:30-10:10, Subsession TuPO1S-12, Room T8 | |
Aerial Robot Learning Poster Session, 12 papers |
|
08:30-10:10, Subsession TuPO1S-13, Room T8 | |
Multi-Robot Systems I Poster Session, 12 papers |
|
08:30-10:10, Subsession TuPO1S-14, Room T8 | |
Intelligent Transportation Systems I Poster Session, 12 papers |
|
08:30-10:10, Subsession TuPO1S-15, Room T8 | |
Motion and Path Planning I Poster Session, 12 papers |
|
08:30-10:10, Subsession TuPO1S-16, Room T8 | |
Reactive and Sensor-Based Planning Poster Session, 8 papers |
|
08:30-10:10, Subsession TuPO1S-17, Room T8 | |
Collision Avoidance Poster Session, 4 papers |
|
08:30-10:10, Subsession TuPO1S-18, Room T8 | |
Perception for Grasping and Manipulation I Poster Session, 12 papers |
|
08:30-10:10, Subsession TuPO1S-19, Room T8 | |
Learning for Grasping and Manipulation I Poster Session, 12 papers |
|
08:30-10:10, Subsession TuPO1S-20, Room T8 | |
Localization I Poster Session, 11 papers |
|
08:30-10:10, Subsession TuPO1S-21, Room T8 | |
Vision-Based Navigation I Poster Session, 12 papers |
|
08:30-10:10, Subsession TuPO1S-22, Room T8 | |
Localization and Mapping I Poster Session, 12 papers |
|
TuPO1S-01 Poster Session, Room T8 |
Add to My Program |
Soft Robots I |
|
|
|
08:30-10:10, Paper TuPO1S-01.1 | Add to My Program |
Reconfigurable Inflated Soft Arms |
|
Kim, Nam Gyun | Korea Advanced Institute of Science and Technology |
Ryu, Jee-Hwan | Korea Advanced Institute of Science and Technology |
Keywords: Soft Robot Materials and Design, Soft Robot Applications, Mechanism Design
Abstract: Inflatable structures have attracted considerable research attention in many fields owing to their numerous advantages, such as being light and able to engage in interactions safely. However, in most cases, the inflatable structure can only have one stable configuration, which is undesirable for robotic arms. This study proposes a novel inflatable structure that can be easily reconfigured into multiple stable configurations, even with single-body inflation. In the proposed mechanism, the structure length can be freely adjusted, and its respective joints can be set in the desired directions to facilitate the reconfiguration of its pose. An additional advantage of the proposed mechanism is that it can withstand external forces as well as its own weight. This study analyzes and experimentally validates the shape locking and load-carrying properties of the proposed mechanism. Further, the fabrication process and design guidelines for the proposed mechanism are presented. Through a suitable demonstration, the proposed mechanism is shown to exhibit multiple stable configurations and lock its poses.
|
|
08:30-10:10, Paper TuPO1S-01.2 | Add to My Program |
A Soft Hybrid-Actuated Continuum Robot Based on Dual Origami Structures |
|
Tao, Jian | University of Science and Technology of China |
Hu, Qiqiang | City University of Hong Kong |
Luo, Tianzhi | University of Science and Technology of China |
Dong, Erbao | University of Science and Technology of China |
Keywords: Soft Robot Materials and Design, Tendon/Wire Mechanism, Hydraulic/Pneumatic Actuators
Abstract: Soft continuum robots have shown tremendous potential for medical and industrial applications owing to their flexibility and continuous deformability. However, their telescopic and bending capabilities and variable stiffness are still limited. This study proposes a novel origami-inspired soft continuum robot to possess large telescopic and bending capabilities while improving stiffness based on the principle of antagonistic actuation. The soft robot consists of dual origami structures. The inner forms an air chamber actuated by pneumatics, and the outer is controlled by nine tendon-driven actuators. The proposed design uses the advantages of a hybrid actuation to achieve motion and stiffness control. The performance of the soft robot is studied experimentally based on single and three robot modules. Results show that the robot has an excellent stretch ratio and a maximum bending angle of 180°. The robot can also increase stiffness to resist the bending deformation induced by self-weight and loads.
|
|
08:30-10:10, Paper TuPO1S-01.3 | Add to My Program |
Direct and Inverse Modeling of Soft Robots by Learning a Condensed FEM Model |
|
Ménager, Etienne | Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189 CRIStAL |
Navez, Tanguy | University of Lille - INRIA |
Goury, Olivier | Inria - Lille Nord Europe |
Duriez, Christian | INRIA |
Keywords: Modeling, Control, and Learning for Soft Robots
Abstract: The Finite Element Method (FEM) is a powerful modeling tool for predicting the behavior of soft robots. However, its use for control can be difficult for non-specialists of numerical computation: it requires an optimization of the computation to make it real-time. In this paper, we propose a learning-based approach to obtain a compact but sufficiently rich mechanical representation. Our choice is based on non- linear compliance data in the actuator/effector space provided by a condensation of the FEM model. We demonstrate that this compact model can be learned with a reasonable amount of data and, at the same time, be very efficient in terms of modeling, since we can deduce the direct and inverse kinematics of the robot. We also show how to couple some models learned individually in particular on an example of a gripper composed of two soft fingers. Other results are shown by comparing the inverse model derived from the full FEM model and the one from the compact learned version. This work opens new perspectives, namely for the embedded control of soft robots, but also for their design. These perspectives are also discussed in the paper.
|
|
08:30-10:10, Paper TuPO1S-01.4 | Add to My Program |
Limit Cycle Generation with Pneumatically Driven Physical Reservoir Computing |
|
Shinkawa, Hiroaki | The University of Tokyo |
Kawase, Toshihiro | Tokyo Denki University |
Miyazaki, Tetsuro | The University of Tokyo |
Kanno, Takahiro | Riverfield Inc |
Sogabe, Maina | The University of Tokyo |
Kawashima, Kenji | The University of Tokyo |
Keywords: Modeling, Control, and Learning for Soft Robots, Hydraulic/Pneumatic Actuators, Neural and Fuzzy Control
Abstract: One of the recent developments in physical reservoir computing, which uses the complex dynamics of a physical system as a computational resource, is the use of a pneumatic pipeline system as a computational resource. This uses the dynamics of air for computation, and because it is lightweight and power-saving, it is used for gait-assist control using a soft exoskeleton with pneumatic rubber artificial muscles. In this study, we verified that by feeding back the estimated information to a pneumatic pipeline system, the pneumatic physical reservoir computing can generate periodic pressure changes as a stable limit cycle, such as those seen in walking. A pneumatic reservoir with feedback loops was modeled to generate limit cycles in the simulation, and it was confirmed that the system could generate limit cycles with high accuracy even from initial positions far from the target limit cycle. This system is expected to be applied to assist walking movements with a soft exoskeleton with a lightweight computational device.
|
|
08:30-10:10, Paper TuPO1S-01.5 | Add to My Program |
Toward Zero-Shot Sim-To-Real Transfer Learning for Pneumatic Soft Robot 3D Proprioception Sensing |
|
Yoo, Uksang | Carnegie Mellon University |
Zhao, Hanwen | New York University |
Altamirano, Alvaro | New York University |
Yuan, Wenzhen | Carnegie Mellon University |
Feng, Chen | New York University |
Keywords: Modeling, Control, and Learning for Soft Robots, Deep Learning for Visual Perception, Soft Robot Materials and Design
Abstract: Pneumatic soft robots present many advantages in manipulation tasks. Notably, their inherent compliance makes them safe and reliable in unstructured and fragile environments. However, full-body shape sensing for pneumatic soft robots is challenging because of their high degrees of freedom and complex deformation behaviors. Vision-based proprioception sensing methods relying on embedded cameras and deep learning provide a good solution to proprioception sensing by extracting the full-body shape information from the high-dimensional sensing data. But the current training data collection process makes it difficult for many applications. To address this challenge, we propose and demonstrate a robust sim-to-real pipeline that allows the collection of the soft robot's shape information in high-fidelity point cloud representation. The model trained on simulated data was evaluated with real internal camera images. The results show that the model performed with averaged Chamfer distance of 8.85 mm and tip position error of 10.12 mm even with external perturbation for a pneumatic soft robot with a length of 100.0 mm. We also demonstrated the sim-to-real pipeline’s potential for exploring different configurations of visual patterns to improve vision-based reconstruction results. The code and dataset are available at https://github.com/DeepSoRo/DeepSoRoSim2Real.
|
|
08:30-10:10, Paper TuPO1S-01.6 | Add to My Program |
Cross-Domain Transfer Learning and State Inference for Soft Robots Via a Semi-Supervised Sequential Variational Bayes Framework |
|
Sapai, Shageenderan | Monash University |
Loo, Junn Yong | Monash Malaysia |
Ding, Ze Yang | Monash University Malaysia |
Tan, Chee Pin | Monash University |
Phan, Raphael | Monash University |
Baskaran, Vishnu Monn | Monash University Malaysia |
Nurzaman, Surya G. | Monash University |
Keywords: Modeling, Control, and Learning for Soft Robots, Transfer Learning, Probabilistic Inference
Abstract: Recently, data-driven models such as deep neural networks have shown to be promising tools for modelling and state inference in soft robots. However, voluminous amounts of data are necessary for deep models to perform effectively, which requires exhaustive and quality data collection, particularly of state labels. Consequently, obtaining labelled state data for soft robotic systems is challenged for various reasons, including difficulty in the sensorization of soft robots and the inconvenience of collecting data in unstructured environments. To address this challenge, in this paper, we propose a semi-supervised sequential variational Bayes (DSVB) framework for transfer learning and state inference in soft robots with missing state labels on certain robot configurations. Considering that soft robots may exhibit distinct dynamics under different robot configurations, a feature space transfer strategy is also incorporated to promote the adaptation of latent features across multiple configurations. Unlike existing transfer learning approaches, our proposed DSVB employs a recurrent neural network to model the nonlinear dynamics and temporal coherence in soft robot data. The proposed framework is validated on multiple setup configurations of a pneumatic-based soft robot finger. Experimental results on four transfer scenarios demonstrate that DSVB performs effective transfer learning and accurate state inference amidst missing state labels.
|
|
08:30-10:10, Paper TuPO1S-01.7 | Add to My Program |
Image-Based Pose Estimation and Shape Reconstruction for Robot Manipulators and Soft, Continuum Robots Via Differentiable Rendering |
|
Lu, Jingpei | University of California San Diego |
Liu, Fei | UCSD |
Girerd, Cedric | University of California, San Diego |
Yip, Michael C. | University of California, San Diego |
Keywords: Computer Vision for Automation, Modeling, Control, and Learning for Soft Robots, Visual Tracking
Abstract: State estimation from measured data is crucial for robotic applications as autonomous systems rely on sensors to capture the motion and localize in the 3D world. Among sensors that are designed for measuring a robot's pose, or for soft robots, their shape, vision sensors are favorable because they are information-rich, easy to set up, and cost-effective. With recent advancements in computer vision, deep learning-based methods no longer require markers for identifying feature points on the robot. However, learning-based methods are data-hungry and hence not suitable for soft and prototyping robots, as building such bench-marking datasets is usually infeasible. In this work, we achieve image-based robot pose estimation and shape reconstruction from camera images. Our method requires no precise robot meshes, but rather utilizes a differentiable renderer and primitive shapes. It hence can be applied to robots for which CAD models might not be available or are crude. Our parameter estimation pipeline is fully differentiable. The robot shape and pose are estimated iteratively by back-propagating the image loss to update the parameters. We demonstrate that our method of using geometrical shape primitives can achieve high accuracy in shape reconstruction for a soft continuum robot and pose estimation for a robot manipulator.
|
|
08:30-10:10, Paper TuPO1S-01.8 | Add to My Program |
Discrete-Time Model Based Control of Soft Manipulator with FBG Sensing |
|
Franco, Enrico | Imperial College London |
Aktas, Ayhan | Imperial College London |
Treratanakulchai, Shen | Imperial College |
Garriga-Casanovas, Arnau | Imperial College London |
Donder, Abdulhamit | Imperial College London |
Rodriguez y Baena, Ferdinando | Imperial College, London, UK |
Keywords: Modeling, Control, and Learning for Soft Robots, Underactuated Robots, Robust/Adaptive Control
Abstract: In this article we investigate the discrete-time model based control of a planar soft continuum manipulator with proprioceptive sensing provided by fiber Bragg gratings. A control algorithm is designed with a discrete-time energy shaping approach which is extended to account for control-related lag of digital nature. A discrete-time nonlinear observer is employed to estimate the uncertain bending stiffness of the manipulator and to compensate constant matched disturbances. Simulations and experiments demonstrate the effectiveness of the controller compared to a continuous time implementation.
|
|
TuPO1S-02 Poster Session, Room T8 |
Add to My Program |
Soft and Flexible Sensors |
|
|
|
08:30-10:10, Paper TuPO1S-02.1 | Add to My Program |
A Soft Robot with Three Dimensional Shape Sensing and Contact Recognition Multi-Modal Sensing Via Tunable Soft Optical Sensors |
|
McCandless, Max | Boston University |
Juliá Wise, Frank | Boston University |
Russo, Sheila | Boston University |
Keywords: Soft Sensors and Actuators, Soft Robot Materials and Design, Soft Robot Applications
Abstract: Soft optical sensing strategies are rapidly developing for soft robotic systems as a means to increase the controllability of soft compliant robots. In this paper, we present a roughness tuning strategy for the fabrication of soft optical sensors to achieve the dual functionality of shape sensing combined with contact recognition within a single multi-modal sensor. The molds used to fabricate the soft sensors are roughened via laser micromachining to achieve asymmetrical sensor responses when bent in opposite directions. We demonstrate the integration of these sensors into a fully soft robotic platform consisting of a multi-directional bending module with integrated 3D shape sensing and a gripper with tip position monitoring along with contact force recognition. We show the accuracy of our sensing strategy in validation experiments and a pick-andplace task is performed to demonstrate the robot’s functionality.
|
|
08:30-10:10, Paper TuPO1S-02.2 | Add to My Program |
A Flexible 3D Force Sensor with Tunable Sensitivity |
|
Davies, James J. | University of New South Wales |
Thai, Mai Thanh | University of New South Wales |
Hoang, Trung Thien | University of New South Wales |
Chi Cong, Nguyen | University of New South Wales |
Phan, Phuoc Thien | University of New South Wales |
Zhu, Kefan | UNSW Sydney |
Tran, Dang Bao Nhi | RMIT |
Ho, Van | Japan Advanced Institute of Science and Technology |
La, Hung | University of Nevada at Reno |
Ha, Q P | University of Technology Sydney |
Lovell, Nigel Hamilton | University of New South Wales |
Do, Thanh Nho | University of New South Wales |
Keywords: Soft Sensors and Actuators, Surgical Robotics: Steerable Catheters/Needles, Medical Robots and Systems
Abstract: Following biology’s lead, soft robotics has emerged as a perfect candidate for actuation within complex environments. While soft actuation has been developed intensively over the last few decades, soft sensing has so far slowed to catch up. A largely unresearched area is the change of the soft material properties through prestress to achieve a degree of mechanical sensitivity tunability within soft sensors. Here, a new 3D force sensor which employs novel hydraulic filament artificial muscles capable of sensitivity tunability is introduced. Using a neural network (NN) model, the new soft 3D sensor can precisely detect external forces based on the change of the hydraulic pressures with error of ~1.0, ~1.3, and ~0.94 % in the x, y, and z-axis directions, respectively. The sensor is also able to sense large force ranges, comparable to other similar sensors available in the literature. The sensor is then integrated into a soft robotic surgical arm for monitoring the tool-tissue interaction during the ablation process.
|
|
08:30-10:10, Paper TuPO1S-02.3 | Add to My Program |
STEV: Stretchable Triboelectric E-Skin Enabled Proprioceptive Vibration Sensing for Soft Robot |
|
Wang, Zihan | Tsinghua University |
Lei, Kai-Chong | Tsinghua University |
Huaze, Tang | Tsinghua University |
Li, Shoujie | Tsinghua Shenzhen International Graduate School |
Dai, Yuan | Tencent |
Ding, Wenbo | Tsinghua University |
Zhang, Xiao-Ping | Ryerson University |
Keywords: Soft Sensors and Actuators, Soft Robot Materials and Design, Soft Robot Applications
Abstract: Vibration perception is essential for robotic sensing and dynamic control. Nevertheless, due to the rigorous demand for sensor conformability and stretchability, enabling soft robots with proprioceptive vibration sensing remains challenging. This paper proposes a new liquid metal-based stretchable e-skin via a kirigami-inspired design to enable soft robot proprioceptive vibration sensing. The e-skin is fabricated into 0.1mm ultrathin thickness, ensuring its negligible influence on the overall stiffness of the soft robot. Moreover, the working mechanism of the e-skin is based on the ubiquitous triboelectrification effect, which transduces mechanical stimuli without external power supply. To demonstrate the practicability of the e-skin, we built a soft gripper consisting of three soft robotic fingers with proprioceptive vibration sensing. Our experiment shows that the gripper can accurately distinguish the grain category (six grains with the same mass, 99.9% accuracy) and the packaging quality (100% accuracy) by simply shaking the gripped bottle. In summary, a soft robotic proprioceptive vibration sensing solution is proposed; it helps soft robots to have a more comprehensive awareness of their self-state and may inspire further research on soft robotics.
|
|
08:30-10:10, Paper TuPO1S-02.4 | Add to My Program |
Design and Development of a Hydrogel-Based Soft Sensor for Multi-Axis Force Control |
|
Cai, Yichen | University of Cambridge |
Hardman, David | University of Cambridge |
Iida, Fumiya | University of Cambridge |
George Thuruthel, Thomas | University College London |
Keywords: Soft Sensors and Actuators, Modeling, Control, and Learning for Soft Robots, Soft Robot Materials and Design
Abstract: As soft robotic systems become increasingly complex, there is a need to develop sensory systems which can provide rich state information to the robot for feedback control. Multi-axis force sensing and control is one of the less explored problems in this domain. There are numerous challenges in the development of a multi-axis soft sensor: from the design and fabrication to the data processing and modelling. This work presents the design and development of a novel multi-axis soft sensor using a gelatin-based ionic hydrogel and 3D printing technology. A learning-based modelling approach coupled with sensor redundancy is developed to model the environmentally dependent soft sensors. Numerous real-time experiments are conducted to test the performance of the sensor and its applicability in closed-loop control tasks. Our results indicate that the soft sensor can predict force values and orientation angle within 4% and 7% of their total range, respectively.
|
|
08:30-10:10, Paper TuPO1S-02.5 | Add to My Program |
Design and Characterization of a Low Mechanical Loss, High-Resolution Wearable Strain Gauge |
|
Liu, Addison | Harvard University |
Araromi, Oluwaseun Adelowo | Harvard University Science and Engineering Building |
Walsh, Conor James | Harvard University |
Wood, Robert | Harvard University |
Keywords: Soft Sensors and Actuators, Wearable Robotics
Abstract: Soft, wearable systems hold promise for a wide variety of new or enhanced applications in the realm of human-computer interaction, physiological monitoring, wearable robotics, and a host of other human-centric devices. Soft sensor systems have been developed concurrently in order to allow these wearable systems to respond intelligently with their surroundings. A recently reported sensing mechanism based on the strain-mediated contact in anisotropically resistive structures (SCARS) is an attractive solution due to its high sensing resolution, low-profile nature, and high mechanical resilience. Furthermore, the resistance-based output provides a simple electronic readout, facilitating its use in a wide variety of applications. However, previous iterations of the sensing mechanism have exhibited stress relaxation and hysteretic behaviors that limit the scope of its use. Here, we report an iteration of the SCARS mechanism that uses silicone-based materials with low mechanical loss in order to improve the sensor signal stability and bandwidth. A new fabrication approach is developed which permits the incorporation of a liquid elastomer adhesive layer while also preserving the SCARS sensing functionality. The silicone-based SCARS sensors exhibited fast stress relaxation response (< 1 s) and reduced cyclic drift properties by more than half that of previously reported designs. A physiological monitoring demonstration is presented, validating that the new sensor design is mechanically resilient to such applications and has potential for use in real-world wearable use cases.
|
|
08:30-10:10, Paper TuPO1S-02.6 | Add to My Program |
Identifying Contact Distance Uncertainty in Whisker Sensing with Tapered, Flexible Whiskers |
|
Kent, Teresa | Carnegie Mellon University |
Emnett, Hannah | Northwestern University |
Babaei, Mahnoush | The University of Texas at Austin |
Hartmann, Mitra | Northwestern University |
Bergbreiter, Sarah | Carnegie Mellon University |
Keywords: Biologically-Inspired Robots, Force and Tactile Sensing, Soft Sensors and Actuators
Abstract: Whisker-based tactile sensors have the potential to perform fast and accurate 3D mappings of the environment, complementing vision-based methods under conditions of glare, reflection, proximity, and occlusion. However, current algorithms for mapping with whiskers make assumptions about the conditions of contact, and these assumptions are not always valid and can cause significant sensing errors. Here we introduce a new whisker sensing system with a tapered, flexible whisker. The system provides inputs to two separate algorithms for estimating radial contact distance on a whisker. Using a Gradient-Moment (GM) algorithm, we correctly detect contact distance in most cases (within 4% of the whisker length). We introduce the Z-Dissimilarity score as a new metric that quantifies uncertainty in the radial contact distance estimate using both the GM algorithm and a Moment-Force (MF) algorithm that exploits the tapered whisker design. Combining the two algorithms ultimately results in contact distance estimates more robust than either algorithm alone.
|
|
08:30-10:10, Paper TuPO1S-02.7 | Add to My Program |
Learning Decoupled Multi-Touch Force Estimation, Localization and Stretch for Soft Capacitive E-Skin |
|
Dawood, Abu Bakar | Queen Mary University of London |
Coppola, Claudio | Queen Mary University of London |
Althoefer, Kaspar | Queen Mary University of London |
Keywords: Soft Sensors and Actuators, Modeling, Control, and Learning for Soft Robots, Machine Learning for Robot Control
Abstract: Distributed sensor arrays capable of detecting multiple spatially distributed stimuli are considered an important element in the realisation of exteroceptive and proprioceptive soft robots. This paper expands upon the previously presented idea of decoupling the measurements of pressure and location of a local indentation from global deformation, using the overall stretch experienced by a soft capacitive e-skin. We employed machine learning methods to decouple and predict these highly coupled deformation stimuli, collecting data from a soft sensor e-skin which was then fed to a machine learning system comprising of linear regressor, gaussian process regressor, SVM and random forest classifier for stretch, force, detection and localisation respectively. We also studied how the localisation and forces are affected when two forces are applied simultaneously. Soft sensor arrays aided by appropriately chosen machine learning techniques can pave the way to e-skins capable of deciphering multi-modal stimuli in soft robots.
|
|
08:30-10:10, Paper TuPO1S-02.8 | Add to My Program |
OptiGap: A Modular Optical Sensor System for Bend Localization |
|
Bupe, Jr., Paul | University of Louisville |
Harnett, Cindy | University of Louisville |
Keywords: Soft Sensors and Actuators, Software-Hardware Integration for Robot Systems, Soft Robot Applications
Abstract: This paper presents the novel use of air gaps in flexible optical light pipes to create coded segments for use in bend localization. The OptiGap sensor system allows for the creation of extrinsic intensity modulated bend sensors that function as flexible absolute linear encoders. Coded segment patterns are identified by a Gaussian naive Bayes classifier running on an STM32 microcontroller. Fitting of the classifier is aided by a custom software suite that simplifies data collection and processing from the sensor. The sensor model is analyzed and verified through simulation and experiments, highlighting key properties and parameters that aid in the design of OptiGap sensors using different light pipe materials and for various applications. This system allows for realtime and accurate bend localization in many robotics and automation applications, in wet and dry conditions.
|
|
TuPO1S-03 Poster Session, Room T8 |
Add to My Program |
Soft Robots: Actuation |
|
|
|
08:30-10:10, Paper TuPO1S-03.1 | Add to My Program |
A Silicone-Sponge-Based Variable-Stiffness Device |
|
Yue, Tianqi | University of Bristol |
You, Tsam Lung | University of Bristol |
Philamore, Hemma | Kyoto University |
Gadelha, Hermes | Department of Engineering, University of Bristol, UK |
Rossiter, Jonathan | University of Bristol |
Keywords: Soft Robot Materials and Design, Soft Robot Applications, Modeling, Control, and Learning for Soft Robots
Abstract: Soft devices employ variable stiffness to ensure safety and improve the robustness in the interaction between robots and objects. Using soft materials is one of the most popular approaches to design a variable-stiffness device, while the use of silicone sponge remains less explored in this field. Here we present a novel silicone-sponge-based variable-stiffness device (SVD). The SVD is easy-to-make and low-cost, and fabricated by an air-tight bellow enclosing a silicone sponge core. This allows easy access to the hyper-elastic response of the porous sponge whilst stiffness tuning of the device via pneumatic pressure difference. A detailed mathematical model of the SVD is proposed, by which the stiffness can be precisely controlled by the pressure difference applied. The stiffness of SVD can be tuned in the range of [1.55, 22.82]×10^3 N/m, up to 14.7 times increase. The high stiffness is easily triggered by a low pressure difference (ΔP < 12 kPa). The SVD is a versatile and compact module, with small axial size (10 mm height) and light weight (14.3 g), making it highly suitable for integration in a wide range of robotics and industrial applications. This, in addition to its easy-to-fabricate and low-cost features, may appeal to the robotics community at large. We further detail its working principle, fabrication processes, mathematical model and automated control methods to show its versatility.
|
|
08:30-10:10, Paper TuPO1S-03.2 | Add to My Program |
Design and Control of a Tunable-Stiffness Coiled-Spring Actuator |
|
Misra, Shivangi | University of Pennsylvania |
Mitchell, Mason | Worcester Polytechnic Institute |
Chen, Rongqian | University of Pennsylvania |
Sung, Cynthia | University of Pennsylvania |
Keywords: Soft Sensors and Actuators, Compliance and Impedance Control, Modeling, Control, and Learning for Soft Robots
Abstract: We propose a novel design for a lightweight and compact tunable stiffness actuator capable of stiffness changes up to 20x. The design is based on the concept of a coiled spring, where changes in the number of layers in the spring change the bulk stiffness in a near linear fashion. We present an elastica nested rings model for the deformation of the proposed actuator and empirically verify that the designed stiffness-changing spring abides by this model. Using the resulting model, we design a physical prototype of the tunable-stiffness coiled-spring actuator and discuss the effect of design choices on the resulting achievable stiffness range and resolution. In the future, this actuator design could be useful in a wide variety of soft robotics applications, where fast, controllable, and local stiffness change is required over a large range of stiffnesses.
|
|
08:30-10:10, Paper TuPO1S-03.3 | Add to My Program |
Wirelessly-Controlled Untethered Piezoelectric Planar Soft Robot Capable of Bidirectional Crawling and Rotation |
|
Zheng, Zhiwu | Princeton University |
Cheng, Hsin | Princeton University |
Kumar, Prakhar | Princeton University |
Wagner, Sigurd | Princeton University |
Chen, Minjie | Princeton University |
Verma, Naveen | Princeton University |
Sturm, James | Princeton University |
Keywords: Soft Robot Materials and Design, Modeling, Control, and Learning for Soft Robots, Soft Sensors and Actuators
Abstract: Electrostatic actuators provide a promising approach to creating soft robotic sheets, due to their flexible form factor, modular integration, and fast response speed. However, their control requires kilo-Volt signals and understanding of complex dynamics resulting from force interactions by on-board and environmental effects. In this work, we demonstrate an untethered planar five-actuator piezoelectric robot powered by batteries and on-board high-voltage circuitry, and controlled through a wireless link. The scalable fabrication approach is based on bonding different functional layers on top of each other (steel foil substrate, actuators, flexible electronics). The robot exhibits a range of controllable motions, including bidirectional crawling (up to ~0.6 cm/s), turning, and in-place rotation (at ~1 degree/s). High-speed videos and control experiments show that the richness of the motion results from the interaction of an asymmetric mass distribution in the robot and the associated dependence of the dynamics on the driving frequency of the piezoelectrics. The robot's speed can reach 6 cm/s with specific payload distribution.
|
|
08:30-10:10, Paper TuPO1S-03.4 | Add to My Program |
Origami Folding Enhances Modularity and Mechanical Efficiency of Soft Actuators |
|
Wang, Zheng | National University of Singapore |
Song, Yazhou | National University of Singapore |
Wang, Zhongkui | Ritsumeikan University |
Zhang, Hongying | National University of Singapore |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Robot Materials and Design, Soft Sensors and Actuators
Abstract: Soft robots have long been attractive to robotic engineers due to their remarkable dexterity; however, reports that standardize soft actuators into modularized off-shelf devices akin to rigid robots are still rare, and the mechanical efficiency of existing designs is still limited. This work identifies origami folding to enable the design of LEGO-like modularized soft actuators with high mechanical efficiency in terms of payload capability and workspace. Herein, three modularized origami actuators that can generate translational, bending, and twisting motion are designed, prototyped, and tested. The translational actuator can contract to 40% of its original length, and the twisting and bending actuators can exert 31° and 52° angular motions, respectively. The translational actuator can exert a blocked force of about 821 times self-weight. The motion of origami soft actuators is accurately modeled using rigid body kinematics, and complex systems built by them are captured by homogeneous transformation. Finally, the modularized design and efficient kinematic model are verified on a manipulator and a reconfigurable letter. Benefiting from the unprecedented modularity and mechanical efficiency, these LEGO-like origami actuators are promising for practical applications like food handling and healthcare.
|
|
08:30-10:10, Paper TuPO1S-03.5 | Add to My Program |
Characterisation of Antagonistically Actuated, Stiffness-Controllable Joint-Link Units for Cobots |
|
Gaozhang, Wenlong | University College London |
Shi, Jialei | University College London |
Li, Yue | Kings College London |
Stilli, Agostino | University College London |
Wurdemann, Helge Arne | University College London |
Keywords: Soft Robot Materials and Design, Soft Robot Applications, Soft Sensors and Actuators
Abstract: Soft robotic structures may play a major role in the 4th industrial revolution. Researchers have successfully demonstrated the advantages of soft robotics over traditional robots made of rigid links and joints in many application areas. Variable stiffness links (VSL) and joints (VSJ) have been investigated to achieve on-demand forces and, at the same time, be inherently safe in interactions with humans. However, a thorough characterisation of soft and rigid robotic components is still required. This paper investigates the influence of antagonistically actuated, stiffness-controllable joint-link units (JLUs) on the performance of collaborative robots (i.e. stiffness, load capacity, repetitive precision) and characterizes the difference compared with rigid units. A JLU is made of a combination of a VSL, a VSJ, and their rigid counterparts. Experimental results show that the VSL has minor differences in terms of stiffness (0.62 ~ 0.95), output force (0.93 ~ 0.94), and repetitive precision compared with the rigid link. For the VSJ, our results show a significant gap compared with the servo motor with regards to maximum stiffness (0.14 ~ 0.21) and repetitive position precision (0.07 ~ 0.25). However, similar performance on repetitive force precision and better performance on the maximum output force (1.54 ~ 1.55 times) are demonstrated.
|
|
08:30-10:10, Paper TuPO1S-03.6 | Add to My Program |
A Fluidic Actuator with an Internal Stiffening Structure Inspired by Mammalian Erectile Tissue |
|
Fras, Jan | Queen Mary University of London |
Althoefer, Kaspar | Queen Mary University of London |
Keywords: Soft Sensors and Actuators, Soft Robot Materials and Design, Compliance and Impedance Control
Abstract: One of the biggest problems with soft robots is precisely the fact that they are soft. Indeed the softer they are, the less force they can exert on the environment. Researchers have proposed a number of stiffening methods, but all of them have drawbacks, such as locking the shape of the device in a way that precludes further adjustments. In this paper we propose a stiffening method inspired by the internal structure of the mammalian penis. The soft actuation chamber is divided into small compartments that trap the actuation fluid, leading to locally amplified pressure increase under certain conditions. At the same time, the proposed solution does not affect the actuation mechanism, allowing the actuator to be adjusted in one direction just as if it was in non-stiffened mode, while offering a stiff response in the opposite direction. Our prototype achieves an increase in stiffening of approximately a factor of two. The paper describes the concept, the mathematical justification of the working principle, the prototype design, its implementation and our experimental results.
|
|
08:30-10:10, Paper TuPO1S-03.7 | Add to My Program |
On Tendon Driven Continuum Robots with Compressible Backbones |
|
Srivastava, Manu | Clemson University |
Walker, Ian | Clemson University |
Keywords: Modeling, Control, and Learning for Soft Robots, Tendon/Wire Mechanism
Abstract: This paper discusses the effect of axial backbone compression on tendon-driven continuum robots. A new mechanics model for compensating for this effect that does not require tendon tension sensing or knowledge of manipulator material properties/stiffnesses is introduced and analyzed. In addition, we provide an analytical expression for the minimum preload on the tendons to achieve a given bend, a quantity determined empirically thus far. Our model is computationally efficient and achieves real time control on low cost hardware. The analysis is supported by experimental results demonstrating significant improvement over kinematics in open loop control of a tendon-driven continuum hose robot.
|
|
TuPO1S-04 Poster Session, Room T8 |
Add to My Program |
Sensor Fusion I |
|
|
|
08:30-10:10, Paper TuPO1S-04.1 | Add to My Program |
FourStr: When Multi-Sensor Fusion Meets Semi-Supervised Learning |
|
Xie, Bangquan | South China University of Technology |
Yang, Liang | Apple Inc |
Yang, Zongming | Clemson University |
Wei, Ailin | Clemson Univeristy |
Weng, Xiaoxiong | South China University of Technology |
Li, Bing | Clemson University |
Keywords: Computer Vision for Transportation, Deep Learning for Visual Perception, Sensor Fusion
Abstract: This article proposes a novel semi-supervised learning framework FourStr} (Four-Stream formed by two two-stream models) that focuses on the improvement of fusion and labeling efficiency for 3D multi-sensor detector. FourStr adopts a multi-sensor single-stage detector named adaptive fusion network (AFNet) as the backbone and trains it through the semi-supervision learning (SSL) strategy Stereo Fusion. Note that multi-sensor AFNet and SSL Stereo Fusion can benefit each other. On the one hand, the Four-stream composed of two AFNets naturally provides rich inputs and large models for SSL Stereo Fusion. While other SSL works have to use massive augmentation to obtain rich inputs, and deepen and widen the network for large models. On the other hand, by the novel three fusion stages and Loss Pruning, Stereo Fusion improves the fusion and labeling efficiency for AFNet. Finally, extensive experiments demonstrate that FourStr performs excellently on outdoor dataset (KITTI and Waymo Open Dataset) and indoor dataset (SUN RGB-D), especially for the small contour objects. And compared to the fully-supervised methods, FourStr achieves similar accuracy with only 2% labeled data on KITTI (or with 50% labeled data on SUN RGB-D).
|
|
08:30-10:10, Paper TuPO1S-04.2 | Add to My Program |
Combining Motion and Appearance for Robust Probabilistic Object Segmentation in Real Time |
|
Mengers, Vito | Technische Universität Berlin |
Battaje, Aravind | TU Berlin |
Baum, Manuel | TU Berlin |
Brock, Oliver | Technische Universität Berlin |
Keywords: Object Detection, Segmentation and Categorization, Sensor Fusion, RGB-D Perception
Abstract: We present a robust method to visually segment scenes into objects based on motion and appearance. Both these cues provide complementary information that we fuse using two interconnected recursive estimators: One estimates object segmentation from motion as a probabilistic clustering of tracked 3D points, and the other estimates object segmentation from appearance as a probabilistic image segmentation. The interconnected estimators provide a probabilistic and consistent object segmentation in real time, which makes them well suited for many downstream robotic tasks. We evaluate our method on one such task, kinematic structure estimation, on a dataset of interactions with articulated objects and show that our fusion improves object segmentation by 70% and in turn estimated kinematic joints by 26% over a purely motion-based approach. Furthermore, we show the necessity of probabilistic modeling for downstream robotic tasks, achieving 339% of the performance of a recent multimodal but deterministic RNN for object segmentation on the estimation of kinematic structure.
|
|
08:30-10:10, Paper TuPO1S-04.3 | Add to My Program |
Event-Based Real-Time Moving Object Detection Based on IMU Ego-Motion Compensation |
|
Zhao, Chunhui | Northwestern Polytechnical University |
Li, Yakun | Northwestern Polytechnical University |
Lyu, Yang | Northwestern Polytechnical University |
Keywords: Object Detection, Segmentation and Categorization, Sensor Fusion, Visual Tracking
Abstract: Accurate and timely onboard perception is a prerequisite for mobile robots to operate in highly dynamic scenarios. The bio-inspired event camera can capture more motion details than a traditional camera by triggering each pixel asynchronously and therefore is more suitable in such scenarios. Among various perception tasks based on the event camera, ego-motion removal is one fundamental procedure to reduce perception ambiguities. Recent ego-motion removal methods are mainly based on optimization processes and may be computationally expensive for robot applications. In this paper, we consider the challenging perception task of detecting fast-moving objects from an aggressively operated platform equipped with an event camera, achieving computational cost reduction by directly employing IMU motion measurement. First, we design a nonlinear warping function to capture rotation information from an IMU and to compensate for the camera motion during an asynchronous events stream. The proposed nonlinear warping accuracy by 10%-15%. Afterward, we segmented the moving parts on the warped image through dynamic threshold segmentation and optical flow calculation, and clustering. Finally, we validate the proposed detection pipeline on public datasets and real-world data streams containing challenging light conditions and fast-moving objects.
|
|
08:30-10:10, Paper TuPO1S-04.4 | Add to My Program |
Estimating the Motion of Drawers from Sound |
|
Baum, Manuel | TU Berlin |
Froessl, Amelie | Technische Universitaet Berlin |
Battaje, Aravind | TU Berlin |
Brock, Oliver | Technische Universität Berlin |
Keywords: Robot Audition, Sensor Fusion
Abstract: Robots need to understand articulated objects, such as drawers. The state of articulated structures is commonly estimated using vision, but visual perception is limited when objects are occluded, have few salient features, or are not in the camera's field of view. Audio sensing does not face these challenges, since sound propagates in a fundamentally different way than light. Therefore we propose to fuse vision and audio sensing to overcome the challenges faced by vision alone. We estimate motion in several drawers and show that an audio-visual approach estimates drawer motion more reliably than only vision -- even in settings where the purely visual approach completely breaks down. Additionally, we perform an in-depth analysis of the regularities that govern how motion in drawers shapes their sound.
|
|
08:30-10:10, Paper TuPO1S-04.5 | Add to My Program |
Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents That See and Hear |
|
Gao, Ruohan | Stanford University |
Li, Hao | Stanford University |
Dharan, Gokul | Stanford University |
Wang, Zhuzhu | Stanford University |
Li, Chengshu | Stanford University |
Xia, Fei | Google Inc |
Savarese, Silvio | Stanford University |
Fei-Fei, Li | Stanford University |
Wu, Jiajun | Stanford University |
Keywords: Robot Audition, Sensor Fusion, Multi-Modal Perception for HRI
Abstract: Developing embodied agents in simulation has been a key research topic in recent years. Exciting new tasks, algorithms, and benchmarks have been developed in various simulators. However, most of them assume deaf agents in silent environments, while we humans perceive the world with multiple senses. We introduce Sonicverse, a multisensory simulation platform with integrated audio-visual simulation for training household agents that can both see and hear. Sonicverse models realistic continuous audio rendering in 3D environments in real-time. Together with a new audio-visual VR interface that allows humans to interact with agents with audio, Sonicverse enables a series of embodied AI tasks that need audio-visual perception. For semantic audio-visual navigation in particular, we also propose a new multi-task learning model that achieves state-of-the-art performance. In addition, we demonstrate Sonicverse's realism via sim-to-real transfer, which has not been achieved by other simulators: an agent trained in Sonicverse can successfully perform audio-visual navigation in real-world environments. Sonicverse is available at: https://github.com/StanfordVL/Sonicverse.
|
|
08:30-10:10, Paper TuPO1S-04.6 | Add to My Program |
LAPTNet-FPN: Multi-Scale LiDAR-Aided Projective Transform Network for Real Time Semantic Grid Prediction |
|
Diaz-Zapata, Manuel | Inria Grenoble |
Sierra-Gonzalez, David | Inria Grenoble Rhône-Alpes |
Erkent, Ozgur | Hacettepe University |
Laugier, Christian | INRIA |
Jilles, Dibangoye | Univ Lyon |
Keywords: Semantic Scene Understanding, Sensor Fusion, Autonomous Agents
Abstract: Semantic grids can be useful representations of the scene around an autonomous system. By having information about the layout of the space around itself, a robot can leverage this type of representation for crucial tasks such as navigation or tracking. By fusing information from multiple sensors, robustness can be increased and the computational load for the task can be lowered, achieving real time performance. Our multi-scale LiDAR-Aided Perspective Transform network uses information available in point clouds to guide the projection of image features to a top-view representation, resulting in a relative improvement in the state of the art for semantic grid generation for human (+8.67%) and movable object (+49.07%) classes in the nuScenes dataset, as well as achieving results close to the state of the art for the vehicle, drivable area and walkway classes, while performing inference at 25 FPS.
|
|
08:30-10:10, Paper TuPO1S-04.7 | Add to My Program |
Collision-Aware In-Hand 6D Object Pose Estimation Using Multiple Vision-Based Tactile Sensors |
|
Caddeo, Gabriele Mario | Istituto Italiano Di Tecnologia |
Piga, Nicola Agostino | Istituto Italiano Di Tecnologia |
Bottarel, Fabrizio | Istituto Italiano Di Tecnologia |
Natale, Lorenzo | Istituto Italiano Di Tecnologia |
Keywords: Sensor Fusion
Abstract: In this paper, we address the problem of estimating the in-hand 6D pose of an object in contact with multiple vision-based tactile sensors. We reason on the possible spatial configurations of the sensors along the object surface. Specifically, we filter contact hypotheses using geometric reasoning and a Convolutional Neural Network (CNN), trained on simulated object-agnostic images, to promote those that better comply with the actual tactile images from the sensors. We use the selected sensors configurations to optimize over the space of 6D poses using a Gradient Descent-based approach. We finally rank the obtained poses by penalizing those that are in collision with the sensors. We carry out experiments in simulation using the DIGIT vision-based sensor with several objects, from the standard YCB model set. The results demonstrate that our approach estimates object poses that are compatible with actual object-sensor contacts in 87.5% of cases while reaching an average positional error in the order of 2 centimeters. Our analysis also includes qualitative results of experiments with a real DIGIT sensor.
|
|
08:30-10:10, Paper TuPO1S-04.8 | Add to My Program |
CalibDepth: Unifying Depth Map Representation for Iterative LiDAR-Camera Online Calibration |
|
Zhu, Jiangtong | Xi'an Jiaotong University |
Xue, Jianru | Xi'an Jiaotong University |
Zhang, Pu | Xi'an Jiaotong University |
Keywords: Sensor Fusion
Abstract: LiDAR-Camera online calibration is of great significance for building a stable autonomous driving perception system. For online calibration, a key challenge lies in constructing a unified and robust representation between multimodal sensor data. Most methods extract features manually or implicitly with an end-to-end deep learning method. The former suffers poor robustness, while the latter has poor interpretability. In this paper, we propose CalibDepth, which uses depth maps as the unified representation for image and LiDAR point cloud. CalibDepth introduces a sub-network for monocular depth estimation to assist online calibration tasks. To further improve the performance, we regard online calibration as a sequence prediction problem, and introduce global and local losses to optimize the calibration results. CalibDepth shows excellent performance in different experimental setups.
|
|
TuPO1S-05 Poster Session, Room T8 |
Add to My Program |
Visual Servoing |
|
|
|
08:30-10:10, Paper TuPO1S-05.1 | Add to My Program |
Shape Visual Servoing of a Tether Cable from Parabolic Features |
|
Smolentsev, Lev | INRIA Rennes - Bretagne Atlantique |
Krupa, Alexandre | Centre Inria De l'Université De Rennes |
Chaumette, Francois | Inria Center at University of Rennes |
Keywords: Visual Servoing
Abstract: In this paper we propose a visual servoing approach that controls the deformation of a suspended tether cable subject to gravity from visual data provided by a RGB-D camera. The cable shape is modelled with a parabolic curve together with the orientation of the plane containing the tether. The visual features considered are the parabolic coefficients and the yaw angle of that plane. We derive the analytical expression of the interaction matrix that relates the variation of the visual features to the velocities of the cable extremities. Singularities are demonstrated to occur if and only if the cable is taut horizontally or vertically. An image processing algorithm is also developed to extract in real-time the current features fitting the parabola to the cable from the observed point cloud. Simulations and experimental results demonstrate the efficiency of our visual servoing approach to deform the tether cable toward a desired shape configuration.
|
|
08:30-10:10, Paper TuPO1S-05.2 | Add to My Program |
Deep Metric Learning for Visual Servoing: When Pose and Image Meet in Latent Space |
|
Felton, Samuel | Université De Rennes 1, IRISA |
Fromont, Elisa | Université of Rennes 1 IRISA/Inria Rba |
Marchand, Eric | Univ Rennes, Inria, CNRS, IRISA |
Keywords: Visual Servoing
Abstract: We propose a new visual servoing method that controls a robot's motion in a latent space. We aim to extract the best properties of two previously proposed servoing methods: we seek to obtain the accuracy of photometric methods such as Direct Visual Servoing (DVS), as well as the behavior and convergence of pose-based visual servoing (PBVS). Photometric methods suffer from limited convergence area due to a highly non-linear cost function, while PBVS requires estimating the pose of the camera which may introduce some noise and incurs a loss of accuracy. Our approach relies on shaping (with metric learning) a latent space, in which the representations of camera poses and the embeddings of their respective images are tied together. By leveraging the multimodal aspect of this shared space, our control law minimizes the difference between latent image representations thanks to information obtained from a set of pose embeddings. Experiments in simulation and on a robot validate the strength of our approach, showing that the sought out benefits are effectively found.
|
|
08:30-10:10, Paper TuPO1S-05.3 | Add to My Program |
CNN-Based Visual Servoing for Simultaneous Positioning and Flattening of Soft Fabric Parts |
|
Tokuda, Fuyuki | Centre for Transformative Garment Production |
Seino, Akira | Tohoku University |
Kobayashi, Akinari | Centre for Transformative Garment Production |
Kosuge, Kazuhiro | The University of Hong Kong |
Keywords: Visual Servoing, Dual Arm Manipulation, Deep Learning in Grasping and Manipulation
Abstract: This paper proposes CNN-based visual servoing for simultaneous positioning and flattening of a soft fabric part placed on a table by a dual manipulator system. We propose a network for multimodal data processing of grayscale images captured by a camera and force/torque applied to force sensors. The training dataset is collected by moving the real manipulators, which enables the network to map the captured images and force/torque to the manipulator’s motion in Cartesian space. We apply structured lighting to emphasize the features of the surface of the fabric part since the surface shape of the non-textured fabric part is difficult to recognize by a single grayscale image. Through experiments, we show that the fabric part with unseen wrinkles can be positioned and flattened by the proposed visual servoing scheme.
|
|
08:30-10:10, Paper TuPO1S-05.4 | Add to My Program |
Dynamical System-Based Imitation Learning for Visual Servoing Using the Large Projection Formulation |
|
Paolillo, Antonio | IDSIA USI-SUPSI |
Robuffo Giordano, Paolo | Irisa Cnrs Umr6074 |
Saveriano, Matteo | University of Trento |
Keywords: Visual Servoing, Learning from Demonstration, Imitation Learning
Abstract: Nowadays ubiquitous robots must be adaptive and easy to use. To this end, dynamical system-based imitation learning plays an important role. In fact, it allows to realize stable and complex robotic tasks without explicitly coding them, thus facilitating the robot use. However, the adaptation capabilities of dynamical systems have not been fully exploited due to the lack of closed-loop implementations making use of visual feedback. In this regard, the integration of visual information allows higher flexibility to cope with environmental changes. This work presents a dynamical system-based imitation learning for visual servoing, based on the large projection task priority formulation. The proposed scheme enables complex and stable visual tasks, as demonstrated by a simulation analysis and experiments with a robotic manipulator.
|
|
08:30-10:10, Paper TuPO1S-05.5 | Add to My Program |
Constant Distance and Orientation Following of an Unknown Surface with a Cable-Driven Parallel Robot |
|
Rousseau, Thomas | Nantes Université, LS2N, IRT Jules Verne |
Pedemonte, Nicolo | IRT Jules Verne |
Caro, Stéphane | CNRS/LS2N |
Chaumette, Francois | Inria Center at University of Rennes |
Keywords: Visual Servoing, Motion Control, Parallel Robots
Abstract: Cable-Driven Parallel Robots (CDPRs) are well-adapted to large workspaces since they replace rigid links by cables. However, they lack in positioning accuracy and new control methods are necessary to achieve profile-following tasks. This paper presents a control scheme designed for these tasks, relying on a combination of accurate boarded distance sensors and of a less accurate remote camera. The profile-following task is divided into two subtasks that are partially conflicting: maintaining a parallel orientation and a constant distance with the surface to follow, and following a trajectory between two points on the surface. The data fusion to solve the redundancy is based on the Gradient Projection Method. This control scheme is validated experimentally on a CDPR prototype and shown to provide the expected behaviour.
|
|
08:30-10:10, Paper TuPO1S-05.6 | Add to My Program |
3D Spectral Domain Registration-Based Visual Servoing |
|
Adjigble, Komlan Jean Maxime | University of Birmingham |
Tamadazte, Brahim | CNRS |
de Farias, Cristiana | University of Birmingham |
Stolkin, Rustam | University of Birmingham |
Marturi, Naresh | University of Birmingham |
Keywords: Visual Servoing, Sensor-based Control, Optimization and Optimal Control
Abstract: This paper presents a spectral domain registration-based visual servoing scheme that works on 3D point clouds. Specifically, we propose a 3D model/point cloud alignment method, which works by finding a global transformation between reference and target point clouds using spectral analysis. A 3D Fast Fourier Transformation (FFT) in R3 is used for the translation estimation, and the real spherical harmonics in SO(3) are used for the rotations estimation. Such an approach allows us to derive a decoupled 6 degrees of freedom (DoF) controller, where we use gradient ascent optimisation to minimise translation and rotational costs. We then show how this methodology can be used to regulate a robot arm to perform a positioning task. In contrast to the existing state-of-the-art depth-based visual servoing methods that either require dense depth maps or dense point clouds, our method works well with partial point clouds and can effectively handle larger transformations between the reference and the target positions. Furthermore, the use of spectral data (instead of spatial data) for transformation estimation makes our method robust to sensor-induced noise and partial occlusions. We validate our approach by performing experiments using point clouds acquired by a robot-mounted depth camera. Obtained results demonstrate the effectiveness of our visual servoing approach.
|
|
08:30-10:10, Paper TuPO1S-05.7 | Add to My Program |
Autonomous Endoscope Control Algorithm with Visibility and Joint Limits Avoidance Constraints for Da Vinci Research Kit Robot |
|
Moccia, Rocco | Università Degli Studi Di Napoli Federico II |
Ficuciello, Fanny | Università Di Napoli Federico II |
Keywords: Visual Servoing, Surgical Robotics: Laparoscopy, Medical Robots and Systems
Abstract: This paper presents a novel autonomous endoscope control method for the dVRK’s Endoscopic Camera Manipulator (ECM), which allows the camera to track the surgical instruments on the Patient Side Manipulator (PSM). An Image-based Visual Servoing (IBVS) is enforced by the addition of a visibility constraint that ensures the identified surgical tool remains in the camera’s Field Of View (FOV) for the continued availability of image feedback and a joint limits avoidance constraint that prevents the ECM from exceeding its joint limits. The work relies on an optimization approach, with constraints performed using the Control Barrier Functions concept (CBFs). The goal is to minimize the surgeon’s cognitive and physical workload by removing the time-consuming job of camera reorientation, offering an enforced method compared to the traditional IBVS endoscopic camera controller.
|
|
08:30-10:10, Paper TuPO1S-05.8 | Add to My Program |
Safe Control Using Vision-Based Control Barrier Function (V-CBF) |
|
Abdi, Hossein | Sharif University of Technology |
Raja, Golnaz | Tampere University |
Ghabcheloo, Reza | Tampere University |
Keywords: Machine Learning for Robot Control, Vision-Based Navigation, RGB-D Perception
Abstract: Safe motion control in unknown environments is one of the challenging tasks in robotics, such as autonomous navigation. Control Barrier Function (CBF), as a strong mathematical tool, has been widely used in many safety-critical systems to satisfy safety requirements. However, there are only a handful of recent studies on safety controllers with perception inputs. Common assumptions in most of the works are that the CBF is already known and obstacles have predefined shapes. In this work, we introduce a novel Vision-based Control Barrier Function (V-CBF), which enables generalization to new environments and obstacles of arbitrary shapes. We then derive CBF safety conditions over RGB-D space and relate those to actual robot control inputs. To train the CBF function, we introduce a method to generate ground truth with desired properties complying with CBF and a method to generate part of the CBF as an image-to-image translation problem. We finally demonstrate the efficacy of V-CBF on the safe control of an autonomous car in CARLA simulator.
|
|
TuPO1S-06 Poster Session, Room T8 |
Add to My Program |
Visual Tracking |
|
|
|
08:30-10:10, Paper TuPO1S-06.1 | Add to My Program |
DC-MOT: Motion Deblurring and Compensation for Multi-Object Tracking in UAV Videos |
|
Cheng, Song | Jilin University |
Yao, Meibao | Jilin University |
Xiao, Xueming | Changchun University of Science and Technology |
Keywords: Visual Tracking, Aerial Systems: Applications, AI-Based Methods
Abstract: In this paper, we propose a multi-object tracking framework for videos captured by UAVs, considering motion imperfection in the following two aspects: 1) motion blurring of objects due to high-speed motion of the UAV and the objects, deteriorating the performance of the detector; 2) motion coupling of the global movement of the UAV camera with the object motion, resulting in the objects trajectory in adjacent frames more difficult to predict. For motion blurring, this paper proposes a hybrid deblurring module that deals with the blurred frames while retaining the clear frames, trading off between video tracking performance and spatio-temporal consistency. For motion coupling, we proposed a motion compensation module to align adjacent frames by feature matching, and the corrected target position is obtained in the next frame to alleviate the interference of camera movement with tracking. We evaluate the proposed methods on VisDrone dataset and validate that our framework achieves new state-of-the-art performance on UAV-based MOT systems.
|
|
08:30-10:10, Paper TuPO1S-06.2 | Add to My Program |
Fast Event-Based Double Integral for Real-Time Robotics |
|
Lin, Shijie | The University of Hong Kong |
Zhang, Yinqiang | The University of Hong Kong |
Huang, Dongyue | The Chinese University of Hong Kong |
Zhou, Bin | Beihang University |
Luo, Xiaowei | City University, HONG KONG |
Pan, Jia | University of Hong Kong |
Keywords: Hardware-Software Integration in Robotics, Computer Vision for Automation, Visual-Inertial SLAM
Abstract: Motion deblurring is a critical ill-posed problem that is important in many vision-based robotics applications. The recently proposed event-based double integral (EDI) provides a theoretical framework for solving the deblurring problem with the event camera and generating clear images at high frame-rate. However, the original EDI is mainly designed for offline computation and does not support real-time requirement in many robotics applications. In this paper, we propose the fast EDI, an efficient implementation of EDI that can achieve real-time online computation on single-core CPU devices, which is common for physical robotic platforms used in practice. In experiments, our method can handle event rates at as high as 13 million event per second in a wide variety of challenging lighting conditions. We demonstrate the benefit on multiple downstream real-time applications, including localization, visual tag detection, and feature matching.
|
|
08:30-10:10, Paper TuPO1S-06.3 | Add to My Program |
Continuous-Time Gaussian Process Motion-Compensation for Event-Vision Pattern Tracking with Distance Fields |
|
Le Gentil, Cedric | University of Technology Sydney |
Alzugaray, Ignacio | Imperial College London |
Vidal-Calleja, Teresa A. | University of Technology Sydney |
Keywords: Visual Tracking, Computer Vision for Automation, SLAM
Abstract: This work addresses the issue of motion compensation and pattern tracking in event camera data. An event camera generates asynchronous streams of events triggered independently by each of the pixels upon changes in the observed intensity. Providing great advantages in low-light and rapid-motion scenarios, such unconventional data present significant research challenges as traditional vision algorithms are not directly applicable to this sensing modality. The proposed method decomposes the tracking problem into a local SE(2) motion-compensation step followed by a homography registration of small motion-compensated event batches. The first component relies on Gaussian Process (GP) theory to model the continuous occupancy field of the events in the image plane and embed the camera trajectory in the covariance kernel function. In doing so, estimating the trajectory is done similarly to GP hyperparameter learning by maximising the log marginal likelihood of the data. The continuous occupancy fields are turned into distance fields and used as templates for homography-based registration. By benchmarking the proposed method against other state-of-the-art techniques, we show that our open-source implementation performs high-accuracy motion compensation and produces high-quality tracks in real-world scenarios.
|
|
08:30-10:10, Paper TuPO1S-06.4 | Add to My Program |
EXOT: Exit-Aware Object Tracker for Safe Robotic Manipulation of Moving Object |
|
Kim, Hyunseo | Seoul National University |
Yoon, Hye Jung | Seoul National University |
Kim, Minji | Seoul National University |
Han, Dong-Sig | Seoul National University |
Zhang, Byoung-Tak | Seoul National University |
Keywords: Visual Tracking, Deep Learning for Visual Perception, Data Sets for Robotic Vision
Abstract: Current robotic hand manipulation narrowly operates with objects in predictable positions in limited environments. Thus, when the location of the target object deviates severely from the expected location, a robot sometimes responds in an unexpected way, especially when it operates with a human. For safe robot operation, we propose the EXit-aware Object Tracker (EXOT) on a robot hand camera that recognizes an object's absence during manipulation. The robot decides whether to proceed by examining the tracker's bounding box output containing the target object. We adopt an out-of-distribution classifier for more accurate object recognition since trackers can mistrack a background as a target object. To the best of our knowledge, our method is the first approach of applying an out-of-distribution classification technique to a tracker output. We evaluate our method on the first-person video benchmark dataset, TREK-150, and on the custom dataset, RMOT-223, that we collect from the UR5e robot. Then we test our tracker on the UR5e robot in real-time with a conveyor-belt sushi task, to examine the tracker's ability to track target dishes and to determine the exit status. Our tracker shows 38% higher exit-aware performance than a baseline method. The dataset and the code will be released at https://github.com/hskAlena/EXOT.
|
|
08:30-10:10, Paper TuPO1S-06.5 | Add to My Program |
Mono-STAR: Mono-Camera Scene-Level Tracking and Reconstruction |
|
Chang, Haonan | Rutgers University |
Metha Ramesh, Dhruv | Rutgers University |
Geng, Shijie | Rutgers University |
Gan, Yuqiu | Columbia University |
Boularias, Abdeslam | Rutgers University |
Keywords: Visual Tracking, RGB-D Perception
Abstract: We present Mono-STAR, the first real-time RGB-D 3D reconstruction system that simultaneously supports semantic fusion, fast motion tracking, non-rigid object deformation, and topological change under a unified framework. The proposed system solves a new optimization problem incorporating optical-flow-based 2D constraints to deal with fast motion and a novel semantic-aware deformation graph (SAD-graph) for handling topology change. We test the proposed system under various challenging scenes and demonstrate that it significantly outperforms existing state-of-the-art methods.
|
|
08:30-10:10, Paper TuPO1S-06.6 | Add to My Program |
DFR-FastMOT: Detection Failure Resistant Tracker for Fast Multi-Object Tracking Based on Sensor Fusion |
|
Nagy, Mohamed | Khalifa University Center for Autonomous Robotic Systems (KUCARS |
Khonji, Majid | Khalifa University |
Dias, Jorge | Khalifa University |
Javed, Sajid | Khalifa University |
Keywords: Visual Tracking, Sensor Fusion, Localization
Abstract: Persistent multi-object tracking (MOT) allows autonomous vehicles to navigate safely in highly dynamic environments. One of the well-known challenges in MOT is object occlusion when an object becomes unobservant for subsequent frames. The current MOT methods store objects information, like objects' trajectory, in internal memory to recover the objects after occlusions. However, they retain short-term memory to save computational time and avoid slowing down the MOT method. As a result, they lose track of objects in some occlusion scenarios, particularly long ones. In this paper, we propose DFR-FastMOT, a light MOT method that uses data from a camera and LiDAR sensors and relies on an algebraic formulation for object association and fusion. The formulation boosts the computational time and permits long-term memory that tackles more occlusion scenarios. Our method shows outstanding tracking performance over recent learning and non-learning benchmarks with about 3% and 4% margin in MOTA, respectively. Also, we conduct extensive experiments that simulate occlusion phenomena by employing detectors with various distortion levels. The proposed solution enables superior performance under various distortion levels in detection over current state-of-art methods. Our framework processes about 7,763 frames in 1.48 seconds, which is seven times faster than recent benchmarks. The framework will be available at https://github.com/MohamedNagyMostafa/DFR-FastMOT.
|
|
08:30-10:10, Paper TuPO1S-06.7 | Add to My Program |
Fusion of Events and Frames Using 8-DOF Warping Model for Robust Feature Tracking |
|
Lee, Min Seok | Seoul National University |
Kim, Ye Jun | Hyundai Motor Group |
Jung, Jae Hyung | Seoul National University |
Park, Chan Gook | Seoul National University |
Keywords: Visual Tracking, Vision-Based Navigation, Localization
Abstract: Event cameras are asynchronous neuromorphic vision sensors with high temporal resolution and no motion blur, offering advantages over standard frame-based cameras especially in high-speed motions and high dynamic range conditions. However, event cameras are unable to capture the overall context of the scene, and produce different events for the same scenery depending on the direction of the motion, creating a challenge in data association. Standard camera, on the other hand, provides frames at a fixed rate that are independent of the motion direction, and are rich in context. In this paper, we present a robust feature tracking method that employs 8-DOF warping model in minimizing the difference between brightness increment patches from events and frames, exploiting the complementary nature of the two data types. Unlike previous works, the proposed method enables tracking of features under complex motions accompanying distortions. Extensive quantitative evaluation over publicly available datasets was performed where our method shows an improvement over state-of-the-art methods in robustness with greatly prolonged feature age and in accuracy for challenging scenarios.
|
|
08:30-10:10, Paper TuPO1S-06.8 | Add to My Program |
3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D Point Clouds |
|
Kini, Jyoti | University of Central Florida |
Mian, Ajmal | University of Western Australia |
Shah, Mubarak | University of Central Florida |
Keywords: Visual Tracking, Autonomous Vehicle Navigation, Human Detection and Tracking
Abstract: We propose a method for joint detection and tracking of multiple objects in 3D point clouds, a task conventionally treated as a two-step process comprising object detection followed by data association. Our method embeds both steps into a single end-to-end trainable network eliminating the dependency on external object detectors. Our model exploits temporal information employing multiple frames to detect objects and track them in a single network, thereby making it a utilitarian formulation for real-world scenarios. Computing affinity matrix by employing features similarity across consecutive point cloud scans forms an integral part of visual tracking. We propose an attention-based refinement module to refine the affinity matrix by suppressing erroneous correspondences. The module is designed to capture the global context in affinity matrix by employing self-attention within each affinity matrix and cross-attention across a pair of affinity matrices. Unlike competing approaches, our network does not require complex post-processing algorithms, and processes raw LiDAR frames to directly output tracking results. We demonstrate the effectiveness of our method on the three tracking benchmarks: JRDB, Waymo, and KITTI. Experimental evaluations indicate the ability of our model to generalize well across datasets.
|
|
TuPO1S-07 Poster Session, Room T8 |
Add to My Program |
Robot Learning |
|
|
|
08:30-10:10, Paper TuPO1S-07.1 | Add to My Program |
Inverse Reinforcement Learning Framework for Transferring Task Sequencing Policies from Humans to Robots in Manufacturing Applications |
|
Manyar, Omey Mohan | University of Southern California |
McNulty, Zachary | University of Southern California |
Nikolaidis, Stefanos | University of Southern California |
Gupta, Satyandra K. | University of Southern California |
Keywords: Learning from Demonstration, Intelligent and Flexible Manufacturing, Task Planning
Abstract: In this work, we present an inverse reinforcement learning approach for solving the problem of task sequencing for robots in complex manufacturing processes. Our proposed framework is adaptable to variations in process and can perform sequencing for entirely new parts. We prescribe an approach to capture feature interactions in a demonstration dataset based on a metric that computes feature interaction coverage. We then actively learn the expert's policy by keeping the expert in the loop. Our training and testing results reveal that our model can successfully learn the expert's policy. We demonstrate the performance of our method on a real-world manufacturing application where we transfer the policy for task sequencing to a manipulator. Our experiments show that the robot can perform these tasks to produce human-competitive performance. Code and video can be found at: https://sites.google.com/usc.edu/irlfortasksequencing
|
|
08:30-10:10, Paper TuPO1S-07.2 | Add to My Program |
Learning State Conditioned Linear Mappings for Low-Dimensional Control of Robotic Manipulators |
|
Przystupa, Michael | University of Alberta |
Johnstonbaugh, Kerrick | University of Alberta |
Zhang, Zichen | University of Alberta, Canada |
Petrich, Laura | University of Alberta |
Dehghan, Masood | University of Alberta |
Haghverd, Faezeh | University of Alberta |
Jagersand, Martin | University of Alberta |
Keywords: Representation Learning, Learning from Demonstration, Telerobotics and Teleoperation
Abstract: Identifying an appropriate task space can simplify solving robotic manipulation problems. One solution is deploying control algorithms in a learned low-dimensional action space. Linear and nonlinear action mapping methods have trade-offs between simplicity and the ability to express motor commands outside of a single low-dimensional subspace. We propose that learning local linear action representations can achieve both of these benefits. Our state-conditioned linear maps ensure that for any given state, the high-dimensional robotic actuation is linear in the low-dimensional actions. As the robot state evolves, so do the action mappings, so that necessary motions can be performed during a task. These local linear representations guarantee desirable theoretical properties by design. We validate these findings empirically through two user studies. Results suggest state-conditioned linear maps outperform conditional autoencoder and PCA baselines on a pick-and-place task and perform comparably to mode switching in a more complex pouring task.
|
|
08:30-10:10, Paper TuPO1S-07.3 | Add to My Program |
Decoupling Skill Learning from Robotic Control for Generalizable Object Manipulation |
|
Lu, Kai | University of Oxford |
Yang, Bo | The Hong Kong Polytechnic University |
Wang, Bing | University of Oxford |
Markham, Andrew | Oxford University |
Keywords: Machine Learning for Robot Control, Reinforcement Learning, Deep Learning Methods
Abstract: Recent works in robotic manipulation through reinforcement learning (RL) or imitation learning (IL) have shown potential for tackling a range of tasks e.g., opening a drawer or a cupboard. However, these techniques generalize poorly to unseen objects. We conjecture that this is due to the high-dimensional action space for joint control. In this paper, we take an alternative approach and separate the task of learning 'what to do' from 'how to do it' i.e., whole-body control. We pose the RL problem as one of determining the skill dynamics for a disembodied virtual manipulator interacting with articulated objects. The whole-body robotic kinematic control is optimized to execute the high-dimensional joint motion to reach the goals in the workspace. It does so by solving a quadratic programming (QP) model with robotic singularity and kinematic constraints. Our experiments on manipulating complex articulated objects show that the proposed approach is more generalizable to unseen objects with large intra-class variations, outperforming previous approaches. The evaluation results indicate that our approach generates more compliant robotic motion and outperforms the pure RL and IL baselines in task success rates. Additional information and videos are available at https://kl-research.github.io/decoupskill.
|
|
08:30-10:10, Paper TuPO1S-07.4 | Add to My Program |
Comparison of Model-Based and Model-Free Reinforcement Learning for Real-World Dexterous Robotic Manipulation Tasks |
|
Valencia Redrovan, David Patricio | The University of Auckland |
Jia, John | University of AUCKLAND |
Li, Raymond | The University of Auckland |
Hayashi, Alex | The University of Auckland |
Lecchi, Megan | The University of Auckland |
Terezakis, Reuel | University of Auckland |
Gee, Trevor | The University of Auckland |
Liarokapis, Minas | The University of Auckland |
MacDonald, Bruce | University of Auckland |
Williams, Henry | University of Auckland |
Keywords: Reinforcement Learning, Machine Learning for Robot Control
Abstract: Model Free Reinforcement Learning (MFRL) has shown significant promise for learning dexterous robotic manipulation tasks, at least in simulation. However, the high number of samples, as well as the long training times, prevent MFRL from scaling to complex real-world tasks. Model-Based Reinforcement Learning (MBRL) emerges as a potential solution that, in theory, can improve the data efficiency of MFRL approaches. This could drastically reduce the training time of MFRL, and increase the application of RL for real-world robotic tasks. This article presents a study on the feasibility of using the state-of-the-art MBRL to improve the training time for two real-world dexterous manipulation tasks. The evaluation is conducted on a real low-cost robot gripper where the predictive model and the control policy are learned from scratch. The results indicate that MBRL is capable of learning accurate models of the world, but does not show clear improvements in learning the control policy in the real world as prior literature suggests should be expected.
|
|
08:30-10:10, Paper TuPO1S-07.5 | Add to My Program |
Handling Sparse Rewards in Reinforcement Learning Using Model Predictive Control |
|
Elnagdi, Murad | University of Bonn |
Dengler, Nils | University of Bonn |
de Heuvel, Jorge | University of Bonn |
Bennewitz, Maren | University of Bonn |
Keywords: Machine Learning for Robot Control, Reinforcement Learning, Optimization and Optimal Control
Abstract: Reinforcement learning (RL) has recently proven great success in various domains. Yet, the design of the reward function requires detailed domain expertise and tedious fine-tuning to ensure that agents are able to learn the desired behaviour. Using a sparse reward conveniently mitigates these challenges. However, the sparse reward represents a challenge on its own, often resulting in unsuccessful training of the agent. In this paper, we therefore address the sparse reward problem in RL. Our goal is to find an effective alternative to reward shaping, without using costly human demonstrations, that would also be applicable to a wide range of domains. Hence, we propose to use model predictive control MPC as an experience source for training RL agents in sparse reward environments. Without the need for reward shaping, we successfully apply our approach in the field of mobile robot navigation both in simulation and real-world experiments with a Kuboki Turtlebot 2. We furthermore demonstrate great improvement over pure RL algorithms in terms of success rate as well as number of collisions and timeouts. Our experiments show that MPC as an experience source improves the agent's learning process for a given task in the case of sparse rewards.
|
|
08:30-10:10, Paper TuPO1S-07.6 | Add to My Program |
Task-Driven Graph Attention for Hierarchical Relational Object Navigation |
|
Lingelbach, Michael | Stanford University |
Li, Chengshu | Stanford University |
Hwang, Minjune | Stanford University |
Kurenkov, Andrey | Stanford University |
Lou, Alan | Stanford University |
Martín-Martín, Roberto | University of Texas at Austin |
Zhang, Ruohan | Stanford University |
Fei-Fei, Li | Stanford University |
Wu, Jiajun | Stanford University |
Keywords: Reinforcement Learning, Machine Learning for Robot Control, Vision-Based Navigation
Abstract: Embodied AI agents in large scenes often need to navigate to find objects. In this work, we study a naturally emerging variant of the object navigation task, hierarchical relational object navigation (HRON), where the goal is to find objects specified by logical predicates organized in a hierarchical structure—objects related to furniture and then to rooms—such as finding an apple on top of a table in the kitchen. Solving such a task requires an efficient representation to reason about object relations and correlate the relations in the environment and in the task goal. HRON in large scenes (e.g. homes) is particularly challenging due to its partial observability and long horizon, which invites solutions that can compactly store the past information while effectively exploring the scene. We demonstrate experimentally that scene graphs are the best-suited representation compared to conventional representations such as images or 2D maps. We propose a solution that uses scene graphs as part of its input and integrates graph neural networks as its backbone, with an integrated task-driven attention mechanism, and demonstrate better scalability and learning efficiency than state-of-the-art baselines.
|
|
08:30-10:10, Paper TuPO1S-07.7 | Add to My Program |
Safety-Guaranteed Skill Discovery for Robot Manipulation Tasks |
|
Kim, Sunin | NAVER LABS |
Kwon, Jaewoon | NAVER LABS |
Lee, Taeyoon | Naver Labs |
Park, Younghyo | Seoul National University |
Perez, Julien | Naver Labs Europe |
Keywords: Machine Learning for Robot Control, Reinforcement Learning, Deep Learning Methods
Abstract: Programming manipulation behaviors can become increasingly difficult with a growing number and complexity of manipulation tasks, particularly in a dynamic and unstructured environment. Recent progress in unsupervised skill discovery algorithms has shown great promise in learning an extensive collection of behaviors without extrinsic supervision. On the other hand, safety is one of the most critical factors for real-world robot applications. As skill discovery methods typically encourage exploratory and dynamic behaviors, it can often be the case that a large portion of learned skills remain too dangerous and unsafe. In this paper, we introduce the novel problem of Safety-Aware Skill Discovery, which aims to learn, in a task-agnostic fashion, a repertoire of reusable skills that are inherently safe to be composed for solving downstream tasks. We present a computationally tractable algorithm that learns a latent-conditioned skill policy that maximizes intrinsic rewards regularized with a safety-critic that can model any user-defined safety constraints. Using the pretrained safe skill repertoire, hierarchical reinforcement learning can solve multiple downstream tasks without the need for explicit consideration of safety during training and testing. We evaluate our algorithm on a collection of force-controlled robotic manipulation tasks in simulation and show promising downstream task performance while satisfying safety constraints.
|
|
08:30-10:10, Paper TuPO1S-07.8 | Add to My Program |
A Framework for the Unsupervised Inference of Relations between Sensed Object Spatial Distributions and Robot Behaviors |
|
Morse, Christopher | University of Virginia |
Feng, Lu | University of Virginia |
Dwyer, Matthew | University of Virginia |
Elbaum, Sebastian | University of Virginia |
Keywords: Formal Methods in Robotics and Automation, Software, Middleware and Programming Environments, Software Tools for Robot Programming
Abstract: The spatial distribution of sensed objects strongly influences the behavior of mobile robots. Yet, as robots evolve in complexity to operate in increasingly rich environments, it becomes much more difficult to specify the underlying relations between sensed object spatial distributions and robot behaviors. We aim to address this challenge by leveraging system trace data to automatically infer relations that help to better characterize these spatial associations. In particular, we introduce SpRInG, a framework for the unsupervised inference of system specifications from traces that characterize the spatial relationships under which a robot operates. Our method builds on a parameterizable notion of reachability to encode relationships of spatial neighborship, which are used to instantiate a language of patterns. These patterns provide the structure to infer, from system traces, the connection between such relationships and robot behaviors. We show that SpRInG can automatically infer spatial relations over two distinct domains: autonomous vehicles in traffic and a surgical robot. Our results demonstrate the power and expressiveness of SpRInG, in its ability to learn existing specifications as machine-checkable first-order logic, uncover previously unstated specifications that are rich and insightful, and reveal contextual differences between executions.
|
|
08:30-10:10, Paper TuPO1S-07.9 | Add to My Program |
Learning Video-Conditioned Policies for Unseen Manipulation Tasks |
|
Chane-Sane, Elliot | Inria PARIS |
Schmid, Cordelia | Inria |
Laptev, Ivan | INRIA |
Keywords: Computer Vision for Automation, Imitation Learning, Machine Learning for Robot Control
Abstract: The ability to specify robot commands by a non-expert user is critical for building generalist agents capable of solving a large variety of tasks. One convenient way to specify the intended robot goal is by a video of a person demonstrating the target task. While prior work typically aims to imitate human demonstrations performed in robot environments, here we focus on a more realistic and challenging setup with demonstrations recorded in natural and diverse human environments. We propose Video-conditioned Policy learning (ViP), a data-driven approach that maps human demonstrations of previously unseen tasks to robot manipulation skills. To this end, we learn our policy to generate appropriate actions given current scene observations and a video of the target task. To encourage generalization to new tasks, we avoid particular tasks during training and learn our policy from unlabelled robot trajectories and corresponding robot videos. Both robot and human videos in our framework are represented by video embeddings pre-trained for human action recognition. At test time we first translate human videos to robot videos in the common video embedding space, and then use resulting embeddings to condition our policies. Notably, our approach enables robot control by human demonstrations in a zero-shot manner, i.e., without using robot trajectories paired with human instructions during training. We validate our approach on a set of challenging multi-task robot manipulation environments and outperform state of the art. Our method also demonstrates excellent performance in a new challenging zero-shot setup where no paired data is used during training.
|
|
08:30-10:10, Paper TuPO1S-07.10 | Add to My Program |
Learning Food Picking without Food: Fracture Anticipation by Breaking Reusable Fragile Objects |
|
Yagawa, Rinto | Keio University |
Ishikawa, Reina | Keio University |
Hamaya, Masashi | OMRON SINIC X Corporation |
Tanaka, Kazutoshi | OMRON SINIC X Corporation |
Hashimoto, Atsushi | Omron Sinic X |
Saito, Hideo | Keio University |
Keywords: Agricultural Automation, Perception for Grasping and Manipulation, Domestic Robotics
Abstract: Food picking is trivial for humans but not for robots, as foods are fragile. Presetting foods' physical properties does not help robots much due to the objects' inter- and intra-category diversity. A recent study proved that learning-based fracture anticipation with tactile sensors could overcome this problem; however, the method trains the model for each food to deal with intra-category differences, and tuning robots for each food leads to an undesirable amount of food consumption. This study proposes a novel framework for learning food-picking tasks without consuming foods. The key idea is to leverage the object-breaking experiences of several reusable fragile objects instead of consuming real foods while making the picking ability object-invariant with domain generalization (DG). In real-robot experiments, we trained a model with reusable objects (toy blocks, ping-pong balls, and jellies), which are selected by three typical fracture types (crack, rupture, and crush). We then tested the model with four real food objects (tofu, bananas, potato chips, and tomatoes). The results showed that the proposed combination of reusable objects' breaking experiences and DG is effective for the food-picking task.
|
|
08:30-10:10, Paper TuPO1S-07.11 | Add to My Program |
Learning Risk-Aware Costmaps Via Inverse Reinforcement Learning for Off-Road Navigation |
|
Triest, Samuel | Carnegie Mellon University |
Guaman Castro, Mateo | Carnegie Mellon University |
Maheshwari, Parv | Indian Institute of Technology Kharagpur |
Sivaprakasam, Matthew | Carnegie Mellon University |
Wang, Wenshan | Carnegie Mellon University |
Scherer, Sebastian | Carnegie Mellon University |
Keywords: Field Robots, Learning from Demonstration, Reinforcement Learning
Abstract: The process of designing costmaps for off-road driving tasks is often a challenging and engineering-intensive task. Recent work in costmap design for off-road driving focuses on training deep neural networks to predict costmaps from sensory observations using corpora of expert driving data. However, such approaches are generally subject to overconfident mis-predictions and are rarely evaluated in-the-loop on physical hardware. We present an inverse reinforcement learning-based method of efficiently training deep cost functions that are uncertainty-aware. We do so by leveraging recent advances in highly parallel model-predictive control and robotic risk estimation. In addition to demonstrating improvement at reproducing expert trajectories, we also evaluate the efficacy of these methods in challenging off-road navigation scenarios. We observe that our method significantly outperforms a geometric baseline, resulting in 44% improvement in expert path reconstruction and 57% fewer interventions in practice. We also observe that varying the risk tolerance of the vehicle results in qualitatively different navigation behaviors, especially with respect to higher-risk scenarios such as slopes and tall grass.
|
|
08:30-10:10, Paper TuPO1S-07.12 | Add to My Program |
How Does It Feel? Self-Supervised Costmap Learning for Off-Road Vehicle Traversability |
|
Guaman Castro, Mateo | Carnegie Mellon University |
Triest, Samuel | Carnegie Mellon University |
Wang, Wenshan | Carnegie Mellon University |
Gregory, Jason M. | US Army Research Laboratory |
Sanchez, Felix | Booz Allen Hamilton |
Rogers III, John G. | US Army Research Laboratory |
Scherer, Sebastian | Carnegie Mellon University |
Keywords: Deep Learning Methods, Field Robots, Vision-Based Navigation
Abstract: Estimating terrain traversability in off-road environments requires reasoning about complex interaction dynamics between the robot and these terrains. However, it is challenging to create informative labels to learn a model in a supervised manner for these interactions. We propose a method that learns to predict traversability costmaps by combining exteroceptive environmental information with proprioceptive terrain interaction feedback in a self-supervised manner. Additionally, we propose a novel way of incorporating robot velocity into the costmap prediction pipeline. We validate our method in multiple short and large-scale navigation tasks on challenging off-road terrains using two different large, all-terrain robots. Our short-scale navigation results show that using our learned costmaps leads to overall smoother navigation, and provides the robot with a more fine-grained understanding of the robot-terrain interactions. Our large-scale navigation trials show that we can reduce the number of interventions by up to 57% compared to an occupancy-based navigation baseline in challenging off-road courses ranging from 400 m to 3150 m. Appendix and full experiment videos can be found in our website: https://mateoguaman.github.io/hdif.
|
|
TuPO1S-08 Poster Session, Room T8 |
Add to My Program |
Learning for Control I |
|
|
|
08:30-10:10, Paper TuPO1S-08.1 | Add to My Program |
Global and Reactive Motion Generation with Geometric Fabric Command Sequences |
|
Zhi, Weiming | University of Sydney |
Akinola, Iretiayo | Columbia University |
Van Wyk, Karl | NVIDIA |
Ratliff, Nathan | NVIDIA |
Ramos, Fabio | University of Sydney, NVIDIA |
Keywords: Machine Learning for Robot Control, Optimization and Optimal Control
Abstract: Motion generation seeks to produce safe and feasible robot motion from start to goal. Various tools at different levels of granularity have been developed. On one extreme, sampling-based motion planners focus on completeness -- a solution, if it exists, would eventually be found. However, produced paths are often of low quality, and contain superfluous motion. On the other, reactive methods optimise the immediate cost to obtain the next controls, producing smooth and legible motion that can quickly adapt to perturbations, uncertainties, and changes in the environment. However, reactive methods are highly local, and often produce motion that become trapped in non-convex regions of the environment. This paper contributes, Geometric Fabric Command Sequences, a method that lies in the middle ground. It can produce globally optimal motion that is smooth and intuitive, while being also reactive. We model motion via a reactive Geometric Fabric policy that ingests a sequence of attractor states, or commands, and then apply global optimisation over the space of commands. We postulate that solutions for different problems and scenes are highly transferable when conditioned on environmental features. Therefore, an implicit generative model is trained on solutions from optimisation and environment features in a self-supervised manner. That is, faced with multiple motion generation problems, the learning and optimisation are contained within the same loop: the optimisation generates labels for learning, while the learning improves the optimisation for the next problem, which in turn provides higher quality labels. We empirically validate our method in both simulation and on a real-world 6-DOF JACO arm.
|
|
08:30-10:10, Paper TuPO1S-08.2 | Add to My Program |
Enforcing the Consensus between Trajectory Optimization and Policy Learning for Precise Robot Control |
|
Le Lidec, Quentin | INRIA-ENS-PSL |
Jallet, Wilson | LAAS-CNRS |
Laptev, Ivan | INRIA |
Schmid, Cordelia | Inria |
Carpentier, Justin | INRIA |
Keywords: Machine Learning for Robot Control, Optimization and Optimal Control
Abstract: Reinforcement learning (RL) and trajectory optimization (TO) present strong complementary advantages. On one hand, RL approaches are able to learn global control policies directly from data, but generally require large sample sizes to properly converge towards feasible policies. On the other hand, TO methods are able to exploit gradient-based information extracted from simulators to quickly converge towards a locally optimal control trajectory which is only valid within the vicinity of the solution. Over the past decade, several approaches have aimed to adequately combine the two classes of methods in order to obtain the best of both worlds. Following on from this line of research, we propose several improvements on top of these approaches to learn global control policies quicker, notably by leveraging sensitivity information stemming from TO methods via Sobolev learning, and Augmented Lagrangian (AL) techniques to enforce the consensus between TO and policy learning. We evaluate the benefits of these improvements on various classical tasks in robotics through comparison with existing approaches in the literature.
|
|
08:30-10:10, Paper TuPO1S-08.3 | Add to My Program |
Neural Optimal Control Using Learned System Dynamics |
|
Engin, Kazim Selim | University of Minnesota |
Isler, Volkan | University of Minnesota |
Keywords: Machine Learning for Robot Control, Model Learning for Control, Optimization and Optimal Control
Abstract: We study the problem of generating control laws for systems with unknown dynamics. Our approach is to represent the controller and the value function with neural networks, and to train them using loss functions adapted from the Hamilton-Jacobi-Bellman (HJB) equations. In the absence of a known dynamics model, our method first learns the state transitions from data collected by interacting with the system in an offline process. The learned transition function is then integrated to the HJB equations and used to forward simulate the control signals produced by our controller in a feedback loop. In contrast to trajectory optimization methods that optimize the controller for a single initial state, our controller can generate near-optimal control signals for initial states from a large portion of the state space. Compared to recent model-based reinforcement learning algorithms, we show that our method is more sample efficient and trains faster by an order of magnitude. We demonstrate our method in a number of tasks, including the control of a quadrotor with 12 state variables.
|
|
08:30-10:10, Paper TuPO1S-08.4 | Add to My Program |
Learned Risk Metric Maps for Kinodynamic Systems |
|
Allen, Ross | MIT Lincoln Laboratory |
Xiao, Wei | MIT |
Rus, Daniela | MIT |
Keywords: Robot Safety, Machine Learning for Robot Control, Collision Avoidance
Abstract: We present Learned Risk Metric Maps (LRMM) for real-time estimation of coherent risk metrics of high-dimensional dynamical systems operating in unstructured, partially observed environments. LRMM models are simple to design and train---requiring only procedural generation of obstacle sets, state and control sampling, and supervised training of a function approximator---which makes them broadly applicable to arbitrary system dynamics and obstacle sets. In a parallel autonomy setting, we demonstrate the model's ability to rapidly infer collision probabilities of a fast-moving car-like robot driving recklessly in an obstructed environment; allowing the LRMM agent to intervene, take control of the vehicle, and avoid collisions. In this time-critical scenario, we show that LRMMs can evaluate risk metrics 20-100x times faster than alternative safety algorithms based on control barrier functions (CBFs) and Hamilton-Jacobi reachability (HJ-reach), leading to 5-15% fewer obstacle collisions by the LRMM agent than CBFs and HJ-reach. This performance improvement comes in spite of the fact that the LRMM model only has access to local/partial observation of obstacles, whereas the CBF and HJ-reach agents are granted privileged/global information. We also show that our model can be equally well trained on a 12-dimensional quadrotor system operating in an obstructed indoor environment. All software for training and experiments is provided at https://github.com/mit-drl/pyrmm
|
|
08:30-10:10, Paper TuPO1S-08.5 | Add to My Program |
Autonomous Drifting with 3 Minutes of Data Via Learned Tire Models |
|
Djeumou, Franck | University of Texas, Austin |
Goh, Jon | Toyota Research Institute |
Topcu, Ufuk | The University of Texas at Austin |
Balachandran, Avinash | Toyota Research Institue |
Keywords: Model Learning for Control, Machine Learning for Robot Control, Intelligent Transportation Systems
Abstract: Near the limits of adhesion, the forces generated by a tire are nonlinear and intricately coupled. Efficient and accurate modelling in this region could improve safety, especially in emergency situations where high forces are required. To this end, we propose a novel family of tire force models based on neural ordinary differential equations and a neural-texttt{ExpTanh} parameterization. These models are designed to satisfy physically insightful assumptions while also having sufficient fidelity to capture higher-order effects directly from vehicle state measurements. They are used as drop-in replacements for an analytical brush tire model in an existing nonlinear model predictive control framework. Experiments with a customized Toyota Supra show that scarce amounts of driving data -- less than three minutes -- is sufficient to achieve high-performance autonomous drifting on various trajectories with speeds up to 45mph. Comparisons with the benchmark model show a 4 times improvement in tracking performance, smoother control inputs, and faster and more consistent computation time.
|
|
08:30-10:10, Paper TuPO1S-08.6 | Add to My Program |
DDK: A Deep Koopman Approach for Longitudinal and Lateral Control of Autonomous Ground Vehicles |
|
Xiao, Yongqian | National University of Defense Technology |
Zhang, Xinglong | National University of Defense Technology |
Xu, Xin | National University of Defense Technology |
Yang, Lu | National University of Defense Technology |
Li, Junxiang | National University of Defense Technology |
Keywords: Model Learning for Control, Deep Learning Methods, Motion Control
Abstract: Autonomous driving has attracted lots of attention in recent years. For some tasks, e.g., trajectory prediction, motion planning, and trajectory tracking, an accurate vehicle model can reduce the difficulty of these tasks and improve task completion performance. Prior works focused on parameter estimation of physical models or modeling nonlinear dynamics using neural networks. Still, these methods rely on internal parameters of vehicles or are not friendly for control due to the strong nonlinearity of models. This paper proposes a data-driven method to approximate vehicle dynamics based on the Koopman operator. The resulting model is an interpretable linear time-invariant model, facilitating controller design and solving related optimization problems. In the proposed approach, the state transition matrix is constructed based on the learned Koopman eigenvalues, while the input matrix is trained as a tensor. Based on the resulting model, a linear model predictive controller is designed to implement coupled longitudinal and lateral trajectory tracking. Simulations and experiments, including vehicle dynamics modeling and coupled longitudinal and lateral trajectory tracking, are performed in a high-fidelity CarSim environment and a real vehicle platform. An oil-driven D-Class SUV is selected in the simulation, while a real electric SUV is utilized in the experiment. Simulation and experiment results illustrate that the model of the nonlinear vehicle dynamics can be identified effectively via the proposed method, and high-quality trajectory tracking performance can be obtained with the resulting model.
|
|
08:30-10:10, Paper TuPO1S-08.7 | Add to My Program |
Meta-Learning-Based Optimal Control for Soft Robotic Manipulators to Interact with Unknown Environments |
|
Tang, Zhiqiang | National University of Singapore |
Wang, Peiyi | Beijing Jiaotong University |
Xin, Wenci | National University of Singapore |
Xie, Zhexin | National University of Singapore |
Kan, Longxin | National University of Singapore |
Mohanakrishnan, Muralidharan | National University of Singapore |
Laschi, Cecilia | National University of Singapore |
Keywords: Modeling, Control, and Learning for Soft Robots, Physical Human-Robot Interaction, Force Control
Abstract: Safe and efficient robot-environment interaction is a critical but challenging problem as robots are being increasingly employed to operate in unstructured and unpredictable environments. Soft robots are inherently compliant to safely interact with environments but their high nonlinearity exacerbates control difficulties. Meta-learning provides a powerful tool for fast online model adaptation because it can learn an efficient model from data across different environments. Thus, this work applies the idea of meta-learning for the control of soft robotics. In particular, a target-oriented proactive search strategy is firstly performed to collect environment-specific data efficiently when a new interaction environment occurs. Then meta-learning exploits past experience to train a data-driven probabilistic model prior, and the model prior is online updated to be fast adapted to the new environment. Lastly, a model-based optimal control policy is utilized to drive the robot to desired performance. Our approach controls a soft robotic manipulator to achieve the desired position and contact force simultaneously when interacting with unknown changing environments. Experimental results demonstrate that the tracking error could be reached within 1mm for position and 0.01N for contact force. Overall, this work provides a viable control approach for soft robots to interact with unknown environments.
|
|
08:30-10:10, Paper TuPO1S-08.8 | Add to My Program |
Dealing with Sparse Rewards in Continuous Control Robotics Via Heavy-Tailed Policy Optimization |
|
Chakraborty, Souradip | University of Maryland |
Bedi, Amrit Singh | University of Maryland, College Park |
Kulathun Mudiyanselage, Kasun Weerakoon | University of Maryland, College Park |
Poddar, Prithvi | IISER Bhopal |
Koppel, Alec | JP Morgan Chase |
Tokekar, Pratap | University of Maryland |
Manocha, Dinesh | University of Maryland |
Keywords: AI-Based Methods, AI-Enabled Robotics
Abstract: In this paper, we present a novel Heavy-Tailed Stochastic Policy Gradient (HT-PSG) algorithm to deal with the challenges of sparse rewards in continuous control problems. Sparse rewards are common in continuous control robotics tasks such as manipulation and navigation and make the learning problem hard due to the non-trivial estimation of value functions over the state space. This demands either reward shaping or expert demonstrations for the sparse reward environment. However, obtaining high-quality demonstrations is quite expensive and sometimes even impossible. We propose a heavy-tailed policy parametrization along with a modified momentum-based policy gradient tracking scheme (HT-SPG) to induce a stable exploratory behavior to the algorithm. The proposed algorithm does not require access to expert demonstrations. We test the performance of HT-SPG on various benchmark tasks of continuous control with sparse rewards such as 1D Mario, Pathological Mountain Car, Sparse Pendulum in OpenAI Gym, and Sparse MuJoCo environments (Hopper-v2, Half-Cheetah, Walker-2D). We show consistent performance improvement across all tasks in terms of high average cumulative reward without requiring access to expert demonstrations. We further demonstrate that a navigation policy trained using HT-SPG can be easily transferred into a Clearpath Husky robot to perform real-world navigation tasks.
|
|
08:30-10:10, Paper TuPO1S-08.9 | Add to My Program |
MPC with Sensor-Based Online Cost Adaptation |
|
Meduri, Avadesh | New York University |
Zhu, Huaijiang | New York University |
Jordana, Armand | NYU |
Righetti, Ludovic | New York University |
Keywords: Optimization and Optimal Control, Sensor-based Control, Machine Learning for Robot Control
Abstract: Model predictive control is a powerful tool to generate complex motions for robots. However, it often requires solving non-convex problems online to produce rich behaviors, which is computationally expensive and not always practical in real time. Additionally, direct integration of high dimensional sensor data (e.g. RGB-D images) in the feedback loop is challenging with current state-space methods.This paper aims to address both issues. It introduces a model predictive control scheme, where a neural network constantly updates the cost function of a quadratic program based on sensory inputs, aiming to minimize a general non-convex task loss without solving a non-convex problem online. By updating the cost, the robot is able to adapt to changes in the environment directly from sensor measurement without requiring a new cost design. Furthermore, since the quadratic program can be solved efficiently with hard constraints, a safe deployment on the robot is ensured. Experiments with a wide variety of reaching tasks on an industrial robot manipulator demonstrate that our method can efficiently solve complex non-convex problems with high-dimensional visual sensory inputs, while still being robust to external disturbances.
|
|
08:30-10:10, Paper TuPO1S-08.10 | Add to My Program |
ReachLipBnB: A Branch-And-Bound Method for Reachability Analysis of Neural Network Autonomous Systems Using Lipschitz Bounds |
|
Entesari, Taha | Johns Hopkins University |
Sharifi, Sina | Johns Hopkins University |
Fazlyab, Mahyar | Johns Hopkins University |
Keywords: Formal Methods in Robotics and Automation, Optimization and Optimal Control, Machine Learning for Robot Control
Abstract: We propose a novel Branch-and-bound method for reachability analysis of neural networks. Our idea is to first compute accurate bounds on the Lipschitz constant of the neural network in specific directions of interest offline using a convex program. We then use these computations to obtain an instantaneous but conservative polyhedral approximation of the reachable set online using Lipschitz continuity arguments. To reduce conservatism, we incorporate our bounding algorithm within a branching strategy to decrease the over-approximation error within an arbitrary accuracy. We then extend our method to reachability analysis of control systems with neural network controllers. Finally, to capture the shape of the reachable sets as accurately as possible, we use sample trajectories to inform the directions of the reachable set over-approximations using Principal Component Analysis (PCA). We evaluate the performance of the proposed method in several open-loop and closed-loop settings.
|
|
08:30-10:10, Paper TuPO1S-08.11 | Add to My Program |
Gradient-Based Trajectory Optimization with Learned Dynamics |
|
Sukhija, Bhavya | ETH Zürich |
Köhler, Nathanael | ETH Zürich |
Zamora Mora, Miguel Angel | ETH Zurich |
Zimmermann, Simon | ETH Zurich |
Curi, Sebastian | ETH Zürich |
Coros, Stelian | ETH Zurich |
Krause, Andreas | ETH Zurich |
Keywords: Machine Learning for Robot Control, Motion and Path Planning
Abstract: Trajectory optimization methods have achieved an exceptional level of performance on real-world robots in recent years. These methods heavily rely on accurate analytical models of the dynamics, yet some aspects of the physical world can only be captured to a limited extent. An alternative approach is to leverage machine learning techniques to learn a differentiable dynamics model of the system from data. In this work, we use trajectory optimization and model learning for performing highly dynamic and complex tasks with robotic systems in absence of accurate analytical models of the dynamics. We show that a neural network can model highly nonlinear behaviors accurately for large time horizons, from data collected in only 25 minutes of interactions on two distinct robots: (i) the Boston Dynamics Spot and an (ii) RC car. Furthermore, we use the gradients of the neural network to perform gradient-based trajectory optimization. In our hardware experiments, we demonstrate that our learned model can represent complex dynamics for both the Spot and Radio-controlled (RC) car, and gives good performance in combination with trajectory optimization methods.
|
|
08:30-10:10, Paper TuPO1S-08.12 | Add to My Program |
RAMP-Net: A Robust Adaptive MPC for Quadrotors Via Physics-Informed Neural Network |
|
Sanyal, Sourav | Purdue University |
Roy, Kaushik | Purdue University |
Keywords: Machine Learning for Robot Control, Deep Learning Methods, Model Learning for Control
Abstract: Model Predictive Control (MPC) is a state-of-the-art (SOTA) control technique which requires solving hard constrained optimization problems iteratively. In the event of uncertain dynamics (typically encountered in real-life), analytical model based MPC requires setting conservative bounds on disturbances to obtain robust controllers. This however, increases the hardness of the problem, as more constraint satisfactions are required. The problem exacerbates in performance-critical applications, when more compute is required in lesser time. Data-driven regression methods such as Neural Networks have been proposed in the past to approximate system dynamics. However, such models rely on high volumes of labeled data, in the absence of symbolic analytical priors. This incurs non-trivial training overheads. Physics-informed Neural Networks (PINNs) have gained traction for approximating non-linear system of ordinary differential equations (ODEs), with reasonable accuracy. In this work, we propose a Robust Adaptive MPC framework via PINNs (RAMP-Net), which uses a neural network trained partly from simple ODEs and partly from data. A physics loss is used to learn simple ODEs (representing ideal dynamics). Having access to analytical functions inside the loss function acts as a regularizer, enforcing robust behavior for parametric uncertainties. On the other hand, a regular data loss is used for adapting to residual disturbances (non-parametric uncertainties), unaccounted during mathematical modelling. Experiments are performed in a simulated environment for trajectory tracking of a quadrotor. We report 7.8% to 43.2% and 8.04% to 61.5% reduction in tracking errors for speeds ranging from 0.5 to 1.75 m/s compared to two SOTA regression based MPC methods.
|
|
TuPO1S-09 Poster Session, Room T8 |
Add to My Program |
Marine Robotics I |
|
|
|
08:30-10:10, Paper TuPO1S-09.1 | Add to My Program |
3-D Reconstruction Using Monocular Camera and Lights: Multi-View Photometric Stereo for Non-Stationary Robots |
|
Roznere, Monika | Dartmouth College |
Mordohai, Philippos | Stevens Institute of Technology |
Rekleitis, Ioannis | University of South Carolina |
Quattrini Li, Alberto | Dartmouth College |
Keywords: Marine Robotics, Field Robots, Mapping
Abstract: This paper proposes a novel underwater Multi-View Photometric Stereo (MVPS) framework for reconstructing scenes in 3-D with a non-stationary low-cost robot equipped with a monocular camera and fixed lights. The underwater realm is the primary focus of study here, due to the challenges in utilizing underwater camera imagery and lack of low-cost reliable localization systems. Previous underwater PS approaches provided accurate scene reconstruction results, but assumed that the robot was stationary at the bottom. This assumption is limiting, as many artifacts, reefs, and man-made structures are large and meters above the bottom. Our proposed MVPS framework relaxes the stationarity assumption by utilizing a monocular SLAM system to estimate small robot motions and extract an initial sparse feature map. To compensate for the scale inconsistency in monocular SLAM output, our MVPS optimization scheme collectively estimates a high-quality, dense 3-D reconstruction and corrects the camera pose estimates. We also present an attenuation and camera-light extrinsic parameter calibration method for non-stationary robots. Finally, validation experiments with a BlueROV2 demonstrated the low-cost capability of producing high-quality scene reconstructions. Overall, this work is the foundation of an active perception pipeline for robots (i.e., underwater, ground, and aerial) to explore and map complex structures in high accuracy and resolution with an inexpensive sensor-light configuration.
|
|
08:30-10:10, Paper TuPO1S-09.2 | Add to My Program |
GMM Registration: A Probabilistic Scan Matching Approach for Sonar-Based AUV Navigation |
|
Vial, Pau | Universitat De Girona ESQ6750002E |
Malagón Pedrosa, Miguel | Universitat De Girona |
Segura, Ricard | Universitat De Girona |
Palomeras, Narcis | Universitat De Girona |
Carreras, Marc | Universitat De Girona |
Keywords: Marine Robotics, Mapping
Abstract: Acoustic perception in underwater environments is challenging due to the low frequency of the acquisition system and multiple and huge sources of noise. Therefore, point clouds built by profiling sonars mounted on Autonomous Underwater Vehicles (AUV) are sparse and noisy. To solve the mapping task, AUVs need a registration algorithm to prevent maps from inconsistencies. Many scan matching algorithms are available, however, a few of them are specialized in acoustic data. In this paper, a probabilistic scan matching methodology based on Gaussian Mixtures Models (GMM) is presented and, for the first time, the Bayesian-GMM algorithm is applied in this context to model acoustic data. The scan matching problem is properly formulated using Lie groups to define pose. In addition, this methodology can return an uncertainty measure for the matching result, which is fundamental in Pose SLAM applications. This tool is implemented in a public C++ library that can process in real-time 2D and 3D scans acquired by a profiling sonar. Theoretical justification and results with real data are provided to benchmark our method against the state-of-the-art Normal Distributions Transforms (NDT) technique. The library repository can be found in https://bitbucket.org/gmmregistration/gmm_registration.
|
|
08:30-10:10, Paper TuPO1S-09.3 | Add to My Program |
Neural Implicit Surface Reconstruction Using Imaging Sonar |
|
Qadri, Mohamad | Carnegie Mellon University |
Kaess, Michael | Carnegie Mellon University |
Gkioulekas, Ioannis | Carnegie Mellon University |
Keywords: Marine Robotics, Mapping, Field Robots
Abstract: We present a technique for dense 3D reconstruction of objects using an imaging sonar, also known as forward-looking sonar (FLS). Compared to previous methods that model the scene geometry as point clouds or volumetric grids, we represent the geometry as a neural implicit function. Additionally, given such a representation, we use a differentiable volumetric renderer that models the propagation of acoustic waves to synthesize imaging sonar measurements. We perform experiments on real and synthetic datasets and show that our algorithm reconstructs high-fidelity surface geometry from multi-view FLS images at much higher quality than was possible with previous techniques and without suffering from their associated memory overhead.
|
|
08:30-10:10, Paper TuPO1S-09.4 | Add to My Program |
Conditional GANs for Sonar Image Filtering with Applications to Underwater Occupancy Mapping |
|
Lin, Tianxiang | Carnegie Mellon University |
Hinduja, Akshay | Carnegie Mellon University |
Qadri, Mohamad | Carnegie Mellon University |
Kaess, Michael | Carnegie Mellon University |
Keywords: Marine Robotics, Mapping, Field Robots
Abstract: Underwater robots typically rely on acoustic sensors like sonar to perceive their surroundings. However, these sensors are often inundated with multiple sources and types of noise, which makes using raw data for any meaningful inference with features, objects, or boundary returns very difficultly. While several conventional methods of dealing with noise exist, their success rates are unsatisfactory. This paper presents a novel application of conditional Generative Adversarial Networks to train a model to produce noise-free sonar images, outperforming several conventional filtering methods. Estimating free space is crucial for autonomous robots performing active exploration and mapping. Thus we apply our approach to the task of underwater occupancy mapping and show superior free and occupied space inference when compared to conventional methods.
|
|
08:30-10:10, Paper TuPO1S-09.5 | Add to My Program |
Stochastic Planning for ASV Navigation Using Satellite Images |
|
Huang, Yizhou | University of Toronto |
Dugmag, Hamza | University of Toronto |
Shkurti, Florian | University of Toronto |
Barfoot, Timothy | University of Toronto |
Keywords: Planning under Uncertainty, Marine Robotics, Field Robots
Abstract: Autonomous surface vessels (ASV) represent a promising technology to automate water-quality monitoring of lakes. In this work, we use satellite images as a coarse map and plan sampling routes for the robot. However, inconsistency between the satellite images and the actual lake, as well as environmental disturbances such as wind, aquatic vegetation, and changing water levels can make it difficult for robots to visit places suggested by the prior map. This paper presents a robust route-planning algorithm that minimizes the expected total travel distance given these environmental disturbances, which induce uncertainties in the map. We verify the efficacy of our algorithm in simulations of over a thousand Canadian lakes and demonstrate an application of our algorithm in a 3.7 km-long real-world robot experiment on a lake in Northern Ontario, Canada.
|
|
08:30-10:10, Paper TuPO1S-09.6 | Add to My Program |
Autonomous Underwater Docking Using Flow State Estimation and Model Predictive Control |
|
Vivekanandan, Rakesh | Oregon State University |
Hollinger, Geoffrey | Oregon State University |
Chang, Dongsik | Amazon |
Keywords: Marine Robotics, Motion and Path Planning, Optimization and Optimal Control
Abstract: We present a navigation framework to perform autonomous underwater docking to a wave energy converter (WEC) under various ocean conditions by incorporating flow state estimation into the design of model predictive control (MPC). Existing methods lack the ability to perform dynamic rendezvous and autonomously dock in energetic conditions. The use of exteroceptive sensors or high performing acoustic sensors have been previously investigated to obtain or estimate the flow states. However, the use of such sensors increases the overall cost of the system and expects the vehicle to navigate close to the seafloor or other landmarks. To overcome these limitations, our method couples an active perception framework with MPC to estimate the flow states simultaneously while moving towards the dock. Our simulation results demonstrate the robustness and reliability of the proposed framework for autonomous docking under various ocean conditions. Furthermore, we conducted laboratory trials with a BlueROV2 docking with an oscillating dock and achieved a greater than 70% success rate.
|
|
08:30-10:10, Paper TuPO1S-09.7 | Add to My Program |
Real-Time Navigation for Autonomous Surface Vehicles in Ice-Covered Waters |
|
de Schaetzen, Rodrigue | University of Waterloo |
Botros, Alexander | University of Waterloo |
Gash, Robert | National Research Council of Canada |
Murrant, Kevin | National Research Council of Canada |
Smith, Stephen L. | University of Waterloo |
Keywords: Marine Robotics, Autonomous Vehicle Navigation, Motion and Path Planning
Abstract: Vessel transit in ice-covered waters poses unique challenges in safe and efficient motion planning. When the concentration of ice is high, it may not be possible to find collision-free paths. Instead, ice can be pushed out of the way if it is small or if contact occurs near the edge of the ice. In this work, we propose a real-time navgiation framework that minimizes collisions with ice and distance travelled by the vessel. We exploit a lattice-based planner with a cost that captures the ship interaction with ice. To address the dynamic nature of the environment, we plan motion in a receding horizon manner based on updated vessel and ice state information. Further, we present a novel planning heuristic for evaluating the cost-to-go, which is applicable to navigation in a channel without a fixed goal location. The performance of our planner is evaluated across several levels of ice concentration both in simulated and in real-world experiments.
|
|
08:30-10:10, Paper TuPO1S-09.8 | Add to My Program |
Experiments in Underwater Feature Tracking with Performance Guarantees Using a Small AUV |
|
Biggs, Benjamin | Virginia Polytechnic Institute and State University |
He, Hans | Virginia Tech |
McMahon, James | The Naval Research Laboratory |
Stilwell, Daniel | Virginia Tech |
Keywords: Field Robots, Marine Robotics, Motion and Path Planning
Abstract: We present the results of experiments performed using a small autonomous underwater vehicle to determine the location of an isobath within a bounded area. The primary contribution of this work is to implement and integrate several recent developments real-time planning for environmental mapping, and to demonstrate their utility in a challenging practical example. We model the bathymetry within the operational area using a Gaussian process and propose a reward function that represents the task of mapping a desired isobath. As is common in applications where plans must be continually updated based on real-time sensor measurements, we adopt a receding horizon framework where the vehicle continually computes near-optimal paths. The sequence of paths does not, in general, inherit the optimality properties of each individual path. Our real-time planning implementation incorporates recent results that lead to performance guarantees for receding-horizon planning.
|
|
08:30-10:10, Paper TuPO1S-09.9 | Add to My Program |
Robust Imaging Sonar-Based Place Recognition and Localization in Underwater Environments |
|
Kim, Ho Gyun | Inha University |
Kang, Gilhwan | Inha University |
Jeong, Seokhwan | Inha University |
Ma, Seungjun | Inha University |
Cho, Younggun | Inha University |
Keywords: Marine Robotics, Localization, SLAM
Abstract: Place recognition using SOund Navigation and Ranging (SONAR) images is an important task for simultaneous localization and mapping (SLAM) in underwater environments. This paper proposes a robust and efficient imaging SONAR-based place recognition, SONAR context, and loop closure method. Unlike previous methods, our approach encodes geometric information based on the characteristics of raw SONAR measurements without prior knowledge or training. We also design a hierarchical searching procedure for fast retrieval of candidate SONAR frames and apply adaptive shifting and padding to achieve robust matching on rotation and translation changes. In addition, we can derive the initial pose through adaptive shifting and apply it to the iterative closest point (ICP)-based loop closure factor. We evaluate the SONAR context’s performance in the various underwater sequences such as simulated open water, real water tank, and real underwater environments. The proposed approach shows the robustness and improvements of place recognition on various datasets and evaluation metrics. Supplementary materials are available at https://github.com/sparolab/sonar_context.
|
|
08:30-10:10, Paper TuPO1S-09.10 | Add to My Program |
Deep Underwater Monocular Depth Estimation with Single-Beam Echosounder |
|
Liu, Haowen | Dartmouth College |
Roznere, Monika | Dartmouth College |
Quattrini Li, Alberto | Dartmouth College |
Keywords: Marine Robotics, Deep Learning for Visual Perception, Data Sets for Robotic Vision
Abstract: Underwater depth estimation is essential for safe Autonomous Underwater Vehicles (AUV) navigation. While there has been recent advances in out-of-water monocular depth estimation, it is difficult to apply these methods to the underwater domain due to the lack of well-established datasets with labelled ground truths. In this paper, we propose a novel method for self-supervised underwater monocular depth estimation by leveraging a low-cost single-beam echosounder (SBES). We also present a synthetic dataset for underwater depth estimation to facilitate visual learning research in the underwater domain, available at https://github.com/hdacnw/ sbes-depth. We evaluated our method on the proposed dataset with results outperforming previous methods and tested our method in a dataset we collected with an inexpensive AUV. We further investigated the use of SBES as an additional component in our self-supervised method for up-to-scale depth estimation providing insights on next research directions.
|
|
08:30-10:10, Paper TuPO1S-09.11 | Add to My Program |
Self-Supervised Monocular Depth Underwater |
|
Amitai, Shlomi | University of Haifa |
Klein, Itzik | University of Haifa |
Treibitz, Tali | University of Haifa |
Keywords: Marine Robotics, RGB-D Perception, Vision-Based Navigation
Abstract: Depth estimation is critical for any robotic system. In the past years estimation of depth from monocular images have shown great improvement, however, in the underwater environment results are still lagging behind due to appearance changes caused by the medium. So far little effort has been invested on overcoming this. Moreover, underwater, there are more limitations for using high resolution depth sensors, this makes generating ground truth for learning methods another enormous obstacle. So far unsupervised methods that tried to solve this have achieved very limited success as they relied on domain transfer from dataset in air. We suggest training using subsequent frames self-supervised by a reprojection loss, as was demonstrated successfully above water. We suggest several additions to the self-supervised framework to cope with the underwater environment and achieve state-of-the-art results on a challenging forward-looking underwater dataset.
|
|
08:30-10:10, Paper TuPO1S-09.12 | Add to My Program |
Performance Evaluation of 3D Keypoint Detectors and Descriptors on Coloured Point Clouds in Subsea Environments |
|
Jung, Kyungmin | McGill University |
Hitchcox, Thomas | McGill University |
Forbes, James Richard | McGill University |
Keywords: Marine Robotics, Vision-Based Navigation, SLAM
Abstract: The recent development of high-precision subsea optical scanners allows for 3D keypoint detectors and feature descriptors to be leveraged on point cloud scans from subsea environments. However, the literature lacks a comprehensive survey to identify the best combination of detectors and descriptors to be used in these challenging and novel environments. This paper aims to identify the best detector/descriptor pair using a challenging field dataset collected using a commercial underwater laser scanner. Furthermore, studies have shown that incorporating texture information to extend geometric features adds robustness to feature matching on synthetic datasets. This paper also proposes a novel method of fusing images with underwater laser scans to produce coloured point clouds, which are used to study the effectiveness of 6D point cloud descriptors.
|
|
TuPO1S-10 Poster Session, Room T8 |
Add to My Program |
Biomimetic Systems |
|
|
|
08:30-10:10, Paper TuPO1S-10.1 | Add to My Program |
Puppeteer and Marionette: Learning Anticipatory Quadrupedal Locomotion Based on Interactions of a Central Pattern Generator and Supraspinal Drive |
|
Shafiee, Milad | EPFL |
Bellegarda, Guillaume | EPFL |
Ijspeert, Auke | EPFL |
Keywords: Biologically-Inspired Robots, Legged Robots
Abstract: Quadruped animal locomotion emerges from the interactions between the spinal central pattern generator (CPG), sensory feedback, and supraspinal drive signals from the brain. Computational models of CPGs have been widely used for investigating the spinal cord contribution to animal locomotion control in computational neuroscience and in bio-inspired robotics. However, the contribution of supraspinal drive to anticipatory behavior, i.e. motor behavior that involves planning ahead of time (e.g. of footstep placements), is not yet properly understood. In particular, it is not clear whether the brain modulates CPG activity and/or directly modulates muscle activity (hence bypassing the CPG) for accurate foot placements. In this paper, we investigate the interaction of supraspinal drive and a CPG in an anticipatory locomotion scenario that involves stepping over gaps. By employing deep reinforcement learning (DRL), we train a neural network policy that replicates the supraspinal drive behavior. This policy can either modulate the CPG dynamics, or directly change actuation signals to bypass the CPG dynamics. Our results indicate that the direct supraspinal contribution to the actuation signal is a key component for a high gap crossing success rate. However, the CPG dynamics in the spinal cord are beneficial for gait smoothness and energy efficiency. Moreover, our investigation shows that sensing the front feet distances to the gap is the most important and sufficient sensory information for learning gap crossing. Our results support the biological hypothesis that cats and horses mainly control the front legs for obstacle avoidance, and that hind limbs follow an internal memory based on the front limbs' information. Our method enables the quadruped robot to cross gaps of up to 20 cm (50% of body-length) without any explicit dynamics modeling or Model Predictive Control (MPC).
|
|
08:30-10:10, Paper TuPO1S-10.2 | Add to My Program |
A Performance Optimization Strategy Based on Improved NSGA-II for a Flexible Robotic Fish |
|
Lu, Ben | Institute of Automation, Chinese Academy of Sciences |
Wang, Jian | Institute of Automation, Chinese Academy of Sciences |
Liao, Xiaocun | Institute of Automation, Chinese Academy of Sciences |
Zou, Qianqian | Institution of Automation, Chinese Academy of Sciences |
Tan, Min | Institute of Automation, Chinese Academy of Sciences |
Zhou, Chao | Chinese Academy of Sciences |
Keywords: Biologically-Inspired Robots, Biomimetics
Abstract: The high speed and low energy cost are two conflicting objectives in the motion optimization of bio-inspired underwater robots, but playing a very important role. To this end, this paper proposes an optimization strategy for swimming speed and power cost using an improved NSGAII for a flexible robotic fish. A dynamic model involving flexible deformation is established for speed prediction with the hydrodynamic parameters identified. A back propagation (BP) neural network is applied to perform compensation of power cost prediction with the dynamic model’s prediction as input. In particular, an NSGA-II-AMS method is developed to improve the efficiency of solving the two-objective optimization problem based on NSGA-II. Finally, extensive simulations and experimental results demonstrate the effectiveness of the proposed optimization strategy, which offers promising prospects for the flexible robotic fish performing aquatic tasks with different performance constraints.
|
|
08:30-10:10, Paper TuPO1S-10.3 | Add to My Program |
Swarm Robotics Search and Rescue: A Bee-Inspired Swarm Cooperation Approach without Information Exchange |
|
Li, Yue | Beihang University |
Gao, Yan | School of Automation Science and Electrical Engineering, Beihang |
Yang, Sijie | Beihang University |
Quan, Quan | Beihang University |
Keywords: Search and Rescue Robots, Biologically-Inspired Robots, Autonomous Agents
Abstract: Swarm robotics plays a non-negligible role in actual practice because of its scalability and robustness. Besides some specific studies, there is still a lack of an overall approach to solving the search and rescue problem in a communication-denied environment. This paper presents a bee-inspired swarm cooperation approach without information exchange, including a target grouping method suitable for multi-objective and multi-robot, a finite behavior state machine, and the corresponding control law. Finally, the effectiveness of the proposed approach is shown via simulation. The overall approach proposed in this paper requires no global position of the swarm and two-way information exchange, making swarm robotics search and rescue in a communication-denied environment possible.
|
|
08:30-10:10, Paper TuPO1S-10.4 | Add to My Program |
Achieving Extensive Trajectory Variation in Impulsive Robotic Systems |
|
Viornery, Luis | Carnegie Mellon University |
Goode, Chloe | University of Lincoln |
Sutton, Gregory | University of Lincoln |
Bergbreiter, Sarah | Carnegie Mellon University |
Keywords: Biologically-Inspired Robots, Mechanism Design
Abstract: Robots that use impulsive mechanisms to achieve high-speed and high-powered motion are becoming more common and better understood, but control of these systems remains relatively rudimentary. Among robots that use spring actuation to generate motion, robot actuation and mechanisms are usually not controlled intentionally in order to achieve variation in the system's behavior, or they are controlled only roughly via adjustments made to the amount of energy stored in the mechanism. We describe the development, construction, and test of an impulsive catapult mechanism whose design is inspired by the grasshopper leg and for which extensive variation in the projectile trajectory is achieved by force control of the actuator that restrains the spring. As a step toward future controlled jumping robots, we give a detailed model of this system, validate this model experimentally, and explain how the actuator dynamics are critical to our ability to vary the system's trajectory using this approach. This work represents a novel approach to the control of spring actuated robots and illustrates how they can be controlled even under highly limiting actuator constraints.
|
|
08:30-10:10, Paper TuPO1S-10.5 | Add to My Program |
Towards Safe Landing of Falling Quadruped Robots Using a 3-DoF Morphable Inertial Tail |
|
Tang, Yunxi | The Chinese University of Hong Kong |
An, Jiajun | The Chinese University of Hong Kong |
Chu, Xiangyu | The Chinese University of Hong Kong |
Wang, Shengzhi | The Chinese University of Hong Kong |
Wong, Ching Yan | The Chinese University of Hong Kong |
Au, K. W. Samuel | The Chinese University of Hong Kong |
Keywords: Biologically-Inspired Robots, Legged Robots, Motion Control
Abstract: Falling cat problem is well-known where cats show their super aerial reorientation capability and can land safely. For their robotic counterparts, a similar falling quadruped robot problem, has not been fully addressed, although achieving safe landing as the cats has been increasingly investigated. Unlike imposing the burden on landing control, we approach to safe landing of falling quadruped robots by effective flight phase control. Different from existing work like swinging legs and attaching reaction wheels or simple tails, we propose to deploy a 3-DoF morphable inertial tail on a medium-size quadruped robot. In the flight phase, the tail with its maximum length can self-right the body orientation in 3D effectively; before touch-down, the tail length can be retracted to about 1/4 of its maximum for impressing the tail's side-effect on landing. To enable aerial reorientation for safe landing in the quadruped robots, we design a control architecture, which is verified in a high-fidelity physics simulation environment with different initial conditions. Experimental results on a customized flight-phase test platform with comparable inertial properties are provided and show the tail's effectiveness on 3D body reorientation and its fast retractability before touch-down. An initial falling quadruped robot experiment is shown, where the robot Unitree A1 with the 3-DoF tail can land safely subject to non-negligible initial body angles.
|
|
08:30-10:10, Paper TuPO1S-10.6 | Add to My Program |
Bioinspired Tearing Manipulation with a Robotic Fish |
|
Wang, Stanley | University of California, Berkeley |
Romero, Juan | University of California, Berkeley |
Li, Monica | UC Berkeley |
Wainwright, Peter | University of California, Davis |
Stuart, Hannah | UC Berkeley |
Keywords: Soft Robot Applications, Biologically-Inspired Robots, Marine Robotics
Abstract: We present SunBot, a robotic system for the study and implementation of fish-inspired tearing manipulations. Various fish species -- such as the sunburst butterflyfish -- feed on prey fixed to substrates, a maneuver previously not demonstrated by robotic fish which typically specialize for open water swimming and surveillance. Biological studies indicate that a dynamic ``head flick'' behavior may play a role in tearing off soft prey during such feeding. In this work, we study whether the robotic tail is an effective means to generate such head motions for ungrounded tearing manipulations in water. We describe the function of SunBot and compare the forces that it applies to a fixed prey in the lab while varying tail speeds and ranges of motion. A simplified dynamic template model for the tail-driven head flick maneuver matches peak force magnitudes from experiments, indicating that inertial effects of the fish's body play a substantial role. Finally we demonstrate a tearing scenario and evaluate a free-swimming trial of SunBot -- this is important to show that the actuator that enables swimming also provides the new dual purpose of forceful tearing manipulation.
|
|
08:30-10:10, Paper TuPO1S-10.7 | Add to My Program |
Learnable Tegotae-Based Feedback in CPGs with Sparse Observation Produces Efficient and Adaptive Locomotion |
|
Herneth, Christopher | Technical University Munich |
Hayashibe, Mitsuhiro | Tohoku University |
Owaki, Dai | Tohoku University |
Keywords: Bioinspired Robot Learning, Reinforcement Learning, Biologically-Inspired Robots
Abstract: Central Pattern generators (CPG) are a biologically inspired, decentralized control architecture that enables modefree, but yet adaptively stable and computational lightweight locomotion capabilities on complex robots. Nevertheless, no unified design guidelines for closed-loop CPG controllers are available in the literature. Therefore, we propose a task-distributed, end-to-end trainable, closed-loop CPG control policy by generalizing and extending Tegotae control. The Tegotae approach modulates CPG activity by quantifying the discrepancy between internal belief states and environmental reactions. Spontaneous and adaptive gait formation towards situationally efficient locomotion patterns are intrinsic properties of Tegotae control. The Tegotae control policy is trained and benchmarked in simulation on a 1D hopping robot. We found that our approach can learn efficient and adaptive locomotion on minimal feedback information, while outperforming unstructured, classic reinforcement learning policies of equal complexity. To the best of our knowledge, this is the first study to fully generalize the Tegotae approach and construct unimpeded, end-to-end trainable Tegotae control policies.
|
|
08:30-10:10, Paper TuPO1S-10.8 | Add to My Program |
Multi-Segmented, Adaptive Feet for Versatile Legged Locomotion in Natural Terrain |
|
Chatterjee, Abhishek | Max Planck Institute for Intelligent Systems, Stuttgart |
Mo, An | MPI IS Stuttgart |
Kiss, Bernadett | Max Planck Institute for Intelligent Systems |
Gonen, Emre Cemal | Max Planck Institute for Intelligent Systems |
Badri-Spröwitz, Alexander | Max Planck Institute for Intelligent Systems |
Keywords: Legged Robots, Biologically-Inspired Robots, Biomimetics
Abstract: Most legged robots are built with leg structures from serially mounted links and actuators and are controlled through complex controllers and sensor feedback. In comparison, animals developed multi-segment legs, mechanical coupling between joints, and multi-segmented feet. They run agile over all terrains, arguably with simpler locomotion control. Here we focus on developing foot mechanisms that resist slipping and sinking also in natural terrain. We present first results of multi-segment feet mounted to a bird-inspired robot leg with multi-joint mechanical tendon coupling. Our one- and two-segment, mechanically adaptive feet show increased viable horizontal forces on multiple soft and hard substrates before starting to slip. We also observe that segmented feet reduce sinking on soft substrates compared to ball-feet and cylinder-feet. We report how multi-segmented feet provide a large range of viable centre of pressure points well suited for bipedal robots, but also for quadruped robots on slopes and natural terrain. Our results also offer a functional understanding of segmented feet in animals like ratite birds.
|
|
08:30-10:10, Paper TuPO1S-10.9 | Add to My Program |
Burst Stimulation for Enhanced Locomotion Control of Terrestrial Cyborg Insects |
|
Nguyen, Huu Duoc | School of Mechanical & Aerospace Engineering, Nanyang Technologi |
Sato, Hirotaka | Nanyang Technological University |
Vo-Doan, T. Thang | University of Freiburg |
Keywords: Cyborgs, Motion Control, Autonomous Agents
Abstract: Terrestrial cyborg insects are biohybrid systems integrating living insects as mobile platforms. The insects’ locomotion is controlled by the electrical stimulation of their sensory, muscular, or neural systems, in which continuous pulse trains are usually chosen as the stimulation waveform. Although this waveform is easy to generate and can elicit graded responses from the insects, its locomotion control efficiency has not been consistent among existing literature. This study demonstrates the improvement of locomotion control by using a new stimulation protocol, named Burst Stimulation, to stimulate a cyborg beetle’s antennae (Zophobas morio). Modulating the continuous pulse train into multiple bursts enhanced the beetle’s turning responses. At the same stimulation intensity (amplitude, pulse width, and active duration), the Burst Stimulation improved the turning angle by up to 50% compared to the continuous waveform. Moreover, the beetle’s graded response was preserved. Increasing the stimulation frequency from 10 Hz to 40 Hz raised the turning rate by 40 deg/s. In addition, the initial implementation of this protocol in the feedback control-based navigation achieved a success rate of 81%, suggesting its potential use to optimize further the autonomous navigation of terrestrial cyborg insects.
|
|
08:30-10:10, Paper TuPO1S-10.10 | Add to My Program |
Twisting Spine or Rigid Torso: Exploring Quadrupedal Morphology Via Trajectory Optimization |
|
Caporale, J. Diego | University of Pennsylvania |
Feng, Zeyuan | University of Pennsylvania |
Rozen-Levy, Shane | University of Pennsylvania |
Carter, Aja | University of Pennsylvania |
Koditschek, Daniel | University of Pennsylvania |
Keywords: Legged Robots, Biologically-Inspired Robots, Optimization and Optimal Control
Abstract: Modern legged robot morphologies assign most of their actuated degrees of freedom (DoF’s) to the limbs and designs continue to converge to twelve DoF quadrupeds with three actuators per leg and a rigid torso often modeled as a Single Rigid Body (SRB). This is in contrast to the animal kingdom, which provides tantalizing hints that core actuation of a jointed torso confers substantial benefit for efficient agility. Unfortunately, the limited specific power of available actuators continues to hamper roboticists' efforts to capitalize on this bio-inspiration. This paper presents the initial steps in a comparative study of the costs and benefits associated with a traditionally neglected torso degree of freedom: a twisting spine. We use trajectory optimization to explore how a one-DoF, axially twisting spine might help or hinder a set of axially-active (twisting) behaviors: trots, sudden turns while bounding, and parkour-style wall jumps. By optimizing for minimum electrical energy or average power, intuitive cost functions for robots, we avoid hand-tuning the behaviors and explore the activation of the spine. Initial evidence suggests that for lower energy behaviors the spine increases the electrical energy required when compared to the rigid torso, but for higher energy runs the spine trends toward having no effect or reducing the electrical work. These results support future, more bio-inspired versions of the spine with inherent stiffness or dampening built into their mechanical design.
|
|
08:30-10:10, Paper TuPO1S-10.11 | Add to My Program |
Dynamic Locomotion of a Quadruped Robot with Active Spine Via Model Predictive Control |
|
Li, Wanyue | Sun Yat-Sen University |
Zhou, Zida | Sun Yat-Sen University |
Cheng, Hui | Sun Yat-Sen University |
Keywords: Legged Robots, Motion Control, Biologically-Inspired Robots
Abstract: As an active spine introduces higher degree of freedoms (DOFs) as well as time-varying inertia, locomotion control of spined quadruped robots is challenging. Direct optimization on the full dynamics model causes prohibitive calculation time and is difficult to apply to embedded platforms. Model predictive control (MPC)-based on centroidal dynamics is a prevalent approach for ordinary quadruped robots, regarding the whole robot as a single rigid body (SRB). However, the approach ignores the changes of the center of mass (CoM) and inertia, which seriously affects the robot’s stability and could not be used in spined quadruped robots directly. To resolve the above issue, this paper presents an MPC approach that considers the movements of the spine in the SRB model. Since the mass of the robot is concentrated on its body, the whole robot is modelled as an unactuated 3D SRB with fully-actuated internal spine joints. MPC finds the optimal ground reaction forces (GRFs) based on the SRB centroidal dynamics, in which the missing spine part is complemented by the pre-defined spine joints’ states and corresponding inertia sequence. According to the GRFs, the full dynamic model calculates the precise joint torques. In addition, a quadruped robot with a 3-DOF active spine, Yat-sen Lion, is developed. With the presented approach, experimental results illustrate that Yat-sen Lion freely achieves bending, arching, and turning behaviors while trotting at speeds of 3.8 m/s in simulations and 0.5 m/s in real-world experiments.
|
|
TuPO1S-11 Poster Session, Room T8 |
Add to My Program |
Aerial Robotics I |
|
|
|
08:30-10:10, Paper TuPO1S-11.1 | Add to My Program |
Scalable Task-Driven Robotic Swarm Control Via Collision Avoidance and Learning Mean-Field Control |
|
Cui, Kai | Technische Universität Darmstadt |
Li, Mengguang | Technische Universität Darmstadt |
Fabian, Christian | Technische Universität Darmstadt |
Koeppl, Heinz | Technische Universität Darmstadt |
Keywords: Swarm Robotics, Reinforcement Learning, Collision Avoidance
Abstract: In recent years, reinforcement learning and its multi-agent analogue have achieved great success in solving various complex control problems. However, multi-agent reinforcement learning remains challenging both in its theoretical analysis and empirical design of algorithms, especially for large swarms of embodied robotic agents where a definitive toolchain remains part of active research. We use emerging state-of-the-art mean-field control techniques in order to convert many-agent swarm control into more classical single-agent control of distributions. This allows profiting from advances in single-agent reinforcement learning at the cost of assuming weak interaction between agents. However, the mean-field model is violated by the nature of real systems with embodied, physically colliding agents. Thus, we combine collision avoidance and learning of mean-field control into a unified framework for tractably designing intelligent robotic swarm behavior. On the theoretical side, we provide novel approximation guarantees for general mean-field control both in continuous spaces and with collision avoidance. On the practical side, we show that our approach outperforms multi-agent reinforcement learning and allows for decentralized open-loop application while avoiding collisions, both in simulation and real UAV swarms. Overall, we propose a framework for the design of swarm behavior that is both mathematically well-founded and practically useful, enabling the solution of otherwise intractable swarm problems.
|
|
08:30-10:10, Paper TuPO1S-11.2 | Add to My Program |
STD-Trees: Spatio-Temporal Deformable Trees for Multirotors Kinodynamic Planning |
|
Ye, Hongkai | Zhejiang University |
Xu, Chao | Zhejiang University |
Gao, Fei | Zhejiang University |
Keywords: Aerial Systems: Applications, Constrained Motion Planning, Motion and Path Planning
Abstract: In constrained solution spaces with a huge number of homotopy classes, stand-alone sampling-based kinodynamic planners suffer low efficiency in convergence. Local optimization is integrated to alleviate this problem. In this paper, we propose to thrive the trajectory tree growing by optimizing the tree in the forms of deformation units, and each unit contains one tree node and all the edges connecting it. The deforming proceeds both spatially and temporally by optimizing the node state and edge time durations efficiently. Deforming the unit only changes the tree locally yet improves the overall quality of a corresponding subtree. Further, to consider the computation burden and optimizing level, patterns to deform different tree parts in combination of different deformation units are studied and compared, all showing much faster convergence. The proposed deformation can be easily integrated into different RRT-based kinodynamic planning methods, and numerical experiments show that integrating the spatio-temporal deformation greatly accelerates the convergence and outperforms the spatial-only deformation.
|
|
08:30-10:10, Paper TuPO1S-11.3 | Add to My Program |
PredRecon: A Prediction-Boosted Planning Framework for Fast and High-Quality Autonomous Aerial Reconstruction |
|
Feng, Chen | The Hong Kong University of Science and Technology |
Li, Haojia | The Hong Kong University of Science and Technology |
Gao, Fei | Zhejiang University |
Zhou, Boyu | Sun Yat-Sen University |
Shen, Shaojie | Hong Kong University of Science and Technology |
Keywords: Aerial Systems: Perception and Autonomy, Integrated Planning and Learning, Motion and Path Planning
Abstract: Autonomous UAV path planning for 3D reconstruction has been actively studied in various applications for high-quality 3D models. However, most existing works have adopted explore-then-exploit, prior-based or exploration-based strategies, demonstrating inefficiency with repeated flight and low autonomy. In this paper, we propose PredRecon, a prediction-boosted planning framework that can autonomously generate paths for high 3D reconstruction quality. We obtain inspiration from humans can roughly infer the complete construction structure from partial observation. Hence, we devise a surface prediction module (SPM) to predict the coarse complete surfaces of the target from the current partial reconstruction. Then, the uncovered surfaces are produced by online volumetric mapping waiting for observation by UAV. Lastly, a hierarchical planner plans motions for 3D reconstruction, which sequentially finds efficient global coverage paths, plans local paths for maximizing the performance of Multi-View Stereo (MVS), and generates smooth trajectories for image-pose pairs acquisition. We conduct benchmarks in the realistic simulator, which validates the performance of PredRecon compared with the classical and state-of-the-art methods. The open-source code is released at https://github.com/HKUST-Aerial-Robotics/PredRecon.
|
|
08:30-10:10, Paper TuPO1S-11.4 | Add to My Program |
Vision-Aided UAV Navigation and Dynamic Obstacle Avoidance Using Gradient-Based B-Spline Trajectory Optimization |
|
Xu, Zhefan | Carnegie Mellon University |
Xiu, Yumeng | Carnegie Mellon University |
Zhan, Xiaoyang | Carnegie Mellon University |
Chen, Baihan | Carnegie Mellon University |
Shimada, Kenji | Carnegie Mellon University |
Keywords: Aerial Systems: Perception and Autonomy, Collision Avoidance, Motion and Path Planning
Abstract: Navigating dynamic environments requires the robot to generate collision-free trajectories and actively avoid moving obstacles. Most previous works designed path planning algorithms based on one single map representation, such as the geometric, occupancy, or ESDF map. Although they have shown success in static environments, due to the limitation of map representation, those methods cannot reliably handle static and dynamic obstacles simultaneously. To address the problem, this paper proposes a gradient-based B-spline trajectory optimization algorithm utilizing the robot's onboard vision. The depth vision enables the robot to track and represent dynamic objects geometrically based on the voxel map. The proposed optimization first adopts the circle-based guide-point algorithm to approximate the costs and gradients for avoiding static obstacles. Then, with the vision-detected moving objects, our receding-horizon distance field is simultaneously used to prevent dynamic collisions. Finally, the iterative re-guide strategy is applied to generate the collision-free trajectory. The simulation and physical experiments prove that our method can run in real-time to navigate dynamic environments safely.
|
|
08:30-10:10, Paper TuPO1S-11.5 | Add to My Program |
Multi-Agent Spatial Predictive Control with Application to Drone Flocking |
|
Brandstätter, Andreas | Technische Universität Wien |
Smolka, Scott | Stony Brook University |
Stoller, Scott | Stony Brook University |
Tiwari, Ashish | Microsoft Corp |
Grosu, Radu | TU Wien |
Keywords: Swarm Robotics, Aerial Systems: Mechanics and Control
Abstract: We introduce Spatial Predictive Control (SPC), a technique for solving the following problem: given a collection of robotic agents with black-box positional low-level controllers (PLLCs) and a mission-specific distributed cost function, how can a distributed controller achieve and maintain cost-function minimization without a plant model and only positional observations of the environment? Our fully distributed SPC controller is based strictly on the position of the agent itself and on those of its neighboring agents. This information is used in every time step to compute the gradient of the cost function and to perform a spatial look-ahead to predict the best next target position for the PLLC. Using a simulation environment, we show that SPC outperforms Potential Field Controllers, a related class of controllers, on the drone flocking problem. We also show that SPC works on real hardware, and is therefore able to cope with the potential sim-to-real transfer gap. We demonstrate its performance using as many as 16 Crazyflie 2.1 drones in a number of scenarios, including obstacle avoidance.
|
|
08:30-10:10, Paper TuPO1S-11.6 | Add to My Program |
Multimodal Image Registration for GPS-Denied UAV Navigation Based on Disentangled Representations |
|
Li, Huandong | Northwestern Polytechnical University |
Liu, Zhunga | Northwestern Polytechnical University |
Lyu, Yanyi | Northwestern Polytechnical University |
Wu, Feiyan | Northwestern Polytechnical University |
Keywords: Aerial Systems: Perception and Autonomy, Vision-Based Navigation, Deep Learning Methods
Abstract: Visual navigation plays an important role for Unmanned Aerial Vehicles(UAVs). In some applications, the landmark image and the real-time image may be heterogeneous, like near-infrared and visible images. In this work, we propose a multimodal image registration method to deal with near-infrared and visible images so that it can be applied to visual navigation system for the localization of UAVs in GPS-denied environments. At first, a new feature extraction strategy is developed to embed different modalities of images into the common feature space based on disentangled representations. Such common space is independent of the image modality, and this can eliminate the modality differences. Meanwhile, an intensity loss is introduced to measure the similarity of mono-modal images. In the proposed method, we can directly predict the transformation parameters and thus accelerates the localization of UAVs. Extensive experiments on synthetic datasets are conducted to demonstrate the validity of our method, and the experimental results show that the proposed method can effectively improve the localization accuracy.
|
|
08:30-10:10, Paper TuPO1S-11.7 | Add to My Program |
SEER: Safe Efficient Exploration for Aerial Robots Using Learning to Predict Information Gain |
|
Tao, Yuezhan | University of Pennsylvania |
Wu, Yuwei | University of Pennsylvania |
Li, Beiming | University of Pennsylvania |
Cladera Ojeda, Fernando | University of Pennsylvania |
Zhou, Alex | University of Pennsylvania |
Thakur, Dinesh | University of Pennsylvania |
Kumar, Vijay | University of Pennsylvania |
Keywords: Aerial Systems: Applications, Aerial Systems: Perception and Autonomy, Motion and Path Planning
Abstract: We address the problem of efficient 3-D exploration in indoor environments for micro aerial vehicles with limited sensing capabilities and payload/power constraints. We develop an indoor exploration framework that uses learning to predict the occupancy of unseen areas, extracts semantic features, samples viewpoints to predict information gains for different exploration goals, and plans informative trajectories to enable safe and smart exploration. Extensive experimentation in simulated and real-world environments shows the proposed approach outperforms the state-of-the-art exploration framework by 24% in terms of the total path length in a structured indoor environment and with a higher success rate during exploration.
|
|
08:30-10:10, Paper TuPO1S-11.8 | Add to My Program |
Trajectory Planning for the Bidirectional Quadrotor As a Differentially Flat Hybrid System |
|
Mao, Katherine | University of Pennsylvania |
Welde, Jake | University of Pennsylvania |
Hsieh, M. Ani | University of Pennsylvania |
Kumar, Vijay | University of Pennsylvania |
Keywords: Aerial Systems: Mechanics and Control, Motion and Path Planning, Underactuated Robots
Abstract: The use of bidirectional propellers provides quadrotors with greater maneuverability which is advantageous in constrained environments. This paper addresses the development of a trajectory planning algorithm for quadrotors with bidirectional motors. Previous work has shown that the property of differential flatness can be leveraged for efficient trajectory planning. However, planners that leverage flatness for quadrotors fail at points where the acceleration of the center of mass is equal to gravity, i.e., when the vehicle experiences free fall. The central contribution of this paper is a flatness-based trajectory planning method that allows quadrotors to use bidirectional propellers and pass through the so-called free-fall singularity. We model our system as a differentially flat hybrid system with the aid of coordinate charts derived from the Hopf fibration and develop an algorithm that computes forward and reverse thrusts for each propeller, resulting in smooth trajectories everywhere in SE(3). We demonstrate the planner’s versatility by planning knife-edge maneuvers and trajectories passing through the free-fall singularity, while transitioning from forward to reverse thrust.
|
|
08:30-10:10, Paper TuPO1S-11.9 | Add to My Program |
Fisher Information Based Active Planning for Aerial Photogrammetry |
|
Lim, Jaeyoung | ETH Zurich |
Lawrance, Nicholas | CSIRO Data61 |
Achermann, Florian | ETH Zurich, ASL |
Stastny, Thomas | Swiss Federal Institute of Technology (ETH Zurich) |
Bähnemann, Rik | ETH Zürich |
Siegwart, Roland | ETH Zurich |
Keywords: Aerial Systems: Perception and Autonomy, View Planning for SLAM, Environment Monitoring and Management
Abstract: Small uncrewed aerial systems are great for 3D reconstruction due to their speed, ease of use, and ability to access high-utility viewpoints. Today, most aerial survey approaches generate a preplanned coverage pattern assuming a planar target region. However, this is inefficient since it results in superfluous overlap and suboptimal viewing angles and does not utilize the entire flight envelope. In this work, we propose active path planning for photogrammetric reconstruction. Our main contribution is a view utility function based on Fisher information approximating the offline reconstruction uncertainty. The metric enables online path planning to make in-flight decisions to collect geometrically informative image data in complex terrain. We evaluate our approach in a photorealistic simulation. A viewpoint selection study shows that our metric leads to faster and more precise reconstruction than state-of-the-art active planning metrics and adapts to different camera resolutions. Comparing our online planning approach to an ordinary fixed-wing aerial survey yields 3.2 times faster coverage of 16 ha undulated terrain without sacrificing precision.
|
|
08:30-10:10, Paper TuPO1S-11.10 | Add to My Program |
Integrated Vector Field and Backstepping Control for Quadcopters |
|
Dias Nunes, Arthur Henrique | Universidade Federal De Minas Gerais |
Raffo, Guilherme V. | Universidade Federal De Minas Gerais |
Pimenta, Luciano | Universidade Federal De Minas Gerais |
Keywords: Aerial Systems: Mechanics and Control, Motion Control, Integrated Planning and Control
Abstract: In this work, we present an Integrated Guidance and Controller (IGC) scheme to drive quadcopters in path-following tasks with obstacle avoidance and constant uncertainties rejection. This scheme is based on the combination of a time-varying artificial vector field and Backstepping with integral action control. The vector field switches between two behaviors: (i) path-following; and (ii) obstacle circumnavigation to allow collision avoidance. This vector field is then integrated into a non-linear controller designed via Backstepping with Integral Action to deal with the quadcopter vehicle dynamics and reject constant uncertainties. The considered vehicle model is based on quaternion algebra. The control inputs are considered to be the total thrust and torques. Stability is proved by using Lyapunov's Theory and Matrosov's Theorem. To illustrate our proposed solution, we show computational simulations.
|
|
08:30-10:10, Paper TuPO1S-11.11 | Add to My Program |
Learning a Single Near-Hover Position Controller for Vastly Different Quadcopters |
|
Zhang, Dingqi | University of California, Berkeley |
Loquercio, Antonio | UC Berkeley |
Wu, Xiangyu | University of California, Berkeley |
Kumar, Ashish | UC Berkeley |
Malik, Jitendra | UC Berkeley |
Mueller, Mark Wilfried | University of California, Berkeley |
Keywords: Aerial Systems: Applications, Machine Learning for Robot Control, AI-Enabled Robotics
Abstract: This paper proposes an adaptive near-hover position controller for quadcopters, which can be deployed to quadcopters of very different mass, size and motor constants, and also shows rapid adaptation to unknown disturbances during runtime. The core algorithmic idea is to learn a single policy that can adapt online at test time not only to the disturbances applied to the drone, but also to the robot dynamics and hardware in the same framework. We achieve this by training a neural network to estimate a latent representation of the robot and environment parameters, which is used to condition the behaviour of the controller, also represented as a neural network. We train both networks exclusively in simulation with the goal of flying the quadcopters to goal positions and avoiding crashes to the ground. We directly deploy the same controller trained in the simulation without any modifications on two quadcopters in the real world with differences in mass, size, motors, and propellers with mass differing by 4.5 times. In addition, we show rapid adaptation to sudden and large disturbances up to one-third of the mass of the quadcopters. We perform an extensive evaluation in both simulation and the physical world, where we outperform a state-of-the-art learning-based adaptive controller and a traditional PID controller specifically tuned to each platform individually.
|
|
08:30-10:10, Paper TuPO1S-11.12 | Add to My Program |
Forming and Controlling Hitches in Midair Using Aerial Robots |
|
S. D'Antonio, Diego | Lehigh University |
Bhattacharya, Subhrajit | Lehigh University |
Saldaña, David | Lehigh University |
Keywords: Aerial Systems: Applications, Cooperating Robots, Multi-Robot Systems
Abstract: The use of cables for aerial manipulation has shown to be a lightweight and versatile way to interact with objects. However, fastening objects using cables is still a challenge and human is required. In this work, we propose a novel way to secure objects using hitches. The hitch can be formed and morphed in midair using a team of aerial robots with cables. The hitch’s shape is modeled as a convex polygon, making it versatile and adaptable to a wide variety of objects. We propose an algorithm to form the hitch systematically. The steps can run in parallel, allowing hitches with a large number of robots to be formed in constant time. We develop a set of actions that include different actions to change the shape of the hitch. We demonstrate our methods using a team of aerial robots via simulation and actual experiments.
|
|
TuPO1S-12 Poster Session, Room T8 |
Add to My Program |
Aerial Robot Learning |
|
|
|
08:30-10:10, Paper TuPO1S-12.1 | Add to My Program |
AirTrack: Onboard Deep Learning Framework for Long-Range Aircraft Detection and Tracking |
|
Ghosh, Sourish | Carnegie Mellon University |
Patrikar, Jay | Carnegie Mellon University |
Moon, Brady | Carnegie Mellon University |
Moghassem Hamidi, Milad | Carnegie Mellon University |
Scherer, Sebastian | Carnegie Mellon University |
Keywords: Aerial Systems: Perception and Autonomy, Object Detection, Segmentation and Categorization, Visual Tracking
Abstract: Detect-and-Avoid (DAA) capabilities are critical for safe operations of unmanned aircraft systems (UAS). This paper introduces, AirTrack, a real-time vision-only detect and tracking framework that respects the size, weight, and power (SWaP) constraints of sUAS systems. Given the low Signal-to-Noise ratios (SNR) of far away aircraft, we propose using full resolution images in a deep learning framework that aligns successive images to remove ego-motion. The aligned images are then used downstream in cascaded primary and secondary classifiers to improve detection and tracking performance on multiple metrics. We show that AirTrack outperforms state-of-the art baselines on the Amazon Airborne Object Tracking (AOT) Dataset. Multiple real world flight tests with a Cessna 172 interacting with general aviation traffic and additional near-collision flight tests with a Bell helicopter flying towards a UAS in a controlled setting showcase that the proposed approach satisfies the newly introduced ASTM F3442/F3442M standard for DAA. Empirical evaluations show that our system has a probability of track of more than 95% up to a range of 700m.
|
|
08:30-10:10, Paper TuPO1S-12.2 | Add to My Program |
Towards a Reliable and Lightweight Onboard Fault Detection in Autonomous Unmanned Aerial Vehicles |
|
Katta, Sai Srinadhu | TII |
Viegas, Eduardo | Pontifícia Universidade Catolica Do Paraná (PUCPR), Brazil |
Keywords: Failure Detection and Recovery, Aerial Systems: Applications, Model Learning for Control
Abstract: This paper proposes a new model for onboard physical fault detection on autonomous unmanned aerial vehicles (UAV) through machine learning (ML) techniques. The proposal performs the detection task with high accuracies and minimal processing requirements,while signaling an unreliable ML model to the operator, implemented in two main phases. First, a wrapper-based feature selection is performed to decrease the feature extraction computational costs, coped with a classification assessment technique to identify ML model unreliability. Second, physical UAV faults are signaled through a multi-view rationale that evaluates a variety of UAV sensors, while triggering alerts based on a sliding window scheme. Experiments performed on a real quadcopter UAV with a broken propeller use case, shows the proposal's feasibility. Our model can decrease the false-positive rates up to only 0.4%, while also decreasing the computational costs by at least 43% when compared to traditional techniques. Notwithstanding, it can identify ML model unreliability, signaling the UAV operator when model fine-tuning is needed.
|
|
08:30-10:10, Paper TuPO1S-12.3 | Add to My Program |
Variable Admittance Interaction Control of UAVs Via Deep Reinforcement Learning |
|
Feng, Yuting | Beijing Institute of Technology |
Shi, Chuanbeibei | Univeristy of Toronto |
Du, Jianrui | Beijing Institute of Technology |
Yu, Yushu | Beijing Institute of Technology |
Sun, Fuchun | Tsinghua University |
Song, Yixu | Tsinghua University |
Keywords: Reinforcement Learning, Contact Modeling
Abstract: A compliant control model based on reinforcement learning (RL) is proposed to allow robots to interact with the environment more effectively and autonomously execute force control tasks. The admittance model learns an optimal adjustment policy for interactions with the external environment using RL algorithms. The model combines energy consumption and trajectory tracking of the agent state using a cost function. Therein, an Unmanned Aerial Vehicle (UAV) can operate stably in unknown environments where interaction forces exist. Furthermore, the model ensures that the interaction process is safe, comfortable, and flexible while protecting the external structures of the UAV from damage. To evaluate the model performance, we verified the approach in a simulation environment using a UAV in three external force scenes. We also tested the model across different UAV platforms and various low-level control parameters, and the proposed approach provided the best results.
|
|
08:30-10:10, Paper TuPO1S-12.4 | Add to My Program |
Learning Tethered Perching for Aerial Robots |
|
Hauf, Fabian | Imperial College London |
Kocer, Basaran Bahadir | Imperial College London |
Nguyen, Hai-Nguyen (Hann) | CNRS |
Pang, Oscar Kwong Fai | Imperial College London |
Clark, Ronald | University of Oxford |
Johns, Edward | Imperial College London |
Kovac, Mirko | Imperial College London |
Keywords: Aerial Systems: Applications, Aerial Systems: Mechanics and Control
Abstract: Aerial robots have a wide range of applications, such as collecting data in hard-to-reach areas. This requires the longest possible operation time. However, because currently available commercial batteries have limited specific energy of roughly 300 Whkg^{−1}, a drone's flight time is a bottleneck for sustainable long-term data collection. Inspired by birds in nature, a possible approach to tackle this challenge is to perch drones on trees, and environmental or man-made structures, to save energy whilst in operation. In this paper, we propose an algorithm to automatically generate trajectories for a drone to perch on a tree branch, using the proposed tethered perching mechanism with a pendulum-like structure. This enables a drone to perform an energy-optimised, controlled 180-degree flip to disarm upside down safely. To fine-tune a set of reachable trajector
|
|
08:30-10:10, Paper TuPO1S-12.5 | Add to My Program |
Credible Online Dynamics Learning for Hybrid UAVs |
|
Rohr, David | ETH Zurich |
Lawrance, Nicholas | CSIRO Data61 |
Andersson, Olov | ETH Zürich |
Siegwart, Roland | ETH Zurich |
Keywords: Aerial Systems: Mechanics and Control, Model Learning for Control, Machine Learning for Robot Control
Abstract: Hybrid unmanned aerial vehicles (H-UAVs) are highly versatile platforms with the ability to transition between rotary- and fixed-wing flight. However, their (aero)dynamics tend to be highly nonlinear which increases the risk of introducing safety-critical modeling errors in a controller. Designing a safe, yet not too cautious controller, requires a credible model which provides accurate dynamics uncertainty quantification. We present a data-efficient, probabilistic semi-parametric dynamics modeling approach that allows for online, filter-based inference. The proposed model leverages prior knowledge using a nominal parametric model, and combines it with residuals in form of sparse Gaussian processes (GP) to account for possibly unmodeled forces and moments. Uncertain nominal and residual parameters are jointly estimated using Bayesian filtering. The resulting model accuracy and the reliability of its predicted uncertainty are analyzed for both a simulated and a real example, where we learn the 6DoF nonlinear dynamics of a tiltwing H-UAV from a few minutes of flight data. Compared to a residual-free nominal model, the proposed semi-parametric approach provides increased model accuracy in relevant parts of the flight envelope and substantially higher credibility overall.
|
|
08:30-10:10, Paper TuPO1S-12.6 | Add to My Program |
AZTR: Aerial Video Action Recognition with Auto Zoom and Temporal Reasoning |
|
Wang, Xijun | University of Maryland, College Park |
Xian, Ruiqi | University of Maryland-College Park |
Guan, Tianrui | University of Maryland |
de Melo, Celso | CCDC US Army Research Laboratory |
Nogar, Stephen | CCDC U.S. Army Research Laboratory |
Bera, Aniket | Purdue University |
Manocha, Dinesh | University of Maryland |
Keywords: Aerial Systems: Applications, Aerial Systems: Perception and Autonomy, Surveillance Robotic Systems
Abstract: We propose a novel approach for aerial video action recognition. Our method is designed for videos captured using UAVs and can run on edge or mobile devices. We present a learning-based approach that uses customized auto zoom to automatically identify the human target and scale it appropriately. This makes it easier to extract the key features and reduces the computational overhead. We also present an efficient temporal reasoning algorithm to capture the action information along the spatial and temporal domains within a controllable computational cost. Our approach has been implemented and evaluated both on the desktop with high-end GPUs and on the low power Robotics RB5 Platform for robots and drones. In practice, we achieve 6.1-7.4% improvement over SOTA in Top-1 accuracy on the RoCoG-v2 dataset, 8.3-10.4% improvement on the UAV-Human dataset and 3.2% improvement on the Drone Action dataset.
|
|
08:30-10:10, Paper TuPO1S-12.7 | Add to My Program |
Follow the Rules: Online Signal Temporal Logic Tree Search for Guided Imitation Learning in Stochastic Domains |
|
Aloor, Jasmine Jerry | Massachusetts Institute of Technology |
Patrikar, Jay | Carnegie Mellon University |
Kapoor, Parv | Carnegie Mellon University |
Oh, Jean | Carnegie Mellon University |
Scherer, Sebastian | Carnegie Mellon University |
Keywords: Imitation Learning, Aerial Systems: Applications, Formal Methods in Robotics and Automation
Abstract: Seamlessly integrating rules in Learning-from-Demonstrations (LfD) policies is a critical requirement to enable the real-world deployment of AI agents. Recently Signal Temporal Logic (STL) has been shown to be an effective language for encoding rules as spatio-temporal constraints. This work uses Monte Carlo Tree Search (MCTS) as a means of integrating STL specification into a vanilla LfD policy to improve constraint satisfaction. We propose augmenting the MCTS heuristic with STL robustness values to bias the tree search towards branches with higher constraint satisfaction. While the domain-independent method can be applied to integrate STL rules online into any pre-trained LfD algorithm, we choose goal-conditioned Generative Adversarial Imitation Learning as the offline LfD policy. We apply the proposed method to the domain of planning trajectories for General Aviation aircraft around a non-towered airfield. Results using the simulator trained on real-world data showcase 60% improved performance over baseline LfD methods that do not use STL heuristics.
|
|
08:30-10:10, Paper TuPO1S-12.8 | Add to My Program |
Continuity-Aware Latent Interframe Information Mining for Reliable UAV Tracking |
|
Fu, Changhong | Tongji University |
Cai, Mutian | Tongji University |
Li, Sihang | Tongji University |
Lu, Kunhan | Tongji University |
Zuo, Haobo | Tongji University |
Liu, Chongjun | Harbin Engineering University |
Keywords: Visual Learning, Aerial Systems: Perception and Autonomy, Aerial Systems: Applications
Abstract: Unmanned aerial vehicle (UAV) tracking is crucial for autonomous navigation and has broad applications in robotic automation fields. However, reliable UAV tracking remains a challenging task due to various difficulties like frequent occlusion and aspect ratio change. Additionally, most of the existing work mainly focuses on explicit information to improve tracking performance, ignoring potential interframe connections. To address the above issues, this work proposes a novel framework with continuity-aware latent interframe information mining for reliable UAV tracking, i.e., ClimRT. Specifically, a new efficient continuity-aware latent interframe information mining network (ClimNet) is proposed for UAV tracking, which can generate highly-effective latent frame between two adjacent frames. Besides, a novel location-continuity Transformer (LCT) is designed to fully explore continuity-aware spatial-temporal information, thereby markedly enhancing UAV tracking. Extensive qualitative and quantitative experiments on three authoritative aerial benchmarks strongly validate the robustness and reliability of ClimRT in UAV tracking performance. Furthermore, real-world tests on the aerial platform validate its practicability and effectiveness. The code and demo materials are released at https://github.com/vision4robotics/ClimRT.
|
|
08:30-10:10, Paper TuPO1S-12.9 | Add to My Program |
Weighted Maximum Likelihood for Controller Tuning |
|
Romero, Angel | University of Zurich |
Govil, Shreedhar | University of Zurich |
Yilmaz, Gonca | University of Zurich |
Song, Yunlong | University of Zurich |
Scaramuzza, Davide | University of Zurich |
Keywords: Machine Learning for Robot Control, Aerial Systems: Mechanics and Control
Abstract: Recently, Model Predictive Contouring Control (MPCC) has arisen as the state-of-the-art approach for model-based agile flight. MPCC benefits from great flexibility in trading-off between progress maximization and path following at runtime without relying on globally optimized trajectories. However, finding the optimal set of tuning parameters for MPCC is challenging because (i) the full quadrotor dynamics are non-linear, (ii) the cost function is highly non-convex, and (iii) of the high dimensionality of the hyperparameter space. This paper leverages a probabilistic Policy Search method, Weighted Maximum Likelihood (WML), to automatically learn the optimal objective for MPCC. WML is sample-efficient due to its closed-form solution for updating the learning parameters. Additionally, the data efficiency provided by the use of a model-based approach allows us to directly train in a high-fidelity simulator, which in turn makes our approach able to transfer zero-shot to the real world. We validate our approach in the real world, where we show that our method outperforms both the previous manually tuned controller and the state-of-the-art auto-tuning baseline reaching speeds of 75 km/h.
|
|
08:30-10:10, Paper TuPO1S-12.10 | Add to My Program |
User-Conditioned Neural Control Policies for Mobile Robotics |
|
Bauersfeld, Leonard | University of Zurich (UZH), |
Kaufmann, Elia | University of Zurich |
Scaramuzza, Davide | University of Zurich |
Keywords: Machine Learning for Robot Control, Aerial Systems: Mechanics and Control, Reinforcement Learning
Abstract: Recently, learning-based controllers have been shown to push mobile robotic systems to their limits and provide the robustness needed for many real-world applications. However, only classical optimization-based control frameworks offer the inherent flexibility to be dynamically adjusted during execution by, for example, setting target speeds or actuator limits. We present a framework to overcome this shortcoming of neural controllers by conditioning them on an auxiliary input. This advance is enabled by including a feature-wise linear modulation layer (FiLM). We use model-free reinforcement-learning to train quadrotor control policies for the task of navigating through a sequence of waypoints in minimum time. By conditioning the policy on the maximum available thrust or the viewing direction relative to the next waypoint, a user can regulate the aggressiveness of the quadrotor's flight during deployment. We demonstrate in simulation and in real-world experiments that a single control policy can achieve close to time-optimal flight performance across the entire performance envelope of the robot, reaching up to 60km/h and 4.5g in acceleration. The ability to guide a learned controller during task execution has implications beyond agile quadrotor flight, as conditioning the control policy on human intent helps safely bringing learning based systems out of the well-defined laboratory environment into the wild.
|
|
08:30-10:10, Paper TuPO1S-12.11 | Add to My Program |
Training Efficient Controllers Via Analytic Policy Gradient |
|
Wiedemann, Nina | Robotics and Perception Group, University of Zürich |
Wüest, Valentin | EPFL |
Loquercio, Antonio | UC Berkeley |
Müller, Matthias | Intel |
Floreano, Dario | Ecole Polytechnique Federal, Lausanne |
Scaramuzza, Davide | University of Zurich |
Keywords: Machine Learning for Robot Control, Aerial Systems: Mechanics and Control, Robust/Adaptive Control
Abstract: Control design for robotic systems is complex and often requires solving an optimization to follow a trajectory accurately. Online optimization approaches like Model Predictive Control (MPC) have been shown to achieve great tracking performance, but require high computing power. Conversely, learning-based offline optimization approaches, such as Reinforcement Learning (RL), allow fast and efficient execution on the robot but hardly match the accuracy of MPC in trajectory tracking tasks. In systems with limited compute, such as aerial vehicles, an accurate controller that is efficient at execution time is imperative. We propose an Analytic Policy Gradient (APG) method to tackle this problem. APG exploits the availability of differentiable simulators by training a controller offline with gradient descent on the tracking error. We address training instabilities that frequently occur with APG through curriculum learning and experiment on a widely used controls benchmark, the CartPole, and two common aerial robots, a quadrotor and a fixed-wing drone. Our proposed method outperforms both model-based and model-free RL methods in terms of tracking error. Concurrently, it achieves similar performance to MPC while requiring more than an order of magnitude less computation time. Our work provides insights into the potential of APG as a promising control method for robotics. To facilitate the exploration of APG, we open-source our code and make it available at https://github.com/lis-epfl/apg_trajectory_tracking.
|
|
08:30-10:10, Paper TuPO1S-12.12 | Add to My Program |
Parallel Reinforcement Learning Simulation for Visual Quadrotor Navigation |
|
Saunders, Jack | University of Bath |
Saeedi, Sajad | Toronto Metropolitan University |
Li, Wenbin | University of Bath |
Keywords: Reinforcement Learning, Collision Avoidance, Aerial Systems: Perception and Autonomy
Abstract: Reinforcement learning (RL) is an agent-based approach for teaching robots to navigate within the physical world. Gathering data for RL is known to be a laborious task, and real-world experiments can be risky. Simulators facilitate the collection of training data in a quicker and more cost-effective manner. However, RL frequently requires a significant number of simulation steps for an agent to become skilful at simple tasks. This is a prevalent issue within the field of RL-based visual quadrotor navigation where state dimensions are typically very large and dynamic models are complex. Furthermore, rendering images and obtaining physical properties of the agent can be computationally expensive. To solve this, we present a simulation framework, built on AirSim, which provides efficient parallel training. Building on this framework, Ape-X is modified to incorporate parallel training of AirSim environments to make use of numerous networked computers. Through experiments we were able to achieve a reduction in training time from 3.9 hours to 11 minutes, for a toy problem, using the aforementioned framework and a total of 74 agents and two networked computers.
|
|
TuPO1S-13 Poster Session, Room T8 |
Add to My Program |
Multi-Robot Systems I |
|
|
|
08:30-10:10, Paper TuPO1S-13.1 | Add to My Program |
Toward Efficient Physical and Algorithmic Design of Automated Garages |
|
Guo, Teng | Rutgers University |
Yu, Jingjin | Rutgers University |
Keywords: Path Planning for Multiple Mobile Robots or Agents, Planning, Scheduling and Coordination, Multi-Robot Systems
Abstract: Parking in large metropolitan areas is often a time-consuming task with further implications toward traffic patterns that affect urban landscaping. Reducing the premium space needed for parking has led to the development of automated mechanical parking systems. Compared to regular garages having one or two rows of vehicles in each island, automated garages can have multiple rows of vehicles stacked together to support higher parking demands. Although this multi-row layout reduces parking space, it makes the parking and retrieval more complicated. In this work, we propose an automated garage design that supports near 100% parking density. Modeling the problem of parking and retrieving multiple vehicles as a special class of multi-robot path planning problem, we propose associated algorithms for handling all common operations of the automated garage, including (1) optimal algorithm and near-optimal methods that find feasible and efficient solutions for simultaneous parking/retrieval and (2) a novel shuffling mechanism to rearrange vehicles to facilitate scheduled retrieval at rush hours. We conduct thorough simulation studies showing the proposed methods are promising for large and high-density real-world parking applications
|
|
08:30-10:10, Paper TuPO1S-13.2 | Add to My Program |
Chronos and CRS: Design of a Miniature Car-Like Robot and a Software Framework for Single and Multi-Agent Robotics and Control |
|
Carron, Andrea | ETH Zurich |
Sabrina, Bodmer | ETH Zurich |
Vogel, Lukas | ETH Zurich |
Zurbruegg, René | ETH Zurich |
Helm, David | ETH Zürich |
Rickenbach, Rahel | ETH Zurich |
Muntwiler, Simon | ETH Zurich |
Sieber, Jerome | ETH Zurich |
Zeilinger, Melanie N. | ETH Zurich |
Keywords: Education Robotics, Hardware-Software Integration in Robotics, Wheeled Robots
Abstract: From both an educational and research point of view, experiments on hardware are a key aspect of robotics and control. In the last decade, many open-source hardware and software frameworks for wheeled robots have been presented, mainly in the form of unicycles and car-like robots, with the goal of making robotics accessible to a wider audience and to support control systems development. Unicycles are usually small and inexpensive, and therefore facilitate experiments in a larger fleet, but they are not suited for high-speed motion. Car-like robots are more agile, but they are usually larger and more expensive, thus requiring more resources in terms of space and money. In order to bridge this gap, we present Chronos, a new car-like 1/28th scale robot with customized open-source electronics, and CRS, an open-source software framework for control and robotics. The CRS software framework includes the implementation of various state-of-the-art algorithms for control, estimation, and multi-agent coordination. With this work, we aim to provide easier access to hardware and reduce the engineering time needed to start new educational and research projects.
|
|
08:30-10:10, Paper TuPO1S-13.3 | Add to My Program |
Multi-Agent Path Integral Control for Interaction-Aware Motion Planning in Urban Canals |
|
Streichenberg, Lucas Michael | ETH Zurich |
Trevisan, Elia | Delft University of Technology |
Chung, Jen Jen | The University of Queensland |
Siegwart, Roland | ETH Zurich |
Alonso-Mora, Javier | Delft University of Technology |
Keywords: Path Planning for Multiple Mobile Robots or Agents, Collision Avoidance, Motion and Path Planning
Abstract: Autonomous vehicles that operate in urban environments shall comply with existing rules and reason about the interactions with other decision-making agents. In this paper, we introduce a decentralized and communication-free interaction-aware motion planner and apply it to Autonomous Surface Vessels (ASVs) in urban canals. We build upon a sampling-based method, namely Model Predictive Path Integral control (MPPI), and employ it to, in each time instance, compute both a collision-free trajectory for the vehicle and a prediction of other agents' trajectories, thus modeling interactions. To improve the method's efficiency in multi-agent scenarios, we introduce a two-stage sample evaluation strategy and define an appropriate cost function to achieve rule compliance. We evaluate this decentralized approach in simulations with multiple vessels in real scenarios extracted from Amsterdam's canals, showing superior performance than a state-of-the-art trajectory optimization framework and robustness when encountering different types of agents.
|
|
08:30-10:10, Paper TuPO1S-13.4 | Add to My Program |
Mixed Observable RRT: Multi-Agent Mission-Planning in Partially Observable Environments |
|
Johansson, Kasper | Stanford University |
Rosolia, Ugo | Caltech |
Ubellacker, Wyatt | California Institute of Technology |
Singletary, Andrew | California Institute of Technology |
Ames, Aaron | California Institute of Technology |
Keywords: Path Planning for Multiple Mobile Robots or Agents, Planning under Uncertainty, Multi-Robot Systems
Abstract: This paper considers centralized mission-planning for a heterogeneous multi-agent system with the aim of locating a hidden target. We propose a mixed observable setting, consisting of a fully observable state-space and a partially observable environment, using a hidden Markov model. First, we construct rapidly exploring random trees (RRTs) to introduce the mixed observable RRT for finding plausible mission plans giving way-points for each agent. Leveraging this construction, we present a path-selection strategy based on a dynamic programming approach, which accounts for the uncertainty from partial observations and minimizes the expected cost. Finally, we combine the high-level plan with model predictive control algorithms to evaluate the approach on an experimental setup consisting of a quadruped robot and a drone. It is shown that agents are able to make intelligent decisions to explore the area efficiently and to locate the target through collaborative actions.
|
|
08:30-10:10, Paper TuPO1S-13.5 | Add to My Program |
RTAW: An Attention Inspired Reinforcement Learning Method for Multi-Robot Task Allocation in Warehouse Environments |
|
Agrawal, Aakriti | University of Maryland, College Park |
Bedi, Amrit Singh | University of Maryland, College Park |
Manocha, Dinesh | University of Maryland |
Keywords: Motion and Path Planning, Path Planning for Multiple Mobile Robots or Agents, Multi-Robot Systems
Abstract: We present a novel reinforcement learning based algorithm for multi-robot task allocation problem in warehouse environments. We formulate it as a Markov Decision Process and solve via a novel deep multi-agent reinforcement learning method (called RTAW) with attention inspired policy architecture. Hence, our proposed policy network uses global embeddings that are independent of the number of robots/tasks. We utilize proximal policy optimization algorithm for training and use a carefully designed reward to obtain a converged policy. The converged policy ensures cooperation among different robots to minimize total travel delay (TTD) which ultimately improves the makespan for a sufficiently large task-list. In our extensive experiments, we compare the performance of our RTAW algorithm to state of the art methods such as myopic pickup distance minimization (greedy) and regret based baselines on different navigation schemes. We show an improvement of upto 14% (25-1000 seconds) in TTD on scenarios with hundreds or thousands of tasks for different challenging warehouse layouts and task generation schemes. We also demonstrate the scalability of our approach by showing performance with up to 1000 robots in simulations.
|
|
08:30-10:10, Paper TuPO1S-13.6 | Add to My Program |
Hybrid SUSD-Based Task Allocation for Heterogeneous Multi-Robot Teams |
|
Chen, Shengkang | Georgia Tech |
Lin, Tony X. | Georgia Institute of Technology |
Al-Abri, Said | Georgia Institute of Technology |
Arkin, Ronald | Georgia Tech |
Zhang, Fumin | Georgia Institute of Technology |
Keywords: Multi-Robot Systems, Task and Motion Planning
Abstract: Effective task allocation is an essential component to the coordination of heterogeneous robots. This paper proposes a hybrid task allocation algorithm that improves upon given initial solutions, for example from the popular decentralized market-based allocation algorithm, via a derivative-free optimization strategy called Speeding-Up and Slowing-Down (SUSD). Based on the initial solutions, SUSD performs a search to find an improved task assignment. Unique to our strategy is the ability to apply a gradient-like search to solve a classical integer-programming problem. The proposed strategy outperforms other state-of-the-art algorithms in terms of total task utility and can achieve near optimal solutions in simulation. Experimental results using the Robotarium are also provided.
|
|
08:30-10:10, Paper TuPO1S-13.7 | Add to My Program |
Search Algorithms for Multi-Agent Teamwise Cooperative Path Finding |
|
Ren, Zhongqiang | Carnegie Mellon University |
Zhang, Chaoran | Carnegie Mellon University |
Rathinam, Sivakumar | TAMU |
Choset, Howie | Carnegie Mellon University |
Keywords: Path Planning for Multiple Mobile Robots or Agents, Planning, Scheduling and Coordination, Motion and Path Planning
Abstract: Multi-Agent Path Finding (MA-PF) computes a set of collision-free paths for multiple agents from their respective starting locations to destinations. This paper considers a generalization of MA-PF called Multi-Agent Teamwise Cooperative Path Finding (MA-TC-PF), where agents are grouped as multiple teams and each team has its own objective to be minimized. For example, an objective can be the sum or max of individual arrival times of the agents. In general, there is more than one team, and MA-TC-PF is thus a multi-objective planning problem with the goal of finding the entire Pareto-optimal front that represents all possible trade-offs among the objectives of the teams. To solve MA-TC-PF, we propose two algorithms TC-CBS and TC-M*, which leverage the existing CBS and M* for conventional MA-PF. We discuss the conditions under which the proposed algorithms are complete and are guaranteed to find the Pareto-optimal front. We present numerical results for several types of MA-TC-PF problems.
|
|
08:30-10:10, Paper TuPO1S-13.8 | Add to My Program |
Collaborative Scheduling with Adaptation to Failure for Heterogeneous Robot Teams |
|
Gao, Peng | University of Maryland, College Park |
Siva, Sriram | Colorado School of Mines |
Micciche, Anthony | University of Massachusetts Amherst |
Zhang, Hao | Colorado School of Mines |
Keywords: Multi-Robot Systems, Imitation Learning
Abstract: Collaborative scheduling is an essential ability for a team of heterogeneous robots to collaboratively complete complex tasks, e.g., in a multi-robot assembly application. To enable collaborative scheduling, two key problems should be addressed, including allocating tasks to heterogeneous robots and adapting to robot failures in order to guarantee the completion of all tasks. In this paper, we introduce a novel approach that integrates deep bipartite graph matching and imitation learning for heterogeneous robots to complete complex tasks as a team. Specifically, we use a graph attention network to represent attributes and relationships of the tasks. Then, we formulate collaborative scheduling with failure adaptation as a new deep learning-based bipartite graph matching problem, which learns a policy by imitation to determine task scheduling based on the reward of potential task schedules. During normal execution, our approach generates robot-task pairs as potential allocations. When a robot fails, our approach identifies not only individual robots but also subteams to replace the failed robot. We conduct extensive experiments to evaluate our approach in the scenarios of collaborative scheduling with robot failures. Experimental results show that our approach achieves promising, generalizable and scalable results on collaborative scheduling with robot failure adaptation.
|
|
08:30-10:10, Paper TuPO1S-13.9 | Add to My Program |
AMSwarm: An Alternating Minimization Approach for Safe Motion Planning of Quadrotor Swarms in Cluttered Environments |
|
Adajania, Vivek Kantilal | University of Toronto |
Zhou, Siqi | University of Toronto |
Singh, Arun Kumar | University of Tartu |
Schoellig, Angela P. | TU Munich |
Keywords: Path Planning for Multiple Mobile Robots or Agents, Motion and Path Planning, Collision Avoidance
Abstract: This paper presents a scalable online algorithm to generate safe and kinematically feasible trajectories for quadrotor swarms. Existing approaches rely on linearizing Euclidean distance-based collision constraints and on axis-wise decoupling of kinematic constraints to reduce the trajectory optimization problem for each quadrotor to a quadratic program (QP). This conservative approximation often fails to find a solution in cluttered environments. We present a novel alternative that handles collision constraints without linearization and kinematic constraints in their quadratic form while still retaining the QP form. We achieve this by reformulating the constraints in a polar form and applying an Alternating Minimization algorithm to the resulting problem. Through extensive simulation results, we demonstrate that, as compared to Sequential Convex Programming (SCP) baselines, our approach achieves on average, a 72% improvement in success rate, a 36% reduction in mission time, and a 42 times faster per-agent computation time. We also show that collision constraints derived from discrete-time barrier functions (BF) can be incorporated, leading to different safety behaviours without significant computational overhead. Moreover, our optimizer outperforms the state-of-the-art optimal control solver ACADO in handling BF constraints with a 31 times faster per-agent computation time and a 44% reduction in mission time on average. We experimentally validated our approach on a Crazyflie quadrotor swarm of up to 12 quadrotors. The code with supplementary material and video are released for reference.
|
|
08:30-10:10, Paper TuPO1S-13.10 | Add to My Program |
Decentralized Deadlock-Free Trajectory Planning for Quadrotor Swarm in Obstacle-Rich Environments |
|
Park, Jungwon | Seoul National University |
Jang, Inkyu | Seoul National University |
Kim, H. Jin | Seoul National University |
Keywords: Path Planning for Multiple Mobile Robots or Agents, Distributed Robot Systems, Collision Avoidance
Abstract: This paper presents a decentralized multi-agent trajectory planning (MATP) algorithm that guarantees to generate a safe, deadlock-free trajectory in an obstacle-rich environment under a limited communication range. The proposed algorithm utilizes a grid-based multi-agent path planning (MAPP) algorithm for deadlock resolution, and we introduce the subgoal optimization method to make the agent converge to the waypoint generated from the MAPP without deadlock. In addition, the proposed algorithm ensures the feasibility of the optimization problem and collision avoidance by adopting a linear safe corridor (LSC). We verify that the proposed algorithm does not cause a deadlock in both random forests and dense mazes regardless of communication range, and it outperforms our previous work in flight time and distance. We validate the proposed algorithm through a hardware demonstration with ten quadrotors.
|
|
08:30-10:10, Paper TuPO1S-13.11 | Add to My Program |
A Negative Imaginary Theory-Based Time-Varying Group Formation Tracking Scheme for Multi-Robot Systems: Applications to Quadcopters |
|
Su, Yu-Hsiang | The University of Manchester |
Bhowmick, Parijat | Indian Institute of Technology Guwahati |
Lanzon, Alexander | The University of Manchester |
Keywords: Multi-Robot Systems, Networked Robots, Swarm Robotics
Abstract: This paper proposes a new methodology to develop a time-varying group formation tracking scheme for a class of multi-agent systems (e.g. different types of multi-robot systems) utilising Negative Imaginary (NI) theory. It offers a two-loop control scheme in which the inner loop deploys an appropriate feedback linearising control law to transform the nonlinear dynamics of each agent into a double integrator system, while the outer loop applies an NI-based time-varying group formation control protocol on the linearised agents. This approach offers greater flexibility in choosing a controller, easy implementation and tuning, reduces the overall complexity of the scheme, and uses only output feedback (hence reduced sensing requirements) to achieve formation control in contrast to the existing formation control schemes. The paper has also provided lab-based experimental validation results to demonstrate the feasibility and usefulness of the proposed scheme. Two experiments were conducted on a group of small-scale quadcopters connected via a network to test the time-varying group formation tracking performance.
|
|
08:30-10:10, Paper TuPO1S-13.12 | Add to My Program |
Data-Driven Risk-Sensitive Model Predictive Control for Safe Navigation in Multi-Robot Systems |
|
Navsalkar, Atharva | Indian Institute of Technology Kharagpur |
Hota, Ashish | Indian Institute of Technology (IIT) Kharagpur |
Keywords: Collision Avoidance, Optimization and Optimal Control, Multi-Robot Systems
Abstract: Safe navigation is a fundamental challenge in multi-robot systems due to the uncertainty surrounding the future trajectory of the robots that act as obstacles for each other. In this work, we propose a principled data-driven approach where each robot repeatedly solves a finite horizon optimization problem subject to collision avoidance constraints with latter being formulated as distributionally robust conditional value-at-risk (CVaR) of the distance between the agent and a polyhedral obstacle geometry. Specifically, the CVaR constraints are required to hold for all distributions that are close to the empirical distribution constructed from observed samples of prediction error collected during execution. The generality of the approach allows us to robustify against prediction errors that arise under commonly imposed assumptions in both distributed and decentralized settings. We derive tractable finite-dimensional approximations of this class of constraints by leveraging convex and minmax duality results for Wasserstein distributionally robust optimization problems. The effectiveness of the proposed approach is illustrated in a multi-drone navigation setting implemented in Gazebo platform.
|
|
TuPO1S-14 Poster Session, Room T8 |
Add to My Program |
Intelligent Transportation Systems I |
|
|
|
08:30-10:10, Paper TuPO1S-14.1 | Add to My Program |
Multi-Modal Hierarchical Transformer for Occupancy Flow Field Prediction in Autonomous Driving |
|
Liu, Haochen | Nanyang Technological University |
Huang, Zhiyu | Nanyang Technological University |
Lv, Chen | Nanyang Technological University |
Keywords: Intelligent Transportation Systems, Computer Vision for Transportation, Deep Learning Methods
Abstract: Forecasting the future states of surrounding traffic participants is a crucial capability for autonomous vehicles. The recently proposed occupancy flow field prediction introduces a scalable and effective representation to jointly predict surrounding agents' future motions in a scene. However, the challenging part is to model the underlying social interactions among traffic agents and the relations between occupancy and flow. Therefore, this paper proposes a novel Multi-modal Hierarchical Transformer network that fuses the vectorized (agent motion) and visual (scene flow, map, and occupancy) modalities and jointly predicts the flow and occupancy of the scene. Specifically, visual and vector features from sensory data are encoded through a multi-stage Transformer module and then a late-fusion Transformer module with temporal pixel-wise attention. Importantly, a flow-guided multi-head self-attention (FG-MSA) module is designed to better aggregate the information on occupancy and flow and model the mathematical relations between them. The proposed method is comprehensively validated on the Waymo Open Motion Dataset and compared against several state-of-the-art models. The results reveal that our model with much more compact architecture and data inputs than other methods can achieve comparable performance. We also demonstrate the effectiveness of incorporating vectorized agent motion features and the proposed FG-MSA module. Compared to the ablated model without the FG-MSA module, which won the 2nd place in the 2022 Waymo Occupancy and Flow Prediction Challenge, the current model shows better separability for flow and occupancy and further performance improvements.
|
|
08:30-10:10, Paper TuPO1S-14.2 | Add to My Program |
Annotating Covert Hazardous Driving Scenarios Online: Utilizing the Driver's Electroencephalography (EEG) Signals |
|
Zheng, Chen | Institute for AI Industry Research, Tsinghua University |
Zi, Muxiao | Institute for AI Industry Research, Tsinghua University |
Jiang, Wenjie | Tsinghua University |
Chu, Mengdi | Tsinghua University |
Zhang, Yan | Tsinghua University |
Yuan, Jirui | Tsinghua University |
Zhou, Guyue | Tsinghua University |
Gong, Jiangtao | Tsinghua University |
Keywords: Brain-Machine Interfaces, Intelligent Transportation Systems
Abstract: As autonomous driving systems prevail, it is becoming increasingly critical that the systems learn from databases containing fine-grained driving scenarios. Most databases currently available are human-annotated; they are expensive, time-consuming, and subject to behavioral biases. In this paper, we provide initial evidence supporting a novel technique utilizing drivers’ electroencephalography (EEG) signals to implicitly label hazardous driving scenarios while passively viewing recordings of real-road driving, thus sparing the need for manual annotation and avoiding human annotators’behavioral biases during explicit report. We conducted an EEG experiment using real-life and animated recordings of driving scenarios and asked participants to report danger explicitly whenever necessary. Behavioral results showed the participants tended to report danger only when overt hazards (e.g., a vehicle or a pedestrian appearing unexpectedly from behind an occlusion) were in view. By contrast, their EEG signals were enhanced at the sight of both an overt hazard and a covert hazard (e.g., an occlusion signalling possible appearance of a vehicle or a pedestrian from behind). Thus, EEG signals were more sensitive to driving hazards than explicit reports. Further, the Time-Series AI (TSAI, [1]) successfully classified EEG signals corresponding to overt and covert hazards. We discuss future steps necessary to materialize the technique in real life.
|
|
08:30-10:10, Paper TuPO1S-14.3 | Add to My Program |
Pedestrian Crossing Action Recognition and Trajectory Prediction with 3D Human Keypoints |
|
Li, Jiachen | Stanford University |
Shi, Xinwei | Waymo LLC |
Chen, Feiyu | Waymo LLC |
Stroud, Jonathan | Waymo |
Zhang, Zhishuai | Google |
Lan, Tian | Waymo |
Mao, Junhua | Waymo |
Kang, Jeonhyung | Waymo |
Refaat, Khaled | Waymo |
Yang, Weilong | Waymo |
Ie, Eugene | Waymo LLC |
Li, Congcong | Waymo Inc |
Keywords: Intelligent Transportation Systems
Abstract: Accurate understanding and prediction of human behaviors are critical prerequisites for autonomous vehicles, especially in highly dynamic and interactive scenarios such as intersections in dense urban areas. In this work, we aim at identifying crossing pedestrians and predicting their future trajectories. To achieve these goals, we not only need the context information of road geometry and other traffic participants but also need fine-grained information of the human pose, motion and activity, which can be inferred from human keypoints. In this paper, we propose a novel multi-task learning framework for pedestrian crossing action recognition and trajectory prediction, which utilizes 3D human keypoints extracted from raw sensor data to capture rich information on human pose and activity. Moreover, we propose to apply two auxiliary tasks and contrastive learning to enable auxiliary supervisions to improve the learned keypoints representation, which further enhances the performance of major tasks. We validate our approach on a large-scale in-house dataset, as well as a public benchmark dataset, and show that our approach achieves state-of-the-art performance on a wide range of evaluation metrics. The effectiveness of each model component is validated in a detailed ablation study.
|
|
08:30-10:10, Paper TuPO1S-14.4 | Add to My Program |
Model-Agnostic Multi-Agent Perception Framework |
|
Xu, Runsheng | UCLA |
Chen, Weizhe | Indiana University Bloomington |
Xiang, Hao | University of California, Los Angeles |
Xia, Xin | University of California, Los Angeles |
Liu, Lantao | Indiana University |
Ma, Jiaqi | University of California, Los Angeles |
Keywords: Intelligent Transportation Systems, Computer Vision for Transportation, Cooperating Robots
Abstract: Existing multi-agent perception systems assume that every agent utilizes the same model with identical parameters and architecture. The performance can be degraded with different perception models due to the mismatch in their confidence scores. In this work, we propose a model-agnostic multi-agent perception framework to reduce the negative effect caused by the model discrepancies without sharing the model information. Specifically, we propose a confidence calibrator that can eliminate the prediction confidence score bias. Each agent performs such calibration independently on a standard public database to protect intellectual property. We also propose a corresponding bounding box aggregation algorithm that considers the confidence scores and the spatial agreement of neighboring boxes. Our experiments shed light on the necessity of model calibration across different agents, and the results show that the proposed framework improves the baseline 3D object detection performance of heterogeneous agents.
|
|
08:30-10:10, Paper TuPO1S-14.5 | Add to My Program |
Explainable Action Prediction through Self-Supervision on Scene Graphs |
|
Kochakarn, Pawit | University of Oxford |
De Martini, Daniele | University of Oxford |
Omeiza, Daniel | University of Oxford |
Kunze, Lars | University of Oxford |
Keywords: Intelligent Transportation Systems, Deep Learning Methods, Representation Learning
Abstract: This work explores scene graphs as a distilled representation of high-level information for autonomous driving, applied to future driver-action prediction. Given the scarcity and strong imbalance of data samples, we propose a self-supervision pipeline to infer representative and well-separated embeddings. Key aspects are interpretability and explainability; as such, we embed in our architecture attention mechanisms that can create spatial and temporal heatmaps on the scene graphs. We evaluate our system on the ROAD dataset against a fully-supervised approach, showing the superiority of our training regime.
|
|
08:30-10:10, Paper TuPO1S-14.6 | Add to My Program |
CueCAn: Cue-Driven Contextual Attention for Identifying Missing Traffic Signs on Unconstrained Roads |
|
Gupta, Varun | IIIT, Hyderabad |
Subramanian, Anbumani | Intel |
Jawahar, C.V. | IIIT, Hyderabad |
Saluja, Rohit | IIIT Hyderabad |
Keywords: Intelligent Transportation Systems, Deep Learning Methods, Data Sets for Robot Learning
Abstract: Unconstrained Asian roads often involve poor infrastructure, affecting overall road safety. Missing traffic signs are a regular part of such roads. Missing or non-existing object detection has been studied for locating missing curbs and estimating reasonable regions for pedestrians on road scene images. Such methods involve analyzing task-specific single object cues. In this paper, we present the first and most challenging video dataset for missing objects, with multiple types of traffic signs for which the cues are visible without the signs in the scenes. We refer to it as the Missing Traffic Signs Video Dataset (MTSVD). MTSVD is challenging compared to the previous works in two aspects i) The traffic signs are generally not present in the vicinity of their cues, ii) The traffic signs’ cues are diverse and unique. Also, MTSVD is the first publicly available missing object dataset. To train the models for identifying missing signs, we complement our dataset with 10K traffic sign tracks, with 40% of the traffic signs having cues visible in the scenes. For identifying missing signs, we propose the Cue-driven Contextual Attention units (CueCAn), which we incorporate in our model’s encoder. We first train the encoder to classify the presence of traffic sign cues and then train the entire segmentation model end-to-end to localize missing traffic signs. Quantitative and qualitative analysis shows that CueCAn significantly improves the performance of base models.
|
|
08:30-10:10, Paper TuPO1S-14.7 | Add to My Program |
Tackling Clutter in Radar Data - Label Generation and Detection Using PointNet++ |
|
Kopp, Johannes | Ulm University |
Kellner, Dominik | BMW AG |
Piroli, Aldi | Universität Ulm |
Dietmayer, Klaus | University of Ulm |
Keywords: Intelligent Transportation Systems, Object Detection, Segmentation and Categorization, Data Sets for Robot Learning
Abstract: Radar sensors employed for environment perception, e.g. in autonomous vehicles, output a lot of unwanted clutter. These points, for which no corresponding real objects exist, are a major source of errors in following processing steps like object detection or tracking. We therefore present two novel neural network setups for identifying clutter. The input data, network architectures and training configuration are adjusted specifically for this task. Special attention is paid to the downsampling of point clouds composed of multiple sensor scans. In an extensive evaluation, the new setups display substantially better performance than existing approaches. Because there is no suitable public data set in which clutter is annotated, we design a method to automatically generate the respective labels. By applying it to existing data with object annotations and releasing its code, we effectively create the first freely available radar clutter data set representing real-world driving scenarios. Code and instructions are accessible at www.github.com/kopp-j/clutter-ds.
|
|
08:30-10:10, Paper TuPO1S-14.8 | Add to My Program |
Effective Combination of Vertical, Longitudinal and Lateral Data for Vehicle Mass Estimation |
|
El Mrhasli, Younesse | ENSTA PARIS |
Monsuez, Bruno | ENSTA-ParisTech |
Mouton, Xavier | Groupe Renault |
Keywords: Intelligent Transportation Systems, Sensor Fusion, Dynamics
Abstract: Real-time knowledge of the vehicle mass is valuable for several applications, mainly: active safety systems design and energy consumption optimization. This work describes a novel strategy for mass estimation in static and dynamic conditions. First, when the vehicle is powered-up, an initial estimation is given by observing the variations of one suspension deflection sensor mounted on the rear. Then, the estimation is refined based on conditioned and filtered longitudinal and lateral motions. In this study, we suggest using these extracted events on two different algorithms, namely: the recursive least squares and the prior-recursive Bayesian inference. That is to express the results in a deterministic and statistical sense. Both simulations and experimental tests show that our approach encompasses the benefits of various works in the literature, preeminently, robustness to resistive loads, fast convergence, and minimal instrumentation.
|
|
08:30-10:10, Paper TuPO1S-14.9 | Add to My Program |
Receding Horizon Planning with Rule Hierarchies for Autonomous Vehicles |
|
Veer, Sushant | NVIDIA |
Leung, Karen | Stanford University, NVIDIA Research, University of Washington |
Cosner, Ryan | California Institute of Technology |
Chen, Yuxiao | California Institute of Technology |
Karkus, Peter | NVIDIA |
Pavone, Marco | Stanford University |
Keywords: Motion and Path Planning, Autonomous Vehicle Navigation
Abstract: Autonomous vehicles must often contend with conflicting planning requirements, e.g., safety and comfort could be at odds with each other if avoiding a collision calls for slamming the brakes. To resolve such conflicts, assigning importance ranking to rules (i.e., imposing a rule hierarchy) has been proposed, which, in turn, induces rankings on trajectories based on the importance of the rules they satisfy. On one hand, imposing rule hierarchies can enhance interpretability, but introduce combinatorial complexity to planning; while on the other hand, differentiable reward structures can be leveraged by modern gradient-based optimization tools, but are less interpretable and unintuitive to tune. In this paper, we present an approach to equivalently express rule hierarchies as differentiable reward structures amenable to modern gradient-based optimizers, thereby, achieving the best of the both worlds. We achieve this by formulating rank-preserving reward functions that are monotonic in the rank of the trajectories induced by the rule hierarchy; i.e., higher ranked trajectories receive higher reward. Equipped with a rule hierarchy and its corresponding rank-preserving reward function, we develop a two-stage planner that can efficiently resolve conflicting planning requirements. We demonstrate that our approach can generate motion plans ~7-10 Hz in various challenging road navigation and intersection negotiation scenarios.
|
|
08:30-10:10, Paper TuPO1S-14.10 | Add to My Program |
Active Probing and Influencing Human Behaviors Via Autonomous Agents |
|
Wang, Shuangge | University of Southern California |
Lyu, Yiwei | Carnegie Mellon University |
Dolan, John M. | Carnegie Mellon University |
Keywords: Intelligent Transportation Systems, Behavior-Based Systems
Abstract: Autonomous agents (robots) face tremendous challenges while interacting with heterogeneous human agents in close proximity. One of these challenges is that the autonomous agent does not have an accurate model tailored to the specific human that the autonomous agent is interacting with, which could sometimes result in inefficient human-robot interaction and suboptimal system dynamics. Developing an online method to enable the autonomous agent to learn information about the human model is therefore an ongoing research goal. Existing approaches position the robot as a passive learner in the environment to observe the physical states and the associated human response. This passive design, however, only allows the robot to obtain information that the human chooses to exhibit, which sometimes doesn’t capture the human’s full intention. In this work, we present an online optimization-based probing procedure for the autonomous agent to clarify its belief about the human model in an active manner. By optimizing an information radius, the autonomous agent chooses the action that most challenges its current conviction. This procedure allows the autonomous agent to actively probe the human agents to reveal information that’s previously unavailable to the autonomous agent. With this gathered information, the autonomous agent can interactively influence the human agent for some designated objectives. Our main contributions include a coherent theoretical framework that unifies the probing and influence procedures and two case studies in autonomous driving that show how active probing can help to create better participant experience during influence, like higher efficiency or less perturbations.
|
|
08:30-10:10, Paper TuPO1S-14.11 | Add to My Program |
TrafficBots: Towards World Models for Autonomous Driving Simulation and Motion Prediction |
|
Zhang, Zhejun | ETH Zurich |
Liniger, Alexander | ETH Zurich |
Dai, Dengxin | ETH Zurich |
Yu, Fisher | ETH Zürich |
Van Gool, Luc | ETH Zurich |
Keywords: Intelligent Transportation Systems, Imitation Learning, Simulation and Animation
Abstract: Data-driven simulation has become a favorable way to train and test autonomous driving algorithms. The idea of replacing the actual environment with a learned simulator has also been explored in model-based reinforcement learning in the context of world models. In this work, we show data-driven traffic simulation can be formulated as a world model. We present TrafficBots, a multi-agent policy built upon motion prediction and end-to-end driving, and based on TrafficBots we obtain a world model tailored for the planning module of autonomous vehicles. Existing data-driven traffic simulators are lacking configurability and scalability. To generate configurable behaviors, for each agent we introduce a destination as navigational information, and a time-invariant latent personality that specifies the behavioral style. To improve the scalability, we present a new scheme of positional encoding for angles, allowing all agents to share the same vectorized context and the use of an architecture based on dot-product attention. As a result, we can simulate all traffic participants seen in dense urban scenarios. Experiments on the Waymo open motion dataset show TrafficBots can simulate realistic multi-agent behaviors and achieve good performance on the motion prediction task.
|
|
08:30-10:10, Paper TuPO1S-14.12 | Add to My Program |
SHAIL: Safety-Aware Hierarchical Adversarial Imitation Learning for Autonomous Driving in Urban Environments |
|
Jamgochian, Arec | Stanford University |
Buehrle, Etienne | Karlsruhe Institute of Technology |
Fischer, Johannes | Karlsruhe Institute of Technology |
Kochenderfer, Mykel | Stanford University |
Keywords: Intelligent Transportation Systems, Integrated Planning and Learning, Imitation Learning
Abstract: Designing a safe and human-like decision-making system for an autonomous vehicle is a challenging task. Generative imitation learning is one possible approach for automating policy-building by leveraging both real-world and simulated decisions. Previous work that applies generative imitation learning to autonomous driving policies focuses on learning a low-level controller for simple settings. However, to scale to complex settings, many autonomous driving systems combine fixed, safe, optimization-based low-level controllers with high-level decision-making logic that selects the appropriate task and associated controller. In this paper, we attempt to bridge this gap in complexity by employing Safety-Aware Hierarchical Adversarial Imitation Learning (SHAIL), a method for learning a high-level policy that selects from a set of low-level controller instances in a way that imitates low-level driving data on-policy. We introduce an urban roundabout simulator that controls non-ego vehicles using real data from the Interaction dataset. We then demonstrate empirically that even with simple controller options, our approach can produce better behavior than previous approaches in driver imitation that have difficulty scaling to complex environments. Our implementation is available at https://github.com/sisl/InteractionImitation.
|
|
TuPO1S-15 Poster Session, Room T8 |
Add to My Program |
Motion and Path Planning I |
|
|
|
08:30-10:10, Paper TuPO1S-15.1 | Add to My Program |
Reinforcement Learning-Based Optimal Multiple Waypoint Navigation |
|
Vlachos, Christos | National Technical University of Athens |
Rousseas, Panagiotis | National Technical University of Athens |
Bechlioulis, Charalampos | University of Patras |
Kyriakopoulos, Kostas | National Technical Univ. of Athens |
Keywords: Motion and Path Planning, Reinforcement Learning, Optimization and Optimal Control
Abstract: In this paper, a novel method based on Artificial Potential Field (APF) theory is presented, for optimal motion planning in fully-known, static workspaces, for multiple final goal configurations. Optimization is achieved through a Reinforcement Learning (RL) framework. More specifically, the parameters of the underlying potential field are adjusted through a policy gradient algorithm in order to minimize a cost function. The main novelty of the proposed scheme lies in the method that provides optimal policies for multiple final positions, in contrast to most existing methodologies that consider a single final configuration. An assessment of the optimality of our results is conducted by comparing our novel motion planning scheme against a RRT∗ method.
|
|
08:30-10:10, Paper TuPO1S-15.2 | Add to My Program |
DriveIRL: Drive in Real Life with Inverse Reinforcement Learning |
|
Phan-Minh, Tung | Motional AD |
Howington, Forbes | Motional |
Chu, Ting-Sheng | University of Michigan |
Tomov, Momchil | Motional |
Beaudoin, Robert | Motional AD |
Lee, Sang Uk | Motional |
Li, Nanxiang | Bosch Research and Technology Center |
Dicle, Caglayan | Motional |
Findler, Samuel | Senior Software Engineer at Motional |
Suárez-Ruiz, Francisco | Nanyang Technological University |
Yang, Bo | Motional |
Omari, Sammy | ETH Zurich |
Wolff, Eric | California Institute of Technology |
Keywords: Integrated Planning and Learning, Motion and Path Planning, Learning from Demonstration
Abstract: In this paper, we introduce the first published planner to drive a car in dense, urban traffic using Inverse Reinforcement Learning (IRL). Our planner, DriveIRL, generates a diverse set of trajectory proposals and scores them with a learned model. The best trajectory is tracked by our self-driving vehicle's low-level controller. We train our trajectory scoring model on a 500+ hour real-world dataset of expert driving demonstrations in Las Vegas within the maximum entropy IRL framework. DriveIRL's benefits include: a simple design due to only learning the trajectory scoring function, a flexible and relatively interpretable feature engineering approach, and strong real-world performance. We validated DriveIRL on the Las Vegas Strip and demonstrated fully autonomous driving in heavy traffic, including scenarios involving cut-ins, abrupt braking by the lead vehicle, and hotel pickup/dropoff zones. Our dataset is currently undergoing public release to help further research in this area.
|
|
08:30-10:10, Paper TuPO1S-15.3 | Add to My Program |
LES: Locally Exploitative Sampling for Robot Path Planning |
|
Joshi, Sagar | Aurora Innovation |
Hutchinson, Seth | Georgia Institute of Technology |
Tsiotras, Panagiotis | Georgia Tech |
Keywords: Motion and Path Planning, Manipulation Planning, Autonomous Agents
Abstract: Sampling-based algorithms solve the path planning problem by generating random samples in the search-space and incrementally growing a connectivity graph or a tree. Conventionally, the sampling strategy used in these algorithms is biased towards exploration to acquire information about the search-space. In contrast, this work proposes an optimization-based procedure that generates new samples so as to improve the cost-to-come value of vertices in a given neighborhood. The application of the proposed algorithm adds an exploitative bias to sampling and results in a faster convergence to the optimal solution compared to other state-of-the-art sampling techniques. This is demonstrated using benchmarking experiments performed for 7 DOF Panda and 14 DOF Baxter robots.
|
|
08:30-10:10, Paper TuPO1S-15.4 | Add to My Program |
Boundary Conditions in Geodesic Motion Planning for Manipulators |
|
Laux, Mario | University of Tübingen |
Zell, Andreas | University of Tübingen |
Keywords: Constrained Motion Planning, Motion and Path Planning
Abstract: In dynamic environments, robotic manipulators and especially cobots must be able to react to changing circumstances while in motion. This substantiates the need for quick trajectory planning algorithms that are able to cope with arbitrary velocity and acceleration boundary conditions. Apart from dynamic re-planning, being able to seamlessly join trajectories together opens the door for divide-and-conquer-type algorithms to focus on the individual parts of a motion separately. While geodesic motion planning has proven that it can produce very smooth and efficient actuator movement, the problem of incorporating non-zero boundary conditions has not been addressed yet. We show how a set of generalized coordinates can be used to transition between boundary conditions and free movement in an optimal way while still retaining the known advantages of geodesic planners. We also outline, how our approach can be combined with the family of time-scaling algorithms for further improvement of the generated trajectories.
|
|
08:30-10:10, Paper TuPO1S-15.5 | Add to My Program |
TOFG: A Unified and Fine-Grained Environment Representation in Autonomous Driving |
|
Wen, Zihao | City University of Hong Kong |
Zhang, Yifan | City University of Hong Kong |
Chen, Xinhong | City University of Hong Kong |
Wang, Jianping | City University of Hong Kong |
Keywords: Motion and Path Planning, Autonomous Agents, Imitation Learning
Abstract: In autonomous driving, an accurate understanding of environment, e.g., the vehicle-to-vehicle and vehicle-to-lane interactions, plays a critical role in many driving tasks such as trajectory prediction and motion planning. Environment information comes from high-definition (HD) map and historical trajectories of vehicles. Due to the heterogeneity of the map data and trajectory data, many data-driven models for trajectory prediction and motion planning extract vehicle-to-vehicle and vehicle-to-lane interactions in a separate and sequential manner. However, such a manner may capture biased interpretation of interactions, causing lower prediction and planning accuracy. Moreover, separate extraction leads to a complicated model structure and hence the overall efficiency and scalability are sacrificed. To address the above issues, we propose an environment representation, Temporal Occupancy Flow Graph (TOFG). Specifically, the occupancy flow-based representation unifies the map information and vehicle trajectories into a homogeneous data format and enables a consistent prediction. The temporal dependencies among vehicles can help capture the change of occupancy flow timely to further promote model performance. To demonstrate that TOFG is capable of simplifying the model architecture, we incorporate TOFG with a simple graph attention (GAT) based neural network and propose TOFG-GAT, which can be used for both trajectory prediction and motion planning. Experiment results show that TOFG-GAT achieves better or competitive performance than all the SOTA baselines with less training time.
|
|
08:30-10:10, Paper TuPO1S-15.6 | Add to My Program |
Unidirectional-Road-Network-Based Global Path Planning for Cleaning Robots in Semi-Structured Environments |
|
Li, Yong | Guangzhou Shiyuan Electronic Technology Co., Ltd |
Cheng, Hui | Sun Yat-Sen University |
Keywords: Motion and Path Planning, Service Robotics, Wheeled Robots
Abstract: Practical global path planning is critical for commercializing cleaning robots working in semi-structured environments. In the literature, global path planning methods for free space usually focus on path length and neglect the traffic rule constraints of the environments, which leads to high-frequency re-planning and increases collision risks. In contrast, those for structured environments are developed mainly by strictly complying with the road network representing the traffic rule constraints, which may result in an overlong path that hinders the overall navigation efficiency. This article proposes a general and systematic approach to improve global path planning performance in semi-structured environments. A unidirectional road network is built to represent the traffic constraints in semi-structured environments and a hybrid strategy is proposed to achieve a guaranteed planning result. Cutting across the road at the starting and the goal points are allowed to achieve a shorter path. Especially, a two-layer potential map is proposed to achieve a guaranteed performance when the starting and the goal points are in complex intersections. Comparative experiments are carried out to validate the effectiveness of the proposed method.Quantitative experimental results show that, compared with the state-of-art, the proposed method guarantees a much better balance between path length and the consistency with the road network.
|
|
08:30-10:10, Paper TuPO1S-15.7 | Add to My Program |
A Hierarchical Decoupling Approach for Fast Temporal Logic Motion Planning |
|
Chen, Ziyang | University of Science and Technology of China |
Zhou, Zhangli | University of Science and Technology of China |
Wang, Shaochen | University of Science and Technology of China |
Kan, Zhen | University of Science and Technology of China |
Keywords: Motion and Path Planning, Formal Methods in Robotics and Automation
Abstract: Fast motion planning is of great significance, espe- cially when a timely mission is desired. However, the complexity of motion planning can grow drastically with the increase of environment details and mission complexity. This challenge can be further exacerbated if the tasks are coupled with the desired locations in the environment. To address these issues, this work aims at fast motion planning problems with temporal logical specifications. In particular, we develop a hierarchical decoupling framework that consists of three layers: the high- level task planner, the decoupling layer, and the low-level motion planner. The decoupling layer is designed to bridge the high and low layers by providing necessary information exchange. Such a framework enables the decoupling of the task planner and path planner, so that they can run independently, which significantly reduces the search space and enables fast planing in continuous or high-dimension discrete workspaces. In addition, the implicit constraint during task-level planning is taken into account, so that the low-level path planning is guaranteed to satisfy the mission requirements. Numerical simulations demonstrate at least one order of magnitude speed up in terms of computational time over existing methods.
|
|
08:30-10:10, Paper TuPO1S-15.8 | Add to My Program |
A Fast Two-Stage Approach for Multi-Goal Path Planning in a Fruit Tree |
|
Kroneman, Werner | University College Roosevelt |
Valente, João | Wageningen University & Research |
van der Stappen, Frank | Utrecht University |
Keywords: Motion and Path Planning, Robotics and Automation in Agriculture and Forestry, Optimization and Optimal Control
Abstract: We consider the problem of planning the motion of a drone equipped with a robotic arm, tasked with bringing its end-effector up to many (150+) targets in a fruit tree; to inspect every piece of fruit, for example. The task is complicated by the intersection of a version of Neighborhood TSP (to find an optimal order and a pose to visit every target), and a robotic motion-planning problem through a planning space that features numerous cavities and narrow passages that confuse common techniques. In this contribution, we present a framework that decomposes the problem into two stages: planning approach paths for every target, and quickly planning between the start points of those approach paths. Then, we compare our approach by simulation to a more straightforward method based on multi-query planning, showing that our approach outperforms it in both time and solution cost.
|
|
08:30-10:10, Paper TuPO1S-15.9 | Add to My Program |
Online Whole-Body Motion Planning for Quadrotor Using Multi-Resolution Search |
|
Ren, Yunfan | The University of Hong Kong |
Liang, Siqi | Harbin Institute of Technology, Shenzhen |
Zhu, Fangcheng | The University of Hong Kong |
Lu, Guozheng | The University of Hong Kong |
Zhang, Fu | University of Hong Kong |
Keywords: Motion and Path Planning, Autonomous Vehicle Navigation, Aerial Systems: Applications
Abstract: In this paper, we address the problem of online quadrotor whole-body motion planning (SE(3) planning) in unknown and unstructured environments. We propose a novel multi-resolution search method, which discovers narrow areas requiring full pose planning and normal areas requiring only position planning. As a consequence, a quadrotor planning problem is decomposed into several SE(3) (if necessary) and R^3 sub-problems. To fly through the discovered narrow areas, a carefully designed corridor generation strategy for narrow areas is proposed, which significantly increases the planning success rate. The overall problem decomposition and hierarchical planning framework substantially accelerate the planning process, making it possible to work online with fully onboard sensing and computation in unknown environments. Extensive simulation benchmark comparisons show that the proposed method is one to several orders of magnitude faster than the state-of-the-art methods in computation time while maintaining high planning success rate. The proposed method is finally integrated into a LiDAR-based autonomous quadrotor, and various real-world experiments in unknown and unstructured environments are conducted to demonstrate the outstanding performance of the proposed method.
|
|
08:30-10:10, Paper TuPO1S-15.10 | Add to My Program |
Intermittent Diffusion Based Path Planning for Heterogeneous Groups of Mobile Sensors in Cluttered Environments |
|
Frederick, Christina | NJIT |
Zhou, Haomin | Georgia Institute of Technology |
Crosby, Frank | USNWC PC |
Keywords: Motion and Path Planning, Collision Avoidance, Path Planning for Multiple Mobile Robots or Agents
Abstract: This paper presents a method for task-oriented path planning and collision avoidance for a group of heterogeneous holonomic mobile sensors. It is a generalization of the authors' prior work on diffusion-based path planning. The proposed variant allows one to plan paths in environments cluttered with obstacles. The agents follow flow dynamics, i.e., the negative gradient of a function that is the sum of two functions: the first minimizes the distance from desired target regions and the second captures distance from other agents within a field of view. When it becomes necessary to steer around an obstacle, this function is augmented by a projection term that is carefully designed in terms of obstacle boundaries. More importantly, a diffusion term is added intermittently so that agents can exit local minima. In addition, the new approach skips the offline planning phase in the prior approach to improve computational performance and handle collision avoidance with a completely decentralized method. This approach also provably finds collision-free paths under certain conditions. Numerical simulations of three deployment missions further support the performance of ID-based diffusion.
|
|
08:30-10:10, Paper TuPO1S-15.11 | Add to My Program |
GANet: Goal Area Network for Motion Forecasting |
|
Wang, Mingkun | Peking University |
Zhu, Xinge | CUHK |
Yu, Changqian | Meituan |
Li, Wei | Inceptio |
Ma, Yuexin | ShanghaiTech University |
Jin, Ruochun | National University of Defense Technology |
Ren, Xiaoguang | Academy of Military Sciences |
Ren, Dongchun | Meituan |
Wang, Mingxu | Fudan University |
Yang, Wenjing | State Key Laboratory of High Performance Computing (HPCL), Schoo |
Keywords: Motion and Path Planning, Computer Vision for Transportation, AI-Based Methods
Abstract: Predicting the future motion of road participants is crucial for autonomous driving but is extremely challenging due to staggering motion uncertainty. Recently, most motion forecasting methods resort to the goal-based strategy, i.e., predicting endpoints of motion trajectories as conditions to regress the entire trajectories, so that the search space of solution can be reduced. However, accurate goal coordinates are hard to predict and evaluate. In addition, the point representation of the destination limits the utilization of a rich road context, leading to inaccurate prediction results in many cases. Goal area, i.e., the possible destination area, rather than goal coordinate, could provide a more soft constraint for searching potential trajectories by involving more tolerance and guidance. In view of this, we propose a new goal area-based framework, named Goal Area Network (GANet), for motion forecasting, which models goal areas as preconditions for trajectory prediction, performing more robustly and accurately. Specifically, we propose a GoICrop (Goal Area of Interest) operator to effectively extract semantic lane features in goal areas and model actors' future interactions, which benefits a lot for future trajectory estimations. GANet ranks 1st on the leaderboard of Argoverse Challenge among all public literature (till the paper submission), and its source codes will be released.
|
|
08:30-10:10, Paper TuPO1S-15.12 | Add to My Program |
FlowMap: Path Generation for Automated Vehicles in Open Space Using Traffic Flow |
|
Ding, Wenchao | Fudan University |
Zhao, Jieru | Shanghai Jiao Tong University |
Chu, Yubin | Dalian University of Technology |
Huang, Haihui | Zhejiang University |
Qin, Tong | Huawei Techonology |
Xu, Chunjing | Huawei Technologies |
Guan, Yuxiang | Fudan University |
Gan, Zhongxue | Fudan University |
Keywords: Motion and Path Planning, Intelligent Transportation Systems
Abstract: There is extensive literature on perceiving road structures by fusing various sensor inputs such as lidar point clouds and camera images using deep neural nets. Leveraging the latest advance of neural architects (such as transformers) and bird-eye-view (BEV) representation, the road cognition accuracy keeps improving. However, how to cognize the “road” for automated vehicles where there is no well-defined “roads”remains an open problem. For example, how to find paths inside intersections without HD maps is hard since there is neither an explicit definition for “roads” nor explicit features such as lane markings. The idea of this paper comes from a proverb: it becomes a way when people walk on it. Although there are no “roads” from sensor readings, there are “roads” from tracks of other vehicles. In this paper, we propose FlowMap, a path generation framework for automated vehicles based on traffic flows. FlowMap is built by extending our previous work RoadMap [1], a light-weight semantic map, with an additional traffic flow layer. A path generation algorithm on traffic flow fields (TFFs) is proposed to generate human-like paths. The proposed framework is validated using real-world driving data and is amenable to generating paths for super complicated intersections without using HD maps.
|
|
TuPO1S-16 Poster Session, Room T8 |
Add to My Program |
Reactive and Sensor-Based Planning |
|
|
|
08:30-10:10, Paper TuPO1S-16.1 | Add to My Program |
An Architecture for Reactive Mobile Manipulation On-The-Move |
|
Burgess-Limerick, Ben | Queensland University of Technology |
Lehnert, Christopher | Queensland University of Technology |
Leitner, Jurgen | LYRO Robotics & Monash University |
Corke, Peter | Queensland University of Technology |
Keywords: Mobile Manipulation, Control Architectures and Programming, Reactive and Sensor-Based Planning
Abstract: We present a generalised architecture for reactive mobile manipulation while a robot's base is in motion toward the next objective in a high-level task. By performing tasks on-the-move, overall cycle time is reduced compared to methods where the base pauses during manipulation. Reactive control of the manipulator enables grasping objects with unpredictable motion while improving robustness against perception errors, environmental disturbances, and inaccurate robot control compared to open-loop, trajectory-based planning approaches. We present an example implementation of the architecture and investigate the performance on a series of pick and place tasks with both static and dynamic objects and compare the performance to baseline methods. Our method demonstrated a real-world success rate of over 99%, failing in only a single trial from 120 attempts with a physical robot system. The architecture is further demonstrated on other mobile manipulator platforms in simulation. Our approach reduces task time by up to 48%, while also improving reliability, gracefulness, and predictability compared to existing architectures for mobile manipulation.
|
|
08:30-10:10, Paper TuPO1S-16.2 | Add to My Program |
Multi-Robot Mission Planning in Dynamic Semantic Environments |
|
Kalluraya, Samarth | Washington University in St. Louis |
Pappas, George J. | University of Pennsylvania |
Kantaros, Yiannis | Washington University in St. Louis |
Keywords: Reactive and Sensor-Based Planning, Path Planning for Multiple Mobile Robots or Agents, Planning under Uncertainty
Abstract: This paper addresses a new semantic multi-robot planning problem in uncertain and dynamic environments. Particularly, the environment is occupied with mobile and uncertain semantic targets. These targets are governed by stochastic dynamics while their current and future positions as well as their semantic labels are uncertain. Our goal is to control mobile sensing robots so that they can accomplish collaborative semantic tasks defined over the uncertain current/future positions and semantic labels of these targets. We express these tasks using Linear Temporal Logic (LTL). We propose a sampling-based approach that explores the robot motion space, the mission specification space, as well as the future configurations of the semantic targets to design optimal paths. These paths are revised online to adapt to uncertain perceptual feedback. To the best of our knowledge, this is the first work that addresses semantic mission planning problems in uncertain and dynamic semantic environments. We provide extensive experiments that demonstrate the efficiency of the proposed method.
|
|
08:30-10:10, Paper TuPO1S-16.3 | Add to My Program |
A System for Generalized 3D Multi-Object Search |
|
Zheng, Kaiyu | Brown University |
Paul, Anirudha | Brown University |
Tellex, Stefanie | Brown |
Keywords: Reactive and Sensor-Based Planning, Search and Rescue Robots, Computer Architecture for Robotic and Automation
Abstract: Searching for objects is a fundamental skill for robots. As such, we expect object search to eventually become an off-the-shelf capability for robots, similar to e.g., object detection and SLAM. In contrast, however, no system for 3D object search exists that generalizes across real robots and environments. In this paper, building upon a recent theoretical framework that exploited the octree structure for representing belief in 3D, we present GenMOS (Generalized Multi-Object Search), the first general-purpose system for multi-object search (MOS) in a 3D region that is robot-independent and environment-agnostic. GenMOS takes as input point cloud observations of the local region, object detection results, and localization of the robot's view pose, and outputs a 6D viewpoint to move to through online planning. In particular, GenMOS uses point cloud observations in three ways: (1) to simulate occlusion; (2) to inform occupancy and initialize octree belief; and (3) to sample a belief-dependent graph of view positions that avoid obstacles. We evaluate our system both in simulation and on two real robot platforms. Our system enables, for example, a Boston Dynamics Spot robot to find a toy cat hidden underneath a couch in under one minute. We further integrate 3D local search with 2D global search to handle larger areas, demonstrating the resulting system in a 25m^2 lobby area.
|
|
08:30-10:10, Paper TuPO1S-16.4 | Add to My Program |
A General Class of Combinatorial Filters That Can Be Minimized Efficiently |
|
Zhang, Yulin | Amazon |
Shell, Dylan | Texas A&M University |
Keywords: Formal Methods in Robotics and Automation, Discrete Event Dynamic Automation Systems, Reactive and Sensor-Based Planning
Abstract: State minimization of combinatorial filters is a fundamental problem that arises, for example, in building cheap, resource-efficient robots. But exact minimization is known to be NP-hard. This paper conducts a more nuanced analysis of this hardness than up till now, and uncovers two factors which contribute to this complexity. We show each factor is a distinct source of the problem’s hardness and are able, thereby, to shed some light on the role played by (1) structure of the graph that encodes compatibility relationships, and (2) determinism-enforcing constraints. Just as a line of prior work has sought to introduce additional assumptions and identify sub-classes that lead to practical state reduction, we next use this new, sharper understanding to explore special cases for which exact minimization is efficient. We introduce a new algorithm for constraint repair that applies to a large sub-class of filters, subsuming three distinct special cases for which the possibility of optimal minimization in polynomial time was known earlier. While the efficiency in each of these three cases previously appeared to stem from seemingly dissimilar properties, when seen through the lens of the present work, their commonality now becomes clear. We also provide entirely new families of filters that are efficiently reducible.
|
|
08:30-10:10, Paper TuPO1S-16.5 | Add to My Program |
Cautious Planning with Incremental Symbolic Perception: Designing Verified Reactive Driving Maneuvers |
|
Kamale, Disha | Lehigh University |
Haesaert, Sofie | Eindhoven University of Technology |
Vasile, Cristian Ioan | Lehigh University |
Keywords: Formal Methods in Robotics and Automation, Reactive and Sensor-Based Planning
Abstract: This work presents a step towards utilizing incrementally-improving symbolic perception knowledge of the robot’s surroundings for provably correct reactive control synthesis applied to an autonomous driving problem. Combining abstract models of motion control and information gathering, we show that assume-guarantee specifications (a subclass of Linear Temporal Logic) can be used to define and resolve traffic rules for cautious planning. We propose a novel representation called symbolic refinement tree for perception that captures the incremental knowledge about the environment and embodies the relationships between various symbolic perception inputs. The incremental knowledge is leveraged for synthesizing verified reactive plans for the robot. The case studies demonstrate the efficacy of the proposed approach in synthesizing control inputs even in case of partially occluded environments.
|
|
08:30-10:10, Paper TuPO1S-16.6 | Add to My Program |
Decision Diagrams As Plans: Answering Observation-Grounded Queries |
|
Shell, Dylan | Texas A&M University |
O'Kane, Jason | Texas A&M University |
Keywords: Reactive and Sensor-Based Planning, Formal Methods in Robotics and Automation, Planning under Uncertainty
Abstract: We consider a robot that answers questions about its environment by traveling to appropriate places and then sensing. Questions are posed as structured queries and may involve conditional or contingent relationships between observable properties. After formulating this problem, and emphasizing the advantages of exploiting deducible information, we describe how non-trivial knowledge of the world and queries can be given a convenient, concise, unified representation via reduced ordered binary decision diagrams (BDDs). To use these data structures directly for inference and planning, we introduce a new product operation and generalize the classic dynamic variable reordering techniques to solve planning problems. Also, finally, we evaluate optimizations that exploit locality.
|
|
08:30-10:10, Paper TuPO1S-16.7 | Add to My Program |
Obstacle Avoidance Using Raycasting and Riemannian Motion Policies at kHz Rates for MAVs |
|
Pantic, Michael | ETH Zürich |
Meijer, Isar | ETH Zurich |
Bähnemann, Rik | ETH Zürich |
Alatur, Nikhilesh | ETH Zurich |
Andersson, Olov | ETH Zürich |
Cadena Lerma, Cesar | ETH Zurich |
Siegwart, Roland | ETH Zurich |
Ott, Lionel | ETH Zurich |
Keywords: Reactive and Sensor-Based Planning, Collision Avoidance, Aerial Systems: Perception and Autonomy
Abstract: This paper presents a novel method for using Riemannian Motion Policies on volumetric maps, shown in the example of obstacle avoidance for Micro Aerial Vehicles (MAVs). Today, most robotic obstacle avoidance algorithms rely on sampling or optimization-based planners with volumetric maps. However, they are computationally expensive and often have inflexible monolithic architectures. Riemannian Motion Policies are a modular, parallelizable, and efficient navigation alternative but are challenging to use with the widely used voxel-based environment representations. We propose using GPU raycasting and tens of thousands of concurrent policies to provide direct obstacle avoidance using Riemannian Motion Policies in voxelized maps without needing map smoothing or pre-processing. Additionally, we present how the same method can directly plan on LiDAR scans without any intermediate map. We show how this reactive approach compares favorably to traditional planning methods and can evaluate up to 200 million rays per second. We demonstrate the planner successfully on a real MAV for static and dynamic obstacles. The presented planner is made available as an open-source package.
|
|
08:30-10:10, Paper TuPO1S-16.8 | Add to My Program |
Adaptive and Explainable Deployment of Navigation Skills Via Hierarchical Deep Reinforcement Learning |
|
Lee, Kyowoon | Ulsan National Institute of Science and Technology |
Kim, Seongun | Korea Advanced Institute of Science and Technology |
Choi, Jaesik | Korea Advanced Institute of Science and Technology |
Keywords: AI-Based Methods, Autonomous Vehicle Navigation, Collision Avoidance
Abstract: For robotic vehicles to navigate robustly and safely in unseen environments, it is crucial to decide the most suitable navigation policy. However, most existing deep reinforcement learning based navigation policies are trained with a hand-engineered curriculum and reward function which are difficult to be deployed in a wide range of real-world scenarios. In this paper, we propose a framework to learn a family of low-level navigation policies and a high-level policy for deploying them. The main idea is that, instead of learning a single navigation policy with a fixed reward function, we simultaneously learn a family of policies that exhibit different behaviors with a wide range of reward functions. We then train the high-level policy which adaptively deploys the most suitable navigation skill. We evaluate our approach in simulation and the real world and demonstrate that our method can learn diverse navigation skills and adaptively deploy them. We also illustrate that our proposed hierarchical learning framework presents explainability by providing semantics for the behavior of an autonomous agent.
|
|
TuPO1S-17 Poster Session, Room T8 |
Add to My Program |
Collision Avoidance |
|
|
|
08:30-10:10, Paper TuPO1S-17.1 | Add to My Program |
Learning Agile Flight Maneuvers: Deep SE(3) Motion Planning and Control for Quadrotors |
|
Wang, Yixiao | National University of Singapore |
Wang, Bingheng | National University of Singapore |
Zhang, Shenning | National University of Singapore |
Sia, Han Wei | ST Engineering |
Zhao, Lin | National University of Singapore |
Keywords: Collision Avoidance, Integrated Planning and Control, Motion and Path Planning
Abstract: Agile flights of autonomous quadrotors in cluttered environments require constrained motion planning and control subject to translational and rotational dynamics. Traditional model-based methods typically demand complicated design and heavy computation. In this paper, we develop a novel deep reinforcement learning-based method that tackles the challenging task of flying through a dynamic narrow gate. We design a model predictive controller with its adaptive tracking references parameterized by a deep neural network (DNN). These references include the traversal time and the quadrotor SE(3) traversal pose that encourage the robot to fly through the gate with maximum safety margins from various initial conditions. To cope with the difficulty of training in highly dynamic environments, we develop a reinforce-imitate learning framework to train the DNN efficiently that generalizes well to diverse settings. Furthermore, we propose a binary search algorithm that allows online adaption of the SE(3) references to dynamic gates in real-time. Finally, through extensive high-fidelity simulations, we show that our approach is adaptive to different gate trajectories, velocities, and orientations.
|
|
08:30-10:10, Paper TuPO1S-17.2 | Add to My Program |
Robust MADER: Decentralized and Asynchronous Multiagent Trajectory Planner Robust to Communication Delay |
|
Kondo, Kota | Massachusetts Institute of Technology |
Tordesillas Torres, Jesus | Massachusetts Institute of Technology |
Figueroa, Reinaldo | Massachusetts Institute of Technology |
Rached, Juan | Massachusetts Institute of Technology |
Merkel, Joseph | MIT Aerospace Controls Lab |
Lusk, Parker C. | Massachusetts Institute of Technology |
How, Jonathan | Massachusetts Institute of Technology |
Keywords: Collision Avoidance, Motion and Path Planning, Distributed Robot Systems
Abstract: Although communication delays can disrupt multiagent systems, most of the existing multiagent trajectory planners lack a strategy to address this issue. State-of-the-art approaches typically assume perfect communication environments, which is hardly realistic in real-world experiments. This paper presents Robust MADER (RMADER), a decentralized and asynchronous multiagent trajectory planner that can handle communication delays among agents. By broadcasting both the newly optimized trajectory and the committed trajectory, and by performing a delay check step, RMADER is able to guarantee safety even under communication delay. RMADER was validated through extensive simulation and hardware flight experiments and achieved a 100% success rate of collision-free trajectory generation, outperforming state-of-the-art approaches.
|
|
08:30-10:10, Paper TuPO1S-17.3 | Add to My Program |
Obstacle Identification and Ellipsoidal Decomposition for Fast Motion Planning in Unknown Dynamic Environments |
|
Kaymaz, Mehmetcan | Istanbul Technical University |
Ure, Nazim Kemal | Istanbul Technical University |
Keywords: Collision Avoidance, Constrained Motion Planning
Abstract: Collision avoidance in the presence of dynamic obstacles in unknown environments is one of the most critical challenges for unmanned systems. In this paper, we present a method that identifies obstacles in terms of ellipsoids to estimate linear and angular obstacle velocities. Our proposed method is based on the idea of any object can be approximately expressed by ellipsoids. To achieve this, we propose a method based on variational Bayesian estimation of Gaussian mixture model, the Kyachiyan algorithm, and a refinement algorithm. Our proposed method does not require knowledge of the number of clusters and can operate in real-time, unlike existing optimization-based methods. In addition, we define an ellipsoid-based feature vector to match obstacles given two timely close point frames. Our method can be applied to any environment with static and dynamic obstacles, including ones with rotating obstacles. We compare our algorithm with other clustering methods and show that when coupled with a trajectory planner, the overall system can efficiently traverse unknown environments in the presence of dynamic obstacles.
|
|
08:30-10:10, Paper TuPO1S-17.4 | Add to My Program |
Safe Operations of an Aerial Swarm Via a Cobot Human Swarm Interface |
|
Abdi, Sydrak | University of Maryland |
Paley, Derek | University of Maryland |
Keywords: Human-Robot Collaboration, Swarm Robotics, Collision Avoidance
Abstract: Command and control of an aerial swarm is a complex task. This task increases in difficulty when the flight volume is restricted and the swarm and operator inhabit the same workspace. This work presents a novel method for interacting with and controlling a swarm of quadrotors in a confined space. EMG-based gesture control is used to control the position, orientation, and density of the swarm. Inter-agent as well as agent-operator collisions are prevented through a velocity controller based on a distance-based potential function. State feedback is relayed to the operator via a vibrotactile haptic vest. This cobot human swarm interface prioritizes operator safety while reducing the cognitive load during control of a cobot swarm. This work demonstrates that an operator can safely and intuitively control a swarm of aerial robots in the same workspace.
|
|
TuPO1S-18 Poster Session, Room T8 |
Add to My Program |
Perception for Grasping and Manipulation I |
|
|
|
08:30-10:10, Paper TuPO1S-18.1 | Add to My Program |
MonoGraspNet: 6-DoF Grasping with a Single RGB Image |
|
Zhai, Guangyao | Technical University of Munich |
Huang, Dianye | Technical University of Munich |
Wu, Shun-Cheng | Technical University of Munich |
Jung, HyunJun | Technical University of Munich |
Di, Yan | Technical University of Munich |
Manhardt, Fabian | Google |
Tombari, Federico | Technische Universität München |
Navab, Nassir | TU Munich |
Busam, Benjamin | Technical University of Munich |
Keywords: Perception for Grasping and Manipulation, Deep Learning in Grasping and Manipulation, Grasping
Abstract: 6-DoF robotic grasping is a long-lasting but unsolved problem. Recent methods utilize strong 3D networks to extract geometric grasping representations from depth sensors, demonstrating superior accuracy on common objects but performing unsatisfactorily on photometrically challenging objects, e.g., objects in transparent or reflective materials. The bottleneck lies in that the surface of these objects can not reflect accurate depth due to the absorption or refraction of light. In this paper, in contrast to exploiting the inaccurate depth data, we propose the first RGB-only 6-DoF grasping pipeline called MonoGraspNet that utilizes stable 2D features to simultaneously handle arbitrary object grasping and overcome the problems induced by photometrically challenging objects. MonoGraspNet leverages a keypoint heatmap and a normal map to recover the 6-DoF grasping poses represented by our novel representation parameterized with 2D keypoints with corresponding depth, grasping direction, grasping width, and angle. Extensive experiments in real scenes demonstrate that our method can achieve competitive results in grasping common objects and surpass the depth-based competitor by a large margin in grasping photometrically challenging objects. To further stimulate robotic manipulation research, we annotate and open-source a multi-view and multi-scene grasping dataset in the real world containing 120 objects of mixed photometric complexity with 20M accurate grasping labels.
|
|
08:30-10:10, Paper TuPO1S-18.2 | Add to My Program |
USEEK: Unsupervised SE(3)-Equivariant 3D Keypoints for Generalizable Manipulation |
|
Xue, Zhengrong | Shanghai Jiao Tong University |
Yuan, Zhecheng | Tsinghua University |
Wang, Jiashun | Carnegie Mellon University |
Wang, Xueqian | Center for Artificial Intelligence and Robotics, Graduate School |
Gao, Yang | Tsinghua University |
Xu, Huazhe | Tsinghua University |
Keywords: Perception for Grasping and Manipulation, RGB-D Perception, Learning from Demonstration
Abstract: Can a robot manipulate intra-category unseen objects in arbitrary poses with the help of a mere demonstration of grasping pose on a single object instance? In this paper, we try to address this intriguing challenge by using USEEK, an unsupervised SE(3)-equivariant keypoints method that enjoys alignment across instances in a category, to perform generalizable manipulation. USEEK follows a teacher-student structure to decouple the unsupervised keypoint discovery and SE(3)-equivariant keypoint detection. With USEEK in hand, the robot can infer the category-level task-relevant object frames in an efficient and explainable manner, enabling manipulation of any intra-category objects from and to any poses. Through extensive experiments, we demonstrate that the keypoints produced by USEEK possess rich semantics, thus successfully transferring the functional knowledge from the demonstration object to the novel ones. Compared with other object representations for manipulation, USEEK is more adaptive in the face of large intra-category shape variance, more robust with limited demonstrations, and more efficient at inference time. Project website: https://sites.google.com/view/useek/.
|
|
08:30-10:10, Paper TuPO1S-18.3 | Add to My Program |
Semantic Mapping with Confidence Scores through Metric Embeddings and Gaussian Process Classification |
|
Hong, Jungseok | University of Minnesota |
Garg, Suveer | University of Pennsylvania |
Isler, Volkan | University of Minnesota |
Keywords: Perception for Grasping and Manipulation, Semantic Scene Understanding, Mapping
Abstract: Recent advances in robotic mapping enable robots to use both semantic and geometric understanding of their surroundings to perform complex tasks. Current methods are optimized for reconstruction quality, but they do not provide a measure of how certain they are of their outputs. Therefore, algorithms that use these maps do not have a way of assessing how much they can trust the outputs. We present a mapping approach that unifies semantic information and shape completion inferred from RGBD images and computes confidence scores for its predictions. We use a Gaussian Process (GP) classification model to merge confidence scores (if available) for the given information. A novel aspect of our method is that we lift the measurement to a learned metric space over which the GP parameters are learned. After training, we can evaluate the uncertainty of objects’ completed shapes with their semantic information. We show that our approach can achieve more accurate predictions than a classic GP model and provide robots with the flexibility to decide whether they can trust the estimate at a given location using the confidence scores.
|
|
08:30-10:10, Paper TuPO1S-18.4 | Add to My Program |
The Third Generation (G3) Dual-Modal and Dual Sensing Mechanisms (DMDSM) Pretouch Sensor for Robotic Grasping |
|
Fang, Cheng | Texas A&M University |
Li, Shuangliang | Texas A&M University |
Wang, Di | Texas A&M University |
Guo, Fengzhi | Texas A&M University |
Song, Dezhen | Texas A&M University |
Zou, Jun | Texas A&M University |
Keywords: Perception for Grasping and Manipulation, Range Sensing, Grasping
Abstract: Fingertip-mounted pretouch sensors are very useful for robotic grasping. In this paper, we report a new (G3) dual-modal and dual sensing mechanisms (DMDSM) pretouch sensor for near-distance ranging and material sensing, which is based on pulse-echo ultrasound (US) and optoacoustics (OA). Different from previously reported versions, the G3 sensor utilizes a self-focused US/OA transceiver, thereby eliminating the need of a bulky parabolic reflective mirror for focusing the ultrasound and laser beams. The self-focused laser and ultrasound beams can be easily steered by a (flat) scanning mirror which expands from single-point ranging and detection to areal mapping or imaging. To verify the new design, a prototype G3 DMDSM sensor with a scanning mirror is fabricated. The US and OA ranging performances are tested in experiments. Together with the scanning mirror, thin wire targets made of same or different materials at different positions are scanned and imaged. The ranging and imaging results show that the G3 DMDSM sensor can provide new and better pretouch mapping and imaging capabilities for robotic grasping than its predecessors.
|
|
08:30-10:10, Paper TuPO1S-18.5 | Add to My Program |
Learning Height for Top-Down Grasps with the DIGIT Sensor |
|
Bernardi, Thais | Inria |
Fleytoux, Yoann | Inria |
Mouret, Jean-Baptiste | Inria |
Ivaldi, Serena | INRIA |
Keywords: Perception for Grasping and Manipulation, Force and Tactile Sensing, Deep Learning in Grasping and Manipulation
Abstract: We address the problem of grasping unknown objects identified from top-down images with a parallel gripper. When no object 3D model is available, the state-of-the-art grasp generators identify the best candidate locations for planar grasps using the RGBD image. However, while they generate the Cartesian location and orientation of the gripper, the height of the grasp center is often determined by heuristics based on the highest point in the depth map, which leads to unsuccessful grasps when the objects are not thick, or have transparencies or curved shapes. In this paper, we propose to learn a regressor that predicts the best grasp height based from the image. We train this regressor with a dataset that is automatically acquired thanks to the DIGIT optical tactile sensors, which can evaluate grasp success and stability. Using our predictor, the grasping success is improved by 6% for all objects, by 16% on average on difficult objects, and by 40% for objects that are notably very difficult to grasp (e.g., transparent, curved, thin).
|
|
08:30-10:10, Paper TuPO1S-18.6 | Add to My Program |
Instance-Wise Grasp Synthesis for Robotic Grasping |
|
Xu, Yucheng | University of Edinburgh |
Kasaei, Mohammadreza | University of Edinburgh |
Kasaei, Hamidreza | University of Groningen |
Li, Zhibin | University College London |
Keywords: Perception for Grasping and Manipulation, Grasping, Deep Learning for Visual Perception
Abstract: Generating high-quality instance-wise grasp configurations provides critical information of how to grasp specific objects in a multi-object environment and is of high importance for robot manipulation tasks. This work proposed a novel Single-Stage Grasp (SSG) synthesis network, which performs high-quality instance-wise grasp synthesis in a single stage: instance mask and grasp configurations are generated for each object simultaneously. Our method outperforms state-of-the-art on robotic grasp prediction based on the OCID-Grasp dataset, and performs competitively on the JACQUARD dataset. The benchmarking results showed significant improvements compared to the baseline on the accuracy of generated grasp configurations. The performance of the proposed method has been validated through both extensive simulations and real robot experiments for three tasks including single object pick-and-place, grasp synthesis in cluttered environments and table cleaning task.
|
|
08:30-10:10, Paper TuPO1S-18.7 | Add to My Program |
Joint Segmentation and Grasp Pose Detection with Multi-Modal Feature Fusion Network |
|
Liu, Xiaozheng | Northeastern University |
Zhang, Yunzhou | Northeastern University |
Cao, He | Northeastern University |
Dexing, Shan | Northeastern University |
Zhao, Jiaqi | Northeastern University |
Keywords: Perception for Grasping and Manipulation, Deep Learning in Grasping and Manipulation, Object Detection, Segmentation and Categorization
Abstract: Efficient grasp pose detection is essential for robotic manipulation in cluttered scenes. However, most methods only utilize point clouds or images for prediction, ignoring the advantages of different features. In this paper, we present a multi-modal fusion network for joint segmentation and grasp pose detection. We design a point cloud and image co-guided feature fusion module that can be used to fuse features and adaptively estimate the importance of the point-pixel feature pairs. Moreover, we develop a seed point sampling algorithm that simultaneously considers the distance, semantics and attention scores. For selected seed points, we design a local feature aggregation module to fully utilize the local features in the grasp region. Experimental results on the GraspNet-1Billion Dataset show that our network outperforms several state-of-the-art methods. We also conduct real robot grasping experiments to demonstrate the effectiveness of our approach.
|
|
08:30-10:10, Paper TuPO1S-18.8 | Add to My Program |
GraspNeRF: Multiview-Based 6-DoF Grasp Detection for Transparent and Specular Objects Using Generalizable NeRF |
|
Dai, Qiyu | Peking University |
Zhu, Yan | Peking University |
Geng, Yiran | Peking University |
Ruan, Ciyu | National University of Defense Technology |
Zhang, Jiazhao | National University of Defense Technology |
Wang, He | Peking University |
Keywords: Deep Learning in Grasping and Manipulation, Perception for Grasping and Manipulation, AI-Based Methods
Abstract: In this work, we tackle 6-DoF grasp detection for transparent and specular objects, which is an important yet challenging problem in vision-based robotic systems, due to the failure of depth cameras in sensing their geometry. We, for the first time, propose a multiview RGB-based 6-DoF grasp detection network, GraspNeRF, that leverages the generalizable neural radiance field (NeRF) to achieve material-agnostic object grasping in clutter. Compared to the existing NeRF-based 3-DoF grasp detection methods that rely on densely captured input images and time-consuming per-scene optimization, our system can perform zero-shot NeRF construction with sparse RGB inputs and reliably detect 6-DoF grasps, both in real-time. The proposed framework jointly learns generalizable NeRF and grasp detection in an end-to-end manner, optimizing the scene representation construction for the grasping. For training data, we generate a large-scale photorealistic domain-randomized synthetic dataset of grasping in cluttered tabletop scenes that enables direct transfer to the real world. Our extensive experiments in synthetic and real-world environments demonstrate that our method significantly outperforms all the baselines in all the experiments while remaining in real-time. Project page can be found at https://pku-epic.github.io/GraspNeRF.
|
|
08:30-10:10, Paper TuPO1S-18.9 | Add to My Program |
Elastic Context: Encoding Elasticity for Data-Driven Models of Textiles |
|
Longhini, Alberta | KTH Royal Institute of Technology |
Moletta, Marco | KTH Royal Institute of Technology |
Reichlin, Alfredo | KTH Royal Institute of Technology |
Welle, Michael C. | KTH Royal Institute of Technology |
Kravberg, Alexander | KTH Royal Institute of Technology |
Wang, Yufei | Carnegie Mellon University |
Held, David | Carnegie Mellon University |
Erickson, Zackory | Carnegie Mellon University |
Kragic, Danica | KTH |
Keywords: AI-Enabled Robotics, Dual Arm Manipulation, Perception for Grasping and Manipulation
Abstract: Physical interaction with textiles, such as assistive dressing or household tasks, requires advanced dexterous skills. The complexity of textile behavior during stretching and pulling is influenced by the material properties of the yarn and by the textile's construction technique, which are often unknown in real-world settings. Moreover, identification of physical properties of textiles through sensing commonly available on robotic platforms remains an open problem. To address this, we introduce Elastic Context (EC), a method to encode the elasticity of textiles using stress-strain curves adapted from textile engineering for robotic applications. We employ EC to learn generalized elastic behaviors of textiles and examine the effect of EC dimension on accurate force modeling of real-world non-linear elastic behaviors.
|
|
08:30-10:10, Paper TuPO1S-18.10 | Add to My Program |
Vision-Based Six-Dimensional Peg-In-Hole for Practical Connector Insertion |
|
Zhang, Kun | Hong Kong University of Science and Technology |
Wang, Chen | The University of Hong Kong |
Chen, Hua | Southern University of Science and Technology |
Pan, Jia | University of Hong Kong |
Wang, Michael Yu | Monash University |
Zhang, Wei | Southern University of Science and Technology |
Keywords: Perception for Grasping and Manipulation, Manipulation Planning, Compliant Assembly
Abstract: We study six-dimensional (6D) perceptive peg-in-hole problem for practical connector insertion task in this paper. To enable the manipulator system to handle different types of pegs in complex environment, we develop a perceptive robotic assembly system that utilizes an in-hand RGB-D camera for peg-in-hole with multiple types of pegs. The proposed framework addresses the critical hole detection and pose estimation problem through combining the learning-based detection with model-based pose estimation strategies. By exploiting the structure of the peg-in-hole task, we consider a rectangle-shape based characterization for modeling the candidate socket. Such a characterization allows us to design simple learning-based methods to detect and estimate the 6D pose of the target socket that balances between processing speed and accuracy. To validate our method, we test the performance of the proposed perceptive peg-in-hole solution using a KUKA iiwa7 robotic arm to accomplish the socket insertion task with two types of practical sockets (RJ45/HDMI). Without the need of additional search, our method achieves an acceptable success rate in the connector insertion tasks. The results confirm the reliability of our method and show that our method is suitable for real world application.
|
|
08:30-10:10, Paper TuPO1S-18.11 | Add to My Program |
RGB-Only Reconstruction of Tabletop Scenes for Collision-Free Manipulator Control |
|
Tang, Zhenggang | University of Illinois Urbana-Champaign |
Sundaralingam, Balakumar | NVIDIA Corporation |
Tremblay, Jonathan | Nvidia |
Wen, Bowen | NVIDIA |
Yuan, Ye | Carnegie Mellon University |
Tyree, Stephen | NVIDIA |
Loop, Charles | NVIDIA |
Schwing, Alexander | University of Illinois at Urbana-Champaign |
Birchfield, Stan | NVIDIA Corporation |
Keywords: Perception for Grasping and Manipulation, Deep Learning for Visual Perception, Machine Learning for Robot Control
Abstract: We present a system for collision-free control of a robot manipulator that uses only RGB views of the world. Perceptual input of a tabletop scene is provided by multiple images of an RGB camera (without depth) that is either handheld or mounted on the robot end effector. A NeRF-like process is used to reconstruct the 3D geometry of the scene, from which the Euclidean full signed distance function (ESDF) is compted. A model predictive control algorithm is then used to control the manipulator to reach a desired pose while avoiding obstacles in the ESDF. We show results on a real dataset collected and annotated in our lab.
|
|
08:30-10:10, Paper TuPO1S-18.12 | Add to My Program |
Multi-View Object Pose Estimation from Correspondence Distributions and Epipolar Geometry |
|
Haugaard, Rasmus Laurvig | University of Southern Denmark |
Iversen, Thorbjørn Mosekjær | The Maersk Mc-Kinney Moller Institute, University of Southern De |
Keywords: Computer Vision for Automation, Computer Vision for Manufacturing, Deep Learning for Visual Perception
Abstract: In many automation tasks involving manipulation of rigid objects, the poses of the objects must be acquired. Vision-based pose estimation using a single RGB or RGB-D sensor is especially popular due to its broad applicability. However, single-view pose estimation is inherently limited by depth ambiguity and ambiguities imposed by various phenomena like occlusion, self-occlusion, reflections, etc. Aggregation of information from multiple views can potentially resolve these ambiguities, but the current state-of-the-art multi-view pose estimation method only uses multiple views to aggregate single-view pose estimates, and thus rely on obtaining good single-view estimates. We present a multi-view pose estimation method which aggregates learned 2D-3D distributions from multiple views for both the initial estimate and optional refinement. Our method performs probabilistic sampling of 3D-3D correspondences under epipolar constraints using learned 2D-3D correspondence distributions which are implicitly trained to respect visual ambiguities such as symmetry. Evaluation on the T-LESS dataset shows that our method reduces pose estimation errors by 80-91% compared to the best single-view method, and we present state-of-the-art results on T-LESS with four views, even compared with methods using five and eight views.
|
|
TuPO1S-19 Poster Session, Room T8 |
Add to My Program |
Learning for Grasping and Manipulation I |
|
|
|
08:30-10:10, Paper TuPO1S-19.1 | Add to My Program |
FSG-Net: A Deep Learning Model for Semantic Robot Grasping through Few-Shot Learning |
|
Barcellona, Leonardo | University of Padova |
Bacchin, Alberto | University of Padua |
Gottardi, Alberto | University of Padova |
Menegatti, Emanuele | The University of Padua |
Ghidoni, Stefano | University of Padova |
Keywords: Deep Learning in Grasping and Manipulation, Perception for Grasping and Manipulation, Grasping
Abstract: Robot grasping has been widely studied in the last decade. Recently, Deep Learning made possible to achieve remarkable results in grasp pose estimation, using depth and RGB images. However, only few works consider the choice of the object to grasp. Moreover, they require a huge amount of data for generalizing to unseen object categories. For this reason, in this work, we define the Few-shot Semantic Grasping task where the objective is inferring a correct grasp given only five labelled images of a target unseen object. We propose a new deep learning architecture able to solve the aforementioned problem, leveraging on a Few-shot Semantic Segmentation module. We have evaluated the proposed model both in the Graspnet dataset and in a real scenario. In Graspnet, we achieve 40,95% accuracy in the Few-shot Semantic Grasping task, outperforming baseline approaches. In the real experiments, the results confirmed the generalization ability of the network.
|
|
08:30-10:10, Paper TuPO1S-19.2 | Add to My Program |
Learning Pre-Grasp Manipulation of Flat Objects in Cluttered Environments Using Sliding Primitives |
|
Wu, Jiaxi | Peking University |
Wu, Haoran | University of Science and Technology of China |
Zhong, Shanlin | Institute of Automation, Chinese Academy of Sciences |
Sun, Quqin | Wuhan Second.Ship Design.and Research Institute |
Li, Yinlin | Institute of Automation, Chinese Academy of Sciences |
Keywords: Deep Learning in Grasping and Manipulation, Manipulation Planning, Reinforcement Learning
Abstract: Flat objects with negligible thicknesses like books and disks are challenging to be grasped by the robot because of the width limit of the robot's gripper, especially when they are in cluttered environments. Pre-grasp manipulation is conducive to rearranging objects on the table and moving the flat objects to the table edge, making them graspable. In this paper, we formulate this task as Parameterized Action Markov Decision Process, and a novel method based on deep reinforcement learning is proposed to address this problem by introducing sliding primitives as actions. A weight-sharing policy network is utilized to predict the sliding primitive's parameters for each object, and a Q-network is adopted to select the acted object among all the candidates on the table. Meanwhile, via integrating a curriculum learning scheme, our method can be scaled to cluttered environments with more objects. In both simulation and real-world experiments, our method surpasses the existing methods and achieves pre-grasp manipulation with higher task success rates and fewer action steps. Without fine-tuning, it can be generalized to novel shapes and household objects with more than 85% success rates in the real world. Videos and supplementary materials are available at https://sites.google.com/view/pre-grasp-sliding.
|
|
08:30-10:10, Paper TuPO1S-19.3 | Add to My Program |
Learning Category-Level Manipulation Tasks from Point Clouds with Dynamic Graph CNNs |
|
Liang, Junchi | Rutgers University |
Boularias, Abdeslam | Rutgers University |
Keywords: Deep Learning in Grasping and Manipulation, Imitation Learning, Learning from Demonstration
Abstract: This paper presents a new technique for learning category-level manipulation from raw RGB-D videos of task demonstrations, with no manual labels or annotations. Category-level learning aims to acquire skills that can be generalized to new objects, with geometries and textures that are different from the ones of the objects used in the demonstrations. We address this problem by first viewing both grasping and manipulation as special cases of tool use, where a tool object is moved to a sequence of key-poses defined in a frame of reference of a target object. Tool and target objects, along with their key-poses, are predicted using a dynamic graph convolutional neural network that takes as input an automatically segmented depth and color image of the entire scene. Empirical results on object manipulation tasks with a real robotic arm show that the proposed network can efficiently learn from real visual demonstrations to perform the tasks on novel objects within the same category, and outperforms alternative approaches.
|
|
08:30-10:10, Paper TuPO1S-19.4 | Add to My Program |
Neural Grasp Distance Fields for Robot Manipulation |
|
Weng, Thomas | Carnegie Mellon University |
Held, David | Carnegie Mellon University |
Meier, Franziska | Facebook |
Mukadam, Mustafa | Facebook AI Research |
Keywords: Deep Learning in Grasping and Manipulation, Representation Learning, Manipulation Planning
Abstract: We formulate grasp learning as a neural field and present Neural Grasp Distance Fields (NGDF). Here, the input is a 6D pose of a robot end effector and output is a distance to a continuous manifold of valid grasps for an object. In contrast to current approaches that predict a set of discrete candidate grasps, the distance-based NGDF representation is easily interpreted as a cost, and minimizing this cost produces a successful grasp pose. This grasp distance cost can be incorporated directly into a trajectory optimizer for joint optimization with other costs such as trajectory smoothness and collision avoidance. During optimization, as the various costs are balanced and minimized, the grasp target is allowed to smoothly vary, as the learned grasp field is continuous. We evaluate NGDF on joint grasp and motion planning in simulation and the real world, outperforming baselines by 63% execution success while generalizing to unseen query poses and unseen object shapes. Project page: https://sites.google.com/view/neural-grasp-distance-fields.
|
|
08:30-10:10, Paper TuPO1S-19.5 | Add to My Program |
Planning for Multi-Object Manipulation with Graph Neural Network Relational Classifiers |
|
Huang, Yixuan | University of Utah |
Conkey, Adam | University of Utah |
Hermans, Tucker | University of Utah |
Keywords: Deep Learning in Grasping and Manipulation, Learning Categories and Concepts
Abstract: Objects rarely sit in isolation in human environments. As such, we’d like our robots to reason about how multiple objects relate to one another and how those relations may change as the robot interacts with the world. To this end, we propose a novel graph neural network framework for multi-object manipulation to predict how inter-object relations change given robot actions. Our model operates on partial-view point clouds and can reason about multiple objects dynamically interacting during the manipulation. By learning a dynamics model in a learned latent graph embedding space, our model enables multi-step planning to reach target goal relations. We show our model trained purely in simulation transfers well to the real world. Our planner enables the robot to rearrange a variable number of objects with a range of shapes and sizes using both push and pick and place skills.
|
|
08:30-10:10, Paper TuPO1S-19.6 | Add to My Program |
Local Neural Descriptor Fields: Locally Conditioned Object Representations for Manipulation |
|
Chun, Ethan | MIT |
Du, Yilun | MIT |
Simeonov, Anthony | Massachusetts Institute of Technology |
Lozano-Perez, Tomas | MIT |
Kaelbling, Leslie | MIT |
Keywords: Deep Learning in Grasping and Manipulation, Perception for Grasping and Manipulation
Abstract: A robot operating in a household environment will see a wide range of unique and unfamiliar objects. While a system could train on many of these, it is infeasible to predict all the objects a robot will see. In this paper, we present a method to generalize object manipulation skills acquired from a limited number of demonstrations, to novel objects from unseen shape categories. Our approach, Local Neural Descriptor Fields (L-NDF), utilizes neural descriptors defined on the local geometry of the object to effectively transfer manipulation demonstrations to novel objects at test time. In doing so, we leverage the local geometry shared between objects to produce a more general manipulation framework. We illustrate the efficacy of our approach in manipulating novel objects in novel poses -- both in simulation and in the real world.
|
|
08:30-10:10, Paper TuPO1S-19.7 | Add to My Program |
Practical Visual Deep Imitation Learning Via Task-Level Domain Consistency |
|
Khansari, Mohi | Google X |
Ho, Daniel | Google X |
Du, Yuqing | UC Berkeley |
Fuentes, Armando | Everyday Robots |
Bennice, Matthew | Everyday Robots |
Sievers, Nicolas | Everyday Robots |
Kirmani, Sean | X, the Moonshot Factory |
Bai, Yunfei | Google X |
Jang, Eric | Halodi Robotics |
Keywords: Deep Learning in Grasping and Manipulation
Abstract: Recent work in visual end-to-end learning for robotics has shown the promise of imitation learning across a variety of tasks. Such approaches are however expensive both because they require large amounts of real world data and rely on time-consuming real-world evaluations to identify the best model for deployment. These challenges can be mitigated by using simulation evaluations to identify high performing policies. However, this introduces the well-known "reality gap" problem, where simulator inaccuracies decorrelate performance in simulation from that of reality. In this paper, we build on top of prior work in GAN-based domain adaptation and introduce the notion of a Task Consistency Loss (TCL), a self-supervised loss that encourages sim and real alignment both at the feature and action-prediction levels. We demonstrate the effectiveness of our approach by teaching a 9-DoF mobile manipulator to perform the challenging task of latched door opening purely from visual inputs such as RGB and depth images. We achieve 69% success across twenty seen and unseen meeting rooms using only ~16.2 hours of teleoperated demonstrations in sim and real. To the best of our knowledge, this is the first work to tackle latched door opening from a purely end-to-end learning approach, where the task of navigation and manipulation are jointly modeled by a single neural network.
|
|
08:30-10:10, Paper TuPO1S-19.8 | Add to My Program |
SEIL: Simulation-Augmented Equivariant Imitation Learning |
|
Jia, Mingxi | Northeastern University |
Wang, Dian | Northeastern University |
Su, Guanang | Northeastern University |
Klee, David | Northeastern University |
Zhu, Xupeng | Northeastern University |
Walters, Robin | Northeastern University |
Platt, Robert | Northeastern University |
Keywords: Deep Learning in Grasping and Manipulation, Imitation Learning, Learning from Demonstration
Abstract: In robotic manipulation, acquiring samples is extremely expensive because it often requires interacting with the real world. Traditional image-level data augmentation has shown the potential to improve sample efficiency in various machine learning tasks. However, image-level data augmentation is insufficient for an imitation learning agent to learn good manipulation policies in a reasonable amount of demonstrations. We propose Simulation-augmented Equivariant Imitation Learning (SEIL), a method that combines a novel data augmentation strategy of supplementing expert trajectories with simulated transitions and an equivariant model that exploits the O(2) symmetry in robotic manipulation. Experimental evaluations demonstrate that our method can learn non-trivial manipulation tasks within ten demonstrations and outperform the baselines by a significant margin.
|
|
08:30-10:10, Paper TuPO1S-19.9 | Add to My Program |
Dextrous Tactile In-Hand Manipulation Using a Modular Reinforcement Learning Architecture |
|
Pitz, Johannes | German Aerospace Center |
Röstel, Lennart | German Aerospace Center (DLR) |
Sievers, Leon | German Aerospace Center |
Bäuml, Berthold | German Aerospace Center (DLR) |
Keywords: In-Hand Manipulation, Dexterous Manipulation, Multifingered Hands
Abstract: Dextrous in-hand manipulation with a multi- fingered robotic hand is a challenging task, esp. when performed with the hand oriented upside down, demanding permanent force-closure, and when no external sensors are used. For the task of reorienting an object to a given goal orientation (vs. infinitely spinning it around an axis), the lack of external sensors is an additional fundamental challenge as the state of the object has to be estimated all the time, e.g., to detect when the goal is reached. In this paper, we show that the task of reorienting a cube to any of the 24 possible goal orientations in a π/2-raster using the torque-controlled DLR- Hand II is possible. The task is learned in simulation using a modular deep reinforcement learning architecture: the actual policy has only a small observation time window of 0.5s but gets the cube state as an explicit input which is estimated via a deep differentiable particle filter trained on data generated by running the policy. In simulation, we reach a success rate of 92% while applying significant domain randomization. Via zero-shot Sim2Real-transfer on the real robotic system, all 24 goal orientations can be reached with a high success rate.
|
|
08:30-10:10, Paper TuPO1S-19.10 | Add to My Program |
Learning Tool Morphology for Contact-Rich Manipulation Tasks with Differentiable Simulation |
|
Li, Mengxi | Stanford University |
Antonova, Rika | Stanford University |
Sadigh, Dorsa | Stanford University |
Bohg, Jeannette | Stanford University |
Keywords: Grippers and Other End-Effectors, Continual Learning, Simulation and Animation
Abstract: When humans perform contact-rich manipulation tasks, customized tools are often necessary to simplify the task. For instance, we use various utensils for handling food, such as knives, forks and spoons. Similarly, robots may benefit from specialized tools that enable them to more easily complete a variety of tasks. We present an end-to-end framework to automatically learn tool morphology for contact-rich manipulation tasks by leveraging differentiable physics simulators. Previous work relied on manually constructed priors requiring detailed specification of a 3D object model, grasp pose and task description to facilitate the search or optimization process. Our approach only requires defining the objective with respect to task performance and enables learning a robust morphology through randomizing variations of the task. We make this optimization tractable by casting it as a continual learning problem. We demonstrate the effectiveness of our method for designing new tools in several scenarios, such as winding ropes, flipping a box and pushing peas onto a scoop in simulation. Additionally, experiments with real robots show that the tool shapes discovered by our method help them succeed in these scenarios.
|
|
08:30-10:10, Paper TuPO1S-19.11 | Add to My Program |
CabiNet: Scaling Neural Collision Detection for Object Rearrangement with Procedural Scene Generation |
|
Murali, Adithyavairavan | Nvidia Corporation |
Mousavian, Arsalan | NVIDIA |
Eppner, Clemens | NVIDIA |
Fishman, Adam | University of Washington |
Fox, Dieter | University of Washington |
Keywords: Perception for Grasping and Manipulation, Manipulation Planning, Deep Learning in Grasping and Manipulation
Abstract: We address the important problem of generalizing robotic rearrangement to clutter without any explicit object models. We first generate over 650K cluttered scenes---orders of magnitude more than prior work---in diverse everyday environments, such as cabinets and shelves. We render synthetic partial point clouds from this data and use it to train our CabiNet model architecture. CabiNet is a collision model that accepts object and scene point clouds, captured from a single-view depth observation, and predicts collisions for SE(3) object poses in the scene. Our representation has a fast inference speed of 7 micro-seconds/query with nearly 20% higher performance than baseline approaches in challenging environments. We use this collision model in conjunction with a Model Predictive Path Integral (MPPI) planner to generate collision-free trajectories for picking and placing in clutter. CabiNet also predicts waypoints, computed from the scene’s signed distance field (SDF), that allows the robot to navigate tight spaces during rearrangement. This improves rearrangement performance by nearly 35% compared to baselines. We systematically evaluate our approach, procedurally generate simulated experiments, and demonstrate that our approach directly transfers to the real world, despite training exclusively in simulation. Supplementary material and videos of robot experiments in completely unknown scenes are available at: https://cabinet-object-rearrangement.github.io.
|
|
08:30-10:10, Paper TuPO1S-19.12 | Add to My Program |
NIFT: Neural Interaction Field and Template for Object Manipulation |
|
Huang, Zeyu | Shenzhen University |
Xu, Juzhan | Shenzhen University |
Dai, Sisi | National University of Defense Technology |
Xu, Kai | National University of Defense Technology |
Zhang, Hao | Simon Fraser University |
Huang, Hui | Shenzhen University |
Hu, Ruizhen | Shenzhen University |
Keywords: Representation Learning, Contact Modeling, Manipulation Planning
Abstract: We introduce NIFT, Neural Interaction Field and Template, a descriptive and robust interaction representation of object manipulations to facilitate imitation learning. Given a few object manipulation demos, NIFT guides the generation of the interaction imitation for a new object instance by matching the Neural Interaction Template (NIT) extracted from the demos in the target Neural Interaction Field (NIF) defined for the new object. Specifically, the NIF is a neural field that encodes the relationship between each spatial point and a given object, where the relative position is defined by a spherical distance function rather than occupancies or signed distances, which are commonly adopted by conventional neural fields but less informative. For a given demo interaction, the corresponding NIT is defined by a set of spatial points sampled in the demo NIF with associated neural features. To better capture the interaction, the points are sampled on the Interaction Bisector Surface (IBS), which consists of points that are equidistant to the two interacting objects and has been used extensively for interaction representation. With both point selection and pointwise features defined for better interaction encoding, NIT effectively guides the feature matching in the NIFs of the new object instances such that the relative poses are optimized to realize the manipulation while imitating the demo interactions. Experiments show that our NIFT solution outperforms state- of-the-art imitation learning methods for object manipulation and generalizes better to objects from new categories.
|
|
TuPO1S-20 Poster Session, Room T8 |
Add to My Program |
Localization I |
|
|
|
08:30-10:10, Paper TuPO1S-20.1 | Add to My Program |
Place Recognition under Occlusion and Changing Appearance Via Disentangled Representations |
|
Chen, Yue | Xi'an Jiaotong University |
Chen, Xingyu | Laboratory of Visual Cognitive Computing and Intelligent Vehicle |
Li, Yicen | McMaster University |
Keywords: Localization
Abstract: Place recognition is a critical and challenging task for mobile robots, aiming to retrieve an image captured at the same place as a query image from a database. Existing methods tend to fail while robots move autonomously under occlusion (e.g., car, bus, truck) and changing appearance (e.g., illumination changes, seasonal variation). Because they encode the image into only one code, entangling place features with appearance and occlusion features. To overcome this limitation, we propose PROCA, an unsupervised approach to decompose the image representation into three codes: a place code used as a descriptor to retrieve images, an appearance code that captures appearance properties, and an occlusion code that encodes occlusion content. Extensive experiments show that our model outperforms the state-of-the-art methods.
|
|
08:30-10:10, Paper TuPO1S-20.2 | Add to My Program |
GIDP: Learning a Good Initialization and Inducing Descriptor Post-Enhancing for Large-Scale Place Recognition |
|
Fan, Zhaoxin | Renmin University of China |
Song, Zhenbo | Nanjing University of Science and Technology |
He, Jun | Renmin University of China |
Liu, Hongyan | Tsinghua University |
Keywords: Localization
Abstract: Large-scale place recognition is a fundamental but challenging task, which plays an increasingly important role in autonomous driving and robotics. Existing methods have achieved acceptable good performance, however, most of them are concentrating on designing elaborate global descriptor learning network structures. The importance of feature generalization and descriptor post-enhancing has long been neglected. In this work, we propose a novel method named GIDP to learn a Good Initialization and Inducing Descriptor Pose-enhancing for Large-scale Place Recognition. In particular, an unsupervised momentum contrast point cloud pretraining module and a reranking-based descriptor post-enhancing module are proposed respectively in GIDP. The former aims at learning a good initialization for the point cloud encoding network before training the place recognition model, while the later aims at post-enhancing the predicted global descriptor through reranking at inference time. Extensive experiments on both indoor and outdoor datasets demonstrate that our method can achieve state-of-the-art performance using simple and general point cloud encoding backbones.
|
|
08:30-10:10, Paper TuPO1S-20.3 | Add to My Program |
STD: Stable Triangle Descriptor for 3D Place Recognition |
|
Yuan, Chongjian | The University of Hong Kong |
Lin, Jiarong | The University of Hong Kong |
Zou, Zuhao | HongKong University |
Hong, Xiaoping | Southern University of Science and Technology |
Zhang, Fu | University of Hong Kong |
Keywords: Localization, Mapping, Range Sensing
Abstract: In this work, we present a novel global descriptor termed stable triangle descriptor (STD) for 3D place recognition. For a triangle, its shape is uniquely determined by the length of the sides or included angles. Moreover, the shape of triangles is completely invariant to rigid transformations. Based on this property, we first design an algorithm to efficiently extract local key points from the 3D point cloud and encode these key points into triangular descriptors. Then, place recognition is achieved by matching the side lengths (and some other information) of the descriptors between point clouds. The point correspondence obtained from the descriptor matching pair can be further used in geometric verification, which greatly improves the accuracy of place recognition. In our experiments, we extensively compare our proposed system against other state-of-the-art systems (i.e., M2DP, Scan Context) on public datasets (i.e., KITTI, NCLT, and Complex-Urban) and our self-collected dataset (with a non-repetitive scanning solid-state LiDAR). All the quantitative results show that STD has stronger adaptability and a great improvement in precision over its counterparts. To share our findings and make contributions to the community, we open source our code on our GitHub: https://github.com/hku-mars/STD
|
|
08:30-10:10, Paper TuPO1S-20.4 | Add to My Program |
DeepRING: Learning Roto-Translation Invariant Representation for LiDAR Based Place Recognition |
|
Lu, Sha | Zhejiang University |
Xu, Xuecheng | Zhejiang University |
Tang, Li | Zhejiang University |
Xiong, Rong | Zhejiang University |
Wang, Yue | Zhejiang University |
Keywords: Localization, Representation Learning, Range Sensing
Abstract: LiDAR based place recognition is popular for loop closure detection and re-localization. In recent years, deep learning brings improvements to place recognition by learnable feature extraction. However, these methods degenerate when the robot re-visits previous places with a large perspective difference. To address the challenge, we propose DeepRING to learn the roto-translation invariant representation from LiDAR scan, so that robot visiting the same place with a different perspective can have similar representations. There are two keys in DeepRING: the feature is extracted from sinogram, and the feature is aggregated by magnitude spectrum. The two steps keep the final representation with both discrimination and roto-translation invariance. Moreover, we state place recognition as a one-shot learning problem with each place being a class, leveraging relation learning to build representation similarity. Substantial experiments are carried out on public datasets, validating the effectiveness of each proposed component, and showing that DeepRING outperforms the comparative methods, especially in dataset level generalization.
|
|
08:30-10:10, Paper TuPO1S-20.5 | Add to My Program |
Sensor Localization by Few Distance Measurements Via the Intersection of Implicit Manifolds |
|
Bilevich, Michael M. | Tel Aviv University |
LaValle, Steven M | University of Oulu |
Halperin, Dan | Tel Aviv University |
Keywords: Localization
Abstract: We present a general approach for determining the unknown (or uncertain) position and orientation of a sensor mounted on a robot in a known environment, using only a few distance measurements (between 2 to 6 typically), which is advantageous, among others, in sensor cost, and storage and information-communication resources. In-between the measurements, the robot can perform predetermined local motions in its workspace, which are useful for narrowing down the candidate poses of the sensor. We demonstrate our approach for planar workspaces, and show that, under mild transversality assumptions, already two measurements are sufficient to reduce the set of possible poses to a set of curves (one-dimensional objects) in the three-dimensional configuration space of the sensor mathbb{R}^2timesmathbb{S}^1, and three or more measurements reduce the set of possible poses to a finite collection of points. However, analytically computing these potential poses for non-trivial intermediate motions between measurements raises substantial hardships and thus we resort to numerical approximation. We reduce the localization problem to a carefully tailored procedure of intersecting two or more implicitly defined two-manifolds, which we carry out to any desired accuracy, proving guarantees on the quality of the approximation. We demonstrate the real-time effectiveness of our method even at high accuracy on various scenarios and different allowable intermediate motions. We also present experiments with a physical robot. Our open-source software and supplementary materials are available at https://bitbucket.org/taucgl/vb-fdml-public
|
|
08:30-10:10, Paper TuPO1S-20.6 | Add to My Program |
Boosting Performance of a Baseline Visual Place Recognition Technique by Predicting the Maximally Complementary Technique |
|
Malone, Connor | Queensland University of Technology |
Hausler, Stephen | CSIRO |
Fischer, Tobias | Queensland University of Technology |
Milford, Michael J | Queensland University of Technology |
Keywords: Localization
Abstract: One recent promising approach to the Visual Place Recognition (VPR) problem has been to fuse the place recognition estimates of multiple complementary VPR techniques using methods such as shared representative appearance learning (SRAL) and multi-process fusion. These approaches come with a substantial practical limitation: they require all potential VPR methods to be brute-force run before they are selectively fused. The obvious solution to this limitation is to predict the viable subset of methods ahead of time, but this is challenging because it requires a predictive signal within the imagery itself that is indicative of high performance methods. Here we propose an alternative approach that instead starts with a known single base VPR technique, and learns to predict the most complementary additional VPR technique to fuse with it, that results in the largest improvement in performance. The key innovation here is to use a dimensionally reduced difference vector between the query image and the top-retrieved reference image using this baseline technique as the predictive signal of the most complementary additional technique, both during training and inference. We demonstrate that our approach can train a single network to select performant, complementary technique pairs across datasets which span multiple modes of transportation (train, car, walking) as well as to generalise to unseen datasets, outperforming multiple baseline strategies for manually selecting the best technique pairs based on the same training data.
|
|
08:30-10:10, Paper TuPO1S-20.7 | Add to My Program |
Loosely-Coupled Localization Fusion System Based on Track-To-Track Fusion with Bias Alignment |
|
Kim, Soyeong | Konkuk University |
Jo, Kichun | Konkuk University |
Bradai, Benazouz | Valeo |
Resende, Paulo | Valeo |
Jo, Jaeyoung | Konkuk University, Smart Vehicle Engineering |
Keywords: Localization, Intelligent Transportation Systems, Sensor Fusion
Abstract: The localization system is an essential element in robotics, which can provide accurate position information. Multiple localization systems can be integrated for reliable localization operations because there are various methods for measuring the position or processing algorithms. Significantly, the track-to-track (T2T) fusion method can fuse multiple localization systems using each system’s estimate without accessing the sensor's low data. However, most T2T fusion-based localization systems ignore slowly varying biases, such as drift errors, odometry errors, and offsets among multiple maps. This can degrade the localization performance because a slowly varying bias is directly reflected in the localization estimate. Therefore, a slowly varying bias must be considered in the fusion process to derive reliable estimates. This study proposes a T2T fusion-based localization system that considers a slowly varying bias. First, the slow-varying bias difference between the systems was estimated. Because each localization system can have a different bias, the estimated bias difference was used to align it with the reference system. Second, a fused estimate can be obtained by T2T fusion using bias-aligned estimates. The proposed fusion system can also be used without limiting the number of inputs to the localization system. The proposed system was compared with various T2T-based localization fusion algorithms for verification in a simulation environment, and it exhibited the best performance in RMSE error comparison.
|
|
08:30-10:10, Paper TuPO1S-20.8 | Add to My Program |
Portable Multi-Hypothesis Monte Carlo Localization for Mobile Robots |
|
García, Alberto | Universidad Rey Juan Carlos |
Martin Rico, Francisco | Carnegie Mellon University |
Guerrero, Jose Miguel | Rey Juan Carlos University |
Rodríguez Lera, Francisco Javier | Universidad De León |
Matellan, Vicente | Universidad De Leon |
Keywords: Localization, Autonomous Agents, Autonomous Vehicle Navigation
Abstract: Self-localization is a fundamental capability that mobile robot navigation systems integrate to move from one point to another using a map. Thus, any enhancement in localization accuracy is crucial to perform delicate dexterity tasks. This paper describes a new localization algorithm that maintains several populations of particles using the Monte Carlo Localization (MCL) algorithm, always choosing the best one as the system’s output. As novelties, our work includes a multi-scale map matching algorithm to create new MCL populations and a metric to determine the most reliable. It also contributes the state of the art implementations, enhancing recovery times from erroneous estimates or unknown initial positions. The proposed method is evaluated in ROS2 in a module fully integrated with Nav2 and compared with the current state-of-the-art Adaptive AMCL solution, obtaining good accuracy/recovery times.
|
|
08:30-10:10, Paper TuPO1S-20.9 | Add to My Program |
CPnP: Consistent Pose Estimator for Perspective-N-Point Problem with Bias Elimination |
|
Zeng, Guangyang | The Chinese University of Hong Kong, Shenzhen |
Chen, Shiyu | The Chinese University of Hong Kong, Shenzhen |
Mu, Biqiang | Chinese Academy of Sciences |
Shi, Guodong | The University of Sydney |
Wu, Junfeng | The Chinese Unviersity of Hong Kong, Shenzhen |
Keywords: Localization, Probability and Statistical Methods, Optimization and Optimal Control
Abstract: The Perspective-n-Point (PnP) problem has been widely studied in both computer vision and photogrammetry societies. With the development of feature extraction techniques, a large number of feature points might be available in a single shot. It is promising to devise a consistent estimator, i.e., the estimate can converge to the true camera pose as the number of points increases. To this end, we propose a consistent PnP solver, named CPnP, with bias elimination. Specifically, linear equations are constructed from the original projection model via measurement model modification and variable elimination, based on which a closed-form least-squares solution is obtained. We then analyze and subtract the asymptotic bias of this solution, resulting in a consistent estimate. Additionally, Gauss- Newton (GN) iterations are executed to refine the consistent solution. Our proposed estimator is efficient in terms of computations—it has O(n) time complexity. Simulations and real dataset tests show that our proposed estimator is superior to some well-known ones for images with dense visual features, in terms of estimation precision and computing time.
|
|
08:30-10:10, Paper TuPO1S-20.10 | Add to My Program |
LiDAR-Based Indoor Localization with Optimal Particle Filters Using Surface Normal Constraints |
|
Andradi, Heruka | Hochschule Bonn Rhein Sieg |
Blumenthal, Sebastian | Locomotec |
Prassler, Erwin | Bonn-Rhein-Sieg Univ. of Applied Sciences |
Plöger, Paul G. | Hochschule Bonn Rhein Sieg |
Keywords: Localization, Probability and Statistical Methods, Range Sensing
Abstract: Accurate and robust localization systems are often highly desired in autonomous mobile robots. Existing LiDAR-based localization systems generally use standard particle filters which suffer from the well-known particle degeneracy problem. Furthermore, standard particle filters are ill-suited for handling discrepancies between maps and the actual operating environments. In this work, we present an effective LiDAR-based indoor localization system which addresses these two issues. The particle degeneracy problem is tackled with an efficient implementation of an optimal particle filter. Map discrepancies are then handled with the use of a high-fidelity observation model for accurate particle propagation and a separate low-fidelity observation model for robust weight update. Evaluations were carried out against a standard particle filter baseline on both real-world and simulated data from challenging indoor environments. The proposed system was found to show significantly better performance in-terms of accuracy, robustness to ambiguity, and robustness to map discrepancies. These performance gains were observed even with more than ten times smaller particle set sizes than in the baseline, while the increase in the computation time per particle was only around 20%.
|
|
08:30-10:10, Paper TuPO1S-20.11 | Add to My Program |
Efficient Planar Pose Estimation Via UWB Measurements |
|
Jiang, Haodong | The Chinese University of Hong Kong, Shenzhen |
Wang, Wentao | ZhejiangUniversity |
Shen, Yuan | Nanjing University of Science and Technology |
Li, Xinghan | Zhejiang University |
Ren, Xiaoqiang | Shanghai University |
Mu, Biqiang | Chinese Academy of Sciences |
Wu, Junfeng | The Chinese Unviersity of Hong Kong, Shenzhen |
Keywords: Localization, Probability and Statistical Methods, SLAM
Abstract: State estimation is an essential part of autonomous systems. Integrating the Ultra-Wideband(UWB) technique has been shown to correct the long-term estimation drift and bypass the complexity of loop closure detection. However, few works on robotics treat UWB as a stand-alone state estimation solution. The primary purpose of this work is to investigate planar pose estimation using only UWB range measurements. We prove the excellent property of a two-step scheme, which says we can refine a consistent estimator to be asymptotically efficient by one step of Gauss-Newton iteration. Grounded on this result, we design the GN-ULS estimator, which reduces the computation time significantly compared to previous methods and presents the possibility of using only UWB for real-time state estimation.
|
|
TuPO1S-21 Poster Session, Room T8 |
Add to My Program |
Vision-Based Navigation I |
|
|
|
08:30-10:10, Paper TuPO1S-21.1 | Add to My Program |
Visual Pitch and Roll Estimation for Inland Water Vessels |
|
Griesser, Dennis | University of Applied Sciences Konstanz, Institute for Optical S |
Umlauf, Georg | University of Applied Sciences Konstanz, Institute for Optical S |
Franz, Matthias | University of Applied Sciences Konstanz, Institute for Optical S |
Keywords: Vision-Based Navigation, Computer Vision for Automation, Data Sets for Robotic Vision
Abstract: Motion estimation is an essential element for autonomous vessels. It is used e.g. for lidar motion compensation as well as mapping and detection tasks in a maritime environment. Because the use of gyroscopes is not reliable and a high performance inertial measurement unit is quite expensive, we present an approach for visual pitch and roll estimation that utilizes a convolutional neural network for water segmentation, a stereo system for reconstruction and simple geometry to estimate pitch and roll. The algorithm is validated on a novel, publicly available dataset recorded at Lake Constance. Our experiments show that the pitch and roll estimator provides accurate results in comparison to an Xsens IMU sensor. We can further improve the pitch and roll estimation by sensor fusion with a gyroscope. The algorithm is available in its implementation as a ROS node.
|
|
08:30-10:10, Paper TuPO1S-21.2 | Add to My Program |
GPF-BG: A Hierarchical Vision-Based Planning Framework for Safe Quadrupedal Navigation |
|
Feng, Shiyu | Georgia Institute of Technology |
Zhou, Ziyi | Georgia Institute of Technology |
Smith, Justin | Georgia Institute of Technology |
Asselmeier, Maxwell | Georgia Institute of Technology |
Zhao, Ye | Georgia Institute of Technology |
Vela, Patricio | Georgia Institute of Technology |
Keywords: Vision-Based Navigation, Legged Robots, Reactive and Sensor-Based Planning
Abstract: Safe quadrupedal navigation through unknown environments is a challenging problem. This paper proposes a hierarchical vision-based planning framework (GPF-BG) integrating our previous Global Path Follower (GPF) navigation system and a gap-based local planner using Bézier curves, so called Bézier Gap (BG). This BG-based trajectory synthesis can generate smooth trajectories and guarantee safety for point-mass robots. With a gap analysis extension based on non-point, rectangular geometry, safety is guaranteed for an idealized quadrupedal motion model and significantly improved for an actual quadrupedal robot model. Stabilized perception space improves performance under oscillatory internal body motions that impact sensing. Simulation-based and real experiments under different benchmarking configurations test safe navigation performance. GPF-BG has the best safety outcomes across all experiments.
|
|
08:30-10:10, Paper TuPO1S-21.3 | Add to My Program |
Direct Angular Rate Estimation without Event Motion-Compensation at High Angular Rates |
|
Ng, Matthew | Singapore University of Technology and Design |
Cai, Xinyu | Singapore University of Technology and Design |
Foong, Shaohui | Singapore University of Technology and Design |
Keywords: Vision-Based Navigation, Localization
Abstract: Feature-based methods are a popular method for camera state estimation using event cameras. Due to the spatiotemporal nature of events, all event images exhibit smearing of events analogous to motion blur for a camera under motion. As such, events must be motion compensated to derive a sharp event image. However, this presents a causality dilemma where motion prior is required to unsmear the events, but a sharp event image is required to estimate motion. While it is possible to use the IMU to develop motion prior, it has been shown that the limited dynamic range of ±2000 ◦/s is insufficient for high angular rate rotorcrafts. Furthermore, smoothing of motion-compensated images due to actual event detection time latency in event cameras severely limits the performance of feature-based methods at high angular rates. This paper proposes a Fourier-based angular rate estimator capable of estimating angular rates directly on non-motion compensated event images. This method circumvents the need for external motion priors in camera state estimation and sidesteps problematic smoothing of features in the spatial domain due to motion blur. Lastly, using an NVIDIA Jetson Xavier NX, the algorithm is demonstrated to be real-time performant up to 3960◦/s
|
|
08:30-10:10, Paper TuPO1S-21.4 | Add to My Program |
StereoVAE: A Lightweight Stereo-Matching System Using Embedded GPUs |
|
Qiong, Chang | Tokyo Institute of Technology |
Xiang, Li | NanJing University |
Xin, Xu | NanJing University |
Liu, Xin | National Institute of Advanced Industrial Science and Technology |
Li, Yun | NanJing University |
Miyazaki, Jun | Tokyo Institute of Technology School of Computing |
Keywords: Embedded Systems for Robotic and Automation, Vision-Based Navigation
Abstract: We propose a lightweight system for stereo-matching using embedded graphic processing units (GPUs). The proposed system overcomes the trade-off between accuracy and processing speed in stereo matching, thus further improving the matching accuracy while ensuring real-time processing. The basic idea is to construct a tiny neural network based on a variational autoencoder (VAE) to achieve the upscaling and refinement a small size of coarse disparity map. This map is initially generated using a traditional matching method. The proposed hybrid structure maintains the advantage of low computational complexity found in traditional methods. Additionally, it achieves matching accuracy with the help of a neural network. Extensive experiments on the KITTI 2015 benchmark dataset demonstrate that our tiny system exhibits high robustness in improving the accuracy of coarse disparity maps generated by different algorithms, while running in real-time on embedded GPUs.
|
|
08:30-10:10, Paper TuPO1S-21.5 | Add to My Program |
Learning Perception-Aware Agile Flight in Cluttered Environments |
|
Song, Yunlong | University of Zurich |
Shi, Kexin | Universität Zürich |
Penicka, Robert | Czech Technical University in Prague |
Scaramuzza, Davide | University of Zurich |
Keywords: Vision-Based Navigation, Machine Learning for Robot Control, Aerial Systems: Perception and Autonomy
Abstract: Recently, neural control policies have outperformed existing model-based planning-and-control methods for autonomously navigating quadrotors through cluttered environments in minimum time. However, they are not perception aware, a crucial requirement in vision-based navigation due to the camera's limited field of view and the underactuated nature of a quadrotor. We propose a learning-based system that achieves perception-aware, agile flight in cluttered environments. Our method combines imitation learning with reinforcement learning (RL) by leveraging a privileged learning-by-cheating framework. Using RL, we first train a perception-aware teacher policy with full-state information to fly in minimum time through cluttered environments. Then, we use imitation learning to distill its knowledge into a vision-based student policy that only perceives the environment via a camera. Our approach tightly couples perception and control, showing a significant advantage in computation speed (10times faster) and success rate. We demonstrate the closed-loop control performance using hardware-in-the-loop simulation.
|
|
08:30-10:10, Paper TuPO1S-21.6 | Add to My Program |
NanoFlowNet: Real-Time Dense Optical Flow on a Nano Quadcopter |
|
Bouwmeester, Rik Jan | Delft University of Technology |
Paredes-Valles, Federico | Delft University of Technology |
de Croon, Guido | TU Delft |
Keywords: Vision-Based Navigation, Machine Learning for Robot Control, AI-Enabled Robotics
Abstract: Nano quadcopters are small, agile, and cheap platforms that are well suited for deployment in narrow, cluttered environments. Due to their limited payload, these vehicles are highly constrained in processing power, rendering conventional vision-based methods for safe and autonomous navigation incompatible. Recent machine learning developments promise high-performance perception at low latency, while dedicated edge computing hardware has the potential to augment the processing capabilities of these limited devices. In this work, we present NanoFlowNet, a lightweight convolutional neural network for real-time dense optical flow estimation on edge computing hardware. We draw inspiration from recent advances in semantic segmentation for the design of this network. Additionally, we guide the learning of optical flow using motion boundary ground truth data, which improves performance with no impact on latency. Validation results on the MPI-Sintel dataset show the high performance of the proposed network given its constrained architecture. Additionally, we successfully demonstrate the capabilities of NanoFlowNet by deploying it on the ultra-low power GAP8 microprocessor and by applying it to vision-based obstacle avoidance on board a Bitcraze Crazyflie, a 34 g nano quadcopter.
|
|
08:30-10:10, Paper TuPO1S-21.7 | Add to My Program |
Zero-Shot Active Visual Search (ZAVIS): Intelligent Object Search for Robotic Assistants |
|
Park, Jeongeun | Korea University |
Yoon, Taerim | Korea University |
Hong, Jejoon | Korea University |
Yu, Youngjae | Yonsei University |
Pan, Matthew | Queen's University |
Choi, Sungjoon | Korea University |
Keywords: Vision-Based Navigation, Object Detection, Segmentation and Categorization, Search and Rescue Robots
Abstract: In this paper, we focus on the problem of efficiently locating a target object described with free-form text using a mobile robot equipped with vision sensors (e.g., an RGBD camera). Conventional active visual search predefines a set of objects to search for, rendering these techniques restrictive in practice. To provide added flexibility in active visual searching, we propose a system where a user can enter target commands using free-form text; we call this system Zero-shot Active Visual Search (ZAVIS). ZAVIS detects and plans to search for a target object inputted by a user through a semantic grid map represented by static landmarks (e.g., desk or bed). For efficient planning of object search patterns, ZAVIS considers commonsense knowledge-based co-occurrence and predictive uncertainty while deciding which landmarks to visit first. We validate the proposed method with respect to SR (success rate) and SPL (success weighted by path length) in both simulated and real-world environments. The proposed method outperforms previous methods in terms of SPL in simulated scenarios, and we further demonstrate ZAVIS with a Pioneer- 3AT robot in real-world studies.
|
|
08:30-10:10, Paper TuPO1S-21.8 | Add to My Program |
Memory-Based Exploration-Value Evaluation Model for Visual Navigation |
|
Feng, Yongquan | National University of Defense Technology |
Xu, Liyang | NUDT |
Li, Minglong | National University of Defense Technology |
Jin, Ruochun | National University of Defense Technology |
Huang, Da | The State Key Laboratory of High Performance Computing (HPCL) & |
Yang, Shaowu | National University of Defense Technology |
Yang, Wenjing | State Key Laboratory of High Performance Computing (HPCL), Schoo |
Keywords: Vision-Based Navigation, Reinforcement Learning, AI-Enabled Robotics
Abstract: We propose a hierarchical visual navigation solution, called Memory-based Exploration-value Evaluation Model (MEEM), to improve the agent's navigation performance. MEEM employs a hierarchical policy to tackle the challenge of sparse rewards, holds an episodic memory to store the historical information of the agent, and applies an Exploration-value Evaluation Model to calculate an exploration-value for action planning at each location in the observable area. We experimentally verify MEEM by navigation performance comparison on two datasets including the grid-map dataset and the 3D scenes Gibson dataset, where our approach achieves state-of-the-art performance on both. Specifically, the overall success rate of MEEM is 95% on the grid-map dataset while the best competitor reaches 68% only. As for the Gibson dataset, the success rate of ours and the best competitor SemExp are 69.8% and 54.4%, respectively. Ablation analysis on the tile-map dataset indicates that all three components of MEEM have positive effects.
|
|
08:30-10:10, Paper TuPO1S-21.9 | Add to My Program |
ViNL: Visual Navigation and Locomotion Over Obstacles |
|
Kareer, Simar | Georgia Tech |
Yokoyama, Naoki | Georgia Institute of Technology |
Batra, Dhruv | Georgia Tech / Facebook AI Research |
Ha, Sehoon | Georgia Institute of Technology |
Truong, Joanne | The Georgia Institute of Technology |
Keywords: Vision-Based Navigation, Reinforcement Learning, Autonomous Agents
Abstract: We present Visual Navigation and Locomotion over obstacles (ViNL), which enables a quadrupedal robot to navigate unseen apartments while stepping over small obstacles that lie in its path (e.g., shoes, toys, cables), similar to how humans and pets lift their feet over objects as they walk. ViNL consists of: (1) a visual navigation policy that outputs linear and angular velocity commands that guides the robot to a goal coordinate in unfamiliar indoor environments; and (2) a visual locomotion policy that controls the robot’s joints to avoid stepping on obstacles while following provided velocity commands. Both the policies are entirely “model-free”, i.e. sensors-to-actions neural networks trained end-to-end. The two are trained independently in two entirely different simulators and then seamlessly co-deployed by feeding the velocity commands from the navigator to the locomotor, entirely “zero-shot” (without any co-training). While prior works have developed learning methods for visual navigation or visual locomotion, to the best of our knowledge, this is the first fully learned approach that leverages vision to accomplish both (1) intelligent navigation in new environments, and (2) intelligent visual locomotion that aims to traverse cluttered environments without disrupting obstacles. On the task of navigation to distant goals in unknown environments, ViNL using just egocentric vision significantly outperforms prior work on robust locomotion using privileged terrain maps (+32.8% success and -4.42 collisions per meter). Additionally, we ablate our locomotion policy to show that each aspect of our approach helps reduce obstacle collisions. Videos and code at http://www.joannetruong.com/projects/vinl.html.
|
|
08:30-10:10, Paper TuPO1S-21.10 | Add to My Program |
Zero-Shot Object Goal Visual Navigation |
|
Zhao, Qianfan | State Key Laboratory of Management and Control for Complex Syste |
Zhang, Lu | Institute of Automation, Chinese Academy of Science |
He, Bin | Tongji University |
Qiao, Hong | Institute of Automation, Chinese Academy of Sciences |
Liu, Zhiyong | Institute of Automation Chinese Academy of Sciences |
Keywords: Vision-Based Navigation, Reinforcement Learning, Deep Learning Methods
Abstract: Object goal visual navigation is a challenging task that aims to guide a robot to find the target object based on its visual observation, and the target is limited to the classes pre-defined in the training stage. However, in real households, there may exist numerous target classes that the robot needs to deal with, and it is hard for all of these classes to be contained in the training stage. To address this challenge, we study the zero-shot object goal visual navigation task, which aims at guiding robots to find targets belonging to novel classes without any training samples. To this end, we also propose a novel zero-shot object navigation framework called semantic similarity network (SSNet). Our framework use the detection results and the cosine similarity between semantic word embeddings as input. Such type of input data has a weak correlation with classes and thus our framework has the ability to generalize the policy to novel classes. Extensive experiments on the AI2-THOR platform show that our model outperforms the baseline models in the zero-shot object navigation task, which proves the generalization ability of our model. Our code is available at: https://github.com/pioneer-innovation/Zero-Shot-Object-Navigation.
|
|
08:30-10:10, Paper TuPO1S-21.11 | Add to My Program |
Monocular Simultaneous Localization and Mapping Using Ground Textures |
|
Hart, Kyle | Stevens Institute of Technology |
Englot, Brendan | Stevens Institute of Technology |
O'Shea, Ryan | Naval Air Warfare Center Aircraft Division |
Kelly, John | RISE Laboratory at Naval Air Warfare Center |
Martinez, David | Pennsylvania State University |
Keywords: Vision-Based Navigation, SLAM, Computer Vision for Automation
Abstract: Recent work has shown impressive localization performance using only images of ground textures taken with a downward facing monocular camera. This provides a reliable navigation method that is robust to feature sparse environments and challenging lighting conditions. However, these localization methods require an existing map for comparison. Our work aims to relax the need for a map by introducing a full simultaneous localization and mapping (SLAM) system. By not requiring an existing map, setup times are minimized and the system is more robust to changing environments. This SLAM system uses a combination of several techniques to accomplish this. Image keypoints are identified and projected into the ground plane. These keypoints, visual bags of words, and several threshold parameters are then used to identify overlapping images and revisited areas. The system then uses robust M-estimators to estimate the transform between robot poses with overlapping images and revisited areas. These optimized estimates make up the map used for navigation. We show, through experimental data, that this system performs reliably on many ground textures, but not all.
|
|
08:30-10:10, Paper TuPO1S-21.12 | Add to My Program |
WAVN: Wide Area Visual Navigation for Large-Scale, GPS-Denied Environments |
|
Lyons, Damian | Fordham University |
Rahouti, Mohamed | Fordham University |
Keywords: Vision-Based Navigation, Swarm Robotics, Multi-Robot Systems
Abstract: This paper introduces a novel approach to GPS-denied visual navigation of a robot team over a wide (i.e., out of line of sight) area which we call WAVN (Wide Area Visual Navigation). Application domains include small-scale precision agriculture as well as exploration and surveillance. The proposed approach requires no exploration or map generation, merging, and updating, some of the most computationally intensive aspects of multi-robot navigation, especially in dynamic environments and for long-term deployments. In contrast, we extend the visual homing paradigm to leverage visual information from the entire team to allow a robot to home to a distant location. Since it only employs the latest imagery, the approach can be resilient to the current state of the environment. WAVN requires three components: identification of common landmarks between robots, a communication infrastructure, and an algorithm to find a sequence of common landmarks to navigate to a goal. The principal contribution of this paper is the navigation algorithm in addition to simulation and physical robot results characterizing performance. The approach is also compared to more traditional map-based approaches.
|
|
TuPO1S-22 Poster Session, Room T8 |
Add to My Program |
Localization and Mapping I |
|
|
|
08:30-10:10, Paper TuPO1S-22.1 | Add to My Program |
ORORA: Outlier-Robust Radar Odometry |
|
Lim, Hyungtae | Korea Advanced Institute of Science and Technology |
Han, Kawon | Korea Advanced Institute of Science and Technology |
Shin, Gunhee | Inha University |
Kim, Giseop | NAVER LABS |
Hong, Songcheol | Korea Advanced Institute of Science and Technology |
Myung, Hyun | KAIST (Korea Advanced Institute of Science and Technology) |
Keywords: SLAM, Mapping, Range Sensing
Abstract: Radar sensors are emerging as solutions for perceiving surroundings and estimating ego-motion in extreme weather conditions. Unfortunately, radar measurements are noisy and suffer from mutual interference, which degrades the performance of feature extraction and matching, triggering imprecise matching pairs, which are referred to as outliers. To tackle the effect of outliers on radar odometry, a novel outlier-robust method called textit{ORORA} is proposed, which is an abbreviation of textit{Outlier-RObust RAdar odometry}. To this end, a novel decoupling-based method is proposed, which consists of graduated non-convexity~(GNC)-based rotation estimation and anisotropic component-wise translation estimation~(A-COTE). Furthermore, our method leverages the anisotropic characteristics of radar measurements, each of whose uncertainty along the azimuthal direction is somewhat larger than that along the radial direction. As verified in the public dataset, it was demonstrated that our proposed method yields robust ego-motion estimation performance compared with other state-of-the-art methods. Our code is available at https://github.com/url-kaist/outlier-robust-radar-odometry.
|
|
08:30-10:10, Paper TuPO1S-22.2 | Add to My Program |
AdaSfM: From Coarse Global to Fine Incremental Adaptive Structure from Motion |
|
Chen, Yu | National University of Singapore |
Yu, Zihao | Beihang University |
Song, Shu | Nreal |
Li, Jianming | Segway Ninebot |
Yu, Tianning | Willand Company |
Lee, Gim Hee | National University of Singapore |
Keywords: Mapping, SLAM, Visual-Inertial SLAM
Abstract: Despite the impressive results achieved by many existing Structure from Motion (SfM) approaches, there is still a need to improve the robustness, accuracy, and efficiency on large-scale scenes with many outlier matches and sparse view graphs. In this paper, we propose AdaSfM: a coarse-to-fine adaptive SfM approach that is scalable to large-scale and challenging datasets. Our approach first does a coarse global SfM which improves the reliability of the view graph by leveraging measurements from low-cost sensors such as Inertial Measurement Units (IMUs) and wheel encoders. Subsequently, the view graph is divided into sub-scenes that are refined in parallel by a fine local incremental SfM regularised by the result from the coarse global SfM to improve the camera registration accuracy and alleviate scene drifts. Finally, our approach uses a threshold-adaptive strategy to align all local reconstructions to the coordinate frame of global SfM. Extensive experiments on large-scale benchmark datasets show that our approach achieves state-of-the-art accuracy and efficiency.
|
|
08:30-10:10, Paper TuPO1S-22.3 | Add to My Program |
Robust Map Fusion with Visual Attention Utilizing Multi-Agent Rendezvous |
|
Kim, Jaein | Seoul National University |
Han, Dong-Sig | Seoul National University |
Zhang, Byoung-Tak | Seoul National University |
Keywords: Multi-Robot SLAM, Deep Learning Methods, SLAM
Abstract: The map fusion for multi-robot simultaneous localization and mapping (SLAM) consistently combines robot maps built independently into the global map. An established approach to map fusion is utilizing rendezvous, which refers to an encounter between multiple agents, to calculate the transformation into the global map. However, previous works using rendezvous have a limitation in that they are unreliable for certain circumstances, where the amount of agent observations or overlapping landmarks is limited. This work proposes a novel map fusion system which robustly fuses local maps in challenging rendezvous that lack shared information. Our system utilizes the single visual perception from rendezvous and estimates the relative pose between agents with the DOPE. Then our scheme transforms local maps with an estimated relative pose and predicts the misalignment from approximated maps by utilizing the attention mechanism of the vision transformer. Comparisons with the Hough transform-based method show that ours is significantly better when the overlap between local maps is insufficient. We also verify the robustness of our system against a similar real-world scenario.
|
|
08:30-10:10, Paper TuPO1S-22.4 | Add to My Program |
Wi-Closure: Reliable and Efficient Search of Inter-Robot Loop Closures Using Wireless Sensing |
|
Wang, Weiying | Harvard University |
Kemmeren, Anne | Delft University |
Son, Daniel | Harvard University |
Alonso-Mora, Javier | Delft University of Technology |
Gil, Stephanie | Harvard University |
Keywords: Multi-Robot SLAM, Multi-Robot Systems, Range Sensing
Abstract: In this paper we propose a novel algorithm, WiClosure, to improve the computational efficiency and robustness of loop closure detection in multi-robot SLAM. Our approach decreases the computational overhead of classical approaches by pruning the search space of potential loop closures, prior to evaluation by a typical multi-robot SLAM pipeline. Wi-Closure achieves this by identifying candidates that are spatially close to each other measured via sensing over the wireless communication signal between robots, even when they are operating in non-line-of-sight or in remote areas of the environment from one another. We demonstrate the validity of our approach in simulation and hardware experiments. Our results show that using Wi-closure greatly reduces computation time, by 54.1% in simulation and by 76.8% in hardware experiments, compared with a multi-robot SLAM baseline. Importantly, this is achieved without sacrificing accuracy. Using Wi-closure reduces absolute trajectory estimation error by 98.0% in simulation and 89.2% in hardware experiments. This improvement is partly due to Wi-Closure’s ability to avoid catastrophic optimization failure that typically occurs with classical approaches in challenging repetitive environments.
|
|
08:30-10:10, Paper TuPO1S-22.5 | Add to My Program |
COVINS-G: A Generic Back-End for Collaborative Visual-Inertial SLAM |
|
Patel, Manthan | ETH Zurich |
Karrer, Marco | ETH Zurich |
Bänninger, Philipp | ETH Zurich |
Chli, Margarita | ETH Zurich |
Keywords: Multi-Robot SLAM, SLAM, Visual-Inertial SLAM
Abstract: Collaborative SLAM is at the core of perception in multi-robot systems as it enables the co-localization of the team of robots in a common reference frame, which is of vital importance for any coordination amongst them. The paradigm of a centralized architecture is well established, with the robots (i.e. agents) running Visual-Inertial Odometry (VIO) onboard while communicating relevant data, such as e.g. Keyframes (KFs), to a central back-end (i.e. server), which then merges and optimizes the joint maps of the agents. While these frameworks have proven to be successful, their capability and performance are highly dependent on the choice of the VIO front-end, thus limiting their flexibility. In this work, we present COVINS-G, a generalized back-end building upon the COVINS framework, enabling the compatibility of the server-back-end with any arbitrary VIO front-end, including, for example, off-the-shelf cameras with odometry capabilities, such as the Realsense T265. The COVINS-G back-end deploys a multi-camera relative pose estimation algorithm for computing the loop-closure constraints allowing the system to work purely on 2D image data. In the experimental evaluation, we show on-par accuracy with state-of-the-art multi-session and collaborative SLAM systems, while demonstrating the flexibility and generality of our approach by employing different front-ends onboard collaborating agents within the same mission. The COVINS-G codebase along with a generalized front-end wrapper to allow any existing VIO front-end to be readily used in combination with the proposed collaborative back-end is open-sourced. Video--https://youtu.be/FoJfXCfaYDw
|
|
08:30-10:10, Paper TuPO1S-22.6 | Add to My Program |
PIEKF-VIWO: Visual-Inertial-Wheel Odometry Using Partial Invariant Extended Kalman Filter |
|
Hua, Tong | Shanghai Jiao Tong University |
Li, Tao | Shanghai Jiao Tong University |
Pei, Ling | Shanghai Jiao Tong University |
Keywords: Sensor Fusion, SLAM
Abstract: Invariant Extended Kalman Filter (IEKF) has been successfully applied in Visual-inertial Odometry (VIO) as an advanced achievement of Kalman filter, showing great potential in sensor fusion. In this paper, we propose partial IEKF (PIEKF), which only incorporates rotation-velocity state into the Lie group structure and apply it for Visual-Inertial-Wheel Odometry (VIWO) to improve positioning accuracy and consistency. Specifically, we derive the rotation-velocity measurement model, which combines wheel measurements with kinematic constraints. The model circumvents the wheel odometer’s 3D integration and covariance propagation, which is essential for filter consistency. And a plane constraint is also introduced to enhance the position accuracy. A dynamic outlier detection method is adopted, leveraging the velocity state output. Through the simulation and real-world test, we validate the effectiveness of our approach, which outperforms the standard Multi-State Constraint Kalman Filter (MSCKF) based VIWO in consistency and accuracy.
|
|
08:30-10:10, Paper TuPO1S-22.7 | Add to My Program |
Observability-Aware Active Extrinsic Calibration of Multiple Sensors |
|
Xu, Shida | Heriot-Watt University |
Scharff Willners, Jonatan | Heriot-Watt University |
Hong, Ziyang | Heriot-Watt University |
Zhang, Kaicheng | Heriot-Watt University |
Petillot, Yvan R. | Heriot-Watt University |
Wang, Sen | Imperial College London |
Keywords: Sensor Fusion, SLAM
Abstract: The extrinsic parameters play a crucial role in multi-sensor fusion, such as visual-inertial Simultaneous Localization and Mapping(SLAM), as they enable the accurate alignment and integration of measurements from different sensors. However, extrinsic calibration is challenging in scenarios, such as underwater, where in-view structures are scanty and/or visibility is limited, causing incorrect extrinsic calibration due to insufficient motion on all degrees of freedom. In this paper, we propose an entropy-based active extrinsic calibration algorithm which leverages observability analysis and information entropy to enhance the accuracy and reliability of extrinsic calibration. It determines the system observability numerically by using singular value decomposition (SVD) of the Fisher Information Matrix (FIM). Furthermore, when the calibration parameter is not fully observable, our method actively searches for the best next motion to recover the system's observability via entropy-based optimisation. Experimental results on synthetic data, in a simulation and using a real underwater vehicle verify that the proposed method is able to avoid/reduce the calibration failure while improving the calibration accuracy and reliability.
|
|
08:30-10:10, Paper TuPO1S-22.8 | Add to My Program |
Learning Continuous Control Policies for Information-Theoretic Active Perception |
|
Yang, Pengzhi | University of Electronic Science and Technology of China |
Liu, Yuhan | University of California, San Diego |
Koga, Shumon | University of California San Diego |
Asgharivaskasi, Arash | University of California, San Diego |
Atanasov, Nikolay | University of California, San Diego |
Keywords: Sensor-based Control, View Planning for SLAM, Reinforcement Learning
Abstract: This paper proposes a method for learning continuous control policies for active landmark localization and exploration using an information-theoretic cost. We consider a mobile robot detecting landmarks within a limited sensing range, and tackle the problem of learning a control policy that maximizes the mutual information between the landmark states and the sensor observations. We employ a Kalman filter to convert the partially observable problem in the landmark state to the Markov decision process (MDP), a differentiable field of view to shape the reward, and an attention-based neural network to represent the control policy. The approach is further unified with active volumetric mapping to promote exploration in addition to landmark localization. The performance is demonstrated in several simulated landmark localization tasks in comparison with benchmark methods.
|
|
08:30-10:10, Paper TuPO1S-22.9 | Add to My Program |
Structure PLP-SLAM: Efficient Sparse Mapping and Localization Using Point, Line and Plane for Monocular, RGB-D and Stereo Cameras |
|
Shu, Fangwen | DFKI |
Wang, Jiaxuan | DFKI |
Pagani, Alain | German Research Center for Artificial Intelligence |
Stricker, Didier | German Research Center for Artificial Intelligence |
Keywords: SLAM
Abstract: This paper presents a visual SLAM system that uses both points and lines for robust camera localization, and simultaneously performs a piece-wise planar reconstruction (PPR) of the environment to provide a structural map in real-time. One of the biggest challenges in parallel tracking and mapping with a monocular camera is to keep the scale consistent when reconstructing the geometric primitives. This further introduces difficulties in graph optimization of the bundle adjustment (BA) step. We solve these problems by proposing several run-time optimizations on the reconstructed lines and planes. Our system is able to run with depth and stereo sensors in addition to the monocular setting. Our proposed SLAM tightly incorporates the semantic and geometric features to boost both frontend pose tracking and backend map optimization. We evaluate our system exhaustively on various datasets, and show that we outperform state-of-the-art methods in terms of trajectory precision. The code of PLP-SLAM has been made available in open-source for the research community (https://github.com/PeterFWS/Structure-PLP-SLAM).
|
|
08:30-10:10, Paper TuPO1S-22.10 | Add to My Program |
Rotation Synchronization Via Deep Matrix Factorization |
|
Gk, Tejus | Indian Institute of Technology (ISM) Dhanbad |
Zara, Giacomo | University of Trento |
Rota, Paolo | University of Trento |
Fusiello, Andrea | University of Udine |
Ricci, Elisa | University of Trento |
Arrigoni, Federica | Politecnico Di Milano |
Keywords: SLAM
Abstract: In this paper we address the rotation synchronization problem, where the objective is to recover absolute rotations starting from pairwise ones, where the unknowns and the measures are represented as nodes and edges of a graph, respectively. This problem is an essential task for structure from motion and simultaneous localization and mapping. We focus on the formulation of synchronization via neural networks, which has only recently begun to be explored in the literature. Inspired by deep matrix completion, we express rotation synchronization in terms of matrix factorization with a deep neural network. Our formulation exhibits implicit regularization properties and, more importantly, is unsupervised, whereas previous deep approaches are supervised. Our experiments show that we achieve comparable accuracy to the closest competitors in most scenes, while working under weaker assumptions.
|
|
08:30-10:10, Paper TuPO1S-22.11 | Add to My Program |
Object-Based SLAM Utilizing Unambiguous Pose Parameters Considering General Symmetry Types |
|
Lee, Taekbeom | Seoul National University |
Jang, Youngseok | Seoul National University |
Kim, H. Jin | Seoul National University |
Keywords: SLAM, Semantic Scene Understanding, RGB-D Perception
Abstract: Existence of symmetric objects, whose observation at different viewpoints can be identical, can deteriorate the performance of simultaneous localization and mapping (SLAM). This work proposes a system for robustly optimizing the pose of cameras and objects even in the presence of symmetric objects. We classify objects into three categories depending on their symmetry characteristics, which is efficient and effective in that it allows to deal with general objects and the objects in the same category can be associated with the same type of ambiguity. Then we extract only the unambiguous parameters corresponding to each category and use them in data association and joint optimization of the camera and object pose. The proposed approach provides significant robustness to the SLAM performance by removing the ambiguous parameters and utilizing as much useful geometric information as possible. Comparison with baseline algorithms confirms the superior performance of the proposed system in terms of object tracking and pose estimation, even in challenging scenarios where the baseline fails.
|
|
08:30-10:10, Paper TuPO1S-22.12 | Add to My Program |
Towards View-Invariant and Accurate Loop Detection Based on Scene Graph |
|
Liu, Chuhao | Hong Kong University of Science and Technology |
Shen, Shaojie | Hong Kong University of Science and Technology |
Keywords: SLAM, Visual-Inertial SLAM
Abstract: Loop detection plays a key role in visual Simultaneous Localization and Mapping (SLAM) by correcting the accumulated pose drift. In indoor scenarios, the richly distributed semantic landmarks are view-point invariant and hold strong descriptive power in loop detection. The current semantic-aided loop detection embeds the topology between semantic instances to search a loop. However, current semantic-aided loop detection methods face challenges in dealing with ambiguous semantic instances and drastic viewpoint differences, which are not fully addressed in the literature. This paper introduces a novel loop detection method based on an incrementally created scene graph, targeting the visual SLAM at indoor scenes. It jointly considers the macro-view topology, micro-view topology, and occupancy of semantic instances to find correct correspondences. Experiments using handheld RGB-D sequence show our method is able to accurately detect loops in drastically changed viewpoints. It maintains a high precision in observing objects with similar topology and appearance. Our method also demonstrates that it is robust in changed indoor scenes.
|
|
TuBT1 Oral Session, ICC Cap Suite 7-9 |
Add to My Program |
SLAM 2 |
|
|
Chair: Nieto, Juan | Microsoft |
Co-Chair: Solà, Joan | Institut De Robòtica I Informàtica Industrial |
|
15:00-15:10, Paper TuBT1.1 | Add to My Program |
ViViD++: Vision for Visibility Dataset |
|
Lee, Alex | Hyundai Motor Company |
Cho, Younggun | Inha University |
Shin, Young-Sik | KIMM |
Kim, Ayoung | Seoul National University |
Myung, Hyun | KAIST (Korea Advanced Institute of Science and Technology) |
Keywords: Data Sets for SLAM, Data Sets for Robotic Vision, Data Sets for Robot Learning
Abstract: In this paper, we present a dataset capturing diverse visual data formats that target varying luminance conditions. While RGB cameras provide nourishing and intuitive information, changes in lighting conditions potentially result in catastrophic failure for robotic applications based on vision sensors. Approaches overcoming illumination problems have included developing more robust algorithms or other types of visual sensors, such as thermal and event cameras. Despite the alternative sensors’ potential, there still are few datasets with alternative vision sensors. Thus, we provided a dataset recorded from alternative vision sensors, by handheld or mounted on a car, repeatedly in the same space but in different conditions. We aim to acquire visible information from co-aligned alternative vision sensors. Our sensor system collects data more independently from visible light intensity by measuring the amount of infrared dissipation, depth by structured reflection, and instantaneous temporal changes in luminance. We provide these measurements along with inertial sensors and ground-truth for developing robust visual SLAMunder poor illumination. The full dataset is available at: https://visibilitydataset.github.io/
|
|
15:10-15:20, Paper TuBT1.2 | Add to My Program |
CamMap: Extrinsic Calibration of Non-Overlapping Cameras Based on SLAM Map Alignment |
|
Xu, Jie | Harbin Institute of Technology |
Li, Ruifeng | Harbin Institute of Technology |
Zhao, Lijun | Harbin Institute of Technology |
Yu, Wenlu | Harbin Institute of Technology |
Liu, Zhiheng | Harbin Institute of Technology |
Zhang, Bo | Harbin Institute of Technology |
Li, Yuchen | Harbin Institute of Technology |
Keywords: Calibration and Identification, SLAM
Abstract: Multiple cameras have emerged as a promising technology for robots and vehicles due to their broad fields of view (FoV) and high resolution. However, there are often limited or no overlapping FoVs among cameras, bringing challenges to estimating extrinsic camera parameters. To overcome this problem, we propose CamMap: a novel 6-degree-of-freedom (DoF) extrinsic calibration pipeline. Following three operating rules, we make a multi-camera rig capture some similar image sequences individually to create sparse feature-based maps with a SLAM system. A two-stage optimization problem is formulated to align the maps and obtain the transformations between them based on bidirectional reprojection. The transformations are exactly the extrinsic parameters. Supporting diverse camera types, the pipeline is available in any texture-rich environment. It can calibrate any number of cameras simultaneously without requiring calibration patterns, synchronization, same resolution and frequency. The pipeline is evaluated on cameras with limited and no overlapping FoVs. In the experiments, we demonstrate our method's accuracy and efficiency. The absolute pose error (APE) between Kalibr and CamMap is less than 0.025. We make the source codes public at github.com/jiejie567/SlamForCalib.
|
|
15:20-15:30, Paper TuBT1.3 | Add to My Program |
Hybrid Visual SLAM for Underwater Vehicle Manipulator Systems |
|
Billings, Gideon | University of Sydney, Australian Center for Field Robotics |
Camilli, Richard | Woods Hole Oceanographic Institution |
Johnson-Roberson, Matthew | University of Michigan |
Keywords: Marine Robotics, SLAM, Sensor Fusion
Abstract: This paper presents a novel visual feature based scene mapping method for underwater vehicle manipulator systems (UVMSs), with specific emphasis on robust mapping in natural seafloor environments. Our method uses GPU accelerated SIFT features in a graph optimization framework to build a feature map. The map scale is constrained by features from a vehicle mounted stereo camera, and we exploit the dynamic positioning capability of the manipulator system by fusing features from a wrist mounted fisheye camera into the map to extend it beyond the limited viewpoint of the vehicle mounted cameras. Our hybrid SLAM method is evaluated on challenging image sequences collected with a UVMS in natural deep seafloor environments of the Costa Rican continental shelf margin, and we also evaluate the stereo only mode on a shallow reef survey dataset. Results on these datasets demonstrate the high accuracy of our system and suitability for operating in diverse and natural seafloor environments. We also contribute these datasets for public use.
|
|
15:30-15:40, Paper TuBT1.4 | Add to My Program |
WOLF: A Modular Estimation Framework for Robotics Based on Factor Graphs |
|
Solà, Joan | Institut De Robòtica I Informàtica Industrial |
Vallvé, Joan | CSIC-UPC |
Casals, Joaquim | Institut De Robòtica I Informàtica Industrial |
Deray, Jeremie | Institut De Robòtica I Informàtica Industrial, CSIC-UPC |
Fourmy, Mederic | LAAS, CNRS |
Atchuthan, Dinesh | EasyMile |
Corominas-Murtra, Andreu | Beta Robots SL |
Andrade-Cetto, Juan | CSIC-UPC |
Keywords: Software, Middleware and Programming Environments, SLAM, Sensor Fusion
Abstract: This paper introduces WOLF, a C++ estimation framework based on factor graphs and targeted at mobile robotics. WOLF can be used beyond SLAM to handle self-calibration, model identification, or the observation of dynamic quantities other than localization. The architecture of WOLF allows for a modular yet tightly-coupled estimator. Modularity is enhanced via reusable plugins that are loaded at runtime depending on the application setup. This setup is achieved conveniently through YAML files, allowing users to configure a wide range of applications without the need of writing or compiling code. Most procedures are coded as abstract algorithms in base classes with varying levels of specialization. Overall, all these assets allow for coherent processing and favor code reusability and scalability. WOLF can be used with ROS and is made publicly available and open to collaboration.
|
|
15:40-15:50, Paper TuBT1.5 | Add to My Program |
Point Cloud Change Detection with Stereo V-SLAM: Dataset, Metrics and Baseline |
|
Lin, Zihan | Tsinghua University |
Jincheng, Yu | Tsinghua University |
Zhou, Lipu | MeiTuan |
Zhang, Xudong | Tsinghua Univ |
Wang, Jian | Tsinghua Univ |
Wang, Yu | Tsinghua University |
Keywords: Data Sets for SLAM, SLAM, Mapping
Abstract: Localization and navigation are basic robotic tasks requiring an accurate and up-to-date map to finish these tasks, with crowdsourced data to detect map changes posing an appealing solution. Collecting and processing crowdsourced data requires low-cost sensors and algorithms, but existing methods rely on expensive sensors or computationally expensive algorithms. Additionally, there is no existing dataset to evaluate point cloud change detection. Thus, this paper proposes a novel framework using low-cost sensors like stereo cameras and IMU to detect changes in a point cloud map. Moreover, we create a dataset and the corresponding metrics to evaluate point cloud change detection with the help of the high-fidelity simulator Unreal Engine 4. Experiments show that our visual-based framework can effectively detect the changes in our dataset.
|
|
15:50-16:00, Paper TuBT1.6 | Add to My Program |
Hilti-Oxford Dataset: A Millimeter-Accurate Benchmark for Simultaneous Localization and Mapping |
|
Zhang, Lintong | University of Oxford |
Helmberger, Michael | HILTI AG |
Fu, Lanke Frank Tarimo | University of Oxford |
Wisth, David | University of Oxford |
Camurri, Marco | Free University of Bozen-Bolzano |
Scaramuzza, Davide | University of Zurich |
Fallon, Maurice | University of Oxford |
Keywords: Data Sets for SLAM, SLAM, Mapping
Abstract: Simultaneous Localization and Mapping (SLAM) is being deployed in real-world applications, however many state-of-the-art solutions still struggle in many common scenarios. A key necessity in progressing SLAM research is the availability of high-quality datasets and fair and transparent benchmarking. To this end, we have created the Hilti-Oxford Dataset, to push state-of-the-art SLAM systems to their limits. The dataset has a variety of challenges ranging from sparse and regular construction sites to a 17th century neoclassical building with fine details and curved surfaces. To encourage multi-modal SLAM approaches, we designed a data collection platform featuring a lidar, five cameras, and an IMU. With the goal of benchmarking SLAM algorithms for tasks where accuracy and robustness are paramount, we implemented a novel ground truth collection method that enables our dataset to accurately measure SLAM pose errors with millimeter accuracy. To further ensure accuracy, the extrinsics of our platform were verified with a micrometer-accurate scanner, and temporal calibration was managed online using hardware time synchronization. The multi-modality and diversity of our dataset attracted a large field of academic and industrial researchers to enter the second edition of the Hilti SLAM challenge. The results of the challenge show that while the top three teams could achieve an accuracy of 2cm or better for some sequences, the performance dropped off in more difficult sequences.
|
|
16:00-16:10, Paper TuBT1.7 | Add to My Program |
Long-Term Visual SLAM with Bayesian Persistence Filter Based Global Map Prediction (I) |
|
Deng, Tianchen | Shanghai Jiao Tong University |
Xie, Hongle | Shanghai Jiao Tong University |
Wang, Jingchuan | Shanghai Jiao Tong University |
Chen, Weidong | Shanghai Jiao Tong University |
Keywords: SLAM, Localization, Long term Interaction
Abstract: With the rapidly growing demand for accurate localization in real-world environments, visual SLAM has received significant attention in recent years. However, those existing methods still suffer from the degradation of localization accuracy in long-term changing environments. To address these problems, we propose a novel long-term SLAM system with map prediction and dynamics removal. First, a visual point cloud matching algorithm is designed to efficiently fuse 2D pixel information and 3D voxel information. Second, each map point is classified into three types: static, semi-static, and dynamic, based on the Bayesian persistence filter. Then we remove the dynamic map points to eliminate the influence of those map points. We can obtain a global predicted map by modeling the time series of semi-static map points. Finally, we incorporate the predicted global map into a state-of-art SLAM method, achieving an efficient visual SLAM system for longterm dynamic environments. Extensive experiments are carried out on a wheelchair robot in an indoor environment over several months. The results demonstrate that our method has better map prediction accuracy and achieves more robust localization performance.
|
|
16:10-16:20, Paper TuBT1.8 | Add to My Program |
Wheel-SLAM: Simultaneous Localization and Terrain Mapping Using One Wheel-Mounted IMU |
|
Wu, Yibin | University of Bonn |
Kuang, Jian | Wuhan University |
Niu, Xiaoji | Wuhan University |
Behley, Jens | University of Bonn |
Klingbeil, Lasse | University of Bonn |
Kuhlmann, Heiner | University of Bonn |
Keywords: SLAM, Localization
Abstract: A reliable pose estimator robust to environmental disturbances is desirable for mobile robots. To this end, inertial measurement units (IMUs) play an important role because they can perceive the full motion state of the vehicle independently. However, it suffers from accumulative error due to inherent noise and bias instability, especially for low-cost sensors. In our previous studies on Wheel-INS, we proposed to limit the error drift of the pure inertial navigation system (INS) by mounting an IMU to the wheel of the robot to take advantage of rotation modulation. However, it still drifted over a long period of time due to the lack of external correction signals. In this letter, we propose to exploit the environmental perception ability of Wheel-INS to achieve simultaneous localization and mapping (SLAM) with only one IMU. To be specific, we use the road bank angles (mirrored by the robot roll angles estimated by Wheel-INS) as terrain features to enable the loop closure with a Rao-Blackwellized particle filter. The road bank angle is sampled and stored according to the robot position in the grid maps maintained by the particles. The weights of the particles are updated according to the difference between the currently estimated roll sequence and the terrain map. Field experiments suggest the feasibility of the idea to perform SLAM in Wheel-INS using the robot roll angle estimates. In addition, the positioning accuracy is improved significantly (more than 30%) over WheelINS.
|
|
16:20-16:30, Paper TuBT1.9 | Add to My Program |
Maplab 2.0 - a Modular and Multi-Modal Mapping Framework |
|
Cramariuc, Andrei | ETHZ |
Bernreiter, Lukas | ETH Zurich, Autonomous Systems Lab |
Tschopp, Florian | Arrival Ltd |
Fehr, Marius | Voliro AG |
Reijgwart, Victor | ETH Zurich |
Nieto, Juan | Microsoft |
Siegwart, Roland | ETH Zurich |
Cadena Lerma, Cesar | ETH Zurich |
Keywords: SLAM, Mapping, Multi-Robot SLAM
Abstract: Integration of multiple sensor modalities and deep learning into Simultaneous Localization And Mapping (SLAM) systems are areas of significant interest in current research. Multi-modality is a stepping stone towards achieving robustness in challenging environments and interoperability of heterogeneous multi-robot systems with varying sensor setups. With maplab 2.0, we provide a versatile open-source platform that facilitates developing, testing, and integrating new modules and features into a fully-fledged SLAM system. Through extensive experiments, we show that maplab 2.0’s accuracy is comparable to the state-of-the-art on the HILTI 2021 benchmark. Additionally, we showcase the flexibility of our system with three use cases: i) large-scale (approx. 10 km) multi-robot multi-session (23 missions) mapping, ii) integration of non-visual landmarks, and iii) incorporating a semantic object-based loop closure module into the mapping framework. The code is available open-source at https://github.com/ethz-asl/maplab.
|
|
TuBT2 Oral Session, Theatre 1 |
Add to My Program |
Modeling, Control, and Learning for Soft Robots |
|
|
Chair: De Luca, Alessandro | Sapienza University of Rome |
Co-Chair: Boyer, Frédéric | IMT Atlantique |
|
15:00-15:10, Paper TuBT2.1 | Add to My Program |
Simulation Data Driven Design Optimization for Reconfigurable Soft Gripper System |
|
Liu, Jun | Institute of High Performance Computing |
Low, Jin Huat | National University of Singapore |
Han, Qian Qian | National University of Singapore |
Lim, Marisa | National University of Singapore |
Lu, Dingjie | IHPC, ASTAR |
Li, Yangfan | Institute of High Performance Computing, A*Star |
Yeow, Chen-Hua | National University of Singapore |
Liu, ZhuangJian | Institute of High Performance Computing |
Keywords: Optimization and Optimal Control, Modeling, Control, and Learning for Soft Robots, Grasping
Abstract: In the soft gripper design work, most of the designs such as gripping width and the design of finger actuator are purely based on experience, and repeated trial-and-error. In most scenarios, the designed actuators cannot achieve the best/optimized grasping performance with a specific design type. This optimized design is important especially for the food grasping application, as a minor improvement of the grasping capability will be helpful to increase the grasping success ratio, especially during high-speed pick and place tasks. That motivates us to develop a design optimization framework, focusing on how to achieve an optimized grasping performance with a multi-objective design optimization. In this work, a simulation aided data-driven optimization framework for guiding the design of a reconfigurable soft gripper system is presented. To achieve an effective optimization, a simulation model is developed based on the Simulation Open Framework Architecture (SOFA) platform. This model can predict the bending and grasping behavior under actuation and external loading. This model is then used in a data-driven design optimization framework for optimizing the actuator design. An artificial neural network (ANN) is built based on the simulation results as training data, and used as a surrogate model in a multi-objective optimization framework, to achieve an optimal grasping capability with design constraints. This simulation and optimization capability can significantly reduce the tr
|
|
15:10-15:20, Paper TuBT2.2 | Add to My Program |
Research on Design and Experiment of a Wearable Hand Rehabilitation Device Driven by Fiber-Reinforced Soft Actuator |
|
Ma, Kaiwei | Nanjing University of Posts and Telecommunications |
Jiang, Zhenjiang | Nanjing University of Posts and Telecommunications |
Gao, Shuang | Nanjing University of Posts and Telecommunications |
Jiang, Guoping | Nanjing University of Posts and Telecommunications |
Xu, Fengyu | Southeast University |
Keywords: Rehabilitation Robotics, Soft Sensors and Actuators, Modeling, Control, and Learning for Soft Robots
Abstract: Fiber-reinforced soft actuators have great potential for the development of wearable technology. However, its complex structural design, nonlinear soft material body, fluiddriven dynamics and high manufacturing costs have brought huge challenges to system modeling, control and application. To improve this situation, a novel fiber-reinforced soft actuator is designed and analyzed. First, a wearable hand rehabilitation device based on fiber-reinforced soft actuators with three-airchamber structure is designed. Next, using Yeoh model and principle of virtual work, we establish a bending mathematical model of the soft actuator, whose input parameters are air pressure P and winding number N, and output parameter is bending angle b. Finally, through the finite element analysis, the optimal N is obtained, and the correctness of the model is verified. To verify the above research, an experimental platform is constructed. The results show that the relative error of the model is in an acceptable state. The device can imitate common gestures, easily grasp objects with a volume of 1.6 dm3 and mass of 335.7 g, which can realize the hand rehabilitation training.
|
|
15:20-15:30, Paper TuBT2.3 | Add to My Program |
DNN-Based Predictive Model for a Batoid-Inspired Soft Robot |
|
Li, Guangtong | Singapore University of Technology and Design |
Stalin, Thileepan | Singapore University of Technology and Design |
Van Tien, Truong | Singapore University of Technology and Design |
Valdivia y Alvarado, Pablo | Singapore University of Technology and Design, MIT |
Keywords: Modeling, Control, and Learning for Soft Robots, Machine Learning for Robot Control, Deep Learning Methods
Abstract: Soft robots have a unique potential to harness advanced functionalities through materials engineering, chemistry, and advanced fabrication. However, modeling and control of soft robot bodies is challenging due to non-linearities and time-dependencies of materials physico-chemical properties. With the rapid development of artificial intelligence technologies, deep neural networks (DNN) have become an essential tool for exploring the relationships between inputs and outputs of challenging systems under complex environmental conditions. In this work, rather than physically modeling a soft robotic system, we treat the entire system, including its environment, as a complex but deterministic input-output system, and we use DNNs to estimate these relationships. As an application example, our training results show that DNNs can accurately simulate the physical properties of an underwater bio-inspired soft robot. Validation experiments show that measured propulsive forces are in good agreement with target values predicted by DNN | |