| |
Last updated on October 19, 2022. This conference program is tentative and subject to change
Technical Program for Wednesday October 26, 2022
|
WeA-1 |
Rm1 (Room A) |
Special Session: Computational Advances in Human-Robot Interaction 1 |
Regular session |
Chair: Bagchi, Shelly | National Institute of Standards and Technology |
Co-Chair: Han, Zhao | Colorado School of Mines |
|
10:00-10:10, Paper WeA-1.1 | |
Coordination with Humans Via Strategy Matching |
|
Zhao, Michelle | Carnegie Mellon University |
Simmons, Reid | Carnegie Mellon University |
Admoni, Henny | Carnegie Mellon University |
Keywords: Human-Robot Collaboration, Human-Robot Teaming
Abstract: Human and robot partners increasingly need to work together to perform tasks as a team. Robots designed for such collaboration must reason about how their task-completion strategies interplay with the behavior and skills of their human team members as they coordinate on achieving joint goals. Our goal in this work is to develop a computational framework for robot adaptation to human partners in human-robot team collaborations. We first present an algorithm for autonomously recognizing available task-completion strategies by observing human-human teams performing a collaborative task. By transforming team actions into low dimensional representations using hidden Markov models, we can identify strategies without prior knowledge. Robot policies are learned on each of the identified strategies to construct a Mixture-of-Experts model that adapts to the task strategies of unseen human partners. We evaluate our model on a collaborative cooking task using an Overcooked simulator. Results of an online user study with 125 participants demonstrate that our framework improves the task performance and collaborative fluency of human-agent teams, as compared to state of the art reinforcement learning methods.
|
|
10:10-10:20, Paper WeA-1.2 | |
DULA and DEBA: Differentiable Ergonomic Risk Models for Postural Assessment and Optimization in Ergonomically Intelligent PHRI |
|
Yazdani, Amir | University of Utah |
Sabbagh Novin, Roya | University of Utah |
Merryweather, Andrew | University of Utah |
Hermans, Tucker | University of Utah |
Keywords: Physical Human-Robot Interaction, Safety in HRI, Human-Aware Motion Planning
Abstract: Ergonomics and human comfort are essential concerns in physical human-robot interaction applications. Defining an accurate and easy-to-use ergonomic assessment model stands as an important step in providing feedback for postural correction to improve operator health and comfort. Common practical methods in the area suffer from inaccurate ergonomics models in performing postural optimization. In order to retain assessment quality, while improving computational considerations, we propose a novel framework for postural assessment and optimization for ergonomically intelligent physical human-robot interaction. We introduce DULA and DEBA, differentiable and continuous ergonomics models learned to replicate the popular and scientifically validated RULA and REBA assessments with more than 99% accuracy. We show that DULA and DEBA provide assessment comparable to RULA and REBA while providing computational benefits when being used in postural optimization. We evaluate our framework through human and simulation experiments. We highlight DULA and DEBA's strength in a demonstration of postural optimization for a simulated pHRI task.
|
|
10:20-10:30, Paper WeA-1.3 | |
Towards Inclusive HRI: Using Sim2Real to Address Underrepresentation in Emotion Expression Recognition |
|
Akhyani, Saba | Simon Fraser University |
Abbasi Boroujeni, Mehryar | Simon Fraser University |
Chen, Mo | Simon Fraser University |
Lim, Angelica | Simon Fraser University |
Keywords: Gesture, Posture and Facial Expressions, Social HRI, Simulation and Animation
Abstract: Robots and artificial agents that interact with humans should be able to do so without bias and inequity, but facial perception systems have notoriously been found to work more poorly for certain groups of people than others. In our work, we aim to build a system that can perceive humans in a more transparent and inclusive manner. Specifically, we focus on dynamic expressions on the human face, which are difficult to collect for a broad set of people due to privacy concerns and the fact that faces are inherently identifiable. Furthermore, datasets collected from the Internet are not necessarily representative of the general population. We address this problem by offering a Sim2Real approach in which we use a suite of 3D simulated human models that enables us to create an auditable synthetic dataset covering 1) underrepresented facial expressions, outside of the six basic emotions, such as confusion; 2) ethnic or gender minority groups; and 3) a wide range of viewing angles that a robot may encounter a human in the real world. By augmenting a small dynamic emotional expression dataset containing 123 samples with a synthetic dataset containing 4536 samples, we achieved an improvement in accuracy of 15% on our own dataset and 11% on an external benchmark dataset, compared to the performance of the same model architecture without synthetic training data. We also show that this additional step improves accuracy specifically for racial minorities when the architecture's feature extraction weights are trained from scratch.
|
|
10:30-10:40, Paper WeA-1.4 | |
Reasoning about Counterfactuals to Improve Human Inverse Reinforcement Learning |
|
Lee, Michael | Carnegie Mellon University |
Admoni, Henny | Carnegie Mellon University |
Simmons, Reid | Carnegie Mellon University |
Keywords: Human-Robot Collaboration
Abstract: To collaborate well with robots, we must be able to understand their decision making. Humans naturally infer other agents' beliefs and desires by reasoning about their observable behavior in a way that resembles inverse reinforcement learning (IRL). Thus, robots can convey their beliefs and desires by providing demonstrations that are informative for a human learner's IRL. An informative demonstration is one that differs strongly from the learner's expectations of what the robot will do given their current understanding of the robot's decision making. However, standard IRL does not model the learner's existing expectations, and thus cannot do this counterfactual reasoning. We propose to incorporate the learner's current understanding of the robot's decision making into our model of human IRL, so that a robot can select demonstrations that maximize the human's understanding. We also propose a novel measure for estimating the difficulty for a human to predict instances of a robot's behavior in unseen environments. A user study finds that our test difficulty measure correlates well with human performance and confidence. Interestingly, considering human beliefs and counterfactuals when selecting demonstrations decreases human performance on easy tests, but increases performance on difficult tests, providing insight on how to best utilize such models.
|
|
10:40-10:50, Paper WeA-1.5 | |
Proactive Robotic Assistance Via Theory of Mind |
|
Shvo, Maayan | University of Toronto |
Hari, Ruthrash | University of Toronto |
O'Reilly, Ziggy | Istituto Italiano Di Tecnologia; Universita Di Torino |
Abolore, Sophia | University of Toronto |
Wang, Sze Yuh Nina | University of Toronto |
McIlraith, Sheila | University of Toronto |
Keywords: AI-Based Methods, AI-Enabled Robotics, Human-Centered Robotics
Abstract: Advanced social cognitive skills enhance the effectiveness of human-robot interactions. Research shows that an important precursor to the development of these abilities in humans is Theory of Mind (ToM) -- the ability to attribute mental states to oneself and to others. In this work, we endow robots with ToM abilities and propose a ToM-based approach to proactive robotic assistance by appealing to epistemic planning techniques. Our evaluation shows that robots implementing our approach and demonstrating ToM are measurably more helpful and perceived by humans as more socially intelligent compared to robots with a deficit in ToM.
|
|
10:50-11:00, Paper WeA-1.6 | |
A Novel Perceptive Robotic Cane with Haptic Navigation for Enabling Vision-Independent Participation in the Social Dynamics of Seat Choice |
|
Agrawal, Shivendra | University of Colorado Boulder |
West, Mary | University of Colorado Boulder |
Hayes, Bradley | University of Colorado Boulder |
Keywords: Multi-Modal Perception for HRI, Vision-Based Navigation
Abstract: Goal-based navigation in public places is critical for independent mobility and for breaking the boundaries that exist for blind or visually impaired (BVI) people in our sight-centric society. Through this work, we present a proof-of-concept system that can autonomously find socially preferred seats and safely guide its user towards them in unknown indoor environments. The robotic system includes a camera, an IMU, vibrational motors, and a white cane, powered via a backpack-mounted laptop. The system combines techniques from computer vision, robotics, and motion planning with insights from psychology to perform 1) SLAM and object detection, 2) goal disambiguation and scoring, and 3) path planning and guidance. We introduce a novel 2-motor haptic feedback system on the cane’s grip for navigation assistance. Through a pilot user study, we show that the system is successful in autonomously classifying and providing haptic navigation guidance to socially preferred seats, while optimizing for users’ convenience, privacy, and intimacy in addition to increasing their confidence in independent navigation. The implications are encouraging as this technology, with careful design guided by the BVI community, can be adopted and further developed to be used with medical devices enabling the BVI population to better independently engage in socially dynamic situations like seat choice.
|
|
11:00-11:10, Paper WeA-1.7 | |
SESNO: Sample Efficient Social Navigation from Observation |
|
Hamed Baghi, Bobak | McGill University |
Konar, Abhisek | McGill University |
Hogan, Francois | Massachusetts Institute of Technology |
Jenkin, Michael | York University |
Dudek, Gregory | McGill University |
Keywords: Imitation Learning, Human-Aware Motion Planning, Learning from Experience
Abstract: In this paper, we present the Sample Efficient Social Navigation from Observation (SESNO) algorithm that efficiently learns socially-compliant navigation policies from observations of human trajectories. SESNO is an inverse reinforcement learning (IRL)-based algorithm that learns from human trajectory observations without knowledge of their actions. We improve the sample-efficiency over previous IRL-based methods by introducing a shared experience replay buffer that allows reuse of past trajectory experiences to estimate the policy and the reward. We evaluate SESNO using publicly available pedestrian motion data sets and compare its performance to related baseline methods in the literature. We show that SESNO yields performance superior to existing baselines while dramatically improving the sample complexity by using as few as a hundredth of the samples required by existing baselines.
|
|
11:10-11:20, Paper WeA-1.8 | |
Understanding Acoustic Patterns of Human Teachers Demonstrating Manipulation Tasks to Robots |
|
Saran, Akanksha | University of Texas at Austin |
Desai, Kush | University of Texas at Austin |
Chang, Mai Lee | University of Texas at Austin |
Lioutikov, Rudolf | Karlsruhe Institute of Technology |
Thomaz, Andrea Lockerd | University of Texas at Austin |
Niekum, Scott | University of Texas at Austin |
Keywords: Human-Centered Robotics, Learning from Demonstration
Abstract: Humans use audio signals in the form of spoken language or verbal reactions effectively when teaching new skills or tasks to other humans. While demonstrations allow humans to teach robots in a natural way, learning from trajectories alone does not leverage other available modalities including audio from human teachers. To effectively utilize audio cues accompanying human demonstrations, first it is important to understand what kind of information is present and conveyed by such cues. This work characterizes audio from human teachers demonstrating multi-step manipulation tasks to a situated Sawyer robot along three dimensions: (1) duration of speech used, (2) expressiveness in speech or prosody, and (3) semantic content of speech. We analyze these features for four different independent variables and find that teachers convey similar semantic content via spoken words for different conditions of (1) demonstration types, (2) audio usage instructions, (3) subtasks, and (4) errors during demonstrations. However, differentiating properties of speech in terms of duration and expressiveness are present for the four independent variables, highlighting that human audio carries rich information, potentially beneficial for technological advancement of robot learning from demonstration methods.
|
|
11:20-11:30, Paper WeA-1.9 | |
Gesture2Vec: Clustering Gestures Using Representation Learning Methods for Co-Speech Gesture Generation (Finalist for IROS Best Paper Award on Cognitive Robotics Sponsored by KROS) |
|
Jome Yazdian, Payam | Simon Fraser University |
Chen, Mo | Simon Fraser University |
Lim, Angelica | Simon Fraser University |
Keywords: Gesture, Posture and Facial Expressions, Social HRI, Representation Learning
Abstract: Co-speech gestures are a principal component in conveying messages and enhancing interaction experiences between humans and critical ingredients in human-agent interaction, including virtual agents and robots. Existing machine learning approaches have yielded only marginal success in learning speech-to-motion at the frame level. Current methods generate repetitive gesture sequences that lack appropriateness with respect to the speech context. To tackle this challenge, we take inspiration from successes in natural language processing on context and long-term dependencies, and propose a new framework that views text-to-gesture as machine translation, where gestures are words in another (non-verbal) language. We propose a vector-quantized variational autoencoder structure as well as training techniques to learn a rigorous representation of gesture sequences. We then translate input text into a discrete sequence of associated gesture chunks in the learned gesture space. Ultimately, we use translated gesture tokens from the input text as an input to the autoencoder’s decoder to produce gesture sequences. Subjective and objective evaluations confirm the success of our approach in terms of appropriateness, human-likeness, and diversity. We also introduce new objective metrics using the quantized gesture representation.
|
|
WeA-2 |
Rm2 (Room B-1) |
Special Session: Robotics in Agriculture and Livestock Farming Systems |
Regular session |
Chair: Qiao, Yongliang | University of Sydney |
Co-Chair: Huang, Zichen | Kyoto University |
|
10:00-10:10, Paper WeA-2.1 | |
Optical Flow-Based Branch Segmentation for Complex Orchard Environments |
|
You, Alexander | Oregon State University |
Grimm, Cindy | Oregon State University |
Davidson, Joseph | Oregon State University |
Keywords: Robotics and Automation in Agriculture and Forestry, Object Detection, Segmentation and Categorization
Abstract: Machine vision is a critical subsystem for enabling robots to be able to perform a variety of tasks in orchard environments. However, orchards are highly visually complex environments, and computer vision algorithms operating in them must be able to contend with variable lighting conditions and background noise. Past work on enabling deep learning algorithms to operate in these environments has typically required large amounts of hand-labeled data to train a deep neural network or physically controlling the conditions under which the environment is perceived. In this paper, we train a neural network system in simulation only using simulated RGB data and optical flow. This resulting neural network is able to perform foreground segmentation of branches in a busy orchard environment without additional real-world training or using any special setup or equipment beyond a standard camera. Our results show that our system is highly accurate and, when compared to a network using manually labeled RGBD data, achieves significantly more consistent and robust performance across environments that differ from the training set.
|
|
10:10-10:20, Paper WeA-2.2 | |
Near Real-Time Vineyard Downy Mildew Detection and Severity Estimation |
|
Liu, Ertai | Cornell University |
Gold, Kaitlin | Cornell University |
Cadle-Davidson, Lance | USDA ARS Grape Genetics Research Unit |
Combs, David | Cornell University |
Jiang, Yu | Cornell University |
Keywords: Robotics and Automation in Agriculture and Forestry, Environment Monitoring and Management, Agricultural Automation
Abstract: The global grape and wine industry has been considerably impacted by diseases such as downy mildew (DM). Agricultural robots have demonstrated great potential to accurately and rapidly map DM infection for precision applications. Although the robots can autonomously acquire high-resolution images in the vineyard, data processing is mostly performed offline because of network infrastructure and onboard computing power constraints, limiting the use of agricultural robots for field operations. To address this issue, we developed a semantic segmentation model based on the modified DeepLabv3 network for near real time DM segmentation in high resolution images. Compared with state-of-the-art real time semantic segmentation models, the developed one achieved the best efficiency-accuracy balance on the DM dataset using embedded computing devices that can be easily integrated with commercial robotic platforms. DM severity estimation pipeline based on the model also showed a comparable measurement accuracy and statistical power in differentiation of fungicide treatments as the one based on offline semantic segmentation models. This enables the use of robotic perception systems for field operations.
|
|
10:20-10:30, Paper WeA-2.3 | |
View Planning Using Discrete Optimization for 3D Reconstruction of Row Crops |
|
Bacharis, Thanasis | CSE, UMN |
Nelson, Henry | University of Minnesota |
Papanikolopoulos, Nikos | University of Minnesota |
Keywords: Robotics and Automation in Agriculture and Forestry, Agricultural Automation, Computer Vision for Automation
Abstract: In view planning, the position and orientation of the cameras have been a major contributing factor to the quality of the resulting 3D model. In applications such as precision agriculture, a dense and accurate reconstruction must be obtained quickly while the data is still actionable. Instead of using an arbitrarily large number of images taken from every possible position and orientation in order to cover the desired area of study, a more optimal approach is required. We present an efficient and realistic pipeline, which aims to optimize the positioning of cameras and hence the quality of the 3D reconstruction of a field of row crops. This is achieved with four steps; an initial flight to obtain a sparse point cloud, the fitting of a simple mesh model, the planning of images via a discrete optimization process, and a second flight to obtain the final reconstruction. We demonstrate the effectiveness of our method by comparing it with baseline methods commonly used for agricultural data collection and processing.
|
|
10:30-10:40, Paper WeA-2.4 | |
BonnBot-I: A Precise Weed Management and Crop Monitoring Platform |
|
Ahmadi, Alireza | University of Bonn |
Halstead, Michael Allan | Bonn University |
McCool, Christopher Steven | University of Bonn |
Keywords: Robotics and Automation in Agriculture and Forestry, Agricultural Automation, Field Robots
Abstract: Cultivation and weeding are two of the primary tasks performed by farmers today. A recent challenge for weed- ing is the desire to reduce herbicide and pesticide treatments while maintaining crop quality and quantity. In this paper we introduce BonnBot-I a precise weed management platform which can also performs field monitoring. Driven by crop monitoring approaches which can accurately locate and classify plants (weed and crop) we further improve their performance by fusing the platform available GNSS and wheel odometry. This improves tracking accuracy of our crop monitoring approach from a normalized average error of 8.3% to 3.5%, evaluated on a new publicly available corn dataset. We also present a novel arrangement of weeding tools mounted on linear actuators evaluated in simulated environments. We replicate weed distributions from a real field, using the results from our monitoring approach, and show the validity of our work-space division techniques which require significantly less movement (a 50% reduction) to achieve similar results. Overall, BonnBot-I is a significant step forward in precise weed management with a novel method of selectively spraying and controlling weeds in an arable field. Keywords — Robotics and Automation in Agriculture and Forestry; Agricultural Automation; Field Robotics.
|
|
10:40-10:50, Paper WeA-2.5 | |
An Integrated Actuation-Perception Framework for Robotic Leaf Retrieval: Detection, Localization, and Cutting |
|
Campbell, Merrick | University of California, Riverside |
Dechemi, Amel | University of California, Riverside |
Karydis, Konstantinos | University of California, Riverside |
Keywords: Agricultural Automation, Software-Hardware Integration for Robot Systems, Robotics and Automation in Agriculture and Forestry
Abstract: Contemporary robots in precision agriculture focus primarily on automated harvesting or remote sensing to monitor crop health. Comparatively less work has been performed with respect to collecting physical leaf samples in the field and retaining them for further analysis. Typically, orchard growers manually collect sample leaves and utilize them for stem water potential measurements to analyze tree health and determine irrigation routines. While this technique benefits orchard management, the process of collecting, assessing, and interpreting measurements requires significant human labor and often leads to infrequent sampling. Automated sampling can provide highly accurate and timely information to growers. The first step in such automated in-situ leaf analysis is identifying and cutting a leaf from a tree. This retrieval process requires new methods for actuation and perception. We present a technique for detecting and localizing candidate leaves using point cloud data from a depth camera. This technique is tested on both indoor and outdoor point clouds from avocado trees. We then use a custom-built leaf-cutting end-effector on a 6-DOF robotic arm to test the proposed detection and localization technique by cutting leaves from an avocado tree. Experimental testing with a real avocado tree demonstrates our proposed approach can enable our mobile manipulator and custom end-effector system to successfully detect, localize, and cut leaves.
|
|
10:50-11:00, Paper WeA-2.6 | |
Algorithm Design and Integration for a Robotic Apple Harvesting System |
|
Zhang, Kaixiang | Michigan State University |
Lammers, Kyle | Michigan State University |
Chu, Pengyu | Michigan State University |
Dickinson, Nathan | Michigan State University |
Li, Zhaojian | Michigan State University |
Lu, Renfu | United States Department of Agriculture Agricultural Research Se |
Keywords: Robotics and Automation in Agriculture and Forestry, Agricultural Automation, Software-Hardware Integration for Robot Systems
Abstract: Due to labor shortage and rising labor cost for the apple industry, there is an urgent need for the development of robotic systems to efficiently and autonomously harvest apples. In this paper, we present a system overview and algorithm design of our recently developed robotic apple harvester prototype. Our robotic system is enabled by the close integration of several core modules, including visual perception, planning, and control. This paper covers the main methods and advancements in deep learning-based multi-view fruit detection and localization, unified picking and dropping planning, and dexterous manipulation control. Indoor and field experiments were conducted to evaluate the performance of the developed system, which achieved an average picking rate of 3.6 seconds per apple. This is a significant improvement over other reported apple harvesting robots with a picking rate in the range of 7-10 seconds per apple. The current prototype shows promising performance towards further development of efficient and automated apple harvesting technology. Finally, limitations of the current system and future work are discussed.
|
|
11:00-11:10, Paper WeA-2.7 | |
Predicting Fruit-Pick Success Using a Grasp Classifier Trained on a Physical Proxy |
|
Velasquez-Lopez, Alejandro | Oregon State University |
Swenson, Nigel | Oregon State University |
Cravetz, Miranda | Oregon State University |
Grimm, Cindy | Oregon State University |
Davidson, Joseph | Oregon State University |
Keywords: Robotics and Automation in Agriculture and Forestry, Grasping, Data Sets for Robot Learning
Abstract: Apple picking is a challenging manipulation task, but it is difficult to test solutions due to the limited window of time that apples are in season. Previous methods have built simulations of apple trees, but simulations rarely capture soft contact and deformation well, both of which are common in fruit picking. In this paper we present and validate a physical proxy that replicates the mechanics of a real world apple pick. This proxy, in conjunction with a novel hand with multiple sensors, enables large-scale capture of sensor data for data collection and testing. To validate our approach, we train a Long Short-Term Memory network to classify a pick as successful or failed based on sensor feedback from the robot hand. We show that a network trained on the proxy performs as well or even better than a network trained solely on real apple trees, with accuracies up to 90 %. We determine which sensors are most important for pick classification and also demonstrate that our proxy preserves the most important sensor feature data for pick classification. Specifically for the implemented hand, the most informative feature group was the finger's servomotor effort.
|
|
11:10-11:20, Paper WeA-2.8 | |
Contrastive 3D Shape Completion and Reconstruction for Agricultural Robots Using RGB-D Frames (Finalist for IROS Best Paper Award on Agri-Robotics Sponsored by YANMAR) |
|
Magistri, Federico | University of Bonn |
Marks, Elias Ariel | University of Bonn |
Nagulavancha, Sumanth | University of Bonn |
Vizzo, Ignacio | University of Bonn |
Läbe, Thomas | University of Bonn |
Behley, Jens | University of Bonn |
Halstead, Michael Allan | Bonn University |
McCool, Christopher Steven | University of Bonn |
Stachniss, Cyrill | University of Bonn |
|
|
11:20-11:30, Paper WeA-2.9 | |
Beyond mAP: Towards Practical Object Detection for Weed Spraying in Precision Agriculture |
|
Salazar-Gomez, Adrian | University of Lincoln |
Darbyshire, Madeleine | University of Lincoln |
Gao, Junfeng | University of Lincoln |
Sklar, Elizabeth I. | University of Lincoln |
Parsons, Simon | University of Lincoln |
Keywords: Agricultural Automation, Computer Vision for Automation, Object Detection, Segmentation and Categorization
Abstract: The evolution of smaller and more powerful GPUs over the last 2 decades has vastly increased the opportunity to apply robust deep learning-based machine vision approaches to real-time use cases in practical environments. One ex- citing application domain for such technologies is precision agriculture, where the ability to integrate on-board machine vision with data-driven actuation means that farmers can make decisions about crop care and harvesting at the level of the individual plant rather than the whole field. This makes sense both economically and environmentally. This paper assesses the feasibility of precision spraying weeds via a comprehensive evaluation of weed detection accuracy and speed using two separate datasets, two types of GPU, and several state-of-the- art object detection algorithms. A simplified model of precision spraying is used to determine whether the weed detection accuracy achieved could result in a sufficiently high weed hit rate combined with a significant reduction in herbicide usage. The paper introduces two metrics to capture these aspects of the real-world deployment of precision weeding and demonstrates their utility through experimental results.
|
|
WeA-3 |
Rm3 (Room B-2) |
Special Session: Robot Audition |
Regular session |
Chair: Kumon, Makoto | Kumamoto University |
Co-Chair: Itoyama, Katsutoshi | Tokyo Institute of Technology |
|
10:00-10:10, Paper WeA-3.1 | |
Audio-Visual Depth and Material Estimation for Robot Navigation |
|
Wilson, Justin | University of North Carolina at Chapel Hill (UNC-CH) |
Rewkowski, Nicholas | UMD College Park |
Lin, Ming C. | University of Maryland at College Park |
Keywords: Semantic Scene Understanding, Vision-Based Navigation, Audio-Visual SLAM
Abstract: Reflective and textureless surfaces such as windows, mirrors, and walls can be a challenge for scene reconstruction, due to depth discontinuities and holes. We propose an audio-visual method that uses the reflections of sound to aid in depth estimation and material classification for 3D scene reconstruction in robot navigation and AR/VR applications. The mobile phone prototype emits pulsed audio, while recording video for audio-visual classification for 3D scene reconstruction. Reflected sound and images from the video are input into our audio (EchoCNN-A) and audio-visual (EchoCNN-AV) convolutional neural networks for surface and sound source detection, depth estimation, and material classification. The inferences from these classifications enhance 3D scene reconstructions containing open spaces and reflective surfaces by depth filtering, inpainting, and placement of unmixed sound sources in the scene. Our prototype, demos, and experimental results from real-world with challenging surfaces and sound, also validated with virtual scenes, indicate high success rates on classification of material, depth estimation, and closed/open surfaces, leading to considerable improvement in 3D scene reconstruction for robot navigation.
|
|
10:10-10:20, Paper WeA-3.2 | |
Design of a Low-Cost Passive Acoustic Monitoring System for Animal Localisation from Calls |
|
Yen, Benjamin | University of Auckland |
Prins, Jemima | University of Auckland |
Schmid, Gian | University of Auckland |
Hioka, Yusuke | University of Auckland |
Ellis, Susan | GNS Science |
Marsland, Stephen | Victoria University of Wellington |
Keywords: Robot Audition
Abstract: The field of bioacoustics is concerned with monitoring wild animals based on their vocalisations. Passive acoustic recorders are now commonly used to collect data of the soundscapes of our wild places. While the data they collect is extremely useful, the majority of the recorders use a single omnidirectional microphone, and thus cannot independently perform localisation of a calling animal. Localisation can be useful to differentiate between multiple calling animals, to improve statistical estimates of abundance, and to locate calling posts, which may be close to nests. In this paper, we consider the design of a low-cost, practical, passive directional acoustic recorder that will facilitate animal localisation, and present and evaluate a prototype system for this purpose.
|
|
10:20-10:30, Paper WeA-3.3 | |
Spotforming by NMF Using Multiple Microphone Arrays |
|
Kagimoto, Yasuhiro | Tokyo Institute of Technology |
Itoyama, Katsutoshi | Tokyo Institute of Technology |
Nishida, Kenji | Tokyo Institute of Technoloy |
Nakadai, Kazuhiro | Tokyo Institute of Technology |
Keywords: Robot Audition
Abstract: Sound source separation is a method to extract a target sound source from a mixture of various sound sources and noises. One of the typical sound source separation methods is beamforming, which can separate sound sources by direction based on the phase difference between channels from the recorded signal of a microphone array, a multi-channel recording system. However, beamforming is a direction-based method and cannot separate multiple sources in the same direction. In this paper, we propose a method for separating sources in the same direction using multiple microphone arrays. The proposed method performs beamforming using multiple microphone arrays and extracts only the target sound source from the separated sound by the Non-negative Matrix Factorization (NMF), thus reducing the influence of other sources in the same direction. In this paper, to investigate the effectiveness of the proposed method, experiments were conducted assuming the presence of another sound source in the same direction from an arbitrary microphone array. The results show that the proposed method outperforms the delay-sum method in a simulation environment. In addition, experiments were conducted in a real environment to verify the effect of reverberation.
|
|
10:30-10:40, Paper WeA-3.4 | |
Noisy Agents: Self-Supervised Exploration by Predicting Auditory Events |
|
Gan, Chuang | IBM |
Chen, Xiaoyu | Tsinghua University |
Isola, Phillip | MIT |
Torralba, Antonio | MIT |
Tenenbaum, Joshua | Massachusetts Institute of Technology |
Keywords: Reinforcement Learning, Robot Audition
Abstract: Humans integrate multiple sensory modalities e.g., visual and audio) to build a causal understanding of the physical world. In this work, we propose a novel type of intrinsic motivation for Reinforcement Learning (RL) that encourages the agent to understand the causal effect of its actions through auditory event prediction. First, we allow the agent to collect a small amount of acoustic data and use K-means to discover underlying auditory event clusters. We then train a neural network to predict the auditory events and use the prediction errors as intrinsic rewards to guide RL exploration. We first conduct proof-of-concept experiments using a set of Atari games for an in-depth analysis of our module. We then apply our model to embodied audio-visual exploration using the Habitat simulator and active exploration with a rolling robot using the ThreeDWorld (TDW) simulator. Experimental results demonstrate the advantages of using audio signals over vision-based models as intrinsic rewards to guide RL explorations.
|
|
10:40-10:50, Paper WeA-3.5 | |
Direction-Aware Adaptive Online Neural Speech Enhancement with an Augmented Reality Headset in Real Noisy Conversational Environments |
|
Sekiguchi, Kouhei | RIKEN |
Nugraha, Aditya Arie | RIKEN |
Du, Yicheng | Kyoto University |
Bando, Yoshiaki | National Institute of Advanced Industrial Science and Technology |
Fontaine, Mathieu | LTCI, Telecom Paris, Institut Polytechnique De Paris |
Yoshii, Kazuyoshi | Kyoto University |
Keywords: Robot Audition, Deep Learning Methods, Human Performance Augmentation
Abstract: This paper describes the practical response- and performance-aware development of online speech enhancement for an augmented reality (AR) headset that helps a user understand conversations made in real noisy echoic environments (e.g., cocktail party). One may use a state-of-the-art blind source separation method called fast multichannel nonnegative matrix factorization (FastMNMF) that works well in various environments thanks to its unsupervised nature. Its heavy computational cost, however, prevents its application to real-time processing. In contrast, a supervised beamforming method that uses a deep neural network (DNN) for estimating spatial information of speech and noise readily fits real-time processing, but suffers from drastic performance degradation in mismatched conditions. Given such complementary characteristics, we propose a dual-layer robust online speech enhancement method based on DNN-based beamforming with FastMNMF-guided adaptation. FastMNMF (back end) is performed in a mini-batch style and the noisy and enhanced speech pairs are used together with the original parallel training data for updating the direction-aware DNN (front end) with backpropagation at a computationally-allowable interval. This method is used with a blind dereverberation method called weighted prediction error (WPE) for transcribing the noisy reverberant speech of a speaker, which can be detected from video or selected by a user's hand gesture or eye gaze, in a streaming manner and spatially showing the transcriptions with an AR technique. Our experiment showed that the word error rate was improved by more than 10 points with the run-time adaptation using only twelve minutes observation.
|
|
10:50-11:00, Paper WeA-3.6 | |
Object Surface Recognition Using Microphone Array by Acoustic Standing Wave |
|
Manabe, Tomoya | Kumamoto University |
Fukunaga, Rikuto | Kumamoto University |
Nakatsuma, Kei | Kumamoto University |
Kumon, Makoto | Kumamoto University |
Keywords: Robot Audition, Range Sensing, Object Detection, Segmentation and Categorization
Abstract: This paper proposes a microphone array with a speaker to recognize the shape of the surface of the target object by using the standing wave between the transmitted and the reflected acoustic signals. Because the profile of the distance spectrum encodes both the distance to the target and the distance to the edges of the target’s surface, this paper proposes to fuse distance spectra using a microphone array to estimate the three-dimensional structure of the target surface. The proposed approach was verified through numerical simulations and outdoor field experiments. Results showed the effectiveness of the method as it could extract the shape of the board located 2m in front of the microphone array by using a chirp tone with 20kHz bandwidth.
|
|
11:00-11:10, Paper WeA-3.7 | |
Recognizing Object Surface Material from Impact Sounds for Robot Manipulation |
|
Dimiccoli, Mariella | Institut De Robòtica I Informàtica Industrial (CSIC-UPC) |
Patni, Shubhan | Czech Technical University in Prague |
Hoffmann, Matej | Czech Technical University in Prague, Faculty of Electrical Engi |
Moreno-Noguer, Francesc | CSIC |
Keywords: Perception for Grasping and Manipulation, Recognition, Robot Audition
Abstract: We investigated the use of impact sounds generated during exploratory behaviors in a robotic manipulation setup as cues for predicting object surface material and for recognizing individual objects. We collected and make available the YCB-impact sounds dataset which includes over 3,500 impact sounds for the YCB set of everyday objects lying on a table. Impact sounds were generated in three modes: (i) human holding a gripper and hitting, scratching, or dropping the object; (ii) gripper attached to a teleoperated robot hitting the object from the top; (iii) autonomously operated robot hitting the objects from the side with two different speeds. A convolutional neural network (ResNet34) is trained from scratch to recognize the object material (steel, aluminium, hard plastic, soft plastic, other plastic, ceramic, wood, paper/cardboard, foam, glass, rubber) from a single impact sound. On the manually collected dataset with more variability in the action, nearly 60% accuracy for the test set (unseen objects) was achieved. On a robot setup and a stereotypical poking action from top, accuracy of 85% was achieved. This performance drops to 79% if multiple exploratory actions are combined. Individual objects from the set of 75 objects can be recognized with a 79% accuracy. This work demonstrates promising results regarding the possibility of using sound for recognition in tasks like single-stream recycling where objects have to be sorted based on their material composition.
|
|
11:10-11:20, Paper WeA-3.8 | |
Controlling the Impression of Robots Via GAN-Based Gesture Generation |
|
Wu, Bowen | Osaka University |
Shi, Jiaqi | Osaka University, RIKEN |
Liu, Chaoran | ATR |
Ishi, Carlos Toshinori | RIKEN |
Ishiguro, Hiroshi | Osaka University |
Keywords: Gesture, Posture and Facial Expressions, Social HRI, Emotional Robotics
Abstract: As a type of body language, gestures can largely affect the impressions of human-like robots perceived by users. Recent data-driven approaches to the generation of co-speech gestures have successfully promoted the naturalness of produced gestures. These approaches also possess greater generalizability to work under various contexts than rule-based methods. However, most have no direct control over the human impressions of robots. The main obstacle is that creating a dataset that covers various impression labels is not trivial. In this study, based on previous findings in cognitive science on robot impressions, we present a heuristic method to control them without manual labeling, and demonstrate its effectiveness on a virtual agent and partially on a humanoid robot through subjective experiments with 50 participants.
|
|
11:20-11:30, Paper WeA-3.9 | |
Outdoor Evaluation of Sound Source Localization for Drone Groups Using Microphone Arrays |
|
Yamada, Taiki | Tokyo Institute of Technology |
Itoyama, Katsutoshi | Tokyo Institute of Technology |
Nishida, Kenji | Tokyo Institute of Technoloy |
Nakadai, Kazuhiro | Tokyo Institute of Technology |
Keywords: Localization, Search and Rescue Robots, Aerial Systems: Applications
Abstract: For robot and drone auditions, microphone arrays have been used for estimating sound source directions and sound source locations. By using sound source localization techniques, for example, drones can detect people calling for help even if the target person is not visible. Most sound source localization methods are based on estimated sound source directions and triangulation. However, when it comes to situations using drones, severe drone noise distorts direction estimation results which could worsen the localization results badly due to the discreteness of direction estimation. In this perspective, the authors have proposed a sound source localization method that can omit outlying triangulation points, which could improve its localization performance. In this paper, an outdoor experiment has been held, and the proposed method is evaluated whether it can localize a sound source even if real drone noise is added to the recordings. Experiment results show that the proposed method can localize with 4.15 m of estimation error for a sound source up to 50 m away, suppress the impact of outliers, and use only plausible triangulation points.
|
|
WeA-4 |
Rm4 (Room C-1) |
Machine Learning for Robot Control 1 |
Regular session |
Chair: Halt, Lorenz | Fraunhofer Institute for Manufacturing Engineering and Automation IPA |
Co-Chair: Baumann, Cyrill | Ecole Polytechnique Fédérale De Lausanne |
|
10:00-10:10, Paper WeA-4.1 | |
Reactive Stepping for Humanoid Robots Using Reinforcement Learning: Application to Standing Push Recovery on the Exoskeleton Atalante |
|
Duburcq, Alexis | Wandercraft |
Schramm, Fabian | Wandercraft |
Boeris, Guilhem | Wandercraft |
Bredeche, Nicolas | Université Pierre Et Marie Curie |
Chevaleyre, Yann | Univ. Paris Dauphine |
Keywords: Machine Learning for Robot Control, Reinforcement Learning, Humanoid Robot Systems
Abstract: State-of-the-art reinforcement learning is now able to learn versatile locomotion, balancing and push-recovery capabilities for bipedal robots in simulation. Yet, the reality gap has mostly been overlooked and the simulated results hardly transfer to real hardware. Either it is unsuccessful in practice because the physics is over-simplified and hardware limitations are ignored, or regularity is not guaranteed, and unexpected hazardous motions can occur. This paper presents a reinforcement learning framework capable of learning robust standing push recovery for bipedal robots that smoothly transfer to reality, providing only instantaneous proprioceptive observations. By combining original termination conditions and policy smoothness conditioning, we achieve stable learning, sim-to-real transfer and safety using a policy without memory nor explicit history. Reward engineering is then used to give insights into how to keep balance. We demonstrate its performance in reality on the lower-limb medical exoskeleton Atalante.
|
|
10:10-10:20, Paper WeA-4.2 | |
Hybrid Approach for Stabilizing Large Time Delays in Cooperative Adaptive Cruise Control with Reduced Performance Penalties |
|
Hsueh, Kuei-Fang (Albert) | University of Toronto |
Farnood, Ayleen | University of Toronto |
Al Janaideh, Mohammad | Memorial University &University of Toronto |
Kundur, Deepa | University of Toronto |
Keywords: Machine Learning for Robot Control
Abstract: cooperative adaptive cruise control (CACC) is a smart transportation solution that can mitigate traffic jams and improve road safety. CACC performance is heavily impacted by communication time delay; moreover, control theory solutions generally compromise control performance by tuning control gains in order to maintain plant stability. We propose a control-machine learning hybrid approach called deep time delay filter (DTDF). DTDF predicts the present (un-delayed) car states given time delayed versions. We successfully train a neural network for the DTDF method and use a physical testbed to show that DTDF can mitigate the effects of constant time delays as large as 5s while maintaining superior control performance compared to that of a baseline control algorithm.
|
|
10:20-10:30, Paper WeA-4.3 | |
Leveraging Multi-Level Modelling to Automatically Design Behavioral Arbitrators in Robotic Controllers |
|
Baumann, Cyrill | Ecole Polytechnique Fédérale De Lausanne |
Birch, Hugo | EPFL |
Martinoli, Alcherio | EPFL |
Keywords: Machine Learning for Robot Control, Behavior-Based Systems, Swarm Robotics
Abstract: Automatic control design for robotic systems is becoming more and more popular. However, this usually involves a significant computational cost, due to the expensive and noisy evaluation of candidate solutions through high-fidelity simulation or even real hardware. This work aims at reducing the computational cost of automatic design of behavioral arbitrators through the introduction of a two-step approach. In the first step, the structure of the finite state machine governing the behavioral arbitrator is optimized. To this purpose, a more abstracted model of the robotic system is leveraged in order to significantly reduce the computational cost. In the second step, the close-to-hardware, behavioral parameters are fine-tuned using a high-fidelity model. We show that, for a scenario involving a single robot and multiple tasks to be solved sequentially, using the proposed method results in a significant decrease of the computational cost while reaching the same controller performance both in simulation and reality.
|
|
10:30-10:40, Paper WeA-4.4 | |
Federated Learning from Demonstration for Active Assistance to Smart Wheelchair Users |
|
Casado, Fernando E. | Universidade De Santiago De Compostela |
Demiris, Yiannis | Imperial College London |
Keywords: Machine Learning for Robot Control, Service Robotics, AI-Based Methods
Abstract: Learning from Demonstration (LfD) is a very appealing approach to empower robots with autonomy. Given some demonstrations provided by a human teacher, the robot can learn a policy to solve the task without explicit programming. A promising use case is to endow smart robotic wheelchairs with active assistance to navigation. By using LfD, it is possible to learn to infer short-term destinations anywhere, without the need of building a map of the environment beforehand. Nevertheless, it is difficult to generalize robot behaviors to environments other than those used for training. We believe that one possible solution is learning from crowds, involving a broad number of teachers (the end users themselves) who perform demonstrations in diverse and real environments. To this end, in this work we consider Federated Learning from Demonstration (FLfD), a distributed approach based on a Federated Learning architecture. Our proposal allows the training of a global deep neural network using sensitive local data (images and laser readings) with privacy guarantees. In our experiments we pose a scenario involving different clients working in heterogeneous domains. We show that the federated model is able to generalize and deal with non Independent and Identically Distributed (non-IID) data.
|
|
10:40-10:50, Paper WeA-4.5 | |
PourNet: Robust Robotic Pouring through Curriculum and Curiosity-Based Reinforcement Learning |
|
Babaians, Edwin | Technical University of Munich |
Sharma, Tapan | Technische Universität München |
Karimi, Mojtaba | Technical University of Munich |
Sharifzadeh, Sahand | LMU |
Steinbach, Eckehard | Technical University of Munich |
Keywords: Machine Learning for Robot Control, AI-Enabled Robotics, Manipulation Planning
Abstract: Pouring liquids accurately into containers is one of the most challenging tasks for robots as they are unaware of the complex fluid dynamics and the behavior of liquids when pouring. Therefore, it is not possible to formulate a generic pouring policy for real-time applications. In this paper, we propose PourNet, as a generalized solution to pouring different liquids into containers. PourNet is a hybrid planner that uses deep reinforcement learning, for end-effector planning, and Nonlinear Model Predictive Control, for joint planning. In this work, we introduce a novel simulation environment using Unity3D and NVIDIA-Flex to train our agents. By effective choice of the state space, action space, and the reward functions, we allow for a direct sim-to-real transfer of the learned skills without additional training. In the simulation, PourNet outperforms state-of-the-art by an average of 4.9g deviation for water-like, and 9.2g deviation for honey-like liquids. In the real-world scenario using Kinova Movo Platform, PourNet achieves an average pouring deviation of 2.3g for dish soap when using a novel pouring container. The average pouring deviation measured for water was 5.5g.
|
|
10:50-11:00, Paper WeA-4.6 | |
Simulation-Based Learning of the Peg-In-Hole Process Using Robot-Skills |
|
Lämmle, Arik | Fraunhofer IPA |
Tenbrock, Philipp | Fraunhofer Institute for Manufacturing Engineering and Automatio |
Bálint, Balázs András | Fraunhofer IPA |
Nägele, Frank | Fraunhofer IPA |
Kraus, Werner | Fraunhofer IPA |
Váncza, József | SZTAKI |
Huber, Marco F. | University of Stuttgart |
Keywords: Machine Learning for Robot Control, Industrial Robots, Reinforcement Learning
Abstract: Increasingly volatile markets challenge companies and demand flexible production systems that can be quickly adapted to new conditions. Machine Learning has proven to show significant potential in supporting the human operator during the time-consuming and complex task of robot programming by identifying relevant parameters of the underlying robot control program. We present a solution to learn these parameters for contact-rich, force-controlled assembly tasks from a simulation using hardware-independent robot skills. We show that successful learning and real-world execution are possible even under process deviation and tolerances utilizing the designed learning system. We present learning skill parameters as high-level robot control, evaluation and comparison of extensive simulations, and preliminary experiments on a physical robot test-bed. The developed solution approach is evaluated and discussed using the Peg-in-Hole process, a typical benchmark process in force-controlled assembly.
|
|
11:00-11:10, Paper WeA-4.7 | |
Hybrid LMC: Hybrid Learning and Model-Based Control for Wheeled Humanoid Robot Via Ensemble Deep Reinforcement Learning |
|
Baek, DongHoon | University of Illinois Urbana-Champaign |
Purushottam, Amartya | University of Illinois, Urbana-Champaign |
Ramos, Joao | University of Illinois at Urbana-Champaign |
Keywords: Machine Learning for Robot Control, Reinforcement Learning, Humanoid Robot Systems
Abstract: Control of wheeled humanoid locomotion is a challenging problem due to the nonlinear dynamics and underactuated characteristics of these robots. Traditionally, feedback controllers have been utilized for stabilization and locomotion. However, these methods are often limited by the fidelity of the underlying model used, choice of controller, and environmental variables considered (surface type, ground inclination, etc). Recent advances in reinforcement learning (RL) offer promising methods to tackle some of these conventional feedback controller issues but require large amounts of interaction data to learn. Here, we propose a hybrid learning and modelbased controller Hybrid LMC that combines the strengths of a classical linear quadratic regulator (LQR) and ensemble deep reinforcement learning. Ensemble deep reinforcement learning is composed of multiple Soft Actor-Critic (SAC) and is utilized in reducing the variance of RL networks. By using a feedback controller in tandem the network exhibits stable performance in the early stages of training. As a preliminary step, we explore the viability of Hybrid LMC in controlling wheeled locomotion of a humanoid robot over a set of different physical parameters in MuJoCo simulator. Our results show that Hybrid LMC achieves better performance compared to other existing techniques and has increased sample efficiency.
|
|
11:10-11:20, Paper WeA-4.8 | |
Active Exploration for Robotic Manipulation |
|
Schneider, Tim | Technical University Darmstadt |
Belousov, Boris | Technische Universität Darmstadt |
Chalvatzaki, Georgia | Technische Universität Darmastadt, Intelligent Robotic Systems |
Romeres, Diego | Mitsubishi Electric Research Laboratories |
Jha, Devesh | Mitsubishi Electric Research Laboratories |
Peters, Jan | Technische Universität Darmstadt |
Keywords: Machine Learning for Robot Control, Reinforcement Learning, Probabilistic Inference
Abstract: Robotic manipulation stands as a largely unsolved problem despite significant advances in robotics and machine learning in recent years. One of the key challenges in manipulation is the exploration of the dynamics of the environment when there is continuous contact between the objects being manipulated. This paper proposes a model-based active exploration approach that enables efficient learning in sparse-reward robotic manipulation tasks. The proposed method estimates an information gain objective using an ensemble of probabilistic models and deploys model predictive control (MPC) to plan actions online that maximize the expected reward while also performing directed exploration. We evaluate our proposed algorithm in simulation and on a real robot, trained from scratch with our method, on a challenging ball pushing task on tilted tables, where the target ball position is not known to the agent a-priori. Our real-world robot experiment serves as a fundamental application of active exploration in model-based reinforcement learning of complex robotic manipulation tasks. Project page https://sites.google.com/view/aerm.
|
|
11:20-11:30, Paper WeA-4.9 | |
Cloud-Edge Training Architecture for Sim-To-Real Deep Reinforcement Learning |
|
Cao, Hongpeng | Technical University of Munich |
Theile, Mirco | Technical University of Munich |
Wyrwal, Federico Gabriel | Technical University Munich |
Caccamo, Marco | Technical University of Munich |
Keywords: Control Architectures and Programming, Reinforcement Learning, Transfer Learning
Abstract: Deep reinforcement learning (DRL) is a promising approach to solve complex control tasks by learning policies through interactions with the environment. However, the training of DRL policies requires large amounts of training experiences, making it impractical to learn the policy directly on physical systems. Sim-to-real approaches leverage simulations to pretrain DRL policies and then deploy them in the real world. Unfortunately, the direct real-world deployment of pretrained policies usually suffers from performance deterioration due to the different dynamics, known as the reality gap. Recent sim-to-real methods, such as domain randomization and domain adaptation, focus on improving the robustness of the pretrained agents. Nevertheless, the simulation-trained policies often need to be tuned with real-world data to reach optimal performance, which is challenging due to the high cost of real-world samples. This work proposes a distributed cloud-edge architecture to train DRL agents in the real world in real-time. In the architecture, the inference and training are assigned to the edge and cloud, separating the real-time control loop from the computationally expensive training loop. To overcome the reality gap, our architecture exploits sim-to-real transfer strategies to continue the training of simulation-pretrained agents on a physical system. We demonstrate its applicability on a physical inverted-pendulum control system, analyzing critical parameters. The real-world experiments show that our architecture can adapt the pretrained DRL agents to unseen dynamics consistently and efficiently.
|
|
WeA-5 |
Rm5 (Room C-2) |
Soft Robot Modeling and Control 1 |
Regular session |
Chair: Hirai, Shinichi | Ritsumeikan Univ |
Co-Chair: Katzschmann, Robert Kevin | ETH Zurich |
|
10:00-10:10, Paper WeA-5.1 | |
Towards Accurate Modeling of Modular Soft Pneumatic Robots: From Volume FEM to Cosserat Rod |
|
Wiese, Mats | Leibniz Universität Hannover |
Cao, Benjamin-Hieu | Leibniz Universität Hannover |
Raatz, Annika | Leibniz Universität Hannover |
Keywords: Modeling, Control, and Learning for Soft Robots, Hydraulic/Pneumatic Actuators, Soft Sensors and Actuators
Abstract: Compared to their rigid counterparts, soft material robotic systems offer great advantages when it comes to flexibility and adaptability. Despite their advantages, modeling of soft systems is still a challenging task, due to the continuous and often highly nonlinear nature of deformation these systems exhibit. Tasks like motion planning or design optimization of soft robots require computationally cheap models of the system’s behavior. In this paper we address this need by deriving operational point dependent Cosserat rod models from detailed volume finite element models (FEM). While the latter offer detailed simulations, they generally come with high computational burden that hinders them from being used in time critical model-based methods like motion planning or control. Basic Cosserat rod models promise to provide computationally efficient mechanical models of soft continuum robots. By using a detailed FE model in an offline stage to identify operational point dependent Cosserat rod models, we bring together the accuracy of volumetric FEM with the efficiency of Cosserat rod models. We apply the approach to a fiber reinforced soft pneumatic bending actuator module (SPA module) and evaluate the model’s predictive capabilities for a single module as well as a two-module robot.
|
|
10:10-10:20, Paper WeA-5.2 | |
A Proprioceptive Method for Soft Robots Using Inertial Measurement Units |
|
Martin, Yves J. | Harvard University |
Bruder, Daniel | Harvard University |
Wood, Robert | Harvard University |
Keywords: Modeling, Control, and Learning for Soft Robots, Kinematics, Sensor Fusion
Abstract: Proprioception, or the perception of the configuration of one's body, is challenging to achieve with soft robots due to their infinite degrees of freedom and incompatibility with most off-the-shelf sensors. This work explores the use of inertial measurement units (IMUs), sensors that output orientation with respect to the direction of gravity, to achieve soft robot proprioception. A simple method for estimating the shape of a soft continuum robot arm from IMUs mounted along the arm is presented. The approach approximates a soft arm as a serial chain of rigid links, where the orientation of each link is given by the output of an IMU or by spherical linear interpolation of the output of adjacent IMUs. In experiments conducted on a 660mm long real-world soft arm, this approach provided estimates of its end effector position with a median error of less than 10% of the arm's length. This demonstrates the potential of IMUs to serve as inexpensive off-the-shelf sensors for soft robot proprioception.
|
|
10:20-10:30, Paper WeA-5.3 | |
Model-Based Contact Detection and Accommodation for Soft Bending Actuators: An Integrated Direct/Indirect Adaptive Robust Approach |
|
Hu, Yu | Zhejiang University |
Chen, Cong | Zhejiang University |
Zou, Jun | Zhejiang University |
Keywords: Modeling, Control, and Learning for Soft Robots, Motion Control
Abstract: Soft robots have intrinsic advantages in interaction with humans or complex environments for actual applications, during which various external disturbances (e.g., external contact or collision) are inevitable. They show remarkable abilities in complicated tasks due to their easily deformable bodies and compliance characteristic, while also bringing challenges to the modeling, control, and trajectory planning in precise tasks. Thereby, perception and reaction to external disturbances are quite critical. In this paper, we focus on the slowly varying external contact and propose a contact detection method for the fiber-reinforced soft bending actuator (FRSBA), which is based on the system dynamical behavior. When contact is detected, a parameter extension method is introduced to modify the dynamic model. Then, a backstepping-based integrated direct/indirect adaptive robust controller with contact detection and accommodation strategy (CDA-DIARC) is designed to deal with system nonlinearities, uncertainties, and parametric variations caused by the external contact. Theoretical proof and physical experiments validate the convergence and high trajectory tracking performance of the proposed methods under different contact environments.
|
|
10:30-10:40, Paper WeA-5.4 | |
A Unified and Modular Model Predictive Control Framework for Soft Continuum Manipulators under Internal and External Constraints |
|
Spinelli, Filippo Alberto | ETH Zürich |
Katzschmann, Robert Kevin | ETH Zurich |
Keywords: Modeling, Control, and Learning for Soft Robots, Optimization and Optimal Control, Robust/Adaptive Control
Abstract: Fluidically actuated soft robots have promising capabilities such as inherent compliance and user safety. The control of soft robots needs to properly handle nonlinear actuation dynamics, motion constraints, workspace limitations, and variable shape stiffness, so having a unique algorithm for all these issues would be extremely beneficial. In this work, we adapt Model Predictive Control (MPC), popular for rigid robots, to a soft robotic arm called SoPrA. We address the challenges that current control methods are facing, by proposing a framework that handles these in a modular manner. While previous work focused on Joint-Space formulations, we show through simulation and experimental results that Task-Space MPC can be successfully implemented for dynamic soft robotic control. We provide a way to couple the Piece-wise Constant Curvature and Augmented Rigid Body Model assumptions with internal and external constraints and actuation dynamics, delivering an algorithm that unites these aspects and optimizes over them. We believe that a MPC implementation based on our approach could be the way to address most of model-based soft robotics control issues within a unified and modular framework, while allowing to include improvements that usually belong to other control domains such as machine learning techniques.
|
|
10:40-10:50, Paper WeA-5.5 | |
Contact-Implicit Trajectory and Grasp Planning for Soft Continuum Manipulators |
|
Graule, Moritz A. | Harvard University |
Teeple, Clark | Harvard University |
Wood, Robert | Harvard University |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Robot Applications
Abstract: As robots begin to move from structured industrial environments to the real world, they must be equipped to not only safely interact with the environment, but also reason about how to leverage contact to perform tasks. In this work, we develop a modeling and motion planning framework for continuum robots that accounts for contact anywhere along the robot. We first present an analytical model for continuum manipulators under contact and discuss the ideal choice of generalized coordinates given properties of the manipulator and task specifications. We then demonstrate the utility of our model by developing a motion planning framework that can solve a diverse set of tasks. We apply our framework to end effector path planning for a soft arm in an obstacle-rich environment, and grasp planning for soft robotic grippers, where contact can happen anywhere on the arm or gripper. Finally, we verify the utility of our model and planning framework by planning a grasp with a desired contact force for a soft antipodal gripper and testing this grasp in a hardware demonstration. Overall, our model and planning approach further enhance soft and continuum robots where they already excel: utilizing contact with the world to achieve their goals with a gentle touch.
|
|
10:50-11:00, Paper WeA-5.6 | |
Analytical Modeling of a Membrane-Based Pneumatic Soft Gripper |
|
Sachin, Sachin | Ritsumeikan University |
Wang, Zhongkui | Ritsumeikan University |
Matsuno, Takahiro | Ritsumeikan Univ |
Hirai, Shinichi | Ritsumeikan Univ |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Robot Applications, Contact Modeling
Abstract: Finite deformation is the principal actuation basis of elastomer-based pneumatic soft actuators. Desired deformation behavior is the key design requirement for such actuators. The objective of current study is to optimize the design of a flat shell gripper and to investigate its interaction with a cylindrical object. Herein, we propose an analytical model for a membrane-based flat shell gripper. The model is based on finite strain membrane theory and neo-Hookean material. The proposed model considers the contact interaction of the actuator with flat and cylindrical rigid substrates. The model is developed for three different states of the actuator: (a) free-space; (b) contact with a flat substrate; and (c) contact with a cylindrical substrate. In application, the model was used to predict the relative position and air pressure required to grasp a cylindrical object by a parallel two-fingered shell gripper. Additionally, the frictional behavior of the actuator in contact with a cylindrical substrate is investigated. The model involves only solving nonlinear algebraic equations and is computationally efficient. The theoretically predicted deformation behavior of the actuator is experimentally validated via free-space deformation, force measurement, and grasping tests.
|
|
11:00-11:10, Paper WeA-5.7 | |
Planar Modeling and Sim-To-Real of a Tethered Multimaterial Soft Swimmer Driven by Peano-HASELs |
|
Gravert, Stephan-Daniel | ETH Zurich |
Michelis, Mike Yan | Technical University of Munich |
Rogler, Simon | ETH Zurich |
Tscholl, Dario | ETH Zurich |
Buchner, Thomas Jakob Konrad | ETH Zurich |
Katzschmann, Robert Kevin | ETH Zurich |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Robot Materials and Design, Soft Robot Applications
Abstract: Soft robotics has the potential to revolutionize robotic locomotion, in particular, soft robotic swimmers offer a minimally invasive and adaptive solution to explore and preserve our oceans. Unfortunately, current soft robotic swimmers are vastly inferior to evolved biological swimmers, especially in terms of controllability, efficiency, maneuverability, and longevity. Additionally, the tedious iterative fabrication and empirical testing required to design soft robots has hindered their optimization. In this work, we tackle this challenge by providing an efficient and straightforward pipeline for designing and fabricating soft robotic swimmers equipped with electrostatic actuation. We streamline the process to allow for rapid additive manufacturing, and show how a differentiable simulation can be used to match a simplified model to the real deformation of a robotic swimmer. We perform several experiments with the fabricated swimmer by varying the voltage and actuation frequency of the swimmer's antagonistic muscles. We show how the voltage and frequency vary the locomotion speed of the swimmer while moving in liquid oil and observe a clear optimum in forward swimming speed. The differentiable simulation model we propose has various downstream applications, such as control and shape optimization of the swimmer; optimization results can be directly mapped back to the real robot through our sim-to-real matching.
|
|
11:10-11:20, Paper WeA-5.8 | |
Model-Based Disturbance Estimation for a Fiber-Reinforced Soft Manipulator Using Orientation Sensing |
|
Cangan, Barnabas Gavin | ETH Zurich |
Escaida Navarro, Stefan | Inria |
Bai, Yang | TU Berlin |
Zhang, Yu | ETH Zurich |
Duriez, Christian | INRIA |
Katzschmann, Robert Kevin | ETH Zurich |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Sensors and Actuators
Abstract: A soft robotic arm should ideally be working efficiently, robustly, and safely in human-centered environments to provide assistance in real-world situations. For this goal, soft robots need to be able to estimate their state and external interactions based on (proprioceptive) sensors. Estimating disturbances allows a soft robot to perform desirable force control. Even in the case of rigid manipulators, force estimation at the end-effector is seen as a non-trivial problem. And indeed, other current approaches to address this challenge have shortcomings that prevent their general application. They are often based on simplified soft dynamic models, such as the ones relying on a piece-wise constant curvature approximation or matched rigid-body models that do not represent enough details of the problem. Thus, the applications needed for complex human-robot interaction can not be built. Finite element method (FEM) based modelings allow for predictions of soft robot dynamics in a more generic fashion. Here, using the soft robot modeling capabilities of the framework SOFA, we build a detailed FEM model of a multi-segment soft continuum robotic arm composed of compliant deformable materials and fiber-reinforced pressurized actuation chambers. In addition, a model for sensors that provide orientation output is presented. This model is used to establish a state observer for the manipulator. The sensor model is adequate for representing the output of flexible bend sensors as well as orientations provided by IMUs or coming from tracking systems, all of which are popular choices in soft robotics. Model parameters were calibrated to match imperfections of the manual fabrication process using physical experiments. We then solve a quadratic programming inverse dynamics problem to compute the components of external force that explain the pose error. Our experiments show an average force estimation error of around 1.2%. As the methods proposed are generic, these results are encouraging for the task of building soft robots exhibiting complex, reactive, sensor-based behavior that can be deployed in human-centered environments.
|
|
11:20-11:30, Paper WeA-5.9 | |
Variable Stiffness Object Recognition with Bayesian Convolutional Neural Network on a Soft Gripper |
|
Cao, Jinyue | Shanghaitech University |
Huang, Jingyi | ShanghaiTech University |
Rosendo, Andre | ShanghaiTech University |
Keywords: Modeling, Control, and Learning for Soft Robots, Deep Learning Methods
Abstract: From a medical standpoint, detecting the size and shape of hard inclusions hidden in soft three-dimensional objects is of great significance for early detection of cancer through palpation. Soft robots, especially soft grippers, substantially broaden robots' palpation capabilities from soft to hard materials without the assistance of a camera. We have recently introduced a CNN-Bayes approach which added a Naive Bayes classifier to a convolutional neural network (CNN) architecture called SoftTactNet for variable stiffness object recognition on a three-finger FinRay soft gripper. SoftTactNet itself lacks uncertainty estimations though it can reach a certain level of recognition accuracy. In this paper, We further improve the framework by merging Bayes method directly into CNN architectures and build a new Bayes-SoftTactNet for object recognition. The new approach, using a prior distribution instead of point estimation, allows the network to present results with uncertainty estimates. We conduct new experiments using the same soft gripper with tactile sensor arrays to grasp different variable stiffness objects surrounded by non-different soft material and generate tactile images as dataset. The results show that our new algorithm is more efficient than the previous approach and still able to achieve higher recognition accuracy than general deterministic CNNs.
|
|
WeA-6 |
Rm6 (Room D) |
SLAM 7 |
Regular session |
Chair: Vidal-Calleja, Teresa A. | University of Technology Sydney |
Co-Chair: Zhang, Fu | University of Hong Kong |
|
10:00-10:10, Paper WeA-6.1 | |
Simultaneous Localization and Mapping through the Lens of Nonlinear Optimization |
|
Saxena, Amay | University of California, Berkeley |
Chiu, Chih-Yuan | University of California, Berkeley |
Shrivastava, Ritika | University of California, Berkeley |
Menke, Joseph | University of California, Berkeley |
Sastry, Shankar | University of California, Berkeley |
Keywords: Visual-Inertial SLAM, Localization, Mapping
Abstract: Simultaneous Localization and Mapping (SLAM) algorithms perform visual-inertial estimation via filtering or batch optimization methods. Empirical evidence suggests that filtering algorithms are computationally faster, while optimization methods are more accurate. This work presents an optimization-based framework that unifies these approaches, and allows users to flexibly implement different design choices, e.g., the number and types of variables maintained in the algorithm at each time. We prove that filtering methods correspond to specific design choices in our generalized framework. We then reformulate the Multi-State Constrained Kalman Filter (MSCKF) and contrast its performance with that of sliding-window based filters. Our approach modularizes state-of-the-art SLAM algorithms to allow for adaptation to various scenarios. Experiments on the EuRoC MAV dataset verify that our implementations of these algorithms are competitive with the performance of off-the-shelf implementations in the literature. Using these results, we explain the relative performance characteristics of filtering and batch-optimization based algorithms in the context of our framework. We illustrate that under different design choices, our empirical performance interpolates between those of state-of-the-art approaches.
|
|
10:10-10:20, Paper WeA-6.2 | |
Fast and Safe Exploration Via Adaptive Semantic Perception in Outdoor Environments |
|
Wang, Zhihao | Harbin Institute of Technology, Shenzhen |
Chen, Lingxu | Harbin Institute of Technology(Shenzhen) |
Chen, Hongjin | Harbin Institute of Technology Shenzhen |
Chen, Haoyao | Harbin Institute of Technology, Shenzhen |
Jiang, Xin | Harbin Institute of Technology, Shenzhen |
Keywords: Perception-Action Coupling, Motion and Path Planning, Field Robots
Abstract: Autonomous exploration in unknown environments is a fundamental task for robots. Existing approaches mostly were concentrated on the efficiency of the exploration with the assumption of perfect state estimation, but the drift of pose estimation in visual SLAM occurs frequently and is detrimental to robot's localization safety and exploration performance. In this paper, a perception-aware exploration(PAE) method is proposed for rapidly and safely autonomous exploration in outdoor environments. The adaptive semantic perception is proposed to improve the robustness of perceptual ability, and based on the perception module, both the selection of exploration goal on a novel weighted information gain and the path planning can avoid the areas with high localization uncertainty. In addition, thanks to the proposed pipeline, including the scan-based frontier detection, kd-tree based map prediction and suboptimal frontier buffer strategy, the PAE planner can explore the environment with high success rate and high efficiency. Several simulations are performed to verify the effectiveness of our methods.
|
|
10:20-10:30, Paper WeA-6.3 | |
Towards Robust Visual-Inertial Odometry with Multiple Non-Overlapped Monocular Cameras |
|
He, Yao | The Chinese University of HongKong, Shenzhen |
Yu, Huai | Carnegie Mellon University; Wuhan University |
Yang, Wen | Wuhan University |
Scherer, Sebastian | Carnegie Mellon University |
Keywords: Visual-Inertial SLAM, Omnidirectional Vision
Abstract: We present a Visual-Inertial Odometry (VIO) algorithm with multiple non-overlapping monocular cameras aiming at improving the robustness of the VIO algorithm. An initialization scheme and tightly-coupled bundle adjustment for multiple non-overlapping monocular cameras are proposed. With more stable features captured by multiple cameras, VIO can maintain stable state estimation, especially when one of the cameras tracked unstable or limited features. We also address the high CPU usage rate brought by multiple cameras by proposing a GPU-accelerated frontend. Finally, we use our pedestrian carried system to evaluate the robustness of the VIO algorithm in several challenging environments. The results show that the multi-camera setup yields significantly higher estimation robustness than a monocular system while not increasing the CPU usage rate (reducing the CPU resource usage rate and computational latency by 40.4% and 50.6% on each camera). A demo video can be found at https://youtu.be/r7QvPth1m10.
|
|
10:30-10:40, Paper WeA-6.4 | |
Scalable Probabilistic Gas Distribution Mapping Using Gaussian Belief Propagation |
|
Rhodes, Callum | Loughborough University |
Liu, Cunjia | Loughborough University |
Chen, Wen-Hua | Loughborough University |
Keywords: Robotics in Hazardous Fields, Environment Monitoring and Management, Probabilistic Inference
Abstract: This paper advocates the Gaussian belief propagation solver for factor graphs in the case of gas distribution mapping to support an olfactory sensing robot. The local message passing of belief propagation moves away from the standard Cholesky decomposition technique, which avoids solving the entire factor graph at once and allows for only areas of interest to be updated more effectively. Implementing a local solver means that iterative updates to the distribution map can be achieved orders of magnitude quicker than conventional direct solvers which scale computationally to the size of the map. After defining the belief propagation algorithm for gas mapping, several state of the art message scheduling algorithms are tested in simulation against the standard Cholesky solver for their ability to converge to the exact solution. Testing shows that under the wildfire scheduling method for a large urban scenario, that distribution maps can be iterated at least 10 times faster whilst still maintaining exact solutions. This move to an efficient local framework allows future works to consider 3D mapping, predictive utility and multi-robot distributed mapping.
|
|
10:40-10:50, Paper WeA-6.5 | |
WiSARD: A Labeled Visual and Thermal Image Dataset for Wilderness Search and Rescue |
|
Broyles, Daniel | University of Washington |
Hayner, Christopher | University of Washington |
Leung, Karen | Stanford University, NVIDIA Research, University of Washington |
Keywords: Search and Rescue Robots, Data Sets for Robotic Vision, Object Detection, Segmentation and Categorization
Abstract: Sensor-equipped unoccupied aerial vehicles (UAVs) have the potential to help reduce search times and alleviate safety risks for first responders carrying out Wilderness Search and Rescue (WiSAR) operations, the process of finding and rescuing person(s) lost in wilderness areas. Unfortunately, visual sensors alone do not address the need for robustness across all the possible terrains, weather, and lighting conditions that WiSAR operations can be conducted in. The use of multi-modal sensors, specifically visual-thermal cameras, is critical in enabling WiSAR UAVs to perform in diverse operating conditions. However, due to the unique challenges posed by the wilderness context, existing dataset benchmarks are inadequate for developing vision-based algorithms for autonomous WiSAR UAVs. To this end, we present WiSARD, a dataset with roughly 56,000 labeled visual and thermal images collected from UAV flights in various terrains, seasons, weather, and lighting conditions. To the best of our knowledge, WiSARD is the first large-scale dataset collected with multi-modal sensors for autonomous WiSAR operations. We envision that our dataset will provide researchers with a diverse and challenging benchmark that can test the robustness of their algorithms when applied to real-world (life-saving) applications. Link to dataset: https://sites.google.com/uw.edu/wisard/
|
|
10:50-11:00, Paper WeA-6.6 | |
A Tightly-Coupled Event-Inertial Odometry Using Exponential Decay and Linear Preintegrated Measurements |
|
Dai, Benny | University of Technology Sydney |
Le Gentil, Cedric | University of Technology Sydney |
Vidal-Calleja, Teresa A. | University of Technology Sydney |
Keywords: Visual-Inertial SLAM, Vision-Based Navigation
Abstract: In this paper, we introduce an event-based visual odometry and mapping framework that relies on decaying event-based corners. Event cameras, unlike conventional cameras, can provide sensor data during high-speed motions or in scenes with high dynamic ranges. Rather than providing intensity information at a global shutter rate, events are triggered asynchronously depending on whether there is a change in brightness at the pixel location. This novel sensing paradigm calls for unconventional ego-motion estimation techniques to address these new challenges. The key aspect of our framework is the use of a continuous representation of inertial measurements to characterise the system's motion which accommodates the asynchronous nature of the event data while estimating a discrete state in an optimisation-based approach. The proposed method relies on corners extracted from events-only data and associates them with a spatio-temporal locality scheme based on exponential decay. Event tracks are then tightly coupled with temporally accurate preintegrated inertial measurements, allowing for the estimation of ego-motion and a sparse map. The proposed method is evaluated on the Event Camera Dataset showing performance against the state-of-art in event-based visual-inertial odometry.
|
|
11:00-11:10, Paper WeA-6.7 | |
Photometric Visual-Inertial Navigation with Uncertainty-Aware Ensembles (I) |
|
Jung, Jae Hyung | Seoul National University |
Choe, Yeongkwon | Korea Electronics Technology Institute (KETI), Seoul National Un |
Park, Chan Gook | Seoul National University |
Keywords: Visual-Inertial SLAM, Vision-Based Navigation, Sensor Fusion
Abstract: In this article, we propose a visual-inertial navigation system that directly minimizes a photometric error without an explicit data-association. We focus on the photometric error parametrized by pose and structure parameters that is highly nonconvex due to the nonlinearity of image intensity. The key idea is to introduce an optimal intensity gradient that accounts for a projective uncertainty of a pixel. Ensembles sampled from the state uncertainty contribute to the proposed gradient and yield a correct update direction even in a bad initialization point. We present two sets of experiments to demonstrate the strengths of our framework. First, a thorough Monte Carlo simulation in a virtual trajectory is designed to reveal robustness to large initial uncertainty. Second, we show that the proposed framework can achieve superior estimation accuracy with efficient computation time over state-of-the-art visual-inertial fusion methods in a real-world UAV flight test, where most scenes are composed of a featureless floor.
|
|
11:10-11:20, Paper WeA-6.8 | |
FAST-LIO2: Fast Direct LiDAR-Inertial Odometry (I) |
|
Xu, Wei | University of Hong Kong |
Cai, Yixi | University of Hong Kong |
He, Dongjiao | The University of Hong Kong |
Lin, Jiarong | The University of Hong Kong |
Zhang, Fu | University of Hong Kong |
Keywords: SLAM, Mapping, Autonomous Vehicle Navigation
Abstract: This article presents FAST-LIO2: a fast, robust, and versatile LiDAR-inertial odometry framework. Building on a highly efficient tightly coupled iterated Kalman filter, FAST-LIO2 has two key novelties that allow fast, robust, and accurate LiDAR navigation (and mapping). The first one is directly registering raw points to the map (and subsequently update the map, i.e., mapping) without extracting features. This enables the exploitation of subtle features in the environment and, hence, increases the accuracy. The elimination of a hand-engineered feature extraction module also makes it naturally adaptable to emerging LiDARs of different scanning patterns; the second main novelty is maintaining a map by an incremental k-dimensional (k-d) tree data structure, incremental k-d tree ( ikd-Tree ), that enables incremental updates (i.e., point insertion and delete) and dynamic rebalancing. Compared with existing dynamic data structures (octree, R -tree, and nanoflann k-d tree), ikd-Tree achieves superior overall performance while naturally supports downsampling on the tree. We conduct an exhaustive benchmark comparison in 19 sequences from a variety of open LiDAR datasets. FAST-LIO2 achieves consistently higher accuracy at a much lower computation load than other state-of-the-art LiDAR-inertial navigation systems. Various real-world experiments on solid-state LiDARs with small field of view are also conducted. Overall, FAST-LIO2 is computationally efficient (e.g., up to 100 Hz odometry and mapping in large outdoor environments), robust (e.g., reliable pose estimation in cluttered indoor environments with rotation up to 1000 deg/s), versatile (i.e., applicable to both multiline spinning and solid-state LiDARs, unmanned aerial vehicle (UAV) and handheld platforms, and Intel- and ARM-based processors), while still achieving a higher accuracy than existing methods. Our implementation of the system FAST-LIO2 and the data structure ikd-Tree are both open-sourced on Github.
|
|
11:20-11:30, Paper WeA-6.9 | |
Rail Vehicle Localization and Mapping with LiDAR-Vision-Inertial-GNSS Fusion |
|
Yusheng, Wang | Wuhan University |
Weiwei, Song | Wuhan University |
Yidong, Lou | Wuhan University |
Yi, Zhang | Wuhan University |
Fei, Huang | Wuhan University |
Zhiyong, Tu | Wuhan University |
Qiangsheng, Liang | Suzhou Jingwei CO., LTD |
Keywords: Field Robots, Intelligent Transportation Systems, SLAM
Abstract: In this paper, we present a global navigation satellite system (GNSS) aided LiDAR-visual-inertial scheme, RailLoMer-V, for accurate and robust rail vehicle localization and mapping. RailLoMer-V is formulated atop a factor graph and consists of two subsystems: an odometer assisted LiDAR-inertial system (OLIS) and an odometer integrated Visual-inertial system (OVIS). Both the subsystem exploits the typical geometry structure on the railroads. The plane constraints from extracted rail tracks are used to complement the rotation and vertical errors in OLIS. Besides, the line features and vanishing points are leveraged to constrain rotation drifts in OVIS. The proposed framework is extensively evaluated on datasets over 800 km, gathered for more than a year on both general-speed and high-speed railways, day and night. Taking advantage of the tightly-coupled integration of all measurements from individual sensors, our framework is accurate to long-during tasks and robust enough to grievously degenerated scenarios (railway tunnels). In addition, the real-time performance can be achieved with an onboard computer.
|
|
WeA-7 |
Rm7 (Room E) |
Medical Robots and Systems 7 |
Regular session |
Chair: Fiorini, Paolo | University of Verona |
Co-Chair: Vander Poorten, Emmanuel B | KU Leuven |
|
10:00-10:10, Paper WeA-7.1 | |
A Metric for Finding Robust Start Positions for Medical Steerable Needle Automation |
|
Hoelscher, Janine | UNC Chapel Hill |
Fried, Inbar | University of North Carolina at Chapel Hill |
Fu, Mengyu | University of North Carolina at Chapel Hill |
Patwardhan, Mihir | University of North Carolina at Chapel Hill |
Christman, Max | University of North Carolina at Chapel Hill |
Akulian, Jason | University of North Carolina at Chapel Hill |
Webster III, Robert James | Vanderbilt University |
Alterovitz, Ron | University of North Carolina at Chapel Hill |
Keywords: Surgical Robotics: Steerable Catheters/Needles, Surgical Robotics: Planning, Medical Robots and Systems
Abstract: Steerable needles are medical devices with the ability to follow curvilinear paths to reach targets while circumventing obstacles. In the deployment process, a human operator typically places the steerable needle at its start position on a tissue surface and then hands off control to the automation that steers the needle to the target. Due to uncertainty in the placement of the needle by the human operator, choosing a start position that is robust to deviations is crucial since some start positions may make it impossible for the steerable needle to safely reach the target. We introduce a method to efficiently evaluate steerable needle motion plans such that they are safe to variation in the start position. This method can be applied to many steerable needle planners and requires that the needle’s orientation angle at insertion can be robotically controlled. Specifically, we introduce a method that builds a funnel around a given plan to determine a safe insertion surface corresponding to insertion points from which it is guaranteed that a collision-free motion plan to the goal can be computed. We use this technique to evaluate multiple feasible plans and select the one that maximizes the size of the safe insertion surface. We evaluate our method through simulation in a lung biopsy scenario and show that the method is able to quickly find needle plans with a large safe insertion surface.
|
|
10:10-10:20, Paper WeA-7.2 | |
Design and Development of a Lorentz Force-Based MRI-Driven Neuroendoscope |
|
Phelan, Martin | Max Planck Institute for Intelligent Systems |
Dogan, Nihal Olcay | Max Planck Institute |
Lazovic, Jelena | Max Planck Institute for Intelligent Systems |
Sitti, Metin | Max-Planck Institute for Intelligent Systems |
Keywords: Surgical Robotics: Steerable Catheters/Needles
Abstract: The introduction of neuroendoscopy, microneurosurgery, neuronavigation, and intraoperative imaging for surgical operations has made significant improvements over other traditionally invasive surgical techniques. The integration of magnetic resonance imaging (MRI)-driven surgical devices with intraoperative imaging and endoscopy can enable further advancements in surgical treatments and outcomes. This work proposes the design and development of an MRI-driven endoscope leveraging the high (3-7 T), external magnetic field of an MR scanner for heat-mitigated steering within the ventricular system of the brain. It also demonstrates the effectiveness of a Lorentz force-based grasper for diseased tissue manipulation and ablation. Feasibility studies show the neuroendoscope can be steered precisely within the lateral ventricle to locate a tumor using both MRI and endoscopic guidance. Results also indicate grasping forces as high as 31 mN are possible and power inputs as low as 0.69 mW can cause cancerous tissue ablation. These findings enable further developments of steerable devices using MR imaging integrated with endoscopic guidance for improved outcomes.
|
|
10:20-10:30, Paper WeA-7.3 | |
GESRsim: Gastrointestinal Endoscopic Surgical Robot Simulator |
|
Gao, Huxin | National University of Singapore |
Zhang, Zedong | National University of Singapore |
Li, Changsheng | Beijing Institute of Technology |
Xiao, Xiao | Southern University of Science and Technology |
Qiu, Liang | National University of Singapore |
Yang, Xiaoxiao | Qilu Hospital of Shandong University |
Hao, Ruoyi | Chinese University of Hong Kong |
Zuo, Xiuli | QiluHospitalofShandongUniversity |
Li, Yanqing | Qilu Hospital of Shandong University |
Ren, Hongliang | Chinese Univ Hong Kong (CUHK) & National Univ Singapore(NUS) |
Keywords: Medical Robots and Systems, Simulation and Animation, Visual Servoing
Abstract: Robot-assisted gastrointestinal endoscopic surgery (GES) as a kind of natural orifice transluminal endoscopic surgery (NOTES) is the next-generation minimally invasive surgery (MIS). Besides, rendering certain autonomy to a Gastrointestinal Endoscopic Surgical Robot (GESR) is currently promising but highly challenging. Therefore, to accelerate the development and augment the autonomy of GESR, we use CoppeliaSim to develop the first robotic simulator for the GESR system (GESRsim) based on our previous design. The GESRsim provides several 3D models and kinematics of our designed manipulators and endoscopic snake bone. Additionally, we build several scenes for robotic GES training and then utilize different programming interfaces to perform teleoperation. Furthermore, several advanced control algorithms, including visual servoing (VS) and deep reinforcement learning (DRL), are implemented to verify the performance of the GESRsim.
|
|
10:30-10:40, Paper WeA-7.4 | |
A Dataset and Benchmark for Learning the Kinematics of Concentric Tube Continuum Robots |
|
Grassmann, Reinhard M. | University of Toronto Mississauga |
Chen, Ryan Zeyuan | University of Toronto |
Liang, Nan | University of Toronto |
Burgner-Kahrs, Jessica | University of Toronto |
Keywords: Surgical Robotics: Steerable Catheters/Needles, Data Sets for Robot Learning, Flexible Robotics
Abstract: Establishing a physics-based model capturing the kinetostatic behavior of concentric tube continuum robots is challenging as elastic interactions between the flexible tubes constituting the robot result in a highly non-linear problem. The Goldstandard physics-based model using the Cosserat theory of elastic rods achieves reasonable approximations with 1.5-3% with respect to the robot's length, if well-calibrated. Learning-based models of concentric tube continuum robots have been shown to outperform the Goldstandard model with approximation errors below 1%. Yet, the merits of learning-based models remain largely unexplored as no common dataset and benchmark exist. In this paper, we present a dataset captured from a three-tube concentric tube continuum robot for use in learning-based kinematics research. The dataset consists of 100,000 joint configurations and the corresponding four 6 dof sensors in SE(3) measured with an electromagnetic tracking system (github.com/ContinuumRoboticsLab/CRL-Dataset-CTCR-Pose). With our dataset, we empower the continuum robotics and machine learning community to advance the field. We share our insights and lessons learned on joint space representation, shape representation in task space, and sampling strategies. Furthermore, we provide benchmark results for learning the forward kinematics using a simple, shallow feedforward neural network. The benchmark results for the tip error are 0.74 mm w.r.t. position (0.4 % of total robot length) and 6.49 grad w.r.t. orientation.
|
|
10:40-10:50, Paper WeA-7.5 | |
Intrinsic Force Sensing for Motion Estimation in a Parallel, Fluidic Soft Robot for Endoluminal Interventions |
|
Lindenroth, Lukas | University College London |
Merlin, Jeref | Wellcome/EPSRC Centre for Interventional and Surgical Sciences ( |
Bano, Sophia | University College London |
Manjaly, Joseph G. | University College London Hospitals Biomedical Research Centre, |
Mehta, Nishchay | University College London Hospitals Biomedical Research Centre, |
Stoyanov, Danail | University College London |
Keywords: Soft Robot Applications, Medical Robots and Systems, Soft Sensors and Actuators
Abstract: Determining the externally-induced motion of a soft robot in minimally-invasive procedures is highly challenging and commonly demands specific tools and dedicated sensors. Intrinsic force sensing paired with a model describing the robot's compliance offers an alternative pathway which relies heavily on knowledge of the characteristic mechanical behaviour of the investigated system. In this work, we apply quasi-static intrinsic force sensing to a miniature, parallel soft robot designed for endoluminal ear interventions. We characterize the soft robot's nonlinear mechanical behaviour and devise methods for inferring forces applied to the actuators of the robot from fluid pressure and volume information of the working fluid. We demonstrate that it is possible to detect the presence of an external contact acting on the soft robot's actuators, infer the applied reaction force with an accuracy of 28.1mN and extrapolate from individual actuator force sensing to determining forces acting on the combined parallel soft robot when it is deployed in a lumen, which can be achieved with an accuracy of 75.45mN for external forces and 0.47Nmm for external torques. The intrinsically-sensed external forces can be employed to estimate the induced motion of the soft robot in response to these forces with an accuracy of 0.11mm in translation and 2.47deg in rotational deflection. The derived methodologies could enable designs for more perceptive endoscopic systems and pave the way for developing sensing and control strategies in endoluminal and transluminal soft robots.
|
|
10:50-11:00, Paper WeA-7.6 | |
Contact Localization of Continuum and Flexible Robot Using Data-Driven Approach |
|
Ha, Xuan Thao | KU Leuven |
Wu, Di | KU Leuven |
Lai, Chun-Feng | Delft University of Technology |
Ourak, Mouloud | University of Leuven |
Borghesan, Gianni | KU Leuven |
Menciassi, Arianna | Scuola Superiore Sant'Anna - SSSA |
Vander Poorten, Emmanuel B | KU Leuven |
Keywords: Surgical Robotics: Steerable Catheters/Needles, Medical Robots and Systems
Abstract: Continuum robots such as robotic catheters are increasingly being used in minimally invasive surgery. Compliance contributes to enhanced safety during e.g. catheter insertion, however, estimation of contact force and location may help clinicians avoiding exerting excessive force. Ultimately this could lead to faster and safer interventions. Researchers proposed force sensors integrated in the catheter tip in the past. However, such sensors add extra complexity to the catheter design. Also, tip force sensors do not provide insights on forces that act along the catheter length. This paper proposes a data-driven approach for localizing contact forces that appear over the length of the catheter. The proposed approach consists of a collision detection method and a contact localization method. The framework only requires the measurement of the catheter's shape which can be done by an embedded multi-core Fiber Bragg Grating fiber. The method was validated experimentally with a 3D-printed continuum robot with an integrated multi-core fiber. A second contact localization method which is based on identifying the discontinuity in the measured curvature, is also implemented and compared with the proposed method. The static and dynamic experiments show a mean average localization error of 2.3 mm and 4.3 mm which correspond to respectively 3.3% and 6.1% of a 70 mm long flexible robot. These findings demonstrate that the proposed framework outperforms the previous methods and yields promising results. The contact state estimation algorithm can detect collisions in at most approximately 1.08s.
|
|
11:00-11:10, Paper WeA-7.7 | |
Deep-Learning-Based Compliant Motion Control of a Pneumatically-Driven Robotic Catheter |
|
Wu, Di | KU Leuven |
Ha, Xuan Thao | KU Leuven |
Zhang, Yao | KU Leuven |
Ourak, Mouloud | University of Leuven |
Borghesan, Gianni | KU Leuven |
Niu, Kenan | University of Twente |
Trauzettel, Fabian | TU Delft |
Dankelman, Jenny | TU Delft |
Menciassi, Arianna | Scuola Superiore Sant'Anna - SSSA |
Vander Poorten, Emmanuel B | KU Leuven |
Keywords: Surgical Robotics: Steerable Catheters/Needles, Medical Robots and Systems
Abstract: In cardiovascular interventions, when steering catheters and especially robotic catheters, great care should be paid to prevent applying too large forces on the vessel walls as this could dislodge calcifications, induce scars or even cause perforation. To address this challenge, this paper presents a novel compliant motion control algorithm that relies solely on position sensing of the catheter tip and knowledge of the catheter's behavior. The proposed algorithm features a data-driven tip position controller. The controller is trained based on a so-called control Long Short-Term Memory Network (control-LSTM). Trajectory following experiments on four different trajectories are conducted to validate the quality of the proposed control-LSTM. The performance was compared with the performance of a controller that makes use of an analytical hysteresis model, i.e. the inverse Deadband Rate-Dependent Prandtl-Ishlinskii (IDRDPI) model. Results demonstrated superior positioning capability with sub-degree precision of the new approach in the presence of severe rate-dependent hysteresis. Experiments both in a simplified setup as well as in an aortic phantom further show that the proposed approach allows reducing the interaction forces with the environment by around 70%. This work shows how deep learning can be exploited advantageously to avoid tedious modeling that would be needed to precisely steer continuum robots in constrained environments such as the patient's vasculature.
|
|
11:10-11:20, Paper WeA-7.8 | |
Colonoscopy Navigation Using End-To-End Deep Visuomotor Control: A User Study |
|
Pore, Ameya | University of Verona |
Finocchiaro, Martina | Universitat Politècnica De Catalunya |
Dall'Alba, Diego | University of Verona |
Hernansanz, Albert | UPC (Universitat Politecnicade Catalunya) |
Ciuti, Gastone | Scuola Superiore Sant'Anna |
Arezzo, Alberto | University of Torino |
Menciassi, Arianna | Scuola Superiore Sant'Anna - SSSA |
Casals, Alicia | UniversitatPolitècnica De Catalunya, Barcelona Tech |
Fiorini, Paolo | University of Verona |
Keywords: Surgical Robotics: Planning, Medical Robots and Systems, Vision-Based Navigation
Abstract: Flexible Endoscopes (FEs) for colonoscopy present several limitations due to their inherent complexity, resulting in patient discomfort and lack of intuitiveness for clinicians. Robotic FEs with autonomous control represent a viable solution to reduce the workload of endoscopists and the training time while improving the procedure outcome. Prior works on autonomous endoscope FE control use heuristic policies that limit their generalisation to the unstructured and highly deformable colon environment and require frequent human intervention. This work proposes an image-based FE control using Deep Reinforcement Learning, called Deep Visuomotor Control (DVC), to exhibit adaptive behaviour in convoluted sections of the colon. DVC learns a mapping between the images and the FE control signal. A first user study of 20 expert gastrointestinal endoscopists was carried out to compare their navigation performance with DVC using a realistic virtual simulator. The results indicate that DVC shows equivalent performance on several assessment parameters, being more safer. Moreover, a second user study with 20 novice users was performed to demonstrate easier human supervision compared to a state-of-the-art heuristic control policy. Seamless supervision of colonoscopy procedures would enable endoscopists to focus on the medical decision rather than on the control of FE.
|
|
11:20-11:30, Paper WeA-7.9 | |
Shape Memory Polymer Variable Stiffness Magnetic Catheters with Hybrid Stiffness Control |
|
Mattmann, Michael | ETH Zurich, Multi Scale Robotics Laboratory |
Boehler, Quentin | ETH Zurich |
Chen, Xiang-Zhong | ETH Zurich |
Pané, Salvador | ETH Zurich |
Nelson, Bradley J. | ETH Zurich |
Keywords: Surgical Robotics: Steerable Catheters/Needles, Soft Robot Materials and Design, Medical Robots and Systems
Abstract: Variable stiffness catheters typically rely on thermally induced stiffness transitions with a transition temperature above body temperature. This imposes considerable safety limitations for medical applications. In this work, we present a variable stiffness catheter using a hybrid control strategy capable of actively heating and actively cooling the catheter material. The proposed catheter is made of a single biocompatible shape memory polymer, which significantly increases its manufacturability and scalability compared to existing designs. Increased safety is obtained by ensuring a lower-risk compliant state at body temperature while maintaining higher stiffness ranges in actively controlled states. Additionally, the combined use of variable stiffness and magnetic actuation increases the dexterity and steerability of the device compared to existing robotic tools.
|
|
WeA-8 |
Rm8 (Room F) |
Compliance and Impedance Control 1 |
Regular session |
Chair: Haddadin, Sami | Technical University of Munich |
Co-Chair: Colomé, Adrià | Institut De Robòtica I Informàtica Industrial (CSIC-UPC), Q2818002D |
|
10:00-10:10, Paper WeA-8.1 | |
Learning Variable Impedance Control for Aerial Sliding on Uneven Heterogeneous Surfaces through Proprioceptive and Tactile Sensing |
|
Zhang, Weixuan | ETH Zurich |
Ott, Lionel | ETH Zurich |
Tognon, Marco | ETH Zurich |
Siegwart, Roland | ETH Zurich |
Keywords: Compliance and Impedance Control, Aerial Systems: Applications, Machine Learning for Robot Control
Abstract: The recent development of novel aerial vehicles capable of physically interacting with the environment leads to new applications such as contact-based inspection. These tasks require the robotic system to exchange forces with partially-known environments, which may contain uncertainties including unknown spatially-varying friction properties and discontinuous variations of the surface geometry. Finding a solution that senses, adapts, and remains robust against these environmental uncertainties remains an open challenge. This paper presents a learning-based adaptive control strategy for aerial sliding tasks. In particular, the gains of a standard impedance controller are adjusted in real-time by a neural network policy based on proprioceptive and tactile sensing. This policy is trained in simulation with simplified actuator dynamics in a student-teacher learning setup. The real-world performance of the proposed approach is verified using a tilt-arm omnidirectional flying vehicle. The proposed controller structure combines data-driven and model-based control methods, enabling our approach to successfully transfer directly and without adaptation from simulation to the real platform. We attribute the success of the sim-to-real transfer to the inclusion of feedback control in the training and deployment. We achieved tracking performance and disturbance rejection that cannot be achieved using fine-tuned state of the art interaction control method.
|
|
10:10-10:20, Paper WeA-8.2 | |
Passivity-Based Skill Motion Learning in Stiffness-Adaptive Unified Force-Impedance Control |
|
Karacan, Kübra | Technical University of Munich |
Sadeghian, Hamid | Technical University of Munich |
Kirschner, Robin Jeanne | TU Munich, Institute for Robotics and Systems Intelligence |
Haddadin, Sami | Technical University of Munich |
Keywords: Compliance and Impedance Control, Energy and Environment-Aware Automation, Human-Robot Collaboration
Abstract: Tactile robots shall be deployed for dynamic task execution in production lines with small batch sizes. Therefore, these robots should have the ability to respond to changing conditions and be easy to (re-)program. Operating under uncertain environments requires unifying subsystems such as robot motion and force policy into one framework, referred to as tactile skills. In this paper, we propose the enhancement of these skills for passivity-based skill motion learning in stiffness-adaptive unified force-impedance control. To achieve the increased level of adaptability, we represent all tactile skills by three basic primitives: contact initiation, manipulation, and contact termination. To ensure passivity and stability, we develop an energy-based approach for unified force-impedance control that allows humans to teach the robot motion through physical interaction during the execution of a tactile task. We incorporate our proposed framework into a tactile robot to experimentally validate the motion adaptation by interaction performance and stability of the control. While the polishing task is presented as our use case through the paper, the experiments can also be carried out with various tactile skills. Finally, the results show the novel controller's stability and passivity to contact-loss and stiffness adaptation, leading to successful programming by interaction.
|
|
10:20-10:30, Paper WeA-8.3 | |
Perturbation-Based Stiffness Inference in Variable Impedance Control |
|
Caldarelli, Edoardo | Institut De Robòtica I Informàtica Industrial (CSIC-UPC) |
Colomé, Adrià | Institut De Robòtica I Informàtica Industrial (CSIC-UPC), Q28180 |
Torras, Carme | Csic - Upc |
Keywords: Compliance and Impedance Control, Learning from Demonstration, Probabilistic Inference
Abstract: One of the major challenges in learning from demonstration is to teach the robot a wider set of task features than the plain trajectories to be followed. In this sense, one key parameter is stiffness, i.e., the rigidity that the manipulator should exhibit when performing a task. The required robot stiffness is often not known a priori and varies along the execution of the task, thus its profile needs to be inferred from the demonstrations. In this work, we propose a novel, force-based algorithm for inferring time-varying stiffness profiles, leveraging the relationship between stiffness and tracking error, and involving human-robot interaction. We begin by gathering a set of demonstrations with kinesthetic teaching. Then, the robot executes a perturbed reference, obtained from these demonstrations by means of Gaussian process regression, and the human intervenes if the perturbation makes the manipulator deviate from its expected behaviour. Human intervention is measured and used to infer the desired control stiffness. In the experiments section, we show that our algorithm can be combined with different types of force sensors, and provide suitable processing algorithms. Our approach correctly infers the stiffness profiles from the force and electromyography sensors, their combination permitting also to comply with the physical constraints imposed by the environment. This is demonstrated in three experiments of increasing complexity: a motion in free Cartesian space, a rigid assembly task, and bed-making.
|
|
10:30-10:40, Paper WeA-8.4 | |
A Whole-Body Controller Based on a Simplified Template for Rendering Impedances in Quadruped Manipulators |
|
Risiglione, Mattia | Italian Institute of Technology |
Barasuol, Victor | Istituto Italiano Di Tecnologia |
Caldwell, Darwin G. | Istituto Italiano Di Tecnologia |
Semini, Claudio | Istituto Italiano Di Tecnologia |
Keywords: Compliance and Impedance Control, Legged Robots, Whole-Body Motion Planning and Control
Abstract: Quadrupedal manipulators require to be compliant when dealing with external forces during autonomous manipulation, tele-operation or physical human-robot interaction. This paper presents a whole-body controller that allows for the implementation of a Cartesian impedance control to coordinate tracking performance and desired compliance for the robot base and manipulator arm. The controller is formulated through an optimization problem using Quadratic Programming (QP) to impose a desired behavior for the system while satisfying friction cone constraints, unilateral force constraints, joint and torque limits. The presented strategy decouples the arm and the base of the platform, enforcing the behavior of a linear double-mass spring damper system, and allows to independently tune their inertia, stiffness and damping properties. The control architecture is validated through a set of simulations using the 90kg HyQ robot equipped with a 7-DoF manipulator arm. Simulation results show the impedance rendering performance when external forces are applied at the arm's end-effector. The paper presents results for full stance condition (all legs on the ground) and, for the first time, also shows how the impedance rendering is affected by the contact conditions during a dynamic gait.
|
|
10:40-10:50, Paper WeA-8.5 | |
Electro-Adhesive Tubular Clutch for Variable-Stiffness Robots |
|
Sun, Yi | École Polytechnique Fédérale De Lausanne |
Digumarti, Krishna Manaswi | Ecole Polytechnique Federale De Lausanne |
Phan, Hoang-Vu | EPFL |
Aloui, Omar | LIS-EPFL |
Floreano, Dario | Ecole Polytechnique Federal, Lausanne |
Keywords: Compliance and Impedance Control, Mechanism Design, Soft Robot Applications
Abstract: Electro-adhesive clutches have become effective tools for variable stiffness functions in many robotic systems due to their light weight, high speed and strong brake force. In this paper, we present a novel, tubular design of an electro-adhesive clutch. Our clutch consists of flexible electrode sheets rolled into a tubular structure. This design allows encapsulating large electrode areas in a compact size for strong brake force. Additionally, the tubular structure acts as a guide for directional sliding without external guides. The structure also ensures that the electrode surfaces are encapsulated, preventing the accumulation of dust and thus leading to reliable performance. This structure is therefore an improvement over the commonly used planar designs. The characterization of the electro-adhesive tubular clutch shows that the frictional force increases with the increase of the electrode contact area, the decrease of the roll diameter and the dielectric layer thickness. A retractable tubular clutch is made by fixing an elastic cable along the clutch axis and achieves a stiffness change factor up to 260. Applications of this retractable clutch in robotics to achieve variable stiffness are demonstrated in two systems: a tensegrity structure and a wing skeleton. Changes in stiffness by 13.2 and 30.2 times are achieved for the two systems, respectively. The proposed tubular clutch is an effective means of achieving variable stiffness, particularly in the case of robotic systems that transmit forces through tensioned cables.
|
|
10:50-11:00, Paper WeA-8.6 | |
An Observer-Based Responsive Variable Impedance Control for Dual–User Haptic Training System |
|
Rashvand, Ashkan | K. N. Toosi University of Technology |
Heidari, Reza | K. N. Toosi Univ. of Tech |
Motaharifar, Mohammad | University of Isfahan |
Hassani, Ali | Advanced Robotics and Automated Systems (ARAS), K. N. Toosi Univ |
Dindarloo, Mohammad Reza | K. N. Toosi University of Technology |
Ahmadi, Mohammad Javad | K. N. Toosi University of Technology |
Hashtrudi-Zaad, Keyvan | Queen's University |
Tavakoli, Mahdi | University of Alberta |
Taghirad, Hamid D. | K.N.Toosi University of Technology |
Keywords: Compliance and Impedance Control, Medical Robots and Systems, Physically Assistive Devices
Abstract: This paper proposes a variable impedance control architecture to facilitate eye surgery training in a dual-user haptic system. In this system, an expert surgeon (the trainer) and a novice surgeon (the trainee) collaborate on a surgical procedure using their own haptic devices. The mechanical impedance parameters of the trainer's haptic device remain constant during the operation, whereas those of the trainee vary with his/her proficiency level. The trainee's relative proficiency might be objectively quantified in real-time based on position error between the trainer and the trainee. The proposed architecture enables the trainer to intervene in the training process as needed to ensure the trainee is following the right course of action and to avoid the trainee's from potential tissue injuries. The stability of the overall nonlinear closed-loop system has been investigated using the input-to-state stability (ISS) criterion. High-gain observer with unknown inputs is considered in this work to estimate the interaction forces. Simulation and experimental results under different scenarios confirm the effectiveness of the proposed control methods.
|
|
11:00-11:10, Paper WeA-8.7 | |
Development of Low-Inertia Backdrivable Arm Focusing on Learning-Based Control |
|
Nishiura, Manabu | Toyota Motor Coporation |
Hatano, Akira | Toyota Motor Corporation |
Nishii, Kazutoshi | Toyota Motor Corporation |
Okumatsu, Yohishiro | Toyota Motor Corporation |
Keywords: Compliance and Impedance Control, Compliant Joints and Mechanisms, Learning from Demonstration
Abstract: A robot designed to coexist and work with humans in the same workspace should be able to work at the same speed as humans and have safe contact with humans and with the environment. However, when a robot arm has been given flexibility through mechanisms and controls for the purpose of coexistence, it is difficult for it to perform tasks at the speed and accuracy desired by humans if it is moved simply by using conventional position-based controls. With such an arm, we consider that the use of learning-based control is necessary to achieve both safety and speed. Therefore, we prototyped a low-inertia, high-backdrivability arm as a platform for studying learning-based control and tested two types of learning-based control. This paper describes our design process, in which hardware suitable for learning-based control was developed according to the requirements of the specific task. It also presents the results of our evaluation experiments, in which tasks involving quick movements and motion requiring physical contact with an object were performed using learning-based control.
|
|
11:10-11:20, Paper WeA-8.8 | |
BEAR-H: An Intelligent Bilateral Exoskeletal Assistive Robot for Smart Rehabilitation (I) |
|
Li, Xiang | Tsinghua University |
Zhang, Xuan | Tsinghua University |
Li, Xiu | Tsinghua University |
Long, Jianjun | Shenzhen University |
Li, Jianan | Nanjing Medical University |
Xu, Lanshuai | Milebot |
Chen, Gong | Shenzhen MileBot Robotics |
Ye, Jing | Shenzhen MileBot Robotics Co. Ltd |
Keywords: Prosthetics and Exoskeletons, Rehabilitation Robotics, Medical Robots and Systems
Abstract: One typical application of a robotic exoskeleton is to automate rehabilitation, where the robot is worn by a stroke patient and provides assistance to help perform repetitive motions and regain motor functions. The deployment of exoskeletons can alleviate the shortage of experienced therapists and would also play a vital role in countries with aging populations. However, the intelligence level of existing exoskeletons is relatively low, wherein a robot cannot adapt to either the online change of a subject's motion (e.g., the gait pattern) or the variation of his/her body parameters (i.e., a new subject who is going to wear the robot), potentially resulting in conflict between human and robot and possibly even leading to physical damage. As such, the application of a robotic exoskeleton in clinical studies is limited. This article introduces a new bilateral exoskeletal assistive robot for rehabilitation (i.e., BEAR-H) in which the main novelty is the integration of multiple intelligent features, such as gait recognition and synchronization, cloud-computing diagnosis, and individualized gait generation. Such an integration helps the robot to better understand the patient’s condition and hence provide effective assistance, by the end achieving smart rehabilitation. BEAR-H is a successfully commercialized product, and its performance has been validated in actual clinical studies with 30 patients, producing experimental results from different aspects that are analyzed and presented.
|
|
11:20-11:30, Paper WeA-8.9 | |
Data-Driven Variable Impedance Control of a Powered Knee-Ankle Prosthesis for Sit, Stand, and Walk with Minimal Tuning |
|
Welker, Cara Gonzalez | University of Colorado Boulder |
Best, Thomas | University of Michigan |
Gregg, Robert D. | University of Michigan |
Keywords: Prosthetics and Exoskeletons, Compliance and Impedance Control, Optimization and Optimal Control
Abstract: Although the average healthy adult transitions from sit to stand over 60 times per day, the majority of robotic prosthesis control research has focused on walking. In this paper, we present a data-driven controller that enables sitting, standing, and walking with minimal tuning. Our controller comprises two high level modes of sit/stand and walking, and we develop heuristic biomechanical rules to control transitions between the two. We use a phase variable based on the user's thigh angle to parameterize both walking and sit/stand motions, where variable impedance control is used during ground contact and position control is used during the swing phase of walking. We extend previous work on data-driven optimization of continuous impedance parameter functions to design the sit/stand control mode using able-bodied data. We test our controller in experiments with a participant with an above-knee amputation, comparing clinical measures including loading symmetry and sit/stand transition time. The participant completed the sit/stand task 20% faster with approximately half of the loading asymmetry relative to his everyday passive prosthesis. The controller also facilitated a timed up and go test involving sitting, standing, walking, and turning, with only a mild (10%) decrease in speed compared to the everyday prosthesis. Our sit/stand/walk controller enables multiple activities of daily life with minimal tuning and mode switching.
|
|
WeA-9 |
Rm9 (Room G) |
Software, Middleware and Programming Environments 1 |
Regular session |
Chair: Kawashima, Hideyuki | Keio University |
Co-Chair: Bombieri, Nicola | University of Verona |
|
10:00-10:10, Paper WeA-9.1 | |
OHM: GPU Based Occupancy Map Generation |
|
Stepanas, Kazys | CSIRO Data61 |
Williams, Jason | CSIRO |
Hernandez, Emili | Emesent |
Ruetz, Fabio Adrian | Queensland University of Technology (QUT) |
Hines, Thomas | CSIRO |
Keywords: Software Tools for Robot Programming, Autonomous Vehicle Navigation, Field Robots
Abstract: Occupancy grid maps (OGMs) are fundamental to most systems for autonomous robotic navigation. However, CPU-based implementations struggle to keep up with data rates from modern 3D lidar sensors, and provide little capacity for modern extensions which maintain richer voxel representations. This paper presents OHM, our open source, GPU-based OGM framework. We show how the algorithms can be mapped to GPU resources, resolving difficulties with contention to obtain a successful implementation. The implementation supports many modern OGM algorithms including NDT-OM, NDT-TM, decay-rate and TSDF. A thorough performance evaluation is presented based on tracked and quadruped UGV platforms and UAVs, and data sets from both outdoor and subterranean environments. The results demonstrate excellent performance improvements both offline, and for online processing in embedded platforms. Finally, we describe how OHM was a key enabler for the UGV navigation solution for our entry in the DARPA Subterranean Challenge, which placed second at the Final Event.
|
|
10:10-10:20, Paper WeA-9.2 | |
IKFlow: Generating Diverse Inverse Kinematics Solutions |
|
Ames, Barrett | Duke University |
Morgan, Jeremy | Independent Researcher |
Konidaris, George | Brown University |
Keywords: Software Tools for Robot Programming, Deep Learning Methods, Redundant Robots
Abstract: Inverse kinematics—finding joint poses that reach a given Cartesian-space end-effector pose—is a fundamental operation in robotics, since goals and waypoints are typically defined in Cartesian space, but robots must be controlled in joint space. However, existing inverse kinematics solvers return a single solution, in contrast, systems with more than 6 degrees of freedom support infinitely many such solutions, which can be useful in the presence of constraints, pose preferences or obstacles. We introduce a method that uses a deep neural network to learn to generate a diverse set of samples from the solution space of such kinematic chains. The resulting samples can be generated quickly (2000 solutions in under 10ms) and accurately (to within 10 millimeters and 2 degrees of an exact solution) and can be rapidly refined by classical methods if necessary.
|
|
10:20-10:30, Paper WeA-9.3 | |
Ros2_tracing: Multipurpose Low-Overhead Framework for Real-Time Tracing of ROS 2 |
|
Bédard, Christophe | Polytechnique Montréal |
Lütkebohle, Ingo | Robert Bosch GmbH |
Dagenais, Michel | Polytechnique Montréal |
Keywords: Software Tools for Robot Programming, Distributed Robot Systems, Performance Evaluation and Benchmarking
Abstract: Testing and debugging have become major obstacles for robot software development, because of high system complexity and dynamic environments. Standard, middleware-based data recording does not provide sufficient information on internal computation and performance bottlenecks. Other existing methods also target very specific problems and thus cannot be used for multipurpose analysis. Moreover, they are not suitable for real-time applications. In this paper, we present ros2_tracing, a collection of flexible tracing tools and multipurpose instrumentation for ROS 2. It allows collecting runtime execution information on real-time distributed systems, using the low-overhead LTTng tracer. Tools also integrate tracing into the invaluable ROS 2 orchestration system and other usability tools. A message latency experiment shows that the end-to-end message latency overhead, when enabling all ROS 2 instrumentation, is on average 0.0033 ms, which we believe is suitable for production real-time systems. ROS 2 execution information obtained using ros2_tracing can be combined with trace data from the operating system, enabling a wider range of precise analyses, that help understand an application execution, to find the cause of performance bottlenecks and other issues. The source code is available at: https://github.com/ros2/ros2_tracing.
|
|
10:30-10:40, Paper WeA-9.4 | |
RobotCore: An Open Architecture for Hardware Acceleration in ROS 2 |
|
Mayoral-Vilches, Victor | Klagenfurt University |
Neuman, Sabrina | MIT |
Plancher, Brian | Harvard University |
Janapa Reddi, Vijay | Harvard University |
Keywords: Software, Middleware and Programming Environments, Computer Architecture for Robotic and Automation, Hardware-Software Integration in Robotics
Abstract: Hardware acceleration can revolutionize robotics, enabling new applications by speeding up robot response times while remaining power-efficient. However, the diversity of acceleration options makes it difficult for roboticists to easily deploy accelerated systems without expertise in each specific hardware platform. In this work, we address this challenge with RobotCore, an architecture to integrate hardware acceleration in the widely-used ROS 2 robotics software framework. This architecture is target-agnostic (supports edge, workstation, data center, or cloud targets) and accelerator-agnostic (supports both FPGAs and GPUs). It builds on top of the common ROS 2 build system and tools and is easily portable across different research and commercial solutions through a new firmware layer. We also leverage the Linux Tracing Toolkit next generation (LTTng) to enable low-overhead real-time tracing and benchmarking of accelerated ROS 2 systems. To demonstrate the acceleration enabled by this architecture, we use it to deploy a ROS 2 perception computational graph on a CPU and FPGA. We also employ our integrated tracing and benchmarking to analyze bottlenecks, uncovering insights that guide us to improve FPGA communication efficiency. In particular, we design an intra-FPGA ROS 2 node communication queue template and use it in conjunction with FPGA-accelerated nodes to achieve a 24.42% speedup over a CPU.
|
|
10:40-10:50, Paper WeA-9.5 | |
Tasho: A Python Toolbox for Rapid Prototyping and Deployment of Optimal Control Problem-Based Complex Robot Motion Skills |
|
Sathya, Ajay Suresha | KU Leuven |
Astudillo Vigoya, Alejandro | KU Leuven |
Gillis, Joris | KU Leuven |
Decré, Wilm | Katholieke Universiteit Leuven |
Pipeleers, Goele | KU Leuven |
Swevers, Jan | KU Leuven |
Keywords: Software Tools for Robot Programming, Engineering for Robotic Systems, Optimization and Optimal Control
Abstract: We present Tasho (Task specification for receding horizon control), an open-source Python toolbox that facilitates systematic programming of optimal control problem (OCP)-based robot motion skills. Separation-of-concerns is followed while designing the components of a motion skill, which promotes their modularity and reusability. This allows us to program complex motion tasks by configuring and composing simpler tasks. We provide templates for several basic tasks like point-to-point and end-effector path-following tasks to speed up prototyping. Internally, the task's symbolic expressions are computed using CasADi and the resulting OCP is transcribed using Rockit. A wide and growing range of mature open-source optimization solvers are supported for solving the OCP. Monitor functions can be easily specified and are automatically deployed with the motion skill, so that the generated motion skills can be easily embedded in a larger control architecture involving higher-level discrete controllers. The motion skills thus programmed can be directly deployed on robot platforms using the C-code generation capabilities of CasADi. The toolbox has been validated through several experiments both in simulation and on physical robot systems. The open-source toolbox can be accessed at: https://gitlab.kuleuven.be/meco-software/tasho
|
|
10:50-11:00, Paper WeA-9.6 | |
Containerization and Orchestration of Software for Autonomous Mobile Robots: A Case Study of Mixed-Criticality Tasks across Edge-Cloud Computing Platforms |
|
Lumpp, Francesco | University of Verona |
Fummi, Franco | University of Verona |
Patel, Hiren | University of Waterloo |
Bombieri, Nicola | University of Verona |
Keywords: Software, Middleware and Programming Environments, Control Architectures and Programming, Autonomous Agents
Abstract: Containerization promises to strengthen platform-independent development, better resource utilization, and secure deployment of software. As these benefits come with negligible overhead in CPU and memory utilization, containerization is increasingly being adopted in mobile robotic applications. An open challenge is supporting software tasks that have mixed-criticality requirements. Even more challenging is the combination of real-time containers with orchestration, which is an emerging paradigm to automate the deployment, networking, scaling, and availability of containerized workloads and services. This paper addresses this challenge by presenting a framework that extends the de-facto reference standard for container orchestration, Kubernetes, to schedule tasks with mixed-criticality requirements. Quantitative experimental results on the software implementing the mission of a Robotnik RB-Kairos mobile robot demonstrate the effectiveness of the proposed approach. The source code is publicly available on GitHub.
|
|
11:00-11:10, Paper WeA-9.7 | |
GPU-Accelerated Incremental Euclidean Distance Transform for Online Motion Planning of Mobile Robots |
|
Chen, Yizhou | Chinese University of Hong Kong |
Lai, Shupeng | National University of Singapore |
Cui, Jinqiang | Peng Cheng Laboratory |
Wang, Biao | Nanjing University of Aeronautics and Astronautics |
Chen, Ben M. | Chinese University of Hong Kong |
Keywords: Software Tools for Robot Programming, Mapping, Motion and Path Planning
Abstract: —In this letter, we present a volumetric mapping system that effectively calculates Occupancy Grid Maps (OGMs) and Euclidean Distance Transforms (EDTs) with parallel computing. Unlike these mappers for high-precision structural reconstruction, our system incrementally constructs global EDT and outputs high-frequency local distance information for online robot motion planning. The proposed system constructs OGM with a massive amount of data from multiple types of sensor inputs. Using GPU programming techniques, the system quickly computes EDT in parallel within a local volume. The new observation is continuously integrated into the global EDT using the parallel wavefront algorithm while preserving the historical observations. Experiments with datasets have shown that our proposed approach outperforms existing state-of-the-art robot mapping systems and is particularly suitable for mapping unexplored areas. In its actual implementations on aerial and ground vehicles, the proposed system achieves realtime performance with limited onboard computational resources.
|
|
11:10-11:20, Paper WeA-9.8 | |
Transactional Transform Library for ROS |
|
Ogiwara, Yushi | Keio University |
Yorozu, Ayanori | University of Tsukuba |
Ohya, Akihisa | University of Tsukuba |
Kawashima, Hideyuki | Keio University |
Keywords: Software Tools for Robot Programming, Software, Middleware and Programming Environments, Software Architecture for Robotic and Automation
Abstract: In the Robot Operating System (ROS), a major middleware for robots, the Transform Library (TF) is a mandatory package that manages transformation information between coordinate systems by using a single-rooted directed tree and providing methods for registering and computing the information. However, the tree has two fundamental problems. The first is its poor scalability: since it accepts only a single thread at a time due to using a single giant lock for mutual exclusion, the access to the tree is sequential. Second, there is a lack of data freshness: it retrieves non-latest synthetic data when computing coordinate transformations because it prioritizes temporal consistency over data freshness. In this paper, we propose methods to solve these problems. First, we decentralize the giant lock to provide performance scalability and show that this results in a throughput 243 times higher than conventional TF on a read-only workload. Second, we design transactional methods based on serializable protocols that prevent anomalies, thus retrieving the freshest data. These transactional methods show a freshness up to 1276 times higher than the conventional one on a read-write combined workload.
|
|
11:20-11:30, Paper WeA-9.9 | |
Arena-Bench: A Benchmarking Suite for Obstacle Avoidance Approaches in Highly Dynamic Environments |
|
Kästner, Linh | Technische Universität Berlin |
Buiyan, Teham | Technical University Berlin |
Le, Tuan Anh | Technical University of Berlin |
Treis, Elias | Technical University Berlin |
Kmiecik, Jacek | Technical University Berlin |
Carstens, Reyk | Technical University Berlin |
Cox, Johannes | Technical University of Berlin |
Pichel, Duc | Technical University Berlin |
Meinardus, Boris | Technical University Berlin |
Fatloun, Mohamad Bassel | Technische Universität Berlin |
Khorsandi, Niloufar | Technical University Berlin |
Lambrecht, Jens | Technische Universität Berlin |
Keywords: Software Tools for Benchmarking and Reproducibility, Motion and Path Planning, Collision Avoidance
Abstract: The ability to autonomously navigate safely, especially within dynamic environments is paramount for mobile robotics. In recent years, DRL approaches have shown superior performance in dynamic obstacle avoidance. However, these learning-based approaches are often developed in specially designed simulation environments and are hard to test against conventional planning approaches. Furthermore, the integration and deployment of these approaches into real robotic platforms are not yet completely solved. In this paper, we present Arena-bench, a benchmark suite to train, test, and evaluate navigation planners on different robotic platforms within 3D environments. It provides tools to design and generate highly dynamic evaluation worlds, scenarios, and tasks for autonomous navigation and is fully integrated into the robot operating system. To demonstrate the functionalities of our suite, we trained a DRL agent on our platform and compared it against a variety of existing different model-based and learning-based navigation approaches on a variety of relevant metrics. Finally, we deployed the approaches towards real robots and demonstrated the reproducibility of results. The code is publicly available on github.com/arena-rosnav-3D.
|
|
WeA-10 |
Rm10 (Room H) |
Wearable Robotics |
Regular session |
Chair: Kiguchi, Kazuo | Kyushu University |
Co-Chair: Funabora, Yuki | Nagoya University |
|
10:00-10:10, Paper WeA-10.1 | |
An Impedance-Controlled Testbed for Simulating Variations in the Mechanical Fit of Wearable Devices |
|
Ambrose, Alexander | Georgia Institute of Technology |
VanAtter, Chelse | Clemson University |
Hammond III, Frank L. | Georgia Institute of Technology |
Keywords: Wearable Robotics, Compliance and Impedance Control, Human Performance Augmentation
Abstract: The fit of a wearable device, such as a prosthesis, can be quantitatively characterized by the mechanical coupling at the user-device interface. It is thought that the mechanical impedance, specifically the stiffness and damping, of wearable device interfaces can significantly impact human performance while using them. To test this theory, we develop a forearm-mounted testbed with a motorized, two degree of freedom (2-DOF) gimbal to simulate variations in the mechanical fit of an upper-extremity wearable device during pointing and target tracking tasks. The two gimbal motors are impedance-controlled to vary the mechanical stiffness and damping between the user and the device’s laser pointer end-effector. In this paper, experiments are conducted to determine the torque constants of the motors before implementation in the testbed, and to validate the accuracy of the joint impedance controller. The completed impedance-controlled wearable interface testbed is validated further by comparing the gimbal joint displacements and torques, recorded during 2-DOF base excitation experiments, to MATLAB Simulink simulation data.
|
|
10:10-10:20, Paper WeA-10.2 | |
Human-Exoskeleton Cooperative Balance Strategy for a Human-Powered Augmentation Lower Exoskeleton |
|
Song, Guangkui | University of Electronic Science and Technology of China |
Huang, Rui | University of Electronic Science and Technology of China |
Peng, Zhinan | Unversity of Electronic Science and Tehcnology of China |
Shi, Kecheng | The School of Automation Engineering, University of Electronic S |
Zhang, Long | University of Electronic Science and Technology of China |
He, Rong | University of Electronic Science and Technology of China |
Qiu, Jing | University of Electronic Science and Technology of China |
Zhan, Huayi | Changhong AI Lab (CHAIR), Sichuan Changhong Electronics Holding |
Cheng, Hong | University of Electronic Science and Technology |
Keywords: Wearable Robotics, Physical Human-Robot Interaction, Rehabilitation Robotics
Abstract: Lower Limb Exoskeleton (LLE) has received considerable interest in strength augmentation, rehabilitation, and walking assistance scenarios. For strength augmentation, the LLE is expected to have the capability of reducing metabolic energy. However, the energy for adjusting the Center of Gravity (CoG) is the main part of the energy consumption during walking, especially the walking with loads. This paper proposes a novel Human-exoskeleton Cooperative Balance (HCB) strategy for giving balance ability to the assistive torque and combined with the direction selected by the pilot to realize the balance walking of the human-exoskeleton system. In which, a Dynamic Torque Primitive Model (DTPM) is designed to plan a bionic assistive torque, and the balance parameter obtained by an Inverted Pendulum Model (IPM) is superimposed on it. Finally, the improved balance performance can break the limitation of traditional strategies and substantially increase the efficiency of assistance. We demonstrated the effectiveness of the proposed HCB strategy in the HUman-powered Augmentation Lower EXoskeleton (HUALEX) system. Experimental results indicate that the proposed HCB strategy is more efficient than traditional strategies.
|
|
10:20-10:30, Paper WeA-10.3 | |
RANK - Robotic Ankle: Design and Testing on Irregular Terrains |
|
Taborri, Juri | University of Tuscia |
MIleti, Ilaria | University of Niccolò Cusano, Roma |
Mariani, Giovanni | University of Tuscia, Viterbo |
Mattioli, Luca | University of Sapienza, Roma |
Liguori, Lorenzo | University of Sapienza, Roma |
Salvatori, Stefano | University of Niccolò Cusano, Roma |
Palermo, Eduardo | Sapienza University of Rome |
Patanè, Fabrizio | University of Niccolò Cusano, Roma |
Rossi, Stefano | University of Tuscia |
Keywords: Wearable Robotics, Mechanism Design, Prosthetics and Exoskeletons
Abstract: Despite the large amount of available exoskeletons, their use in daily life is still limited due to the absence of testing in real-life environments. This paper aims at presenting the design of a wearable ankle exoskeleton for walking assistance. The system was tested on irregular terrains. Our RANK (Robotic ANKle) is equipped with a servomotor. A four-bar linkage mechanism is used for the torque transmission. An adjustable 3D-printed brace was realized in PC/ABS to connect the exoskeleton to the user shank. The control system processes the output signals of three piezoresistive sensors placed on the insole. Assistive torque was provided during swing phase in order to limit the plantarflexion to avoid the drop-foot. Two healthy male subjects were enrolled in the study. Experimental testing consists of walking tasks performed on three different terrain conditions (flat, soft, and irregular) with and without exoskeleton. Human kinematics was gathered via inertial measurements units (IMUs). The effects of ankle exoskeleton on lower limb joint angles were assessed in terms of range of motion (ROM). Statistical parametric map method was also applied to compare joint angle curves. As expected, a reduction of the ankle ROM in all terrain conditions was found between the trails performed with and without exoskeleton. No effects induced on the hip and knee joint were observed. Moreover, no significant differences have been observed over the almost totality of the gait cycle independently of the terrain conditions. Results demonstrate the capability of the exoskeleton to work properly regardless the type of walking surface.
|
|
10:30-10:40, Paper WeA-10.4 | |
A Wearable Smart Glove and Its Application of Pose and Gesture Detection to Sign Language Classification |
|
DelPreto, Joseph | Massachusetts Institute of Technology |
Hughes, Josie | EPFL |
D’Aria, Matteo | STMicroelectronics |
de Fazio, Marco | STMicrocontroller |
Rus, Daniela | MIT |
Keywords: Wearable Robotics, Gesture, Posture and Facial Expressions, Soft Sensors and Actuators
Abstract: Advances in soft sensors coupled with machine learning are enabling increasingly capable wearable systems. Since hand motion in particular can convey useful information for developing intuitive interfaces, glove-based systems can have a significant impact on many application areas. A key remaining challenge for wearables is to capture, process, and analyze data from the high-degree-of-freedom hand in real time. We propose using a commercially available conductive knit to create an unobtrusive network of resistive sensors that spans all hand joints, coupling this with an accelerometer, and deploying machine learning on a low-profile microcontroller to process and classify data. This yields a self-contained wearable device with rich sensing capabilities for hand pose and orientation, low fabrication time, and embedded activity prediction. To demonstrate its capabilities, we use it to detect static poses and dynamic gestures from American Sign Language (ASL). By pre-training a long short-term memory (LSTM) neural network and using tools to deploy it in an embedded context, the glove and an ST microcontroller can classify 12 ASL letters and 12 ASL words in real time. Using a leave-one-experiment-out cross validation methodology, networks successfully classify 96.3% of segmented examples and generate correct rolling predictions during 92.8% of real-time streaming trials.
|
|
10:40-10:50, Paper WeA-10.5 | |
A Soft Fabric-Based Shrink-To-Fit Pneumatic Sleeve for Comfortable Limb Assistance |
|
Diteesawat, Richard Suphapol | University of Bristol |
Hoh, Sam | University of Bristol |
Pulvirenti, Emanuele | Bristol Robotics Laboratory |
Rahman, Nahian | Georgia Institute of Technology |
Morris, Leah | University of the West of England |
Turton, A.J. | Bristol Robotics Laboratory |
Cramp, Mary | Department of Allied Health Professions and Associate Professor, |
Rossiter, Jonathan | University of Bristol |
Keywords: Wearable Robotics, Prosthetics and Exoskeletons, Hydraulic/Pneumatic Actuators
Abstract: Upper limb impairments and weakness are common post-stroke and with advanced aging. Rigid exoskeletons have been developed as a potential solution, but have had limited impact. In addition to user concerns about safety, their weight and appearance, the rigid attachment and typical anchoring methods can result in skin damage. In this paper, we present a soft, fabric-based pneumatic sleeve, which can shrink from a loose fit to a tight fit in order to anchor to the limbs temporarily, thereby enabling the application of mechanical assistance only when needed. The sleeve is comfortable, ergonomic and can be embedded unobtrusively with clothing. A mathematical model is built to simulate and design sleeves with different geometric parameters. The best sleeve was capable of generating a friction force of 98 N on the limb when inflated to 25 kPa. This sleeve was used to create a wearable assistive device, integrated with a cable-driven actuator. This device was able to lift a 1.44 kg forearm rig up to 95 degree at low pressure of 20 kPa. The device was tested with six healthy participants, in terms of fit, comfort and assistive functionality. The average acceptable sleeve pressure was found to be 33±4.7 kPa. All participants liked the appearance of the sleeve, with a high average perceived assistance score of 7.33±1.6 (out of 10). The shrink-to-fit sleeve is expected to significantly increase the development and adoption of soft robotic assistive devices and emerging powered clothing.
|
|
10:50-11:00, Paper WeA-10.6 | |
Ring-Pull Type Soft Wearable Robotic Glove for Hand Strength Assistance |
|
Yang, Junmo | Daegu Gyeongbuk Institute of Science and Technology (DGIST) |
Kim, Donghyun | Daegu Gyeongbuk Institute of Science and Technology |
Yoon, Jingon | DGIST |
Kim, Jisu | Daegu Gyeongbuk Institute of Science and Technology (DGIST) |
Yun, Dongwon | Daegu Gyeongbuk Institute of Science and Technology (DGIST) |
Keywords: Wearable Robotics, Human Performance Augmentation, Rehabilitation Robotics
Abstract: This paper proposes and verifies a new method, the ring-pull mechanism, to overcome the disadvantages of existing wearable robotic gloves. By attaching a ring to the metacarpopha- langeal (MCP) joint of the finger, the ring-pull mechanism supplements the grasping force of the user, while reducing the weight of the entire wearable robotic glove system. Ring-pull mechanism experiments were conducted to determine which finger combinations had the most positive effect on muscle strength assistance, and through this, the Ring-Pull type Soft Glove (RPSG) was developed. The main body of the developed RPSG is composed of single polymer silicon, a soft material, and is driven by tendon-driven actuation. The tendon path is secured through a tube attached to the palm that matches the direction of the flexor digitorum superficialis (FDS). The new type of wearable robotic glove was manufactured with the proposed mechanism, and excellent fit and strength support effects were confirmed. The RPSG increased the subject's grasping force by 25.69% on average, and the %MVIC data analysis demonstrated that the activation of FDS decreased by about 23.51%. As a result, it was confirmed that the user's muscle efficiency was increased due to the muscle support and muscle function improvement provided by the RPSG.
|
|
11:00-11:10, Paper WeA-10.7 | |
Kinematics-Based Adaptive Assistance of a Semi-Passive Upper-Limb Exoskeleton for Workers in Static and Dynamic Tasks |
|
Grazi, Lorenzo | Scuola Superiore Sant'Anna |
Trigili, Emilio | Scuola Superiore Sant'Anna |
Caloi, Noemi | University of Pisa and Scuola Superiore Sant'Anna |
Ramella, Giulia | Scuola Superiore Sant'Anna |
Giovacchini, Francesco | IUVO S.r.l |
Vitiello, Nicola | Scuola Superiore Sant Anna |
Crea, Simona | Scuola Superiore Sant'Anna, the BioRobotics Institute |
Keywords: Wearable Robotics, Physically Assistive Devices, Human Performance Augmentation
Abstract: Typical industrial work activities may include a variety of different gestures, entailing the execution of dynamic and static movements. Occupational upper-limb exoskeletons can assist the shoulder complex in both static and dynamic gestures, but the required assistance level may be different according to the tasks. This article presents the design, development, and experimental evaluation of a novel kinematics-based adaptive assistance algorithm for a semi-passive upper-limb exoskeleton. The algorithm uses kinematic signals gathered by onboard sensors to set the assistance amplitude according to the type of movement being executed. Experimental activities were performed to assess the algorithm’s performance. Results show that the algorithm can effectively provide different assistance levels according to the type of task being executed, such as the minimum level for more dynamic tasks and the maximum level for the most static activities. Additionally, compared to working without the exoskeleton, the exoskeleton controlled by the proposed adaptive algorithm can reduce the users’ flexor muscular activity in both dynamic and static tasks, respectively by 24 ± 6% and 42 ± 2%. Similar results were reported for extensor muscles, which reduced their activations by 7 ± 3%, and 40 ± 4% in dynamic and static tasks.
|
|
11:10-11:20, Paper WeA-10.8 | |
Reconfigurable Self-Sensing Pneumatic Artificial Muscle with Locking Ability Based on Modular Multi-Chamber Soft Actuator |
|
Liu, Jianbin | Key Laboratory of Mechanism Theory and Equipment Design, Ministr |
Ma, Zhuo | Tianjin University |
Wang, Yingxue | Tianjin University |
Zuo, Siyang | Tianjin University |
Keywords: Wearable Robotics, Soft Sensors and Actuators, Hydraulic/Pneumatic Actuators
Abstract: Traditional pneumatic artificial muscles (PAMs) have cannot fully satisfy the requirements in wearable applications for safe and sufficient interaction with human body. The requirements include high contraction ratio, self-contained sensing, reconfiguration, locking abilities and no squeezing force worked on human tissue. In this paper, a reconfigurable self-sensing pneumatic artificial muscle (RSPAM) based on modular multi-chamber soft actuator with locking ability is developed, which provides an alternative that can satisfy all the requirements for wearable applications. Contraction principle of the RSPAM, which is transforming the expansion of the soft actuator into contraction by fabric winding, has inherent high contraction ratio (details seen in Appendix) no squeezing on human tissue at all. Self-sensing of contraction stroke based on liquid metal and locking ability based positive pressure jamming are integrated. A driving force of 70.14 N and contraction ratio of 71% were validated at 3 bar air pressure. Modular design of the actuator makes it possible to change configuration by adjusting the actuator amount and the fabric length according to application. In addition, the actuator unit can move on the fabric by itself, allowing an interesting self-adjustment ability to the muscle. The displacement self-sensing ability is verified through square-wave tracking experiments with closed-loop control. Finally, a preliminary test of assisting elbow movement verifies that RSPAM can reduce 25.52% muscle fatigue.
|
|
11:20-11:30, Paper WeA-10.9 | |
Design of a Wearable Mechanism with Shape Memory Alloy (SMA)-Based Artificial Muscle for Assisting with Shoulder Abduction |
|
Hyeon, Kyujin | KAIST |
Jeong, Jaeyeon | Korea Advanced Institute of Science Ane Technology |
Chung, Chongyoung | Korea Advanced Institute of Science and Technology (KAIST) |
Cho, Minjae | KAIST |
Hussain, Sajjad | Korea Advanced Institute of Science and Technology |
Kyung, Ki-Uk | Korea Advanced Institute of Science & Technology (KAIST) |
Keywords: Wearable Robotics, Rehabilitation Robotics, Soft Robot Applications
Abstract: This paper proposes a new mechanism, a four-bar linkage-based support hinge mechanism, for assisting with shoulder abduction with artificial muscle based on a shape memory alloy (SMA). An artificial muscle using the SMA coils is designed to lighten the entire system and support the wearer’s movement both actively and passively. It can generate up to 273 N and 180 N with and without energy input, respectively, while weighing only 0.04 kg. Furthermore, to consider the rotation axis shifts of the arm during shoulder abduction, the trajectory of the arm along the shoulder abduction is modeled using the scapulohumeral rhythm, a combined movement of the scapula and humerus. The mechanism is designed to follow the modeled arm trajectory and achieve the required torque to perform shoulder abduction based on a four-bar linkage mechanism. It can generate up to 10.1 Nm and 6.3 Nm of torque with and without energy input, respectively. To verify the assistive effect of the proposed mechanism, electromyography is measured while performing the same exercise requiring shoulder abduction with and without the support of the mechanism. The results show that the proposed mechanism reduces not only muscle load but also fatigue while performing the shoulder abduction.
|
|
WeA-11 |
Rm11 (Room I) |
Intention Recognition |
Regular session |
Chair: Antonello, Morris | Five AI Ltd |
Co-Chair: Belardinelli, Anna | Honda |
|
10:00-10:10, Paper WeA-11.1 | |
Intention Estimation from Gaze and Motion Features for Human-Robot Shared-Control Object Manipulation |
|
Belardinelli, Anna | Honda |
Kondapally-Reddy, Anirudh | Honda R&D Innovative Research Excellence |
Ruiken, Dirk | Honda Research Institute Europe |
Tanneberg, Daniel | Honda Research Institute |
Watabe, Tomoki | Honda R&D Co., Ltd |
Keywords: Intention Recognition, Telerobotics and Teleoperation, Human-Robot Collaboration
Abstract: Shared control can help in teleoperated object manipulation by assisting with the execution of the user’s intention. To this end, robust and prompt intention estimation is needed, which relies on behavioral observations. Here, an intention estimation framework is presented, which uses natural gaze and motion features to predict the current action and the target object. The system is trained and tested in a simulated environment with pick and place sequences produced in a relatively cluttered scene and with both hands, with possible hand-over to the other hand. Validation is conducted across different users and hands, achieving good accuracy and earliness of prediction. An analysis of the predictive power of single features shows the predominance of the grasping trigger and the gaze features in the early identification of the current action. In the current framework, the same probabilistic model can be used for the two hands working in parallel and independently, while a rule-based model is proposed to identify the resulting bimanual action. Finally, limitations and perspectives of this approach to more complex, full-bimanual manipulations are discussed.
|
|
10:10-10:20, Paper WeA-11.2 | |
Disentangled Sequence Clustering for Human Intention Inference |
|
Zolotas, Mark | Northeastern University |
Demiris, Yiannis | Imperial College London |
Keywords: Intention Recognition, Representation Learning, Deep Learning Methods
Abstract: Equipping robots with the ability to infer human intent is a vital precondition for effective collaboration. Most computational approaches towards this objective derive a probability distribution of "intent" conditioned on the robot's perceived state. However, these approaches typically assume task-specific labels of human intent are known a priori. To overcome this constraint, we propose the Disentangled Sequence Clustering Variational Autoencoder (DiSCVAE), a clustering framework capable of learning such a distribution of intent in an unsupervised manner. The proposed framework leverages recent advances in unsupervised learning to disentangle latent representations of sequence data, separating time-varying local features from time-invariant global attributes. As a novel extension, the DiSCVAE also infers a discrete variable to form a latent mixture model and thus enable clustering over these global sequence concepts, e.g. high-level intentions. We evaluate the DiSCVAE on a real-world human-robot interaction dataset collected using a robotic wheelchair. Our findings reveal that the inferred discrete variable coincides with human intent, holding promise for collaborative settings, such as shared control.
|
|
10:20-10:30, Paper WeA-11.3 | |
Personalized Estimation of Intended Gait Speed for Lower-Limb Exoskeleton Users Via Data Augmentation Using Mutual Information |
|
Karulkar, Roopak M. | University of Notre Dame |
Wensing, Patrick M. | University of Notre Dame |
Keywords: Intention Recognition, Prosthetics and Exoskeletons, Rehabilitation Robotics
Abstract: This letter presents a method for data-driven user-specific gait speed estimation for people with Spinal Cord Injuries (SCIs) walking in lower-limb exoskeletons. The scarcity of training data for this population is addressed by leveraging common patterns across users that relate gait changes to speed changes. To bootstrap the process, widely available walking data from uninjured individuals was used as a base dataset. The distribution of this data was first transformed to match smaller user-specific training sets from walking trials of subjects with SCIs. User-specific trials were then selected based on the mutual information between gait speed and features for the combined dataset. The resulting selected data was finally used to build a model for estimating the user's intended gait speed. The performance of this approach was evaluated using data from two users with SCIs walking in an EksoGT exoskeleton with a walker or crutches. Estimation trials were compared when using the base data alone versus when providing personalization via the addition of novel data. The average successful estimation of speed-up and slow-down changes increased from 52% to 67% with personalization using only 8 to 12 steps' worth of user-specific data, with a best-case improvement of 32%, from 48% to 80%. Overall, the proposed method uses the mutual information between gait features and speed to provide a reliable alternative to manual data selection while pooling data from healthy and injured individuals.
|
|
10:30-10:40, Paper WeA-11.4 | |
Flash: Fast and Light Motion Prediction for Autonomous Driving with Bayesian Inverse Planning and Learned Motion Profiles |
|
Antonello, Morris | Five AI Ltd |
Dobre, Mihai | FiveAI |
Albrecht, Stefano V. | University of Edinburgh |
Redford, John | Five AI Ltd |
Ramamoorthy, Subramanian | The University of Edinburgh |
Keywords: Intention Recognition, Probabilistic Inference, Motion and Path Planning
Abstract: Motion prediction of road users in traffic scenes is critical for autonomous driving systems that must take safe and robust decisions in complex dynamic environments. We present a novel motion prediction system for autonomous driving. Our system is based on the Bayesian inverse planning framework, which efficiently orchestrates map-based goal extraction, a classical control-based trajectory generator and a mixture of experts collection of light-weight neural networks specialised in motion profile prediction. In contrast to many alternative methods, this modularity helps isolate performance factors and better interpret results, without compromising performance. This system addresses multiple aspects of interest, namely multi-modality, motion profile uncertainty and trajectory physical feasibility. We report on several experiments with the popular highway dataset NGSIM, demonstrating state-of-the-art performance in terms of trajectory error. We also perform a detailed analysis of our system's components, along with experiments that stratify the data based on behaviours, such as change-lane versus follow-lane, to provide insights into the challenges in this domain. Finally, we present a qualitative analysis to show other benefits of our approach, such as the ability to interpret the outputs.
|
|
10:40-10:50, Paper WeA-11.5 | |
MPC-PF: Social Interaction Aware Trajectory Prediction of Dynamic Objects for Autonomous Driving Using Potential Fields |
|
Bhatt, Neel P. | University of Waterloo |
Khajepour, Amir | University of Waterloo |
Hashemi, Ehsan | University of Alberta |
Keywords: Intention Recognition, Intelligent Transportation Systems, Agent-Based Systems
Abstract: Predicting object motion behaviour is a challenging but crucial task for safe decision making and path planning for an autonomous vehicle. It is challenging in large part due to the uncertain, multi-modal, and practically intractable set of possible human-human and human-space interactions, especially in urban driving settings. Models solely based on constant velocity or social force have an inherent bias and may lead to inaccurate predictions across the prediction horizon whereas purely data driven approaches suffer from a lack of a holistic set of rules governing predictions. We tackle this problem by introducing MPC-PF: a novel potential field-based trajectory predictor that incorporates social interaction and is able to tradeoff between inherent model biases across the prediction horizon. Through evaluation on a variety of common urban driving scenarios, we show that our model is capable of producing accurate predictions for both short and long term timesteps. We also demonstrate the significance of our model architecture through an ablation study.
|
|
10:50-11:00, Paper WeA-11.6 | |
Optimization of Forcemyography Sensor Placement for Arm Movement Recognition |
|
Xu, Xiaohao | Huazhong University of Science and Technology, State Key Laborat |
Du, Zihao | Huazhong University of Science and Technology |
Zhang, HuaXin | Huazhong University of Science and Technology |
Ruichao, Zhang | Huazhong University of Science and Technology |
Hong, Zihan | Huazhong University of Science and Technology |
Huang, Qin | Huazhong University of Science and Technology |
Han, Bin | Huazhong University of Science and Technology |
Keywords: Intention Recognition, Datasets for Human Motion, Human-Robot Collaboration
Abstract: How to design an optimal wearable device for human movement recognition is vital to reliable and accurate human-machine collaboration. Previous works mainly fabricate wearable devices heuristically. Instead, this paper raises an academic question: can we design an optimization algorithm to optimize the fabrication of wearable devices such as figuring out the best sensor arrangement automatically? Specifically, this work focuses on optimizing the placement of Forcemyography (FMG) sensors for FMG armbands in the application of arm movement recognition. Firstly, based on graph theory, the armband is modeled considering sensors’ signals and connectivity. Then, a Graph-based Armband Modeling Network (GAM-Net) is introduced for arm movement recognition. Afterward, the sensor placement optimization for FMG armbands is formulated and an optimization algorithm with greedy local search is proposed. To study the effectiveness of our optimization algorithm, a dataset for mechanical maintenance tasks using FMG armbands with 16 sensors is collected. Our experiments show that using only 4 sensors optimized with our algorithm can help maintain a comparable recognition accuracy to using all sensors. Finally, the optimized sensor placement result is verified from a physiological view. This work would like to shed light on the automatic fabrication of wearable devices considering downstream tasks, such as human biological signal collection and movement recognition.
|
|
11:00-11:10, Paper WeA-11.7 | |
Pedestrian Intention Prediction Based on Traffic-Aware Scene Graph Model |
|
Song, Xingchen | Xi' an Jiaotong University |
Kang, Miao | Xi’an Jiaotong University |
Zhou, Sanping | Xi’an JIaotong University |
Wang, Jianji | Xi' an Jiaotong University |
Mao, Yishu | Xi' an Jiaotong University |
Zheng, Nanning | Xi'an Jiaotong University |
Keywords: Intention Recognition, Computer Vision for Transportation, Intelligent Transportation Systems
Abstract: Anticipating the future behavior of pedestrians is a crucial part of deploying Automated Driving Systems (ADS) in urban traffic scenarios. Most recent works utilize a convolutional neural network (CNN) to extract visual information, which is then input to a recurrent neural network (RNN) along with pedestrian-specific features like location and speed to obtain temporal features. However, the majority of these approaches lack the ability to parse the relationships of the related objects in the specific traffic scene, which leads to omitting the interactions between the pedestrians and the interactions between the pedestrians and the traffic. For this purpose, we propose a graph-structured model which can dig out pedestrians' dynamic constraints by constructing a traffic-aware scene graph within each frame. In addition, to capture pedestrian movement more effectively, we also introduce a temporal feature representation model, which first uses inter-frame and intra-frame GRU (II-GRU) to mine inter-frame information and intra-frame information together, and then employs a novel attention mechanism to adaptively generate attention weights. Extensive experiments on the JAAD and PIE datasets prove that our proposed model is effective in reaching and enhancing the state-of-the-art performance.
|
|
11:10-11:20, Paper WeA-11.8 | |
Social-PatteRNN: Socially-Aware Trajectory Prediction Guided by Motion Patterns |
|
Navarro, Ingrid | Carnegie Mellon University |
Oh, Jean | Carnegie Mellon University |
Keywords: Intention Recognition, AI-Enabled Robotics, Human-Centered Robotics
Abstract: As robots across domains start collaborating with humans in shared environments, algorithms that enable them to reason over human intent are important to achieve safe interplay. In our work, we study human intent through the problem of predicting trajectories in dynamic environments. We explore domains where navigation guidelines are relatively strictly defined but not clearly marked in their physical environments. We hypothesize that within these domains, agents tend to exhibit short-term motion patterns that reveal context information related to the agent's general direction, intermediate goals and rules of motion, e.g., social behavior. From this intuition, we propose Social-PatteRNN, an algorithm for recurrent, multi-modal trajectory prediction that exploits motion patterns to encode the aforesaid contexts. Our approach guides long-term trajectory prediction by learning to predict short-term motion patterns. It then extracts sub-goal information from the patterns and aggregates it as social context. We assess our approach across three domains: humans crowds, humans in sports and manned aircraft in terminal airspace, achieving state-of-the-art performance.
|
|
11:20-11:30, Paper WeA-11.9 | |
A Hierarchical Deliberative Architecture Framework Based on Goal Decomposition |
|
Lesire, Charles | ONERA/DTIS, University of Toulouse |
Bailon-Ruiz, Rafael | ONERA |
Barbier, Magali | ONERA |
Grand, Christophe | ONERA |
Keywords: Control Architectures and Programming, Planning, Scheduling and Coordination, Engineering for Robotic Systems
Abstract: Performing a complex autonomous mission with a multi-robot system requires to integrate several deliberative approaches to perform task allocation, optimization, and execution control. Implementing such a deliberative architecture is a complex task: it requires the developer to master the decision algorithms themselves (e.g., automated planning models), to have a good knowledge of the involved robotic platforms, and to think about how these elements will be assembled as a system architecture. We propose a framework to help designing such deliberative architectures. The framework relies on the concept of a hierarchical structure of actors, each actor managing goals with specific planning or optimization approaches, and delegating sub-goals to other actors.
|
|
WeA-12 |
Rm12 (Room J) |
Semantic Scene Understanding 1 |
Regular session |
Chair: Chen, Liming | Ecole Centrale De Lyon |
Co-Chair: Mae, Yasushi | Kansai University |
|
10:00-10:10, Paper WeA-12.1 | |
SynWoodScape: Synthetic Surround-View Fisheye Camera Dataset for Autonomous Driving |
|
Sekkat, Ahmed Rida | LITIS Lab, Université De Rouen Normandie |
Dupuis, Yohan | CESI |
Ravi Kumar, Varun | QualComm |
Rashed, Hazem | Valeo |
Yogamani, Senthil | Valeo Vision Systems |
Vasseur, Pascal | Université De Picardie Jules Verne |
Honeine, Paul | Université De Rouen Normandie |
Keywords: Omnidirectional Vision, Data Sets for Robotic Vision, Semantic Scene Understanding
Abstract: Surround-view cameras are a primary sensor for automated driving, used for near-field perception. It is one of the most commonly used sensors in commercial vehicles primarily used for parking visualization and automated parking. Four fisheye cameras with a 190∘ field of view cover the 360∘ around the vehicle. Due to its high radial distortion, the standard algorithms do not extend easily. Previously, we released the first public fisheye surround-view dataset named WoodScape. In this work, we release a synthetic version of the surround-view dataset, covering many of its weaknesses and extending it. Firstly, it is not possible to obtain ground truth for pixel-wise optical flow and depth. Secondly, WoodScape did not have all four cameras annotated simultaneously in order to sample diverse frames. However, this means that multi-camera algorithms cannot be designed to obtain a unified output in birds-eye space, which is enabled in the new dataset. We implemented surround-view fisheye geometric projections in CARLA Simulator matching WoodScape’s configuration and created SynWoodScape. We release 80 k images from the synthetic dataset with annotations for 10+ tasks. We also release the baseline code and supporting scripts.
|
|
10:10-10:20, Paper WeA-12.2 | |
Accurate Instance-Level CAD Model Retrieval in a Large-Scale Database |
|
Wei, Jiaxin | ShanghaiTech University |
Hu, Lan | ShanghaiTech University |
Wang, Chenyu | ShanghaiTech University |
Kneip, Laurent | ShanghaiTech University |
Keywords: Semantic Scene Understanding, Deep Learning for Visual Perception, RGB-D Perception
Abstract: We present a new solution to the fine-grained retrieval of clean CAD models from a large-scale database in order to recover detailed object shape geometries for RGBD scans. Unlike previous work simply indexing into a moderately small database using an object shape descriptor and accepting the top retrieval result, we argue that in the case of a large-scale database a more accurate model may be found within a neighborhood of the descriptor. More importantly, we propose that the distinctiveness deficiency of shape descriptors at the instance level can be compensated by a geometry-based re-ranking of its neighborhood. Our approach first leverages the discriminative power of learned representations to distinguish between different categories of models and then uses a novel robust point set distance metric to re-rank the CAD neighborhood, enabling fine-grained retrieval in a large shape database. Evaluation on a real-world dataset shows that our geometry-based re-ranking is a conceptually simple but highly effective method that can lead to a significant improvement in retrieval accuracy compared to the state-of-the-art.
|
|
10:20-10:30, Paper WeA-12.3 | |
Low-Latency LiDAR Semantic Segmentation |
|
Hori, Takahiro | University of Tokyo |
Yairi, Takehisa | University of Tokyo |
Keywords: Semantic Scene Understanding, Object Detection, Segmentation and Categorization, Recognition
Abstract: Several methods of semantic segmentation using light detection and ranging (LiDAR) sensors have been proposed for the recognition of surrounding objects by autonomous driving cars. LiDAR is a sensor that compensates for the weaknesses of other sensors, such as cameras or radar systems, and semantic segmentation assigns a class label to each point in the LiDAR point cloud. Recently, real-time semantic segmentation methods that are capable of processing LiDAR point clouds at frame rates have been proposed. Real-time semantic segmentation is essential for the autonomous driving system because it can output class labels for LiDAR point clouds at high speeds. However, this segmentation method suffers from a delay equal to processing time. To address this challenge, we propose a novel method that combines SalsaNext cite{salsanext}, a method of real-time LiDAR semantic segmentation, and semantic forecasting, which predicts the results of future semantic segmentation. We quantitatively evaluate our method using the Semantic-KITTI dataset, which comprises point cloud data acquired from the LiDAR sensor in the real world, and compare the latency and accuracy of our method with other semantic segmentation methods. Consequently, our method is found to be capable of operating in real-time and with low-latency, and it can achieve a performance similar to that of previously reported real-time semantic segmentation methods.
|
|
10:30-10:40, Paper WeA-12.4 | |
Implicit-Part Based Context Aggregation for Point Cloud Instance Segmentation |
|
Wu, Xiaodong | Institute of Computing Technology (ICT), Chinese Academy of Scie |
Wang, Ruiping | Institute of Computing Technology, Chinese Academy of Sciences |
Chen, Xilin | Institute of Computing Technology, Chinese Academy |
Keywords: Semantic Scene Understanding, Object Detection, Segmentation and Categorization
Abstract: Context information is important for instance segmentation on point clouds. Existing methods either only use local surroundings by stacking multiple convolution layers or use non-local methods to model long-range interactions. However, they usually directly operate on points which is an unstructured and low-level representation and is highly dependent on context. To address this issue, we propose an effective framework named Implicit-Part Context Aggregation (IPCA), which adopts implicit parts as an intermediate representation and achieves context aggregation through message passing along the implicit part graph. Specifically, we first organize unstructured points into geometrically consistent implicit parts and construct the implicit part graph according to the geometric adjacency. Then, an initial part embedding is extracted using the proposed Implicit Part Network (IPN) which can aggregate point features and capture the intrinsic geometric shape of the part. We further refine the part embedding by a graph reasoning module named Context Aggregation Network (CAN), which helps to make a more precise prediction by well exploiting the context information. Instance proposals are then generated by grouping implicit parts. Finally, we propose an additional step to attribute the entire instance proposal to a Semantic Criterion Net (SCN) to infer the semantics of the instance. The purpose is to correct the semantic prediction errors caused by not knowing the boundary and overall shape of the object in the previous steps. Extensive experiments on two large datasets, ScanNet and 3RScan demonstrate the effectiveness of our method. It outperforms all existing methods on the ScanNet test benchmark and its AP@50 is 9.5 points higher than the baseline.
|
|
10:40-10:50, Paper WeA-12.5 | |
Unsupervised Domain Adaptation for Point Cloud Semantic Segmentation Via Graph Matching |
|
Bian, Yikai | Nanjing University of Science and Technology |
Hui, Le | Nanjing University of Science and Technology |
Qian, Jianjun | Nanjing University of Science and Technology |
Xie, Jin | Nanjing University of Science and Technology |
Keywords: Semantic Scene Understanding, Transfer Learning, Deep Learning for Visual Perception
Abstract: Unsupervised domain adaptation for point cloud semantic segmentation has attracted great attention due to its effectiveness in learning with unlabeled data. Most of existing methods use global-level feature alignment to transfer the knowledge from the source domain to the target domain, which may cause the semantic ambiguity of the feature space. In this paper, we propose a graph-based framework to explore the local-level feature alignment between the two domains, which can reserve semantic discrimination during adaptation. Specifically, in order to extract local-level features, we dynamically construct local feature graphs on both of the two domains and then build a memory bank with the graphs from the source domain. In particular, we use optimal transport to generate the graph matching pairs. Then, based on the assignment matrix, we can meticulously align the feature distributions between the two domains by the graph-based local feature loss. Furthermore, we consider the correlation between the features of different categories and design a category-guided contrastive loss to guide the segmentation model to learn discriminative features on the target domain. Extensive experiments on different synthetic-to-real and real-to-real domain adaptation scenarios demonstrate that our method can achieve state-of-the-art performance.
|
|
10:50-11:00, Paper WeA-12.6 | |
SectionKey: 3-D Semantic Point Cloud Descriptor for Place Recognition |
|
Jin, Shutong | Nanyang Technological University |
Wu, Zhenyu | Nanyang Technological University |
Zhao, Chunyang | Nanyang Technological University |
Zhang, Jun | Nanyang Technological University |
Peng, Guohao | Nanyang Technological University |
Wang, Danwei | Nanyang Technological University |
Keywords: Semantic Scene Understanding, Localization, Intelligent Transportation Systems
Abstract: Place recognition is seen as a crucial factor to correct cumulative errors in Simultaneous Localization and Mapping (SLAM) applications. Most existing studies focus on visual place recognition, which is inherently sensitive to environmental changes such as illumination, weather and seasons. Considering these facts, more recent attention has been attracted to use 3-D Light Detection and Ranging (LiDAR) scans for place recognition, which demonstrates more credibility by exerting accurate geometric information. Different from pure geometric-based studies, this paper proposes a novel global descriptor, named SectionKey, which leverages both semantic and geometric information to tackle the problem of place recognition in large-scale urban environments. The proposed descriptor is robust and invariant to viewpoint changes. Specifically, the encoded three-layers key serves as a pre-selection step and a `candidate center' selection strategy is deployed before calculating the similarity score, thus improving the accuracy and efficiency significantly. Then, a two-step semantic iterative closest point (ICP) algorithm is applied to acquire the 3-D pose (x, y, theta) that is used to align the candidate point clouds with the query frame and calculate the similarity score. Extensive experiments have been conducted on public Semantic KITTI dataset to demonstrate the superior performance of our proposed system over state-of-the-art baselines.
|
|
11:00-11:10, Paper WeA-12.7 | |
Fisheye Object Detection Based on Standard Image Datasets with 24-Points Regression Strategy |
|
Xu, Xi | Beijing Institute of Technology |
Gao, Yu | Beijing Institude of Technology |
Liang, Hao | Beijing Institute of Technology |
Yang, Yi | Beijing Institute of Technology |
Fu, Mengyin | Beijing Institute of Technology |
Keywords: Semantic Scene Understanding
Abstract: Fisheye object detection is a difficult task in robotics and autonomous driving. One of the reasons is that the fisheye datasets are inferior to standard image datasets in scale and quantity, which inspires the idea of using standard image datasets for fisheye object detection. However, the models trained on standard image datasets do not perform well with fisheye data. In this work, we explore the effect of fisheye images on different stages of the YOLOX with published weights generated by standard image datasets. We also propose a new regression strategy for 24-points object representation method, which is insensitive to image distortion. The experiments show that the feature extraction part is robust to fisheye image features, while the regression part of location and category performs poorly. The strategy can achieve the position of discrete points without calculating the IOU of irregular-shaped boxes. Theoretically, the strategy can be widely adopted to regress the irregular bounding boxes composed of discrete points. Source code is at https://github.com/IN2-ViAUn/Exploration-of-Potential
|
|
11:10-11:20, Paper WeA-12.8 | |
Real-Time Semantic 3D Reconstruction for High-Touch Surface Recognition for Robotic Disinfection |
|
Qiu, Ri-Zhao | University of Illinois at Urbana-Champaign |
Sun, Yixiao | Stanford University |
Correia Marques, Joao Marcos | University of Illinois at Urbana-Champaign |
Hauser, Kris | University of Illinois at Urbana-Champaign |
Keywords: Semantic Scene Understanding, RGB-D Perception, Motion and Path Planning
Abstract: Disinfection robots have applications in promoting public health and reducing hospital-acquired infections and have drawn considerable interest due to the COVID-19 pandemic. To disinfect a room quickly, motion planning can be used to plan robot disinfection trajectories on a reconstructed 3D map of the room's surfaces. However, existing approaches discard semantic information of the room and, thus, take a long time to perform thorough disinfection. Human cleaners, on the other hand, disinfect rooms more efficiently by prioritizing the cleaning of high-touch surfaces. To address this gap, we present a novel GPU-based volumetric semantic TSDF (Truncated Signed Distance Function) integration system for semantic 3D reconstruction. Our system produces 3D reconstructions that distinguish high-touch surfaces from non-high-touch surfaces at approximately 50 frames per second on a consumer-grade GPU, which is approximately 5 times faster than existing CPU-based TSDF semantic reconstruction methods. In addition, we extend a UV disinfection motion planning algorithm to incorporate semantic awareness for optimizing coverage of disinfection trajectories. Experiments show that our semantic-aware planning outperforms geometry-only planning by disinfecting up to 20% more high-touch surfaces under the same time budget. Further, the real-time nature of our semantic reconstruction pipeline enables future work on simultaneous disinfection and mapping.
|
|
11:20-11:30, Paper WeA-12.9 | |
Relationship Oriented Semantic Scene Understanding for Daily Manipulation Tasks |
|
Tang, Chao | Southern University of Science and Technology |
Yu, Jingwen | The Hong Kong University of Science and Technology, Southern Uni |
Chen, Weinan | Southern University of Science and Technology |
Xia, Bingyi | Southern University of Science and Technology |
Zhang, Hong | SUSTech |
Keywords: Semantic Scene Understanding, Perception for Grasping and Manipulation, Task Planning
Abstract: Assistive robot systems have been developed to help people accomplish daily manipulation tasks especially for those with disabilities, where scene understanding plays a crucial role in enabling robots to interpret the surroundings and behave accordingly. Most of the current systems approach scene understanding without considering the functional dependencies between objects. However, it is only valuable to interact with some objects when their function-relevant counterparts are considered. In this paper, we augment an assistive robotic arm system with an end-to-end semantic relationship reasoning model. It incorporates functional relationships between pairs of objects for semantic scene understanding. To ensure good generalization to unseen objects and relationships, the model works in a category-agnostic manner. We evaluate our design and three baseline methods on a self-collected benchmark with two levels of difficulty. To further demonstrate the effectiveness, the model is integrated with a symbolic planner for goal-oriented, multi-step manipulation task on a real-world assistive robotic arm platform.
|
|
WeA-13 |
Rm13 (Room K) |
Multi-Robot Systems 1 |
Regular session |
Chair: Manocha, Dinesh | University of Maryland |
Co-Chair: Scherer, Jürgen | Silicon Austria Labs GmbH |
|
10:00-10:10, Paper WeA-13.1 | |
Multi-Robot Unknown Area Exploration Using Frontier Trees |
|
Soni, Ankit | Birla Institute of Technology and Science, Pilani Campus, Pilani |
Dasannacharya, Chirag | Birla Institute of Technology and Science, Pilani |
Gautam, Avinash | Birla Institute of Technology and Science |
Shekhawat, Virendra Singh | BITS Pilani |
Mohan, Sudeept | Birla Institute of Technology and Science |
Keywords: Multi-Robot Systems, Task Planning, Mapping
Abstract: This paper presents a novel approach for multi-robot unknown area exploration. Recently, the frontier tree data structure was used in single robot exploration to memorize frontiers, their positions, exploration state, and the map. This tree could be queried to decide on further exploration steps. In this paper, we take the concept further for multi-robot exploration by proposing a new abstraction called the ‘group,’ meant to share information through a common frontier tree, requisite operations at the group level, and a method to assign goals to multiple robots. A group is a set of robots, the union of whose explored regions forms a contiguous region (a single connected region in a topological sense). As a group has precisely one tree, the robots share a common state of the exploration task. We propose techniques to merge groups and their frontier trees once their maps overlap. Finally, we suggest a method to designate and assign exploration goals to the individual robots by choosing nodes from the frontier tree. The proposed approach outperforms seven state-of-the-art research works in simulation.
|
|
10:10-10:20, Paper WeA-13.2 | |
Min-Max Vertex Cycle Covers with Connectivity Constraints for Multi-Robot Patrolling |
|
Scherer, Jürgen | Silicon Austria Labs GmbH |
Schoellig, Angela P. | University of Toronto |
Rinner, Bernhard | Alpen-Adria-Universität Klagenfurt |
Keywords: Multi-Robot Systems, Path Planning for Multiple Mobile Robots or Agents, Surveillance Robotic Systems
Abstract: We consider a multi-robot patrolling scenario with intermittent connectivity constraints, which ensure that data from the robots finally arrive at a base station. In particular, each robot traverses a closed tour periodically and meets with the robots on neighboring tours to exchange data. We model the problem as a variant of the min-max vertex cycle cover problem (MMCCP), which is the problem of covering all vertices with a given number of disjoint tours such that the largest tour length is minimal. In this work we introduce the minimum idleness connectivity-constrained multi-robot patrolling problem, show that it is NP-hard, and model it as a mixed integer linear program (MILP). The computational complexity of exactly solving this problem restrains practical applications, and therefore we develop approximate algorithms taking a solution for MMCCP as input. Our simulation experiments on 10 vertices and up to 3 robots compare the results of different solution approaches (including solving the MILP formulation) and show that our greedy algorithm can obtain an objective value close to the one of the MILP formulations with much shorter computation times. Experiments on instances with up to 100 vertices and up to 10 robots indicate that the greedy approximation algorithm tries to keep the length of the longest tour small by extending smaller tours for data exchange.
|
|
10:20-10:30, Paper WeA-13.3 | |
Efficient Range-Constrained Manifold Optimization with Application to Cooperative Navigation |
|
Zhang, Yetong | Georgia Tech |
Chen, Gerry | Georgia Institute of Technology |
Rutkowski, Adam | Air Force Research Laboratory |
Dellaert, Frank | Georgia Institute of Technology |
Keywords: Multi-Robot Systems, Path Planning for Multiple Mobile Robots or Agents, Swarm Robotics
Abstract: We present a manifold optimization approach to solve inference and planning problems with range constraints. The core of our approach is the definition of a manifold that represents points or poses with range constraints. We discover that the manifold of range-constrained points is homogeneous under the rigid transformation group action, and utilize the group action to derive the tangent space, retraction and topology of the manifold. We evaluate the performance of manifold optimization approach on solving range-constrained inference problems over state-of-the-art constrained optimization methods, and the results show that manifold optimization with the range-constraint manifold achieves both faster speed and better constraint satisfaction. We further study the conditions of inference problems that we can treat range measurements as constraints in practice.
|
|
10:30-10:40, Paper WeA-13.4 | |
On Coverage Control for Limited Range Multi-Robot Systems |
|
Pratissoli, Federico | Università Degli Studi Di Modena E Reggio Emilia |
Capelli, Beatrice | University of Modena and Reggio Emilia |
Sabattini, Lorenzo | University of Modena and Reggio Emilia |
Keywords: Multi-Robot Systems, Sensor Networks, Distributed Robot Systems
Abstract: This paper presents a coverage based control algorithm to coordinate a group of autonomous robots. Most of the solutions presented in the literature rely on an exact Voronoi partitioning, whose computation requires complete knowledge of the environment to be covered. This can be achieved only by robots with unlimited sensing capabilities, or through communication among robots in a limited sensing scenario. To overcome these limitations, we present a distributed control strategy to cover an unknown environment with a group of robots with limited sensing capabilities and in the absence of reliable communication. The control law is based on a limited Voronoi partitioning of the sensing area, and we demonstrate that the group of robots can optimally cover the environment using only information that is locally detected (without communication). The proposed method is validated by means of simulations and experiments carried out on a group of mobile robots.
|
|
10:40-10:50, Paper WeA-13.5 | |
Multi-Goal Multi-Agent Pickup and Delivery |
|
Xu, Qinghong | Simon Fraser University |
Li, Jiaoyang | University of Southern California |
Koenig, Sven | University of Southern California |
Ma, Hang | Simon Fraser University |
Keywords: Multi-Robot Systems, Path Planning for Multiple Mobile Robots or Agents, Task and Motion Planning
Abstract: In this work, we consider the Multi-Agent Pickup and Delivery (MAPD) problem, where agents constantly engage with new tasks and need to plan collision-free paths to execute them. To execute a task, an agent needs to visit a pair of goal locations, consisting of a pickup location and a delivery location. We propose two variants of an algorithm that assigns a sequence of tasks to each agent using the anytime algorithm Large Neighborhood Search (LNS) and plans paths using the Multi-Agent Path Finding (MAPF) algorithm Priority-Based Search (PBS). LNS-PBS is complete for well-formed MAPD instances, a realistic subclass of MAPD instances, and empirically more effective than the existing complete MAPD algorithm CENTRAL. LNS-wPBS provides no completeness guarantee but is empirically more efficient and stable than LNS-PBS. It scales to thousands of agents and thousands of tasks in a large warehouse and is empirically more effective than the existing scalable MAPD algorithm HBH+MLA*. LNS-PBS and LNS-wPBS also apply to a more general variant of MAPD, namely the Multi-Goal MAPD (MG-MAPD) problem, where tasks can have different numbers of goal locations.
|
|
10:50-11:00, Paper WeA-13.6 | |
Asynchronous Real-Time Decentralized Multi-Robot Trajectory Planning |
|
Senbaslar, Baskın | University of Southern California |
Sukhatme, Gaurav | University of Southern California |
Keywords: Multi-Robot Systems, Path Planning for Multiple Mobile Robots or Agents, Networked Robots
Abstract: We present a novel overconstraining and constraint-discarding method for asynchronous, real-time, decentralized, multi-robot trajectory planning that ensures collision avoidance. Our approach utilizes communication between robots. The communication medium is best-effort: messages may be dropped, re-ordered or delayed. Robots conservatively constrain themselves against others assuming they may be working with outdated information, and discard constraints when they receive update messages from others. Our method can augment existing synchronized decentralized receding horizon planning algorithms that utilize separating hyperplanes for collision avoidance thereby making them applicable to asynchronous setups. As an example, we extend an existing model predictive control based, synchronized, decentralized multi-robot planner using our method. We show our method’s effectiveness under asynchronous planning and imperfect communication by comparing our extension to the base version. Our extension does not result in any collisions or synchronization-induced deadlocks to which the base version is prone.
|
|
11:00-11:10, Paper WeA-13.7 | |
Decentralized Learning with Limited Communications for Multi-Robot Coverage of Unknown Spatial Fields |
|
Nakamura, Kensuke | Princeton University |
Santos, María | Princeton University |
Leonard, Naomi | Princeton University |
Keywords: Multi-Robot Systems, Path Planning for Multiple Mobile Robots or Agents, Optimization and Optimal Control
Abstract: This paper presents an algorithm for a team of mobile robots to simultaneously learn a spatial field over a domain and spatially distribute themselves to optimally cover it. Drawing from previous approaches that estimate the spatial field through a centralized Gaussian process, this work leverages the spatial structure of the coverage problem and presents a decentralized strategy where samples are aggregated locally by establishing communications through the boundaries of a Voronoi partition. We present an algorithm whereby each robot runs a local Gaussian process calculated from its own measurements and those provided by its Voronoi neighbors, which are incorporated into the individual robot’s Gaussian process only if they provide sufficiently novel information. The performance of the algorithm is evaluated in simulation and compared with centralized approaches.
|
|
11:10-11:20, Paper WeA-13.8 | |
Multi-Agent Path Planning Using Medial-Axis-Based Pebble-Graph Embedding |
|
Liang, He | University of North Carolina at Chapel Hill |
Pan, Zherong | Tencent America |
Solovey, Kiril | Technion--Israel Institute of Technology |
Jia, Biao | University of Maryland at College Park |
Manocha, Dinesh | University of Maryland |
Keywords: Multi-Robot Systems, Path Planning for Multiple Mobile Robots or Agents, Motion and Path Planning
Abstract: We present a centralized algorithm for labeled, disk-shaped Multi-Agent Path Planning (MPP) in a continuous workspace with polygonal boundaries. Our method automatically constructs a discrete pebble-graph out of the workspace and then routes the agents on the graph. To construct the pebble graph, we identify inscribed circles in the workspace via medial axis transform and organize agents into layers within each inscribed circle. We show that our layered pebble-graph allows the agents to perform both pebble and rotation motions, such that all the MPP instances restricted to the pebble-graph are feasible. MPP instances with continuous start and goal positions can then be solved via local navigations that route agents from and to graph vertices. We experiment with our method on a row of 5 environments with a high agent-packing density (up to 61:6% of the free space). Such density violates the well-separated assumptions made by state-of-the-art MPP planners, while our method achieves a high success rate.
|
|
11:20-11:30, Paper WeA-13.9 | |
Multi-Modal User Interface for Multi-Robot Control in Underground Environments |
|
Chen, Shengkang | Georgia Tech |
O'Brien, Matthew | Georgia Institute of Technology |
Talbot, Fletcher | CSIRO |
Williams, Jason | CSIRO |
Tidd, Brendan | CSIRO |
Pitt, Alex | CSIRO |
Arkin, Ronald | Georgia Tech |
Keywords: Multi-Robot Systems, Task Planning, Human Factors and Human-in-the-Loop
Abstract: Leveraging both the autonomy of robots and the expert knowledge of humans can enable a multi-robot system to complete missions in challenging environments with a high degree of adaptivity and robustness. This paper proposes a multimodal task-based graphical user interface for controlling a heterogeneous multi-robot team. The core of the interface is an integrated multi-robot task allocation system to allow the user to encode his/her intents to guide the heterogeneous multi-robot team. The design of the interface aims to provide the human operator continuous situational awareness and effective control for rapid decision-making in time-critical missions. Team CSIRO Data61 came in second place utilizing this interface for the DARPA Subterranean (SubT) Challenge. The ideas used for this user interface can apply to other multi-robot applications.
|
|
WeA-14 |
Rm14 (Room 501) |
Soft Sensors and Actuators 1 |
Regular session |
Chair: Minor, Mark | University of Utah |
Co-Chair: Masuya, Ken | University of Miyazaki |
|
10:00-10:10, Paper WeA-14.1 | |
Slip Anticipation for Grasping Deformable Objects Using a Soft Force Sensor |
|
Judd, Euan | EPFL |
Aksoy, Bekir | EPFL |
Digumarti, Krishna Manaswi | Ecole Polytechnique Federale De Lausanne |
Shea, Herbert | EPFL |
Floreano, Dario | Ecole Polytechnique Federal, Lausanne |
Keywords: Soft Sensors and Actuators, AI-Based Methods, Soft Robot Materials and Design
Abstract: Robots using classical control have revolutionised assembly lines where the environment and manipulated objects are restricted and predictable. However, they have proven less effective when the manipulated objects are deformable due to their complex and unpredictable behaviour. The use of tactile sensors and continuous monitoring of tactile feedback is therefore particularly important for pick-and-place tasks using these materials. This is in part due to the need to use multiple points of contact for the manipulation of deformable objects which can result in slippage with inadequate coordination between manipulators. In this paper, continuous monitoring of tactile feedback, using a liquid metal soft force sensor, for grasping deformable objects is presented. The trained data-driven model distinguishes between successful grasps, slippage and failure during a manipulation task for multiple deformable objects. Slippage could be anticipated before failure occurred using data acquired over a 30 ms period with a greater than 95% accuracy using a random forest classifier. The results were achieved using a single sensor that can be mounted on the fingertips of existing grippers and contributes to the development of an automated pick-and-place process for deformable objects.
|
|
10:10-10:20, Paper WeA-14.2 | |
Estimation of Soft Robotic Bladder Compression for Smart Helmets Using IR Range Finding and Hall Effect Magnetic Sensing |
|
Pollard, Colin | University of Utah |
Aston, Jonathan | University of Utah |
Minor, Mark | University of Utah |
Keywords: Soft Sensors and Actuators, Soft Robot Applications
Abstract: This research focuses on soft robotic bladders that are used to monitor and control the interaction between a user’s head and the shell of a Smart Helmet. Compression of these bladders determines impact dissipation; hence the focus of this paper is sensing and estimation of bladder compression. An IR rangefinder-based solution is evaluated using regression techniques as well as a Neural Network to estimate bladder compression. A Hall-Effect (HE) magnetic sensing system is also examined where HE sensors embedded in the base of the bladder sense the position of a magnet in the top of the bladder. The paper presents the HE sensor array, signal processing of HE voltage data, and then a Neural Network (NN) for predicting bladder compression. Efficacy of different training data sets on NN performance is studied. Different NN configurations are examined to determine a configuration that provides accurate estimates with as few nodes as possible. Different bladder compression profiles are evaluated to characterize IR range finding and HE based techniques in application scenarios.
|
|
10:20-10:30, Paper WeA-14.3 | |
Kirigami Skin Based Flexible Whisker Sensor |
|
Liu, Bangyuan | Georgia Institute of Technology |
Herbert, Robert | Georgia Institute of Technology |
Yeo, Woon-Hong | Georgia Tech |
Hammond III, Frank L. | Georgia Institute of Technology |
Keywords: Soft Sensors and Actuators, Soft Robot Materials and Design
Abstract: Whiskers are widely used by animals for sensing physical interactions with their environments. By combining the Kirigami skin pop-up feature and flexible conducting layer, we designed a deployable Kirigami whisker sensor. The sensor can deploy from a flat state to a sensing state while whisker stiffness and initial pop-up angle can be tuned by adjusting the pre-stretch strain. Preliminary results show that the sensor works well both in air and underwater. The sensor is capable of measuring both externally applied forces and water flow.
|
|
10:30-10:40, Paper WeA-14.4 | |
Design and Characterisation of a Soft Barometric Sensing Skin for Robotic Manipulation |
|
Gilday, Kieran | University of Cambridge |
Relandeau, Louis | University of Cambridge |
Iida, Fumiya | University of Cambridge |
Keywords: Soft Sensors and Actuators, Modeling, Control, and Learning for Soft Robots, In-Hand Manipulation
Abstract: Soft sensorised skins are essential for improving robotic manipulation capabilities towards that of humans. Integration of sensors into existing robotic hands is challenging due to rigidity of components, low packing density or poor sensor response. We propose a sensorised skin, based-on barometric sensing, which can be molded over a skeletal robot hand. The sensors connect air chambers embedded in the soft skin to wrist-mounted pressure sensors, allowing sensor spacing 2-4 mm, force ranges from 23 mN to 5700 mN and bandwidth of 20 Hz. Integrating this with a skeletal hand allows us to showcase the potential of these sensors to aid robotic manipulation. We demonstrate 3-axis contact modelling, useful for in-hand manipulation and exploration. In addition, by grasping a chopstick and sensing forces transmitted from the environment, the system can remotely detect small environmental features, e.g., hole finding using tools.
|
|
10:40-10:50, Paper WeA-14.5 | |
A Virtual 2D Tactile Array for Soft Actuators Using Acoustic Sensing |
|
Wall, Vincent | TU Berlin |
Brock, Oliver | Technische Universität Berlin |
Keywords: Soft Sensors and Actuators, Force and Tactile Sensing
Abstract: We create a virtual 2D tactile array for soft pneumatic actuators using embedded audio components. We detect contact-specific changes in sound modulation to infer tactile information. We evaluate different sound representations and learning methods to detect even small contact variations. We demonstrate the acoustic tactile sensor array by the example of a PneuFlex actuator and use a Braille display to individually control the contact of 29 x 4 pins with the actuator's 90 x 10 mm palmar surface. Evaluating the spatial resolution, the acoustic sensor localizes edges in x- and y-direction with a root-mean-square regression error of 1.67 mm and 0.0 mm, respectively. Even light contacts of a single Braille pin with a lifting force of 0.17 N are measured with high accuracy. Finally, we demonstrate the sensor's sensitivity to complex contact shapes by successfully reading the 26 letters of the Braille alphabet from a single display cell with a classification rate of 88%.
|
|
10:50-11:00, Paper WeA-14.6 | |
Transferable Shape Estimation of Soft Pneumatic Actuators Based on Active Vibroacoustic Sensing |
|
Chandrasiri, Kazumi | Tokai University |
Takemura, Kentaro | Tokai University |
Keywords: Soft Sensors and Actuators, Soft Robot Applications
Abstract: A soft pneumatic actuator (SPA) is one of the most prominent components in a soft robotic system. The sensing of SPAs is challenging owing to their elasticity and deformability. Sensors for specific factors are required to successfully sense an SPA. A flexible sensor is an important component for sensing the conditions of SPAs, such as the shape and deformation due to contact events during applications. Developing versatile sensors with high flexibility and tolerability for SPAs is challenging. Data-driven sensing approaches involve individual machine learning models for different actuators. In contrast, it is an enormous advantage to have versatile sensors as hardware and machine learning models as software employed on various actuators. Therefore, we propose a transferable shape estimation method based on active vibro-acoustic sensing to achieve tolerability and increase versatility. We created easily transferable sensing devices for SPAs. In addition, we employed a data-driven approach that utilizes a simple transfer learning technique on a two-dimensional convolutional neural network model. We confirm the feasibility and versatility of the proposed method through evaluation experiments. A transferable estimation method was used on SPAs to estimate the bending angle and length under various sensing and environmental conditions, and the average errors were less than 3.5 degrees and 2.1 mm, respectively.
|
|
11:00-11:10, Paper WeA-14.7 | |
Shape Reconstruction of Soft Manipulators Using Vision and IMU Feedback |
|
Bezawada, Harish | The University of Alabama |
Vikas, Vishesh | University of Alabama |
Woods, Cole | The University of Alabama |
Keywords: Soft Sensors and Actuators, Soft Robot Applications, Modeling, Control, and Learning for Soft Robots
Abstract: In recent times, soft manipulators have gardened immense interest given their dexterous abilities. A critical aspect for their feedback control involves the reconstruction of the manipulator shape. The research, for the first time, presents shape reconstruction of a soft manipulator through sensor fusion of information available from Inertial Measurement Units (IMUs) and visual tracking. The manipulator is modeled using multi-segment continuous curvature Pythagorean Hodograph (PH) curves. PH curves are a class of continuous curvature curves with an analytical expression for the hodograph (slope). The shape reconstruction is formulated as an optimization problem that minimizes bending energy of the curve with a length constraint and the information from IMUs and/or visual markers. The paper experimentally investigates the robustness of shape reconstruction for scenarios when position of all visual markers, or slope at all the knots (placement of sensors) are known. Occlusion of manipulator segments is frequent, hence, this scenario is simulated by fusing information of available slopes (IMUs) at all knots and position (vision) at some knots. The experiments are performed on a planar tensegrity manipulator with IMUs feedback and visual tracking. The robustness study indicates reliability of these models for real world applications. Additionally, the proposed sensor fusion algorithm provides promising results where, for most cases, the shape estimates benefit from additional position information. Finally, the low dimensionality of the optimization problem argues for extension of the approach for real-time applications.
|
|
11:10-11:20, Paper WeA-14.8 | |
FBG-Based Variable-Length Estimation for Shape Sensing of Extensible Soft Robotic Manipulators |
|
Lu, Yiang | The Chinese University of Hong Kong |
Chen, Wei | The Chinese University of Hong Kong |
Chen, Zhi | Hefei University |
Zhou, Jianshu | The Chinese University of Hong Kong |
Liu, Yunhui | Chinese University of Hong Kong |
Keywords: Soft Sensors and Actuators, Soft Robot Applications, Sensor Fusion
Abstract: In this paper, we propose a novel variable-length estimation approach for shape sensing of extensible soft robots utilizing fiber Bragg gratings (FBGs). Shape reconstruction from FBG sensors has been increasingly developed for soft robots, while the narrow stretching range of FBG fiber makes it difficult to acquire accurate sensing results for extensible robots. Towards this limitation, we newly introduce an FBG-based length sensor by leveraging a rigid curved channel, through which FBGs are allowed to slide within the robot following its body extension/compression, hence we can search and match the FBGs with specific constant curvature in the fiber to determine the effective length. From the fusion with the above measurements, a model-free filtering technique is accordingly presented for simultaneous calibration of a variable-length model and temporally continuous length estimation of the robot, enabling its accurate shape sensing using solely FBGs. The performances of the proposed method have been experimentally evaluated on an extensible soft robot equipped with an FBG fiber in both free and unstructured environments. The results concerning dynamic accuracy and robustness of length estimation and shape sensing demonstrate the effectiveness of our approach.
|
|
11:20-11:30, Paper WeA-14.9 | |
Soft-Skin Actuator Capable of Seawater Propulsion Based on MagnetoHydroDynamics |
|
Matsumoto, Mutsuki | Shibaura Institute of Technology |
Kuwajima, Yu | Shibaura Institute Technology |
Shigemune, Hiroki | Shibaura Institute of Technology |
Keywords: Soft Sensors and Actuators, Soft Robot Materials and Design, Soft Robot Applications
Abstract: Underwater robots have a variety of potential uses, including marine resource research, ecological research, and disaster relief. Most of the underwater robots currently in practical use have screw propulsion systems, which have several noises, collision, and entrainment problems. There is a lot of research on underwater robots using soft actuators to solve these problems. However, current soft actuators have disadvantages, such as the need for special fluids, pressure sources, and high voltage circuits. Therefore, we have developed a soft-skin actuator based on magnetohydrodynamics (MHD). The soft-skin MHD actuator is made of soft material and the structure is prepared as thin, which allows it to attach to the surface of an object, including curved surfaces, to provide the object with a propulsive function in the sea. Since it has no moving parts, it does not generate mechanical noise, and there is no danger of entrapment. Because it can pump seawater directly, it does not require a special working fluid, and its structure is simple and easy to miniaturize. This paper investigates the thrust and power consumption of the developed soft-skin MHD actuator when attached to a flat surface. As a result, we obtained a thrust of 1.37 mN from a single soft-skin MHD actuator with a maximum power of about 140 W. We also measured the thrust force by attaching it to a curved surface. We obtained a higher thrust on a curved surface by adjusting the crossing of the magnetic field and the current than when using a flat surface. We developed an untethered robot that can remove oil from the sea using soft-skin MHD actuators. We demonstrated the adaptability of the soft-skin MHD actuator by attaching it to a commercial underwater camera weighing about 253.5 g and providing propulsion.
|
|
WeA-15 |
Rm15 (Room 509) |
Path Planning for Multiple Mobile Robots and Agents 1 |
Regular session |
Chair: De Martini, Daniele | University of Oxford |
Co-Chair: Yang, Yuan | Southeast University |
|
10:00-10:10, Paper WeA-15.1 | |
Collaborative Navigation-Aware Coverage in Feature-Poor Environments |
|
Ozkahraman, Ozer | KTH - Royal Institute of Technology |
Ogren, Petter | Royal Institute of Technology (KTH) |
Keywords: Path Planning for Multiple Mobile Robots or Agents, Networked Robots
Abstract: Multi agent coverage and robot navigation are two very important research fields within robotics. However, their intersection has received limited attention. In multi agent coverage, perfect navigation is often assumed, and in robot navigation, the focus is often to minimize the localization error with the aid of stationary features from the environment. The need for integration of the two becomes clear in environments with very sparse features or landmarks, for example when a group of Autonomous Underwater Vehicles (AUVs) are to search a uniform seafloor for mines or other dangerous objects. In such environments, localization systems are often deprived of detectable features to use that could increase their accuracy. In this paper we propose an algorithm for doing navigation aware multi agent coverage in areas with no landmarks. Instead of using identical lawn mower patterns, we propose to mirror every other pattern to enable the agents to meet up and make inter-agent measurements and share information regularly. This improves performance in two ways, global drift in relation to the area to be covered is reduced, and local coverage gaps between adjacent patterns are reduced. Further, we show that this can be accomplished within the constraints of very limited sensing, computing and communication resources that most AUVs have available. The effectiveness of our method is shown through statistically significant simulated experiments.
|
|
10:10-10:20, Paper WeA-15.2 | |
Polynomial Time Near-Time-Optimal Multi-Robot Path Planning in Three Dimensions with Applications to Large-Scale UAV Coordination |
|
Guo, Teng | Rutgers University |
Feng, Si Wei | Rutgers University |
Yu, Jingjin | Rutgers University |
Keywords: Path Planning for Multiple Mobile Robots or Agents, Multi-Robot Systems, Planning, Scheduling and Coordination
Abstract: For enabling efficient, large-scale coordination of unmanned aerial vehicles (UAVs) under the labeled setting, in this work, we develop the first polynomial time algorithm for the reconfiguration of many moving bodies in three- dimensional spaces, with provable 1.x asymptotic makespan optimality guarantee under high robot density. More precisely, on an m1×m2×m3 gird, m1>= m2>= m3, our method computes solutions for routing up to m1*m2*m3/3 uniquely labeled robots with uniformly randomly distributed start and goal configurations within a makespan of m1+2m2+2m3+o(m1), with high probability. Because the makespan lower bound for such instances is m1+m2+m3-o(m1), also with high probability, as m1 -> infty, (m1+2m2 +2m3)/(m1+m2+m3) optimality guarantee is achieved. (m1 +2m2+2m3)/(m1+m2+m3)in (1,5/3 ],yielding 1.x optimality.In contrast, it is well-known that multi-robot path planning is NP-hard to optimally solve. In numerical evaluations, our method readily scales to support the motion planning of over 100 000 robots in 3D while simultaneously achieving 1.x optimality. We demonstrate the application of our method in coordinating many quadcopters in both simulation and hardware experiments.
|
|
10:20-10:30, Paper WeA-15.3 | |
Energy-Efficient Orienteering Problem in the Presence of Ocean Currents |
|
Mansfield, Ariella | University of Pennsylvania |
G. Macharet, Douglas | Universidade Federal De Minas Gerais |
Hsieh, M. Ani | University of Pennsylvania |
Keywords: Task and Motion Planning, Planning, Scheduling and Coordination, Environment Monitoring and Management
Abstract: In many environmental monitoring applications robots are often tasked to visit various distinct locations to make observations and/or collect specific measurements. The problem of scheduling and assigning robots to the various tasks and planning feasible paths for the robots can be posed as an Orienteering Problem (OP). In the standard OP, routing and scheduling is achieved by maximizing an objective function by visiting the most rewarding locations while respecting a limited travel budget. However, traditional formulations for such problems usually neglect some environmental features that can greatly impact the tour, e.g., flows, such as wind or ocean currents. This is of particular importance for applications in marine and atmospheric environments where vehicle motions can be significantly impacted by the environmental dynamics and the environment exerts a non-negligible force on the vehicles. In this paper, we tackle the OP in fluid environments where robots must operate in the presence of ocean and/or atmospheric currents. We introduce a novel multi-objective formulation that combines both task and path planning problems, and whose goals are to (i) maximize the collected reward, while (ii) minimizing the energy expenditure by leveraging the environmental dynamics wherever possible. We validate our strategy using simulated ocean model data to show that our approach can generate a diverse set of solutions that have an adequate compromise between both objectives.
|
|
10:30-10:40, Paper WeA-15.4 | |
MAPFASTER: A Faster and Simpler Take on Multi-Agent Path Finding Algorithm Selection |
|
Alkazzi, Jean-Marc | IDEALworks GmbH |
Rizk, Anthony | Faculty of Engineering, Saint Joseph University of Beirut, Campu |
Salomon, Michel | University Bourgogne Franche-Comte |
Makhoul, Abdallah | University of Franche-Comté |
Keywords: Path Planning for Multiple Mobile Robots or Agents, Multi-Robot Systems, Deep Learning Methods
Abstract: Portfolio-based algorithm selection can help in choosing the best suited algorithm for a given task while leveraging the complementary strengths of the candidates. Solving the Multi-Agent Path Finding (MAPF) problem optimally has been proven to be NP-Hard. Furthermore, no single optimal algorithm has been shown to have the fastest runtime for all MAPF problem instances, and there are no proven approaches for when to use each algorithm. To address these challenges, we develop MAPFASTER, a smaller and more accurate deep learning based architecture aiming to be deployed in fleet management systems to select the fastest MAPF solver in a multi-robot setting. MAPF problem instances are encoded as images and passed to the model for classification into one of the portfolio's candidates. We evaluate our model against state-of-the-art Optimal-MAPF-Algorithm selectors, showing +5.42% improvement in accuracy while being 7.1times faster to train. The dataset, code and analysis used in this research can be found at href{https://github.com/jeanmarcalkazzi/mapfaster}{https://github.com/jeanmarcalkazzi/mapfaster}.
|
|
10:40-10:50, Paper WeA-15.5 | |
A Conflict-Driven Interface between Symbolic Planning and Nonlinear Constraint Solving |
|
Ortiz-Haro, Joaquim | University of Stuttgart |
Karpas, Erez | Technion |
Katz, Michael | IBM |
Toussaint, Marc | TU Berlin |
Keywords: Task and Motion Planning, Task Planning, Manipulation Planning
Abstract: Robotic planning in real-world scenarios typically requires joint optimization of logic and continuous variables. A core challenge to combine the strengths of logic planners and continuous solvers is the design of an efficient interface that informs the logical search about continuous infeasibilities. In this paper we present a novel iterative algorithm that connects logic planning with nonlinear optimization through a bidirectional interface, achieved by the detection of minimal subsets of nonlinear constraints that are infeasible. The algorithm continuously builds a database of graphs that represent (in)feasible subsets of continuous variables and constraints, and encodes this knowledge in the logical description. As a foundation for this algorithm, we introduce Planning with Nonlinear Transition Constraints (PNTC), a novel planning formulation that clarifies the exact assumptions our algorithm requires and can be applied to model Task and Motion Planning (TAMP) efficiently. Our experimental results show that our framework significantly outperforms alternative optimization-based approaches for TAMP. Webpage: https://quimortiz.github.io/graphnlp/
|
|
10:50-11:00, Paper WeA-15.6 | |
Scalable Online Coverage Path Planning for Multi-Robot Systems |
|
Mitra, Ratijit | IIT Kanpur |
Saha, Indranil | IIT Kanpur |
Keywords: Path Planning for Multiple Mobile Robots or Agents, Multi-Robot Systems, Swarm Robotics
Abstract: Online coverage path planning to explore an unknown workspace with multiple homogeneous robots could be either centralized or distributed. While distributed planners are computationally faster, centralized planners can produce more efficient paths, reducing the duration of completing a coverage mission significantly. To exploit the power of a centralized framework, we propose a receding horizon centralized online multi-robot planner. In each planning horizon, it generates collision-free paths that guide the robots to visit some obstacle-free locations (aka goals) not visited so far, which in turn help them explore some new regions with their laser rangefinders. We formally prove that, under reasonable conditions, it enables the robots to cover a workspace completely and subsequently analyze its time complexity. We evaluate our planner for ground and aerial robots by performing experiments with up to 128 robots on six 2D grid-based benchmark obstacle maps, establishing scalability. We also perform Gazebo simulations with 10 quadcopters and real experiments with 2 four-wheel ground robots, demonstrating its practical feasibility. Furthermore, a comparison with a state-of-the-art distributed planner establishes its superiority in coverage completion time.
|
|
11:00-11:10, Paper WeA-15.7 | |
DiMOpt: A Distributed Multi-Robot Trajectory Optimization Algorithm |
|
Salvado, João | Orebro University |
Mansouri, Masoumeh | Birmingham University |
Pecora, Federico | Örebro University |
Keywords: Path Planning for Multiple Mobile Robots or Agents, Integrated Planning and Control, Multi-Robot Systems
Abstract: This paper deals with Multi-robot Trajectory Planning, that is, the problem of computing trajectories for multiple robots navigating in a shared space. Approaches based on trajectory optimization can solve this problem optimally. However, such methods are hampered by complex robot dynamics and collision constraints that couple robot's decision variables. We propose a distributed multi-robot optimization algorithm (DiMOpt) which addresses these issues by exploiting (1) consensus optimization strategies to tackle coupling collision constraints, and (2) a single-robot sequential convex programming (SCP) method for efficiently handling non-convexities introduced by dynamics. We compare DiMOpt with a baseline sequential convex programming algorithm tailored to the multi-robot case (M-SCP). We empirically demonstrate that DiMOpt scales well for large fleets of robots, while computing solutions faster and with lower costs than M-SCP. Moreover, DiMOpt is an iterative algorithm that finds feasible trajectories before converging to an optimal solution, and results suggest the quality of such fast initial solutions is comparable to a converged solution computed via M-SCP. Finally, we also investigate how other factors, including path length, affect the performance of DiMOpt.
|
|
11:10-11:20, Paper WeA-15.8 | |
Non-Submodular Maximization Via the Greedy Algorithm and the Effects of Limited Information in Multi-Agent Execution |
|
Biggs, Benjamin | Virginia Polytechnic Institute and State University |
McMahon, James | The Naval Research Laboratory |
Baldoni, Philip | United States Naval Research Laboratory |
Stilwell, Daniel | Virginia Tech |
Keywords: Path Planning for Multiple Mobile Robots or Agents, Multi-Robot Systems, Planning under Uncertainty
Abstract: We provide theoretical bounds on the worst case performance of the greedy algorithm in seeking to maximize a normalized, monotone, but not necessarily submodular objective function under a simple partition matroid constraint. We also provide worst case bounds on the performance of the greedy algorithm in the case that limited information is available at each planning step. We specifically consider limited information as a result of unreliable communications during distributed execution of the greedy algorithm. We utilize notions of curvature for normalized, monotone set functions to develop the bounds provided in this work. To demonstrate the value of the bounds provided in this work, we analyze a variant of the benefit of search objective function and show, using real-world data collected by an autonomous underwater vehicle, that theoretical approximation guarantees are achieved despite non-submodularity of the objective function.
|
|
11:20-11:30, Paper WeA-15.9 | |
Gathering Physical Particles with a Global Magnetic Field Using Reinforcement Learning |
|
Konitzny, Matthias | Technische Universität Braunschweig |
Lu, Yitong | University of Houston |
Julien, Leclerc | University of Houston |
Fekete, Sándor | Technische Universität Braunschweig |
Becker, Aaron | University of Houston |
Keywords: Path Planning for Multiple Mobile Robots or Agents, Reinforcement Learning, Automation at Micro-Nano Scales
Abstract: For biomedical applications in targeted therapy delivery and interventions, a large swarm of micro-scale particles ("agents") has to be moved through a maze-like environment ("vascular system") to a target region ("tumor"). Due to limited on-board capabilities, these agents cannot move autonomously; instead, they are controlled by an external global force that acts uniformly on all particles. In this work, we demonstrate how to use a time-varying magnetic field to gather particles to a desired location. We use reinforcement learning to train networks to efficiently gather particles. Methods to overcome the simulation-to-reality gap are explained, and the trained networks are deployed on a set of mazes and goal locations. The hardware experiments demonstrate fast convergence, and robustness to both sensor and actuation noise. To encourage extensions and to serve as a benchmark for the reinforcement learning community, the code is available at Github.
|
|
WeA-16 |
Rm16 (Room 510) |
Transfer Learning |
Regular session |
Chair: Kim, H. Jin | Seoul National University |
Co-Chair: Gronauer, Sven | Technical University of Munich |
|
10:00-10:10, Paper WeA-16.1 | |
Contrastive Learning for Cross-Domain Open World Recognition |
|
Cappio Borlino, Francesco | Politecnico Di Torino |
Bucci, Silvia | Politecnico Di Torino |
Tommasi, Tatiana | Politecnico Di Torino |
Keywords: Incremental Learning, Transfer Learning, Recognition
Abstract: The ability to evolve is fundamental for any valuable autonomous agent whose knowledge cannot remain limited to that injected by the manufacturer. Consider for example a home assistant robot: it should be able to incrementally learn new object categories when requested, but also to recognize the same objects in different environments (rooms) and poses (hand-held/on the floor/above furniture), while rejecting unknown ones. Despite its importance, this scenario has started to raise interest in the robotic community only recently and the related research is still in its infancy, with existing experimental testbeds but no tailored methods. With this work, we propose the first learning approach that deals with all the previously mentioned challenges at once by exploiting a single contrastive objective. We show how it learns a feature space perfectly suitable to incrementally include new classes and is able to capture knowledge which generalizes across a variety of visual domains. Our method is endowed with a tailored effective stopping criterion for each learning episode and exploits a self-paced thresholding strategy that provides the classifier with a reliable rejection option. Both these novel contributions are based on the observation of the data statistics and do not need manual tuning. An extensive experimental analysis confirms the effectiveness of the proposed approach in establishing the new state-of-the-art. The code is available at https://github.com/FrancescoCappio/Contrastive_Open_World.
|
|
10:10-10:20, Paper WeA-16.2 | |
Efficient Multi-Task Learning Via Iterated Single-Task Transfer |
|
Zentner, K.R. | University of Southern California |
Puri, Ujjwal | University of Southern California |
Zhang, Yulun | University of Southern California |
Julian, Ryan | Google |
Sukhatme, Gaurav | University of Southern California |
Keywords: Machine Learning for Robot Control, Reinforcement Learning, Transfer Learning
Abstract: In order to be effective general purpose machines in real world environments, robots not only will need to adapt their existing manipulation skills to new circumstances, they will need to acquire entirely new skills on-the-fly. One approach to achieving this capability is via Multi-task Reinforcement Learning (MTRL). Most recent work in MTRL trains a single policy to solve all tasks at once. In this work, we investigate the feasibility of instead training separate policies for each task, and only transferring from a task once the policy for it has finished training. We describe a method of finding near optimal sequences of transfers to perform in this setting, and use it to show that performing the optimal sequence of transfer is competitive with other MTRL methods on the MetaWorld MT10 benchmark. Lastly, we describe a method for finding nearly optimal transfer sequences during training that is able to improve on training each task from scratch.
|
|
10:20-10:30, Paper WeA-16.3 | |
MPR-RL: Multi-Prior Regularized Reinforcement Learning for Knowledge Transfer |
|
Yang, Quantao | Orebro University |
Stork, Johannes A. | Orebro University |
Stoyanov, Todor | Örebro University |
Keywords: Reinforcement Learning, Transfer Learning, Machine Learning for Robot Control
Abstract: In manufacturing, assembly tasks have been a challenge for learning algorithms due to variant dynamics of different environments. Reinforcement learning (RL) is a promising framework to automatically learn these tasks, yet it is still not easy to apply a learned policy or skill, that is the ability of solving a task, to a similar environment even if the deployment condition is only slightly different. For safety and feasibility reasons, state-of-the-art methods require policy training in simulation to prevent undesired behavior followed by domain transfer, or guided policy search for similar environments. In this paper, we address the challenge of transferring knowledge within a family of similar tasks by leveraging multiple skill priors. We propose to learn prior distribution over the specific skill required to accomplish each task and compose the family of skill priors to guide learning the policy for a new task by comparing the similarity between the target task and the prior ones. Our method learns a latent action space representing the skill embedding from demonstrated trajectories for each prior task. We have evaluated our method on a task in simulation and a set of peg-in-hole insertion tasks and demonstrate better generalization to new tasks that have never been encountered during training. Our Multi-Prior Regularized RL (MPR-RL) method is deployed directly on a real-world Franka Panda arm, requiring only a set of demonstration trajectories from similar, but cruicially not identical, problem instances.
|
|
10:30-10:40, Paper WeA-16.4 | |
Unsupervised Reinforcement Learning for Transferable Manipulation Skill Discovery |
|
Cho, Daesol | Seoul National University |
Kim, Jigang | Seoul National University |
Kim, H. Jin | Seoul National University |
Keywords: Reinforcement Learning, Machine Learning for Robot Control, Transfer Learning
Abstract: Current reinforcement learning (RL) in robotics often experiences difficulty in generalizing to new downstream tasks due to the innate task-specific training paradigm. To alleviate it, unsupervised RL, a framework that pre-trains the agent in a task-agnostic manner without access to the task-specific reward, leverages active exploration for distilling diverse experience into essential skills or reusable knowledge. For exploiting such benefits also in robotic manipulation, we propose an unsupervised method for transferable manipulation skill discovery that ties structured exploration toward interacting behavior and transferable skill learning. It not only enables the agent to learn interaction behavior, the key aspect of the robotic manipulation learning, without access to the environment reward, but also to generalize to arbitrary downstream manipulation tasks with the learned task-agnostic skills. Through comparative experiments, we show that our approach achieves the most diverse interacting behavior and significantly improves sample efficiency in downstream tasks including the extension to multi-object, multitask problems.
|
|
10:40-10:50, Paper WeA-16.5 | |
Subspace-Based Feature Alignment for Unsupervised Domain Adaptation |
|
Yi, Eojindl | KAIST |
Kim, Junmo | KAIST |
Keywords: Transfer Learning, Object Detection, Segmentation and Categorization, Deep Learning for Visual Perception
Abstract: Autonomous agents need to perceive the world in a robust way, such that the shift in data distribution does not lead to faulty perception results. When agents cannot be trained with abundant data, agents may need to operate on real world environments while trained on simulated data, and suffer from domain shift. This paper proposes an effective and robust unsupervised domain adaptation (UDA) method that can resolve these situations. In the UDA setup, we are given a labeled source domain and an unlabeled target domain that share the same set of classes but are sampled from different distributions. This domain shift prevents agents which employ deep neural networks from generalizing well on the target domain. Recent methods adopt the strategy of self-training the networks with pseudo labeled target samples. However, falsely labeled samples cause negative transfer and deteriorate generalization of a network. To reduce negative transfer we propose an algorithm that can filter the pseudo labels, and use the filtered labels to align the domains in the feature space. The samples whose labels have not passed the filtering process can be used as an index to tune the hyperparameters of our method. Across various benchmarks, we validate the performance of our method. Especially, our method achieves strong performance on the synthetic-to-real adaptation scenario.
|
|
10:50-11:00, Paper WeA-16.6 | |
Using Simulation Optimization to Improve Zero-Shot Policy Transfer of Quadrotors |
|
Gronauer, Sven | Technical University of Munich |
Kissel, Matthias | Technical University of Munich |
Sacchetto, Luca | Technical University of Munich |
Korte, Mathias | Technical University of Munich |
Diepold, Klaus | Technische Universität München |
Keywords: Transfer Learning, Machine Learning for Robot Control, Reinforcement Learning
Abstract: In this work, we propose a data-driven approach to optimize the parameters of a simulation such that control policies can be directly transferred from simulation to a real-world quadrotor. Our neural network-based policies take only onboard sensor data as input and run entirely on the embedded hardware. In real-world experiments, we compare low-level Pulse-Width Modulated control with higher-level control structures such as Attitude Rate and Attitude, which utilize Proportional-Integral-Derivative controllers to output motor commands. Our experiments show that low-level controllers trained with Reinforcement Learning require a more accurate simulation than higher-level control policies at the expense of being less robust towards parameter uncertainties.
|
|
11:00-11:10, Paper WeA-16.7 | |
Bilateral Knowledge Distillation for Unsupervised Domain Adaptation of Semantic Segmentation |
|
Wang, Yunnan | Shanghai Jiao Tong University |
Li, Jianxun | Shanghai Jiao Tong University |
Keywords: Transfer Learning, Semantic Scene Understanding, Computer Vision for Transportation
Abstract: Unsupervised domain adaptation (UDA) aims to learn domain-invariant representations between the labeled source domain and the unlabeled target domain. Existing self-training-based UDA methods use ground truth and pseudo-labels to supervise source data and target data respectively. However, strong supervision in the source domain and pseudo-label noise in the target domain lead to some problems, such as biased predictions and over-fitting. To tackle these issues, we propose a novel Bilateral Knowledge Distillation (BKD) framework for UDA in semantic segmentation, which adopts different knowledge distillation strategies depending on the domain. Specifically, we first introduce a Source-Flow Distillation (SD) to smooth the labels of source images, which weakens the supervision in the source domain. Meanwhile, a Target-Flow Distillation (TD) is designed to extract the inter-class knowledge in the probability map output from the teacher model, which alleviates the influence of pseudo-label noise in the target domain. Considering the class imbalance in semantic segmentation, we further propose an Image-Wise Hard Pixel Mining (HPM) to address this issue without estimating class frequency in the unlabeled target domain. The effectiveness of our framework against existing state-of-the-art methods is demonstrated by extensive experiments on two benchmarks: GTA5-to-Cityscapes and SYNTHIA-to-Cityscapes.
|
|
11:10-11:20, Paper WeA-16.8 | |
Self-Supervised Noisy Label Learning for Source-Free Unsupervised Domain Adaptation |
|
Chen, Weijie | Zhejiang University |
Lin, Luojun | Fuzhou University |
Yang, Shicai | Hikvision Research Institute |
Xie, Di | Hikvision Research Institute |
Pu, Shiliang | Hangzhou Hikvision Digital Technology Co. Ltd |
Zhuang, Yueting | Zhejiang University |
Keywords: Transfer Learning, Recognition, Deep Learning Methods
Abstract: Domain adaptation is an important property in robot vision, which enables the neural networks pre-trained on source domains to adapt target domains automatically without any annotation efforts. During this process, source data is not always accessible due to the constraints of expensive storage overhead and data privacy protection. Therefore, the source domain pre-trained model is expected to optimize with only unlabeled target data, termed as source-free unsupervised domain adaptation. In this paper, we view this problem as a special case of noisy label learning, since the given pre-trained model can generate noisy labels for unlabeled target data via network inference. The potential semantic cues for unsupervised domain adaptation exactly lie on these noisy labels. Inspired by this problem modeling, we propose a simple yet effective Self-Supervised Noisy Label Learning method, which injects self-supervised learning to impose the intrinsic data structure and facilitate label-denoising. Extensive experiments have been conducted on diverse benchmarks to validate the effectiveness. Our method achieves state-of-the-art performance.
|
|
11:20-11:30, Paper WeA-16.9 | |
Analysis of Randomization Effects on Sim2Real Transfer in Reinforcement Learning for Robotic Manipulation Tasks |
|
Josifovski, Josip | Technical University of Munich |
Malmir, Mohammadhossein | Technical University of Munich |
Klarmann, Noah | Rosenheim University of Applied Sciences |
Zagar, Bare Luka | Technical University Munich |
Navarro-Guerrero, Nicolás | Deutsches Forschungszentrum Für Künstliche Intelligenz (DFKI) Gm |
Knoll, Alois | Tech. Univ. Muenchen TUM |
Keywords: Transfer Learning, Incremental Learning, Deep Learning in Grasping and Manipulation
Abstract: Randomization is currently a widely used approach in Sim2Real transfer for data-driven learning algorithms in robotics. Still, most Sim2Real studies report results for a specific randomization technique and often on a highly customized robotic system, making it difficult to evaluate different randomization approaches systematically. To address this problem, we define an easy-to-reproduce experimental setup for a robotic reach-and-balance manipulator task, which can serve as a benchmark for comparison. We compare four randomization strategies with three randomized parameters both in simulation and on a real robot. Our results show that more randomization helps in Sim2Real transfer, yet it can also harm the ability of the algorithm to find a good policy in simulation. Fully randomized simulations and fine-tuning show differentiated results and translate better to the real robot than the other approaches tested.
|
|
WeA-17 |
Rm17 (Room 553) |
Assembly and Additive Manufacturing |
Regular session |
Chair: Xiao, Jing | Worcester Polytechnic Institute (WPI) |
Co-Chair: Wan, Weiwei | Osaka University |
|
10:00-10:10, Paper WeA-17.1 | |
Additive Manufacturing for Tissue Engineering Applications in a Temperature-Controlled Environment |
|
Tseng, Wei-Chih | National Central University |
Liao, Chao-Yaug | National Central University |
Chen, Bo-Ren | National Central University |
Chassagne, Luc | University of Versailles |
Cagneau, Barthélemy | Université De Versailles Saint-Quentin En Yvelines |
Keywords: Additive Manufacturing, Product Design, Development and Prototyping
Abstract: In recent years, with the combination of tissue engineering and additive manufacturing technologies, the possibility of fabricating scaffolds with porosity and complex structure has been improved. Since the properties of most biomaterial inks are influenced by temperature and thereby affect the quality of the scaffolds, a controlled printing environment is very important. This study focuses on temperature monitoring from the nozzle to the working platform. A compact heating jacket is developed to heat the needle and sense its temperature inside the nozzle. It makes it very different from common cartridge heating mechanisms. Moreover, a semi-closed printing environment composed of an air curtain and temperature circulation device is developed to create a stable cooling environment. It improves the uniformity of the work platform and increases by 50% the cooling time efficiency. To demonstrate the robustness for a wide range of temperatures, this study presents two experiments of printing two biomaterial inks at body and low temperatures, respectively.
|
|
10:10-10:20, Paper WeA-17.2 | |
On CAD Informed Adaptive Robotic Assembly |
|
Koga, Yotto | Autodesk |
Kerrick, Heather | Autodesk |
Chitta, Sachin | Autodesk Inc |
Keywords: Assembly, Dual Arm Manipulation, Computer Vision for Automation
Abstract: We introduce a robotic assembly system that streamlines the design-to-make workflow for going from a CAD model of a product assembly to a fully programmed and adaptive assembly process. Our system captures (in the CAD tool) the intent of the assembly process for a specific robotic workcell and generates a recipe of task-level instructions. By integrating visual sensing with deep-learned perception models, the robots infer the necessary actions to assemble the design from the generated recipe. The perception models are trained directly from simulation, allowing the system to identify various parts based on CAD information. We demonstrate the system with a workcell of two robots to assemble interlocking 3D part designs. We first build and tune the assembly process in simulation, verifying the generated recipe. Finally, the real robotic workcell assembles the design using the same behavior.
|
|
10:20-10:30, Paper WeA-17.3 | |
Graph-Based Reinforcement Learning Meets Mixed Integer Programs: An Application to 3D Robot Assembly Discovery |
|
Funk, Niklas Wilhelm | TU Darmstadt |
Menzenbach, Svenja | Technical University of Darmstadt |
Chalvatzaki, Georgia | Technische Universität Darmastadt, Intelligent Robotic Systems |
Peters, Jan | Technische Universität Darmstadt |
Keywords: Assembly, Reinforcement Learning, Task and Motion Planning
Abstract: Robot assembly discovery (RAD) is a challenging problem that lives at the intersection of resource allocation and motion planning. The goal is to combine a predefined set of objects to form something new while considering task execution with the robot-in-the-loop. In this work, we tackle the problem of building arbitrary, predefined target structures entirely from scratch using a set of Tetris-like building blocks and a robotic manipulator. Our novel hierarchical approach aims at efficiently decomposing the overall task into three feasible levels that benefit mutually from each other. On the high level, we run a classical mixed-integer program for global optimization of block-type selection and the blocks’ final poses to recreate the desired shape. Its output is then exploited to efficiently guide the exploration of an underlying reinforcement learning (RL) policy. This RL policy draws its generalization properties from a flexible graph-based representation that is learned through Q-learning and can be refined with search. Moreover, it accounts for the necessary conditions of structural stability and robotic feasibility that cannot be effectively reflected in the previous layer. Lastly, a grasp and motion planner transforms the desired assembly commands into robot joint movements. We demonstrate our proposed method’s performance on a set of competitive simulated RAD environments, showcase real-world transfer, and report performance and robustness gains compared to an unstructured end-to-end approach.
|
|
10:30-10:40, Paper WeA-17.4 | |
Assembly Planning from Observations under Physical Constraints |
|
Chabal, Thomas | Inria and Département d’Informatique De l’Ecole Normale Supérieu |
Strudel, Robin | INRIA Paris |
Arlaud, Etienne | INRIA |
Ponce, Jean | Ecole Normale Supérieure |
Schmid, Cordelia | Inria |
Keywords: Assembly, Manipulation Planning
Abstract: This paper addresses the problem of copying an unknown assembly of primitives with known shape and appearance using information extracted from a single photograph by an off-the-shelf procedure for object detection and pose estimation. The proposed algorithm uses a simple combination of physical stability constraints, convex optimization and Monte Carlo tree search to plan assemblies as sequences of pick-and-place operations represented by STRIPS operators. It is efficient and, most importantly, robust to the errors in object detection and pose estimation unavoidable in any real robotic system. The proposed approach is demonstrated with thorough experiments on a UR5 manipulator.
|
|
10:40-10:50, Paper WeA-17.5 | |
Coordinated Toolpath Planning for Multi-Extruder Additive Manufacturing |
|
Khatkar, Jayant | University of Technology Sydney |
Yoo, Chanyeol | University of Technology Sydney |
Fitch, Robert | University of Technology Sydney |
Clemon, Lee | University of Technology Sydney |
Mettu, Ramgopal | Tulane University |
Keywords: Additive Manufacturing, Task and Motion Planning, Cooperating Robots
Abstract: We present a new algorithm for coordinating the motion of multiple extruders to increase throughput in fused filament fabrication (FFF)/fused deposition modeling (FDM) additive manufacturing. Platforms based on FFF are commonly available and advantageous to several industries, but are limited by slow fabrication time and could be could be significantly improved through efficient use of multiple extruders. We propose the coordinated toolpath planning problem for systems of extruders mounted as end-effectors on robot arms with the objective of maximizing utilization and avoiding collisions. Building on the idea of dependency graphs introduced in our earlier work, we develop a planning and control framework that precomputes a set of multi-layer toolpath segments from the input model and efficiently assigns them to individual extruders such that executed toolpaths are collision-free. Our method overcomes key limitations of existing methods, including utilization loss from workspace partitioning, precomputed toolpaths subject to collisions with the partially fabricated object, and wasted motion resulting from strict layer-by-layer fabrication. We report simulation results that show a major increase in utilization compared to single and multi-extruder methods, and favorable fabrication results using commodity hardware that demonstrate the feasibility of our method in practice.
|
|
10:50-11:00, Paper WeA-17.6 | |
A Hierarchical Finite-State Machine-Based Task Allocation Framework for Human-Robot Collaborative Assembly Tasks |
|
El Makrini, Ilias | Vrije Universiteit Brussel |
Omidi, Mohsen | Vrije Universiteit Brussel (VUB) |
Fusaro, Fabio | Fondazione Istituto Italiano Di Tecnologia |
Lamon, Edoardo | Istituto Italiano Di Tecnologia |
Ajoudani, Arash | Istituto Italiano Di Tecnologia |
Vanderborght, Bram | Vrije Universiteit Brussel |
Keywords: Industrial Robots, Assembly, Human-Robot Collaboration
Abstract: Work-related musculoskeletal disorders (MSD) are one of the major causes of injuries and absenteeism at work. These lead to important costs in the manufacturing industry. Human-robot collaboration can help decrease this issue by appropriately distributing the tasks and decreasing the workload of the factory worker. This paper proposes a novel generic task allocation approach based on hierarchical finite-state machines for human-robot assembly tasks. The developed framework decomposes first the main task into sub-tasks modeled as state machines. Based on capabilities considerations, workload, and performance estimations, the task allocator assigns the sub-task to the human or robot agent. The algorithm was validated on the assembly of a crusher unit of a smoothie machine using the collaborative Franka Emika Panda robot and showed promising results in terms of productivity thanks to task parallelization, with an improvement of more than 30% of the total assembly time with respect to a collaborative scenario, where the agents perform the tasks sequentially.
|
|
11:00-11:10, Paper WeA-17.7 | |
Self-Stabilizing Self-Assembly |
|
Jilek, Martin | Czech Technical University in Prague |
Stránská, Kateřina | Czech Institute of Informatics, Robotics and Cybernetics Czech T |
Somr, Michael | Czech Technical University in Prague, Faculty of Civil Engineeri |
Kulich, Miroslav | Czech Technical University in Prague |
Zeman, Jan | Czech Technical University in Prague, Faculty of Civil Engineeri |
Preucil, Libor | Czech Technical University in Prague |
Keywords: Assembly, Swarm Robotics
Abstract: The emerging field of passive macro-scale tile-based self-assembly (TBSA) shows promise in enabling effective manufacturing processes by harnessing TBSA’s intrinsic parallelism. However, current TBSA methodologies still do not fulfill their potentials, largely because such assemblies are often prone to errors, and the size of an individual assembly is limited due to insufficient mechanical stability. Moreover, the instability issue worsens as assemblies grow in size. Using a novel type of magnetically-bonded tiles carried by bristle-bot drives, we propose here a framework that reverses this tendency; i.e., as an assembly grows, it becomes more stable. Stability is achieved by introducing two sets of tiles that move in opposite directions, thus zeroing the assembly net force. Using physics-based computational experiments, we compare the performance of the proposed approach with the common orbital shaking method, proving that the proposed system of tiles indeed possesses self-stabilizing characteristics. Our approach enables assemblies containing hundreds of tiles to be built, while the shaking approach is inherently limited to a few tens of tiles. Our results indicate that one of the primary limitations of mechanical, agitation-based TBSA approaches, instability, might be overcome by employing a swarm of free-running, sensorless mobile robots, herein represented by passive tiles at the macroscopic scale.
|
|
11:10-11:20, Paper WeA-17.8 | |
Flexible and Precision Snap-Fit Peg-In-Hole Assembly Based on Multiple Sensations and Damping Identification |
|
Liu, Ruikai | Harbin Institute of Technology, Shenzhen |
Yang, Xiansheng | Harbin Institute of Technology, Shenzhen |
Li, Ajian | Harbin Institute of Technology, Shenzhen |
Lou, Yunjiang | Harbin Institute of Technology, Shenzhen |
Keywords: Compliant Assembly, Force and Tactile Sensing, Sensor Fusion
Abstract: Snap-fit peg-in-hole assembly widely exists in both industry and daily life, especially for consumer electronics. The buckle mechanism leads to a damping zone inside the port where insertion force needs to be increased. It is much difficult to automate this process by robots, for size and clearance of the components are always small, and the damping buckle should be perceived and distinguished from solid inner walls of the port. End-effector position control might be invalid, since grasping error will make it difficult to locate the plug accurately. In this article, we undertake this assembly challenge by taking advantage of fingertip tactile perception combined with visual images and force feedback. Raw sensor data is collected, processed, and fused together to be state input of a reinforcement learning network, generating continuous action vectors. We also propose a novel damping zone predictor through feature extraction and multimodal fusion, which is able to identify whether the plug has touched the buckle mechanism, so as to adjust the insertion force. The whole framework is implemented through a common USB Type-C insertion experiment on Franka Panda robot platform, reaching a success rate of 88%. Furthermore, system robustness is verified, and comparisons of different modalities are also conducted.
|
|
11:20-11:30, Paper WeA-17.9 | |
A General Method for Autonomous Assembly of Arbitrary Parts in the Presence of Uncertainty |
|
Cao, Shichen | Worcester Polytechnic Institute |
Xiao, Jing | Worcester Polytechnic Institute (WPI) |
Keywords: Compliant Assembly, Perception-Action Coupling, Contact Modeling
Abstract: In this paper, we propose a novel and general method for autonomous robotic assembly of arbitrary and complex-shaped parts in the presence of 6-dimensional uncertainty. When a nominal assembly motion of the robot holding a part is stopped by contact due to uncertainty, our method finds the best estimate for the uncertainty and the contact configuration of the part based on sensed force/torque and uses that information to find a more accurate estimate of the goal configuration to guide a recovery motion of the part. It is based on a general, surface-based sphere tree representation of parts, a constrained optimization strategy to find the best estimate of the contact configuration under an uncertainty estimate, and a learned force/torque calibration model to relate computed force/torque and the sensed real force/torque. The method is applied and evaluated on different complex-shaped multi-peg-in-hole tasks. The results show that our method can achieve successful assembly with the presence of realistic 6-D uncertainties more than 10 times of the tight task clearances in terms of orientation clearance (<0.015 rad) and position clearance (<1.5 mm), in all the test cases.
|
|
WeA-18 |
Rm18 (Room 554) |
Motion and Path Planning 7 |
Regular session |
Chair: Sharf, Inna | McGill University |
Co-Chair: Fainekos, Georgios | Toyota Research Institute of North America |
|
10:00-10:10, Paper WeA-18.1 | |
Fast-Replanning Motion Control for Non-Holonomic Vehicles with Aborting A* |
|
Missura, Marcell | University of Bonn |
Roychoudhury, Arindam | University of Bonn |
Bennewitz, Maren | University of Bonn |
Keywords: Motion and Path Planning, Collision Avoidance, Autonomous Vehicle Navigation
Abstract: Autonomously driving vehicles must be able to navigate in dynamic and unpredictable environments in a collision-free manner. So far, this has only been partially achieved in driverless cars and warehouse installations where marked structures such as roads, lanes, and traffic signs simplify the motion planning and collision avoidance problem. We are presenting a new control approach for car-like vehicles that is based on an unprecedentedly fast-paced A* implementation that allows the control cycle to run at a frequency of 30 Hz. This frequency enables us to place our A* algorithm as a low- level replanning controller that is well suited for navigation and collision avoidance in virtually any dynamic environment. Due to an efficient heuristic consisting of rotate-translate-rotate motions laid out along the shortest path to the target, our Short- Term Aborting A* (STAA*) converges fast and can be aborted early in order to guarantee a high and steady control rate. While our STAA* expands states along the shortest path, it takes care of collision checking with the environment including predicted states of moving obstacles, and returns the best solution found when the computation time runs out. Despite the bounded computation time, our STAA* does not get trapped in corners due to the following of the shortest path. In simulated and real-robot experiments, we demonstrate that our control approach eliminates collisions almost entirely and is superior to an improved version of the Dynamic Window Approach with predictive collision avoidance capabilities.
|
|
10:10-10:20, Paper WeA-18.2 | |
Collision and Rollover-Free mathcal{G}^2 Path Planning for Mobile Manipulation |
|
Song, Jiazhi | McGill University |
Sharf, Inna | McGill University |
Keywords: Motion and Path Planning, Collision Avoidance, Robotics and Automation in Agriculture and Forestry
Abstract: This paper presents a path planning refinement technique that allows the efficient collision and rollover-free motion planning for mobile manipulator robots working on rough terrain. First, the necessary theoretical background on a mobile manipulator's kinematics and dynamic stability measure is introduced. Then, after the brief introduction of the sampling-based path planning problem, the additional refinement stage and its problem formulation will be introduced. Within the refinement stage, the novel B'{e}zier control point addition method is introduced to allow for fast, collision and rollover-free path smoothing using curvature-continuous parametrized curves. Analytical proofs and simulated comparisons are provided in the paper to show effectiveness. The beneficial effect of the refined path on trajectory planning will also be demonstrated through simulation.
|
|
10:20-10:30, Paper WeA-18.3 | |
Fast 3D Sparse Topological Skeleton Graph Generation for Mobile Robot Global Planning |
|
Chen, Xinyi | The Hong Kong University of Science and Technology |
Zhou, Boyu | Hong Kong University of Science and Technology |
Lin, Jiarong | The University of Hong Kong |
Zhang, Yichen | The Hong Kong University of Science and Technology |
Zhang, Fu | University of Hong Kong |
Shen, Shaojie | Hong Kong University of Science and Technology |
Keywords: Motion and Path Planning, Autonomous Vehicle Navigation, Mapping
Abstract: In recent years, mobile robots are becoming ambitious and deployed in large-scale scenarios. Serving as a high-level understanding of environments, a sparse skeleton graph is beneficial for more efficient global planning. Currently, existing solutions for skeleton graph generation suffer from several major limitations, including poor adaptiveness to different map representations, dependency on robot inspection trajectories and high computational overhead. In this paper, we propose an efficient and flexible algorithm generating a trajectory-independent 3D sparse topological skeleton graph capturing the spatial structure of the free space. In our method, an efficient ray sampling and validating mechanism are adopted to find distinctive free space regions, which contributes to skeleton graph vertices, with traversability between adjacent vertices as edges. A cycle formation scheme is also utilized to maintain skeleton graph compactness. Benchmark comparison with state-of-the-art works demonstrates that our approach generates sparse graphs in a substantially shorter time, giving high-quality global planning paths. Experiments conducted in real-world maps further validate the capability of our method in real-world scenarios. Our method will be made open source to benefit the community.
|
|
10:30-10:40, Paper WeA-18.4 | |
Learning Enabled Fast Planning and Control in Dynamic Environments with Intermittent Information |
|
Cleaveland, Matthew | University of Pennsylvania |
Yel, Esen | Stanford University |
Kantaros, Yiannis | Washington University in St. Louis |
Lee, Insup | University of Pennsylvania |
Bezzo, Nicola | University of Virginia |
Keywords: Motion and Path Planning, Autonomous Vehicle Navigation, Collision Avoidance
Abstract: This paper addresses a safe planning and control problem for mobile robots operating in communication- and sensor-limited dynamic environments. In this case the robots cannot sense the objects around them and must instead rely on intermittent, external information about the environment, as e.g., in underwater applications. The challenge in this case is that the robots must plan using only this stale data, while accounting for any noise in the data or uncertainty in the environment. To address this challenge we propose a compositional technique which leverages neural networks to quickly plan and control a robot through crowded and dynamic environments using only intermittent information. Specifically, our tool uses reachability analysis and potential fields to train a neural network that is capable of generating safe control actions. We demonstrate our technique both in simulation with an underwater vehicle crossing a crowded shipping channel and with real experiments with ground vehicles in communication- and sensor-limited environments.
|
|
10:40-10:50, Paper WeA-18.5 | |
NMPC-LBF: Nonlinear MPC with Learned Barrier Function for Decentralized Safe Navigation of Multiple Robots in Unknown Environments |
|
Salimi Lafmejani, Amir | Arizona State University |
Berman, Spring | Arizona State University |
Fainekos, Georgios | Toyota Research Institute of North America |
Keywords: Motion and Path Planning, Multi-Robot Systems, Machine Learning for Robot Control
Abstract: In this paper, we present a decentralized control approach based on a Nonlinear Model Predictive Control (NMPC) method that employs barrier certificates for safe navigation of multiple nonholonomic wheeled mobile robots in unknown environments with static and/or dynamic obstacles. This method incorporates a Learned Barrier Function (LBF) into the NMPC design in order to guarantee safe robot navigation, i.e., prevent robot collisions with other robots and the obstacles. We refer to our proposed control approach as NMPC-LBF. Since each robot does not have a priori knowledge about the obstacles and other robots, we use a Deep Neural Network (DeepNN) running in real-time on each robot to learn the Barrier Function (BF) only from the robot's LiDAR and odometry measurements. The DeepNN is trained to learn the BF that separates safe and unsafe regions. We implemented our proposed method on simulated and actual Turtlebot3 Burger robot(s) in different scenarios. The implementation results show the effectiveness of the NMPC-LBF method at ensuring safe navigation of the robots.
|
|
10:50-11:00, Paper WeA-18.6 | |
FISS: A Trajectory Planning Framework Using Fast Iterative Search and Sampling Strategy for Autonomous Driving |
|
Sun, Shuo | National University of Singapore |
Liu, Zhiyang | National University of Singapore |
Yin, Huan | Hong Kong University of Science and Technology |
Ang Jr, Marcelo H | National University of Singapore |
Keywords: Motion and Path Planning, Autonomous Vehicle Navigation, Intelligent Transportation Systems
Abstract: Trajectory planning is a critical component in autonomous vehicles directly responsible for driving safety and efficiency during deployment. The ability to find the optimal trajectory in real-time is critical for autonomous driving. This paper presents a novel general framework using the Fast Iterative Search and Sampling (FISS) strategy for sampling-based trajectory planning, which can find the optimal trajectory from an enormous number of candidates with high efficiency in real-time. Specifically, before generating any trajectories, the proposed method utilizes historical planning results as prior information in heuristics to estimate the cost distribution over the sampling space. On this basis, the Fast Iterative Search and Sampling strategy is employed to explore the sampling space for possible candidates and generate trajectories for verification during the search process. Experimental results show that our method can significantly outperform existing frameworks by order of magnitude in planning efficiency while ensuring safety and maintaining high accuracy.
|
|
11:00-11:10, Paper WeA-18.7 | |
Reshaping Local Path Planner |
|
Sarvesh, Akshay | Texas A&M University |
Carroll, Austin | Texas A&M |
Gopalswamy, Swaminathan | Texas A&M University |
Keywords: Motion and Path Planning, Collision Avoidance, Constrained Motion Planning
Abstract: This paper proposes a path planner that reshapes a global path locally in response to sensor-based observations of obstacles in the environment. Two fundamental concepts enable the resultant algorithm (a) a path-following synthetic vehicle whose steering actions are non-myopically optimized to result in a smooth traversible path that meets path curvature constraints, and (b) a path-aware turning moment-field that enables obstacle avoidance while eluding the typical local-minimum-induced stagnation associated with potential field methods. The use of the combination of the two concepts results in a reduced action space over which optimization needs to be performed towards minimizing the path deviation subject to obstacle avoidance, and thus results in an efficient algorithm that can be implemented online. We demonstrate the algorithm in simulations as well as in field experiments, performing real time local path planning and obstacle avoidance on two different vehicle platforms (an ackerman steering vehicle two-axled vehicle, and a differential-steering 4-axled vehicle) in an unstructured off-road terrain.
|
|
11:10-11:20, Paper WeA-18.8 | |
T-PRM: Temporal Probabilistic Roadmap for Path Planning in Dynamic Environments |
|
Hüppi, Matthias | ETH Zurich |
Bartolomei, Luca | ETH Zurich |
Mascaro Palliser, Ruben | ETH Zurich |
Chli, Margarita | ETH Zurich |
Keywords: Motion and Path Planning, Collision Avoidance
Abstract: Sampling-based motion planners are widely used in robotics due to their simplicity, flexibility and computational efficiency. However, in their most basic form, these algorithms operate under the assumption of static scenes and lack the ability to avoid collisions with dynamic (i.e. moving) obstacles. This raises safety concerns, limiting the range of possible applications of mobile robots in the real world. Motivated by these challenges, in this work we present Temporal-PRM, a novel sampling-based path-planning algorithm that performs obstacle avoidance in dynamic environments. The proposed approach extends the original Probabilistic Roadmap (PRM) with the notion of time, generating an augmented graph-like structure that can be efficiently queried using a time-aware variant of the A* search algorithm, also introduced in this paper. Our design maintains all the properties of PRM, such as the ability to perform multiple queries and to find smooth paths, while circumventing its downside by enabling collision avoidance in highly dynamic scenes with a minor increase in the computational cost. Through a series of challenging experiments in highly cluttered and dynamic environments, we demonstrate that the proposed path planner outperforms other state-of-the-art sampling-based solvers. Moreover, we show that our algorithm can run onboard a flying robot, performing obstacle avoidance in real time.
|
|
11:20-11:30, Paper WeA-18.9 | |
Hierarchical Planning with Annotated Skeleton Guidance |
|
Uwacu, Diane | Texas A&M University |
Yammanuru, Ananya | University of Illinois at Urbana-Champaign |
Morales, Marco | University of Illinois at Urbana-Champaign & Instituto Tecnológ |
Amato, Nancy | University of Illinois |
Keywords: Motion and Path Planning, Collision Avoidance, Computational Geometry
Abstract: We present a hierarchical skeleton-guided motion planning algorithm to guide mobile robots. A good skeleton maps the connectivity of the subspace of c-space containing significant degrees of freedom and is able to guide the planner to find the desired solutions fast. However, sometimes the skeleton does not closely represent the free c-space, which often misleads current skeleton-guided planners. The hierarchical skeleton-guided planning strategy gradually relaxes its reliance on the workspace skeleton as Cspace is sampled, thereby incrementally returning a sub-optimal path, a feature that is not guaranteed in the standard skeleton-guided algorithm. Experimental comparisons to the standard skeleton guided planners and other lazy planning strategies show significant improvement in roadmap construction run time while maintaining path quality for multi-query problems in cluttered environments.
|
|
WeA-19 |
Rm19 (Room 555) |
Legged Robots 1 |
Regular session |
Chair: Bousmalis, Konstantinos | DeepMind |
Co-Chair: Nguyen, Quan | University of Southern California |
|
10:00-10:10, Paper WeA-19.1 | |
Learning Coordinated Terrain-Adaptive Locomotion by Imitating a Centroidal Dynamics Planner |
|
Brakel, Philemon | Deepmind |
Bohez, Steven | DeepMind |
Hasenclever, Leonard | DeepMind |
Heess, Nicolas | Deepmind |
Bousmalis, Konstantinos | DeepMind |
Keywords: Legged Robots, Deep Learning Methods, Reinforcement Learning
Abstract: We propose a simple imitation learning procedure for learning locomotion controllers that can walk over very challenging terrains. We use trajectory optimization (TO) to produce a large dataset of trajectories over procedurally generated terrains and use Reinforcement Learning (RL) to imitate these trajectories. We demonstrate with a realistic model of the ANYmal robot that the learned controllers transfer to unseen terrains and provide an effective initialization for fine-tuning on challenging terrains that require exteroception and precise foot placements. Our setup combines TO and RL in a simple fashion that overcomes the computational limitations and need for a robust tracking controller of the former and the exploration and reward-tuning difficulties of the latter.
|
|
10:10-10:20, Paper WeA-19.2 | |
A Versatile Co-Design Approach for Dynamic Legged Robots |
|
Dinev, Traiko | The University of Edinburgh |
Mastalli, Carlos | Heriot-Watt University |
Ivan, Vladimir | Touchlab Limited |
Tonneau, Steve | The University of Edinburgh |
Vijayakumar, Sethu | University of Edinburgh |
Keywords: Legged Robots, Mechanism Design, Optimization and Optimal Control
Abstract: We present a versatile framework for the computational co-design of legged robots and dynamic maneuvers. Current state-of-the-art approaches are typically based on random sampling or concurrent optimization. We propose a novel bilevel optimization approach that exploits the derivatives of the motion planning sub-problem (i.e., the lower level). These motion-planning derivatives allow us to incorporate arbitrary design constraints and costs in an general-purpose nonlinear program (i.e., the upper level). Our approach allows for the use of any differentiable motion planner in the lower level and also allows for an upper level that captures arbitrary design constraints and costs. It efficiently optimizes the robot’s morphology, payload distribution and actuator parameters while considering its full dynamics, joint limits and physical constraints such as friction cones. We demonstrate these capabilities by designing quadruped robots that jump and trot. We show that our method is able to design a more energy-efficient Solo robot for these tasks.
|
|
10:20-10:30, Paper WeA-19.3 | |
Motion Planning for Agile Legged Locomotion Using Failure Margin Constraints |
|
Green, Kevin | Oregon State University |
Warila, John | Oregon State University |
Hatton, Ross | Oregon State University |
Hurst, Jonathan | Oregon State University |
Keywords: Legged Robots, Human and Humanoid Motion Analysis and Synthesis, Motion and Path Planning
Abstract: The complex dynamics of agile robotic legged locomotion requires motion planning to intelligently adjust footstep locations. Often, bipedal footstep and motion planning use mathematically simple models such as the linear inverted pendulum, instead of dynamically-rich models that do not have closed-form solutions. We propose a real-time optimization method to plan for dynamical models that do not have closed form solutions and experience irrecoverable failure. Our method uses a data-driven approximation of the step-to-step dynamics and of a failure margin function. This failure margin function is an oriented distance function in state-action space where it describes the signed distance to success or failure. The motion planning problem is formed as a nonlinear program with constraints that enforce the approximated forward dynamics and the validity of state-action pairs. For illustration, this method is applied to create a planner for an actuated spring-loaded inverted pendulum model. In an ablation study, the failure margin constraints decreased the number of invalid solutions by between 24 and 47 percentage points across different objectives and horizon lengths. While we demonstrate the method on a canonical model of locomotion, we also discuss how this can be applied to data-driven models and full-order robot models.
|
|
10:30-10:40, Paper WeA-19.4 | |
Energy-Based Legged Robots Terrain Traversability Modeling Via Deep Inverse Reinforcement Learning |
|
Gan, Lu | California Institute of Technology |
Grizzle, J.W | University of Michigan |
Eustice, Ryan | University of Michigan |
Ghaffari, Maani | University of Michigan |
Keywords: Legged Robots, Energy and Environment-Aware Automation, Learning from Demonstration
Abstract: This work reports on developing a deep inverse reinforcement learning method for legged robots terrain traversability modeling that incorporates both exteroceptive and proprioceptive sensory data. Existing works use robot-agnostic exteroceptive environmental features or handcrafted kinematic features; instead, we propose to also learn robot-specific inertial features from proprioceptive sensory data for reward approximation in a single deep neural network. Incorporating the inertial features can improve the model fidelity and provide a reward that depends on the robot's state during deployment. We train the reward network using the Maximum Entropy Deep Inverse Reinforcement Learning (MEDIRL) algorithm and propose simultaneously minimizing a trajectory ranking loss to deal with the suboptimality of legged robot demonstrations. The demonstrated trajectories are ranked by locomotion energy consumption, in order to learn an energy-aware reward function and a more energy-efficient policy than demonstration. We evaluate our method using a dataset collected by an MIT Mini-Cheetah robot and a Mini-Cheetah simulator. The code is publicly available at https://github.com/ganlumomo/minicheetah-traversability-irl.
|
|
10:40-10:50, Paper WeA-19.5 | |
Robust High-Speed Running for Quadruped Robots Via Deep Reinforcement Learning |
|
Bellegarda, Guillaume | EPFL |
Chen, Yiyu | University of Southern California |
Liu, Zhuochen | University of Southern California |
Nguyen, Quan | University of Southern California |
Keywords: Legged Robots, Machine Learning for Robot Control, Motion Control
Abstract: Deep reinforcement learning has emerged as a popular and powerful way to develop locomotion controllers for quadruped robots. Common approaches have largely focused on learning actions directly in joint space, or learning to modify and offset foot positions produced by trajectory generators. Both approaches typically require careful reward shaping and training for millions of time steps, and with trajectory generators introduce human bias into the resulting control policies. In this paper, we present a learning framework that leads to the natural emergence of fast and robust bounding policies for quadruped robots. The agent both selects and controls actions directly in task space to track desired velocity commands subject to environmental noise including model uncertainty and rough terrain. We observe that this framework improves sample efficiency, necessitates little reward shaping, leads to the emergence of natural gaits such as galloping and bounding, and eases the sim-to-real transfer at running speeds. Policies can be learned in only a few million time steps, even for challenging tasks of running over rough terrain with loads of over 100% of the nominal quadruped mass. Training occurs in PyBullet, and we perform a sim-to-sim transfer to Gazebo and sim-to-real transfer to the Unitree A1 hardware. For sim-to-sim, our results show the quadruped is able to run at over 4 m/s without a load, and 3.5 m/s with a 10 kg load, which is over 83% of the nominal quadruped mass. For sim-to-real, the Unitree A1 is able to bound at 2 m/s with a 5 kg load, representing 42% of the nominal quadruped mass.
|
|
10:50-11:00, Paper WeA-19.6 | |
Toward a Data-Driven Template Model for Quadrupedal Locomotion |
|
Fawcett, Randall | Virginia Polytechnic Institute and State University |
Afsari, Kereshmeh | Virginia Tech |
Ames, Aaron | Caltech |
Akbari Hamed, Kaveh | Virginia Tech |
Keywords: Legged Robots, Motion Control, Multi-Contact Whole-Body Motion Planning and Control
Abstract: This work investigates a data-driven template model for trajectory planning of dynamic quadrupedal robots. Many state-of-the-art approaches involve using a reduced-order model, primarily due to computational tractability. The spirit of the trajectory planning approach in this work draws on recent advancements in the area of behavioral systems theory. Here, we aim to capitalize on the knowledge of well-known template models to construct a data-driven model, enabling us to obtain an information rich reduced-order model. In particular, this work considers input-output states similar to that of the single rigid body model and proceeds to develop a data-driven representation of the system, which is then used in a predictive control framework to plan a trajectory for quadrupeds. The optimal trajectory is passed to a low-level and nonlinear model-based controller to be tracked. Preliminary experimental results are provided to establish the efficacy of this hierarchical control approach for trotting and walking gaits of a high-dimensional quadrupedal robot on unknown terrains and in the presence of disturbances.
|
|
11:00-11:10, Paper WeA-19.7 | |
Planning of Obstacle-Aided Navigation for Multi-Legged Robots Using a Sampling-Based Method Over Directed Graphs |
|
Chakraborty, Kaustav | University of Southern California |
Hu, Haodi | University of Southern California |
Kvalheim, Matthew | University of Pennsylvania |
Qian, Feifei | University of Southern California |
Keywords: Legged Robots, Dynamics
Abstract: Existing work in legged robot navigation in cluttered environments often seeks collision-free paths that avoid obstacle interactions. Here we present a new approach for multi-legged robots to utilize leg-obstacle collisions to generate desired dynamics across obstacle fields. To predict the change of robot state (i.e., position and orientation) under repeated leg-obstacle collisions, we construct a discretized directed graph model: each node of the graph represents a different robot state, whereas the directed edges pointing from one node to another represent the transitions from one robot state to the next within one stride. These obstacle-modulated state transitions can depend on the robot gaits used. To capture this dependence, an empirical interaction model is used to compute the change of robot state based on initial contact positions between robot legs and obstacles. To validate the prediction of robot state transitions, we experimentally measure the state of a quadrupedal robot as it traverses a periodic obstacle field with three different gaits: bound, trot, and pace. We observed that the robot could passively converge to different steady state orientations, and these steady states corresponded well with the Strongly-Connected-Path-Components (SCPCs) within the directed graph. Searching over the graph for connected paths of SCPCs allows development of a gait planner that can generate gait switching strategies for a robot to achieve desired states by simply engaging with a sequence of leg-obstacle collisions. We demonstrate in experiments that, by using the gait sequences generated by our planner, an open-loop quadrupedal robot was able to successfully achieve desired orientations within a periodic obstacle field without any sensory input or active steering.
|
|
11:10-11:20, Paper WeA-19.8 | |
Real-Time Digital Double Framework to Predict Collapsible Terrains for Legged Robots |
|
Haddeler, Garen | National University of Singapore, Agency for Science, Technolo |
Palanivelu, Hari Prasanth | Insititute for Infocomm Research (I2R) |
Ng, Yung Chuen | Institute for Infocomm Research (I2R), A*STAR |
Colonnier, Fabien | Institute for Infocomm Research (I2R), A*STAR |
Adiwahono, Albertus Hendrawan | I2R A-STAR |
Li, Zhibin | University of Edinburgh |
Chew, Chee Meng | National University of Singapore |
Chuah, Meng Yee (Michael) | Agency for Science, Technology and Research (A*STAR) |
Keywords: Legged Robots, Hardware-Software Integration in Robotics, Sensorimotor Learning
Abstract: Inspired by the digital twinning systems, a novel real-time digital double framework is developed to enhance robot perception of the terrain conditions. Based on the very same physical model and motion control, this work exploits the use of such simulated digital double synchronized with a real robot to capture and extract discrepancy information between the two systems, which provides high dimensional cues in multiple physical quantities to represent differences between the modelled and the real world. Soft, non-rigid terrains cause common failures in legged locomotion, whereby visual perception solely is insufficient in estimating such physical properties of terrains. We used digital double to develop the estimation of the collapsibility, which addressed this issue through physical interactions during dynamic walking. The discrepancy in sensory measurements between the real robot and its digital double are used as input of a learning-based algorithm for terrain collapsibility analysis. Although trained only in simulation, the learned model can perform collapsibility estimation successfully in both simulation and real world. Our evaluation of results showed the generalization to different scenarios and the advantages of the digital double to reliably detect nuances in ground conditions.
|
|
11:20-11:30, Paper WeA-19.9 | |
Adaptive Feet for Quadrupedal Walkers (I) |
|
Catalano, Manuel Giuseppe | Istituto Italiano Di Tecnologia |
Pollayil, Mathew Jose | University of Pisa |
Grioli, Giorgio | Istituto Italiano Di Tecnologia |
Valsecchi, Giorgio | Robotic System Lab, ETH |
Kolvenbach, Hendrik | ETH Zurich |
Hutter, Marco | ETH Zurich |
Bicchi, Antonio | Fondazione Istituto Italiano Di Tecnologia |
Garabini, Manolo | Università Di Pisa |
Keywords: Compliant Joints and Mechanisms, Mechanism Design, Biologically-Inspired Robots
Abstract: The vast majority of state-of-the-art walking robots employ flat or ball feet for locomotion, presenting limitations while stepping on obstacles, slopes, or unstructured terrain. Moreover, traditional feet for quadrupeds lack sensing systems that are able to provide information about the environment and about the foot interaction with the surroundings. This further diminishes their value. Inspired by our previous work on soft feet for bipedal robots, we present the SoftFoot-Q, an articulated adaptive foot for quadrupeds. This device is conceived to be robust and able to overcome the limitations of currently employed feet. The core idea behind our adaptive foot design is first introduced and validated through a simplified mathematical formulation of the problem. Subsequently, we present the chosen mechanical implementation to attempt overcoming current limitations. The realized prototype of adaptive foot is integrated and tested on the compliantly actuated quadrupedal robot ANYmal together with a ROS based real-time foot pose reconstruction software. Both extensive field tests and indoor experiments show noticeable performance improvements, in terms of reduced slippage of the robot, with respect to both flat and ball feet.
|
|
WeA-20 |
Rm20 (Room 104) |
Art and Entertainment and Manipulation |
Regular session |
Chair: Erickson, Zackory | Carnegie Mellon University |
Co-Chair: Sewlia, Mayank | KTH Royal Institute of Technology |
|
10:00-10:10, Paper WeA-20.1 | |
Towards Learning to Play Piano with Dexterous Hands and Touch |
|
Xu, Huazhe | Stanford University |
Luo, Yuping | Princeton University |
Wang, Shaoxiong | MIT |
Darrell, Trevor | UC Berkeley |
Calandra, Roberto | Meta AI |
Keywords: Art and Entertainment Robotics, Force and Tactile Sensing, Sensorimotor Learning
Abstract: As Liszt once said "(a virtuoso) must call up scent and blossom, and breathe the breath of life", a virtuoso plays the piano with passion, poetry, and extraordinary technical ability. Hence, piano playing, being a task that is quintessentially human, becomes a hallmark for roboticians and artificial intelligence researchers to pursue. In this paper, we advocate an end-to-end reinforcement learning (RL) paradigm to demonstrate how an agent can learn directly from machine-readable music score to play the piano with touch-augmented dexterous hands on a simulated piano. To achieve the desired tasks, we design useful touch- and audio-based reward functions and a series of tasks. Empirical results show that the RL agent can not only find the correct key position but also deal with the various rhythmic, volume, and fingering requirements. As a result, the agent demonstrates its effectiveness in playing simple pieces that have different musical requirements which show the potential of leveraging reinforcement learning approach for the piano playing tasks.
|
|
10:10-10:20, Paper WeA-20.2 | |
Consensus-Based Normalizing-Flow Control: A Case Study in Learning Dual-Arm Coordination |
|
Yin, Hang | KTH |
Verginis, Christos | Uppsala University |
Kragic, Danica | KTH |
Keywords: Dual Arm Manipulation, Multi-Robot Systems, Reinforcement Learning
Abstract: We develop two consensus-based learning algorithms for multi-robot systems applied on complex tasks involving collision constraints and force interactions, such as the cooperative peg-in-hole placement. The proposed algorithms integrate multi-robot distributed consensus and normalizing-flow-based reinforcement learning. The algorithms guarantee the stability and the consensus of the multi-robot system's generalized variables in a transformed space. This transformed space is obtained via a diffeomorphic transformation parameterized by normalizing-flow models that the algorithms use to train the underlying task, learning hence skillful, dexterous trajectories required for the task accomplishment. We validate the proposed algorithms by parameterizing reinforcement learning policies, demonstrating efficient cooperative learning, and strong generalization of dual-arm assembly skills in a dynamics-engine simulator.
|
|
10:20-10:30, Paper WeA-20.3 | |
Toward Efficient Task Planning for Dual-Arm Tabletop Object Rearrangement |
|
Gao, Kai | Rutgers University |
Yu, Jingjin | Rutgers University |
Keywords: Dual Arm Manipulation, Task Planning, Cooperating Robots
Abstract: We investigate the problem of coordinating two robot arms to solve non-monotone tabletop multi-object rearrangement tasks. In a non-monotone rearrangement task, complex object-object dependencies exist that require moving some objects multiple times to solve an instance. In working with two arms in a large workspace, some objects must be handed off between the robots, which further complicates the planning process. For the challenging dual-arm tabletop rearrangement problem, we develop effective task planning algorithms for scheduling the pick-n-place sequence that can be properly distributed between the two arms. We show that, even without using a sophisticated motion planner, our method achieves significant time savings in comparison to greedy approaches and naive parallelization of single-robot plans.
|
|
10:30-10:40, Paper WeA-20.4 | |
Simultaneous Depth Estimation and Localization for Cell Manipulation Based on Deep Learning |
|
Wang, Zengshuo | Nankai University |
Gong, Huiying | Nankai University |
Li, Ke | Nankai University |
Yang, Bin | Nankai University |
Du, Yue | Nankai University |
Liu, Yaowei | Nankai University |
Zhao, Xin | Nankai University |
Sun, Mingzhu | Nankai University |
Keywords: Biological Cell Manipulation, Computer Vision for Automation
Abstract: Visual localization, which is a key technology to realize the automation of cell manipulation, has been widely studied. Since the depth of field of the microscope is narrow, the planar localization and depth estimation are usually coupled together. At present, most methods adopt the serial working mode of focusing first and then planar localization, but they usually do not have good real-time performance and stability. In this paper, a simultaneous depth estimation and localization network was developed for cell manipulation. The network takes a focused image and a defocus-offset image as inputs, and outputs the defocus in the depth direction and the offset in the plane at the same time after going through defocus-offset information extraction, defocus classification mapping and offset regression mapping. To train and test our network, we also create two datasets: An Adherent Cell dataset and an Injection Micropipette dataset. The experimental results demonstrated that the proposed method achieves the detection of all test samples with a frame rate of more than 40Hz, and the maximum errors of depth estimation and localization are 2.44μm and 0.49μm, respectively. The proposed method has good stability, which is mainly reflected in its strong generalization ability and anti-noise ability.
|
|
10:40-10:50, Paper WeA-20.5 | |
Cooperative Object Manipulation under Signal Temporal Logic Tasks and Uncertain Dynamics |
|
Sewlia, Mayank | KTH Royal Institute of Technology |
Verginis, Christos | Uppsala University |
Dimarogonas, Dimos V. | KTH Royal Institute of Technology |
Keywords: Dual Arm Manipulation, Multi-Robot Systems, Cooperating Robots
Abstract: We address the problem of cooperative manipulation of an object whose tasks are specified by a Signal Temporal Logic (STL) formula. We employ the Prescribed Performance Control (PPC) methodology to guarantee predefined transient and steady-state performance on the object trajectory in order to satisfy the STL formula. More specifically, we first provide a way that translates the problem of satisfaction of an STL task to the problem of state evolution within a user-defined time-varying funnel. We then design a control strategy for the robotic agents that guarantees compliance with this funnel. The control strategy is decentralized, in the sense that each agent calculates its own control signal, and does not use any information on the agents' and object's dynamic terms, which are assumed to be unknown. We experimentally verify the results on two manipulator arms, cooperatively working to manipulate an object based on a STL formula.
|
|
10:50-11:00, Paper WeA-20.6 | |
DrozBot: Using Ergodic Control to Draw Portraits |
|
Löw, Tobias | Idiap Research Institute |
Maceiras, Jérémy | Idiap Research Institute |
Calinon, Sylvain | Idiap Research Institute |
Keywords: Art and Entertainment Robotics, Motion Control
Abstract: We present drozBot: le robot portraitiste, a robotic system that draws artistic portraits of people. The input images for the portrait are taken interactively by the robot itself. We formulate the problem of drawing portraits as a problem of coverage which is then solved by an ergodic control algorithm to compute the strokes. The ergodic computation of the strokes for the portrait gives an artistic look to them. The specific ergodic control algorithm that we chose is inspired by the heat equation. We employed a 7-axis Franka Emika robot for the physical drawings and used an optimal control strategy to generate joint angle commands. We explain the influence of the different hyperparameters and show the importance of the image processing steps. The attractiveness of the results was evaluated by conducting a survey where we asked the participants to rank the portraits produced by different algorithms.
|
|
11:00-11:10, Paper WeA-20.7 | |
Visual Haptic Reasoning: Estimating Contact Forces by Observing Deformable Object Interactions |
|
Wang, Yufei | Carnegie Mellon University |
Held, David | Carnegie Mellon University |
Erickson, Zackory | Carnegie Mellon University |
Keywords: Physically Assistive Devices, Deep Learning for Visual Perception, Perception for Grasping and Manipulation
Abstract: Robotic manipulation of highly deformable cloth presents a promising opportunity to assist people with several daily tasks, such as washing dishes; folding laundry; or dressing, bathing, and hygiene assistance for individuals with severe motor impairments. In this work, we introduce a formulation that enables a collaborative robot to perform visual haptic reasoning with cloth--the act of inferring the location and magnitude of applied forces during physical interaction. We present two distinct model representations, trained in physics simulation, that enable haptic reasoning using only visual and robot kinematic observations. We conducted quantitative evaluations of these models in simulation for robot-assisted dressing, bathing, and dish washing tasks, and demonstrate that the trained models can generalize across different tasks with varying interactions, human body sizes, and object shapes. We also present results with a real-world mobile manipulator, which used our simulation-trained models to estimate applied contact forces while performing physically assistive tasks with cloth. Videos can be found at our project webpage: https://sites.google.com/view/visualhapticreasoning/home
|
|
11:10-11:20, Paper WeA-20.8 | |
Tactile Feedback Enabling In-Hand Pivoting and Internal Force Control for Dual-Arm Cooperative Object Carrying |
|
Costanzo, Marco | Università Degli Studi Della Campania Luigi Vanvitelli |
De Maria, Giuseppe | Università Degli Studi Della Campania Luigi Vanvitelli |
Natale, Ciro | Università Degli Studi Della Campania "Luigi Vanvitelli" |
Keywords: Dual Arm Manipulation, In-Hand Manipulation, Cooperating Robots
Abstract: The main purpose of this paper is to demonstrate that smart exploitation of force/tactile feedback can enable successful physical cooperation of two robot manipulators to handle a common object with a high degree of dexterity. The novelty of the paper is that dexterity is provided not only by the degrees of freedom of the robot arms but also by the grasp controller of the sensorized parallel grippers, which allow the robots to manipulate the object either with a tight grasp or with a one-degree-of-freedom rolling contact. The coordinated motion of the robots depends on both the desired motion of the carried object and the control of the internal forces during transportation and in-hand manipulation. The solution exploits only kinematic models of the robots and a dynamic model of the distributed soft contact, which includes both linear force and torsional moment.
|
|
11:20-11:30, Paper WeA-20.9 | |
DUQIM-Net: Probabilistic Object Hierarchy Representation for Multi-View Manipulation |
|
Tchuiev, Vladimir | Bosch |
Miron, Yakov | Bosch |
Di Castro, Dotan | Bosch |
Keywords: Deep Learning in Grasping and Manipulation, Object Detection, Segmentation and Categorization, Grasping
Abstract: Object manipulation in cluttered scenes is a difficult and important problem in robotics. To efficiently manipulate objects, it is crucial to understand their surroundings, especially in cases where multiple objects are stacked one on top of the other, preventing effective grasping. We here present DUQIM-Net, a decision-making approach for object manipulation in a setting of stacked objects. In DUQIM-Net, the hierarchical stacking relationship is assessed using Adj-Net, a model that leverages existing Transformer Encoder-Decoder object detectors by adding an adjacency head. The output of this head probabilistically infers the underlying hierarchical structure of the objects in the scene. We utilize the properties of the adjacency matrix in DUQIM-Net to perform decision making and assist with object-grasping tasks. Our experimental results show that Adj-Net surpasses the state-of-the-art in object-relationship inference on the Visual Manipulation Relationship Dataset (VMRD), and that DUQIM-Net outperforms comparable approaches in bin clearing tasks.
|
|
WeA-OL1 |
Rm21 (on-line) |
Aerial Systems 5 |
Regular session |
Chair: Zhu, Chi | Maebashi Institute of Technology |
Co-Chair: Pan, Jia | University of Hong Kong |
|
10:00-10:10, Paper WeA-OL1.1 | |
Downwash-Aware Control Allocation for Over-Actuated UAV Platforms |
|
Su, Yao | Beijing Institute for General Artificial Intelligence |
Chu, Chi | Tsinghua University |
Wang, Meng | Beijing Institute for General Artificial Intelligence |
Li, Jiarui | Peking University |
Yang, Liu | Tsinghua University |
Zhu, Yixin | Peking University |
Liu, Hangxin | Beijing Institute for General Artificial Intelligence (BIGAI) |
Keywords: Aerial Systems: Mechanics and Control, Motion Control
Abstract: Tracking position and orientation independently affords more agile maneuver for over-actuated multirotor Unmanned Aerial Vehicles (UAVs) while introducing undesired downwash effects; downwash flows generated by thrust generators may counteract others due to close proximity, which significantly threatens the stability of the platform. The complexity of modeling aerodynamic airflow challenges control algorithms from properly compensating for such a side effect. Leveraging the input redundancies in over-actuated UAVs, we tackle this issue with a novel control allocation framework that considers downwash effects and explores the entire allocation space for an optimal solution. This optimal solution avoids downwash effects while providing high thrust efficiency within the hardware constraints. To the best of our knowledge, ours is the first formal derivation to investigate the downwash effects on over-actuated UAVs. We verify our framework on different hardware configurations in both simulation and experiment.
|
|
10:10-10:20, Paper WeA-OL1.2 | |
Siamese Object Tracking for Vision-Based UAM Approaching with Pairwise Scale-Channel Attention |
|
Zheng, Guangze | Tongji University |
Fu, Changhong | Tongji University |
Ye, Junjie | Tongji University |
Li, Bowen | Tongji University |
Lu, Geng | Department of Automation, Tsinghua University, Beijing, China |
Pan, Jia | University of Hong Kong |
Keywords: Aerial Systems: Applications, Deep Learning for Visual Perception, Data Sets for Robotic Vision
Abstract: Although the manipulating of the unmanned aerial manipulator (UAM) has been widely studied, vision-based UAM approaching, which is crucial to the subsequent manipulating, generally lacks effective design. The key to the visual UAM approaching lies in object tracking, while current UAM tracking typically relies on costly model-based methods. Besides, UAM approaching often confronts more severe object scale variation issues, which makes it inappropriate to directly employ state-of-the-art model-free Siamese-based methods from the object tracking field. To address the above problems, this work proposes a novel Siamese network with pairwise scale-channel attention (SiamSA) for vision-based UAM approaching. Specifically, SiamSA consists of a pairwise scale-channel attention network (PSAN) and a scale-aware anchor proposal network (SA-APN). PSAN acquires valuable scale information for feature processing, while SAAPN mainly attaches scale awareness to anchor proposing. Moreover, a new tracking benchmark for UAM approaching, namely UAMT100, is recorded with 35K frames on a flying UAM platform for evaluation. Exhaustive experiments on the benchmarks and real-world tests validate the efficiency and practicality of SiamSA with a promising speed. Both the code and UAMT100 benchmark are now available at https:// github.com/vision4robotics/SiamSA.
|
|
10:20-10:30, Paper WeA-OL1.3 | |
Unsteady Aerodynamic Modeling of Aerobat Using Lifting Line Theory and Wagner's Function |
|
Sihite, Eric | California Institute of Technology |
Ghanem, Paul | Northeastern University |
Salagame, Adarsh | Northeastern University |
Ramezani, Alireza | Northeastern University |
Keywords: Aerial Systems: Mechanics and Control, Simulation and Animation, Modeling, Control, and Learning for Soft Robots
Abstract: Flying animals possess highly complex physical characteristics and are capable of performing agile maneuvers using their wings. The flapping wings generate complex wake structures that influence the aerodynamic forces, which can be difficult to model. While it is possible to model these forces using fluid-structure interaction, it is very computationally expensive and difficult to formulate. In this paper, we follow a simpler approach by deriving the aerodynamic forces using a relatively small number of states and presenting them in a simple state-space form. The formulation utilizes Prandtl's lifting line theory and Wagner's function to determine the unsteady aerodynamic forces acting on the wing in a simulation, which then are compared to experimental data of the bat-inspired robot called the Aerobat. The simulated trailing-edge vortex shedding can be evaluated from this model, which then can be analyzed for a wake-based gait design approach to improve the aerodynamic performance of the robot.
|
|
10:30-10:40, Paper WeA-OL1.4 | |
Design and Analysis of Truss Aerial Transportation System (TATS): The Lightweight Bar Spherical Joint Mechanism |
|
Zhang, Xiaozhen | Beijing Institute of Technology |
Yang, Qingkai | Beijing Institute of Technology |
Yu, Rui | Beijing Institute of Technology |
Wu, Delong | BIT |
Wei, Shaozhun | Beijing Institute of Technology |
Cui, Jinqiang | Peng Cheng Laboratory |
Fang, Hao | Beijing Institute of Technology |
Keywords: Aerial Systems: Applications, Intelligent Transportation Systems
Abstract: In aerial cooperative transportation missions, it has been recognized that for small-sized but heavy payloads, the cable-suspended framework is a preferred manner. However, to maintain proper safe flight distances, cables always stay inclined, which implies that horizontal force components have to be generated by UAVs, and only partial thrust forces are used for gravity compensation. To overcome this drawback, in this paper, a new cooperative transportation system named Truss Aerial Transportation System (TATS) is proposed, where those horizontal forces can be internally compensated by the bar spherical joint structure. In the TATS, rigid bars can powerfully sustain the desired distances among UAVs for safe flight, resulting in a more compact and effective transportation system. Thanks to the structural advantage of the truss, the rigid bars can be made lightweight so as to minimize their induced gravity burden. The construction method of the proposed TATS is presented. The improvement in energy efficiency is analyzed and compared with the cable-suspended framework. Furthermore, the robustness property of a TATS configuration is evaluated by computing the margin capacity. Finally, a load test experiment is conducted on our made prototype, the results of which show the effectiveness and feasibility of the proposed TATS.
|
|
10:40-10:50, Paper WeA-OL1.5 | |
SytaB: A Class of Smooth-Transition Hybrid Terrestrial/Aerial Bicopters |
|
Zhu, Yimin | Harbin Institute of Technology |
Yang, Jianan | Harbin Institute of Technology |
Zhang, Lixian | Harbin Institute of Technology |
Dong, Yifei | Harbin Institute of Technology |
Ding, Yihang | Harbin Institute of Technology |
Keywords: Aerial Systems: Mechanics and Control, Dynamics, Motion Control
Abstract: This work details the design, modeling and control of SytaB, a vehicle capable of hybrid terrestrial/aerial mobility with smooth transition, where the structure embedding a bicopter is adopted for the first time. In contrast to previous hybrid terrestrial/aerial vehicles with a quadrotor embedded, SytaB not only requires less energy for the same takeoff weight, but also regulates its attitude less frequently to restrain the vibration of sensors. Three modes, the terrestrial/aerial/transitional modes, are considered for the vehicle, and the dynamics modeling and controller design of each mode are carried out. The transition between terrestrial and aerial locomotions is smooth compared with the case of non-inclusion of transitional mode such that the bounce and shake of the vehicle are alleviated. The energy efficiency is compared between the terrestrial and aerial modes, and between the energy-saving and high-maneuverability paradigms (a choice enabled in the terrestrial mode of SytaB). Experimental results are presented to demonstrate the potential of SytaB, the effectiveness of the designed controllers, and the necessity of considering the transition process.
|
|
10:50-11:00, Paper WeA-OL1.6 | |
Real-Time Trajectory Planning for Aerial Perching |
|
Ji, Jialin | Zhejiang University |
Yang, Tiankai | Zhejiang University |
Xu, Chao | Zhejiang University |
Gao, Fei | Zhejiang University |
Keywords: Aerial Systems: Applications, Whole-Body Motion Planning and Control, Motion and Path Planning
Abstract: This paper presents a novel trajectory planning method for aerial perching. Compared with the existing work, the terminal states and the trajectory durations can be adjusted adaptively, instead of being determined in advance. Further- more, our planner is able to minimize the tangential relative speed on the premise of safety and dynamic feasibility. This feature is especially notable on micro aerial robots with low maneuverability or scenarios where the space is not enough. Moreover, we design a flexible transformation strategy to eliminate terminal constraints along with reducing optimization variables. Besides, we take precise SE(3) motion planning into account to ensure that the drone would not touch the landing platform until the last moment. The proposed method is validated onboard by a palm-sized micro aerial robot with quite limited thrust and moment (thrust-to-weight ratio 1.7) perching on a mobile inclined surface. Sufficient experimental results show that our planner generates an optimal trajectory within 20ms, and replans with warm start in 2ms.
|
|
11:00-11:10, Paper WeA-OL1.7 | |
Dynamic Free-Space Roadmap for Safe Quadrotor Motion Planning |
|
Guo, Junlong | Zhejiang University |
Xun, Zhiren | Zhejiang University |
Geng, Shuang | Zhejiang University |
Lin, Yi | Hong Kong University of Science and Technology |
Xu, Chao | Zhejiang University |
Gao, Fei | Zhejiang University |
Keywords: Aerial Systems: Applications, Motion and Path Planning, Autonomous Vehicle Navigation
Abstract: Free-space-oriented roadmaps typically generate a series of convex geometric primitives, which constitute the safe region for motion planning. However, a static environment is assumed for this kind of roadmap. This assumption makes it unable to deal with dynamic obstacles and limits its applications. In this paper, we present a dynamic free-space roadmap, which provides feasible spaces and a navigation graph for safe quadrotor motion planning. Our roadmap is constructed by continuously seeding and extracting free regions in the environment. In order to adapt our map to environments with dynamic obstacles, we incrementally decompose the polyhedra intersecting with obstacles into obstacle-free regions, while the graph is also updated by our well-designed mechanism. Extensive simulations and real-world experiments demonstrate that our method is practically applicable and efficient.
|
|
11:10-11:20, Paper WeA-OL1.8 | |
Obstacle Avoidance of Resilient UAV Swarm Formation with Active Sensing System in the Dense Environment |
|
Peng, Peng | Shanghai Jiao Tong University |
Dong, Wei | Shanghai Jiao Tong University |
Chen, Gang | Shanghai Jiaotong University |
Zhu, Xiangyang | Shanghai Jiao Tong University |
Keywords: Aerial Systems: Applications, Path Planning for Multiple Mobile Robots or Agents, Swarm Robotics
Abstract: This paper proposes a perception-shared and swarm trajectory global optimal (STGO) algorithm fused UAVs formation motion planning framework aided by an active sensing system. First, the point cloud received by each UAV is fit by the gaussian mixture model (GMM) and transmitted in the swarm. Resampling from the received GMM contributes to a global map, which is used as the foundation for consensus. Second, to improve flight safety, an active sensing system is designed to plan the observation angle of each UAV considering the unknown field, overlap of the field of view (FOV), velocity direction and smoothness of yaw rotation, and this planning problem is solved by the distributed particle swarm optimization (DPSO) algorithm. Last, for the formation motion planning, to ensure obstacle avoidance, the formation structure is allowed for affine transformation and is treated as the soft constraint on the control points of the B-spline. Besides, the STGO is introduced to avoid local minima. The combination of GMM communication and STGO guarantees a safe and strict consensus between UAVs. Tests on different formations in the simulation show that our algorithm can contribute to a strict consensus and has a success rate of at least 80% for obstacle avoidance in a dense environment. Besides, the active sensing system can increase the success rate of obstacle avoidance from 50% to 100% in some scenarios.
|
|
11:20-11:30, Paper WeA-OL1.9 | |
Autoexplorer: Autonomous Exploration of Unknown Environments Using Fast Frontier-Region Detection and Parallel Path Planning |
|
Han, Kyung Min | Ewha Womans Univeristy |
Kim, Young J. | Ewha Womans University |
Keywords: Autonomous Vehicle Navigation, Motion and Path Planning
Abstract: We propose a fully autonomous system for mobile robot exploration in unknown environments. Our system employs a novel frontier detection algorithm based on the fast front propagation (FFP) technique and uses parallel path planning to reach the detected front regions. Given an occupancy grid map in 2D, possibly updated online, our algorithm can find all the frontier points that can allow mobile robots to visit unexplored regions to maximize the exploratory coverage. Our FFP method is six~seven times faster than the state-of-the-art wavefront frontier detection algorithm in terms of finding frontier points without compromising the detection accuracy. The speedup can be further accelerated by simplifying the map without degrading the detection accuracy. To expedite locating the optimal frontier point, We also eliminate spurious points by the obstacle filter and the novel boundary filter. In addition, we parallelize the global planning phase using the branch-and-bound A*, where the search space of each thread is confined by its best knowledge discovered during the parallel search. As a result, our parallel path-planning algorithm operating on 20 threads is about 30 times faster than the vanilla exploration system that operates on a single thread. Our method is validated through extensive experiments, including autonomous robot exploration in both synthetic and real-world scenarios. In the real-world experiment, we show that an autonomous navigation system using a human-sized mobile manipulator robot equipped with a low-end embedded processor that fully integrates our FFP and parallel path-planning algorithms.
|
|
WeA-OL2 |
Rm22 (on-line) |
Navigation Systems 6 |
Regular session |
Chair: Reshef, Roi | Nvidia |
Co-Chair: Qu, Chao | Skydio |
|
10:00-10:10, Paper WeA-OL2.1 | |
Real-Time Visual Inertial Odometry with a Resource-Efficient Harris Corner Detection Accelerator on FPGA Platform |
|
Pengfei, Gu | Tsinghua University |
Meng, Ziyang | Tsinghua University |
Zhou, Pengkun | Tsinghua University |
Keywords: Vision-Based Navigation, Software-Hardware Integration for Robot Systems
Abstract: Visual Inertial Odometry (VIO) is a widely studied localization technique in robotics. State-of-the-art VIO algorithms are composed of two parts: a frontend which performs visual perception and inertial measurement pre-processing, and a backend which fuses vision and inertial measurements to estimate the robot’s pose. Both image processing in the frontend and sensor fusion in the backend are computationally expensive, making it very challenging to run the VIO algorithm, especially the optimization-based VIO algorithm in real time on embedded platforms with limited power budget. In this paper, a real-time optimization-based monocular VIO algorithm is proposed based on algorithm-and-hardware codesign and successfully implemented on an embedded platform with only 2.6W processor power consumption. In particular, the time-consuming Harris corner detection (HCD) is accelerated on Field Programmable Gate Array (FPGA), achieving an average 16* processing time reduction compared with the ARM implementation. Compared with the state-of-the-art HCD accelerator provided by Xilinx, the hardware resource required of our accelerator is largely reduced without any compromise in speed, thanks to the proposed dedicated pruning and parallelization techniques. Finally, experiment on the public dataset demonstrates that the proposed real-time VIO algorithm on the FPGA-based platform has comparable accuracy with respect to the existing state-of-the-art VIO algorithm on the desktop, and 3* faster frontend processing speed over the ARM-based implementation.
|
|
10:10-10:20, Paper WeA-OL2.2 | |
MPNP: Multi-Policy Neural Planner for Urban Driving |
|
Cheng, Jie | Hong Kong University of Science and Technology |
Xin, Ren | The Hong Kong University of Science and Technology |
Wang, Sheng | Hong Kong University of Science and Technology |
Liu, Ming | Hong Kong University of Science and Technology |
Keywords: Autonomous Vehicle Navigation, Imitation Learning
Abstract: Our goal is to train a neural planner that can capture the diverse driving behaviors in complex urban scenarios. We observe that even state-of-the-art neural planners are struggling to perform common maneuvers such as lane-change, which is rather natural for human drivers. We propose to explore the multi-modalities in the planning problem and force the neural planner to explicitly consider different policies. This is achieved by generating the future trajectories conditioned on every possible reference line, which could simply be the centerline of the surrounding lanes. We find this simple strategy yet enables the planner to perform rich and complex behaviors. We train our model using real-world driving data and demonstrate the effectiveness of our method through both open-loop and closed-loop evaluations. Project website https://jchengai.github.io/mpnp.
|
|
10:20-10:30, Paper WeA-OL2.3 | |
Contextual Tuning of Model Predictive Control for Autonomous Racing |
|
Froehlich, Lukas | ETH Zurich |
Kuettel, Christian | ETH Zurich |
Arcari, Elena | ETH Zurich |
Hewing, Lukas | ETH Zurich |
Zeilinger, Melanie N. | ETH Zurich |
Carron, Andrea | ETH Zurich |
Keywords: Autonomous Vehicle Navigation, Optimization and Optimal Control
Abstract: Learning-based model predictive control has been widely applied in autonomous racing to improve the closed-loop behaviour of vehicles in a data-driven manner. When environmental conditions change, e.g., due to rain, often only the predictive model is adapted, but the controller parameters are kept constant. However, this can lead to suboptimal behaviour. In this paper, we address the problem of data-efficient controller tuning, adapting both the model and objective simultaneously. The key novelty of the proposed approach is that we leverage a learned dynamics model to encode the environmental condition as a so-called context. This insight allows us to employ contextual Bayesian optimization to efficiently transfer knowledge across different environmental conditions. Consequently, we require fewer laps to find the optimal controller configuration for each context. The proposed framework is extensively evaluated with more than 3'000 laps driven on an experimental platform with 1:28 scale RC race cars. Our approach successfully optimizes the lap time across different contexts requiring fewer laps compared to other approaches based on standard Bayesian optimization.
|
|
10:30-10:40, Paper WeA-OL2.4 | |
Temporal Logic Path Planning under Localization Uncertainty |
|
Dhyani, Amit | IIT Kanpur |
Saha, Indranil | IIT Kanpur |
Keywords: Autonomous Vehicle Navigation, Formal Methods in Robotics and Automation, Motion and Path Planning
Abstract: We present a method to find the optimal control strategy for a robot using prior information of localization that maximizes the probability of satisfaction of a temporal logic specification while considering the uncertainty in both motion and sensing, two major causes for localization uncertainty. The specifications are given in the probabilistic computation tree logic (PCTL) formula over a set of propositions, which capture the presence of the robot in some key locations in the environment. A computation model that can deal with the uncertainty in both motion and sensing is the Partially Observable Markov Decision Process (POMDP), which is computationally expensive. We approximate the underlying POMDP using Augmented Markov Decision Process (AMDP) and present a control synthesis algorithm for AMDP. We carry out numerous experiments on workspaces with sizes up to 100times 100 and four different PCTL specifications to evaluate the efficacy of our technique. Experimental results show that our technique for computing robot control policy using localization prior can deal with localization uncertainty effectively and scale to large environments.
|
|
10:40-10:50, Paper WeA-OL2.5 | |
Navigating to Objects in Unseen Environments by Distance Prediction |
|
Zhu, Minzhao | Bytedance |
Zhao, Binglei | Xi'an Jiaotong University |
Kong, Tao | ByteDance |
Keywords: Vision-Based Navigation, AI-Enabled Robotics, AI-Based Methods
Abstract: Object Goal Navigation (ObjectNav) task is to navigate an agent to an object category in unseen environments without a pre-built map. In this paper, we solve this task by predicting the distance to the target using semantically-related objects as cues. Based on the estimated distance to the target object, our method directly choose optimal mid-term goals that are more likely to have a shorter path to the target. Specifically, based on the learned knowledge, our model takes a bird's-eye view semantic map as input, and estimates the path length from the frontier map cells to the target object. With the estimated distance map, the agent could simultaneously explore the environment and navigate to the target objects based on a simple human-designed strategy. Empirical results in visually realistic simulation environments show that the proposed method outperforms a wide range of baselines on success rate and efficiency. Real-robot experiment also demonstrates that our method generalizes well to the real world.
|
|
10:50-11:00, Paper WeA-OL2.6 | |
Depth-CUPRL: Depth-Imaged Contrastive Unsupervised Prioritized Representations in Reinforcement Learning for Mapless Navigation of Unmanned Aerial Vehicles |
|
Costa de Jesus, Junior | Universidade Federal Do Rio Grande |
Kich, Victor Augusto | Universidade Federal De Santa Maria |
Kolling, Alisson Henrique | Universidade Federal De Santa Maria |
Grando, Ricardo | Federal University of Rio Grande |
da Silva Guerra, Rodrigo | Universidade Federal De Santa Maria |
Drews-Jr, Paulo | Federal University of Rio Grande (FURG) |
Keywords: Autonomous Vehicle Navigation, Reinforcement Learning, Autonomous Agents
Abstract: Reinforcement Learning (RL) has presented an impressive performance in video games through raw pixel imaging and continuous control tasks. However, RL performs poorly with high-dimensional observations such as raw pixel images. It is generally accepted that physical state-based RL policies such as laser sensor measurements give a more sample-efficient result than learning by pixels. This work presents a new approach that extracts information from a depth map estimation to teach an RL agent to perform the mapless navigation of Unmanned Aerial Vehicle (UAV). We propose the Depth-Imaged Contrastive Unsupervised Prioritized Representations in Reinforcement Learning (Depth-CUPRL) that estimates the depth of images with a prioritized replay memory. We used a combination of RL and Contrastive Learning to lead with the problem of RL based on images. From the analysis of the results with Unmanned Aerial Vehicles (UAVs), it is possible to conclude that our Depth-CUPRL approach is effective for the decision-making and outperforms state-of-the-art pixel-based approaches in the mapless navigation capability.
|
|
11:00-11:10, Paper WeA-OL2.7 | |
DSOL: A Fast Direct Sparse Odometry Scheme |
|
Qu, Chao | University of Pennsylvania |
Skandan, Shreyas | University of Pennsylvania |
Miller, Ian | University of Pennsylvania |
Taylor, Camillo Jose | University of Pennsylvania |
Keywords: Vision-Based Navigation
Abstract: In this paper, we describe Direct Sparse Odometry Lite (DSOL), an improved version of Direct Sparse Odometry (DSO). We propose several algorithmic and implementation enhancements which speed up computation by a significant factor (on average 5x) even on resource-constrained platforms. The increase in speed allows us to process images at higher frame rates, which in turn provides better results on rapid motions. Our open-source implementation is available at https://github.com/versatran01/dsol.
|
|
11:10-11:20, Paper WeA-OL2.8 | |
Planning for Negotiations in Autonomous Driving Using Reinforcement Learning |
|
Reshef, Roi | Nvidia |
Keywords: Autonomous Vehicle Navigation, Autonomous Agents, Reinforcement Learning
Abstract: Planning autonomous driving behaviors in dense traffic is challenging. Human drivers are able to influence their road environment to achieve (otherwise unachievable) goals, by communicating their intents to other drivers. An autonomous system that is required to drive in the presence of human traffic must thus possess this fundamental negotiation capability. This work presents a novel benchmark that includes a stochastic driver negotiation model and a framework for training policies to drive and negotiate based on reinforcement learning. It is shown that driving policies trained in this framework lead to greater safety, higher mission accomplishment rates and more driving comfort, and can generalize across scenarios.
|
|
11:20-11:30, Paper WeA-OL2.9 | |
Towards Specialized Hardware for Learning-Based Visual Odometry on the Edge |
|
Chen, Siyuan | Carnegie Mellon University |
Mai, Ken | Carnegie Mellon University |
Keywords: Computer Architecture for Robotic and Automation, Deep Learning for Visual Perception, Hardware-Software Integration in Robotics
Abstract: Learning-based visual odometry (VO) has gained increasing popularity in autonomous navigation of small robots. However, most methods in the category require computation resources not normally available on edge systems. We contend that specialized hardware accelerators are ideal solutions to this problem because of their superior energy efficiency. In this paper, we first propose a model to derive compute specifications for VO from physical characteristics of unmanned aerial vehicles (UAVs). These specifications serve as the basis to guide our accelerator design process. Based on the specifications derived from the DJI Mavic Air 2 and Crazyflie 2.0 UAVs, we explore the speed/flight-time design spaces for three target VO algorithms on two NVIDIA Jetson systems. Then, we propose a hardware accelerator architecture and present prototype implementations based on FPGAs. Additionally, we illustrate the algorithm/hardware co-design approach with a series of hardware-aware algorithmic redesigns targeting the FPGA prototypes, and quantify the throughput-accuracy tradeoff of them. Our FPGA implementation of DFVO is 2.7x more energy efficient compared to off-the-shelf embedded computers.
|
|
WeA-OL3 |
Rm23 (on-line) |
Deep Learning for Visual Perception 1 |
Regular session |
Chair: Kobayashi, Taisuke | National Institute of Informatics |
Co-Chair: Lee, Dongheui | Technische Universität Wien (TU Wien) |
|
10:00-10:10, Paper WeA-OL3.1 | |
MPT-Net: Mask Point Transformer Network for Large Scale Point Cloud Semantic Segmentation |
|
Tang, Zhe Jun | SenseTime, Nanyang Technological University |
Cham, Tat-Jen | Nanyang Technological University |
Keywords: Deep Learning for Visual Perception, Computer Vision for Transportation, Object Detection, Segmentation and Categorization
Abstract: Point cloud semantic segmentation is important for road scene perception, a task for driverless vehicles to achieve full fledged autonomy. In this work, we introduce Mask Point Transformer Network (MPT-Net), a novel architecture for point cloud segmentation that is simple to implement. MPT-Net consists of a local and global feature encoder and a transformer based decoder; a 3D Point-Voxel Convolution encoder backbone with voxel self attention to encode features and a Mask Point Transformer module to decode point features and segment the point cloud. Firstly, we introduce the novel MPT designed to specifically handle point cloud segmentation. MPT offers two benefits. It attends to every point in the point cloud using mask tokens to extract class specific features globally with cross attention, and provide inter-class feature information exchange using self attention on the learned mask tokens. Secondly, we design a backbone to use sparse point voxel convolutional blocks and a self attention block using transformers to learn local and global contextual features. We evaluate MPT-Net on large scale outdoor driving scene point cloud datasets, SemanticKITTI and nuScenes. Our experiments show that by replacing the standard segmentation head with MPT, MPT-Net achieves a state-of-the-art performance over our baseline approach by 3.8% in SemanticKITTI and is highly effective in detecting ’stuffs’ in point cloud.
|
|
10:10-10:20, Paper WeA-OL3.2 | |
Timestamp-Supervised Action Segmentation with Graph Convolutional Networks |
|
Khan, Hamza | Retrocausal |
Haresh, Sanjay | Retrocausal, Inc |
Ahmed, Awais | Retrocausal Inc |
Siddiqui, Shakeeb | Retrocausal |
Konin, Andrey | Retrocausal Inc |
Zia, M. Zeeshan | Retrocausal |
Tran, Quoc-Huy | Retrocausal, Inc |
Keywords: Deep Learning for Visual Perception, Human Detection and Tracking, Human-Robot Collaboration
Abstract: We introduce a novel approach for temporal activity segmentation with timestamp supervision. Our main contribution is a graph convolutional network, which is learned in an end-to-end manner to exploit both frame features and connections between neighboring frames to generate dense framewise labels from sparse timestamp labels. The generated dense framewise labels can then be used to train the segmentation model. In addition, we propose a framework for alternating learning of both the segmentation model and the graph convolutional model, which first initializes and then iteratively refines the learned models. Detailed experiments on four public datasets, including 50 Salads, GTEA, Breakfast, and Desktop Assembly, show that our method is superior to the multi-layer perceptron baseline, while performing on par with or better than the state of the art in temporal activity segmentation with timestamp supervision.
|
|
10:20-10:30, Paper WeA-OL3.3 | |
CA-SpaceNet: Counterfactual Analysis for 6D Pose Estimation in Space |
|
Wang, Shunli | Fudan University |
Wang, Shuaibing | Fudan University |
Jiao, Bo | Fudan University |
Yang, Dingkang | Fudan University |
Su, Liuzhen | Fudan University |
Zhai, Peng | Fudan University |
Chen, Chixiao | Fudan University |
ZHang, Lihua | Fudan University |
Keywords: Deep Learning for Visual Perception, Aerial Systems: Perception and Autonomy, Computer Vision for Automation
Abstract: Reliable and stable 6D pose estimation of uncooperative space objects plays an essential role in on-orbit servicing and debris removal missions. Considering that the pose estimator is sensitive to background interference, this paper proposes a counterfactual analysis framework named CASpaceNet to complete robust 6D pose estimation of the spaceborne targets under complicated background. Specifically, conventional methods are adopted to extract the features of the whole image in the factual case. In the counterfactual case, a non-existent image without the target but only the background is imagined. Side effect caused by background interference is reduced by counterfactual analysis, which leads to unbiased prediction in final results. In addition, we also carry out lowbit-width quantization for CA-SpaceNet and deploy part of the framework to a Processing-In-Memory (PIM) accelerator on FPGA. Qualitative and quantitative results demonstrate the effectiveness and efficiency of our proposed method. To our best knowledge, this paper applies causal inference and network quantization to the 6D pose estimation of space-borne targets for the first time. The code is available at https://github.com/Shunli-Wang/CA-SpaceNet.
|
|
10:30-10:40, Paper WeA-OL3.4 | |
3D Object Aided Self-Supervised Monocular Depth Estimation |
|
Wei, Songlin | Soochow University |
Chen, Guodong | Soochow University |
Chi, Wenzheng | Soochow University |
Wang, Zhenhua | Soochow University |
Sun, Lining | Harbin Institute of Technology |
Keywords: Deep Learning for Visual Perception, Object Detection, Segmentation and Categorization, Semantic Scene Understanding
Abstract: Monocular depth estimation has been actively studied in fields such as robot vision, autonomous driving, and 3D scene understanding. Given a sequence of color images, unsupervised learning methods based on the framework of Structure-From-Motion (SfM) simultaneously predict depth and camera relative pose. However, dynamically moving objects in the scene violate the static world assumption, resulting in inaccurate depths of dynamic objects. In this work, we propose a new method to address such dynamic object movements through monocular 3D object detection. Specifically, we first detect 3D objects in the images and build the per-pixel correspondence of the dynamic pixels with the detected object pose while leaving the static pixels corresponding to the rigid background to be modeled with camera motion. In this way, the depth of every pixel can be learned via a meaningful geometry model. Besides, objects are detected as cuboids with absolute scale, which is used to eliminate the scale ambiguity problem inherent in monocular vision. Experiments on the KITTI depth dataset show that our method achieves State-of-The-Art performance for depth estimation. Furthermore, joint training of depth, camera motion and object pose also improves monocular 3D object detection performance. To the best of our knowledge, this is the first work that allows a monocular 3D object detection network to be fine-tuned in a self-supervised manner.
|
|
10:40-10:50, Paper WeA-OL3.5 | |
DeepMLE: A Robust Deep Maximum Likelihood Estimator for Two-View Structure from Motion |
|
Xiao, Yuxi | Wuhan University |
Li, Li | Wuhan University |
Li, Xiaodi | Wuhan University |
Yao, Jian | Wuhan University |
Keywords: Deep Learning for Visual Perception, Audio-Visual SLAM, SLAM
Abstract: Two-view structure from motion (SfM) is the cornerstone of 3D reconstruction and visual SLAM (vSLAM). Many existing end-to-end learning-based methods usually formulate it as a brute regression problem. However, the inadequate utilization of traditional geometry model makes the model not robust in unseen environments. To improve the generalization capability and robustness of end-to-end two-view SfM network, we formulate the two-view SfM problem as a maximum likelihood estimation (MLE) and solve it with the proposed framework, denoted as DeepMLE. First, we propose to take the deep multi-scale correlation maps to depict the visual similarities of 2D image matches decided by ego-motion. In addition, in order to increase the robustness of our framework, we formulate the likelihood function of the correlations of 2D image matches as a Gaussian and Uniform mixture distribution which takes the uncertainty caused by illumination changes, image noise and moving objects into account. Meanwhile, an uncertainty prediction module is presented to predict the pixel-wise distribution parameters. Finally, we iteratively refine the depth and relative camera pose using the gradient-like information to maximize the likelihood function of the correlations. Extensive experimental results on several datasets prove that our method significantly outperforms the state-of-the-art end-to-end two-view SfM approaches in accuracy and generalization capability.
|
|
10:50-11:00, Paper WeA-OL3.6 | |
Attention-Guided RGB-D Fusion Network for Category-Level 6D Object Pose Estimation |
|
Wang, Hao | Samsung |
Li, Weiming | Samsung Advanced Institute of Technology (SAIT) |
Kim, JiYeon | Samsung Advanced Institute of Technology |
Wang, Qiang | Samsung |
Keywords: Deep Learning for Visual Perception, RGB-D Perception
Abstract: This work focuses on estimating 6D poses and sizes of category-level objects from a single RGB-D image. How to exploit the complementary RGB and depth features plays an important role in this task yet remains an open question. Due to the large intra-category texture and shape variations, an object instance in test may have different RGB and depth features from those of the object instances in training, which poses challenges to previous RGB-D fusion methods. To deal with such problem, an Attention-guided RGB-D Fusion Network (ARF-Net) is proposed in this work. Our key design is an ARF module that learns to adaptively fuse RGB and depth features with guidance from both structure-aware attention and relation-aware attention. Specifically, the structure-aware attention captures spatial relationship among object parts and the relation-aware attention captures the RGB-to-depth correlations between the appearance and geometric features. Our ARF-Net directly establishes canonical correspondences with a compact decoder based on the multi-modal features from our ARF module. Extensive experiments show that our method can effectively fuse RGB features to various popular point cloud encoders and provide consistent performance improvement. In particular, without reconstructing instance 3D models, our method with its relatively compact architecture outperforms all state-of-the-art models on CAMERA25 and REAL275 benchmarks by a large margin.
|
|
11:00-11:10, Paper WeA-OL3.7 | |
Robust Sim2Real 3D Object Classification Using Graph Representations and a Deep Center Voting Scheme |
|
Weibel, Jean-Baptiste | TU Wien |
Patten, Timothy | University of Technology Sydney |
Vincze, Markus | Vienna University of Technology |
Keywords: Deep Learning for Visual Perception, Recognition, Visual Learning
Abstract: While object semantic understanding is essential for service robotic tasks, 3D object classification is still an open problem. Learning from artificial 3D models alleviates the cost of the annotation necessary to approach this problem, but today's methods still struggle with the differences between artificial and real 3D data. We conjecture that one of the causes of this issue is the fact that today's methods learn directly from point coordinates, which makes them highly sensitive to scale changes. We propose to learn from a graph of reproducible object parts whose scale is more reliable. In combination with a voting scheme, our approach achieves significantly more robust classification and improves upon state-of-the-art by up to 16% when transferring from artificial to real objects.
|
|
11:10-11:20, Paper WeA-OL3.8 | |
Weak6D: Weakly Supervised 6D Pose Estimation with Iterative Annotation Resolver |
|
Mu, Fengjun | University of Electronic Science and Technology of China |
Huang, Rui | University of Electronic Science and Technology of China |
Shi, Kecheng | The School of Automation Engineering, University of Electronic S |
Li, Xin | The Group 42 |
Qiu, Jing | University of Electronic Science and Technology of China |
Cheng, Hong | University of Electronic Science and Technology |
Keywords: Deep Learning for Visual Perception, Deep Learning Methods, RGB-D Perception
Abstract: 6D object pose estimation is an essential task in vision-based robotic grasping and manipulation. Prior works always train models with a large number of pose annotated images, limiting the efficiency of model transfer between different scenarios. This paper presents an end-to-end model named Weak6D, which could be learned with unannotated RGB-D data. The core of the proposed approach is the novel optimizing method Iterative Annotation Resolver, which has the ability to directly utilize the captured RGB-D data through the training process. Furthermore, we employ a weak refinement loss to optimize the pose estimation network with refined object poses. We evaluated the proposed Weak6D in the YCB-Video dataset, and experimental results show our model achieved practical results without annotated data. Our code is available at https://github.com/mufengjun260/Weak6D.
|
|
11:20-11:30, Paper WeA-OL3.9 | |
Robust Human Motion Forecasting Using Transformer-Based Model |
|
Valls Mascaro, Esteve | Technische Universitat Wien |
Ma, Shuo | Technical University of Munich |
Ahn, Hyemin | Ulsan National Institute of Science and Technology |
Lee, Dongheui | Technische Universität Wien (TU Wien) |
Keywords: Deep Learning for Visual Perception, Human and Humanoid Motion Analysis and Synthesis, Human-Robot Collaboration
Abstract: Comprehending human motion is a fundamental challenge for developing Human-Robot Collaborative applications. Computer vision researchers have addressed this field by only focusing on reducing error in predictions, but not taking into account the requirements needed to facilitate its implementation in robots. In this paper, we propose a new model based on Transformer that simultaneously deals with the real time 3D human motion forecasting in the short and long term. Our 2-Channel Transformer (2CH-TR) is able to efficiently exploit the spatio-temporal information of a shortly observed sequence (400ms) and generates a competitive accuracy against the current state-of-the-art. 2CH-TR stands out for the efficient performance of the Transformer, being lighter and faster than its competitors. In addition, our model is tested in conditions where the human motion is severely occluded, demonstrating its robustness in reconstructing and predicting 3D human motion in a highly noisy environment. Our experiment results show that the proposed 2CH-TR outperforms the ST-Transformer, which is another state-of-the-art model based on the Transformer, in terms of reconstruction and prediction under the same conditions of input prefix. Our model reduces in 8.89% the mean squared error of ST-Transformer in short-term prediction, and 2.57% in long-term prediction in Human3.6M dataset with 400ms input prefix.
|
|
WeB-1 |
Rm1 (Room A) |
Special Session: Computational Advances in Human-Robot Interaction 2 |
Regular session |
Chair: Lim, Angelica | Simon Fraser University |
Co-Chair: Bagchi, Shelly | National Institute of Standards and Technology |
|
14:40-14:50, Paper WeB-1.1 | |
Human-Robot Collaborative Carrying of Objects with Unknown Deformation Characteristics |
|
Sirintuna, Doganay | Italian Institute of Technology |
Giammarino, Alberto | Istituto Italiano Di Tecnologia |
Ajoudani, Arash | Istituto Italiano Di Tecnologia |
Keywords: Human-Robot Collaboration, Physical Human-Robot Interaction, Human-Centered Automation
Abstract: In this work, we introduce an adaptive control framework for human-robot collaborative transportation of objects with unknown deformation behaviour. The proposed framework takes as input the haptic information transmitted through the object, and the kinematic information of the human body obtained from a motion capture system to create reactive whole-body motions on a mobile collaborative robot. In order to validate our framework experimentally, we compared its performance with an admittance controller during a co-transportation task of a partially deformable object. We additionally demonstrate the potential of the framework while co-transporting rigid (aluminum rod) and highly deformable (rope) objects. A mobile manipulator which consists of an Omni-directional mobile base, a collaborative robotic arm, and a robotic hand is used as the robotic partner in the experiments. Quantitative and qualitative results of a 12-subjects experiment show that the proposed framework can effectively deal with objects of unknown deformability and provides intuitive assistance to human partners.
|
|
14:50-15:00, Paper WeB-1.2 | |
A Framework for Robot Self-Assessment of Expected Task Performance |
|
Frasca, Tyler | Tufts University |
Scheutz, Matthias | Tufts University |
Keywords: Human-Robot Teaming, Methods and Tools for Robot System Design, Simulation and Animation
Abstract: We propose a self-assessment framework which enables a robot to estimate how well it will be able to perform a known or possibly novel task. The robot simulates the task to generate a state distribution of possible outcomes and determines (1) the likelihood of overall success, (2) the most probable failure location, and (3) the expected time to task completion. We evaluate the framework on the ``FetchIt!'' mobile manipulation challenge which requires the robot to fetch a variety of parts around a small enclosed arena. By comparing the simulated and actual task resulting state distributions, we show that the robot can effectively assesses its expected performance which can be communicated to humans.
|
|
15:00-15:10, Paper WeB-1.3 | |
An Empirical Study of Reward Explanations with Human-Robot Interaction Applications |
|
Sanneman, Lindsay | Massachusetts Institute of Technology |
Shah, Julie A. | MIT |
Keywords: Human Factors and Human-in-the-Loop, Human-Centered Automation, Human-Robot Collaboration
Abstract: Explainable AI techniques that describe agent reward functions can enhance human-robot collaboration in a variety of settings. However, in order to effectively explain reward information to humans, it is important to understand the efficacy of different types of explanation techniques in scenarios of varying complexity. In this paper, we compare the performance of a broad range of explanation techniques in scenarios of differing reward function complexity through a set of human-subject experiments. To perform this analysis, we first introduce a categorization of reward explanation information types and then apply a suite of assessments to measure human reward understanding. Our findings indicate that increased reward complexity (in number of features) corresponded to higher workload and decreased reward understanding, while providing direct reward information was an effective approach across reward complexities. We also observed that providing full or near full reward information was associated with increased workload and that providing abstractions of the reward was more effective at supporting reward understanding than other approaches (besides direct information) and was associated with decreased workload and improved subjective assessment in high complexity settings.
|
|
15:10-15:20, Paper WeB-1.4 | |
The Predictive Kinematic Control Tree: Enhancing Teleoperation of Redundant Robots through Probabilistic User Models |
|
Brooks, Connor | University of Colorado Boulder |
Szafir, Daniel J. | University of North Carolina at Chapel Hill |
Keywords: Telerobotics and Teleoperation, Motion Control, Redundant Robots
Abstract: When teleoperating complex robotic manipulators, operators often find it most natural to issue commands that dictate end effector movements in task space. If the robot has redundant degrees of freedom, the translation of this command from task space into configuration space can affect the robot’s maneuverability, smoothness of motion, and the general precision of the teleoperated system. In this paper, we propose a novel method for performing this translation that predicts future operator commands in order to choose joint motions that maintain maneuverability in future timesteps. We introduce a Predictive Kinematic Control Tree (PrediKCT) that optimizes joint movement in the nullspace of the Jacobian over multiple future timesteps by reasoning over probabilistic models of the human operator. In essence, PrediKCT builds out and evaluates a tree of possible future commands. We implement this system on two simulated and one physical 7-degree-of-freedom robotic arms and characterize performance by analyzing robot motions produced through multiple command trajectories with differing user model accuracies and tree parameters, demonstrating benefits to path accuracy over both a minimum-norm joint velocity solution and local optimization of joint movement.
|
|
15:20-15:30, Paper WeB-1.5 | |
Sociable and Ergonomic Human-Robot Collaboration through Action Recognition and Augmented Hierarchical Quadratic Programming |
|
Tassi, Francesco | Istituto Italiano Di Tecnologia |
Iodice, Francesco | Istituto Italiano Di Tecnologia |
De Momi, Elena | Politecnico Di Milano |
Ajoudani, Arash | Istituto Italiano Di Tecnologia |
Keywords: Human-Robot Collaboration, Optimization and Optimal Control, Behavior-Based Systems
Abstract: The recognition of actions performed by humans and the anticipation of their intentions are important enablers to yield sociable and successful collaboration in human-robot teams. Meanwhile, robots should have the capacity to deal with multiple objectives and constraints, arising from the collaborative task or the human. In this regard, we propose vision techniques to perform human action recognition and image classification, which are integrated into an Augmented Hierarchical Quadratic Programming (AHQP) scheme to hierarchically optimize the robot’s reactive behavior and human ergonomics. The proposed framework allows one to intuitively command the robot in space while a task is being executed. The experiments confirm increased human ergonomics and usability, which are fundamental parameters for reducing musculoskeletal diseases and increasing trust in automation.
|
|
15:30-15:40, Paper WeB-1.6 | |
Learning on the Job: Long-Term Behavioural Adaptation in Human-Robot Interactions (Finalist for IROS Best Paper Award on Cognitive Robotics Sponsored by KROS) |
|
Del Duchetto, Francesco | University of Lincoln |
Hanheide, Marc | University of Lincoln |
|
|
15:40-15:50, Paper WeB-1.7 | |
Bounded Rational Game-Theoretical Modeling of Human Joint Actions with Incomplete Information |
|
Wang, Yiwei | Huazhong University of Science and Technology |
Shintre, Pallavi | Arizona State University |
Amatya, Sunny | ARIZONA State University |
Zhang, Wenlong | Arizona State University |
Keywords: Human-Robot Collaboration, Human-Robot Teaming, Physical Human-Robot Interaction
Abstract: As humans and robots start to collaborate in close proximity, robots are tasked to perceive, comprehend, and anticipate human partners' actions, which demands a predictive model to describe how humans collaborate with each other in joint actions. Previous studies either simplify the collaborative task as an optimal control between two agents or do not consider the learning process of humans during repeated interaction. This idyllic representation is thus not able to model human rationality and the learning process. In this paper, a bounded-rational and game-theoretical human cooperative model is developed to describe the cooperative behaviors of the human dyad. An experiment of a joint object pushing collaborative task was conducted with 30 human subjects using haptic interfaces in a virtual environment. The proposed model uses inverse optimal control (IOC) to model the reward parameters in the collaborative task. The collected data verified the accuracy of the predicted human trajectory generated from the bounded rational model excels the one with a fully rational model. We further provide insight from the conducted experiments about the effects of leadership on the performance of human collaboration.
|
|
15:50-16:00, Paper WeB-1.8 | |
COSM2IC: Optimizing Real-Time Multi-Modal Instruction Comprehension |
|
Weerakoon Mudiyanselage, Dulanga Kaveesha Weerakoon | Singapore Management University |
Subbaraju, Vigneshwaran | Agency for Science Technology and Research (A*STAR) |
Tran, Tuan | Singapore Management University |
Misra, Archan | Singapore Management University |
Keywords: Human-Robot Collaboration, Virtual Reality and Interfaces, Multi-Modal Perception for HRI
Abstract: Supporting real-time, on-device execution of multi- modal referring instruction comprehension models is an important challenge to be tackled in embodied Human-Robot Interaction. However, state-of-the-art deep learning models are resource intensive and unsuitable for real-time execution on embedded devices. While model compression can achieve reduction in computational resources upto a certain point, further optimizations result in a severe drop in accuracy (upto 50%). To minimize this loss in accuracy, we propose the COSM2IC framework, with a lightweight Task Complexity Predictor, that uses multiple sensor inputs to assess the instructional complexity and thereby dynamically switch between a set of models of varying computational intensity such that computationally less demanding models are invoked whenever possible. To demonstrate the benefits of COSM2IC, we utilize a representative human-robot collaborative “table-top target acquisition” task, to curate a new multi-modal instruction dataset where a human issues instructions in a natural manner using a combination of visual, verbal and gestural (pointing) cues. We show that COSM2IC achieves a 3-fold reduction in comprehension latency when compared to a baseline DNN model while suffering an accuracy loss of only ∼5%. When compared to state-of-the-art model compression methods COSM2IC is able to achieve a further 30% reduction in latency and energy consumption for a comparable performance.
|
|
16:00-16:10, Paper WeB-1.9 | |
Quantifying Changes in Kinematic Behavior of a Human-Exoskeleton Interactive System |
|
Ghonasgi, Keya | The University of Texas at Austin |
Mirsky, Reuth | University of Texas at Austin |
Haith, Adrian | Johns Hopkins University |
Stone, Peter | University of Texas at Austin |
Deshpande, Ashish | The University of Texas |
Keywords: Human-Robot Collaboration, Prosthetics and Exoskeletons, Human-Centered Robotics
Abstract: While human-robot interaction studies are becoming more common, quantification of the effects of repeated interaction with an exoskeleton remains unexplored. We draw upon existing literature in human skill assessment and present extrinsic and intrinsic performance metrics that quantify how the human-exoskeleton system's behavior changes over time. Specifically, in this paper, we present a new performance metric that provides insight into the system's kinematics associated with `successful' movements resulting in a richer characterization of changes in the system's behavior. A human subject study is carried out wherein participants learn to play a challenging and dynamic reaching game over multiple attempts, while donning an upper-body exoskeleton. The results demonstrate that repeated practice results in learning over time as identified through the improvement of extrinsic performance. Changes in the newly developed kinematics-based measure further illuminate how the participant's intrinsic behavior is altered over the training period. Thus, we are able to quantify the changes in the human-exoskeleton system's behavior observed in relation with learning.
|
|
WeB-2 |
Rm2 (Room B-1) |
Learning from Demonstration 3 |
Regular session |
Chair: Bhattacharjee, Tapomayukh | Cornell University |
Co-Chair: Ding, Tianli | Google |
|
14:40-14:50, Paper WeB-2.1 | |
Socially CompliAnt Navigation Dataset (SCAND): A Large-Scale Dataset of Demonstrations for Social Navigation |
|
Karnan, Haresh | The University of Texas at Austin |
Nair, Anirudh | The University of Texas at Austin |
Xiao, Xuesu | The University of Texas at Austin |
Warnell, Garrett | U.S. Army Research Laboratory |
Pirk, Soren | Robotics at Google |
Toshev, Alexander | Google |
Hart, Justin | University of Texas at Austin |
Biswas, Joydeep | University of Texas at Austin |
Stone, Peter | University of Texas at Austin |
Keywords: Data Sets for Robot Learning, Learning from Demonstration, Imitation Learning
Abstract: Social navigation is the capability of an autonomous agent, such as a robot, to navigate in a socially compliant manner in the presence of other intelligent agents such as humans. With the emergence of autonomously navigating mobile robots in human-populated environments (e.g., domestic service robots in homes and restaurants and food delivery robots on public sidewalks), incorporating socially compliant navigation behaviors on these robots becomes critical to ensuring safe and comfortable human-robot coexistence. To address this challenge, imitation learning is a promising framework, since it is easier for humans to demonstrate the task of social navigation rather than to formulate reward functions that accurately capture the complex multi-objective setting of social navigation. The use of imitation learning and inverse reinforcement learning to social navigation for mobile robots, however, is currently hindered by a lack of large-scale datasets that capture socially compliant robot navigation demonstrations in the wild. To fill this gap, we introduce Socially CompliAnt Navigation Dataset ( SCAND ) a large-scale, first-person-view dataset of socially compliant navigation demonstrations. Our dataset contains 8.7 hours, 138 trajectories, 25 miles of socially compliant, human tele-operated driving demonstrations that comprises multi-modal data streams including 3D lidar, joystick commands, odometry, visual and inertial information, collected on two morphologically different mobile robots --- a Boston Dynamics Spot and a Clearpath Jackal by four different human demonstrators in both indoor and outdoor environments. We additionally perform preliminary analysis and validation through real-world robot experiments and show that navigation policies learned by imitation learning on SCAND generate socially compliant behaviors.
|
|
14:50-15:00, Paper WeB-2.2 | |
Learning Deformable Object Manipulation from Expert Demonstrations |
|
Salhotra, Gautam | University of Southern California |
Liu, I-Chun Arthur | University of Southern California |
Dominguez-Kuhne, Marcus | University of Southern California |
Sukhatme, Gaurav | University of Southern California |
Keywords: Deep Learning in Grasping and Manipulation, Learning from Demonstration, Reinforcement Learning
Abstract: We present a novel Learning from Demonstration (LfD) method, Deformable Manipulation from Demonstrations (DMfD), to solve deformable manipulation tasks using states or images as inputs, given expert demonstrations. Our method uses demonstrations in three different ways, and balances the trade-off between exploring the environment online and using guidance from experts to explore high dimensional spaces effectively. We test DMfD on a set of representative manipulation tasks for a 1-dimensional rope and a 2-dimensional cloth from the SoftGym suite of tasks, each with state and image observations. Our method exceeds baseline performance by up to 12.9% for state-based tasks and up to 33.44% on image-based tasks, with comparable or better robustness to randomness. Additionally, we create two challenging environments for folding a 2D cloth using image-based observations, and set a performance benchmark for them. We deploy DMfD on a real robot with a minimal loss in normalized performance during real-world execution compared to simulation (~6%). Source code is on github.com/uscresl/dmfd
|
|
15:00-15:10, Paper WeB-2.3 | |
Transporters with Visual Foresight for Solving Unseen Rearrangement Tasks |
|
Wu, Hongtao | Johns Hopkins University |
Ye, Jikai | National University of Singapore |
Meng, Xin | National University of Singapore |
Paxton, Chris | Meta AI |
Chirikjian, Gregory | Johns Hopkins University |
Keywords: Deep Learning in Grasping and Manipulation, Learning from Demonstration, Task and Motion Planning
Abstract: Rearrangement tasks have been identified as a crucial challenge for intelligent robotic manipulation, but few methods allow for precise construction of unseen structures. We propose a visual foresight model for pick-and-place rearrangement manipulation which is able to learn efficiently. In addition, we develop a multi-modal action proposal module which builds on the Goal-Conditioned Transporter Network, a state-of-the-art imitation learning method. Our image-based task planning method, Transporters with Visual Foresight, is able to learn from only a handful of data and generalize to multiple unseen tasks in a zero-shot manner. TVF is able to improve the performance of a state-of-the-art imitation learning method on unseen tasks in simulation and real robot experiments. In particular, the average success rate on unseen tasks improves from 55.4% to 78.5% in simulation experiments and from 30% to 63.3% in real robot experiments when given only tens of expert demonstrations. Video and code are available on our project website: https://chirikjianlab.github.io/tvf/
|
|
15:10-15:20, Paper WeB-2.4 | |
Learning Perceptual Concepts by Bootstrapping from Human Queries |
|
Bobu, Andreea | University of California, Berkeley |
Paxton, Chris | Meta AI |
Yang, Wei | NVIDIA |
Sundaralingam, Balakumar | NVIDIA Corporation |
Chao, Yu-Wei | NVIDIA |
Cakmak, Maya | University of Washington |
Fox, Dieter | University of Washington |
Keywords: Learning Categories and Concepts, Learning from Demonstration, Visual Learning
Abstract: When robots operate in human environments, it's critical that humans can quickly teach them new concepts: object-centric properties of the environment that they care about (e.g. objects near, upright, etc). However, teaching a new perceptual concept from high-dimensional robot sensor data (e.g. point clouds) is demanding, requiring an unrealistic amount of human labels. To address this, we propose a framework called Perceptual Concept Bootstrapping (PCB). First, we leverage the inherently lower-dimensional privileged information, e.g., object poses and bounding boxes, available from a simulator only at training time to rapidly learn a low-dimensional, geometric concept from minimal human input. Second, we treat this low-dimensional concept as an automatic labeler to synthesize a large-scale high-dimensional data set with the simulator. With these two key ideas, PCB alleviates human label burden while still learning perceptual concepts that work with real sensor input where no privileged information is available. We evaluate PCB for learning spatial concepts that describe object state or multi-object relationships, and show it achieves superior performance compared to baseline methods. We also demonstrate the utility of the learned concepts in motion planning tasks on a 7-DoF Franka Panda robot.
|
|
15:20-15:30, Paper WeB-2.5 | |
Extending Extrapolation Capabilities of Probabilistic Motion Models Learned from Human Demonstrations Using Shape-Preserving Virtual Demonstrations |
|
Burlizzi, Riccardo | KU Leuven |
Vochten, Maxim | KU Leuven |
De Schutter, Joris | KU Leuven |
Aertbelien, Erwin | KU Leuven |
Keywords: Learning from Demonstration, Imitation Learning, Motion and Path Planning
Abstract: Learning from Demonstration (LfD) requires methodologies able to generalize tasks in new situations. This paper studies the use of virtual demonstrations to extend the extrapolation capabilities of probabilistic motion models such as the traPPCA method. Similarly to other LfD methods, traPPCA is able to calculate new trajectories very fast, but does not generalize well outside the area covered by the demonstrations. Another approach, the invariants method, shows outstanding generalization capabilities thanks to its shape-preserving properties, while being limited by long computation times. The proposed methodology combines the advantages of the two methods by learning traPPCA models using virtual demonstrations generated by the invariants method. The proposed approach is analyzed in three case studies. Furthermore, a comparison is made between learning with virtual demonstrations and learning with only real demonstrations. The results encourage the use of virtual demonstrations to extend the extrapolation capabilities of probabilistic motion models and hence reduce the required number of real demonstrations. The latter has the potential of reducing the cost of commissioning robot tasks.
|
|
15:30-15:40, Paper WeB-2.6 | |
Learning High Speed Precision Table Tennis on a Physical Robot |
|
Ding, Tianli | Google |
Graesser, Laura | Google |
Abeyruwan, Saminda Wishwajith | Google Inc |
D'Ambrosio, David | Google |
Shankar, Anish | Google |
Sermanet, Pierre | Google |
Sanketi, Pannag | Google |
Lynch, Corey | Google Brain |
Keywords: Machine Learning for Robot Control, Learning from Demonstration
Abstract: Learning goal conditioned control in the real world is a challenging open problem in robotics. Reinforcement learning systems have the potential to learn autonomously via trial-and-error, but in practice the costs of manual reward design, ensuring safe exploration, and hyperparameter tuning are often enough to preclude real world deployment. Imitation learning approaches, on the other hand, offer a simple way to learn control in the real world, but typically require costly curated demonstration data and lack a mechanism for continuous improvement. Recently, iterative imitation methods have been shown to be effective at relaxing both these constraints, learning goal directed control from undirected demonstration data, and improving continuously via self-supervised goal reaching. These approaches, however, have not yet been shown to scale beyond simple simulated environments. In this work, we present the first evidence that simple iterative imitation learning can scale to goal-directed behavior on a real robot in a dynamic setting: high speed, precision table tennis (e.g. ``land the ball on this particular target"). We find that this approach offers a straightforward way to do continuous on-robot learning, without complexities such as reward design, value function learning, or sim-to-real transfer. We also find that this approach is scalable --- sample efficient enough to train on a physical robot in just a few hours. In real world evaluations, we find that that the resulting policy can perform on par or better than amateur humans (with players sampled randomly from a robotics lab) at the task of returning the ball to specific targets on the table. Finally, we analyze the effect of an initial undirected bootstrap dataset size on performance, finding that a modest amount of unstructured demonstration data provided up-front drastically speeds up the convergence of a general purpose goal-reaching policy. See supplementary video for examples of the policy on a physical robot.
|
|
15:40-15:50, Paper WeB-2.7 | |
Behaviour Learning with Adaptive Motif Discovery and Interacting Multiple Model |
|
Zhao, Hanqing | McGill University |
Manderson, Travis | McGill University |
Zhang, Hao | Department of Electronic Engineering, Tsinghua University |
Liu, Xue | McGill University |
Dudek, Gregory | McGill University |
Keywords: Learning from Demonstration, Behavior-Based Systems, Vision-Based Navigation
Abstract: We propose an approach that enables simultaneous interpretable learning of a high-level discrete behaviour and its low-level rhythmic sub-behaviour. We do this though a unified reward function, where a reward function that only describes low-level behaviour, with less impact on learning of other behaviors is recovered from few-shot motion demonstrations. To this end, we first extract local behaviour motifs from state-only human demonstrations and random driving samples using an adaptive motif discovery approach derived from the Matrix Profile algorithm. We then optimize parameters for motif discovery by maximizing the sum and entropy over motif sizes. Interacting Multiple Model (IMM) estimators are constructed on top of linear-Gaussian dynamics of discovered motifs, the cumulative distributions over motifs estimated by IMMs serve as the basis of the reward function. By combining the recovered reward with terrain type signal gathered from the environment, we are able to train a dual-objective off-road vehicle controller that demonstrates both terrain selection and human-like driving behaviours. Compared with related approaches across 10 people, our rhythmic behaviour reward recovery approach enables the controller to produce higher preference over human driving demonstrations. As well as performing more stable across different people with 87% less variance than the best baseline in rhythmic behaviour indicator, reducing negative effects on higher-level behaviour learning, while maintaining high interpretability at all stages of the algorithm.
|
|
15:50-16:00, Paper WeB-2.8 | |
Learning from Demonstration Using a Curvature Regularized Variational Auto-Encoder (CurvVAE) |
|
Rhodes, Travers | Cornell University |
Bhattacharjee, Tapomayukh | Cornell University |
Lee, Daniel | Cornell Tech |
Keywords: Learning from Demonstration, Representation Learning, Physically Assistive Devices
Abstract: Learning intricate manipulation skills from human demonstrations requires good sample efficiency. We introduce a novel learning algorithm, the Curvature-regularized Variational Auto-Encoder (CurvVAE), to achieve this goal. The CurvVAE is able to model the natural variations in human-demonstrated trajectory data without overfitting. It does so by regularizing the curvature of the learned manifold. To showcase our algorithm, our robot learns an interpretable model of the variation in how humans acquire soft, slippery banana slices with a fork. We evaluate our learned trajectories on a physical robot system, resulting in banana slice acquisition performance better than current state-of-the-art.
|
|
16:00-16:10, Paper WeB-2.9 | |
Constrained Probabilistic Movement Primitives for Robot Trajectory Adaptation (I) |
|
Frank, Felix | Volkswagen Group |
Paraschos, Alexandros | Volkswagen Group |
van der Smagt, Patrick | Volkswagen Group |
Cseke, Botond | Volkswagen Machine Learning Research Lab |
Keywords: Learning from Demonstration, Probability and Statistical Methods, Collision Avoidance
Abstract: Placing robots outside controlled conditions requires versatile movement representations that allow robots to learn new tasks and adapt them to environmental changes. The introduction of obstacles or the placement of additional robots in the workspace, the modification of the joint range due to faults or range-of-motion constraints are typical cases where the adaptation capabilities play a key role for safely performing the robot's task. Probabilistic movement primitives (ProMPs) have been proposed for representing adaptable movement skills, which are modelled as Gaussian distributions over trajectories. These are analytically tractable and can be learned from a small number of demonstrations. However, both the original ProMP formulation and the subsequent approaches only provide solutions to specific movement adaptation problems, e.g., obstacle avoidance, and a generic, unifying, probabilistic approach to adaptation is missing. In this paper we develop a generic probabilistic framework for adapting ProMPs. We unify previous adaptation techniques, for example, various types of obstacle avoidance, via-points, mutual avoidance, in one single framework and combine them to solve complex robotic problems. Additionally, we derive novel adaptation techniques such as temporally unbound via-points and mutual avoidance. We formulate adaptation as a constrained optimisation problem where we minimise the Kullback-Leibler divergence between the adapted distribution and the distribution of the original primitive while we constrain the probability mass associated with undesired trajectories to be low. We demonstrate our approach on several adaptation problems on simulated planar robot arms and 7-DOF Franka-Emika robots in a dual robot arm setting.
|
|
WeB-3 |
Rm3 (Room B-2) |
Deep Learning for Visual Perception 2 |
Regular session |
Chair: Ang Jr, Marcelo H | National University of Singapore |
Co-Chair: Triebel, Rudolph | German Aerospace Center (DLR) |
|
14:40-14:50, Paper WeB-3.1 | |
Bayesian Active Learning for Sim-To-Real Robotic Perception |
|
Feng, Jianxiang | Institute of Robotics and Mechatronics, German Aerospace Center |
Lee, Jongseok | German Aerospace Center |
Durner, Maximilian | German Aerospace Center DLR |
Triebel, Rudolph | German Aerospace Center (DLR) |
Keywords: Deep Learning for Visual Perception, Rehabilitation Robotics, Object Detection, Segmentation and Categorization
Abstract: While learning from synthetic training data has recently gained an increased attention, in real-world robotic applications, there are still performance deficiencies due to the so-called Sim-to-Real gap. Therefore, we focus on an efficient acquisition of real data within a Sim-to-Real learning pipeline. Concretely, we employ deep Bayesian active learning to minimize manual annotation efforts and devise an autonomous learning paradigm to select the data that is considered useful for the human expert to annotate. To achieve this, a Bayesian Neural Network (BNN) object detector providing reliable uncertainty estimates is adapted to infer the informativeness of the unlabeled data. Furthermore, to cope with misalignments of the label distribution in uncertainty-based sampling, we develop an effective randomized sampling strategy that performs favorably compared to other complex alternatives. In our experiments on object classification and detection, we show benefits of our approach and provide evidence that labeling efforts can be reduced significantly. Finally, we demonstrate the practical effectiveness of this idea in a grasping task on an assistive robot.
|
|
14:50-15:00, Paper WeB-3.2 | |
DiffCloud: Real-To-Sim from Point Clouds with Differentiable Simulation and Rendering of Deformable Objects |
|
Sundaresan, Priya | Stanford University |
Antonova, Rika | Stanford University |
Bohg, Jeannette | Stanford University |
Keywords: Deep Learning for Visual Perception, Deep Learning in Grasping and Manipulation, Simulation and Animation
Abstract: Research in manipulation of deformable objects is typically conducted on a limited range of scenarios, because handling each scenario on hardware takes significant effort. Realistic simulators with support for various types of deformations and interactions have the potential to speed up experimentation with novel tasks and algorithms. However, for highly deformable objects it is challenging to align the output of a simulator with the behavior of real objects. Manual tuning is not intuitive, hence automated methods are needed. We view this alignment problem as a joint perception-inference challenge and demonstrate how to use recent neural network architectures to successfully perform simulation parameter inference from real point clouds. We analyze the performance of various architectures, comparing their data and training requirements. Furthermore, we propose to leverage differentiable point cloud sampling and differentiable simulation to significantly reduce the time to achieve the alignment. We employ an efficient way to propagate gradients from point clouds to simulated meshes and further through to the physical simulation parameters, such as mass and stiffness. Experiments with highly deformable objects show that our method can achieve comparable or better alignment with real object behavior, while reducing the time needed to achieve this by more than an order of magnitude. Videos and supplementary material are available at https://tinyurl.com/diffcloud.
|
|
15:00-15:10, Paper WeB-3.3 | |
Dynamics-Aware Spatiotemporal Occupancy Prediction in Urban Environments |
|
Toyungyernsub, Maneekwan | Stanford |
Yel, Esen | Stanford University |
Li, Jiachen | Stanford University |
Kochenderfer, Mykel | Stanford University |
Keywords: Deep Learning for Visual Perception, Computer Vision for Transportation, Computer Vision for Automation
Abstract: Detection and segmentation of moving obstacles, along with prediction of the future occupancy states of the local environment, are essential for autonomous vehicles to proactively make safe and informed decisions. In this paper, we propose a framework that integrates the two capabilities together using deep neural network architectures. Our method first detects and segments moving objects in the scene, and uses this information to predict the spatiotemporal evolution of the environment around autonomous vehicles. To address the problem of direct integration of both static-dynamic object segmentation and environment prediction models, we propose using occupancy-based environment representations across the whole framework. Our method is validated on the real-world Waymo Open Dataset and demonstrates higher prediction accuracy than baseline methods.
|
|
15:10-15:20, Paper WeB-3.4 | |
What's in the Black Box? the False Negative Mechanisms Inside Object Detectors |
|
Miller, Dimity | Queensland University of Technology |
Moghadam, Peyman | CSIRO |
Cox, Mark | CSIRO |
Wildie, Matt | Csiro |
Jurdak, Raja | Queensland University of Technology |
Keywords: Deep Learning for Visual Perception, Recognition, Object Detection, Segmentation and Categorization
Abstract: In object detection, false negatives arise when a detector fails to detect a target object. To understand why object detectors produce false negatives, we identify five 'false negative mechanisms', where each mechanism describes how a specific component inside the detector architecture failed. Focusing on two-stage and one-stage anchor-box object detector architectures, we introduce a framework for quantifying these false negative mechanisms. Using this framework, we investigate why Faster R-CNN and RetinaNet fail to detect objects in benchmark vision datasets and robotics datasets. We show that a detector's false negative mechanisms differ significantly between computer vision benchmark datasets and robotics deployment scenarios. This has implications for the translation of object detectors developed for benchmark datasets to robotics applications.
|
|
15:20-15:30, Paper WeB-3.5 | |
RVMOS: Range-View Moving Object Segmentation Leveraged by Semantic and Motion Features |
|
Kim, Jaeyeul | DGIST |
Woo, Jungwan | DGIST |
Im, Sunghoon | DGIST |
Keywords: Deep Learning for Visual Perception, Object Detection, Segmentation and Categorization, Semantic Scene Understanding
Abstract: Detecting traffic participants is an essential and age-old problem in autonomous driving. Recently, the recognition of moving objects has emerged as a major issue in this field for safe driving. In this paper, we present RVMOS, a LiDAR Range-View-based Moving Object Segmentation framework that segments moving objects given a sequence of range-view images. In contrast to the conventional method, our network incorporates both motion and semantic features, each of which encodes the motion of objects and the surrounding circumstance of the objects. In addition, we design a new feature extraction module suitably designed for range-view images. Lastly, we introduce simple yet effective data augmentation methods: time interval modulation and zero residual image synthesis. With these contributions, we achieve a 19% higher performance (mIoU) with 10% faster computational time (34 FPS on RTX 3090) than the state-of-the-art method with the SemanticKitti benchmark. Extensive experiments demonstrate the effectiveness of our network design and data augmentation scheme.
|
|
15:30-15:40, Paper WeB-3.6 | |
Pseudo-Label Guided Cross-Video Pixel Contrast for Robotic SurgicalScene Segmentation with Limited Annotations |
|
Yu, Yang | The Chinese University of Hong Kong |
Zhao, Zixu | The Chinese University of Hong Kong |
Jin, Yueming | The Chinese University of Hong Kong |
Chen, Guangyong | Shenzhen Institute of Advanced Technology, Chinese Academy of Sc |
Dou, Qi | The Chinese University of Hong Kong |
Heng, Pheng Ann | The Chinese University of Hong Kong |
Keywords: Deep Learning for Visual Perception, Computer Vision for Medical Robotics, Surgical Robotics: Laparoscopy
Abstract: Surgical scene segmentation is fundamentally crucial for prompting cognitive assistance in robotic surgery. However, pixel-wise annotating surgical video in a frame-by-frame manner is expensive and time consuming. To greatly reduce the labeling burden, in this work, we study semi-supervised scene segmentation from robotic surgical video, which is practically essential yet rarely explored before. We consider a clinically suitable annotation situation under the equidistant sampling. We then propose PGV-CL, a novel pseudo-label guided cross-video contrast learning method to boost scene segmentation. It effectively leverages unlabeled data for a trusty and global model regularization that produces more discriminative feature representation. Concretely, for trusty representation learning, we propose to incorporate pseudo labels to instruct the pair selection, obtaining more reliable representation pairs for pixel contrast. Moreover, we expand the representation learning space from previous image-level to cross-video, which can capture the global semantics to benefit the learning process. We extensively evaluate our method on a public robotic surgery dataset EndoVis18 and a public cataract dataset CaDIS. Experimental results demonstrate the effectiveness of our method, consistently outperforming the state-of-the-art semi-supervised methods under different labeling ratios, and even surpassing fully supervised training on EndoVis18 with 10.1% labeling. Our code will be publicly available.
|
|
15:40-15:50, Paper WeB-3.7 | |
An Unsupervised Domain Adaptive Approach for Multimodal 2D Object Detection in Adverse Weather Conditions |
|
Eskandar, George | University of Stuttgart |
Marsden, Robert | Institute of Signal Processing and System Theory |
Pandiyan, Pavithran | University of Stuttgart |
Döbler, Mario | University of Stuttgart |
Guirguis, Karim | Robert Bosch Corporate Research |
Yang, Bin | University of Stuttgart |
Keywords: Deep Learning for Visual Perception, Transfer Learning, Sensor Fusion
Abstract: Integrating different representations from complementary sensing modalities is crucial for robust scene interpretation in autonomous driving. While deep learning architectures that fuse vision and range data for 2D object detection have thrived in recent years, the corresponding modalities can degrade in adverse weather or lighting conditions, ultimately leading to a drop in performance. Although domain adaptation methods attempt to bridge the domain gap between source and target domains, they do not readily extend to heterogeneous data distributions. In this work, we propose an unsupervised domain adaptation framework, which adapts a 2D object detector for RGB and lidar sensors to one or more target domains featuring adverse weather conditions. Our proposed approach consists of three components. First, a data augmentation scheme that simulates weather distortions is devised to add domain confusion and prevent overfitting on the source data. Second, to promote cross-domain foreground object alignment, we leverage the complementary features of multiple modalities through a multi-scale entropy-weighted domain discriminator. Finally, we use carefully designed pretext tasks to learn a more robust representation of the target domain data. Experiments performed on the DENSE dataset show that our method can substantially alleviate the domain gap under the single-target domain adaptation (STDA) setting and the less explored yet more general multi-target domain adaptation (MTDA) setting.
|
|
15:50-16:00, Paper WeB-3.8 | |
BIMS-PU: Bi-Directional and Multi-Scale Point Cloud Upsampling |
|
Bai, Yechao | National University of Singapore |
Wang, Xiaogang | National University of Singapore |
Ang Jr, Marcelo H | National University of Singapore |
Rus, Daniela | MIT |
Keywords: Deep Learning for Visual Perception
Abstract: The learning and aggregation of multi-scale features are essential in empowering neural networks to capture the fine-grained geometric details in the point cloud upsampling task. Most existing approaches extract multi-scale features from a point cloud of a fixed resolution, hence obtain only a limited level of details. Though an existing approach aggregates a feature hierarchy of different resolutions from a cascade of upsampling sub-network, the training is complex with expensive computation. To address these issues, we construct a new point cloud upsampling pipeline called BIMS-PU that integrates the feature pyramid architecture with a bi-directional up and downsampling path. Specifically, we decompose the up/downsampling procedure into several up/downsampling sub-steps by breaking the target sampling factor into smaller factors. The multi-scale features are naturally produced in a parallel manner and aggregated using a fast feature fusion method. Supervision signal is simultaneously applied to all upsampled point clouds of different scales. Moreover, we formulate a residual block to ease the training of our model. Extensive quantitative and qualitative experiments on different datasets show that our method achieves superior results to state-of-the-art approaches. Last but not least, we demonstrate that point cloud upsampling can improve robot perception by ameliorating the 3D data quality.
|
|
16:00-16:10, Paper WeB-3.9 | |
LiCaS3: A Simple LiDAR–Camera Self-Supervised Synchronization Method (I) |
|
Yuan, Kaiwen | University of British Columbia |
Ding, Li | The University of British Columbia |
Abdelfattah, Mazen | University of British Columbia |
Wang, Z. Jane | UBC |
Keywords: Deep Learning Methods, Autonomous Vehicle Navigation, Calibration and Identification
Abstract: Recent advances in robotics and deep learning demonstrate promising 3-D perception performances via fusing the light detection and ranging (LiDAR) sensor and camera data, where both spatial calibration and temporal synchronization are generally required. While the LiDAR–camera calibration problem has been actively studied during the past few years, LiDAR–camera synchronization has been less studied and mainly addressed by employing a conventional pipeline consisting of clock synchronization and temporal synchronization. The conventional pipeline has certain potential limitations, which have not been sufficiently addressed and could be a bottleneck for the potential wide adoption of low-cost LiDAR–camera platforms. Different from the conventional pipeline, in this article, we propose the LiCaS3, the first deep-learning-based framework, for the LiDAR–camera synchronization task via self-supervised learning. The proposed LiCaS3 does not require hardware synchronization or extra annotations and can be deployed both online and offline. Evaluated on both the KITTI and Newer College datasets, the proposed method shows promising performances. The code will be publicly available at https://github.com/KleinYuan/LiCaS3.
|
|
WeB-4 |
Rm4 (Room C-1) |
Machine Learning for Robot Control 2 |
Regular session |
Chair: Singh, Sumeet | Google |
Co-Chair: Vela, Patricio | Georgia Institute of Technology |
|
14:40-14:50, Paper WeB-4.1 | |
Multiscale Sensor Fusion and Continuous Control with Neural CDEs |
|
Singh, Sumeet | Google |
McCann Ramirez, Francis | Google Brain |
Varley, Jacob | Google |
Zeng, Andy | Google |
Sindhwani, Vikas | Google Brain, NYC |
Keywords: Machine Learning for Robot Control, Sensor Fusion, Perception-Action Coupling
Abstract: Though robot learning is often formulated in terms of discrete-time Markov decision processes (MDPs), physical robots require near-continuous multiscale feedback control. Machines operate on multiple asynchronous sensing modalities, each with different frequencies, e.g., video frames at 30Hz, proprioceptive state at 100Hz, force-torque data at 500Hz, etc. While the classic approach is to batch observations into fixed-time windows then pass them through feed-forward encoders (e.g., with deep networks), we show that there exists a more elegant approach -- one that treats policy learning as modeling latent state dynamics in continuous-time. Specifically, we present 'InFuser', a unified architecture that trains continuous time-policies with Neural Controlled Differential Equations (CDEs). InFuser evolves a single latent state representation over time by (In)tegrating and (Fus)ing multi-sensory observations (arriving at different frequencies), and inferring actions in continuous-time. This enables policies that can react to multi-frequency multi sensory feedback for truly end-to-end visuomotor control, without discrete-time assumptions. Behavior cloning experiments demonstrate that InFuser learns robust policies for dynamic tasks (e.g., swinging a ball into a cup) notably outperforming several baselines in settings where observations from one sensing modality can arrive at much sparser intervals than others.
|
|
14:50-15:00, Paper WeB-4.2 | |
SMS-MPC: Adversarial Learning-Based Simultaneous Prediction Control with Single Model for Mobile Robots |
|
Yang, Andong | Institute of Computing Technology, Chinese Academy of Sciences |
Li, Wei | Institute of Computing Technology, Chinese Academy of Sciences |
Hu, Yu | Institute of Computing Technology Chinese Academy of Sciences |
Keywords: Machine Learning for Robot Control, Model Learning for Control, Motion Control
Abstract: Model predictive control is a promising method in robot control tasks. How to design an effective model structure and efficient prediction framework for model predictive control is still an open challenge. To reduce the time consumption and avoid compounding-error of the multi-step prediction process in model predictive control, we propose a single-model simultaneous framework, which uses single dynamics model to predict the entire prediction horizon simultaneously by taking all control actions with the current state as inputs. Based on this framework, we further propose an adversarial dynamics model that contains two parts. The generator provides a dynamics model for the prediction process, while the discriminator provides constraints that are hard to describe by manually defined loss. This adversarial dynamics model can accelerate training and improve model accuracy in unstructured environments. Experiments conducted in Gazebo simulator and on a real mobile robot demonstrate the efficiency and accuracy of the single-model simultaneous framework with an adversarial dynamics model.
|
|
15:00-15:10, Paper WeB-4.3 | |
Dynamic Inference on Graphs Using Structured Transition Models |
|
Saxena, Saumya | Carnegie Mellon University |
Kroemer, Oliver | Carnegie Mellon University |
Keywords: Machine Learning for Robot Control, Model Learning for Control, Robust/Adaptive Control
Abstract: Enabling robots to perform complex dynamic tasks such as picking up an object in one sweeping motion or pushing off a wall to quickly turn a corner is a challenging problem. The dynamic interactions implicit in these tasks are critical towards the successful execution of such tasks. Graph neural networks (GNNs) provide a principled way of learning the dynamics of interactive systems but can suffer from scaling issues as the number of interactions increases. Furthermore, the problem of using learned GNN-based models for optimal control is insufficiently explored. In this work, we present a method for efficiently learning the dynamics of interacting systems by simultaneously learning a dynamic graph structure and a stable and locally linear forward model of the system. The dynamic graph structure encodes evolving contact modes along a trajectory by making probabilistic predictions over the edges of the graph. Additionally, we introduce a temporal dependence in the learned graph structure which allows us to incorporate contact measurement updates during execution thus enabling more accurate forward predictions. The learned stable and locally linear dynamics enable the use of optimal control algorithms such as iLQR for long-horizon planning and control for complex interactive tasks. Through experiments in simulation and in the real world, we evaluate the performance of our method by using the learned interaction dynamics for control and demonstrate generalization to more objects and interactions not seen during training. We introduce a control scheme that takes advantage of contact measurement updates and hence is robust to prediction inaccuracies during execution.
|
|
15:10-15:20, Paper WeB-4.4 | |
Grasp Planning for Occluded Objects in a Confined Space with Lateral View Using Monte Carlo Tree Search |
|
Kang, Minjae | Seoul National University (SNU) |
Kee, Hogun | Seoul National University |
Kim, Junseok | Seoul National University |
Oh, Songhwai | Seoul National University |
Keywords: Deep Learning in Grasping and Manipulation, Manipulation Planning
Abstract: In the lateral access environment, the robot behavior should be planned considering surrounding objects and obstacles because object observation directions and approach angles are limited. To safely retrieve a partially occluded target object in these environments, we have to relocate objects using prehensile actions to create a collision-free path for the target. We propose a learning-based method for object rearrangement planning applicable to objects of various types and sizes in the lateral environment. We plan the optimal rearrangement sequence by considering both collisions and approach angles at which objects can be grasped. The proposed method finds the grasping order through Monte Carlo tree search, significantly reducing the tree search cost using point cloud states. In the experiment, the proposed method shows the best and most stable performance in various scenarios compared to the existing TAMP methods. In addition, we confirm that the proposed method trained in simulation can be easily applied to a real robot without additional fine-tuning, showing the robustness of the proposed method.
|
|
15:20-15:30, Paper WeB-4.5 | |
Non-Blocking Asynchronous Training for Reinforcement Learning in Real-World Environments |
|
Bohm, Peter | The University of Queensland |
Pounds, Pauline | The University of Queensland |
Chapman, Archie | The University of Queensland |
Keywords: Machine Learning for Robot Control, Reinforcement Learning
Abstract: Deep Reinforcement Learning (DRL) faces challenges bridging the sim-to-real gap to enable real-world applications. In contrast to the simulated environments used in conventional DRL training, real-world systems are non-linear and evolve in an asynchronous fashion; sensors and actuators have limited precision; communication channels are noisy; and many components introduce variable delays. While these issues are known to many researchers, published methods for systematically tackling the problem of DRL training under these conditions without using simulation are sparse in the field. To this end, this paper proposes a non-blocking and asynchronous DRL training architecture for non-linear, real-time dynamical systems tailored to handling variable delays. Compared to conventional DRL training, we: (i) decouple the RL loop into separate processes run independently at their own frequencies, (ii) further decouple collection of transition tuples (st , att, st+1 ) via asynchronous and independent streaming of both actions and observations, and (iii) mitigate the effects of delays and increase sample efficiency by providing delay-length measurements to the training loop and regular retraining of the DRL network. This allows the action step time to be tuned to find an optimal control frequency for a given system, and handles streamed observations that arrive with random delays and independently of action timing. We demonstrate the efficacy of this architecture with a physical implementations of a commodity-grade swing-up pendulum and a quadrupedal robot. Our architecture achieves the best results balancing the pendulum for almost entire length of the episode, compared to conventional blocking approaches which fail to learn effective policies. Our results show that these techniques scale to more complex tasks such as quadrupedal locomotion.
|
|
15:30-15:40, Paper WeB-4.6 | |
End-To-End Learning to Grasp Via Sampling from Object Point Clouds |
|
Alliegro, Antonio | Politecnico Di Torino |
Rudorfer, Martin | University of Birmingham |
Frattin, Fabio | Politecnico Di Torino |
Leonardis, Ales | University of Birmingham |
Tommasi, Tatiana | Politecnico Di Torino |
Keywords: Deep Learning in Grasping and Manipulation, Deep Learning for Visual Perception, Grasping
Abstract: The ability to grasp objects is an essential skill that enables many robotic manipulation tasks. Recent works have studied point cloud-based methods for object grasping by starting from simulated datasets and have shown promising performance in real-world scenarios. Nevertheless, many of them still rely on ad-hoc geometric heuristics to generate grasp candidates, which fail to generalize to objects with significantly different shapes with respect to those observed during training. Several approaches exploit complex multi-stage learning strategies and local neighborhood feature extraction while ignoring semantic global information. Furthermore, they are inefficient in terms of number of training samples and time required for inference. In this paper, we propose an end-to-end learning solution to generate 6-DOF parallel-jaw grasps starting from the 3D partial view of the object. Our Learning to Grasp (L2G) method gathers information from the input point cloud through a new procedure that combines a differentiable sampling strategy to identify the visible contact points, with a feature encoder that leverages local and global cues. Overall, L2G is guided by a multi-task objective that generates a diverse set of grasps by optimizing contact point sampling, grasp regression, and grasp classification. With a thorough experimental analysis, we show the effectiveness of L2G as well as its robustness and generalization abilities.
|
|
15:40-15:50, Paper WeB-4.7 | |
Robot Skill Learning with Identification of Preconditions and Postconditions Via Level Set Estimation |
|
Takano, Rin | NEC Corporation |
Oyama, Hiroyuki | NEC Corporation |
Taya, Yuki | NEC Corporation |
Keywords: Machine Learning for Robot Control, AI-Based Methods, Task and Motion Planning
Abstract: Hierarchical algorithms have often been used to plan and execute complicated robotic sequential manipulation tasks, where an abstract planner searches for a skill sequence in an abstract space, and each skill generates actual motions on the basis of the planned skill sequences. To generate executable plans, the abstract planner should know the pre-/postconditions of each skill and appropriately choose skills so that the generated plan satisfies their pre-/postconditions. For such hierarchical planning, this paper presents a novel method for robot skill learning that learns not only a control policy but also the learned skill's pre-/postconditions to complete a given task. Our method combines an optimal control method and an active learning approach called level set estimation (LSE) to effectively collect training data for learning control policies and pre-/postconditions. Although there exists a LSE-based policy learning algorithm that identifies preconditions, its performance is limited to cases where the dimension of the search space for pre-/postconditions is low. The main contribution of this paper is the proposal of a new learning method that can handle tasks having a high-dimensional search space for pre-/postconditions. We demonstrate our proposed method in two robotic tasks. The results show that our method can more effectively learn a control policy and its pre-/postconditions compared with the existing LSE-based method.
|
|
15:50-16:00, Paper WeB-4.8 | |
Sex Parity in Cognitive Fatigue Model Development for Effective Human-Robot Collaboration |
|
Kalatzis, Apostolos | Montana State University Bozeman |
Hopko, Sarah | Texas A&M University |
Mehta, Ranjana | Texas A&M University |
Stanley, Laura | Montana State University Bozeman |
Wittie, Mike | Montana State University Bozeman |
Keywords: Machine Learning for Robot Control, Human Factors and Human-in-the-Loop, Human-Robot Collaboration
Abstract: In recent years, robots have become vital to achieving manufacturing competitiveness. Especially in industrial environments, a strong level of interaction is reached when humans and robots form a dynamic system that works together towards achieving a common goal or accomplishing a task. However, the human-robot collaboration can be cognitively demanding, potentially contributing to cognitive fatigue. Therefore, the consideration of cognitive fatigue becomes particularly important to ensure the efficiency and safety in the overall human-robot collaboration. Additionally, sex is an inevitable human factor that needs further investigation for machine learning model development given the perceptual and physiological differences between the sexes in responding to fatigue. As such, this study explored sex differences and labeling strategies in the development of machine learning models for cognitive fatigue detection. Sixteen participants, balanced by sex, recruited to perform a surface finishing task with a UR10 collaborative robot under fatigued and non-fatigued states. Fatigue perception and heart rate activity data collected throughout to create a dataset for cognitive fatigue detection. Equitable machine learning models developed based on perception (survey responses) and condition (fatigue manipulation). The labeling approach had a significant impact on the accuracy and F1-score, where perception-based labels lead to lower accuracy and F1-score for females likely due to sex differences in reporting of fatigue. Additionally, we observed a relationship between heart rate, algorithm type, and labeling approach, where heart rate was the most significant predictor for the two labeling approaches and for all the algorithms utilized. Understanding the implications of label type, algorithm type, and sex on the design of fatigue detection algorithms is essential to designing equitable fatigue-adaptive human-robot collaborations across the sexes.
|
|
16:00-16:10, Paper WeB-4.9 | |
Online Adaptive Compensation for Model Uncertainty Using Extreme Learning Machine-Based Control Barrier Functions |
|
Munoz Panduro, Emanuel | Carnegie Mellon University |
Kalaria, Dvij | Indian Institute of Technology Kharagpur |
Lin, Qin | Cleveland State University |
Dolan, John M. | Carnegie Mellon University |
Keywords: Machine Learning for Robot Control, Robot Safety
Abstract: A control barrier functions-based quadratic programming (CBF-QP) method has emerged as a controller synthesis tool to assure safety of autonomous systems owing to the appealing safe forward invariant set. However, the provable safety relies on a precisely described dynamic model, which is not always available in practice. Recent works leverage learning to compensate model uncertainty for a CBF controller. However, these approaches based on reinforcement learning or episodic learning are limited to dealing with time-invariant uncertainty. Also, the reinforcement learning approach learns the uncertainty offline, while episodic learning only updates the controller after a batch of data is available by the end of an episode. Instead, we propose a novel tuning extreme learning machine (tELM)-based CBF controller that can compensate time-variant and time-invariant model uncertainty adaptively in an online manner. We validate our approach's effectiveness in a simulation of an Adaptive Cruise Control (ACC) system.
|
|
WeB-5 |
Rm5 (Room C-2) |
Soft Robot Modeling and Control 2 |
Regular session |
Chair: Legrand, Julie | VUB |
Co-Chair: Boyer, Frédéric | Ecole Des Mines De Nantes |
|
14:40-14:50, Paper WeB-5.1 | |
Task-Space Control of Continuum Robots Using Underactuated Discrete Rod Models |
|
Rucker, Caleb | University of Tennessee |
Barth, Eric J. | Vanderbilt University |
Gaston, Joshua | The University of Tennessee, Knoxville |
Gallentine, James | Vanderbilt University |
Keywords: Modeling, Control, and Learning for Soft Robots, Dynamics
Abstract: Underactuation is a core challenge associated with controlling soft and continuum robots, which possess theoretically infinite degrees of freedom, but few actuators. However, m actuators may still be used to control a dynamic soft robot in an m-dimensional output task space. In this paper we develop a task-space control approach for planar continuum robots that is robust to modeling error and requires very little sensor information. The controller is based on a highly underactuated discrete rod mechanics model in maximal coordinates and does not require conversion to a classical robot dynamics model form. This promotes straightforward control design, implementation and efficiency. We perform input-output feedback linearization on this model, apply sliding mode control to increase robustness, and formulate an observer to estimate the full state from sparse output measurements. Simulation results show exact task-space reference tracking behavior can be achieved even in the presence of significant modeling error, inaccurate initial conditions, and output-only sensing.
|
|
14:50-15:00, Paper WeB-5.2 | |
Nonlinear Dynamics Modeling and Fault Detection for a Soft Trunk Robot: An Adaptive NN-Based Approach |
|
Zhang, Jingting | University of Rhode Island |
Chen, Xiaotian | University of Rhode Island |
Stegagno, Paolo | University of Rhode Island |
Yuan, Chengzhi | University of Rhode Island |
Keywords: Modeling, Control, and Learning for Soft Robots, Failure Detection and Recovery, Model Learning for Control
Abstract: This paper presents a radial basis function neural network (RBF NN) based methodology to investigate the dynamics modeling and fault detection (FD) problems for soft robots. Finite element method (FEM) is first used to derive a mathematical model to describe the dynamics of a soft trunk robot. An adaptive dynamics modeling approach is then designed based on this FEM model by incorporating model-reduction and RBF NN techniques. This approach is capable of achieving accurate identification of the soft robot’s highly-nonlinear dynamics, with the identified knowledge being obtained and stored in constant RBF NN models. Finally, a model-based FD scheme is proposed with the modeling results, which can achieve efficient FD for the soft robot whenever it encounters an unknown fault. Note that the proposed methods are generic and usable for general soft robots. Validation of these methods is performed through both computer simulation and physical experiments.
|
|
15:00-15:10, Paper WeB-5.3 | |
Shape Representation and Modeling of Tendon-Driven Continuum Robots Using Euler Arc Splines |
|
Rao, Priyanka | University of Toronto |
Peyron, Quentin | Inria Lille-Nord Europe and CRIStAL UMR CNRS 9189, University Of |
Burgner-Kahrs, Jessica | University of Toronto |
Keywords: Modeling, Control, and Learning for Soft Robots, Flexible Robotics, Kinematics
Abstract: Due to the compliance of tendon-driven continuum robots, carrying a load or experiencing a tip force result in variations in backbone curvature. While the spatial robot configuration theoretically needs an infinite number of parameters for exact description, it can be well approximated using Euler Arc Splines which use only six of them. In this letter, we first show the accuracy of this representation by fitting the Euler Arc splines directly to experimentally measured robot shapes. Additionally, we propose a 3D static model that can account for gravity, friction and tip forces. We demonstrate the utility of using efficient parameterization by analyzing the computation time of the proposed model and then, using it to propose a hybrid model that combines physics-based model with observed data. The average tip error for the Euler arc spline representation is 0.43% and the proposed static model is 3.25% w.r.t. robot length. The average computation time is 0.56 ms for nonplanar deformations for a robot with ten disks. The hybrid model reduces the maximum error predicted by the static model from 8.6% to 5.1% w.r.t. robot length, while using 30 observations for training.
|
|
15:10-15:20, Paper WeB-5.4 | |
Geometrically-Exact Inverse Kinematic Control of Sof Manipulators with General Threadlike Actuators’ Routing |
|
Renda, Federico | Khalifa University of Science and Technology |
Armanini, Costanza | Khalifa University |
Mathew, Anup Teejo | Khalifa University |
Boyer, Frédéric | Ecole Des Mines De Nantes |
Keywords: Modeling, Control, and Learning for Soft Robots, Kinematics, Flexible Robotics
Abstract: The inverse kinematic control of soft robots appears as an open challenge that has been the subject of a number of papers presented in the last decade. Some solutions have been provided based on specific assumptions on the robot’s shape or the actuation mechanism. Other more generic approaches are characterized by a significant computational cost or by a low level of accuracy for very high deformations. In the effort to overcome some of these limitations, here we present a Geometrically-Exact (GE) inverse kinematics controller, which can be applied to soft manipulators having general threadlike actuators’ routing. Being GE, the approach is suitable to applications involving arbitrarily large bending and twisting, and, on the other side, it relies on a reduced number of Degrees of Freedom (DOFs). We prove the feasibility of the proposed Jacobian-based inverse kinematic control in simulation for soft manipulators with complex and discontinuous actuators’ routing.
|
|
15:20-15:30, Paper WeB-5.5 | |
Quasi-Static FEA Model for a Multi-Material Soft Pneumatic Actuator in SOFA |
|
Ferrentino, Pasquale | Vrije Universiteit Brussels |
López-Díaz, Antonio | Universidad De Castilla-La Mancha |
Terryn, Seppe | Vrije Universiteit Brussel (VUB) |
Legrand, Julie | VUB |
Brancart, Joost | Vrije Universiteit Brussel (VUB) |
Van Assche, Guy | Vrije Universiteit Brussel (VUB) |
Vázquez Fernández-Pacheco, Ester | Universidad De Castilla La Mancha |
Vazquez, Andres S. | Universidad De Castilla La Mancha |
Vanderborght, Bram | Vrije Universiteit Brussel |
Keywords: Modeling, Control, and Learning for Soft Robots, Simulation and Animation, Kinematics
Abstract: The increasing interest in soft robotics has led to new designs that exploit the combination of multiple materials, increasing robustness and enhancing performance. However, the combination of multiple non-linear materials makes modelization and eventually control of these highly flexible systems challenging. This article presents a methodology to model multi-material soft pneumatic actuators using finite element analysis (FEA), based on (hyper)elastic constitutive laws fitted on experimental material characterisation. The model in SOFA, the FEA software, allows to model and control in real-time soft robotic structures. One of the novelties presented in this paper is the development of a new user-friendly technique for the mesh partitioning in SOFA, using MATLAB algorithms, that allow the creation of uniform and more refined meshes and a mesh domain partitioning that can be adapted for any geometry. As a case study, a cylindrical multimaterial soft pneumatic actuator is considered. It is composed of an internal chamber, which is constituted of an autonomous self-healing hydrogel, modelled as a hyperelastic material, and an external elastic reinforcement, made of thermoplastic polyetherpolyurethane elastomer (TPPU), approached as a linear elastic material. The simulation of the combination of a hyperelastic and a linear elastic material in a single design is another contribution of this work to the scientific literature of SOFA simulations. Finally, the multi-material model obtained with the new mesh partitioning technique is simulated in quasi-static conditions and is experimentally validated, demonstrating an accurate fit, between simulation and reality.
|
|
15:30-15:40, Paper WeB-5.6 | |
Controlling Soft Fluidic Actuators Using Soft DEA-Based Valves |
|
Poccard-Saudart, Johan | Harvard School of Engineering & Applied Sciences |
Xu, Siyi | Harvard University |
Teeple, Clark | Harvard University |
Hyun, Nak-seung Patrick | Harvard University |
Becker, Kaitlyn | MIT |
Wood, Robert | Harvard University |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Sensors and Actuators, Soft Robot Applications
Abstract: Fluidic soft actuators have been widely used for applications where compliance is desirable -- such as in delicate robotic manipulation. As their operation is based on pressurization, fluidic actuators typically rely on bulky, rigid pumps, valves, and pressure regulators for control. This dependence on rigid components hinders the development of compact and fully-soft robots. Soft regulation systems designed for control of these actuators have been recently developed based on soft valves using dielectric elastomer actuators, but precise control has not been achieved. In this work, we leverage these valves to introduce a soft regulation system capable of precise closed-loop position control of a soft hydraulic actuator with multiple controllers. We also achieve open-loop trajectory tracking based on a data-driven model of the fluidic system. Finally, we combine the valve system with wearable strain sensors to create the first teleoperated fluidic circuit where the sensor, actuator, and regulation system are all soft. This work presents control strategies for fluid-driven actuators with soft sensors and regulation systems, showing the potential for future all-soft motion control of soft hydraulic robots.
|
|
15:40-15:50, Paper WeB-5.7 | |
Omnidirectional Walking of a Quadruped Robot Enabled by Compressible Tendon-Driven Soft Actuators |
|
Ji, Qinglei | KTH Royal Institute of Technology |
Fu, Shuo | KTH Royal Institute of Technology |
Feng, Lei | KTH Royal Institute of Technology |
Andrikopoulos, George | KTH Royal Institute of Technology |
Wang, Xi Vincent | KTH Royal Institute of Technology |
Wang, Lihui | KTH Royal Institute of Technology |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Sensors and Actuators, Motion Control
Abstract: Using soft actuators as legs, soft quadruped robots have shown great potential in traversing unstructured and complex terrains and environments. However, unlike rigid robots whose gaits can be generated using foot pattern design and kinematic model of the rigid legs, the gait generation of soft quadruped robots remains challenging due to the high DoFs of the soft actuators and the uncertain deformations during their contact with the ground. This study is based on a quadruped robot using four Compressible Tendon-driven Soft Actuators (CTSAs) as the legs, with the actuator's compression motion being utilized to improve the walking performance of the robot. For the gait design, an inverse kinematics model considering the compression of the CTSA is developed and validated in simulation. Based on this model, walking gaits realizing different motion speeds and directions are generated. Closed loop direction and speed controllers are developed for increasing the robustness and precision of the robot walking. Simulation and experimental results show that omnidirectional locomotion and complex walking tasks can be realized by tuning the gait parameters and the motions are resistant to external disturbances.
|
|
15:50-16:00, Paper WeB-5.8 | |
Estimating Forces Along Continuum Robots |
|
Aloi, Vincent | University of Tennessee |
Dang, Khoa | The University of Tennessee, Knoxville |
Barth, Eric J. | Vanderbilt University |
Rucker, Caleb | University of Tennessee |
Keywords: Modeling, Control, and Learning for Soft Robots, Surgical Robotics: Steerable Catheters/Needles
Abstract: Continuum robots can be slender and flexible to navigate through complex environments, such as passageways in the human body. In order to control the forces that continuum robots apply during navigation and manipulation, we would like to detect the location, direction, and magnitude of contact forces distributions as they arise. In this paper, we present a model-based framework for sensing distributed loads along continuum robots. Using sensed positions along the robot, we use a nonlinear optimization algorithm to estimate the loading which fits the model-predicted robot shape to the data. We propose that Gaussian load distributions provide a seamless way to account for a wide range of loadings, including approximate point loads and uniform distributed loads, while avoiding the ill-conditioning associated with highly resolved force distributions. In addition, we gain computational efficiency by re-framing the problem as unconstrained weighted least-squares minimization and by solving this problem with an Extended Kalman-filter framework. We validate the approach on two prototype tendon-driven continuum robots in multiple 3D loading scenarios, displaying a mean error of 0.58 N in load magnitude and 7% mean error in load location with respect to the length of the respective robot.
|
|
16:00-16:10, Paper WeB-5.9 | |
Learning Physics-Informed Simulation Models for Soft Robotic Manipulation: A Case Study with Dielectric Elastomer Actuators |
|
Lahariya, Manu | Ghent University |
Innes, Craig | University of Edinburgh |
Develder, Chris | Ghent University |
Ramamoorthy, Subramanian | The University of Edinburgh |
Keywords: Modeling, Control, and Learning for Soft Robots, Machine Learning for Robot Control, Reinforcement Learning
Abstract: Soft actuators offer a safe, adaptable approach to tasks like gentle grasping and dexterous manipulation. Creating accurate models to control such systems however is challenging due to the complex physics of deformable materials. Accurate Finite Element Method (FEM) models incur prohibitive computational complexity for closed-loop use. Using a differentiable simulator is an attractive alternative, but their applicability to soft actuators and deformable materials remains underexplored. This paper presents a framework that combines the advantages of both. We learn a differentiable model consisting of a material properties neural network and an analytical dynamics model of the remainder of the manipulation task. This physics-informed model is trained using data generated from FEM, and can be used for closed-loop control and inference. We evaluate our framework on a dielectric elastomer actuator (DEA) coin-pulling task. We simulate the task of using DEA to pull a coin along a surface with frictional contact, using FEM, and evaluate the physics-informed model for simulation, control, and inference. Our model attains ≤5% simulation error compared to FEM, and we use it as the basis for an MPC controller that requires fewer iterations to converge than model-free actor-critic, PD, and heuristic policies.
|
|
WeB-6 |
Rm6 (Room D) |
SLAM 8 |
Regular session |
Chair: Biber, Peter | Robert Bosch GmbH |
Co-Chair: Xu, Binbin | Imperial College London |
|
14:40-14:50, Paper WeB-6.1 | |
Detecting Invalid Map Merges in Lifelong SLAM |
|
Holoch, Matthias | Robert Bosch GmbH |
Kurz, Gerhard | Robert Bosch GmbH |
Biber, Peter | Robert Bosch GmbH |
Keywords: SLAM, Mapping, Localization
Abstract: For Lifelong SLAM, one has to deal with temporary localization failures, e.g., induced by kidnapping. We achieve this by starting a new map and merging it with the previous map as soon as relocalization succeeds. Since relocalization methods are fallible, it can happen that such a merge is invalid, e.g., due to perceptual aliasing. To address this issue, we propose methods to detect and undo invalid merges. These methods compare incoming scans with scans that were previously merged into the current map and consider how well they agree with each other. Evaluation of our methods takes place using a dataset that consists of multiple flat and office environments, as well as the public MIT Stata Center dataset. We show that methods based on a change detection algorithm and on comparison of gridmaps perform well in both environments and can be run in real-time with a reasonable computational cost.
|
|
14:50-15:00, Paper WeB-6.2 | |
MD-SLAM: Multi-Cue Direct SLAM |
|
Di Giammarino, Luca | Sapienza Univ. of Rome |
Brizi, Leonardo | Sapienza University of Rome |
Guadagnino, Tiziano | Sapienza University of Rome |
Stachniss, Cyrill | University of Bonn |
Grisetti, Giorgio | Sapienza University of Rome |
Keywords: SLAM, Mapping
Abstract: Simultaneous Localization and Mapping (SLAM) systems are fundamental building blocks for any autonomous robot navigating in an unknown environment. The SLAM implementation heavily depends on the sensor modality employed on the mobile platform. For this reason, assumptions on the scene's structure are often made to maximize estimation accuracy. This paper presents a novel direct 3D SLAM pipeline that works independently for RGB-D and LiDAR sensors. Building upon prior work on multi-cue photometric frame-to-frame alignment, our proposed approach provides an easy-to-extend and generic SLAM system. Our pipeline requires only minor adaptations within the projection model to handle different sensor modalities. We couple a position tracking system with an appearance-based relocalization mechanism that handles large loop closures. Loop closures are validated by the same direct registration algorithm used for odometry estimation. We present comparative experiments with state-of-the-art approaches on publicly available benchmarks using RGB-D cameras and 3D LiDARS. Our system performs well in heterogeneous datasets compared to other sensor-specific methods while making no assumptions about the environment. Finally, we release an open-source C++ implementation of our system.
|
|
15:00-15:10, Paper WeB-6.3 | |
Visual-Inertial Multi-Instance Dynamic SLAM with Object-Level Relocalisation |
|
Ren, Yifei | Imperial College London |
Xu, Binbin | Imperial College London |
Choi, Christopher | Imperial College London |
Leutenegger, Stefan | Technical University of Munich |
Keywords: SLAM, Mapping, Visual-Inertial SLAM
Abstract: In this paper, we present a tightly-coupled visual-inertial object-level multi-instance dynamic SLAM system. Even in extremely dynamic scenes, it can robustly optimise for the camera pose, velocity, IMU biases and build a dense 3D reconstruction object-level map of the environment. Our system can robustly track and reconstruct the geometries of arbitrary objects, their semantics and motion by incrementally fusing associated colour, depth, semantic, and foreground object probabilities into each object model thanks to its robust sensor and object tracking. In addition, when an object is lost or moved outside the camera field of view, our system can reliably recover its pose upon re-observation. We demonstrate the robustness and accuracy of our method by quantitatively and qualitatively testing it in real-world data sequences.
|
|
15:10-15:20, Paper WeB-6.4 | |
ACEFusion - Accelerated and Energy-Efficient Semantic 3D Reconstruction of Dynamic Scenes |
|
Bujanca, Mihai | University of Manchester |
Lennox, Barry | The University of Manchester |
Luján, Mikel | University of Manchester |
Keywords: SLAM, Mapping, Embedded Systems for Robotic and Automation
Abstract: Abstract— ACEFusion is the first 3D reconstruction system able to capture the geometry and semantics of dynamic scenes using an RGB-D camera in real-time on a robotic computing platform. Harnessing the hardware accelerators of an Nvidia Jetson AGX Xavier, the system uses heterogeneous computing to achieve 30 FPS under a 30W power budget. Using a dataparallel design, we perform most image computation on the dedicated hardware accelerators, freeing the general purpose cores and GPU to process 3D geometry. To further increase efficiency, we employ a hybrid geometry representation with octrees for static-semantic reconstruction and surfels for dynamic reconstruction. ACEFusion achieves competitive results on standard benchmarks while efficiently performing a more complex overall task than existing SLAM techniques. Figure. 1 shows the output of our system on a dynamic sequence.
|
|
15:20-15:30, Paper WeB-6.5 | |
A Spanning Tree-Based Multi-Resolution Approach for Pose-Graph Optimization |
|
Tazaki, Yuichi | Kobe University |
Keywords: SLAM, Mapping, Optimization and Optimal Control
Abstract: This paper proposes a computationally efficient method for pose-graph optimization that makes use of a multi-resolution representation of pose-graph transformation constructed on a spanning tree. It is shown that the proposed spanning tree-based hierarchy has a number of advantages over the previously known serial chain-based hierarchy in terms of preservation of sparsity and compatibility with parallel computation. It is demonstrated in numerical experiments using several public datasets that the proposed method outperforms a state-of-the-art solver for large-scale datasets.
|
|
15:30-15:40, Paper WeB-6.6 | |
Situational Graphs for Robot Navigation in Structured Indoor Environments |
|
Bavle, Hriday | Postdoctoral Research Associate |
Sanchez-Lopez, Jose Luis | Interdisciplinary Center for Security, Reliability and Trust (Sn |
Shaheer, Muhammad | University of Luxembourg |
Civera, Javier | Universidad De Zaragoza |
Voos, Holger | University of Luxembourg |
Keywords: SLAM, Mapping, Localization
Abstract: Mobile robots should be aware of their situation, comprising the deep understanding of their surrounding environment along with the estimation of its own state, to successfully make intelligent decisions and execute tasks autonomously in real environments. 3D scene graphs are an emerging field of research that propose to represent the environment in a joint model comprising geometric, semantic and relational/topological dimensions. Although 3D scene graphs have already been combined with SLAM techniques to provide robots with situational understanding, further research is still required to effectively deploy them on-board mobile robots. To this end, we present in this paper a novel, real-time, online built Situational Graph (S-Graph), which combines in a single optimizable graph, the representation of the environment with the aforementioned three dimensions, together with the robot pose. Our method utilizes odometry readings and planar surfaces extracted from 3D LiDAR scans, to construct and optimize in real-time a three layered S-Graph that includes (1) a robot tracking layer where the robot poses are registered, (2) a metric-semantic layer with features such as planar walls and (3) our novel topological layer constraining the planar walls using higher-level features such as corridors and rooms. Our proposal does not only demonstrate state-of-the-art results for pose estimation of the robot, but also contributes with a metric-semantic-topological model of the environment.
|
|
15:40-15:50, Paper WeB-6.7 | |
PFilter: Building Persistent Maps through Feature Filtering for Fast and Accurate LiDAR-Based SLAM |
|
Duan, Yifan | University of Science and Technology of China |
Peng, Jie | University of Science and Technology of China |
Zhang, Yu | University of Science and Technology of China |
Ji, Jianmin | University of Science and Technology of China |
Zhang, Yanyong | University of Science and Technology of China |
Keywords: Mapping, SLAM
Abstract: Simultaneous localization and mapping (SLAM) based on laser sensors has been widely adopted by mobile robots and autonomous vehicles. These SLAM systems are required to support accurate localization with limited computational resources. In particular, point cloud registration, i.e., the process of matching and aligning multiple LiDAR scans collected at multiple locations in a global coordinate framework, has been deemed as the bottleneck step in SLAM. In this paper, we propose a feature filtering algorithm, PFilter, that can filter out invalid features and can thus greatly alleviate this bottleneck. Meanwhile, the overall registration accuracy is also improved due to the carefully curated feature points. We integrate PFilter into the well-established scan-to-map LiDAR odometry framework, F-LOAM, and evaluate its performance on the KITTI dataset. The experimental results show that PFilter can remove about 48.4% of the points in the local feature map and reduce feature points in scan by 19.3% on average, which save 20.9% processing time per frame. In the mean time, we improve the accuracy by 9.4%.
|
|
15:50-16:00, Paper WeB-6.8 | |
Nested Sampling for Non-Gaussian Inference in SLAM Factor Graphs |
|
Huang, Qiangqiang | Massachusetts Institute of Technology |
Papalia, Alan | Massachusetts Institute of Technology |
Leonard, John | MIT |
Keywords: Localization, Mapping
Abstract: We present nested sampling for factor graphs (NSFG), a novel nested sampling approach to approximate inference for posterior distributions expressed over factor-graphs. Performing such inference is a key step in simultaneous localization and mapping (SLAM). Although the Gaussian approximation often works well, in other more challenging SLAM situations, the posterior distribution is non-Gaussian and cannot be explicitly represented with standard distributions. Our technique applies to settings where the posterior distribution is substantially non-Gaussian (e.g., multi-modal) and thus needs a more expressive representation. NSFG exploits nested sampling methods to directly sample the posterior to represent the distribution without parametric density models. While nested sampling methods are known for their powerful capability in sampling multi-modal distributions, the application of the methods to SLAM factor graphs is not straightforward. NSFG leverages the structure of factor graphs to construct informative prior distributions which are efficiently sampled and provide notable computational benefits for nested sampling methods. We compare NSFG to state-of-the-art sampling approaches and Gaussian/non-Gaussian SLAM techniques in experiments. NSFG performs most robustly in describing non-Gaussian posteriors and computes solutions over an order of magnitude faster than other sampling approaches. We believe the primary value of NSFG is as a reference solution of posterior distributions, aiding offline accuracy evaluation of approximate distributions found by other SLAM algorithms.
|
|
16:00-16:10, Paper WeB-6.9 | |
City-Wide Street-To-Satellite Image Geolocalization of a Mobile Ground Agent |
|
Downes, Lena | Massachusetts Institute of Technology |
Kim, Dong Ki | Massachusetts Institute of Tech |
Steiner, Ted | Draper |
How, Jonathan | Massachusetts Institute of Technology |
Keywords: Localization, Vision-Based Navigation, Deep Learning for Visual Perception
Abstract: Cross-view image geolocalization provides an estimate of an agent's global position by matching a local ground image to an overhead satellite image without the need for GPS. It is challenging to reliably match a ground image to the correct satellite image since the images have significant viewpoint differences. Existing works have demonstrated localization in constrained scenarios over small areas but have not demonstrated wider-scale localization. Our approach, called Wide-Area Geolocalization (WAG), combines a neural network with a particle filter to achieve global position estimates for agents moving in GPS-denied environments, scaling efficiently to city-scale regions. WAG introduces a trinomial loss function for a Siamese network to robustly match non-centered image pairs and thus enables the generation of a smaller satellite image database by coarsely discretizing the search area. A modified particle filter weighting scheme is also presented to improve localization accuracy and convergence. Taken together, WAG's network training and particle filter weighting approach achieves city-scale position estimation accuracies on the order of 20 meters, a 98% reduction compared to a baseline training and weighting approach. Applied to a smaller-scale testing area, WAG reduces the final position estimation error by 64% compared to a state-of-the-art baseline from the literature. WAG’s search space discretization additionally significantly reduces storage and processing requirements. We include in our submission a video demonstrating particle filter convergence results for WAG compared to the baseline for the Chicago test area.
|
|
WeB-7 |
Rm7 (Room E) |
Rehabilitation Robotics |
Regular session |
Chair: Yoon, Jungwon | Gwangju Institutue of Science and Technology |
Co-Chair: Zhu, Chi | Maebashi Institute of Technology |
|
14:40-14:50, Paper WeB-7.1 | |
Soft Actuators for Facial Reanimation |
|
Konstantinidi, Stefania | Ecole Polytechnique Fédérale De Lausanne (EPFL) |
Martinez, Thomas | Ecole Polytechnique Fédérale De Lausanne (EPFL) |
Benouhiba, Amine | École Polytechnique Fédérale De Lausanne (EPFL) |
Civet, Yoan | EPFL |
Perriard, Yves | Ecole Polytechnique Fédérale De Lausanne (EPFL) |
Keywords: Soft Robot Applications, Soft Robot Materials and Design, Medical Robots and Systems
Abstract: Facial paralysis is a challenging condition that alters a patient's ability to express emotion and communicate. Restoring facial movements thus has crucial implications for the patients' quality of life. This publication introduces an approach for artificial muscles implementation targeting facial reanimation, as well as the challenges and limitations of the proposed strategy. The aim is to develop a Dielectric Elastomer Actuator (DEA) prosthesis for patients suffering from facial paralysis. DEAs are chosen as they are soft, have large strain (up to 200%) and high dynamic behaviour (up to 20 kHz), making them a promising actuator for the application. Myoelectric signals are extracted using electromyography sensors from the Zygomaticus Major muscle, and they are processed in order to emphasize the activation phases. The resulting actuating signal is used to control a high voltage power supply to operate the DEA in an open loop. The resulting induced movement qualitatively matches the myoelectric signal, showing great potential of the proposed approach for facial paralysis reanimation.
|
|
14:50-15:00, Paper WeB-7.2 | |
Development and Experimental Evaluation of a Novel Portable Haptic Robotic Exoskeleton Glove System for Patients with Brachial Plexus Injuries |
|
Xu, Wenda | Virginia Tech |
Guo, Yunfei | Virginia Tech |
Bravo, Cesar | Carilion Clinic Institute of Orthopaedics and Neurosciences |
Ben-Tzvi, Pinhas | Virginia Tech |
Keywords: Rehabilitation Robotics, Wearable Robotics, Grasping
Abstract: This paper presents the development and experimental evaluation of a portable haptic exoskeleton glove system designed for people who suffer from brachial plexus injuries to restore their lost grasping functionality. The proposed glove system involves force perception, linkage-driven finger mechanism, and personalized voice control to achieve various grasping functionality requirements. The fully integrated system provides our wearable device with lightweight, portable, and comfortable characterization for grasping objects used in daily activities. Rigid articulated linkages powered by Series Elastic Actuators (SEAs) with slip detection on the fingertips provide stable and robust grasp for multiple objects. The passive abduction-adduction motion of each finger is also considered to provide better grasping flexibility for the user. The continuous voice control with bio-authentication also provides a hands-free user interface. The experiments with different objects verify the functionalities and capabilities of the proposed exoskeleton glove system in grasping objects with various shapes and weights used in activities of daily living (ADLs).
|
|
15:00-15:10, Paper WeB-7.3 | |
Development of a Novel Low-Profile Robotic Exoskeleton Glove for Patients with Brachial Plexus Injuries |
|
Xu, Wenda | Virginia Tech |
Liu, Yujiong | Virginia Tech |
Ben-Tzvi, Pinhas | Virginia Tech |
Keywords: Rehabilitation Robotics, Mechanism Design, Prosthetics and Exoskeletons
Abstract: This paper presents the design and development of a novel, low-profile, exoskeleton robotic glove aimed for people who suffer from brachial plexus injuries to restore their lost grasping functionality. The key idea of this new glove lies in its new finger mechanism that takes advantage of the rigid coupling hybrid mechanism (RCHM) concept. This mechanism concept couples the motions of the adjacent human finger links using rigid coupling mechanisms so that the overall mechanism motion (e.g., bending, extension, etc.) could be achieved using fewer actuators. The finger mechanism utilizes the single degree of freedom case of the RCHM that uses a rack-and-pinion mechanism as the rigid coupling mechanism. This special arrangement enables to design each finger mechanism of the glove as thin as possible while maintaining mechanical robustness simultaneously. Based on this novel finger mechanism, a two-finger low-profile robotic glove was developed. Remote center of motion mechanisms were used for the metacarpophalangeal (MCP) joints. Kinematic analysis and optimization-based kinematic synthesis were conducted to determine the design parameters of the new glove. Passive abduction/adduction joints were considered to improve the grasping flexibility. A proof-of-concept prototype was built and pinch grasping experiments of various objects were conducted. The results validated the mechanism and the mechanical design of the new robotic glove and demonstrated its functionalities and capabilities in grasping objects with various shapes and weights that are used in activities of daily living (ADLs).
|
|
15:10-15:20, Paper WeB-7.4 | |
A Novel Wheelchair-Exoskeleton Hybrid Robot to Assist Movement and Aid Rehabilitation |
|
Song, Zhibin | Tianjin University |
Ju, Wenjie | Tianjin University |
Chen, Dechao | Tianjin University |
Gong, Hexi | Tianjin University |
Kang, Rongjie | Tianjin University |
Dario, Paolo | Scuola Superiore Sant'Anna |
Keywords: Rehabilitation Robotics, Physical Human-Robot Interaction, Wearable Robotics
Abstract: As a traditional movement assist equipment for people with lower-limb dysfunction, the wheelchair can support and carry users to perform a long-distance movement indoor and outdoor, however, prolonged inactivity can lead to muscle atrophy and deteriorate motion functions. As a promising solution, the lower limb exoskeleton provides people the ability of standing and walking to avoid these problems. However, the exoskeleton has inevitable shortcomings in long-distance movement and balance, which do not exist in a wheelchair. To integrate the advantages of both devices, in this paper, we proposed a wheelchair-exoskeleton hybrid robot (WeHR) that can not only provide users long-time support and long-distance movement but also provide walking training and keep self-balance. Moreover, motion transitions such as sit-to-stand and stand-to-sit can also be implemented by the newly proposed device without help from caregivers. We have developed the prototype to implement the above functions. In this paper, we emphasize the strategy of motion transition including two trajectory planning methods for the Sit-To-Stand (STS) process as well as the mechanism design to implement it. Furthermore, the preliminary experiments of motion transition and walking test are also conducted and the results prove that our device can support users sitting, standing, and walking and the motion transition.
|
|
15:20-15:30, Paper WeB-7.5 | |
Facial Expressions-Controlled Flight Game with Haptic Feedback for Stroke Rehabilitation: A Proof-Of-Concept Study |
|
Li, Min | Xi'an Jiaotong University |
Wu, Zonglin | Xi'an Jiaotong University |
Zhao, Chen-Guang | Fourth Military Medical University |
Yuan, Hua | Fourth Military Medical University |
Wang, Tianci | Xi'an Jiaotong University |
Xie, Jun | Xi'an Jiaotong University |
Xu, Guanghua | School of Mechanical Engineering, Xi'an Jiaotong University |
Luo, Shan | King's College London |
Keywords: Rehabilitation Robotics
Abstract: Most stroke patients suffer from a combination of motor and sensory dysfunction and central facial paralysis. Specific rehabilitation training is required to restore those functions. Current research focuses on developing stimulating and straightforward rehabilitation training processes so that patients adhere to the training at home after hospital release. This study proposes enhancing patients’ enthusiasm to participate in facial muscle exercises and improving their postural perception and balance by controlling virtual objects to complete assigned tasks in virtual reality games using different facial expressions with the assistance of haptic feedback. The different rehabilitation exercises for motor, sensory, and facial dysfunctions were combined in one virtual reality game for the first time. The proposed haptic feedback device was modeled, simulated, and characterized. A user study was conducted to validate the proposed system. The experiment result shows that all the designed functions of the comprehensive stroke rehabilitation virtual reality game can be achieved. The added haptic feedback enhances the performance of the aircraft control with facial expressions by lowering the trajectory deviation by 22.57%. This implies that the proposed game may improve users' performance, thus attracting them to conduct more training.
|
|
15:30-15:40, Paper WeB-7.6 | |
An Intention Prediction Based Shared Control System for Point-To-Point Navigation of a Robotic Wheelchair |
|
Lei, Zhen | Nanyang Technological University |
Tan, Bang Yi | Nanyang Technological University |
Garg, Neha Priyadarshini | NUS |
Li, Lei | Nanyang Technological University |
Sidarta, Ananda | Nanyang Technological University |
Ang, Wei Tech | Nanyang Technological University |
Keywords: Rehabilitation Robotics, Human-Robot Collaboration, Intention Recognition
Abstract: Shared control approaches for robotic wheelchairs aim to provide navigation assistance to humans by utilizing robot’s intelligence in environment perception and motion planning. They can be broadly classified into two categories based on human intention prediction. Without human intention prediction, control authority lies with humans and assistance is provided only to avoid collisions. This can cause difficulty in cases where fine motor control is required, such as when entering narrow doorways, especially for users with severe upper limb disability. Intention prediction based approaches are able to better assist with such tasks but do not give enough control authority to the user as possible user intentions are pre-defined. In this work, we present an intention prediction based shared control system for point-to-point navigation of wheelchair which gives control authority to the user and also assists in fine motor control tasks. We compute various possible user intentions online using generalized Voronoi diagram, link them across time steps using their homotopy class and thus are able to calculate their probability given user input history. A shared local path planner then steers user towards the most likely path. This allows the user to follow any path. Our simulation experiments with 18 healthy subjects and both simulation and real wheelchair experiments, with 2 Cerebral Palsy (CP) subjects, show that our system can improve navigation outcome for people with disability and in general leads to around 10% faster completion of the task for even healthy people as compared to a local obstacle avoidance system while allowing users to follow their desired path with similar accuracy.
|
|
15:40-15:50, Paper WeB-7.7 | |
A Wearable System with Harmonic Oscillations to Assess Finger Biomechanics |
|
Yu, Hao | The University of Edinburgh |
Sena, Aran | Imperial College London |
Burdet, Etienne | Imperial College London |
Keywords: Rehabilitation Robotics, Wearable Robotics
Abstract: This paper presents a wearable device for finger assessment that can identify finger joint impedance parameters through harmonic oscillation perturbations. This device is designed to help assess motor impairments related to hypertonic soft-tissue changes, that can arise from a number of conditions such as stroke. By measuring the ratio of the applied torque and resulting velocities, the impedance values for any bending direction of a metacarpophalangeal (MCP) joint can be esti- mated. The ability of this device to effectively estimate finger parameters was tested in experiments with six participants. The experimental result was validated through comparison to prior works on finger impedance estimation. The user experience of the presented system was also analysed, indicating that the device design is comfortable and acceptable for participants.
|
|
15:50-16:00, Paper WeB-7.8 | |
Evaluation of TENS Based Biofeedback and Warning for Improvement of Seated Balance on a Trunk Rehabilitation Robot |
|
Eizad, Amre | Gwangju Institute of Science and Technology |
Lee, Hosu | Gwangju Institute of Science and Technology |
Lee, Junyeong | Gwangju Institute of Science and Technology |
Yoon, Jungwon | Gwangju Institutue of Science and Technology |
Keywords: Rehabilitation Robotics, Haptics and Haptic Interfaces, Human-Centered Robotics
Abstract: Provision of visual feedback under unstable and forcefully perturbed seat conditions can help improve seated balance performance. However, due to visual limitations, some patients may require use of a different modality. Additionally, warning about an upcoming perturbation may improve balance reactions, and use of a system that provides such warnings and balance biofeedback through different modalities may result in further improvement. Transcutaneous electrical nerve stimulation (TENS), which can generate electro-tactile stimulation, may be useful in this regard. Therefore, in this study with 21 healthy subjects, we have used our recently developed trunk rehabilitation robot to evaluate the performance of TENS as a feedback and perturbation warning modality against visual and vibrotactile stimulation. The center of pressure (COP) and trunk acceleration results show that both TENS and vibrotactile stimulation may serve as viable alternatives for visual feedback under unstable condition. Under perturbation condition, provision of warning improves balancing performance, and TENS shows the overall best performance as a warning modality, with vibrotactile showing the lowest performance. Thus, TENS may be considered a feasible feedback/warning modality for use during seated balance rehabilitation.
|
|
16:00-16:10, Paper WeB-7.9 | |
Soft Robotic Fabric Actuator with Elastic Bands for High Force & Bending Performance in Hand Exoskeletons |
|
Suulker, Cem | Queen Mary University of London |
Skach, Sophie | Queen Mary University of London |
Althoefer, Kaspar | Queen Mary University of London |
Keywords: Soft Sensors and Actuators, Prosthetics and Exoskeletons, Soft Robot Applications
Abstract: In current designs soft robotic bending actuators, the need for high force capabilities is not adequately addressed. In this paper, we present a new inflatable actuator that exploits textile manufacturing techniques, using an elastic band to improve both bend and force performance. At a pressure of 102 kPa, the index finger sized actuator exerts 24.8 N of force on the environment. It is also capable of a full 360 degree bending angle for pressures between 30 kPa and 102 kPa with a maximum bending stiffness of 288.4 N/m. We further demonstrate feasibility of this new actuator in a case study of an entirely fabric based soft robotic hand exoskeleton that increases robustness and user comfort. Our results suggest that textile robotics could provide an attractive solution in terms of the development of user-friendly hand exoskeletons.
|
|
WeB-8 |
Rm8 (Room F) |
Compliance and Impedance Control 2 |
Regular session |
Chair: Karayiannidis, Yiannis | Lund University |
Co-Chair: Tsuji, Toshiaki | Saitama University |
|
14:40-14:50, Paper WeB-8.1 | |
Integrating Impedance Control and Nonlinear Disturbance Observer for Robot-Assisted Arthroscope Control in Elbow Arthroscopic Surgery |
|
Li, Teng | University of Alberta |
Badre, Armin | University of Alberta |
Taghirad, Hamid D. | K.N.Toosi University of Technology |
Tavakoli, Mahdi | University of Alberta |
Keywords: Compliance and Impedance Control, Physical Human-Robot Interaction, Medical Robots and Systems
Abstract: Robot-assisted arthroscopic surgery is transforming the tradition in orthopaedic surgery. Compliance and stability are essential features that a surgical robot must have for safe physical human-robot interaction (pHRI). Surgical tools attached at the robot end-effector and human-robot interaction will affect the robot dynamics inevitably. This could undermine the utility and stability of the robotic system if the varying robot dynamics are not identified and updated in the robot control law. In this paper, an integrated framework for robot impedance control and nonlinear disturbance observer (NDOB)-based compensation of uncertain dynamics is proposed, where the former ensures compliant robot behavior and the latter compensates for dynamic uncertainties when necessary. The combination of impedance controller and NDOB is analyzed theoretically in three scenarios. A complete simulation and experimental studies involving three common conditions are then conducted to evaluate the theoretical analyses. A preliminary pHRI application on arthroscopic surgery is designed to implement the proposed framework on a robotic surgeon-assist system and evaluate its effectiveness experimentally. By integrating impedance controller with NDOB, the proposed framework allows an accurate impedance control when dynamic model inaccuracy and external disturbance exist.
|
|
14:50-15:00, Paper WeB-8.2 | |
Reinforcement Learning of Impedance Policies for Peg-In-Hole Tasks: Role of Asymmetric Matrices |
|
Kozlovsky, Shir | Technion – Israel Institute of Technology |
Newman, Elad | Technion - Israel Institute of Technology |
Zacksenhouse, Miriam | Technion-Israel Institute of Technology |
Keywords: Compliance and Impedance Control, Machine Learning for Robot Control, Reinforcement Learning
Abstract: Robotic manipulators are playing an increasing role in a wide range of industries. However, their application to assembly tasks is hampered by the need for precise control over the environment and for task-specific coding. Cartesian impedance control is a well-established method for interacting with the environment and handling uncertainties. With the advance of Reinforcement Learning (RL) it has been suggested to learn the impedance matrices. However, most of the current work is limited to learning diagonal impedance matrices in addition to the trajectory itself. We argue that asymmetric impedance matrices enhance the ability to properly correct reference trajectories generated by a baseline planner, alleviating the need for learning the trajectory. Moreover, a task-specific set of asymmetric impedance matrices can be sufficient for simple tasks, alleviating the need for learning variable impedance control. We learn impedance policies for small (few mm) peg-in-hole using model-free RL, and investigate the advantage of using asymmetric impedance matrices and their space-invariance. Finally, we demonstrate zero-shot policy transfer from the simulation to a real robot, and generalization to new real-world environments, with larger parts and semi-flexible pegs.
|
|
15:00-15:10, Paper WeB-8.3 | |
A Self-Tuning Impedance-Based Interaction Planner for Robotic Haptic Exploration |
|
Kato, Yasuhiro | Saitama University |
Balatti, Pietro | Istituto Italiano Di Tecnologia |
Gandarias, Juan M. | Istituto Italiano Di Tecnologia |
Leonori, Mattia | Istituto Italiano Di Tecnologia |
Tsuji, Toshiaki | Saitama University |
Ajoudani, Arash | Istituto Italiano Di Tecnologia |
Keywords: Compliance and Impedance Control, Integrated Planning and Control, Planning under Uncertainty
Abstract: This paper presents a novel interaction planning method that exploits impedance tuning techniques in response to environmental uncertainties and unpredictable conditions using haptic information only. The proposed algorithm plans the robot’s trajectory based on the haptic interaction with the environment and adapts planning strategies as needed. Two approaches are considered: Exploration and Bouncing strategies. The Exploration strategy takes the actual motion of the robot into account in planning, while the Bouncing strategy exploits the forces and the motion vector of the robot. Moreover, self-tuning impedance is performed according to the planned trajectory to ensure compliant contact and low contact forces. In order to show the performance of the proposed methodology, two experiments with a torque-controller robotic arm are carried out. The first considers a maze exploration without obstacles, whereas the second includes obstacles. The proposed method performance is analyzed and compared against previously proposed solutions in both cases. Experimental results demonstrate that: i) the robot can successfully plan its trajectory autonomously in the most feasible direction according to the interaction with the environment, and ii) a compliant interaction with an unknown environment despite the uncertainties is achieved. Finally, a scalability demonstration is carried out to show the potential of the proposed method under multiple scenarios.
|
|
15:10-15:20, Paper WeB-8.4 | |
Probabilistic Approach to Online Stiffness Estimation for Robotic Tasks |
|
Tsuji, Toshiaki | Saitama University |
Kusakabe, Tsukasa | Saitama University |
Keywords: Compliance and Impedance Control, Contact Modeling, Force and Tactile Sensing
Abstract: The stiffness of the environment is useful information for robotic tasks involving interactions with unstructured and unknown environments. However, its online estimation remains challenging. Owing to the nature of its calculation algorithm, a large amount of noise may be generated depending on the response value of the force and position. In this paper, we propose a variable gain filter that predicts the degree of such noise using a probabilistic approach and reflects only reliable data in the estimation. We experimentally show that the proposed method improves the accuracy of stiffness estimation without degrading the estimation time constant.
|
|
15:20-15:30, Paper WeB-8.5 | |
Efficient Learning of Inverse Dynamics Models for Adaptive Computed Torque Control |
|
Jorge, David | University of Edinburgh |
Pizzuto, Gabriella | University of Liverpool |
Mistry, Michael | University of Edinburgh |
Keywords: Dynamics, Compliance and Impedance Control
Abstract: Modelling robot dynamics accurately is essential for control, motion optimisation and safe human-robot collaboration. Given the complexity of modern robotic systems, dynamics modelling remains non-trivial, mostly in the presence of compliant actuators, mechanical inaccuracies, friction and sensor noise. Recent efforts have focused on utilising data-driven methods such as Gaussian processes and neural networks to overcome these challenges, as they are capable of capturing these dynamics without requiring extensive knowledge beforehand. While Gaussian processes have shown to be an effective method for learning robotic dynamics with the ability to also represent the uncertainty in the learned model through its variance, they come at a cost of cubic time complexity rather than linear, as is the case for deep neural networks. In this work, we leverage the use of deep kernel models, which combine the computational efficiency of deep learning with the non-parametric flexibility of kernel methods (Gaussian processes), with the overarching goal of realising an accurate probabilistic framework for uncertainty quantification. Through using the predicted variance, we adapt the feedback gains as more accurate models are learned, leading to low-gain control without compromising tracking accuracy. Using simulated and real data recorded from a seven degree-of-freedom robotic manipulator, we illustrate how using stochastic variational inference with deep kernel models increases compliance in the computed torque controller, and retains tracking accuracy. We empirically show how our model outperforms current state-of-the-art methods with prediction uncertainty for online inverse dynamics model learning, and solidify its adaptation and generalisation capabilities across different setups.
|
|
15:30-15:40, Paper WeB-8.6 | |
On the Performance and Passivity of Admittance Control with Feed-Forward Input |
|
Ko, Dongwoo | Postech |
Lee, Donghyeon | Pohang University of Science and Technology(POSTECH) |
Chung, Wan Kyun | POSTECH |
Kim, Keehoon | POSTECH, Pohang University of Science and Technology |
Keywords: Compliance and Impedance Control
Abstract: This paper analyzes the effect of control parameters of feed-forward and inner loop velocity controller in an admittance control scheme on the performance and passivity. The interaction force, inertia, and damping compensation were considered as the feed-forward input. Sufficient conditions and guidelines for each parameter were provided to enable the implementation of a wide range of desired admittance satisfying passivity. The proposed guidelines were verified through experiments.
|
|
15:40-15:50, Paper WeB-8.7 | |
Feel the Tension: Manipulation of Deformable Linear Objects in Environments with Fixtures Using Force Information |
|
Süberkrüb, Finn | Chalmers University of Technology |
Laezza, Rita | Chalmers University of Technology |
Karayiannidis, Yiannis | Lund University |
Keywords: Force Control, Perception for Grasping and Manipulation, Compliant Assembly
Abstract: Humans are able to manipulate Deformable Linear Objects (DLOs) such as cables and wires, with little or no visual information, relying mostly on force sensing. In this work, we propose a reduced DLO model which enables such blind manipulation by keeping the object under tension. Further, an online model estimation procedure is also proposed. A set of elementary sliding and clipping manipulation primitives are defined based on our model. The combination of these primitives allows for more complex motions such as winding of a DLO. The model estimation and manipulation primitives are tested individually but also together in a real-world cable harness production task, using a dual-arm YuMi, thus demonstrating that force-based perception can be sufficient even for such a complex scenario.
|
|
15:50-16:00, Paper WeB-8.8 | |
A Comparative Study of Force Observers for Accurate Force Control of Multisensor-Based Force Controlled Motion Systems |
|
Kangwagye, Samuel | DGIST |
Oh, Sehoon | DGIST |
Keywords: Force Control, Motion Control, Sensor Fusion
Abstract: This paper presents a comprehensive comparative study of the multisensor-based force observers for accurate force control. A force controlled system which contains a force sensor for measuring force transmitted to the load by the motor and an encoder for measuring motor position is considered as the general multisensor-based motion system in this study. Even though these multisensor-based motion systems are emerging as potential motion systems as the demands for collaborative robots increase, there has been few studies that investigate their advantages and limitations. To address this issue, three types of observer-based force controllers that utilize the multisensors are designed and implemented. These controllers exploit the availability of force sensor, motor encoder, and motor torque information from the multisensor-based motion system to estimate accurate force which is later utilized to close the feedback loop. Mathematical and quantitative analyses are conducted to compare performances of the proposed observer-based force control and through this, their advantages and limitations are pointed out. Finally, simulation and an experimental case study with an actual robot are conducted to validate the force tracking performance of the designed force control systems.
|
|
16:00-16:10, Paper WeB-8.9 | |
Bio-Inspired Grasping Controller for Sensorized 2-DoF Grippers |
|
Lach, Luca | Bielefeld University |
Lemaignan, Séverin | PAL Robotics |
Ferro, Francesco | PAL Robotics |
Ritter, Helge Joachim | Bielefeld University |
Haschke, Robert | Bielefeld University |
Keywords: Force Control, Grasping, Mobile Manipulation
Abstract: We present a holistic grasping controller, combining free-space position control and in-contact force-control for reliable grasping given uncertain object pose estimates. Employing tactile fingertip sensors, undesired object displacement during grasping is minimized by pausing the finger closing motion for individual joints on first contact until force-closure is established. While holding an object, the controller is compliant with external forces to avoid high internal object forces and prevent object damage. Gravity as an external force is explicitly considered and compensated for, thus preventing gravity-induced object drift. We evaluate the controller in two experiments on the TIAGo robot and its parallel-jaw gripper proving the effectiveness of the approach for robust grasping and minimizing object displacement. In a series of ablation studies, we demonstrate the utility of the individual controller components.
|
|
WeB-9 |
Rm9 (Room G) |
Software, Middleware and Programming Environments 2 |
Regular session |
Chair: Angelidis, Emmanouil | Huawei Technologies Munich Research Center |
Co-Chair: Biswas, Joydeep | University of Texas at Austin |
|
14:40-14:50, Paper WeB-9.1 | |
Gazebo Fluids: SPH-Based Simulation of Fluid Interaction with Articulated Rigid Body Dynamics |
|
Angelidis, Emmanouil | Huawei Technologies Munich Research Center |
Bender, Jan | RWTH Aachen University |
Arreguit, Jonathan | École Polytechnique Fédérale De Lausanne |
Gleim, Lars Christoph | Huawei Technologies Munich Research Center |
Wang, Wei | Huawei Technologies Munich Research Center |
Axenie, Cristian | Huawei Technologies Munich Research Center |
Knoll, Alois | Technical University of Munich |
Ijspeert, Auke | EPFL |
Keywords: Software, Middleware and Programming Environments, Dynamics, Biologically-Inspired Robots
Abstract: Physical simulation is an indispensable component of robotics simulation platforms that serves as the basis for a plethora of research directions. Looking strictly at robotics, the common characteristic of the most popular physics engines, such as ODE, DART, MuJoCo, bullet, SimBody, PhysX or RaiSim, is that they focus on the solution of articulated rigid bodies with collisions and contacts problems, while paying less attention to other physical phenomena. This restriction limits the range of addressable simulation problems, rendering applications such as soft robotics, cloth simulation, simulation of viscoelastic materials, and fluid dynamics, especially surface swimming, infeasible. In this work, we present Gazebo Fluids, an open-source extension of the popular Gazebo robotics simulator that enables the interaction of articulated rigid body dynamics with particle-based fluid and deformable solid simulation. We implement fluid dynamics and highly viscous and elastic material simulation capabilities based on the Smoothed Particle Hydrodynamics method. We demonstrate the practical impact of this extension for previously infeasible application scenarios in a series of experiments, showcasing one of the first self-propelled robot swimming simulations with SPH in a robotics simulator.
|
|
14:50-15:00, Paper WeB-9.2 | |
SOCIALGYM: A Framework for Benchmarking Social Robot Navigation |
|
Holtz, Jarrett | University of Texas at Austin |
Biswas, Joydeep | University of Texas at Austin |
Keywords: Software Tools for Benchmarking and Reproducibility, Human-Aware Motion Planning, Data Sets for Robot Learning
Abstract: Robots moving safely and in a socially compliant manner in dynamic human environments is an essential benchmark for long-term robot autonomy. However, it is not feasible to learn and benchmark social navigation behaviors entirely in the real world, as learning is data-intensive, and it is challenging to make safety guarantees during training. Therefore, simulation-based benchmarks that provide abstractions for social navigation are required. A framework for these benchmarks would need to support a wide variety of learning approaches, be extensible to the broad range of social navigation scenarios, and abstract away the perception problem to focus on social navigation explicitly. While there have been many proposed solutions, including high fidelity 3D simulators and grid world approximations, no existing solution satisfies all of the aforementioned properties for learning and evaluating social navigation behaviors. In this work, we propose SOCIALGYM, a lightweight 2D simulation environment for robot social navigation designed with extensibility in mind, and a benchmark scenario built on SOCIALGYM. Further, we present benchmark results that compare and contrast human-engineered and model-based learning approaches to a suite of off-the-shelf Learning from Demonstration (LfD) and Reinforcement Learning (RL) approaches applied to social robot navigation. These results demonstrate the data efficiency, task performance, social compliance, and environment transfer capabilities for each of the policies evaluated to provide a solid grounding for future social navigation research.
|
|
15:00-15:10, Paper WeB-9.3 | |
SROS2: Usable Cyber Security Tools for ROS 2 |
|
Mayoral-Vilches, Victor | Klagenfurt University |
White, Ruffin | University of California San Diego |
Caiazza, Gianluca | Ca Foscari University of Venice |
Arguedas, Mikael | Open Source Robotics Foundation |
Keywords: Software, Middleware and Programming Environments, Software Tools for Robot Programming, Software Architecture for Robotic and Automation
Abstract: ROS 2 is rapidly becoming a standard in the robotics industry. Built upon DDS as its default communication middleware and used in safety-critical scenarios, adding security to robots and ROS computational graphs is increasingly becoming a concern. The present work introduces SROS2, a series of developer tools and libraries that facilitate adding security to ROS 2 graphs. Focusing on a usability-centric approach in SROS2, we present a methodology for securing graphs systematically while following the DevSecOps model. We also demonstrate the use of our security tools by presenting an application case study that considers securing a graph using the popular Navigation2 and SLAM Toolbox stacks applied in a TurtleBot3 robot. We analyse the current capabilities of SROS2 and discuss the shortcomings, which provides insights for future contributions and extensions. Ultimately, we present SROS2 as usable security tools for ROS 2 and argue that without usability, security in robotics will be greatly impaired.
|
|
15:10-15:20, Paper WeB-9.4 | |
Automatic Co-Design of Aerial Robots Using a Graph Grammar |
|
Zhao, Allan | Massachusetts Institute of Technology |
Du, Tao | MIT |
Xu, Jie | Massachusetts Institute of Technology |
Hughes, Josie | EPFL |
Salazar, Juan | MIT |
Ma, Pingchuan | MIT CSAIL |
Wang, Wei | Massachusetts Institute of Technology |
Rus, Daniela | MIT |
Matusik, Wojciech | MIT |
Keywords: Methods and Tools for Robot System Design
Abstract: Unmanned aerial vehicles (UAVs) have broad applications including disaster response, transportation, photography, and mapping. A significant bottleneck in the development of UAVs is the limited availability of automatic tools for task-specific co-design of a UAV's shape and controller. The development of such tools is particularly challenging as UAVs can take many forms, including fixed-wing planes, radial copters, and hybrid topologies, with each class of topology showing different advantages. In this work, we present a computational design pipeline for UAVs based on a graph grammar that can search across a wide range of topologies. Graphs generated by the grammar encode different topologies and component selections, while continuous parameters encode the dimensions and properties of each component. We further augment the shape representation with deformation cages, which allow expressing a variety of wing shapes. Each UAV design is associated with an LQR controller with tunable continuous parameters. To search over this complex discrete and continuous design space, we develop a hybrid algorithm that combines discrete graph search strategies and gradient-based continuous optimization methods using a differentiable UAV simulator. We evaluate our pipeline on a set of simulated flight tasks requiring dynamic motions, showing that it discovers novel UAV designs that outperform canonical UAVs typically made by engineers.
|
|
15:20-15:30, Paper WeB-9.5 | |
ARviz – an Augmented Reality-Enabled Visualization Platform for ROS Applications (I) |
|
Hoang, Khoa Cong | Monash University |
Chan, Wesley Patrick | Monash University |
Lay, Steven | Monash University |
Cosgun, Akansel | Monash University |
Croft, Elizabeth | Monash University |
Keywords: Virtual Reality and Interfaces, Human-Robot Collaboration, Telerobotics and Teleoperation
Abstract: Current robot interfaces such as teach pendants and 2D screen displays used for task visualization and interaction often seem unintuitive and limited in terms of information flow. This compromises task efficiency as interacting with the interface can distract the user from the task at hand. Augmented Reality (AR) technology offers the capability to create visually rich displays and intuitive interaction elements in situ. In recent years, AR has shown promising potential to enable effective human-robot interaction. We introduce ARviz - a versatile, extendable AR visualization platform built for robot applications developed with the widely used Robot Operating System (ROS) framework. ARviz aims to provide both a universal visualization platform with the capability of displaying any ROS message data type in AR, as well as a multimodal user interface for interacting with robots over ROS. ARviz is built as a platform incorporating a collection of plugins that provide visualization and/or interaction components. Users can also extend the platform by implementing new plugins to suit their needs. We present three use cases as well as two potential use cases to showcase the capabilities and benefits of the ARviz platform for human-robot interaction applications. The open access source code for our ARviz platform is available at: https://github.com/hri-group/arviz.
|
|
15:30-15:40, Paper WeB-9.6 | |
A RoboStack Tutorial: Using the Robot Operating System Alongside the Conda and Jupyter Data Science Ecosystems (I) |
|
Fischer, Tobias | Queensland University of Technology |
Vollprecht, Wolf Kristian | QuantStack |
Traversaro, Silvio | Istituto Italiano Di Tecnologia |
Yen, Sean | Microsoft |
Herrero, Carlos | QuantStack |
Milford, Michael J | Queensland University of Technology |
Keywords: Software Tools for Robot Programming, Software Tools for Benchmarking and Reproducibility, Software Architecture for Robotic and Automation
Abstract: RoboStack tightly couples the widely-used Robot Operating System with Conda, a cross-platform, language-agnostic package manager, and Jupyter, a web-based interactive computational environment affording scientific computing. RoboStack provides new ROS packages for Conda, enabling the installation of ROS alongside data-science and machine-learning packages with ease. Multiple ROS versions (currently ROS1 Melodic and Noetic as well as ROS2 Foxy, Galactic and Humble) can run simultaneously on one machine, with pre-compiled binaries available for Linux, Windows and OSX, and the ARM architecture (e.g. the Raspberry Pi and the new Apple Silicon). To deal with the large size of the ROS ecosystem, we significantly improved the speed of the Conda solver and build system by rewriting the crucial parts in C++. We further contribute a collection of JupyterLab extensions for ROS, including plugins for live plotting, debugging and robot control, as well as tight integration with Zethus, an RViz like visualization tool. Taken together, RoboStack combines the best of the data-science and robotics worlds to help researchers and developers to build custom solutions for their academic and industrial projects.
|
|
15:40-15:50, Paper WeB-9.7 | |
Safe-Control-Gym: A Unified Benchmark Suite for Safe Learning-Based Control and Reinforcement Learning in Robotics |
|
Yuan, Zhaocong | University of Toronto |
Hall, Adam W. | University of Toronto |
Zhou, Siqi | University of Toronto |
Brunke, Lukas | University of Toronto |
Greeff, Melissa | University of Toronto |
Panerati, Jacopo | University of Toronto |
Schoellig, Angela P. | University of Toronto |
Keywords: Software Tools for Benchmarking and Reproducibility, Machine Learning for Robot Control, Reinforcement Learning
Abstract: In recent years, both reinforcement learning and learning-based control---as well as the study of their safety, which is crucial for deployment in real-world robots---have gained significant traction. However, to adequately gauge the progress and applicability of new results, we need the tools to equitably compare the approaches proposed by the controls and reinforcement learning communities. Here, we propose a new open-source benchmark suite, called safe-control-gym, supporting both model-based and data-based control techniques. We provide implementations for three dynamic systems---the cart-pole, the 1D, and 2D quadrotor---and two control tasks---stabilization and trajectory tracking. We propose to extend OpenAI's Gym API---the de facto standard in reinforcement learning research---with (i) the ability to specify (and query) symbolic dynamics and (ii) constraints, and (iii) (repeatably) inject simulated disturbances in the control inputs, state measurements, and inertial properties. To demonstrate our proposal and in an attempt to bring research communities closer together, we show how to use safe-control-gym to quantitatively compare the control performance, data efficiency, and safety of multiple approaches from the fields of traditional control, learning-based control, and reinforcement learning.
|
|
15:50-16:00, Paper WeB-9.8 | |
On-Device CPU Scheduling for Robot Systems |
|
Partap, Aditi | Stanford University |
Grayson, Samuel | University of Illinois at Urbana-Champaign |
Huzaifa, Muhammad | University of Illinois at Urbana-Champaign |
Adve, Sarita | University of Illinois at Urbana-Champaign |
Godfrey, Brighten | University of Illinois at Urbana-Champaign |
Gupta, Saurabh | UIUC |
Hauser, Kris | University of Illinois at Urbana-Champaign |
Mittal, Radhika | University of Illinois at Urbana Champaign |
Keywords: Software Tools for Robot Programming, Engineering for Robotic Systems, Methods and Tools for Robot System Design
Abstract: Robots have to take highly responsive real-time actions, driven by complex decisions involving a pipeline of sensing, perception, planning, and reaction tasks. These tasks must be scheduled on resource-constrained devices such that the performance goals and the requirements of the application are met. This is a difficult problem that requires handling multiple scheduling dimensions, and variations in computational resource usage and availability. In practice, system designers manually tune parameters for their specific hardware and application, which results in poor generalization and increases the development burden. In this work, we highlight the emerging need for scheduling CPU resources at runtime in robot systems. We use robot navigation as a case-study to understand the key scheduling requirements for such systems. Armed with this understanding, we develop a CPU scheduling framework, Catan, that dynamically schedules compute resources across different components of an app so as to meet the specified application requirements. Through experiments with a prototype implemented on ROS, we show the impact of system scheduling on meeting the application’s performance goals, and how Catan dynamically adapts to runtime variations.
|
|
WeB-10 |
Rm10 (Room H) |
Whole-Body Motion Planning and Control 1 |
Regular session |
Chair: Katyal, Kapil | Johns Hopkins University Applied Physics Lab |
Co-Chair: Lai, Tin | University of Sydney |
|
14:40-14:50, Paper WeB-10.1 | |
Whole-Body Control with Motion/Force Transmissibility for Parallel-Legged Robot |
|
Wang, Jiajun | UBTECH Robotics |
Han, Gang | UBTECH Robotics |
Ju, Xiaozhu | UBTech Robotics |
Zhao, Mingguo | Tsinghua University |
Keywords: Whole-Body Motion Planning and Control, Parallel Robots, Humanoid and Bipedal Locomotion
Abstract: Whole-body control (WBC) with task priority transition is an important technology for robots to switch multiple behaviors, change different objectives, and adapt to various environments. Many methods have solved the problem of control continuity in the priority transition process. However, they either increased the computation consumption or sacrificed the accuracy of tasks in practical application. In this work, we propose a Recursive Hierarchical Projection (RHP) matrix and introduce it in Hierarchical Quadratic Programming (HQP). This RHP-HQP scheme can form continuously changing hierarchical projection and regard the WBC problem with task priority transition as a unified formulation. This unified formulation can be smoothly transitioned without increasing computation consumption and solved without losing task accuracy. The comparative simulations of the reactive collision avoidance verify that this priority transition scheme can guarantee high computational efficiency and task accuracy.
|
|
14:50-15:00, Paper WeB-10.2 | |
Recursive Hierarchical Projection for Whole-Body Control with Task Priority Transition |
|
Han, Gang | UBTECH Robotics |
Wang, Jiajun | UBTECH Robotics |
Ju, Xiaozhu | UBTech Robotics |
Zhao, Mingguo | Tsinghua University |
Keywords: Whole-Body Motion Planning and Control, Optimization and Optimal Control, Collision Avoidance
Abstract: Whole-body control (WBC) with task priority transition is an important technology for robots to switch multiple behaviors, change different objectives, and adapt to various environments. Many methods have solved the problem of control continuity in the priority transition process. However, they either increased the computation consumption or sacrificed the accuracy of tasks in practical application. In this work, we propose a Recursive Hierarchical Projection (RHP) matrix and introduce it in Hierarchical Quadratic Programming (HQP). This RHP-HQP scheme can form continuously changing hierarchical projection and regard the WBC problem with task priority transition as a unified formulation. This unified formulation can be smoothly transitioned without increasing computation consumption and solved without losing task accuracy. The comparative simulations of the reactive collision avoidance verify that this priority transition scheme can guarantee high computational efficiency and task accuracy.
|
|
15:00-15:10, Paper WeB-10.3 | |
Multimodal Generation of Novel Action Appearances for Synthetic-To-Real Recognition of Activities of Daily Living |
|
Marinov, Zdravko | Karlsruhe Institute of Technology |
Schneider, David | Karlsruhe Institute of Technology |
Roitberg, Alina | Karlsruhe Institute of Technology (KIT) |
Stiefelhagen, Rainer | Karlsruhe Institute of Technology |
Keywords: Multi-Modal Perception for HRI, Transfer Learning, Modeling and Simulating Humans
Abstract: Domain shifts, such as appearance changes, are a key challenge in real-world applications of activity recognition models, which range from assistive robotics and smart homes to driver observation in intelligent vehicles. For example, while simulations are an excellent way of economical data collection, a Synthetic->Real domain shift leads to >60% drop in accuracy when recognizing Activities of Daily Living (ADLs). We tackle this challenge and introduce an activity domain generation framework which creates novel ADL appearances (novel domains) from different existing activity modalities (source domains) inferred from video training data. Our framework computes human poses, heatmaps of body joints, and optical flow maps and uses them alongside the original RGB videos to learn the essence of source domains in order to generate completely new ADL domains. The model is optimized by maximizing the distance between the existing source appearances and the generated novel appearances while ensuring that the semantics of an activity is preserved through an additional classification loss. While source data multimodality is an important concept in this design, our setup does not rely on multi-sensor setups, (i.e., all source modalities are inferred from a single video only.) The newly created activity domains are then integrated in the training of the ADL classification networks, resulting in models far less susceptible to changes in data distributions. Extensive experiments on the Synthetic->Real benchmark Sims4Action demonstrate the potential of the domain generation paradigm for cross-domain ADL recognition, setting new state-of-the-art results. We will make our code publicly available to the community.
|
|
15:10-15:20, Paper WeB-10.4 | |
Learning a Group-Aware Policy for Robot Navigation |
|
Katyal, Kapil | Johns Hopkins University Applied Physics Lab |
Gao, Yuxiang | Johns Hopkins University |
Markowitz, Jared | Johns Hopkins University Applied Physics Lab |
Pohland, Sara | University of California, Berkeley |
Rivera, Corban | Johns Hopkins University Applied Physics Lab |
Wang, I-Jeng | Johns Hopkins University Applied Physics Lab |
Huang, Chien-Ming | Johns Hopkins University |
Keywords: Human-Aware Motion Planning, Social HRI, Modeling and Simulating Humans
Abstract: Human-aware robot navigation promises a range of applications in which mobile robots bring versatile assistance to people in common human environments. While prior research has mostly focused on modeling pedestrians as independent, intentional individuals, people move in groups; consequently, it is imperative for mobile robots to respect human groups when navigating around people. This paper explores learning group-aware navigation policies based on dynamic group formation using deep reinforcement learning. Through simulation experiments, we show that group-aware policies, compared to baseline policies that neglect human groups, achieve greater robot navigation performance (e.g., fewer collisions), minimize violation of social norms and discomfort, and reduce the robot’s movement impact on pedestrians. Our results contribute to the development of social navigation and the integration of mobile robots into human environments.
|
|
15:20-15:30, Paper WeB-10.5 | |
Feedback-Efficient Active Preference Learning for Socially Aware Robot Navigation |
|
Wang, Ruiqi | Purdue University |
Wang, Weizheng | Beijing University of Chemical Technology |
Min, Byung-Cheol | Purdue University |
Keywords: Human-Aware Motion Planning, Reinforcement Learning, AI-Based Methods
Abstract: Socially aware robot navigation, where a robot is required to optimize its trajectory to maintain comfortable and compliant spatial interactions with humans in addition to reaching its goal without collisions, is a fundamental yet challenging task in the context of human-robot interaction. While existing learning-based methods have achieved better performance than the preceding model-based ones, they still have drawbacks: reinforcement learning depends on the handcrafted reward that is unlikely to effectively quantify broad social compliance, and can lead to reward exploitation problems; meanwhile, inverse reinforcement learning suffers from the need for expensive human demonstrations. In this paper, we propose a feedback-efficient active preference learning approach, FAPL, that distills human comfort and expectation into a reward model to guide the robot agent to explore latent aspects of social compliance. We further introduce hybrid experience learning to improve the efficiency of human feedback and samples, and evaluate benefits of robot behaviors learned from FAPL through extensive simulation experiments and a user study (N=10) employing a physical robot to navigate with human subjects in real-world scenarios. Source code and experiment videos for this work are available at: https://sites.google.com/view/san-fapl.
|
|
15:30-15:40, Paper WeB-10.6 | |
Watch Out! There May Be a Human. Addressing Invisible Humans in Social Navigation |
|
Singamaneni, Phani Teja | LAAS-CNRS |
Favier, Anthony | LAAS-CNRS |
Alami, Rachid | CNRS |
Keywords: Human-Aware Motion Planning, Human Detection and Tracking, Human-Robot Collaboration
Abstract: Current approaches in human-aware or social robot navigation address the humans that are visible to the robot. However, it is also important to address the possible emergences of humans to avoid shocks or surprises to humans and erratic behavior of the robot planner. In this paper, we propose a novel approach to detect and address these human emergences called `invisible humans'. We determine the places from which a human, currently not visible to the robot, can appear suddenly and then adapt the path and speed of the robot with the anticipation of potential collisions. This is done while still considering and adapting humans present in the robot's field of view. We also show how this detection can be exploited to identify and address the doorways or narrow passages. Finally, the effectiveness of the proposed methodology is shown through several simulated and real-world experiments.
|
|
15:40-15:50, Paper WeB-10.7 | |
Momentum-Aware Trajectory Optimization and Control for Agile Quadrupedal Locomotion |
|
Zhou, Ziyi | Georgia Institute of Technology |
Wingo, Bruce | Georgia Institute of Technology |
Boyd, Nathan | Georgia Institute of Technology |
Hutchinson, Seth | Georgia Institute of Technology |
Zhao, Ye | Georgia Institute of Technology |
Keywords: Whole-Body Motion Planning and Control, Legged Robots, Optimization and Optimal Control
Abstract: In this paper, we present a versatile hierarchical offline planning algorithm, along with and an online control pipeline for agile quadrupedal locomotion. Our offline planner alternates between optimizing centroidal dynamics for a reduced-order model and whole-body trajectory optimization, with the aim of achieving dynamics consensus. Our novel momentum-inertia-aware centroidal optimization, which uses an equimomental ellipsoid parameterization, is able to generate highly acrobatic motions via ``inertia shaping". Our whole-body optimization approach significantly improves upon the quality of standard DDP-based approaches by iteratively exploiting feedback from the centroidal level. For online control, we have developed a novel linearization of the full centroidal dynamics, and incorporated these into a convex model predictive control scheme. Our controller can efficiently optimize for both contact forces and joint accelerations in single optimization, enabling more straightforward tracking for momentum-rich motions compared to existing quadrupedal MPC controllers. We demonstrate the capability and generality of our trajectory planner on four different dynamic maneuvers. We then present hardware experiments on the MIT Mini Cheetah platform to demonstrate performance of the entire planning and control pipeline on a twisting jump maneuver.
|
|
15:50-16:00, Paper WeB-10.8 | |
Discover Life Skills for Planning As Bandits Via Observing and Learning How the World Works |
|
Lai, Tin | University of Sydney |
Keywords: Hybrid Logical/Dynamical Planning and Verification, Planning, Scheduling and Coordination, Task Planning
Abstract: We propose a novel approach for planning agents to compose abstract skills via observing and learning from historical interactions with the world. Our framework operates in a Markov state-space model via a set of actions under unknown pre-conditions. We formulate skills as high-level abstract policies that propose action plans based on the current state. Each policy learns new plans by observing the states' transitions while the agent interacts with the world. Such an approach automatically learns new plans to achieve specific intended effects, but the success of such plans is often dependent on the states in which they are applicable. Therefore, we formulate the evaluation of such plans as infinitely many multi-armed bandit problems, where we balance the allocation of resources on evaluating the success probability of existing arms and exploring new options. The result is a planner capable of automatically learning robust high-level skills under a noisy environment; such skills implicitly learn the action pre-condition without explicit knowledge. We show that this planning approach is experimentally very competitive in high-dimensional state space domains.
|
|
16:00-16:10, Paper WeB-10.9 | |
An Optimal Motion Planning Framework for Quadruped Jumping |
|
Song, Zhitao | The Chinese University of Hong Kong |
Yue, Linzhu | The Chinese University of Hong Kong |
Sun, Guangli | The Chinese University of Hong Kong |
Ling, Yihu | The Chinese University of HongKong |
Wei, Hongshuo | HKCLR |
Gui, Linhai | The Chinese University of Hong Kong |
Liu, Yunhui | Chinese University of Hong Kong |
Keywords: Whole-Body Motion Planning and Control, Optimization and Optimal Control, Legged Robots
Abstract: This paper presents an optimal motion planning framework to generate versatile energy-optimal quadrupedal jumping motions automatically (e.g., flips, spin). The jumping motions via the centroidal dynamics are formulated as a 12-dimensional black-box optimization problem subject to the robot kino-dynamic constraints. Gradient-based approaches offer great success in addressing trajectory optimization (TO), yet, prior knowledge (e.g., reference motion, contact schedule) is required and results in sub-optimal solutions. The new proposed framework first employed a heuristics-based optimization method to avoid these problems. Moreover, a prioritization fitness function is created for heuristics-based algorithms in robot ground reaction force (GRF) planning, enhancing convergence and searching performance considerably. Since heuristics-based algorithms often require significant time, motions are planned offline and stored as a pre-motion library. A selector is designed to automatically choose motions with user-specified or perception information as input. The proposed framework has been successfully validated only with a simple continuously tracking PD controller in an open-source Mini-Cheetah by several challenging jumping motions, including jumping over a window-shaped obstacle with 30 cm height and left-flipping over a rectangle obstacle with 27 cm height.
|
|
WeB-11 |
Rm11 (Room I) |
Intelligent Transportation Systems 1 |
Regular session |
Chair: Misu, Teruhisa | Honda Research Institute USA, Inc |
Co-Chair: Koga, Shumon | University of California San Diego |
|
14:40-14:50, Paper WeB-11.1 | |
Trajectory Prediction with Graph-Based Dual-Scale Context Fusion |
|
Zhang, Lu | Hong Kong University of Science and Technology |
Li, Peiliang | HKUST, Robotics Institute |
Chen, Jing | Hong Kong University of Science and Technology |
Shen, Shaojie | Hong Kong University of Science and Technology |
Keywords: Intelligent Transportation Systems, Deep Learning Methods, Computer Vision for Transportation
Abstract: Motion prediction for traffic participants is essential for a safe and robust automated driving system, especially in cluttered urban environments. However, it is highly challenging due to the complex road topology as well as the uncertain intentions of the other agents. In this paper, we present a graph-based trajectory prediction network named the Dual Scale Predictor (DSP), which encodes both the static and dynamical driving context in a hierarchical manner. Different from methods based on a rasterized map or sparse lane graph, we consider the driving context as a graph with two layers, focusing on both geometrical and topological features. Graph neural networks (GNNs) are applied to extract features with different levels of granularity, and features are subsequently aggregated with attention-based inter-layer networks, realizing better local-global feature fusion. Following the recent goal-driven trajectory prediction pipeline, goal candidates with high likelihood for the target agent are extracted, and predicted trajectories are generated conditioned on these goals. Thanks to the proposed dual-scale context fusion network, our DSP is able to generate accurate and human-like multi-modal trajectories. We evaluate the proposed method on the large-scale Argoverse motion forecasting benchmark, and it achieves promising results, outperforming the recent state-of-the-art methods. We release the code on our project website.
|
|
14:50-15:00, Paper WeB-11.2 | |
IMU Dead-Reckoning Localization with RNN-IEKF Algorithm |
|
Hang, Zhou | Harbin Institute of Technology, Shenzhen |
Yibo, Zhao | Harbin Institute of Technology Shenzhen |
Xiong, Xiaogang | Harbin Institute of Technology, Shenzhen |
Kamal, Shyam | Indian Institute of Technology (Banaras Hindu University) Varana |
Lou, Yunjiang | Harbin Institute of Technology, Shenzhen |
Keywords: Intelligent Transportation Systems, Autonomous Vehicle Navigation, AI-Enabled Robotics
Abstract: In complex urban environments, the Inertial Navigation System (INS) is important for navigating unmanned ground vehicles (UAVs) for its environment-independency and reliability of real-time localization. It is usually employed as the baseline in the case of other sensors failures, such as the GPS, Lidar, or Cameras. However, one problem for the INS is that its estimation error of localization accumulates over time, and thus the estimated trajectories of the UAVs continue to drift away from their ground truths. To solve this problem, this paper proposes an improved algorithm based on the Invariant Extended Kalman Filter (IEKF) for dead-reckoning of autonomous vehicles, which dynamically adjusts the process noise and the observation noise covariance matrixes through Attention mechanism and Recurrent Neural Network (RNN). The algorithm achieves more robust and accurate dead-reckoning localization in the experiments conducted on the KITTI dataset, reducing the translational error by about 45%compared to the baseline.
|
|
15:00-15:10, Paper WeB-11.3 | |
A Value-Based Dynamic Learning Approach for Vehicle Dispatch in Ride-Sharing |
|
Li, Cheng | University of Birmingham |
Parker, David | University of Birmingham |
Hao, Qi | Southern University of Science and Technology |
Keywords: Intelligent Transportation Systems, Planning, Scheduling and Coordination
Abstract: To ensure real-time response to passengers, existing solutions to the vehicle dispatch problem typically optimize dispatch policies using small batch windows and ignore the spatial-temporal dynamics over the long-term horizon. In this paper, we focus on improving the long-term performance of ride-sharing services and propose a deep reinforcement learning based approach for the ride-sharing dispatch problem. In particular, this work includes: (1) an offline policy evaluatio | |