| |
Last updated on May 1, 2024. This conference program is tentative and subject to change
Technical Program for Wednesday May 15, 2024
|
WeAA1-CC Award Session, CC-Main Hall |
Add to My Program |
Robot Manipulation |
|
|
Chair: Harada, Kensuke | Osaka University |
Co-Chair: Dogar, Mehmet R | University of Leeds |
|
10:30-12:00, Paper WeAA1-CC.1 | Add to My Program |
Open X-Embodiment: Robotic Learning Datasets and RT-X Models |
|
Levine, Sergey | UC Berkeley |
Finn, Chelsea | Stanford University |
Goldberg, Ken | UC Berkeley |
Chen, Lawrence Yunliang | UC Berkeley |
Sukhatme, Gaurav | University of Southern California |
Dass, Shivin | UT Austin |
Pinto, Lerrel | New York University |
Zhu, Yuke | The University of Texas at Austin |
Zhu, Yifeng | The University of Texas at Austin |
Song, Shuran | Columbia University |
Mees, Oier | University of California, Berkeley |
Pathak, Deepak | Carnegie Mellon University |
Fang, Hao-Shu | Shanghai Jiao Tong University |
Christensen, Henrik Iskov | UC San Diego |
Ding, Mingyu | UC Berkeley |
Lee, Youngwoon | University of California, Berkeley |
Sadigh, Dorsa | Stanford University |
Radosavovic, Ilija | UC Berkeley |
Bohg, Jeannette | Stanford University |
Wang, Xiaolong | UC San Diego |
Li, Xuanlin | UC San Diego |
Rana, Krishan | Queensland University of Technology |
Kawaharazuka, Kento | The University of Tokyo |
Matsushima, Tatsuya | The University of Tokyo |
Oh, Jihoon | The University of Tokyo |
Osa, Takayuki | University of Tokyo |
Kroemer, Oliver | Carnegie Mellon University |
Kim, Beomjoon | Korea Advanced Institute of Science and Technology |
Johns, Edward | Imperial College London |
Stulp, Freek | DLR - Deutsches Zentrum Für Luft Und Raumfahrt E.V |
Schneider, Jan | Max Planck Institute for Intelligent Systems |
Wu, Jiajun | Stanford University |
Li, Yunzhu | University of Illinois Urbana-Champaign |
Ben Amor, Heni | Arizona State University |
Ott, Lionel | ETH Zurich |
Martín-Martín, Roberto | University of Texas at Austin |
Hausman, Karol | Google Brain |
Vuong, Quan | UC San Diego |
Sanketi, Pannag | Google |
Heess, Nicolas | Google Deepmind |
Vanhoucke, Vincent | Google |
Pertsch, Karl | UC Berkeley & Stanford University |
Schaal, Stefan | Google X |
Chi, Cheng | Columbia University |
Pan, Chuer | Stanford University |
Bewley, Alex | Google |
Keywords: Data Sets for Robot Learning, Imitation Learning, Deep Learning Methods
Abstract: Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train ``generalist’’ cross-robot policy that can be adapted efficiently to new robots, tasks, and environments? In this paper, we provide datasets in standardized data formats and models to make it possible to explore this possibility in the context of robotic manipulation, alongside experimental results that provide an example of effective cross-robot policies. We assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms.
|
|
10:30-12:00, Paper WeAA1-CC.2 | Add to My Program |
Towards Generalizable Zero-Shot Manipulation Via Translating Human Interaction Plans |
|
Bharadhwaj, Homanga | Carnegie Mellon University |
Gupta, Abhinav | Carnegie Mellon University |
Kumar, Vikash | Meta AI |
Tulsiani, Shubham | Carnegie Mellon University |
Keywords: Machine Learning for Robot Control, Learning from Demonstration, Big Data in Robotics and Automation
Abstract: We pursue the goal of developing robots that can interact zero-shot with generic unseen objects via a diverse repertoire of manipulation skills and show how passive human videos can serve as a rich source of data for learning such generalist robots. Unlike typical robot learning approaches which directly learn how a robot should act from interaction data, we adopt a factorized approach that can leverage large-scale human videos to learn how a human would accomplish a desired task (a human `plan'), followed by `translating’ this plan to the robot’s embodiment. Specifically, we learn a human `plan predictor’ that, given a current image of a scene and a goal image, predicts the future hand and object configurations. We combine this with a `translation’ module that learns a plan-conditioned robot manipulation policy, and allows following humans plans for generic manipulation tasks in a zero-shot manner with no deployment-time training. Importantly, while the plan predictor can leverage large-scale human videos for learning, the translation module only requires a small amount of in-domain data, and can generalize to tasks not seen during training. We show that our learned system can perform over 16 manipulation skills that generalize to 40 objects, encompassing 100 real-world tasks for table-top manipulation and diverse in-the-wild manipulation. https://homangab.github.io/hopman/
|
|
10:30-12:00, Paper WeAA1-CC.3 | Add to My Program |
Hearing Touch: Audio-Visual Pretraining for Contact-Rich Manipulation |
|
Mejia, Jared | Carnegie Mellon University |
Dean, Victoria | Carnegie Mellon University |
Hellebrekers, Tess | Meta AI Research |
Gupta, Abhinav | Carnegie Mellon University |
Keywords: Representation Learning, Sensorimotor Learning, Robot Audition
Abstract: Although pre-training on a large amount of data is beneficial for robot learning, current paradigms only perform large-scale pretraining for visual representations, whereas representations for other modalities are trained from scratch. In contrast to the abundance of visual data, it is unclear what relevant internet-scale data may be used for pretraining other modalities such as tactile sensing. Such pretraining becomes increasingly crucial in the low-data regimes common in robotics applications. In this paper, we address this gap by using contact microphones as an alternative tactile sensor. Our key insight is that contact microphones capture inherently audio-based information, allowing us to leverage large-scale audio-visual pretraining to obtain representations that boost the performance of robotic manipulation. To the best of our knowledge, our method is the first approach leveraging large-scale multisensory pre-training for robotic manipulation.
|
|
10:30-12:00, Paper WeAA1-CC.4 | Add to My Program |
SARA-RT: Scaling up Robotics Transformers with Self-Adaptive Robust Attention |
|
Leal, Isabel | Google Deepmind |
Choromanski, Krzysztof | Google DeepMind Robotics |
Jain, Deepali | Robotics at Google |
Dubey, Avinava | Google |
Varley, Jacob | Google |
Ryoo, Michael S. | Google, Stony Brook University |
Lu, Yao | Google |
Liu, Frederick | Google |
Sindhwani, Vikas | Google Brain, NYC |
Sarlos, Tamas | Google Research |
Oslund, Kenneth | Google |
Hausman, Karol | Google Brain |
Vuong, Quan | UC San Diego |
Rao, Kanishka | Google |
Keywords: Deep Learning Methods, Deep Learning in Grasping and Manipulation
Abstract: We present Self-Adaptive Robust Attention for Robotics Transformers (SARA-RT): a new paradigm for addressing the emerging challenge of scaling up Robotics Transformers (RT) for on-robot deployment. SARA-RT relies on the new method of fine-tuning proposed by us, called up-training. It converts pre-trained or already fine-tuned Transformer-based robotic policies of quadratic time complexity (including massive billion-parameter vision-language-action models or VLAs), into their efficient linear-attention counterparts maintaining high quality. We demonstrate the effectiveness of SARA-RT by speeding up: (a) the class of recently introduced RT-2 models, the first VLA robotic policies pre-trained on internet-scale data, as well as (b) Point Cloud Transformer (PCT) robotic policies operating on large point clouds. We complement our results with the rigorous mathematical analysis providing deeper insight into the phenomenon of SARA.
|
|
10:30-12:00, Paper WeAA1-CC.5 | Add to My Program |
DenseTact-Mini: An Optical Tactile Sensor for Grasping Multi-Scale Objects from Flat Surfaces |
|
Do, Won Kyung | Stanford University |
Dhawan, Ankush | Stanford University |
Kitzmann, Mathilda | Stanford University |
Kennedy, Monroe | Stanford University |
Keywords: Grasping, Force and Tactile Sensing, Grippers and Other End-Effectors
Abstract: Dexterous manipulation, especially of small daily objects, continues to pose complex challenges in robotics. This paper introduces the DenseTact-Mini, an optical tactile sensor with a soft, rounded, smooth gel surface and compact design equipped with a synthetic fingernail. We propose three distinct grasping strategies: tap grasping using adhesion forces such as electrostatic and van der Waals, fingernail grasping leveraging rolling/sliding contact between the object and fingernail, and fingertip grasping with two soft fingertips. Through comprehensive evaluations, the DenseTact-Mini demonstrates a lifting success rate exceeding 90.2% when grasping various objects, including items such as 1mm basil seeds, thin paperclips, and items larger than 15mm such as bearings. This work demonstrates the potential of soft optical tactile sensors for dexterous manipulation and grasping.
|
|
10:30-12:00, Paper WeAA1-CC.6 | Add to My Program |
Constrained Bimanual Planning with Analytic Inverse Kinematics |
|
Cohn, Thomas | Massachusetts Institute of Technology |
Shaw, Seiji | Massachusetts Institute of Technology |
Simchowitz, Max | MIT |
Tedrake, Russ | Massachusetts Institute of Technology |
Keywords: Bimanual Manipulation, Constrained Motion Planning, Kinematics
Abstract: In order for a bimanual robot to manipulate an object that is held by both hands, it must construct motion plans such that the transformation between its end effectors remains fixed. This amounts to complicated nonlinear equality constraints in the configuration space, which are difficult for trajectory optimizers. In addition, the set of feasible configurations becomes a measure zero set, which presents a challenge to sampling-based motion planners. We leverage an analytic solution to the inverse kinematics problem to parametrize the configuration space, resulting in a lower-dimensional representation where the set of valid configurations has positive measure. We describe how to use this parametrization with existing motion planning algorithms, including sampling-based approaches, trajectory optimizers, and techniques that plan through convex inner-approximations of collision-free space.
|
|
WeAA2-CC Award Session, CC-301 |
Add to My Program |
Robot Vision |
|
|
Chair: Chaumette, Francois | Inria Center at University of Rennes |
Co-Chair: Hashimoto, Koichi | Tohoku University |
|
10:30-12:00, Paper WeAA2-CC.1 | Add to My Program |
Deep Evidential Uncertainty Estimation for Semantic Segmentation under Out-Of-Distribution Obstacles |
|
Ancha, Siddharth | Massachusetts Institute of Technology |
Osteen, Philip | U.S. Army Research Laboratory |
Roy, Nicholas | Massachusetts Institute of Technology |
Keywords: Deep Learning for Visual Perception, Deep Learning Methods, Visual Learning
Abstract: In order to navigate safely and reliably in novel environments, robots must estimate perceptual uncertainty when confronted with out-of-distribution (OOD) obstacles not seen in training data. We present a method to accurately estimate pixel-wise uncertainty in semantic segmentation without requiring real or synthetic OOD examples at training time. From a shared per-pixel latent feature representation, a classification network predicts a categorical distribution over semantic labels, while a normalizing flow estimates the probability density of features under the training distribution. The label distribution and density estimates are combined in a Dirichlet-based evidential uncertainty framework that efficiently computes epistemic and aleatoric uncertainty in a single neural network forward pass. Our method is enabled by three key contributions. First, we simplify the problem of learning a transformation to the training data density by starting from a fitted Gaussian mixture model instead of the conventional standard normal distribution. Second, we learn a richer and more expressive latent pixel representation to aid OOD detection by training a decoder to reconstruct input image patches. Third, we perform theoretical analysis of the loss function used in the evidential uncertainty framework and propose a principled objective that more accurately balances training the classification and density estimation networks. We demonstrate the accuracy of our uncertainty estimation approach under long-tail OOD obstacle classes for semantic segmentation in both off-road and urban driving environments.
|
|
10:30-12:00, Paper WeAA2-CC.2 | Add to My Program |
NGEL-SLAM: Neural Implicit Representation-Based Global Consistent Low-Latency SLAM System |
|
Mao, Yunxuan | Zhejiang University |
Yu, Xuan | Zhejiang University |
Zhang, Zhuqing | Zhejiang University |
Wang, Kai | HuaWei |
Wang, Yue | Zhejiang University |
Xiong, Rong | Zhejiang University |
Liao, Yiyi | Zhejiang University |
Keywords: SLAM
Abstract: Neural implicit representations have emerged as a promising solution for addressing the challenges of Simultaneous Localization and Mapping (SLAM) problems in indoor scenes. This paper presents NGEL-SLAM, a low-latency global consistent SLAM system that utilizes neural implicit scene representation. To ensure global consistency, our system incorporates loop closure in the tracking module and maintains a global consistent map by representing the scene using multiple neural implicit fields and performing a quick adjustment to the loop closure. The fast convergence and rapid response to loop closure make our system a truly low-latency system that achieves global consistency. The neural implicit representation enables the rendering of high-fidelity RGB-D images with the extraction of explicit, dense, and interactive surfaces. Experiments were conducted on both synthetic and real-world datasets to evaluate the effectiveness of the proposed approach. The results demonstrate the achieved tracking and mapping accuracy and low-latency performance of our system.
|
|
10:30-12:00, Paper WeAA2-CC.3 | Add to My Program |
SeqTrack3D: Exploring Sequence Information for Robust 3D Point Cloud Tracking |
|
Lin, Yu | Northeastern University |
Li, Zhiheng | Northeastern University |
Cui, Yubo | Northeastern University |
Fang, Zheng | Northeastern University |
Keywords: Visual Tracking, Deep Learning for Visual Perception, Computer Vision for Transportation
Abstract: 3D single object tracking (SOT) is an important and challenging task for the autonomous driving and mobile robotics. Most existing methods perform tracking between two consecutive frames while ignoring the motion patterns of the target over a series of frames, which would cause performance degradation in scenes with sparse points. To break through this limitation, we introduce Sequence-to-Sequence tracking paradigm and a tracker named SeqTrack3D to capture target motion across continuous frames. Unlike previous methods that primarily adopted three strategies: matching two consecutive point clouds, predicting relative motion, or utilizing sequential point clouds to address feature degradation, our SeqTrack3D combines both historical point clouds and bounding box sequences. This novel approach ensures robust tracking by leveraging location priors from historical boxes, even in scenes with sparse points. Extensive experiments conducted on large-scale datasets show that SeqTrack3D achieves new state-of-the-art performances, improving by 6.00% on NuScenes and 14.13% on Waymo dataset.
|
|
10:30-12:00, Paper WeAA2-CC.4 | Add to My Program |
Ultrafast Square-Root Filter-Based VINS |
|
Peng, Yuxiang | University of Delaware |
Chen, Chuchu | University of Delaware |
Huang, Guoquan | University of Delaware |
Keywords: Localization, Visual-Inertial SLAM, SLAM
Abstract: In this paper, we strongly advocate square-root covariance (instead of information) filtering for visual-inertial navigation, in particular on resource-constrained edge devices, because of its superior efficiency and numerical stability. Although Visual-Inertial Navigation Systems (VINS) have made tremendous progress in recent years, they still face resource stringency and numerical instability on embedded systems when imposing limited word length. To overcome these challenges, we develop an ultrafast and numerically-stable square-root filter (SRF)-based VINS algorithm (i.e., SR-VINS). The numerical stability of the proposed SR-VINS is inherited from the adoption of square-root covariance while the never-before-seen efficiency is largely enabled by the novel SRF update method that is based on our new permuted-QR (P-QR), which fully utilizes and properly maintains the upper triangular structure of the square-root covariance matrix. Furthermore, we choose a special ordering of the state variables which is amenable for (P-)QR operations in the SRF propagation and update and prevents unnecessary computation. The proposed SR-VINS is validated extensively through numerical studies, demonstrating that when the state-of-the-art (SOTA) filters have numerical difficulties, our SR-VINS has superior numerical stability, and remarkably, achieves efficient and robust performance on 32-bit single-precision float at a speed nearly twice as fast as the SOTA methods. We also conduct comprehensive real-world experiments to validate the efficiency, accuracy, and robustness of the proposed SR-VINS.
|
|
10:30-12:00, Paper WeAA2-CC.5 | Add to My Program |
Universal Visual Decomposer: Long-Horizon Manipulation Made Easy |
|
Zhang, Zichen | Allen Institute for AI |
Li, Yunshuang | Univeresity of Pennsylvania |
Bastani, Osbert | University of Pennsylvania |
Gupta, Abhishek | University of Washington |
Jayaraman, Dinesh | University of Pennsylvania |
Ma, Yecheng Jason | University of Pennsylvania |
Weihs, Luca | Allen Institute for AI |
Keywords: Learning from Demonstration, Imitation Learning, Reinforcement Learning
Abstract: Real-world robotic tasks stretch over extended horizons and encompass multiple stages. Learning long-horizon manipulation tasks, however, is a long-standing challenge, and demands decomposing the overarching task into several manageable subtasks to facilitate policy learning and generalization to unseen tasks. Prior task decomposition methods require task-specific knowledge, are computationally intensive, and cannot readily be applied to new tasks. To address these shortcomings, we propose Universal Visual Decomposer (UVD), an off-the-shelf task decomposition method for visual long-horizon manipulation using pre-trained visual representations for robotic control. At a high level, UVD discovers subgoals by detecting phase shifts in the embedding space of the pre-trained representation. Operating purely on visual demonstrations without auxiliary information, UVD can effectively extract visual subgoals embedded in the videos, while incurring zero additional training cost on top of standard visuomotor policy training. Goal-conditioned policies learned with UVD-discovered subgoals exhibit significantly improved compositional generalization at test time to unseen tasks. Furthermore, UVD-discovered subgoals can be used to construct goal-based reward shaping that jump-starts temporally extended exploration for reinforcement learning. We extensively evaluate UVD on both simulation and real-world tasks, and in all cases, UVD substantially outperforms baselines across imitation and reinforcement learning settings on in-domain and out-of-domain task sequences alike, validating the clear advantage of automated visual task decomposition within the simple, compact UVD framework.
|
|
10:30-12:00, Paper WeAA2-CC.6 | Add to My Program |
HEGN: Hierarchical Equivariant Graph Neural Network for 9DoF Point Cloud Registration |
|
Misik, Adam | Siemens Technology, Technical University Munich |
Salihu, Driton | Technical University Munich |
Su, Xin | Technical University of Munich |
Brock, Heike | Siemens AG |
Steinbach, Eckehard | Technical University of Munich |
Keywords: Deep Learning for Visual Perception, Visual Learning, Computer Vision for Automation
Abstract: Given its wide application in robotics, point cloud registration is a widely researched topic. Conventional methods aim to find a rotation and translation that align two point clouds in 6 degrees of freedom (DoF). However, certain tasks in robotics, such as category-level pose estimation, involve non-uniformly scaled point clouds, requiring a 9DoF transform for accurate alignment. We propose HEGN, a novel equivariant graph neural network for 9DoF point cloud registration. HEGN utilizes equivariance to rotation, translation, and scaling to estimate the transformation without relying on point correspondences. Based on graph representations for both point clouds, we extract equivariant node features aggregated in their local, cross-, and global context. In addition, we introduce a novel node pooling mechanism that leverages the cross-context importance of nodes to pool the graph representation. By repeating the feature extraction and node pooling, we obtain a graph hierarchy. Finally, we determine rotation and translation by aligning equivariant features aggregated over the graph hierarchy. To estimate scaling, we leverage scale information contained in the vector norm of the equivariant features. We evaluate the effectiveness of HEGN through experiments with the synthetic ModelNet40 dataset and the real-world ScanObjectNN dataset. The results show the superior performance of HEGN in 9DoF point cloud registration and its competitive performance in conventional 6DoF point cloud registration.
|
|
WeAT1-CC Oral Session, CC-303 |
Add to My Program |
Motion and Path Planning I |
|
|
Chair: Tsagarakis, Nikos | Istituto Italiano Di Tecnologia |
Co-Chair: Okuda, Hiroyuki | Nagoya University |
|
10:30-12:00, Paper WeAT1-CC.1 | Add to My Program |
Autonomous Navigation with Online Replanning and Recovery Behaviors for Wheeled-Legged Robots Using Behavior Trees |
|
De Luca, Alessio | Istituto Italiano Di Tecnologia |
Muratore, Luca | Istituto Italiano Di Tecnologia |
Tsagarakis, Nikos | Istituto Italiano Di Tecnologia |
Keywords: Motion and Path Planning, Reactive and Sensor-Based Planning, Field Robots
Abstract: Performing autonomous navigation in cluttered and unstructured terrains still remains a challenging task for legged and wheeled mobile robots. To accomplish such a task, online planners shall incorporate new terrain information perceived while the robot is moving within its environment. While hybrid mobility robots offer high flexibility in traversing challenging terrains by leveraging the advantages of both wheeled and legged locomotion, the effective hybrid planning of the mobility actions that transparently combine both modes of locomotion has not been extensively explored. In this work, we present a hierarchical online hybrid primitive-based planner for autonomous navigation with wheeled-legged robots. The framework is handled by a Behavior Tree (BT) and it takes into account recovery methods to deal with possible failures during the execution of the navigation/mobility plan. The framework was evaluated in multiple irregular and heavily cluttered simulated environments randomly generated and in real-world trials, using the CENTAURO robot platform. With these experiments, we demonstrated autonomous capabilities without any human intervention, even in case of collision or planner failures.
|
|
10:30-12:00, Paper WeAT1-CC.2 | Add to My Program |
Signal Temporal Logic Neural Predictive Control |
|
Meng, Yue | Massachusetts Institute of Technology |
Fan, Chuchu | Massachusetts Institute of Technology |
Keywords: Motion and Path Planning, Machine Learning for Robot Control, AI-Based Methods
Abstract: Ensuring safety and meeting temporal specifications are critical challenges for long-term robotic tasks. Signal temporal logic (STL) has been widely used to systematically and rigorously specify these requirements. However, traditional methods of finding the control policy under those STL requirements are computationally complex and not scalable to high-dimensional or systems with complex nonlinear dynamics. Reinforcement learning (RL) methods can learn the policy to satisfy the STL specifications via hand-crafted or STL-inspired rewards, but might encounter unexpected behaviors due to ambiguity and sparsity in the reward. In this paper, we propose a method to directly learn a neural network controller to satisfy the requirements specified in STL. Our controller learns to roll out trajectories to maximize the STL robustness score in training. In testing, similar to Model Predictive Control (MPC), the learned controller predicts a trajectory within a planning horizon to ensure the satisfaction of the STL requirement in deployment. A backup policy is designed to ensure safety when our controller fails. Our approach can adapt to various initial conditions and environmental parameters. We conduct experiments on six tasks, where our method with the backup policy outperforms the classical methods (MPC, STL-solver), model-free and model-based RL methods in STL satisfaction rate, especially on tasks with complex STL specifications while being 10X-100X faster than the classical method
|
|
10:30-12:00, Paper WeAT1-CC.3 | Add to My Program |
Multi-Query TDSP for Path Planning in Time-Varying Flow Fields |
|
Lee, James Ju Heon | University of Technology Sydney |
Yoo, Chanyeol | University of Technology Sydney |
Anstee, Stuart David | Defence Science and Technology Group |
Fitch, Robert | University of Technology Sydney |
Keywords: Motion and Path Planning, Marine Robotics
Abstract: Many applications of path planning in time-varying flow fields, particularly in areas such as marine robotics and ship routing, can be modelled as instances of the time-varying shortest path (TDSP) problem. Although there are no known polynomial-time solutions to TDSP in general, our recent work has identified a tractable case where the flow is modelled as piecewise constant. Extending this method to allow for computational reuse in larger multi-query problems, however, requires additional thought. This paper shows that the piecewise-linear form of the cost function employed in previously work can be used to build an analogy of a shortest path tree, thereby enabling optimal concatenation of sub-problem solutions in the absence of an optimal substructure, and without uniform time discretisation. We present a framework for multi-query TDSP that finds an optimal path that passes through a defined sequence of waypoints and is computationally efficient. Performance comparison is provided in simulation that shows large (up to 100x) speedup compared to a naive approach. This result is significant for applications such as ship routing, where route evaluation is a desirable capability.
|
|
10:30-12:00, Paper WeAT1-CC.4 | Add to My Program |
CTopPRM: Clustering Topological PRM for Planning Multiple Distinct Paths in 3D Environments |
|
Novosad, Matej | Faculty of Electrical Engineering, Czech Technical University In |
Penicka, Robert | Czech Technical University in Prague |
Vonasek, Vojtech | Czech Technical University in Prague |
Keywords: Motion and Path Planning, Planning, Scheduling and Coordination
Abstract: We propose a new method called Clustering Topological PRM (CTopPRM) for finding multiple distinct paths in 3D cluttered environments. Finding such distinct paths is useful in many applications. Among others, using multiple distinct paths is necessary for optimization-based trajectory planners where found trajectories are restricted to only a single homotopy class of a given path. Distinct paths can also be used to guide sampling-based motion planning and thus increase the effectiveness of planning in environments with narrow passages. Graph-based representation called roadmap is a common representation for path planning and also for finding multiple distinct paths. Yet, challenging environments with multiple narrow passages require a densely sampled roadmap to capture the connectivity of the environment. Searching such a dense roadmap for multiple paths is computationally too expensive. The majority of existing methods construct only a sparse roadmap which, however, struggles to find all distinct paths in challenging environments. To this end, we propose the CTopPRM which creates a sparse graph by clustering an initially sampled dense roadmap. Such a reduced roadmap allows fast identification of homotopically distinct paths captured in the dense roadmap. We show, that compared to the existing methods the CTopPRM improves the probability of finding all distinct paths by almost 20%, during same run-time. The source code of our method is released as an open-source package.
|
|
10:30-12:00, Paper WeAT1-CC.5 | Add to My Program |
Stein Variational Guided Model Predictive Path Integral Control: Proposal and Experiments with Fast Maneuvering Vehicles |
|
Honda, Kohei | Nagoya University |
Akai, Naoki | Nagoya University |
Suzuki, Kosuke | Nagoya University |
Aoki, Mizuho | Nagoya University |
Hosogaya, Hirotaka | Nagoya University |
Okuda, Hiroyuki | Nagoya University |
Suzuki, Tatsuya | Nagoya University |
Keywords: Motion and Path Planning, Optimization and Optimal Control, Collision Avoidance
Abstract: This paper presents a novel Stochastic Optimal Control (SOC) method based on Model Predictive Path Integral control (MPPI), named Stein Variational Guided MPPI (SVG-MPPI), designed to handle rapidly shifting multimodal optimal action distributions. While MPPI can find a Gaussian-approximated optimal action distribution in closed form, i.e., without iterative solution updates, it struggles with the multimodality of the optimal distributions. This is due to the less representative nature of the Gaussian. To overcome this limitation, our method aims to identify a target mode of the optimal distribution and guide the solution to converge to fit it. In the proposed method, the target mode is roughly estimated using a modified Stein Variational Gradient Descent (SVGD) method and embedded into the MPPI algorithm to find a closed-form ``mode-seeking'' solution that covers only the target mode, thus preserving the fast convergence property of MPPI. Our simulation and real-world experimental results demonstrate that SVG-MPPI outperforms both the original MPPI and other state-of-the-art sampling-based SOC algorithms in terms of path-tracking and obstacle-avoidance capabilities.
|
|
10:30-12:00, Paper WeAT1-CC.6 | Add to My Program |
An Efficient Solution to the 2D Visibility Problem in Cartesian Grid Maps and Its Application in Heuristic Path Planning |
|
Ibrahim, Ibrahim | KU Leuven |
Gillis, Joris | KU Leuven |
Decré, Wilm | Katholieke Universiteit Leuven |
Swevers, Jan | KU Leuven |
Keywords: Computational Geometry, Simulation and Animation, Motion and Path Planning
Abstract: This paper introduces a novel, lightweight method to solve the visibility problem for 2D grids. The proposed method evaluates the existence of lines-of-sight from a source point to all other grid cells in a single pass with no preprocessing and independently of the number and shape of obstacles. It has a compute and memory complexity of mathcal{O}(n), where n = n_{x}times n_{y} is the size of the grid, and requires at most ten arithmetic operations per grid cell. In the proposed approach, we use a linear first-order hyperbolic partial differential equation to transport the visibility quantity in all directions. In order to accomplish that, we use an entropy-satisfying upwind scheme that converges to the true visibility polygon as the step size goes to zero. This dynamic-programming approach allows the evaluation of visibility for an entire grid much faster than typical algorithms. We provide a practical application of our proposed algorithm by posing the visibility quantity as a heuristic and implementing a deterministic, local-minima-free path planner, setting apart the proposed planner from traditional methods. Lastly, we provide necessary algorithms and an open-source implementation of the proposed methods.
|
|
10:30-12:00, Paper WeAT1-CC.7 | Add to My Program |
Efficient Clothoid Tree-Based Local Path Planning for Self-Driving Robots |
|
Lee, Minhyeong | Seoul National University |
Lee, Dongjun | Seoul National University |
Keywords: Motion and Path Planning, Wheeled Robots
Abstract: In this paper, we propose a real-time clothoid tree-based path planning for self-driving robots. Clothoids, curves that exhibit linear curvature profiles, play an important role in road design and path planning due to their appealing properties. Nevertheless, their real-time applications face considerable challenges, primarily stemming from the lack of a closed-form clothoid expression. To address these challenges, we introduce two innovative techniques: 1) an efficient and precise clothoid approximation using the Gauss-Legendre quadrature; and 2) a data-efficient decoder for interpolating clothoid splines that leverages the symmetry and similarity of clothoids. These techniques are demonstrated with numerical examples. The clothoid approximation ensures an accurate and smooth representation of the curve, and the clothoid spline decoder effectively accelerates the clothoid tree exploration by relaxing the problem constraints and reducing the problem size. Both techniques are integrated into our path planning algorithm and evaluated in various driving scenarios.
|
|
10:30-12:00, Paper WeAT1-CC.8 | Add to My Program |
Decentralized Lifelong Path Planning for Multiple Ackerman Car-Like Robots |
|
Guo, Teng | Rutgers University |
Yu, Jingjin | Rutgers University |
Keywords: Motion and Path Planning, Multi-Robot Systems, Path Planning for Multiple Mobile Robots or Agents
Abstract: Path planning for multiple non-holonomic robots in continuous domains constitutes a difficult robotics challenge with many applications. Despite significant recent progress on the topic, computationally efficient and high-quality solutions are lacking, especially in lifelong settings where robots must continuously take on new tasks. In this work, we make it possible to extend key ideas enabling state-of-the-art (SOTA) methods for multi-robot planning in discrete domains to the motion planning of multiple Ackerman (car-like) robots in lifelong settings, yielding high-performance centralized and decentralized planners. Our planners compute trajectories that allow the robots to reach precise SE(2) goal poses. The effectiveness of our methods is thoroughly evaluated and confirmed using both simulation and real-world experiments.
|
|
10:30-12:00, Paper WeAT1-CC.9 | Add to My Program |
Energy-Aware Ergodic Search: Continuous Exploration for Multi-Agent Systems with Battery Constraints |
|
Seewald, Adam | Yale University |
Lerch, Cameron | Yale University |
Chancán, Marvin | Yale University |
Dollar, Aaron | Yale University |
Abraham, Ian | Yale University |
Keywords: Motion and Path Planning, Energy and Environment-Aware Automation
Abstract: Continuous exploration without interruption is important in scenarios such as search and rescue and precision agriculture, where consistent presence is needed to detect events over large areas. Ergodic search already derives continuous trajectories in these scenarios so that a robot spends more time in areas with high information density. However, existing literature on ergodic search does not consider the robot's energy constraints, limiting how long a robot can explore. In fact, if the robots are battery-powered, it is physically not possible to continuously explore on a single battery charge. Our paper tackles this challenge, integrating ergodic search methods with energy-aware coverage. We trade off battery usage and coverage quality, maintaining uninterrupted exploration by at least one agent. Our approach derives an abstract battery model for future state-of-charge estimation and extends canonical ergodic search to ergodic search under battery constraints. Empirical data from simulations and real-world experiments demonstrate the effectiveness of our energy-aware ergodic search, which ensures continuous exploration and guarantees spatial coverage.
|
|
WeAT2-CC Oral Session, CC-311 |
Add to My Program |
Actuation |
|
|
Chair: Thomas, Ulrike | Chemnitz University of Technology |
Co-Chair: Haddadin, Sami | Technical University of Munich |
|
10:30-12:00, Paper WeAT2-CC.1 | Add to My Program |
Development of Variable Transmission Series Elastic Actuator for Hip Exoskeletons |
|
Wang, Tianci | City University of Hong Kong |
Wen, Hao | City University of Hong Kong |
Song, Zaixin | City University of Hong Kong |
Dong, Zhiping | City University of Hong Kong |
Liu, Chunhua | City University of Hong Kong |
Keywords: Actuation and Joint Mechanisms, Compliant Joints and Mechanisms, Compliance and Impedance Control
Abstract: Series Elastic Actuator-based exoskeleton can offer precise torque control and transparency when interacting with human wearers. Accurate control of SEA-produced torques ensures the wearer’s voluntary motion and supports the implementation of multiple assistive paradigms. In this paper, a novel variable transmission series elastic actuator (VTSEA) is developed to meet torque-speed requirements in different exoskeleton-assisted locomotion modes, such as running, walking, sit-to-stand, and stand-to-sit. The VTSEA features a SEA-coupled variable transmission ratio adjusting mechanism and works between three discrete levels of transmission ratio depending on the user’s initiative. The proposed prototype can also improve transparency in human-robot interaction. Also, an accurate torque controller with inertial compensation is developed for the VTSEA via the singular perturbation theory, and its stability is proved. The feasibility of the proposed VTSEA prototype and the precise output torque performance of VTSEA are verified by experiments.
|
|
10:30-12:00, Paper WeAT2-CC.2 | Add to My Program |
Optimization of Mono and Bi-Articular Parallel Elastic Elements for a Robotic Arm Performing a Pick-And-Place Task |
|
Marchal, Maxime | Vrije Universiteit Brussel |
Furnémont, Raphaël | Vrije Universiteit Brussel |
Vanderborght, Bram | Vrije Universiteit Brussel |
Mostafaoui, Ghiles | CNRS, University of CergyPontoise, ENSEA |
Verstraten, Tom | Vrije Universiteit Brussel |
Keywords: Actuation and Joint Mechanisms, Compliant Joints and Mechanisms, Mechanism Design
Abstract: Actuation concepts such as Series Elastic Actuation (SEA), Parallel Elastic Actuation (PEA), and Biarticular Actuation (BA), which introduce elastic elements into the structure, have the potential to reduce the electrical energy consumption of a robot. This letter presents an optimization of the arrangement of springs for a 3 degrees of freedom robotic arm, with the aim of decreasing the electrical energy consumption for a given pick-and-place task. Through simulations and experimental validation, we show that the optimal configuration in terms of electrical energy consumption and complexity consists of rigid actuation on joint 1 and PEAs on joints 2 and 3. With this configuration, root mean square (RMS) and peak load torques for a specific pick-and-place task can be reduced respectively by up to 43% and 44% for joint 2, and by 15% and 21% for joint 3 compared to the configuration without springs.
|
|
10:30-12:00, Paper WeAT2-CC.3 | Add to My Program |
A Novel Compact Design of a Lever-Cam-Based Variable Stiffness Actuator: LC-VSA |
|
Zhu, Hongxi | Chemnitz University of Technology |
Thomas, Ulrike | Chemnitz University of Technology |
Keywords: Actuation and Joint Mechanisms, Compliant Joints and Mechanisms, Mechanism Design
Abstract: The safer interaction between humans and robots is one of the challenges in robotics. To protect humans and robots from impact, researchers have developed many different soft robots, which incorporate mechanical springs into their joints. The forthcoming generation of soft robots necessitates adaptable joint stiffness to accommodate various tasks. Consequently, the development of variable stiffness joints (VSA) has become crucial. Among the prevalent approaches for stiffness adjustment, lever mechanisms have been implemented in numerous variable stiffness joints. Nonetheless, the integration of the lever technology into VSA often faces challenges in achieving a compact design. This paper introduces a mechanically compact design for a novel lever-cam-based variable stiffness joint.
|
|
10:30-12:00, Paper WeAT2-CC.4 | Add to My Program |
Design and Modeling of a Compact Serial Variable Stiffness Actuator (SVSA-III) with Linear Stiffness Profile |
|
Yi, Shuowen | Wuhan University |
Liu, Siyu | The School of Power and Mechanical Engineering, Wuhan University |
Liao, Junbei | Wuhan University |
Guo, Zhao | Wuhan University |
Keywords: Actuation and Joint Mechanisms, Compliant Joints and Mechanisms, Mechanism Design
Abstract: Variable stiffness actuator (VSA) can imitate natural muscles in their compliance capbility, which can provide flexible adaptability for robots, improving the safety of robots interacting with the environment or human. This paper presents a new compact serial variable stiffness actuator ((SVSA-III)) with linear stiffness profile based on symmetrical variable lever arm mechanism. The stiffness motor is used to regulate the position of the pivot located on the Archimedean Spiral Relocation Mechanism (ASRM), so that the stiffness of the actuator can be adjusted (softening or hardening). By designing the lever length, the range of stiffness adjustment can change from 0.3Nm/degree to therotical infinity. Moreover, the continuous linear stiffness profile of the actuator can be customized by solving the transcendental equation of the relationship between the actuator stiffness and the rotation angle of the stiffness motor. SVSA-III has the advantages of compact structure, wide-range stiffness regulation, reduced control difficulty, and linear stiffness profile. Two experiments of step response and stiffness tracking have proved the high accuracy and fast response for both theoretical stiffness and position adjustment.
|
|
10:30-12:00, Paper WeAT2-CC.5 | Add to My Program |
Optimally Controlling the Timing of Energy Transfer in Elastic Joints: Experimental Validation of the Bi-Stiffness Actuation Concept |
|
Pozo Fortunić, Edmundo | Technical University of Munich |
Yildirim, Mehmet Can | Technical University of Munich |
Ossadnik, Dennis | Technical University of Munich |
Swikir, Abdalla | Technical University of Munich |
Abdolshah, Saeed | KUKA Deutschland GmbH |
Haddadin, Sami | Technical University of Munich |
Keywords: Actuation and Joint Mechanisms, Compliant Joints and Mechanisms, Optimization and Optimal Control
Abstract: Elastic actuation taps into elastic elements' energy storage for dynamic motions beyond rigid actuation. While Series Elastic Actuators (SEA) and Variable Stiffness Actuators (VSA) are highly sophisticated, they do not fully provide control over energy transfer timing. To overcome this problem on the basic system level, the Bi-Stiffness Actuation (BSA) concept was recently proposed. Theoretically, it allows for full link decoupling, while simultaneously being able to lock the spring in the drive train via a switch-and-hold mechanism. Thus, the user would be in full control of the potential energy storage and release timing. In this work, we introduce an initial proof-of-concept of Bi-Stiffness-Actuation in the form of a 1-DoF physical prototype, which is implemented using a modular testbed. We present a hybrid system model, as well as the mechatronic implementation of the actuator. We corroborate the feasibility of the concept by conducting a series of hardware experiments using an open-loop control signal obtained by trajectory optimization. Here, we compare the performance of the prototype with a comparable SEA implementation. We show that BSA outperforms SEA 1) in terms of maximum velocity at low final times and 2) in terms of the movement strategy itself: The clutch mechanism allows the BSA to generate consistent launch sequences while the SEA has to rely on lengthy and possibly dangerous oscillatory swing-up motions.
|
|
10:30-12:00, Paper WeAT2-CC.6 | Add to My Program |
Experimental Comparison of Pinwheel and Non-Pinwheel Designs of 3D-Printed Cycloidal Gearing for Robotics |
|
Roozing, Wesley | University of Twente |
Roozing, Glenn | Auto Elect B.V |
Keywords: Actuation and Joint Mechanisms, Mechanism Design
Abstract: Recent trends in robotic actuation have highlighted the need for low-cost, high performance, and efficient gearing. We present an experimental study comparing pinwheel and non-pinwheel designs of cycloidal gearing. The open source designs are 3D-printable combined with off-the-shelf components, achieving a high performance-to-cost ratio. Extensive experimental data is presented, that compares two prototypes on run-in behaviour and a number of quantitative metrics including transmission error, play, friction, and stiffness. Furthermore, we assess overall actuator performance through position control experiments, and a 10-hour endurance test. The results show strong performance characteristics, and crucially, suggest that non-pinwheel designs of cycloidal gearing can be a lower complexity and cost alternative to classical pinwheel designs, while offering similar performance.
|
|
10:30-12:00, Paper WeAT2-CC.7 | Add to My Program |
Design and Optimization of an Origami-Inspired Foldable Pneumatic Actuator |
|
Chen, Huaiyuan | Shanghai Jiao Tong University |
Ma, Yiyuan | Shanghai Jiao Tong University |
Chen, Weidong | Shanghai Jiao Tong University |
Keywords: Hydraulic/Pneumatic Actuators, Actuation and Joint Mechanisms, Modeling, Control, and Learning for Soft Robots
Abstract: A novel origami-inspired foldable pneumatic actuator is proposed in this letter to satisfy the comprehensive requirements in wearable assistive application. The pneumatic actuator combines the origami structure of designed Quadrangular-Expand pattern and the foldable pneumatic bellow. Integrated origami structure regulates the motion of actuator with high contraction ratio and enables accurate modeling. The origami framework also improves the strength of bearing negative pressure, and thus can provide bidirectional actuation. The workflow including design, fabrication and mathematic modeling of the pneumatic actuator is presented in detail. Based on the actuator modeling, the multi-objective optimization for parameters using Genetic Algorithm is then conducted to obtain the trade-off design. The verifications for static characteristics of output torque, as well as the dynamic characteristics of power density, mechanical efficiency and frequency response, have been conducted. In summary, the proposed actuator is powerful and energy-efficient.
|
|
10:30-12:00, Paper WeAT2-CC.8 | Add to My Program |
A Non-Magnetic Dual-Mode Linear Pneumatic Actuator: Initial Design and Assessment |
|
Portha, Timothée | University of Strasbourg |
Barbé, Laurent | University of Strasbourg, ICube CNRS |
Geiskopf, Francois | INSA De Strasbourg |
Vappou, Jonathan | CNRS, Universite De Strasbourg |
Renaud, Pierre | ICube |
Keywords: Hydraulic/Pneumatic Actuators
Abstract: A pneumatic linear actuator is presented and evaluated. Designed to operate in demanding environments such as MRI, it is developed to be used with two motion control modes: 1) a step-by-step mode with tooth-based gripping to ensure precision, 2) a continuous mode available locally for fine positioning. The actuator can also be disengaged to enable direct handling by an operator, for example for comanipulation. The design is presented. A prototype, developed in the medical context, is implemented and characterized. A specific step-by-step control sequence is then elaborated based on its characterization. Testing of the dual-mode actuation is finally described. The complementarity between the two motion modes and possible adaptations of the original design are discussed.
|
|
10:30-12:00, Paper WeAT2-CC.9 | Add to My Program |
Variable Stiffness Floating Spring Leg: Performing Net-Zero Energy Cost Tasks Not Achievable Using Fixed Stiffness Springs |
|
Kim, Sung | Vanderbilt University |
Braun, David | Vanderbilt University |
Keywords: Compliant Joints and Mechanisms, Actuation and Joint Mechanisms, Legged Robots
Abstract: Sitting down and standing up from a chair and, similarly, moving heavy objects up and down between factory lines are examples of cyclic tasks that require large forces but little to no net mechanical energy. Motor-driven artificial limbs and industrial robots can help humans do these tasks, but motors require energy to provide force even if they supply no net mechanical energy. Springs are energetically conservative mechanical elements useful for building robots that require no energy when performing cyclic tasks. However, conventional springs can be limited by their non-customizable force-deflection behavior -- for example, when they cannot meet the force demand despite storing enough energy to perform a cyclic task. Variable stiffness springs are a special type of spring with customizable force-deflection behavior, but most typical variable stiffness springs require energy to amplify force similar to motors. In this paper, we introduce a new type of variable stiffness spring design which is energetically conservative despite having a customizable force-deflection behavior. We present the theory of these springs and demonstrate their utility in performing a net-zero mechanical energy cost lifting task that requires force amplification and as such is not realizable using conventional springs.
|
|
WeAT3-CC Oral Session, CC-313 |
Add to My Program |
Kinematics |
|
|
Chair: Kroeger, Torsten | Karlsruher Institut Für Technologie (KIT) |
Co-Chair: Chirikjian, Gregory | National University of Singapore |
|
10:30-12:00, Paper WeAT3-CC.1 | Add to My Program |
Accurate Kinematic Modeling Using Autoencoders on Differentiable Joints |
|
Wilhelm, Nikolas Jakob | Technical University of Munich |
Haddadin, Sami | Technical University of Munich |
Burgkart, Rainer | Technische Universität München |
van der Smagt, Patrick | Volkswagen Group |
Karl, Maximilian | Volkswagen AG |
Keywords: Deep Learning Methods, Kinematics
Abstract: In robotics and biomechanics, accurately determining joint parameters and computing the corresponding forward and inverse kinematics are critical yet often challenging tasks, especially when dealing with highly individualized and partly unknown systems. This paper unveils a cutting-edge kinematic optimizer, underpinned by an autoencoder-based architecture, to address these challenges. Utilizing a neural network, our approach simulates inverse kinematics, converting measurement data into joint-specific parameters during encoding, enabling a stable optimization process. These parameters are subsequently processed through a predefined, differentiable forward kinematics model, resulting in a decoded representation of the original data. Beyond offering a comprehensive solution to kinematics challenges, our method also unveils previously unidentified joint parameters. Real experimental data from knee and hand joints validate the optimizer's efficacy. Additionally, our optimizer is multifunctional: it streamlines the modeling and automation of kinematics and enables a nuanced evaluation of diverse modeling techniques. By assessing the differences in reconstruction losses, we illuminate the merits of each approach. Collectively, this preliminary study signifies advancements in kinematic optimization, with potential applications spanning both biomechanics and robotics.
|
|
10:30-12:00, Paper WeAT3-CC.2 | Add to My Program |
A Miniature Water Jumping Robot Based on Accurate Interaction Force Analysis |
|
Yan, Jihong | Harbin Institute of Technology |
Zhang, Xin | Harbin Institute of Technology |
Yang, Kai | Harbin Institute of Technology |
Zhao, Jie | Harbin Institute of Technology |
Keywords: Dynamics, Mechanism Design, Kinematics, trajectory optimization
Abstract: Water jumping motion extends the robot's movement space and flexibility. However, the jumping performance is influenced by multiple factors such as driving force, rowing trajectory and robot structure. The interaction force between the robot and water surface is complicated due to water deformation, and the difficulty of the water jumping increases with the robot's scale. This paper designs a miniature water jumping robot with rowing driving legs. The hydrodynamic model between driving legs and water is established based on the modified Wagner theory with consideration of water surface deformation. Particularly, the dynamic model of the robot for the whole jumping process is also developed relate to multiple factors. Then the jumping performance is improved by optimizing the energy storage modality, rowing trajectory and supporting leg shapes through the theoretical analysis and experiments. The fabricated robot weights 91 g, and its length, width and height are 220 mm, 410 mm and 95 mm respectively. The maximum water jumping height and distance are 241 and 965 mm.
|
|
10:30-12:00, Paper WeAT3-CC.3 | Add to My Program |
Jerk-Limited Traversal of One-Dimensional Paths and Its Application to Multi-Dimensional Path Tracking |
|
Kiemel, Jonas | Karlsruhe Institute of Technology |
Kroeger, Torsten | Karlsruher Institut Für Technologie (KIT) |
Keywords: Kinematics, Constrained Motion Planning
Abstract: In this paper, we present an iterative method to quickly traverse multi-dimensional paths considering jerk constraints. As a first step, we analyze the traversal of each individual path dimension. We derive a range of feasible target accelerations for each intermediate waypoint of a one-dimensional path using a binary search algorithm. Computing a trajectory from waypoint to waypoint leads to the fastest progress on the path when selecting the highest feasible target acceleration. Similarly, it is possible to calculate a trajectory that leads to minimum progress along the path. This insight allows us to control the traversal of a one-dimensional path in such a way that a reference path length of a multi-dimensional path is approximately tracked over time. In order to improve the tracking accuracy, we propose an iterative scheme to adjust the temporal course of the selected reference path length. More precisely, the temporal region causing the largest position deviation is identified and updated at each iteration. In our evaluation, we thoroughly analyze the performance of our method using seven-dimensional reference paths with different path characteristics. We show that our method manages to quickly traverse the reference paths and compare the required traversing time and the resulting path accuracy with other state-of-the-art approaches.
|
|
10:30-12:00, Paper WeAT3-CC.4 | Add to My Program |
The Kinematics of Constant Curvature Continuum Robots through Three Segments |
|
Li, Yucheng | University of Dayton |
Myszka, David H. | University of Dayton |
Murray, Andrew | University of Dayton |
Keywords: Kinematics, Formal Methods in Robotics and Automation, Soft Robot Applications
Abstract: This letter presents an investigation into the mathematical relationships between the positions and orientations at the segment tips of a piecewise constant curvature (PCC) continuum robot with up to three segments. For one-segment, a reachability criterion is proposed, which simplifies the calculation of the neighboring orientation. For two-segments, a reachability criterion is proposed and the redundancy of its inverse kinematics solution is found, establishing a circle of tip locations. For three-segments, the redundancy of the inverse kinematics includes tips that lie on a sphere providing a closed-form solution to the inverse kinematics problem. These relationships are derived from the unique characteristics of the bisecting plane of a single segment. The degenerate cases for the solutions are also addressed. These outcomes stem from a specific PCC parametrization, with implications extending to the general PCC model. Note that this study is grounded solely in simulation.
|
|
10:30-12:00, Paper WeAT3-CC.5 | Add to My Program |
An Analytic Solution to the 3D CSC Dubins Path Problem |
|
Montano, Victor | University of Houston |
Navkar, Nikhil | Hamad Medical Corporation |
Becker, Aaron | University of Houston |
Keywords: Kinematics, Nonholonomic Motion Planning, Motion and Path Planning
Abstract: We present an analytic solution to the 3D Dubins path problem for paths composed of an initial circular arc, a straight component, and a final circular arc. These are commonly called CSC paths. By modeling the start and goal configurations of the path as the base frame and final frame of an RRPRR manipulator, we treat this as an inverse kinematics problem. The kinematic features of the 3D Dubins path are built into the constraints of our manipulator model. Furthermore, we show that the number of solutions is not constant, with up to seven valid CSC path solutions even in non-singular regions. An implementation of solution is available at https://github.com/aabecker/dubins3D
|
|
10:30-12:00, Paper WeAT3-CC.6 | Add to My Program |
Polytope-Based Continuous Scalar Performance Measure with Analytical Gradient for Effective Robot Manipulation |
|
Somenedi Nageswara Rao, Keerthi Sagar | Irish Manufacturing Research Limited, Ireland |
Caro, Stéphane | CNRS/LS2N |
Padir, Taskin | Northeastern University |
Long, Philip | Atlantic Technological University |
Keywords: Kinematics, Optimization and Optimal Control, Parallel Robots
Abstract: Performance measures are essential to characterize a robot's ability to carry out manipulation tasks. Generally, these measures examine the system’s kinematic transformations from configuration to task space, but the Capacity margin, a polytope based kinetostatic index, provides additionally, both an accurate evaluation of the twist and wrench capacities of a robotic manipulator. However, this index is the minimum of a discontinuous scalar function leading to difficulties when computing gradients thereby rendering it unsuitable for online numerical optimization. In this letter, we propose a novel performance index using an approximation of the capacity margin. The proposed index is continuous and differentiable, characteristics that are essential for modelling smooth and predictable system behavior. We demonstrate the effectiveness both as a constraint and objective function for inverse kinematics optimization. Moreover, to show its practical use, two opposing robot architectures are chosen: (i) Serial robot - Universal Robot- UR5 (6-dof); Rethink Robotics- Sawyer Robot (7-dof) and (ii) Parallel manipulator - Cable Driven Parallel Robot to validate the results through both simulation and experiments. A visual representation of the performance index is also presented.
|
|
10:30-12:00, Paper WeAT3-CC.7 | Add to My Program |
Kinematic Optimization of a Robotic Arm for Automation Tasks with Human Demonstration |
|
Meir, Inbar | Tel Aviv University |
Bechar, Avital | Agricultural Research Organization |
Sintov, Avishai | Tel-Aviv University |
Keywords: Kinematics, Industrial Robots
Abstract: Robotic arms are highly common in various automation processes such as manufacturing lines. However, these highly capable robots are usually degraded to simple repetitive tasks such as pick-and-place. On the other hand, designing an optimal robot for one specific task consumes large resources of engineering time and costs. In this paper, we propose a novel concept for optimizing the fitness of a robotic arm to perform a specific task based on human demonstration. Fitness of a robot arm is a measure of its ability to follow recorded human arm and hand paths. The optimization is conducted using a modified variant of the Particle Swarm Optimization for the robot design problem. In the proposed approach, we generate an optimal robot design along with the required path to complete the task. The approach could reduce the time-to-market of robotic arms and enable the standardization of modular robotic parts. Novice users could easily apply a minimal robot arm to various tasks. Two test cases of common manufacturing tasks are presented yielding optimal designs and reduced computational effort by up to 92%.
|
|
10:30-12:00, Paper WeAT3-CC.8 | Add to My Program |
Enhancing Motion Trajectory Segmentation of Rigid Bodies Using a Novel Screw-Based Trajectory-Shape Representation |
|
Verduyn, Arno | KU Leuven |
Vochten, Maxim | KU Leuven |
De Schutter, Joris | KU Leuven |
Keywords: Kinematics, Learning from Demonstration
Abstract: Trajectory segmentation refers to dividing a trajectory into meaningful consecutive sub-trajectories. This paper focuses on trajectory segmentation for 3D rigid-body motions. Most segmentation approaches in the literature represent the body’s trajectory as a point trajectory, considering only its translation and neglecting its rotation. We propose a novel trajectory representation for rigid-body motions that incorporates both translation and rotation, and additionally exhibits several invariant properties. This representation consists of a geometric progress rate and a third-order trajectory-shape descriptor. Concepts from screw theory were used to make this representation time-invariant and also invariant to the choice of body reference point. This new representation is validated for a self-supervised segmentation approach, both in simulation and using real recordings of human-demonstrated pouring motions. The results show a more robust detection of consecutive sub-motions with distinct features and a more consistent segmentation compared to conventional representations. We believe that other existing segmentation methods may benefit from using this trajectory representation to improve their invariance.
|
|
10:30-12:00, Paper WeAT3-CC.9 | Add to My Program |
Model Reduction in Soft Robotics Using Locally Volume-Preserving Primitives |
|
Xu, Yi | National University of Singapore |
Chirikjian, Gregory | National University of Singapore |
Keywords: Kinematics, Modeling, Control, and Learning for Soft Robots
Abstract: A new, and extremely efficient, computational modeling paradigm is introduced here for specific finite elasticity problems that arise in the context of soft robotics. Whereas continuum mechanics is a very classical area of study that is broadly applicable throughout engineering, and significant effort has been devoted to the development of intricate constitutive models for finite elasticity, we show that for the most part, the isochoric (locally volume-preserving) constraint dominates behavior, and this can be built into closed-form kinematic deformation fields before even considering other aspects of constitutive modeling. We therefore focus on developing and applying primitive deformations that each observe this constraint. By composing a wide enough variety of such deformations, many of the most common behaviors observed in soft robots can be replicated. Case studies include isotropic objects subjected to different boundary conditions, a non-isotropic helically-reinforced tube, and a not-purely-kinematic scenario with gravity loading. We show that this method is at least 50 times faster than the ABAQUS implementation of the finite element method (FEM), and has speed comparable with the real-time FEM framework SOFA. Experiments show that both our method and ABAQUS have approximately 10% error relative to experimentally measured displacements, as well as to each other. And our method outperforms SOFA when the deformation is highly nonlinear.
|
|
WeAT4-CC Oral Session, CC-315 |
Add to My Program |
Multi-Robot Systems IV |
|
|
Chair: Lam, Tin Lun | The Chinese University of Hong Kong, Shenzhen |
Co-Chair: Best, Graeme | University of Technology Sydney |
|
10:30-12:00, Paper WeAT4-CC.1 | Add to My Program |
Automatic Configuration of Multi-Agent Model Predictive Controllers Based on Semantic Graph World Models |
|
de Vos, Koen | Eindhoven University of Technology |
Torta, Elena | Eindhoven University of Technology |
Bruyninckx, Herman | KU Leuven |
López Martínez, César Augusto | Eindhoven University of Technology |
van de Molengraft, Marinus Jacobus Gerardus | University of Technology Eindhoven |
Keywords: Multi-Robot Systems, Constrained Motion Planning, Cooperating Robots
Abstract: We propose a shared semantic map architecture to construct and configure Model Predictive Controllers (MPC) dynamically, that solve navigation problems for multiple robotic agents sharing parts of the same environment. The navigation task is represented as a sequence of semantically labeled areas in the map, that must be traversed sequentially, i.e. a route. Each semantic label represents one or more constraints on the robots’ motion behaviour in that area. The advantages of this approach are: (i) an MPC-based motion controller in each individual robot can be (re-)configured, at runtime, with the locally and temporally relevant parameters; (ii) the application can influence, also at runtime, the navigation behaviour of the robots, just by adapting the semantic labels; and (iii) the robots can reason about their need for coordination, through analyzing over which horizon in time and space their routes overlap. The paper provides simulations of various representative situations, showing that the approach of runtime configuration of the MPC drastically decreases computation time, while retaining task execution performance similar to an approach in which each robot always includes all other robots in its MPC computations.
|
|
10:30-12:00, Paper WeAT4-CC.2 | Add to My Program |
Meta-Reinforcement Learning Based Cooperative Surface Inspection of 3D Uncertain Structures Using Multi-Robot Systems |
|
Chen, Junfeng | Peking University |
Gao, Yuan | Shenzhen Institute of Artificial Intelligence and Robotics for S |
Hu, Junjie | The Chinese University of Hong Kong, Shenzhen |
Deng, Fuqin | Shenzhen Institute of Artificial Intelligence and Robotics for S |
Lam, Tin Lun | The Chinese University of Hong Kong, Shenzhen |
Keywords: Multi-Robot Systems, Constrained Motion Planning, Reinforcement Learning
Abstract: This paper presents a decentralized cooperative motion planning approach for surface inspection of 3D structures which includes uncertainties like size, number, shape, position, using multi-robot systems (MRS). Given that most of existing works mainly focus on surface inspection of single and fully known 3D structures, our motivation is two-fold: first, 3D structures separately distributed in 3D environments are complex, therefore the use of MRS intuitively can facilitate an inspection by fully taking advantage of sensors with different capabilities. Second, performing the aforementioned tasks when considering uncertainties is a complicated and time-consuming process because we need to explore, figure out the size and shape of 3D structures and then plan surface-inspection path. To overcome these challenges, we present a meta-learning approach that provides a decentralized planner for each robot to improve the exploration and surface inspection capabilities. The experimental results demonstrate our method can outperform other methods by approximately 10.5%-27% on success rate and 70%-75% on inspection speed.
|
|
10:30-12:00, Paper WeAT4-CC.3 | Add to My Program |
Decentralized Multi-Agent Trajectory Planning in Dynamic Environments with Spatiotemporal Occupancy Grid Maps |
|
Wu, Siyuan | Delft University of Technology |
Chen, Gang | Delft University of Technology |
Shi, Moji | Delft University of Technology |
Alonso-Mora, Javier | Delft University of Technology |
Keywords: Multi-Robot Systems, Motion and Path Planning, Distributed Robot Systems
Abstract: This paper proposes a decentralized trajectory planning framework for the collision avoidance problem of multiple micro aerial vehicles (MAVs) in environments with static and dynamic obstacles. The framework utilizes spatiotemporal occupancy grid maps (SOGM), which forecast the occupancy status of neighboring space in the near future, as the environment representation. Based on this representation, we extend the kinodynamic A* and the corridor-constrained trajectory optimization algorithms to efficiently tackle static and dynamic obstacles with arbitrary shapes. Collision avoidance between communicating robots is integrated by sharing planned trajectories and projecting them onto the SOGM. The simulation results show that our method achieves competitive performance against state-of-the-art methods in dynamic environments with different numbers and shapes of obstacles. Finally, the proposed method is validated in real experiments.
|
|
10:30-12:00, Paper WeAT4-CC.4 | Add to My Program |
Communicating Intent As Behaviour Trees for Decentralised Multi-Robot Coordination |
|
Hull, Rhett | University of Technology Sydney |
Moratuwage, Diluka Prasanjith | University of Technology Sydney |
Scheide, Emily | Oregon State University |
Fitch, Robert | University of Technology Sydney |
Best, Graeme | University of Technology Sydney |
Keywords: Multi-Robot Systems, Path Planning for Multiple Mobile Robots or Agents
Abstract: We propose a decentralised multi-robot coordination algorithm that features a rich representation for encoding and communicating each robot’s intent. This representation for “intent messages” enables improved coordination behaviour and communication efficiency in difficult scenarios, such as those where there are unknown points of contention that require negotiation between robots. Each intent message is an adaptive policy that conditions on identified points of contention that conflict with the intentions of other robots. These policies are concisely expressed as behaviour trees via algebraic logic simplification, and are interpretable by robot teammates and human operators. We propose this intent representation in the context of the Dec-MCTS online planning algorithm for decentralised coordination. We present results for a generalised multi-robot orienteering domain that show improved plan convergence and coordination performance over standard Dec-MCTS enabled by the intent representation’s ability to encode and facilitate negotiation over points of contention.
|
|
10:30-12:00, Paper WeAT4-CC.5 | Add to My Program |
Partial Belief Space Planning for Scaling Stochastic Dynamic Games |
|
Vakil, Kamran | Boston University |
Coffey, Mela | Boston University |
Pierson, Alyssa | Boston University |
Keywords: Multi-Robot Systems, Path Planning for Multiple Mobile Robots or Agents, Planning under Uncertainty
Abstract: This paper presents a method to reduce computations for stochastic dynamic games with game-theoretic belief space planning through partially propagating beliefs. Complex interactions in scenarios such as surveillance, herding, and racing can be modeled using game-theoretic frameworks in the belief space. Stochastic dynamic games can be solved to a local Nash Equilibrium using a game-theoretic belief space variant of an iterative Linear Quadratic Gaussian (iLQG). However, the scalability of this method suffers due to the large dimensionality of beliefs which the iLQG must propagate. We examine the utility of partial belief space propagation, which allows polynomial runtime to decrease. We validate our findings through simulations and hardware implementation.
|
|
10:30-12:00, Paper WeAT4-CC.6 | Add to My Program |
Decentralized Multi-Agent Active Search and Tracking When Targets Outnumber Agents |
|
Banerjee, Arundhati | Carnegie Mellon University |
Schneider, Jeff | Carnegie Mellon University |
Keywords: Multi-Robot Systems, Planning under Uncertainty, Localization
Abstract: Multi-agent multi-target tracking has a wide range of applications, including wildlife patrolling, security surveillance or environment monitoring. Such algorithms often make restrictive assumptions: the number of targets and/or their initial locations may be assumed known, or agents may be pre-assigned to monitor disjoint partitions of the environment, reducing the burden of exploration. This also limits applicability when there are fewer agents than targets, since agents are unable to continuously follow the targets in their fields of view. Multi-agent tracking algorithms additionally assume inter-agent synchronization of observations, or the presence of a central controller to coordinate joint actions. Instead, we focus on the setting of decentralized multi-agent, multi-target, simultaneous active search-and-tracking with asynchronous inter-agent communication. Our proposed algorithm DecSTER uses a sequential monte carlo implementation of the Probability Hypothesis Density filter for posterior inference combined with Thompson sampling for decentralized multi-agent decision making. We compare different action selection policies, focusing on scenarios where targets outnumber agents. In simulation, we demonstrate that DecSTER is robust to unreliable inter-agent communication and outperforms information-greedy baselines in terms of the Optimal Sub-Pattern Assignment (OSPA) metric for different numbers of targets and varying teamsizes.
|
|
10:30-12:00, Paper WeAT4-CC.7 | Add to My Program |
Multi-Robot Autonomous Exploration and Mapping under Localization Uncertainty with Expectation-Maximization |
|
Huang, Yewei | Stevens Institute of Technology |
Lin, Xi | Stevens Institute of Technology |
Englot, Brendan | Stevens Institute of Technology |
Keywords: Multi-Robot Systems, Planning under Uncertainty, Reactive and Sensor-Based Planning
Abstract: We propose an autonomous exploration algorithm designed for decentralized multi-robot teams, which takes into account map and localization uncertainties of range-sensing mobile robots. Virtual landmarks are used to quantify the combined impact of process noise and sensor noise on map uncertainty. Additionally, we employ an iterative expectation-maximization inspired algorithm to assess the potential outcomes of both a local robot’s and its neighbors’ next-step actions. To evaluate the effectiveness of our framework, we conduct a comparative analysis with state-of-the-art algorithms. The results of our experiments show the proposed algorithm’s capacity to strike a balance between curbing map uncertainty and achieving efficient task allocation among robots.
|
|
10:30-12:00, Paper WeAT4-CC.8 | Add to My Program |
Optimal Task Allocation for Heterogeneous Multi-Robot Teams with Battery Constraints |
|
Calvo, Álvaro | University of Seville |
Capitan, Jesus | University of Seville |
Keywords: Multi-Robot Systems, Planning, Scheduling and Coordination, Task Planning
Abstract: This paper presents a novel approach to optimal multi-robot task allocation in heterogeneous teams of robots. When robots have heterogeneous capabilities and there are diverse objectives and constraints to comply with, computing optimal plans can become especially hard. Moreover, we increase the problem complexity by: 1) considering battery-limited robots that need to schedule recharges; 2) tasks that can be decomposed into multiple fragments; and 3) multi-robot tasks that need to be executed by a coalition synchronously. We define a new problem for heterogeneous multi-robot task allocation and formulate it as a Mixed-Integer Linear Program that includes all the aforementioned features. Then we use an off-the-shelf solver to show the type of optimal solutions that our planner can produce and assess its performance in random scenarios. Our method, which is released as open-source code, represents a first step to formalize and analyze a complex problem that has not been solved in the state of the art.
|
|
10:30-12:00, Paper WeAT4-CC.9 | Add to My Program |
Bigraph Matching Weighted with Learnt Incentive Function for Multi-Robot Task Allocation |
|
Paul, Steve | University of Connecticut |
Maurer, Nathan | University at Buffalo |
Chowdhury, Souma | University at Buffalo, State University of New York |
Keywords: Multi-Robot Systems, Planning, Scheduling and Coordination, Task Planning
Abstract: Most real-world Multi-Robot Task Allocation (MRTA) problems require fast and efficient decision-making, which is often achieved using heuristics-aided methods such as genetic algorithms, auction-based methods, and bipartite graph matching methods. These methods often assume a form that lends better explainability compared to an end-to-end (learnt) neural network based policy for MRTA. However, deriving suitable heuristics can be tedious, risky and in some cases impractical if problems are too complex. This raises the question: can these heuristics be learned? To this end, this paper particularly develops a Graph Reinforcement Learning (GRL) framework to learn the heuristics or incentives for a bipartite graph matching approach to MRTA. Specifically a Capsule Attention policy model is used to learn how to weight task/robot pairings (edges) in the bipartite graph that connects the set of tasks to the set of robots. The original capsule attention network architecture is fundamentally modified by adding encoding of robots' state graph, and two Multihead Attention based decoders whose output are used to construct a LogNormal distribution matrix from which positive bigraph weights can be drawn. The performance of this new bigraph matching approach augmented with a GRL-derived incentive is found to be at par with the original bigraph matching approach that used expert-specified heuristics, with the former offering notable robustness benefits. During training, the learned incentive policy is found to get initially closer to the expert-specified incentive and then slightly deviate from its trend.
|
|
WeAT5-CC Oral Session, CC-411 |
Add to My Program |
Visual Perception and Learning I |
|
|
Chair: Najjaran, Homayoun | University of Victoria |
Co-Chair: Ravendran, Ahalya | The Commonwealth Scientific and Industrial Research Organisation |
|
10:30-12:00, Paper WeAT5-CC.1 | Add to My Program |
Bag of Views: An Appearance-Based Approach to Next-Best-View Planning for 3D Reconstruction |
|
Hatami Gazani, Sara | University of Victoria |
Tucsok, Matthew | University of British Columbia |
Mantegh, Iraj | National Research Council Canada |
Najjaran, Homayoun | University of Victoria |
Keywords: Computer Vision for Automation, Aerial Systems: Perception and Autonomy, Reactive and Sensor-Based Planning
Abstract: UAV-based intelligent data acquisition for 3D reconstruction and monitoring of infrastructure has experienced an increasing surge of interest due to recent advancements in image processing and deep learning-based techniques. View planning is an essential part of this task that dictates the information capture strategy and heavily impacts the quality of the 3D model generated from the captured data. Recent methods have used prior knowledge or partial reconstruction of the target to accomplish view planning for active reconstruction; the former approach poses a challenge for complex or newly identified targets while the latter is computationally expensive. In this work, we present Bag-of-Views (BoV), a fully appearance-based model used to assign utility to the captured views for both offline dataset refinement and online next-best-view (NBV) planning applications targeting the task of 3D reconstruction. With this contribution, we also developed the View Planning Toolbox (VPT), a lightweight package for training and testing machine learning-based view planning frameworks, custom view dataset generation of arbitrary 3D scenes, and 3D reconstruction. Through experiments which pair a BoV-based reinforcement learning model with VPT, we demonstrate the efficacy of our model in reducing the number of required views for high-quality reconstructions in dataset refinement and NBV planning.
|
|
10:30-12:00, Paper WeAT5-CC.2 | Add to My Program |
See through the Real World Haze Scenes: Navigating the Synthetic-To-Real Gap in Challenging Image Dehazing |
|
Chen, Shijie | Fudan University |
Mahdizadeh, Mohammad | Fudan University |
Yu, Chong | Fudan University & NVIDIA |
Fan, Jiayuan | Fudan University |
Chen, Tao | Fudan University |
Keywords: Computer Vision for Automation, Computer Vision for Manufacturing, Computer Vision for Transportation
Abstract: Dehazing and enhancing visibility in real-world hazy images pose significant challenges due to the physical complexity of haze, the variability in haze conditions, the tendency to capture more details in noisy scenes, and the risk of overexposure. Many existing single RGB image dehazing methods tend to perform well in synthetic hazy scenarios but struggle in real-world situations. It stems from the fact that these methods often rely solely on deep learning techniques or classical approaches. In addition, they neglect overall image quality improvement. To partially address these challenges, we introduce an innovative approach that harnesses the strengths of both modalities to dehaze and enhance visibility in a single real-world hazy RGB image. First, both Low-level and deep features are extracted, and then a pre-trained vector quantization GAN is employed to create a discrete codebook of well-detailed data patches. A decoder component, enhanced with a normalized module, effectively utilizes these high-quality features to produce clear results. Additionally, a controllable operation is introduced to improve feature matching. To further enhance dehazing and generalizability, the decoder's output undergoes a sequence of gamma-correction operations and generates a sequence of multi-exposure images that are combined to create a haze-free, visually pleasing, and higher-quality final image. The method effectively reduces haziness, enhances sharpness, preserves natural colors, and minimizes artifacts in challenging real-world scenarios. The proposed approach surpasses five SOTA methods in both qualitative and quantitative evaluations across three key metrics, utilizing two real-world and three synthetic hazy image datasets. Notably, it achieves a substantial improvement in real-world datasets over the second-best method, with gains of 0.5702 and 0.129 in FADE metrics for the RTTS and Fattal datasets, respectively.
|
|
10:30-12:00, Paper WeAT5-CC.3 | Add to My Program |
CopperTag: A Real-Time Occlusion-Resilient Fiducial Marker |
|
Bian, Xu | Xi’an Jiaotong University |
Chen, Wenzhao | Youibot Robotics Co., Ltd |
Tian, Xiaoyu | Carnegie Mellon University |
Ran, Donglai | Youibot Robotics Co., Ltd |
Keywords: Computer Vision for Automation, Computer Vision for Manufacturing, Industrial Robots
Abstract: Fiducial markers, like AprilTag and ArUco, are extensively utilized in robotics applications within industrial environments, encompassing navigation, docking, and object grasping tasks. However, in contrast to controlled laboratory conditions, markers installed in factory grounds or equipment surfaces, often face challenges like damage or contamination. These issues can lead to compromised marker integrity, resulting in reduced detection reliability. To address this challenge, we propose a novel fiducial marker called CopperTag, which incorporates circular and square elements to create a robust occlusion-resistant pattern. The CopperTag detection process relies on three fundamental steps: firstly, extracting all lines from the image; secondly, identifying corners; and lastly, searching for quadrilateral candidate regions using ellipses and nearby corners. The Reed-Solomon (RS) algorithm is utilized for both encoding and decoding the information content. This algorithm possesses the ability to recover corrupted messages in situations where CopperTag data is incomplete. The experimental results illustrate that CopperTag exhibits superior robustness and accuracy in detection when compared to other state-of-the-art fiducial markers, even in scenarios with heavy occlusion. Moreover, CopperTag maintains an average processing time of 10ms per frame on a standard laptop, effectively meeting the real-time demands of robotics applications.
|
|
10:30-12:00, Paper WeAT5-CC.4 | Add to My Program |
Robust Collaborative Perception without External Localization and Clock Devices |
|
Lei, Zixing | Shanghai Jiao Tong University |
Ni, Zhenyang | Shanghai Jiao Tong University |
Han, Ruize | Chinese Academy of Sciences |
Tang, Shuo | Shanghai Jiao Tong University |
Feng, Chen | New York University |
Chen, Siheng | Shanghai Jiao Tong University |
Wang, Yanfeng | Shanghai Jiao Tong University |
Keywords: Computer Vision for Automation, Computer Vision for Transportation, Deep Learning for Visual Perception
Abstract: A consistent spatial-temporal coordination across multiple agents is fundamental for collaborative perception, which seeks to improve perception abilities through information exchange among agents. To achieve this spatial-temporal alignment, traditional methods depend on external devices to provide localization and clock signals. However, hardware-generated signals could be vulnerable to noise and potentially malicious attack, jeopardizing the precision of spatial-temporal alignment. Rather than relying on external hardwares, this work proposes a novel approach: aligning by recognizing the inherent geometric patterns within the perceptual data of various agents. Following this spirit, we propose a robust collaborative perception system that operates independently of external localization and clock devices. The key module of our system,~emph{FreeAlign}, constructs a salient object graph for each agent based on its detected boxes and uses a graph neural network to identify common subgraphs between agents, leading to accurate relative pose and time. We validate emph{FreeAlign} on both real-world and simulated datasets. The results show that, the ~emph{FreeAlign} empowered robust collaborative perception system perform comparably to systems relying on precise localization and clock devices. We will release code related to this work.
|
|
10:30-12:00, Paper WeAT5-CC.5 | Add to My Program |
DerainNeRF: 3D Scene Estimation with Adhesive Waterdrop Removal |
|
Li, Yunhao | Westlake University |
Wu, Jing | Westlake University |
Zhao, Lingzhe | Westlake University |
Liu, Peidong | Westlake University |
Keywords: Computer Vision for Automation, Computer Vision for Transportation, Visual Learning
Abstract: When capturing images through the glass during rainy or snowy weather conditions, the resulting images often contain waterdrops adhered on the glass surface, and these waterdrops significantly degrade the image quality and perfor- mance of many computer vision algorithms. To tackle these limitations, we propose a method to reconstruct the clear 3D scene implicitly from multi-view images degraded by water- drops. Our method exploits an attention network to predict the location of waterdrops and then train a Neural Radiance Fields to recover the 3D scene implicitly. By leveraging the strong scene representation capabilities of NeRF, our method can render high-quality novel-view images with waterdrops removed. Extensive experimental results on both synthetic and real datasets show that our method is able to generate clear 3D scenes and outperforms existing state-of-the-art (SOTA) image adhesive waterdrop removal methods.
|
|
10:30-12:00, Paper WeAT5-CC.6 | Add to My Program |
Learning Interaction Regions and Motion Trajectories Simultaneously from Egocentric Demonstration Videos |
|
Xin, Jianjia | Beijing University of Technology |
Wang, Lichun | Beijing University of Technology |
Xu, Kai | Beijing University of Technology |
Yang, Chao | Beijing University of Technology |
Yin, Baocai | Beijing University of Technology |
Keywords: Computer Vision for Automation, Deep Learning for Visual Perception, Data Sets for Robotic Vision
Abstract: Learning to interact with objects is significant for robots to integrate into human environments. When the interaction semantic is definite, manually guiding the manipulator is a commonly used method to teach robots how to interact with objects. However, the learning results are robot-dependent because the mechanical parameters are different for different robots, which means the learning process must be executed again. Moreover, during the manual guiding process, operators are responsible for recognizing the region being contacted and providing expert motion programming, which limits the robot's intelligence. To improve the degree of automation for robots interacting with objects, this paper proposes IRMT-Net (Interaction Region and Motion Trajectory prediction Network) to predict the interaction region and motion trajectory simultaneously based on images. IRMT-Net achieves state-of-the-art interaction region prediction results on Epic-kitchens dataset, generates reasonable motion trajectories and can support robot interaction in actual situations.
|
|
10:30-12:00, Paper WeAT5-CC.7 | Add to My Program |
Marrying NeRF with Feature Matching for One-Step Pose Estimation |
|
Chen, Ronghan | Sheyang Institute of Automation, Chinese Academy of Sciences |
Cong, Yang | Chinese Academy of Science, China |
Ren, Yu | Shenyang Institute of Automation Chinese Academy of Sciences |
Keywords: Computer Vision for Automation, Deep Learning for Visual Perception, Localization
Abstract: Given the image collection of an object, we aim at building a real-time image-based pose estimation method, which requires neither its CAD model nor hours of object-specific training. Recent NeRF-based methods provide a promising solution by directly optimizing the pose from pixel loss between rendered and target images. However, during inference, they require long converging time, and suffer from local minima, making them impractical for real-time robot applications. We aim at solving this problem by marrying image matching with NeRF. With 2D matches and depth rendered by NeRF, we directly solve the pose in one step by building 2D-3D correspondences between target and initial view, thus allowing for real-time prediction. Moreover, to improve the accuracy of 2D-3D correspondences, we propose a 3D consistent point mining strategy, which effectively discards unfaithful points reconstruted by NeRF. Moreover, current NeRF-based methods naively optimizing pixel loss fail at occluded images. Thus, we further propose a 2D matches based sampling strategy to preclude the occluded area. Experimental results on representative datasets prove that our method outperforms state-of-the-art methods, and improves inference efficiency by 90x, achieving real-time prediction at 6 FPS.
|
|
10:30-12:00, Paper WeAT5-CC.8 | Add to My Program |
Occluded Part-Aware Graph Convolutional Networks for Skeleton-Based Action Recognition |
|
Kim, Min Hyuk | Chonnam National University |
Kim, Min Ju | Chonnam National University |
Yoo, Seok Bong | Chonnam National University |
Keywords: Computer Vision for Automation, Deep Learning for Visual Perception, Recognition
Abstract: Recognizing human action is one of the most critical factors in the visual perception of robots. Specifically, skeleton-based action recognition has been actively researched to enhance recognition performance at a lower cost. However, action recognition in occlusion situations, where body parts are not visible, is still challenging. We propose an occluded part-aware graph convolutional network (OP-GCN) to address this challenge using the optimal occluded body parts. The proposed model uses an occluded part detector to identify occluded body parts within a human skeleton. It is based on an autoencoder trained on a nonoccluded human skeleton and exploits the symmetry and angular information of the skeleton. Then, we select an optimal group constructed considering the occluded body parts. Each group comprises five sets of joint nodes, focusing on the body parts, excluding the occluded ones. Finally, to enhance interaction within the selected groups, we apply an interpart association module, considering the fusion of global and local elements. The experimental results reveal that the proposed model outperforms others on the occluded datasets. These comparative experiments demonstrate the effectiveness of the study in addressing the challenge of action recognition in occlusion situations. Our code is publicly available at https://github.com/MJ-Kor/OP-GCN.
|
|
10:30-12:00, Paper WeAT5-CC.9 | Add to My Program |
MAL: Motion-Aware Loss with Temporal and Distillation Hints for Self-Supervised Depth Estimation |
|
Dong, Yuejiang | Tsinghua University |
Zhang, Fang-Lue | Victoria University of Wellington |
Zhang, Song-Hai | Tsinghua University |
Keywords: Computer Vision for Automation, Deep Learning for Visual Perception, RGB-D Perception
Abstract: Depth perception is crucial for a wide range of robotic applications. Multi-frame self-supervised depth estimation methods have gained research interest due to their ability to leverage large-scale, unlabeled real-world data. However, the self-supervised methods often rely on the assumption of a static scene and their performance tends to degrade in dynamic environments. To address this issue, we present Motion-Aware Loss, which leverages the temporal relation among consecutive input frames and a novel distillation scheme between the teacher and student networks in the multi-frame self-supervised depth estimation methods. Specifically, we associate the spatial locations of moving objects with the temporal order of input frames to eliminate errors induced by object motion. Meanwhile, we enhance the original distillation scheme in multi-frame methods to better exploit the knowledge from a teacher network. MAL is a novel, plug-and-play module designed for seamless integration into multi-frame self-supervised monocular depth estimation methods. Adding MAL into previous state-of-the-art methods leads to a reduction in depth estimation errors by up to 4.2% and 10.8% on KITTI and CityScapes benchmarks, respectively.
|
|
WeAT6-CC Oral Session, CC-414 |
Add to My Program |
Visual Servoing |
|
|
Chair: Valada, Abhinav | University of Freiburg |
Co-Chair: Loianno, Giuseppe | New York University |
|
10:30-12:00, Paper WeAT6-CC.1 | Add to My Program |
Stereo Image-Based Visual Servoing towards Feature-Based Grasping |
|
Enyedy, Albert | Worcester Polytechnic Institute |
Aswale, Ashay | Worcester Polytechnic Institute |
Calli, Berk | Worcester Polytechnic Institute |
Gennert, Michael | Worcester Polytechnic Institute |
Keywords: Visual Servoing, Grasping, Humanoid Robot Systems
Abstract: This paper presents an image-based visual servoing scheme that can control robotic manipulators in 3D space using 2D stereo images without needing to perform stereo reconstruction. We use a stereo camera in an eye-to-hand configuration for controlling the robot to reach target positions by directly mapping image space errors to joint space actuation. We achieve convergence without a-priori knowledge of the target object, a reference 2D image, or 3D data. By doing so, we can reach targets in unstructured environments using high-resolution RGB images instead of utilizing relatively noisy depth data. We conduct several experiments on two different physical robots. The Panda 7DOF arm grasps a static target in 3D space, grasps a pitcher handle, and picks and places a box by determining the approach angle using 2D image features, demonstrating that this algorithm can be used for grasping practical objects in 3D space using only 2D image features for feedback. Our second platform, the Atlas humanoid robot, reaches a target from an unknown starting configuration, demonstrating that this controller achieves convergence to a target, even with the uncertainties introduced by walking to a new location. We believe that this algorithm is a step towards enabling intuitive interfaces that allow a user to initiate a grasp on an object by specifying a grasping point in a 2D image.
|
|
10:30-12:00, Paper WeAT6-CC.2 | Add to My Program |
Visual Feedback Control of an Underactuated Hand for Grasping Brittle and Soft Foods |
|
Kai, Ryogo | Chuo University |
Isobe, Yuzuka | Chuo University |
Pathak, Sarthak | Chuo University |
Umeda, Kazunori | Chuo University |
Keywords: Visual Servoing, Grasping, Underactuated Robots
Abstract: This paper presents a novel method to control an underactuated hand by using only a monocular camera, not using any internal sensors. In food factories, robots are required to handle a wide variety of foods without damaging them. To accomplish this, the use of underactuated hands is effective because they can adapt to various food shapes. However, if internal sensors such as tactile sensors and force sensors are used in the underactuated hands, it may cause a problem with hygiene and require complicated calibration. Moreover, if external sensors such as cameras are used, it is necessary to grasp foods without damaging them by using external information such as images. In our method, to tackle these problems, a camera is used as an external sensor. First, contact between the hand and the object is detected by using the contours of both, obtained from a camera image. Then, to avoid damaging the object, the following information is extracted from camera images and observed: the centroid of both the hand and object, the deformation of the object, and the occlusion rate of the hand. Furthermore, to prevent the object from dropping while the robotic arm is in motion, the distance between the centroid of the hand and the object is calculated. The experiments were conducted using twelve different food items.
|
|
10:30-12:00, Paper WeAT6-CC.3 | Add to My Program |
Compositional Servoing by Recombining Demonstrations |
|
Argus, Maximilian | University of Freiburg |
Nayak, Abhijeet | University of Freiburg |
Büchner, Martin | University of Freiburg |
Galesso, Silvio | University of Freiburg |
Valada, Abhinav | University of Freiburg |
Brox, Thomas | University of Freiburg |
Keywords: Visual Servoing, Manipulation Planning, Learning from Demonstration
Abstract: Learning-based manipulation policies from image inputs often show weak task transfer capabilities. In contrast, visual servoing methods allow efficient task transfer in high-precision scenarios while requiring only a few demonstrations. In this work, we present a framework that formulates the visual servoing task as graph traversal. Our method not only extends the robustness of visual servoing, but also enables multitask capability based on a few task-specific demonstrations. We construct demonstration graphs by splitting existing demonstrations and recombining them. In order to traverse the demonstration graph in the inference case, we utilize a similarity function that helps select the best demonstration for a specific task. This enables us to compute the shortest path through the graph. Ultimately, we show that recombining demonstrations leads to higher task-respective success. We present extensive simulation and real-world experimental results that demonstrate the efficacy of our approach.
|
|
10:30-12:00, Paper WeAT6-CC.4 | Add to My Program |
Second-Order Position-Based Visual Servoing of a Robot Manipulator |
|
Godinho Ribeiro, Eduardo | University of São Paulo |
de Queiroz Mendes, Raul | Eindhoven University of Technology (TU/e) |
Terra, Marco Henrique | University of Sao Paulo |
Grassi Junior, Valdir | University of São Paulo |
Keywords: Visual Servoing, Motion Control
Abstract: Visual Servoing is an established approach for controlling robots using visual feedback. Most controllers in this domain generate velocity control signals to guide the cameras to desired positions and orientations. However, the dynamic characteristics of conventional visual servoing controllers may be unsatisfactory, and the velocity signal itself hinders the connection between the feature velocity model and the robot's dynamics. Consequently, research has explored models incorporating the second-order derivative of features and the robot's acceleration. The current state-of-the-art techniques mainly focus on image-based visual servoing, which deals with feature errors in the image domain. In this work, we propose an acceleration-based controller for the position-based visual servoing framework, which models the error in Cartesian space. Our approach involves extracting an acceleration control signal from the traditional velocity-based controller. To achieve this, we redefine the camera orientation using quaternions, generate new interaction matrices, and conduct comprehensive comparative experiments in simulated and real robot scenarios. We show that our method provides better dynamic properties in both image and Cartesian spaces, superior tracking performance, and less sensitivity to noise compared to velocity controllers.
|
|
10:30-12:00, Paper WeAT6-CC.5 | Add to My Program |
Event-Triggered Image Moments Predictive Control for Tracking Evolving Features Using UAVs |
|
Aspragkathos, Sotiris | NTUA |
Karras, George | University of Thessaly |
Kyriakopoulos, Kostas | New York University - Abu Dhabi |
Keywords: Visual Servoing, Visual Tracking, Sensor-based Control
Abstract: This paper presents a novel approach for tracking deformable contour targets using Unmanned Aerial Vehicles (UAVs). The proposed scheme combines image moments descriptor and event-triggered Nonlinear Model Predictive Control (NMPC) for efficient and accurate tracking. The deformable contour model allows adaptation to the evolving target's shape, while the proposed event-triggered scheme achieves improved computational efficiency and extended flight duration while generating new control sequences for the UAV. Real-world experiments validate the scheme, showcasing its robustness in handling complex scenarios. This approach holds promise for various applications, such as surveillance and autonomous navigation.
|
|
10:30-12:00, Paper WeAT6-CC.6 | Add to My Program |
Lattice-Based Shape Tracking and Servoing of Elastic Objects |
|
Shetab-Bushehri, Mohammadreza | Université Clermont Auvergne, Institut Pascal |
Aranda, Miguel | Universidad De Zaragoza |
Mezouar, Youcef | Clermont Auvergne INP - SIGMA Clermont |
Ozgur, Erol | SIGMA-Clermont / Institut Pascal |
Keywords: Visual Servoing, Visual Tracking, Sensor-based Control, Manipulation of Deformable Objects
Abstract: In this paper, we propose a general unified tracking-servoing approach for controlling the shape of elastic deformable objects using robotic arms. Our approach works by forming a lattice around the object, binding the object to the lattice, and tracking and servoing the lattice instead of the object. This makes our approach have full control over the deformation of elastic deformable objects of any general form (linear, thin-shell, volumetric) in 3D space. Furthermore, it decouples the runtime complexity of the approach from the objects’ geometric complexity. Our approach is based on the As-Rigid-As-Possible (ARAP) deformation model. It requires no mechanical parameter of the object to be known and can drive the object toward desired shapes through large deformations. The inputs to our approach are the point cloud of the object’s surface in its rest shape and the point cloud captured by a 3D camera in each frame. Overall, our approach is more broadly applicable than existing approaches. We validate the efficiency of our approach through numerous experiments with elastic deformable objects of various shapes and materials (paper, rubber, plastic, foam).
|
|
10:30-12:00, Paper WeAT6-CC.7 | Add to My Program |
DCPT: Darkness Clue-Prompted Tracking in Nighttime UAVs |
|
Zhu, Jiawen | Dalian University of Technology |
Tang, Huayi | Dalian University of Technology |
Cheng, Zhi-Qi | Carnegie Mellon University |
He, Jun-yan | Alibaba Group |
Luo, Bin | Alibaba Group |
Qiu, Shihao | Dalian University of Technology |
Li, Shengming | Dalian University of Technology |
Lu, Huchuan | Dalian University of Technology |
Keywords: Visual Tracking
Abstract: Existing nighttime unmanned aerial vehicle (UAV) trackers follow an “Enhance-then-Track” architecture - first using a light enhancer to brighten the nighttime video, then employing a daytime tracker to locate the object. This separate enhancement and tracking fails to build an end-to-end trainable vision system. To address this, we propose a novel architecture called Darkness Clue-Prompted Tracking (DCPT) that achieves robust UAV tracking at night by efficiently learning to generate darkness clue prompts. Without a separate enhancer, DCPT directly encodes anti-dark capabilities into prompts using a darkness clue prompter (DCP). Specifically, DCP iteratively learns emphasizing and undermining projections for darkness clues. It then injects these learned visual prompts into a daytime tracker with fixed parameters across transformer layers. Moreover, a gated feature aggregation mechanism enables adaptive fusion between prompts and between prompts and the base model. Extensive experiments show state-of-the-art performance for DCPT on multiple dark scenario benchmarks. The unified end-to-end learning of enhancement and tracking in DCPT enables a more trainable system. The darkness clue prompting efficiently injects anti-dark knowledge without extra modules. Code is available at https://github.com/bearyi26/DCPT.
|
|
10:30-12:00, Paper WeAT6-CC.8 | Add to My Program |
Unifying Foundation Models with Quadrotor Control for Visual Tracking Beyond Object Categories |
|
Saviolo, Alessandro | New York University |
Rao, Pratyaksh | New York University |
Radhakrishnan, Vivek | Technology Innovation Institute, New York University |
Xiao, Jiuhong | New York University |
Loianno, Giuseppe | New York University |
Keywords: Visual Tracking, Aerial Systems: Applications, Vision-Based Navigation
Abstract: Visual control enables quadrotors to adaptively navigate using real-time sensory data, bridging perception with action. Yet, challenges persist, including generalization across scenarios, maintaining reliability, and ensuring real-time responsiveness. This paper introduces a perception framework grounded in foundational models for universal object detection and tracking, moving beyond specific training categories. Integral to our approach is a multi-layered tracker integrated with the foundational detector, ensuring continuous target visibility, even when faced with motion blur, abrupt light shifts, and occlusions. Complementing this, we introduce a model-free controller tailored for resilient quadrotor visual tracking. Our system operates efficiently on limited hardware, relying solely on an onboard camera and an inertial measurement unit. Through extensive validation in diverse challenging indoor and outdoor environments, we demonstrate our system's effectiveness and adaptability. In conclusion, our research represents a step forward in quadrotor visual tracking, moving from task-specific methods to more versatile and adaptable operations.
|
|
10:30-12:00, Paper WeAT6-CC.9 | Add to My Program |
DroneMOT: Drone-Based Multi-Object Tracking Considering Detection Difficulties and Simultaneous Moving of Drones and Objects |
|
Wang, Peng | Renmin University of China |
Wang, Yongcai | Renmin University of China |
Li, Deying | Renmin University of China |
Keywords: Visual Tracking, Aerial Systems: Perception and Autonomy, Human Detection and Tracking
Abstract: Multi-object tracking (MOT) on static platforms, such as by surveillance cameras, has achieved significant progress, with various paradigms providing attractive performances. However, the effectiveness of traditional MOT methods is significantly reduced when it comes to dynamic platforms like drones. This decrease is attributed to the distinctive challenges in the MOT-on-drone scenario: (1) objects are generally small in the image plane, often blurred, and frequently occluded, making them challenging to detect and recognize; (2) drones move and see objects from different angles, causing the unreliability of the predicted positions and feature embeddings of the objects. This paper proposes DroneMOT, which firstly proposes a Dual-Domain integrated Attention (DIA) module that considers the fast movements of drones to enhance the drone-based object detection and feature embedding for small-sized, blurred, and occluded objects. Then, an innovative Motion-Driven Association (MDA) scheme is introduced, considering the concurrent movements of both the drone and the objects. Within MDA, an Adaptive Feature Synchronization (AFS) technique is presented to update the object features seen from different angles. Additionally, a Dual Motion-based Prediction (DMP) method is employed to forecast the object positions. Finally, both the refined feature embeddings and the predicted positions are integrated to enhance the object association. Comprehensive evaluations on VisDrone2019-MOT and UAVDT datasets show that DroneMOT provides substantial performance improvements over the state-of-the-art in the domain of MOT on drones.
|
|
WeAT7-CC Oral Session, CC-416 |
Add to My Program |
Learning in Planning |
|
|
Chair: Zhao, Ding | Carnegie Mellon University |
Co-Chair: Hamaya, Masashi | OMRON SINIC X Corporation |
|
10:30-12:00, Paper WeAT7-CC.1 | Add to My Program |
Human-Robot Gym: Benchmarking Reinforcement Learning in Human-Robot Collaboration |
|
Thumm, Jakob | Technical University of Munich |
Trost, Felix | Technical University of Munich |
Althoff, Matthias | Technische Universität München |
Keywords: Reinforcement Learning, Human-Robot Collaboration, Safety in HRI
Abstract: Deep reinforcement learning (RL) has shown promising results in robot motion planning with first attempts in human-robot collaboration (HRC). However, a fair comparison of RL approaches in HRC under the constraint of guaranteed safety is yet to be made. We, therefore, present human-robot gym, a benchmark suite for safe RL in HRC. We provide challenging, realistic HRC tasks in a modular simulation framework. Most importantly, human-robot gym is the first benchmark suite that includes a safety shield to provably guarantee human safety. This bridges a critical gap between theoretic RL research and its real-world deployment. Our evaluation of six tasks led to three key results: (a) the diverse nature of the tasks offered by human-robot gym creates a challenging benchmark for state-of-the-art RL methods, (b) by leveraging expert knowledge in form of an action imitation reward, the RL agent can outperform the expert, and (c) our agents negligibly overfit to training data.
|
|
10:30-12:00, Paper WeAT7-CC.2 | Add to My Program |
Improving the Generalization of Unseen Crowd Behaviors for Reinforcement Learning Based Local Motion Planners |
|
Ng, Wen Zheng Terence | Nanyang Technological University |
Chen, Jianda | Nanyang Technological University |
Pan, Sinno Jialin | The Chinese University of Hong Kong |
Zhang, Tianwei | Nanyang Technological University |
Keywords: Reinforcement Learning, Collision Avoidance, Machine Learning for Robot Control
Abstract: Deploying a safe mobile robot policy in scenarios with human pedestrians is challenging due to their unpredictable movements. Current Reinforcement Learning-based motion planners rely on a single policy to simulate pedestrian movements and could suffer from the over-fitting issue. Alternatively, framing the collision avoidance problem as a multi-agent framework, where agents generate dynamic movements while learning to reach their goals, can lead to conflicts with human pedestrians due to their homogeneity. To tackle this problem, we introduce an efficient method that enhances agent diversity within a single policy by maximizing an information-theoretic objective. This diversity enriches each agent's experiences, improving its adaptability to unseen crowd behaviors. In assessing an agent's robustness against unseen crowds, we propose diverse scenarios inspired by pedestrian crowd behaviors. Our behavior-conditioned policies outperform existing works in these challenging scenes, reducing potential collisions without additional time or travel.
|
|
10:30-12:00, Paper WeAT7-CC.3 | Add to My Program |
Human-Aligned Longitudinal Control for Occluded Pedestrian Crossing with Visual Attention |
|
Asodia, Vinal | University of Surrey |
Feng, Zhenhua | University of Surrey |
Fallah, Saber | University of Surrey |
Keywords: Reinforcement Learning, Collision Avoidance, Human-Centered Automation
Abstract: Reinforcement Learning (RL) has been widely used to create generalizable autonomous vehicles. However, they rely on fixed reward functions that struggle to balance values like safety and efficiency. How can autonomous vehicles balance different driving objectives and human values in a constantly changing environment? To bridge this gap, we propose an adaptive reward function that utilizes visual attention maps to detect pedestrians in the driving scene and dynamically switch between prioritizing safety or efficiency depending on the current observation. The visual attention map is used to provide spatial attention to the RL agent to boost the training efficiency of the pipeline. We evaluate the pipeline against variants of an occluded pedestrian crossing scenario in the CARLA Urban Driving simulator. Specifically, the proposed pipeline is compared against a modular setup that combines the well-established object detection model, YOLO, with a Proximal Policy Optimization (PPO) agent. The results indicate that the proposed approach can compete with the modular setup while yielding greater training efficiency. The trajectories collected with the approach confirm the effectiveness of the proposed adaptive reward function.
|
|
10:30-12:00, Paper WeAT7-CC.4 | Add to My Program |
Projection-Based Fast and Safe Policy Optimization for Reinforcement Learning |
|
Lin, Shijun | University of Science and Technology of China |
Wang, Hao | University of Science and Technology of China |
Chen, Ziyang | University of Science and Technology of China |
Kan, Zhen | University of Science and Technology of China |
Keywords: Reinforcement Learning, Task and Motion Planning
Abstract: While reinforcement learning (RL) attracts increasing research attention, maximizing the return while keeping the agent safe at the same time remains an open problem. Motivated to address this challenge, this work proposes a new Fast and Safe Policy Optimization (FSPO) algorithm, which consists of three steps: the first step involves reward improvement update, the second step projects the policy to the neighborhood of the baseline policy to accelerate the optimization process, and the third step addresses the constraint violation by projecting the policy back onto the constraint set. Such a projection-based optimization can improve the convergence and learning performance. Unlike many existing works that require convex approximations for the objectives and constraints, this work exploits a first- order method to avoid expensive computations and high dimensional issues, enabling fast and safe policy optimization, especially for challenging tasks. Numerical simulation and physical experiments demonstrate that FSPO outperforms existing methods in terms of safety guarantees and task completion rate.
|
|
10:30-12:00, Paper WeAT7-CC.5 | Add to My Program |
Symmetry Considerations for Learning Task Symmetric Robot Policies |
|
Mittal, Mayank | ETH Zurich |
Rudin, Nikita | ETH Zurich, NVIDIA |
Klemm, Victor | ETH Zurich |
Allshire, Arthur | University of Toronto |
Hutter, Marco | ETH Zurich |
Keywords: Reinforcement Learning, Machine Learning for Robot Control
Abstract: Symmetry is a fundamental aspect of many real-world robotic tasks. However, current deep reinforcement learning (DRL) approaches can seldom harness and exploit symmetry effectively. Often, the learned behaviors fail to achieve the desired transformation invariances and suffer from motion artifacts. For instance, a quadruped may exhibit different gaits when commanded to move forward or backward, even though it is symmetrical about its torso. This issue becomes further pronounced in high-dimensional or complex environments, where DRL methods are prone to local optima and fail to explore regions of the state space equally. Past methods on encouraging symmetry for robotic tasks have studied this topic mainly in a single-task setting, where symmetry usually refers to symmetry in the motion, such as the gait patterns. In this paper, we revisit this topic for goal-conditioned tasks in robotics, where symmetry lies mainly in task execution and not necessarily in the learned motions themselves. In particular, we investigate two approaches to incorporate symmetry invariance into DRL – data augmentation and mirror loss function. We provide a theoretical foundation for using augmented samples in an on-policy setting. Based on this, we show that the corresponding approach achieves faster convergence and improves the learned behaviors in various challenging robotic tasks, from climbing boxes with a quadruped to dexterous manipulation.
|
|
10:30-12:00, Paper WeAT7-CC.6 | Add to My Program |
Learning Dual-Arm Object Rearrangement for Cartesian Robots |
|
Zhang, Shishun | National University of Defense Technology |
She, Qijin | National University of Defense Technology |
Li, Wenhao | National University of Defense Technology |
Zhu, Chenyang | National University of Defense Technology |
Wang, Yongjun | National University of Defense Technology |
Hu, Ruizhen | Shenzhen University |
Xu, Kai | National University of Defense Technology |
Keywords: Reinforcement Learning, Task and Motion Planning
Abstract: This work focuses on the dual-arm object rearrangement problem abstracted from a realistic industrial scenario of Cartesian robots. The goal of this problem is to transfer all the objects from sources to targets with the minimum total completion time. To achieve the goal, the core idea is to develop an effective object-to-arm task assignment strategy for minimizing the cumulative task execution time and maximizing the dual-arm cooperation efficiency. One of the difficulties in the task assignment is the scalability problem. As the number of objects increases, the computation time of traditional offline-search-based methods grows strongly for computational complexity. Encouraged by the adaptability of reinforcement learning (RL) in long-sequence task decisions, we propose an online task assignment decision method based on RL, and the computation time of our method only increases linearly with the number of objects. Further, we design an attention-based network to model the dependencies between the input states during the whole task execution process to help find the most reasonable object-to-arm correspondence in each task assignment round. In the experimental part, we adapt some search-based methods to this specific setting and compare our method with them. Experimental result shows that our approach achieves outperformance over search-based methods in total execution time and computational efficiency, and also verifies the generalization of our method to different numbers of objects. In addition, we show the effectiveness of our method deployed on the real robot in the supplementary video.
|
|
10:30-12:00, Paper WeAT7-CC.7 | Add to My Program |
Guided Online Distillation: Promoting Safe Reinforcement Learning by Offline Demonstration |
|
Li, Jinning | University of California, Berkeley |
Liu, Xinyi | University of Michigan |
Zhu, Banghua | University of California, Berkeley |
Jiao, Jiantao | University of California, Berkeley |
Tomizuka, Masayoshi | University of California |
Tang, Chen | University of California Berkeley |
Zhan, Wei | Univeristy of California, Berkeley |
Keywords: Reinforcement Learning, Learning from Demonstration, Robot Safety
Abstract: Safe Reinforcement Learning (RL) aims to find a policy that achieves high rewards while satisfying cost constraints. When learning from scratch, safe RL agents tend to be overly conservative, which impedes exploration and restrains the overall performance. In many realistic tasks, e.g. autonomous driving, large-scale expert demonstration data are available. We argue that extracting expert policy from offline data to guide online exploration is a promising solution to mitigate the conserveness issue. Large-capacity models, e.g. decision transformers (DT), have been proven to be competent in offline policy learning. However, data collected in real-world scenarios rarely contain dangerous cases (e.g., collisions), which makes it prohibitive for the policies to learn safety concepts. Besides, these bulk policy networks cannot meet the computation speed requirements at inference time on real-world tasks such as autonomous driving. To this end, we propose Guided Online Distillation (GOLD), an offline-to-online safe RL framework. GOLD distills an offline DT policy into a lightweight policy network through guided online safe RL training, which outperforms both the offline DT policy and online safe RL algorithms. Experiments in both benchmark safe RL tasks and real-world driving tasks based on the Waymo Open Motion Dataset (WOMD) demonstrate that GOLD can successfully distill lightweight policies and solve decision-making problems in challenging safety-critical scenarios.
|
|
10:30-12:00, Paper WeAT7-CC.8 | Add to My Program |
Sample-Efficient Learning to Solve a Real-World Labyrinth Game Using Data-Augmented Model-Based Reinforcement Learning |
|
Bi, Thomas | ETH Zurich |
D'Andrea, Raffaello | ETHZ |
Keywords: Reinforcement Learning, Engineering for Robotic Systems, Visual Learning
Abstract: Motivated by the challenge of achieving rapid learning in physical environments, this paper presents the development and training of a robotic system designed to navigate and solve a labyrinth game using model-based reinforcement learning techniques. The method involves extracting low-dimensional observations from camera images, along with a cropped and rectified image patch centered on the current position within the labyrinth, providing valuable information about the labyrinth layout. The learning of a control policy is performed purely on the physical system using model-based reinforcement learning, where the progress along the labyrinth's path serves as a reward signal. Additionally, we exploit the system's inherent symmetries to augment the training data. Consequently, our approach learns to successfully solve a popular real-world labyrinth game in record time, with only 5 hours of real-world training data.
|
|
10:30-12:00, Paper WeAT7-CC.9 | Add to My Program |
Active Neural Topological Mapping for Multi-Agent Exploration |
|
Yang, Xinyi | Tsinghua University |
Yang, Yuxiang | Tsinghua University |
Yu, Chao | Tsinghua University |
Chen, Jiayu | Tsinghua University |
Jincheng, Yu | Tsinghua University |
Ren, Haibing | Meituan Inc |
Yang, Huazhong | Tsinghua University |
Wang, Yu | Tsinghua University |
Keywords: Reinforcement Learning, Path Planning for Multiple Mobile Robots or Agents
Abstract: This paper investigates the multi-agent cooperative exploration problem, which requires multiple agents to explore an unseen environment via sensory signals in a limited time. A popular approach to exploration tasks is to combine active mapping with planning. Metric maps capture the details of the spatial representation, but are memory intensive and may vary significantly between scenarios, resulting in inferior generalization. Topological maps are a promising alternative as they consist only of nodes and edges with abstract but essential information and are less influenced by the scene structures. However, most existing topology-based exploration tasks utilize classical methods for planning, which are time-consuming and sub-optimal due to their handcrafted design. Deep reinforcement learning (DRL) has shown great potential for learning (near) optimal policies through fast end-to-end inference. In this paper, we propose Multi-Agent Neural Topological Mapping (MANTM) to improve exploration efficiency and generalization for multi-agent exploration tasks. MANTM mainly comprises a Topological Mapper and a novel RL-based Hierarchical Topological Planner (HTP). The Topological Mapper employs a visual encoder and distance-based heuristics to construct a graph containing main nodes and their corresponding ghost nodes. The HTP leverages graph neural networks to capture correlations between agents and graph nodes in a coarse-to-fine manner for effective global goal selection. Extensi
|
|
WeAT8-CC Oral Session, CC-418 |
Add to My Program |
Learning in Grasping and Manipulation I |
|
|
Chair: Tahara, Kenji | Kyushu University |
Co-Chair: Zhang, Haichao | Horizon Robotics |
|
10:30-12:00, Paper WeAT8-CC.1 | Add to My Program |
Efficient Multi-Task and Transfer Reinforcement Learning with Parameter-Compositional Framework |
|
Sun, Lingfeng | University of California, Berkeley |
Zhang, Haichao | Horizon Robotics |
Xu, Wei | Horizon Robotics |
Tomizuka, Masayoshi | University of California |
Keywords: Reinforcement Learning, Transfer Learning, Deep Learning in Grasping and Manipulation
Abstract: In this work, we investigate the potential of improving multi-task training and leveraging it for transferring in the reinforcement learning setting. We identify several challenges towards this goal and propose a transferring approach with a parameter-compositional formulation. We investigate ways to improve the training of multi-task reinforcement learning, which serves as the foundation for transferring. Then we conduct a number of transferring experiments on various manipulation tasks. Experimental results demonstrate that the proposed approach can have improved performance in the multi-task training stage, and further show effective transferring in terms of both sample efficiency and performance.
|
|
10:30-12:00, Paper WeAT8-CC.2 | Add to My Program |
Goal-Conditioned Reinforcement Learning with Disentanglement-Based Reachability Planning |
|
Qian, Zhifeng | Tongji University |
Mingyu, You | Tongji |
Hongjun, Zhou | Tongji University |
Xu, Xuanhui | TongJi University |
He, Bin | Tongji University |
Keywords: Reinforcement Learning, Representation Learning, Manipulation Planning
Abstract: Goal-Conditioned Reinforcement Learning (GCRL) can enable agents to spontaneously set diverse goals to learn a set of skills. Despite the excellent works proposed in various fields, reaching distant goals in temporally extended tasks remains a challenge for GCRL. Current works tackled this problem by leveraging planning algorithms to plan intermediate subgoals to augment GCRL. Their methods need two crucial requirements: (i) a state representation space to search valid subgoals, and (ii) a distance function to measure the reachability of subgoals. However, they struggle to scale to high-dimensional state space due to their non-compact representations. Moreover, they cannot collect high-quality training data through standard GC policies, which results in an inaccurate distance function. Both affect the efficiency and performance of planning and policy learning. In the paper, we propose a goal-conditioned RL algorithm combined with Disentanglement-based Reachability Planning (REPlan) to solve temporally extended tasks. In REPlan, a Disentangled Representation Module (DRM) is proposed to learn compact representations which disentangle robot poses and object positions from high-dimensional observations in a self-supervised manner. A simple REachability discrimination Module (REM) is also designed to determine the temporal distance of subgoals. Moreover, REM computes intrinsic bonuses to encourage the collection of novel states for training. We evaluate our REPlan in three vision-
|
|
10:30-12:00, Paper WeAT8-CC.3 | Add to My Program |
KINet: Unsupervised Forward Models for Robotic Pushing Manipulation |
|
Rezazadeh, Alireza | University of Minnesota |
Choi, Changhyun | University of Minnesota, Twin Cities |
Keywords: Representation Learning, Deep Learning Methods, Manipulation Planning
Abstract: Object-centric representation is an essential abstraction for forward prediction. Most existing forward models learn this representation through extensive supervision (e.g., object class and bounding box) although such ground-truth information is not readily accessible in reality. To address this, we introduce KINet (Keypoint Interaction Network) ---an end-to-end unsupervised framework to reason about object interactions based on a keypoint representation. Using visual observations, our model learns to associate objects with keypoint coordinates and discovers a graph representation of the system as a set of keypoint embeddings and their relations. It then learns an action-conditioned forward model using contrastive estimation to predict future keypoint states. By learning to perform physical reasoning in the keypoint space, our model automatically generalizes to scenarios with a different number of objects, novel backgrounds, and unseen object geometries. Experiments demonstrate the effectiveness of our model in accurately performing forward prediction and learning plannable object-centric representations for downstream robotic pushing manipulation tasks.
|
|
10:30-12:00, Paper WeAT8-CC.4 | Add to My Program |
Intrinsic Language-Guided Exploration for Complex Long-Horizon Robotic Manipulation Tasks |
|
Triantafyllidis, Eleftherios | The University of Edinburgh |
Christianos, Filippos | University of Edinburgh |
Li, Zhibin (Alex) | University College London |
Keywords: Deep Learning Methods, Reinforcement Learning, Deep Learning in Grasping and Manipulation
Abstract: Current reinforcement learning algorithms struggle in sparse and complex environments, most notably in long-horizon manipulation tasks entailing a plethora of different sequences. In this work, we propose the Intrinsically Guided Exploration from Large Language Models (IGE-LLMs) framework. By leveraging LLMs as an assistive intrinsic reward, IGE-LLMs guides the exploratory process in reinforcement learning to address intricate long-horizon with sparse rewards robotic manipulation tasks. We evaluate our framework and related intrinsic learning methods in an environment challenged with exploration, and a complex robotic manipulation task challenged by both exploration and long-horizons. Results show IGE-LLMs (i) exhibit notably higher performance over related intrinsic methods and the direct use of LLMs in decision-making, (ii) can be combined and complement existing learning methods highlighting its modularity, (iii) are fairly insensitive to different intrinsic scaling parameters, and (iv) maintain robustness against increased levels of uncertainty and horizons.
|
|
10:30-12:00, Paper WeAT8-CC.5 | Add to My Program |
Touch-Based Manipulation with Multi-Fingered Robot Using Off-Policy RL and Temporal Contrastive Learning |
|
Morihira, Naoki | Honda R&D, Ltd |
Deo, Pranav | Honda R&D Co. Ltd |
Bhadu, Manoj | Honda R&D Co. Ltd |
Hayashi, Akinobu | Honda R&D Co., Ltd |
Hasegawa, Tadaaki | Honda R&D Co., Ltd |
Otsubo, Satoshi | Honda R&D |
Osa, Takayuki | University of Tokyo |
Keywords: Reinforcement Learning, In-Hand Manipulation, Dexterous Manipulation
Abstract: Tactile information holds promise for enhancing the manipulation capabilities of multi-fingered robots. In tasks such as in-hand manipulation, where robots frequently switch between contact and non-contact states, it is important to address the partial observability of tactile sensors and to properly consider the history of observations and actions. Previous studies have shown that Recurrent Neural Network (RNN) can be used to learn latent representations for handling observation and action histories. However, this approach is usually combined with on-policy reinforcement learning (RL) and suffers from low sample efficiency. Integrating RNN with off-policy RL could enhance sample efficiency, but this often compromises stability and robustness, especially as the dimensions of observation and action increase. This paper presents a time-contrastive learning approach tailored for off-policy RL. Our method incorporates a temporal contrastive model and introduces a surrogate loss to extract task-related latent representations, enhancing the pursuit of the optimal policy. Simulations and real robot experiments demonstrate that our proposed method outperforms RNN-based approaches.
|
|
10:30-12:00, Paper WeAT8-CC.6 | Add to My Program |
Learning Language-Conditioned Deformable Object Manipulation with Graph Dynamics |
|
Deng, Yuhong | National University of Singapore |
Mo, Kai | Tsinghua University, Shenzhen International Graduate School |
Xia, Chongkun | Tsinghua University |
Wang, Xueqian | Center for Artificial Intelligence and Robotics, Graduate School |
Keywords: Manipulation Planning, Deep Learning in Grasping and Manipulation, Dexterous Manipulation
Abstract: Multi-task learning of deformable object manipulation is a challenging problem in robot manipulation. Most previous works address this problem in a goal-conditioned way and adapt goal images to specify different tasks, which limits the multi-task learning performance and can not generalize to new tasks. Thus, we adapt language instruction to specify deformable object manipulation tasks and propose a learning framework. We first design a unified Transformer-based architecture to understand multi-modal data and output picking and placing action. Besides, we have applied the visible connectivity graph to tackle nonlinear dynamics and complex configuration of the deformable object. Both simulated and real experiments have demonstrated that the proposed method is effective and can generalize to unseen instructions and tasks. Compared with the state-of-the-art method, our method achieves higher success rates (87.2% on average) and has a 75.6% shorter inference time. We also demonstrate that our method performs well in real-world experiments. Supplementary videos can be found at https://sites.google.com/view/language-deformable.
|
|
10:30-12:00, Paper WeAT8-CC.7 | Add to My Program |
Grasp Anything: Combining Teacher-Augmented Policy Gradient Learning with Instance Segmentation to Grasp Arbitrary Objects |
|
Mosbach, Malte | University of Bonn |
Behnke, Sven | University of Bonn |
Keywords: Reinforcement Learning, Grasping, Sensorimotor Learning
Abstract: Interactive grasping from clutter, akin to human dexterity, is one of the longest-standing problems in robot learning. Challenges stem from the intricacies of visual perception, the demand for precise motor skills, and the complex interplay between the two. In this work, we present Teacher-Augmented Policy Gradient (TAPG), a novel two-stage learning framework that synergizes reinforcement learning and policy distillation. After training a teacher policy to master the motor control based on object pose information, TAPG facilitates guided, yet adaptive, learning of a sensorimotor policy, based on object segmentation. We zero-shot transfer from simulation to a real robot by using Segment Anything Model for promptable object segmentation. Our trained policies adeptly grasp a wide variety of objects from cluttered scenarios in simulation and the real world based on human-understandable prompts. Furthermore, we show robust zero-shot transfer to novel objects. Videos of our experiments are available at url{https://maltemosbach.github.io/grasp_anything}.
|
|
10:30-12:00, Paper WeAT8-CC.8 | Add to My Program |
Composable Interaction Primitives: A Structured Policy Class for Efficiently Learning Sustained-Contact Manipulation Skills |
|
Abbatematteo, Ben | University of Texas at Austin |
Rosen, Eric | Brown University |
Thompson, Skye | MIT |
Akbulut, Mete Tuluhan | Bogazici University |
Rammohan, Sreehari | Brown University |
Konidaris, George | Brown University |
Keywords: Reinforcement Learning, Integrated Planning and Learning, Deep Learning in Grasping and Manipulation
Abstract: We propose a new policy class, Composable Interaction Primitives (CIPs), specialized for learning sustained-contact manipulation skills like opening a drawer, pulling a lever, turning a wheel, or shifting gears. CIPs have two primary design goals: to minimize what must be learned by exploiting structure present in the world and the robot, and to support sequential composition by construction, so that learned skills can be used by a task-level planner. Using an ablation experiment in four simulated manipulation tasks, we show that the structure included in CIPs substantially improves the efficiency of motor skill learning. We then show that CIPs can be used for plan execution in a zero-shot fashion by sequencing learned skills. We validate our approach on real robot hardware by learning and sequencing two manipulation skills.
|
|
WeAT9-CC Oral Session, CC-419 |
Add to My Program |
Collision Avoidance I |
|
|
Chair: Wang, Zhuping | Tongji University |
Co-Chair: Albu-Schäffer, Alin | DLR - German Aerospace Center |
|
10:30-12:00, Paper WeAT9-CC.1 | Add to My Program |
CollisionGP: Gaussian Process-Based Collision Checking for Robot Motion Planning |
|
Muñoz Mendi, Javier | Universidad Carlos III De Madrid |
Lehner, Peter | German Aerospace Center (DLR) |
Moreno, Luis | Carlos III University |
Albu-Schäffer, Alin | DLR - German Aerospace Center |
Roa, Maximo A. | German Aerospace Center (DLR) |
Keywords: Motion and Path Planning, Collision Avoidance, Probabilistic Inference
Abstract: Collision checking is the primitive operation of motion planning that consumes most time. Machine learning algorithms have proven to accelerate collision checking. We propose CollisionGP, a Gaussian process-based algorithm for modeling a robot's configuration space and query collision checks. CollisionGP introduces a Pòlya-Gamma auxiliary variable for each data point in the training set to allow classification inference to be done exactly with a closed form expression. Gaussian processes provide a distribution as the output, obtaining a mean and variance for the collision check. The obtained variance is processed to reduce false negatives (FN). We demonstrate that CollisionGP can use GPU acceleration to process collision checks for thousands of configurations much faster than traditional collision detection libraries. Furthermore, we obtain better accuracy, TPR and TNR results than state-of-the-art learning-based algorithms using less support points, thus making our proposed method more sparse.
|
|
10:30-12:00, Paper WeAT9-CC.2 | Add to My Program |
Probabilistic Motion Planning and Prediction Via Partitioned Scenario Replay |
|
de Groot, Oscar | Delft University of Technology |
Sridharan, Anish | Starnus Technologiy |
Alonso-Mora, Javier | Delft University of Technology |
Ferranti, Laura | Delft University of Technology |
Keywords: Collision Avoidance, Planning under Uncertainty, Optimization and Optimal Control
Abstract: Autonomous mobile robots require predictions of human motion to plan a safe trajectory that avoids them. Because human motion cannot be predicted exactly, future trajectories are typically inferred from real-world data via learning-based approximations. These approximations provide useful information on the pedestrian’s behavior, but may deviate from the data, which can lead to collisions during planning. In this work, we introduce a joint prediction and planning framework, Partitioned Scenario Replay (PSR), that stores and partitions previously observed human trajectories, referred to as scenarios. During planning, scenarios observed in similar situations are reintroduced (or replayed) as motion predictions. By sampling real data and by building on scenario optimization and predictive control, the planner provides probabilistic collision avoidance guarantees in the real-world. Relying on this guarantee to remain safe, PSR can incrementally improve its prediction and planning performance online. We demonstrate our approach on a mobile robot navigating around pedestrians.
|
|
10:30-12:00, Paper WeAT9-CC.3 | Add to My Program |
Prescient Collision-Free Navigation of Mobile Robots with Iterative Multimodal Motion Prediction of Dynamic Obstacles |
|
Zhang, Ze | Chalmers University of Technology |
Hajieghrary, Hadi | Magna International |
Dean, Emmanuel | Chalmers University of Technology |
Akesson, Knut | Chalmers University of Technology |
Keywords: Collision Avoidance, Deep Learning Methods, AI-Based Methods
Abstract: To explore safe interactions between a mobile robot and dynamic obstacles, this paper presents a comprehensive approach to collision-free navigation in dynamic indoor environments. The approach integrates multimodal motion predictions of dynamic obstacles with predictive control for obstacle avoidance. Multimodal Motion Prediction (MMP) is achieved by a deep-learning method that predicts multiple plausible future positions. By repeating the MMP for each time offset in the future, multi-time-step multimodal motion predictions are obtained. A nonlinear Model Predictive Control (MPC) solver utilizes the prediction outcomes to achieve collision-free trajectory tracking for the mobile robot. The proposed integration of multimodal motion prediction and trajectory tracking outperforms other non-deep-learning methods in complex scenarios. The approach enables safe interaction between the mobile robot and stochastic dynamic obstacles.
|
|
10:30-12:00, Paper WeAT9-CC.4 | Add to My Program |
GPU-Accelerated Optimization-Based Collision Avoidance |
|
Wu, Zeming | Tongji University |
Wang, Zhuping | Tongji University |
Zhang, Hao | Tongji University |
Keywords: Motion and Path Planning, Collision Avoidance, Constrained Motion Planning
Abstract: This paper proposes a GPU-accelerated optimization framework for collision avoidance problems where the controlled objects and the obstacles can be modeled as the finite union of convex polyhedra. A novel collision avoidance constraint is proposed based on scale-based collision detection and the strong duality of convex optimization. Under this constraint, the high-dimensional non-convex optimization problems of collision avoidance can be decomposed into several low-dimensional quadratic programmings (QPs) following the paradigm of alternating direction method of multipliers (ADMM). Furthermore, these low-dimensional QPs can be solved parallel with GPUs, significantly reducing computational time. High-fidelity simulations are conducted to validate the proposed method's effectiveness and practicality.
|
|
10:30-12:00, Paper WeAT9-CC.5 | Add to My Program |
Learn to Navigate in Dynamic Environments with Normalized LiDAR Scans |
|
Zhu, Wei | Tohoku University |
Hayashibe, Mitsuhiro | Tohoku University |
Keywords: Collision Avoidance, Human-Aware Motion Planning, Reinforcement Learning
Abstract: The latest robot navigation methods for dynamic environments assume that the states of obstacles, including their geometries and trajectories, are fully observable. While it's easy to obtain these states accurately in simulations, it's exceedingly challenging in the real world. Therefore, a viable alternative is to directly map raw sensor observations into robot actions. However, acquiring skills from high-dimensional raw observations demands massive neural networks and extended training periods. Furthermore, there are discrepancies between simulated and real environments that impede real-world implementations. To overcome these limitations, we propose a Learning framework for robot Navigation in Dynamic environments that uses sequential Normalized LiDAR (LNDNL) scans. We employ long-short-term memory (LSTM) to propagate historical environmental information from the sequential LiDAR observations. Additionally, we customize a LiDAR-integrated simulator to speed up sampling and normalize the geometry of real-world obstacles to match that of simulated objects, thereby bridging the sim-to-real gap. Our extensive comparisons with state-of-the-art baselines and real-world implementations demonstrate the potential of learning to navigate in dynamic environments using raw sensor observations and sim-to-real transfer.
|
|
10:30-12:00, Paper WeAT9-CC.6 | Add to My Program |
Learning Terminal State of the Trajectory Planner: Application for Collision Scenarios of Autonomous Vehicles |
|
Lim, Joonhee | KAIST |
Lee, Kibeom | Gachon University |
Shin, Jangho | Hyundai Motor Company |
Kum, Dongsuk | KAIST |
Keywords: Collision Avoidance, Integrated Planning and Learning, Motion and Path Planning
Abstract: Collision Avoidance/Mitigation System (CAMS) for autonomous vehicles is a crucial technology that ensures the safety and reliability of autonomous driving systems. Conventional collision avoidance approaches struggle in complex and various scenarios by avoiding collisions based on rules for specific collision scenarios. This has led to learning-based methods using neural networks for adaptive collision avoidance. However, the approaches directly outputting control inputs through neural networks have drawbacks in interpretability and stability. To address these limitations, we propose a trajectory planning method for CAMS that combines deep reinforcement learning (DRL) and quintic polynomial (QP) trajectory planning. The proposed method determines the terminal state and confidence of the trajectory using DRL and plans a QP trajectory based on them. By utilizing the terminal state and confidence of the trajectory rather than direct control inputs as the output of the neural network, it generates a more realistic and continuous path. Moreover, this approach considers collision avoidance and mitigation in an integrated manner through the reward function of RL. Our experimental results demonstrate that the proposed method not only improves interpretability and stability compared to existing learning-based methods but also upholds performance in complex and various collision scenarios.
|
|
10:30-12:00, Paper WeAT9-CC.7 | Add to My Program |
History-Aware Planning for Risk-Free Autonomous Navigation on Unknown Uneven Terrain |
|
Wang, Yinchuan | Shandong University |
Du, Nianfei | Shandong University |
Qin, Yongsen | Shandong University |
Zhang, Xiang | School of Control Science and Engineering, Shandong University |
Song, Rui | Shandong University |
Wang, Chaoqun | Shandong University |
Keywords: Collision Avoidance, Planning under Uncertainty, Autonomous Vehicle Navigation
Abstract: It is challenging for the mobile robot to achieve autonomous and mapless navigation in the unknown environment with uneven terrain. In this study, we present a layered and systematic pipeline. At the local level, we maintain a tree structure that is dynamically extended with the navigation. This structure unifies the planning with the terrain identification. Besides, it contributes to explicitly identifying the hazardous areas on uneven terrain. In particular, certain nodes of the tree are consistently kept to form a sparse graph at the global level, which records the history of the exploration. A series of subgoals that can be obtained in the tree and the graph are utilized for leading the navigation. To determine a subgoal, we develop an evaluation method whose input elements can be efficiently obtained on the layered structure. We conduct both simulation and real-world experiments to evaluate the developed method and its key modules. The experimental results demonstrate the effectiveness and efficiency of our method. The robot can travel through the unknown uneven region safely and reach the target rapidly without a preconstructed map.
|
|
WeAT10-CC Oral Session, CC-501 |
Add to My Program |
Soft Sensors and Actuators II |
|
|
Chair: Hughes, Josie | EPFL |
Co-Chair: Shi, Chaoyang | Tianjin University |
|
10:30-12:00, Paper WeAT10-CC.1 | Add to My Program |
Multi-Tap Resistive Sensing and FEM Modeling Enables Shape and Force Estimation in Soft Robots |
|
Cangan, Barnabas Gavin | ETH Zurich |
Tian, Sizhe | Inria, Université De Lille |
Escaida Navarro, Stefan | Universidad De O'Higgins |
Beger, Artem | Festo SE & Co. KG |
Duriez, Christian | INRIA |
Katzschmann, Robert Kevin | ETH Zurich |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Sensors and Actuators, Grasping
Abstract: We tackle the problem of proprioception in soft robots, specifically soft grippers with tight packaging constraints, relying only on intrinsic sensors. While various sensing ap- proaches have been studied towards curvature estimation, we look into being able to sense local deformations. To accomplish this, we use a widely available, off-the-shelf resistive sensor and multi-tap this sensor, i.e. make multiple electrical connections onto the resistive layer of the sensor. This allows us to measure changes in resistance at multiple segments throughout the length of the sensor, providing improved resolution of local deformations in the soft body. These measurements inform a finite-element- method (FEM) based model to then estimate the shape of the soft body and the magnitude of an external force acting at a known arbitrary location. Our model-based approach estimates soft body deformation with approximately 3% average relative error and taking into account internal fluidic actuation, our estimate of external force disturbance has 11% relative error within a 5 N range. The combined sensing and modeling approach can be integrated into soft manipulation platforms to enable features such as identifying shape and material properties of an object being grasped. Such manipulators can benefit from the softness and compliance while being proprioceptive relying only on embedded sensing and not on external systems such as motion capture, which is essential for deployment in real-world scena
|
|
10:30-12:00, Paper WeAT10-CC.2 | Add to My Program |
Learning Motion Reconstruction from Demonstration Via Multi-Modal Soft Tactile Sensing |
|
Pan, Cheng | Swiss Federal Institute of Technology Lausanne (EPFL) |
Gilday, Kieran | EPFL |
Sologuren, Emily | MIT |
Junge, Kai | École Polytechnique Fédérale De Lausanne |
Hughes, Josie | EPFL |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Sensors and Actuators, Learning from Demonstration
Abstract: Learning manipulation from demonstration is a key way for humans to teach complex tasks. However, this domain mainly focuses on kinetic teaching, and does not consider imitation of interaction forces which is essential for more contact rich tasks. We propose a framework that enables robotic imitation of contact from human demonstration using a wearable finger-tip sensor. By developing a multi-modal sensor (providing both force and contact location) and robotic collection of simple training data of different motion primitives (tapping, rotation and translation), an LSTM-based model can be used to replicate motion from tactile demonstration only. To evaluate this approach, we explore the performance on increasingly complex testing data generated by a robot, and also demonstrate the full pipeline from human demonstration via the sensor used as a wearable device. This approach of using tactile sensing as a means of inferring the required robot motion paves the way for imitation of more contact-rich tasks, and enables imitation of tasks where the demonstration and imitation is performed with different body-schema.
|
|
10:30-12:00, Paper WeAT10-CC.3 | Add to My Program |
A Generalized Motion Control Framework of Dielectric Elastomer Actuators: Dynamic Modeling, Sliding-Mode Control and Experimental Evaluation |
|
Zou, Jiang | Shanghai Jiao Tong University |
Kassim, Shakiru Olajide | School of Engineering, University of Aberdeen, Scotland |
Ren, Jieji | Shanghai Jiao Tong University |
Vaziri, Vahid | University of Aberdeen |
Aphale, Sumeet S. | University of Aberdeen |
Gu, Guoying | Shanghai Jiao Tong University |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Sensors and Actuators, Motion Control, Dielectric elastomer actuators
Abstract: The continuous electromechanical deformation of dielectric elastomer actuators (DEAs) suffers from rate-dependent viscoelasticity, mechanical vibration and configuration dependency, making the generalized dynamic modeling and precise control elusive. In this work, we present a generalized motion control framework for DEAs capable of accommodating different configurations, materials and degrees of freedom (DOFs). First, a generalized, control-enabling dynamic model is developed for DEAs by taking both nonlinear electromechanical coupling, mechanical vibration and rate-dependent viscoelasticity into consideration. Further, a state observer is introduced to predict the unobservable viscoelasticity. Then, an Enhanced Exponential Reaching Law based Sliding-Mode Controller (EERLSMC) is proposed to minimize the viscoelasticity of DEAs. Its stability is also proved mathematically. The experimental results obtained for different DEAs (four configurations, two materials and multi-DOFs) demonstrate that our dynamic model can precisely describe their complex dynamic responses and the EERLSMC can achieve precise tracking control; verifying the generality of our framework.
|
|
10:30-12:00, Paper WeAT10-CC.4 | Add to My Program |
Vision-Based Tip Force Estimation on a Soft Continuum Robot |
|
Chen, Xingyu | University College London |
Shi, Jialei | University College London |
Wurdemann, Helge Arne | University College London |
George Thuruthel, Thomas | University College London |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Sensors and Actuators, Soft Robot Applications
Abstract: Soft continuum robots, fabricated from elastomeric materials, offer unparalleled flexibility and adaptability, making them ideal for applications such as minimally invasive surgery and inspections in constrained environments. With the miniaturization of imaging technologies and the development of novel control algorithms, these devices provide exceptional opportunities to visualize the internal structures of the human body. However, there are still challenges in accurately estimating external forces applied to these systems using current technologies. Adding additional sensors is challenging without compromising the softness of the device. This work presents a visual deformation-based force sensing framework for soft continuum robots. The core idea behind this work is that point loads lead to unique deformation profiles in an actuated soft-bodied robot. We introduce a Convolutional Neural Network-based tip force estimation method that utilizes arbitrarily placed camera images and actuation inputs to predict applied tip forces. Experimental validation was performed using the STIFF-FLOP robot, a pneumatically actuated soft robot developed for minimally invasive surgery. Our vision-based force estimation model demonstrated a sensing precision of 0.05 N in the XY plane during testing, with data collection and training taking only 70 minutes.
|
|
10:30-12:00, Paper WeAT10-CC.5 | Add to My Program |
Soft Bending Actuator with Fiber-Jamming Variable Stiffness and Fiber-Optic Proprioception |
|
Kang, Joonwon | Seoul National University |
Lee, Sudong | EPFL (École Polytechnique Fédérale De Lausanne) |
Park, Yong-Lae | Seoul National University |
Keywords: Soft Robot Materials and Design, Soft Sensors and Actuators, Compliant Joints and Mechanisms
Abstract: Soft actuators with a function of variable stiffness are beneficial to the improvement of the adaptability of robots, expanding the application areas and environments. We propose a tendon-driven soft bending actuator that can change its stiffness using fiber jamming. The actuator is made of an elastomer tube filled with different types of fiber. The three types of fibers play different roles in maintaining the structure, variable stiffness by jamming, and fiber-optic shape sensing while sharing the same structure and materials, realizing a compact form factor of the entire structure. The stiffness of the actuator can be increased to higher than three times its original stiffness by jamming. In addition to jamming, the proposed actuator has a special function of shape sensing that estimates the tip location of the actuator based on image sensing from optical fibers packaged with the jamming fibers. The tip position sensing shows accuracies with errors of 3.1%, 3.0%, and 6.7% for the x, y, and z axes, respectively, using feature extraction and a deep neural network. The proposed actuator has two degrees of freedom (i.e., bending on two orthogonal planes) and is controlled by two tendons. When connected in series, multiple actuators form a soft robotic manipulator (i.e., arm), physically compliant or capable of delivering a relatively high force to the target objects.
|
|
10:30-12:00, Paper WeAT10-CC.6 | Add to My Program |
A Light and Heat-Seeking Vine-Inspired Robot with Material-Level Responsiveness |
|
Deglurkar, Shivani | University of California, San Diego |
Xiao, Charles | University of California, Santa Barbara |
Gockowski, Luke | University of California Santa Barbara |
Valentine, Megan | University of California, Santa Barbara |
Hawkes, Elliot Wright | University of California, Santa Barbara |
Keywords: Soft Robot Materials and Design, Soft Sensors and Actuators, Mechanism Design
Abstract: The fields of soft and bio-inspired robotics promise to imbue synthetic systems with capabilities found in the natural world. However, many of these biological capabilities are yet to be realized. For example, vines in nature direct growth via localized responses embedded in the cells of vine body, allowing an organism without a central brain to successfully search for resources (e.g., light). Yet to date, vine-inspired robots have yet to show such localized embedded responsiveness. Here we present a vine-inspired robotic device with material-level responses embedded in its skin and capable of “growing” and steering toward either a light or heat stimulus. We present basic modeling of the concept, design details, and experimental results showing its behavior in response to infrared (IR) and visible light. Our simple design concept advances the capabilities of bio-inspired robots and lays the foundation for future “growing” robots that are capable of seeking light or heat, yet are extremely simple and low-cost. Potential applications include solar tracking, and in the future, firefighting smoldering fires. We envision using similar robots to find hot spots in hard- to-access environments, allowing us to put out potentially long- burning fires faster.
|
|
10:30-12:00, Paper WeAT10-CC.7 | Add to My Program |
Morphological Design for Pneumatic Soft Actuators and Robots with Desired Deformation Behavior |
|
Chen, Feifei | Shanghai Jiao Tong University |
Song, Zenan | Shanghai Jiao Tong University |
Chen, Shitong | Shanghai Jiaotong University |
Gu, Guoying | Shanghai Jiao Tong University |
Zhu, Xiangyang | Shanghai Jiao Tong University |
Keywords: Soft Robot Materials and Design, Soft Sensors and Actuators, Shape Optimization, Soft Robot Applications
Abstract: A homogeneous pneumatic soft robot may generate complex output motions using a simple input pressure, resulting from its morphological shape that locally deforms the soft material to different degrees by simultaneously tailoring the structural characteristics and orienting the input pressure. To date, design of the morphological shape (inverse problem) has not been fully addressed. This article outlines a geometry-mechanics-optimization integrated approach to automatically shaping a pneumatic soft actuator or robot that achieves the desired deformation behavior. Instead of constraining the robot's geometry within any predefined regular shape, we employ B-splines to allow generation of freeform boundary surfaces, and use nonlinear mechanical modelling and shape derivative based optimization to navigate the high-dimensional design space. Our design framework can readily regulate the surface quality during the morphological evolution, by imposing the geometric constraints in terms of the principal curvatures and the minimal distance between surfaces as penalty functions. The effect of external forces including the gravity and the interaction force at the end-effector is also taken int
|
|
10:30-12:00, Paper WeAT10-CC.8 | Add to My Program |
Thermally-Activated Biochemically-Sustained Reactor for Soft Fluidic Actuation |
|
Liu, Jialun | The University of Sheffield |
Soliman, MennaAllah | University of Sheffield |
Damian, Dana | University of Sheffield |
Keywords: Soft Robot Materials and Design, Soft Sensors and Actuators, Soft Robot Applications
Abstract: Soft robots have shown remarkable distinct capabilities due to their high deformation. Recently increasing attention has been dedicated to developing fully soft robots to exploit their full potential, with a recognition that electronic powering may limit this achievement. Alternative powering sources compatible with soft robots have been identified such as combustion and chemical reactions. A further milestone to such systems would be to increase the controllability and responsiveness of their underlying reactions in order to achieve more complex behaviors for soft robots. In this paper, we present a thermally-activated reactor incorporating a biocompatible hydrogel valve that enables control of the biochemical reaction of sugar and yeast. The biochemical reaction is utilized to generate contained pressure, which in turn powers a fluidic soft actuator. Experiments were conducted to evaluate the response time of the hydrogel valves with three different crosslinker concentrations. Among the tested concentrations, we found that the lowest crosslinker concentration yielded the fastest response time of the valve at an ambient temperature of 50°C. We also evaluated the pressure generation capacity of the reactor, which can reach up to 0.22 bar, and demonstrated the thermo-responsive behavior of the reactor to trigger a biochemical reaction for powering a fluidic soft actuator. This work opens up the possibility to power and control tetherless and fully soft robots.
|
|
10:30-12:00, Paper WeAT10-CC.9 | Add to My Program |
Pulsating Fluidic Sensor for Sensing of Location, Pressure and Contact Area |
|
Jones, Joanna | University of Sheffield |
Pontin, Marco | University of Sheffield |
Damian, Dana | University of Sheffield |
Keywords: Soft Robot Materials and Design, Soft Sensors and Actuators, Soft Robot Applications
Abstract: Designing information-rich and space-efficient sensors is a key challenge for soft robotics, and crucial for the development of safe soft robots. Sensing and understanding the environmental interactions with a minimal footprint is especially important in the medical context, where portability and unhindered patient/user movement is a priority, to move towards personalized and decentralized healthcare solutions. In this work, a pulsating fluidic soft sensor (PFS) capable of determining location, pressure and contact area of press events is shown. The sensor relies on spatio-temporal resistance changes driven by a pulsating conductive fluid. The sensor demonstrates good repeatability and distinction of single and multiple press events, detecting single indents of sizes greater than 1 cm, forces larger than 2 N, and various locations across the sensor, as well as multiple indents spaced 2 cm apart. Furthermore, the sensor is demonstrated in two applications to detect foot placement and grip location. Overall, the sensor represents an improvement towards minimizing electronic hardware, and cost of the sensing solution, without sacrificing the richness of the sensing information in the field of soft fluidic sensors.
|
|
WeAT11-CC Oral Session, CC-502 |
Add to My Program |
Semantic Scene Understanding I |
|
|
Chair: Fujii, Hiromitsu | Chiba Institute of Technology |
Co-Chair: Beetz, Michael | University of Bremen |
|
10:30-12:00, Paper WeAT11-CC.1 | Add to My Program |
Perception through Cognitive Emulation: “A Second Iteration of NaivPhys4RP for Learningless and Safe Recognition and 6D-Pose Estimation of (Transparent) Objects” |
|
Kenghagho Kenfack, Franklin | University of Bremen |
Neumann, Michael | Uni Bremen |
Mania, Patrick | University of Bremen |
Beetz, Michael | University of Bremen |
Keywords: Semantic Scene Understanding, Cognitive Modeling, Perception for Grasping and Manipulation
Abstract: In our previous work, we designed a human-like white-box and causal generative model of perception NaivPhys4RP, essentially based on cognitive emulation to understand the past, the present and the future of the state of complex worlds from poor observations. In this paper, as recommended in that previous work, we first refine the theoretical model of NaivPhys4RP in terms of integration of variables as well as perceptual inference tasks to solve. Intuitively, the system is closed under the injection, update and dependency of variables. Then, we present a first implementation of NaivPhys4RP that demonstrates the learningless and safe recognition and 6D-Pose estimation of objects from poor sensor data (e.g., occlusion, transparency, poor-depth, in-hand). This does not only make a substantial step forward comparatively to classical perception systems in perceiving objects in these scenarios, but escape the burden of data-intensive learning and operate safely (transparency and causality — we fit sensor data into mentally constructed meaningful worlds). With respect to ChatGPT’s ambitions, it can imagine physico-realistic socio-physical scenes from texts, demonstrate understanding of these texts, and all these with no data- and resource-intensive learning.
|
|
10:30-12:00, Paper WeAT11-CC.2 | Add to My Program |
Mapping High-Level Semantic Regions in Indoor Environments without Object Recognition |
|
Bigazzi, Roberto | University of Modena and Reggio Emilia |
Baraldi, Lorenzo | Università Degli Studi Di Modena E Reggio Emilia |
Kousik, Shreyas | Georgia Institute of Technology |
Cucchiara, Rita | Università Degli Studi Di Modena E Reggio Emilia |
Pavone, Marco | Stanford University |
Keywords: Semantic Scene Understanding, Deep Learning for Visual Perception, Visual Learning
Abstract: Robots require a semantic understanding of their surroundings to operate in an efficient and explainable way in human environments. In the literature, there has been an extensive focus on object labeling and exhaustive scene graph generation; less effort has been focused on the task of purely identifying and mapping large semantic regions. The present work proposes a method for semantic region mapping via embodied navigation in indoor environments, generating a high-level representation of the knowledge of the agent. To enable region identification, the method uses a vision-to-language model to provide scene information for mapping. By projecting egocentric scene understanding into the global frame, the proposed method generates a semantic map as a distribution over possible region labels at each location. This mapping procedure is paired with a trained navigation policy to enable autonomous map generation. The proposed method significantly outperforms a variety of baselines, including an object-based system and a pretrained scene classifier, in experiments in a photorealistic simulator.
|
|
10:30-12:00, Paper WeAT11-CC.3 | Add to My Program |
LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model As an Agent |
|
Yang, Jianing | University of Michigan |
Chen, Xuweiyi | University of Michigan |
Qian, Shengyi | University of Michigan |
Madaan, Nikhil | Bloomberg |
Iyengar, Madhavan | University of Michigan |
Fouhey, David | University of Michigan |
Chai, Joyce | University of Michigan |
Keywords: Semantic Scene Understanding, Deep Learning for Visual Perception, RGB-D Perception
Abstract: 3D visual grounding is a critical skill for household robots, enabling them to navigate, manipulate objects, and answer questions based on their environment. While existing approaches often rely on extensive labeled data or exhibit limitations in handling complex language queries, we propose LLM-Grounder, a novel zero-shot, open-vocabulary, Large Language Model (LLM)-based 3D visual grounding pipeline. LLM-Grounder utilizes an LLM to decompose complex natural language queries into semantic constituents and employs a visual grounding tool, such as OpenScene or LERF, to identify objects in a 3D scene. The LLM then evaluates the spatial and commonsense relations among the proposed objects to make a final grounding decision. Our method does not require any labeled training data and can generalize to novel 3D scenes and arbitrary text queries. We evaluate LLM-Grounder on the ScanRefer benchmark and demonstrate state-of-the-art zero-shot grounding accuracy. Our findings indicate that LLMs significantly improve the grounding capability, especially for complex language queries, making LLM-Grounder an effective approach for 3D vision-language tasks in robotics.
|
|
10:30-12:00, Paper WeAT11-CC.4 | Add to My Program |
Learning Off-Road Terrain Traversability with Self-Supervisions Only |
|
Seo, Junwon | Agency for Defense Development |
Sim, Sungdae | Agency for Defense Development |
Shim, Inwook | Inha University |
Keywords: Semantic Scene Understanding, Deep Learning for Visual Perception, Vision-Based Navigation
Abstract: Estimating the traversability of terrain should be reliable and accurate in diverse conditions for autonomous driving in off-road environments. However, learning-based ap- proaches often yield unreliable results when confronted with unfamiliar contexts, and it is challenging to obtain manual annotations frequently for new circumstances. In this paper, we introduce a method for learning traversability from images that utilizes only self-supervision and no manual labels, enabling it to easily learn traversability in new circumstances. To this end, we first generate self-supervised traversability labels from past driving trajectories by labeling regions traversed by the vehicle as highly traversable. Using the self-supervised labels, we then train a neural network that identifies terrains that are safe to traverse from an image using a one-class classification algorithm. Additionally, we supplement the limitations of self- supervised labels by incorporating methods of self-supervised learning of visual representations. To conduct a comprehensive evaluation, we collect data in a variety of driving environments and perceptual conditions and show that our method produces reliable estimations in various environments. In addition, the experimental results validate that our method outperforms other self-supervised traversability estimation methods and achieves comparable performances with supervised learning methods trained on manually labeled data.
|
|
10:30-12:00, Paper WeAT11-CC.5 | Add to My Program |
Improving Radial Imbalances with Hybrid Voxelization and RadialMix for LiDAR 3D Semantic Segmentation |
|
Li, Jiale | Zhejiang University |
Dai, Hang | University of Glasgow |
Wang, Yu | YUNJI Technology Co. Ltd |
Cao, Guangzhi | Pegasus Technology |
Luo, Chun | YUNJI Technology Co. Ltd |
Ding, Yong | Zhejiang University |
Keywords: Semantic Scene Understanding, Deep Learning for Visual Perception, Deep Learning Methods
Abstract: Huge progress has been made in LiDAR 3D semantic segmentation, but there are still two under-explored imbalances on the radial axis: points are unevenly concentrated on the near side, and the distribution of foreground object instances is skewed to the near side. This leads the training of the model to favor semantics at the near side with the majority of points and object instances. Both the popular cylindrical and the neglected spherical voxelizations aim to address the problem of imbalanced point distribution by increasing the volume of voxels along the radial distance to include fewer near-side points in a smaller voxel and more far-side points in a bigger voxel. However, this causes a problem of the receptive field enlarging along the radial distance, which is not desirable in LiDAR point clouds since the size of an object is distance-independent. This can be addressed in cubic voxelization which has a fixed volume of voxels. Thus, we propose a new LiDAR 3D semantic segmentation network (Hi-VoxelNet) with Hybrid Voxelization that leverages the advantages of cubic, cylindrical, and spherical voxelizations for hybrid voxel feature learning. To address the radial imbalance of object instances, we propose a novel data augmentation technique termed as RadialMix that uses radial sample duplication to increase the number of distant foreground object instances and mixes the radial duplication with another point cloud for enriching the training samples. With the joint improvements of the radial imbalances, our method archives state-of-the-art performance on nuScenes and SemanticKITTI datasets, and consistently shows significant improvements along the radial distances. Our code is publicly available at https://github.com/jialeli1/lidarseg3d.
|
|
10:30-12:00, Paper WeAT11-CC.6 | Add to My Program |
Few-Shot Panoptic Segmentation with Foundation Models |
|
Käppeler, Markus | University of Freiburg |
Petek, Kürsat | University of Freiburg |
Vödisch, Niclas | University of Freiburg |
Burgard, Wolfram | University of Technology Nuremberg |
Valada, Abhinav | University of Freiburg |
Keywords: Semantic Scene Understanding, Deep Learning for Visual Perception, Object Detection, Segmentation and Categorization
Abstract: Current state-of-the-art methods for panoptic segmentation require an immense amount of annotated training data that is both arduous and expensive to obtain posing a significant challenge for their widespread adoption. Concurrently, recent breakthroughs in visual representation learning have sparked a paradigm shift leading to the advent of large foundation models that can be trained with completely unlabeled images. In this work, we propose to leverage such task-agnostic image features to enable few-shot panoptic segmentation by presenting Segmenting Panoptic Information with Nearly 0 labels (SPINO). In detail, our method combines a DINOv2 backbone with lightweight network heads for semantic segmentation and boundary estimation. We show that our approach, albeit being trained with only ten annotated images, predicts high-quality pseudo-labels that can be used with any existing panoptic segmentation method. Notably, we demonstrate that SPINO achieves competitive results compared to fully supervised baselines while using less than 0.3% of the ground truth labels, paving the way for learning complex visual recognition tasks leveraging foundation models. To illustrate its general applicability, we further deploy SPINO on real-world robotic vision systems for both outdoor and indoor environments. To foster future research, we make the code and trained models publicly available at http://spino.cs.uni-freiburg.de.
|
|
10:30-12:00, Paper WeAT11-CC.7 | Add to My Program |
End-To-End Semantic Segmentation Network for Low-Light Scenes |
|
Mu, Hongmin | Beijing University of Chemical Technology |
Zhang, Gang | Beijing University of Chemical Technology |
Zhou, MengChu | New Jersey Institute of Technology |
Cao, Zhengcai | Harbin Institute of Technology |
Keywords: Semantic Scene Understanding, Deep Learning for Visual Perception, Visual Learning
Abstract: In the fields of robotic perception and computer vision, achieving accurate semantic segmentation of low-light or nighttime scenes is challenging. This is primarily due to the limited visibility of objects and the reduced texture and color contrasts among them. To address the issue of limited visibility, we propose a hierarchical gated convolution unit, which simultaneously expands the receptive field and restores edge texture. To address the issue of reduced texture among objects, we propose a dual closed-loop bipartite matching algorithm to establish a total loss function consisting of the unsupervised illumination enhancement loss and supervised intersection-over-union loss, thus enabling the joint minimization of both losses via the Hungarian algorithm. We thus achieve end-to-end training for a semantic segmentation network especially suitable for handling low-light scenes. Experimental results demonstrate that the proposed network surpasses existing methods on the Cityscapes dataset and notably outperforms state-of-the-art methods on both Dark Zurich and Nighttime Driving datasets.
|
|
10:30-12:00, Paper WeAT11-CC.8 | Add to My Program |
DefFusion: Deformable Multimodal Representation Fusion for 3D Semantic Segmentation |
|
Xu, Rongtao | Institute of Automation, Chinese Academy of Sciences, Beijing, C |
Wang, Changwei | Casia |
Zhang, Duzhen | Institute of Automation, Chinese Academy of Sciences |
Zhang, Man | Beijing University of Posts and Telecommunications |
Xu, Shibiao | Beijing University of Posts and Telecommunications |
Meng, Weiliang | Institute of Automation, Chinese Academy of Sciences |
Zhang, Xiaopeng | National Laboratory of Pattern Recognition, Institute of Automat |
Keywords: Semantic Scene Understanding, Autonomous Agents, Sensor Fusion
Abstract: The complementarity between camera and LiDAR data makes fusion methods a promising approach to improve 3D semantic segmentation performance. Recent transformer-based methods have also demonstrated superiority in segmentation. However, multimodal solutions incorporating transformers are underexplored and face two key inherent difficulties: over-attention and noise from different modal data. To overcome these challenges, we propose a Deformable Multimodal Representation Fusion (DefFusion) framework consisting mainly of a Deformable Representation Fusion Transformer and Dynamic Representation Augmentation Modules. The Deformable Representation Fusion Transformer introduces the deformable mechanism in multimodal fusion, avoiding over-attention and improving efficiency by adaptively modeling a 2D key/value set for a given 3D query, thus enabling multimodal fusion with higher flexibility. To enhance the 2D representation and 3D representation, the Dynamic Representation Enhancement Module is proposed to dynamically remove noise in the input representation via Dynamic Grouped Representation Generation and Dynamic Mask Generation. Extensive experiments validate that our model achieves the best 3D semantic segmentation performance on SemanticKITTI and NuScenes benchmarks.
|
|
10:30-12:00, Paper WeAT11-CC.9 | Add to My Program |
Lifelong LERF: Local 3D Semantic Inventory Monitoring Using FogROS2 |
|
Rashid, Adam | UC Berkeley |
Kim, Chung Min | University of California, Berkeley |
Kerr, Justin | University of California, Berkeley |
Fu, Letian | UC Berkeley |
Hari, Kush | UC Berkeley |
Ahmad, Ayah | University of California, Berkeley |
Chen, Kaiyuan | University of California, Berkeley |
Huang, Huang | University of California at Berkeley |
Gualtieri, Marcus | Bosch Research |
Wang, Michael | Bosch |
Juette, Christian | Bosch Research |
Tian, Nan | University of California, Berkeley |
Ren, Liu | Robert Bosch North America Research Technology Center |
Goldberg, Ken | UC Berkeley |
Keywords: Semantic Scene Understanding, Continual Learning, SLAM
Abstract: Inventory monitoring in homes, factories, and retail stores relies on maintaining data despite objects being swapped, added, removed, or moved. We introduce Lifelong LERF, a method that allows a mobile robot with minimal compute to jointly optimize a dense language and geometric representation of its surroundings. Lifelong LERF maintains this representation over time by detecting semantic changes and selectively updating these regions of the environment, avoiding the need to exhaustively remap. Human users can query inventory by providing natural language queries and receiving a 3D heatmap of potential object locations. To manage the computational load, we use Fog-ROS2, a cloud robotics platform, to offload resource-intensive tasks. Lifelong LERF obtains poses from a monocular RGBD SLAM backend, and uses these poses to progressively optimize a Language Embedded Radiance Field (LERF) for semantic monitoring. Experiments with 3-5 objects arranged on a tabletop and a Turtlebot with a RealSense camera suggest that Lifelong LERF can persistently adapt to changes in objects with up to 91% accuracy.
|
|
WeAT12-CC Oral Session, CC-503 |
Add to My Program |
Deep Learning in Grasping and Manipulation IV |
|
|
Chair: Yamazaki, Kimitoshi | Shinshu University |
Co-Chair: Dijkman, Daniel | Qualcomm |
|
10:30-12:00, Paper WeAT12-CC.1 | Add to My Program |
RGBManip: Monocular Image-Based Robotic Manipulation through Active Object Pose Estimation |
|
An, Boshi | Peking University |
Geng, Yiran | Peking University |
Chen, Kai | The Chinese University of Hong Kong |
Li, Xiaoqi | Peking University |
Dou, Qi | The Chinese University of Hong Kong |
Dong, Hao | Peking University |
Keywords: AI-Based Methods, Deep Learning in Grasping and Manipulation, Perception for Grasping and Manipulation
Abstract: Robotic manipulation requires accurate perception of the environment, which poses a significant challenge due to its inherent complexity and constantly changing nature. In this context, RGB image and point-cloud observations are two commonly used modalities in visual-based robotic manipulation, but each of these modalities have their own limitations. Commercial point-cloud observations often suffer from issues like sparse sampling and noisy output due to the limits of the emission-reception imaging principle. On the other hand, RGB images, while rich in texture information, lack essential depth and 3D information crucial for robotic manipulation. To mitigate these challenges, we propose an image-only robotic manipulation framework that leverages an eye-on-hand monocular camera installed on the robot's parallel gripper. By moving with the robot gripper, this camera gains the ability to actively perceive the object from multiple perspectives during the manipulation process. This enables the estimation of 6D object poses, which can be utilized for manipulation. While, obtaining images from more and diverse viewpoints typically improves pose estimation, it also increases the manipulation time. To address this trade-off, we employ a reinforcement learning policy to synchronize the manipulation strategy with active perception, achieving a balance between 6D pose accuracy and manipulation efficiency. Our experimental results in both simulated and real-world environments showcase the state-of-the-art effectiveness of our approach. We believe that our method will inspire further research on real-world-oriented robotic manipulation.
|
|
10:30-12:00, Paper WeAT12-CC.2 | Add to My Program |
Part-Guided 3D RL for Sim2Real Articulated Object Manipulation |
|
Xie, Pengwei | Tsinghua University |
Chen, Rui | Tsinghua University |
Chen, Siang | Tsinghua University |
Qin, Yuzhe | UC San Diego |
Xiang, Fanbo | University of California San Diego |
Sun, Tianyu | Tsinghua University |
Xu, Jing | Tsinghua University |
Wang, Guijin | Tsinghua University |
Su, Hao | UCSD |
Keywords: Deep Learning in Grasping and Manipulation, RGB-D Perception, Reinforcement Learning
Abstract: Manipulating unseen articulated objects through visual feedback is a critical but challenging task for real robots. Existing learning-based solutions mainly focus on visual affordance learning or other pre-trained visual models to guide manipulation policies, which face challenges for novel instances in real-world scenarios. In this paper, we propose a novel part-guided 3D RL framework, which can learn to manipulate articulated objects without demonstrations. We combine the strengths of 2D segmentation and 3D RL to improve the efficiency of RL policy training. To improve the stability of the policy on real robots, we design a Frame-consistent Uncertainty-aware Sampling (FUS) strategy to get a condensed and hierarchical 3D representation. In addition, a single versatile RL policy can be trained on multiple articulated object manipulation tasks simultaneously in simulation and shows great generalizability to novel categories and instances. Experimental results demonstrate the effectiveness of our framework in both simulation and real-world settings.
|
|
10:30-12:00, Paper WeAT12-CC.3 | Add to My Program |
MORPH: Design Co-Optimization with Reinforcement Learning Via a Differentiable Hardware Model Proxy |
|
He, Zhanpeng | Columbia University |
Ciocarlie, Matei | Columbia University |
Keywords: Deep Learning in Grasping and Manipulation, Reinforcement Learning, Mechanism Design
Abstract: We introduce MORPH, a method for co-optimization of hardware design parameters and control policies in simulation using reinforcement learning. Like most co-optimization methods, MORPH relies on a model of the hardware being optimized, usually simulated based on the laws of physics. However, such a model is often difficult to integrate into an effective optimization routine. To address this, we introduce a proxy hardware model, which is always differentiable and enables efficient co-optimization alongside a long-horizon control policy using RL. MORPH is designed to ensure that the optimized hardware proxy remains as close as possible to its realistic counterpart, while still enabling task completion. We demonstrate our approach on simulated 2D reaching and 3D multi-fingered manipulation tasks.
|
|
10:30-12:00, Paper WeAT12-CC.4 | Add to My Program |
Mastering Stacking of Diverse Shapes with Large-Scale Iterative Reinforcement Learning on Real Robots |
|
Lampe, Thomas | Google UK Ltd |
Abdolmaleki, Abbas | DeepMind |
Huang, Sandy H. | Google DeepMind |
Bechtle, Sarah | Google DeepMind |
Springenberg, Jost Tobias | Albert-Ludwigs Universitaet Freiburg |
Bloesch, Michael | Google |
Groth, Oliver | University of Oxford |
Hafner, Roland | Google DeepMind |
Hertweck, Tim | DeepMind |
Neunert, Michael | Google |
Wulfmeier, Markus | Google DeepMind |
Zhang, Jingwei | DeepMind |
Nori, Francesco | Google DeepMind |
Heess, Nicolas | Google Deepmind |
Riedmiller, Martin | DeepMind |
Keywords: Deep Learning in Grasping and Manipulation, Reinforcement Learning, Machine Learning for Robot Control
Abstract: Reinforcement learning solely from an agent's self-generated data is often believed to be infeasible for learning on real robots, due to the amount of data needed. However, if done right, agents learning from real data can be surprisingly efficient through re-using previously collected sub-optimal data. In this paper we demonstrate how the increased understanding of off-policy learning methods and their embedding in an iterative online/offline scheme ("collect and infer") can drastically improve data-efficiency by using all the collected experience, which empowers learning from real robot experience only. Moreover, the resulting policy improves significantly over the state of the art on a recently proposed real robot manipulation benchmark. Our approach learns end-to-end, directly from pixels, and does not rely on additional human domain knowledge such as a simulator or demonstrations.
|
|
10:30-12:00, Paper WeAT12-CC.5 | Add to My Program |
Information-Driven Affordance Discovery for Efficient Robotic Manipulation |
|
Mazzaglia, Pietro | University of Gent |
Cohen, Taco | Qualcomm AI Research |
Dijkman, Daniel | Qualcomm |
Keywords: AI-Based Methods, Deep Learning in Grasping and Manipulation, Perception for Grasping and Manipulation
Abstract: Robotic affordances, providing information about what actions can be taken in a given situation, can aid robotic manipulation. However, learning about affordances requires expensive large annotated datasets of interactions or demonstrations. In this work, we argue that well-directed interactions with the environment can mitigate this problem and propose an information-based measure to augment the agent's objective and accelerate the affordance discovery process. We provide a theoretical justification of our approach and we empirically validate the approach both in simulation and real-world tasks. Our method, which we dub IDA, enables the efficient discovery of visual affordances for several action primitives, such as grasping, stacking objects, or opening drawers, strongly improving data efficiency in simulation, and it allows us to learn grasping affordances in a small number of interactions, on a real-world setup with a UFACTORY xArm 6 robot arm.
|
|
10:30-12:00, Paper WeAT12-CC.6 | Add to My Program |
HybGrasp: A Hybrid Learning-To-Adapt Architecture for Efficient Robot Grasping |
|
Mun, Jungwook | Korea Advanced Institute of Science and Technology |
Truong Giang, Khang | KAIST |
Lee, Yunghee | Korea Advanced Institute of Science and Technology |
Oh, Nayoung | KAIST |
Huh, Sejoon | Korea Advanced Institute of Science and Technology |
Kim, Min | KAIST |
Jo, Sungho | Korea Advanced Institute of Science and Technology (KAIST) |
Keywords: Deep Learning in Grasping and Manipulation, Reinforcement Learning, Multifingered Hands
Abstract: Despite the prevalence of robotic manipulation tasks in various real-world applications of different requirements and needs, there has been a lack of focus on enhancing the adaptability of robotic grasping systems. Most of the current literature constructs models around a single gripper, succumbing to a tradeoff between gripper complexity and generalizability. Adapting such models pre-trained on one type of gripper to another to work around the tradeoff is inefficient and not scalable, as it would require tremendous effort and computational cost to generate new datasets and relearn the grasping task. In this letter, we propose a novel hybrid architecture for robot grasping that efficiently learns to adapt to different gripper designs. Our approach involves a three-step process that first obtains a rough grasp pose prediction from a parallel gripper model, then predicts an adaptive action using a convolutional neural network, and finally refines the predicted action with reinforcement learning. The proposed method shows significant improvements in grasping performance compared to existing methods for both generated datasets and real-world scenarios, presenting a promising direction for improving the adaptability and flexibility of robotic manipulation systems.
|
|
10:30-12:00, Paper WeAT12-CC.7 | Add to My Program |
Efficient Heatmap-Guided 6-Dof Grasp Detection in Cluttered Scenes |
|
Chen, Siang | Tsinghua University |
Tang, Wei | Tsinghua University |
Xie, Pengwei | Tsinghua University |
Yang, Wenming | Tsinghua University |
Wang, Guijin | Tsinghua University |
Keywords: Deep Learning in Grasping and Manipulation, RGB-D Perception, Grasping
Abstract: Fast and robust object grasping in clutter is a crucial component of robotics. Most current works resort to the whole observed point cloud for 6-Dof grasp generation, ignoring the guidance information excavated from global semantics, thus limiting high-quality grasp generation and real-time performance. In this work, we show that the widely used heatmaps are underestimated in the efficiency of 6-Dof grasp generation. Therefore, we propose an effective local grasp generator combined with grasp heatmaps as guidance, which infers in a global-to-local semantic-to-point way. Specifically, Gaussian encoding and the grid-based strategy are applied to predict grasp heatmaps as guidance to aggregate local points into graspable regions and provide global semantic information. Further, a novel non-uniform anchor sampling mechanism is designed to improve grasp accuracy and diversity. Benefiting from the high-efficiency encoding in the image space and focusing on points in local graspable regions, our framework can perform high-quality grasp detection in real-time and achieve state-of-the-art results. In addition, real robot experiments demonstrate the effectiveness of our method with a success rate of 94% and a clutter completion rate of 100%.
|
|
10:30-12:00, Paper WeAT12-CC.8 | Add to My Program |
A Hyper-Network Based End-To-End Visual Servoing with Arbitrary Desired Poses |
|
Yu, Hongxiang | Zhejiang University |
Chen, Anzhe | Zhejiang University |
Xu, Kechun | Zhejiang University |
Zhou, Zhongxiang | Zhejiang University |
Jing, Wei | Alibaba |
Wang, Yue | Zhejiang University |
Xiong, Rong | Zhejiang University |
Keywords: Deep Learning in Grasping and Manipulation, Transfer Learning, Visual Servoing
Abstract: Recently, several works achieve end-to-end visual servoing (VS) for robotic manipulation by replacing traditional controller with differentiable neural networks, but lose the ability to servo arbitrary desired poses. This letter proposes a differentiable architecture for arbitrary pose servoing: a hyper-network based neural controller (HPN-NC). To achieve this, HPN-NC consists of a hyper net and a low-level controller, where the hyper net learns to generate the parameters of the low-level controller and the controller uses the 2D keypoints error for control like traditional image-based visual servoing (IBVS). HPN-NC can complete 6 degree of freedom visual servoing with large initial offset. Taking advantage of the fully differentiable nature of HPN-NC, we provide a three-stage training procedure to servo real world objects. With self-supervised end-to-end training, the performance of the integrated model can be further improved in unseen scenes and the amount of manual annotations can be significantly reduced.
|
|
10:30-12:00, Paper WeAT12-CC.9 | Add to My Program |
6-DoF Closed-Loop Grasping with Reinforcement Learning |
|
Herland, Sverre | Norwegian University of Science and Technology |
Bach, Kerstin | Norwegian University of Science and Technology |
Misimi, Ekrem | SINTEF Ocean |
Keywords: Deep Learning in Grasping and Manipulation, Reinforcement Learning, Perception for Grasping and Manipulation
Abstract: We present a novel vision-based, 6-DoF grasping framework based on Deep Reinforcement Learning (DRL) that is capable of directly synthesizing continuous 6-DoF actions in cartesian space. Our proposed approach uses visual observations from an eye-in-hand RGB-D camera, and we mitigate the sim-to-real gap with a combination of domain randomization, image augmentation, and segmentation tools. Our method consists of an off-policy, maximum-entropy, Actor-Critic algorithm that learns a policy from a binary reward and a few simulated example grasps. It does not need any real-world grasping examples, is trained completely in simulation, and is deployed directly to the real world without any fine-tuning. The efficacy of our approach is demonstrated in simulation and experimentally validated in the real world on 6-DoF grasping tasks, achieving state-of-the-art results of an 86% mean zero-shot success rate on previously unseen objects, an 85% mean zero-shot success rate on a class of previously unseen adversarial objects, and a 74.3% mean zero-shot success rate on a class of previously unseen, challenging "6-DoF" objects. Raw footage of real-world validation can be found at https://youtu.be/bwPf8Imvook
|
|
WeAT13-AX Oral Session, AX-201 |
Add to My Program |
Human-Robot Collaboration I |
|
|
Chair: Kuchenbecker, Katherine J. | Max Planck Institute for Intelligent Systems |
Co-Chair: Zhang, Yunbo | Rochester Institute of Technology |
|
10:30-12:00, Paper WeAT13-AX.1 | Add to My Program |
Self-Supervised 6-DoF Robot Grasping by Demonstration Via Augmented Reality Teleoperation System |
|
Dengxiong, Xiwen | Rochester Institute of Technology |
Wang, Xueting | Rochester Institute of Technology |
Bai, Shi | Wing |
Zhang, Yunbo | Rochester Institute of Technology |
Keywords: Human-Centered Automation, Telerobotics and Teleoperation, Learning from Demonstration
Abstract: Most existing 6-DoF robot grasping solutions depend on strong supervision on grasp pose to ensure satisfactory performance, which could be laborious and impractical when the robot works in some restricted area. To this end, we propose a self-supervised 6-DoF grasp pose detection framework via an Augmented Reality (AR) teleoperation system that can efficiently learn human demonstrations and provide 6-DoF grasp poses without grasp pose annotations. Specifically, the system collects the human demonstration from the AR environment and contrastively learns the grasping strategy from the demonstration. For the real-world experiment, the proposed system leads to satisfactory grasping abilities and learning to grasp unknown objects within three demonstrations.
|
|
10:30-12:00, Paper WeAT13-AX.2 | Add to My Program |
Trust Recognition in Human-Robot Cooperation Using EEG |
|
Xu, Caiyue | Tongji University |
Zhang, Changming | Tongji University |
Zhou, Yanmin | Tongji University |
Wang, Zhipeng | Tongji University |
Lu, Ping | Tongji University |
He, Bin | Tongji University |
Keywords: Acceptability and Trust, Human-Robot Collaboration
Abstract: Collaboration between humans and robots is becoming increasingly crucial in our daily life. In order to accomplish efficient cooperation, trust recognition is vital, empowering robots to predict human behaviors and make trust-aware decisions. Consequently, there is an urgent need for a generalized approach to recognize human-robot trust. This study addresses this need by introducing an EEG-based method for trust recognition during human-robot cooperation. A human-robot cooperation game scenario is used to stimulate various human trust levels when working with robots. To enhance recognition performance, the study proposes an EEG Vision Transformer model coupled with a 3-D spatial representation to capture the spatial information of EEG, taking into account the topological relationship among electrodes. To validate this approach, a public EEG-based human trust dataset called EEGTrust is constructed. Experimental results indicate the effectiveness of the proposed approach, achieving an accuracy of 74.99% in slice-wise cross-validation and 62.00% in trial-wise cross-validation. This outperforms baseline models in both recognition accuracy and generalization. Furthermore, an ablation study demonstrates a significant improvement in trust recognition performance of the spatial representation. The source code and EEGTrust dataset are available at https://github.com/CaiyueXu/EEGTrust.
|
|
10:30-12:00, Paper WeAT13-AX.3 | Add to My Program |
Multimodal Multi-User Surface Recognition with the Kernel Two-Sample Test (I) |
|
Khojasteh, Behnam | Max Planck Institute for Intelligent Systems |
Solowjow, Friedrich | RWTH Aachen University |
Trimpe, Sebastian | RWTH Aachen University |
Kuchenbecker, Katherine J. | Max Planck Institute for Intelligent Systems |
Keywords: Human-Centered Automation, Force and Tactile Sensing
Abstract: Machine learning and deep learning have been used extensively to classify physical surfaces through images and time-series contact data. However, these methods rely on human expertise and entail the time-consuming processes of data and parameter tuning. To overcome these challenges, we propose an easily implemented framework that can directly handle heterogeneous data sources for classification tasks. Our data-versus-data approach automatically quantifies distinctive differences in distributions in a high-dimensional space via kernel two-sample testing between two sets extracted from multimodal data (e.g., images, sounds, haptic signals). We demonstrate the effectiveness of our technique by benchmarking against expertly engineered classifiers for visual-audio-haptic surface recognition due to the industrial relevance, difficulty, and competitive baselines of this application; ablation studies confirm the utility of key components of our pipeline. As shown in our open-source code, we achieve 97.2% accuracy on a standard multi-user dataset with 108 surface classes, outperforming the state-of-the-art machine-learning algorithm by 6% on a more difficult version of the task. The fact that our classifier obtains this performance with minimal data processing in the standard algorithm setting reinforces the powerful nature of kernel methods for learning to recognize complex patterns.
|
|
10:30-12:00, Paper WeAT13-AX.4 | Add to My Program |
Learning User Preferences for Complex Cobotic Tasks: Meta-Behaviors and Human Groups |
|
Vella, Elena | The University of Melbourne |
Chapman, Airlie | University of Melbourne |
Lipovetzky, Nir | The University of Melbourne |
Keywords: Acceptability and Trust, Human-Robot Teaming, Human-Robot Collaboration
Abstract: In complex tasks (beyond a single targeted controller) requiring robots to collaborate with multiple human users, two challenges arise: complex tasks are often composed of multiple behaviors which can only be evaluated as a collective (a meta-behavior) and user preferences often differ between individuals, yet successful interactions are expected across groups. To address these challenges, we formulate a set-wise preference learning problem, and validate a cost function that captures human group preferences for complex collaborative robotic tasks (cobotics). We develop a sparse optimization formulation to introduce a distinctiveness metric that aggregates individuals with similar preference profiles. Analysis of anonymized unlabelled preferences provides further insight into group preferences. Identification of the mode average most-preferred meta-behavior and minimum covariance bound allows us to analyze group cohesion. A user study with 43 participants is used to validate group preference profiles.
|
|
10:30-12:00, Paper WeAT13-AX.5 | Add to My Program |
Learning Self-Confidence from Semantic Action Embeddings for Improved Trust in Human-Robot Interaction |
|
Goubard, Cedric | Imperial College London |
Demiris, Yiannis | Imperial College London |
Keywords: Acceptability and Trust, Human-Centered Robotics, Human-Robot Collaboration
Abstract: In Human-Robot Interaction scenarios, human factors like trust can greatly impact task performance and interaction quality. Recent research has confirmed that perceived robot proficiency is a major antecedent of trust. By making robots aware of their capabilities, we can allow them to choose when to perform low-confidence actions, thus actively controlling the risk of trust reduction. In this paper, we propose Self-Confidence through Observed Novel Experiences (SCONE), a policy to learn self-confidence from experience using semantic action embeddings. Using an assistive cooking setting, we show that the semantic aspect allows SCONE to learn self-confidence faster than existing approaches, while also achieving promising performance in simple instructions following. Finally, we share results from a pilot study with 31 participants, showing that such a self-confidence-aware policy increases capability-based human trust.
|
|
10:30-12:00, Paper WeAT13-AX.6 | Add to My Program |
Interactive Navigation in Environments with Traversable Obstacles Using Large Language and Vision-Language Models |
|
Zhang, Zhen | The Chinese University of Hong Kong |
Lin, Anran | The Chinese University of Hong Kong |
Wong, Chun Wai | The Chinese University of Hong Kong |
Chu, Xiangyu | The Chinese University of Hong Kong |
Dou, Qi | The Chinese University of Hong Kong |
Au, K. W. Samuel | The Chinese University of Hong Kong |
Keywords: Human-Centered Robotics, AI-Based Methods, Reactive and Sensor-Based Planning
Abstract: This paper proposes an interactive navigation framework by using large language and vision-language models, allowing robots to navigate in environments with traversable obstacles. We utilize the large language model (GPT-3.5) and the open-set Vision-language Model (Grounding DINO) to create an action-aware costmap to perform effective path planning without fine-tuning. With the large models, we can achieve an end-to-end system from textual instructions like “Can you pass through the curtains to deliver medicines to me?”, to bounding boxes (e.g., curtains) with action-aware attributes. They can be used to segment LiDAR point clouds into two parts: traversable and untraversable parts, and then an action-aware costmap is constructed for generating a feasible path. The pre-trained large models have great generalization ability and do not require additional annotated data for training, allowing fast deployment in the interactive navigation tasks. We choose to use multiple traversable objects such as curtains and grasses for verification by instructing the robot to traverse them. Besides, traversing curtains in a medical scenario was tested. All experimental results demonstrated the proposed framework’s effectiveness and adaptability to diverse environments.
|
|
10:30-12:00, Paper WeAT13-AX.7 | Add to My Program |
From Unstable Electrode Contacts to Reliable Control: A Deep Learning Approach for HD-sEMG in Neurorobotics |
|
Tyacke, Eion | New York University |
Gupta, Kunal | New York University |
Patel, Jay | New York University |
Katoch, Raghav | New York University |
Atashzar, S. Farokh | New York University (NYU), US |
Keywords: Human-Centered Robotics, Brain-Machine Interfaces, Gesture, Posture and Facial Expressions
Abstract: In the past decade, there has been significant advancement in designing wearable neural interfaces for controlling neurorobotic systems, particularly bionic limbs. These interfaces function by decoding signals captured non-invasively from the skin's surface. Portable high-density surface electromyography (HD-sEMG) modules combined with deep learning decoding have attracted interest by achieving excellent gesture prediction and myoelectric control of prosthetic systems and neurorobots. However, factors like small electrode size and unstable electrode-skin contacts make HD-sEMG susceptible to pixel electrode drops. The sparse electrode-skin disconnections rooted in issues such as low adhesion, sweating, hair blockage, and skin stretch challenge the reliability and scalability of these modules as the perception unit for neurorobotic systems. This paper proposes a novel deep-learning model providing resiliency for HD-sEMG modules, which can be used in the wearable interfaces of neurorobots. The proposed 3D Dilated Efficient CapsNet model trains on an augmented input space to computationally `force' the network to learn channel dropout variations and thus learn robustness to channel dropout. The proposed framework maintained high performance under a sensor dropout reliability study conducted. Results show conventional models' performance significantly degrades with dropout and is recovered using the proposed architecture and the training paradigm.
|
|
10:30-12:00, Paper WeAT13-AX.8 | Add to My Program |
Enhanced Human-Robot Collaboration with Intent Prediction Using Deep Inverse Reinforcement Learning |
|
Mitra, Mukund | IISc Bangalore |
Kumar, Gyanig | Indian Institute of Sciences, India |
Chakrabarti, Partha Pratim | Indian Institute of Technology, Kharagpur, India |
Biswas, Pradipta | Indian Institute of Science |
Keywords: Human-Centered Automation, Intention Recognition, Human-Robot Collaboration
Abstract: In shared autonomy, human-robot handover for object delivery is crucial. Accurate robot predictions of human hand motion and intentions enhance collaboration efficiency. However, low prediction accuracy increases mental and physical demands on the user. In this work, we propose a system for predicting hand motion and intended target during human-robot handover using Inverse Reinforcement Learning (IRL). A set of feature functions were designed to explicitly capture users’ preferences during the task. The proposed approach was experimentally validated through user studies. Results indicate that the proposed method outperformed other state-of-the-art methods (PI-IRL, BP-HMT, RNNIK-MKF and CMk=5) with users feeling comfortable reaching upto 60% of the total distance to the target for handover with 90% target prediction accuracy. The target prediction accuracy reaches 99.9% when less than 20% of the task remains.
|
|
10:30-12:00, Paper WeAT13-AX.9 | Add to My Program |
ToP-ToM: Trust-Aware Robot Policy with Theory of Mind |
|
Yu, Chuang | University College London |
Serhan, Baris | The University of Manchester |
Cangelosi, Angelo | University of Manchester |
Keywords: Cognitive Control Architectures, Acceptability and Trust, Human Factors and Human-in-the-Loop
Abstract: Theory of Mind (ToM) is a fundamental cognitive architecture that endows humans with the ability to attribute mental states to others. Humans infer the desires, beliefs, and intentions of others by observing their behavior and, in turn, adjust their actions to facilitate better interpersonal communication and team collaboration. In this paper, we investigated trust-aware robot policy with the theory of mind in a multiagent setting where a human collaborates with a robot against another human opponent. We show that by only focusing on team performance, the robot may resort to the reverse psychology trick, which poses a significant threat to trust maintenance. The human's trust in the robot will collapse when they discover deceptive behavior by the robot. To mitigate this problem, we adopt the robot theory of mind model to infer the human's trust beliefs, including true belief and false belief (an essential element of ToM). We designed a dynamic trust-aware reward function based on different trust beliefs to guide the robot policy learning, which aims to balance between avoiding human trust collapse due to robot reverse psychology and leveraging its potential to boost team performance. The experimental results demonstrate the importance of the ToM-based robot policy for human-robot trust and the effectiveness of our robot ToM-based robot policy in multiagent interaction settings.
|
|
WeAT15-AX Oral Session, AX-203 |
Add to My Program |
Human Factors and Human-In-The-Loop I |
|
|
Chair: Ciocarlie, Matei | Columbia University |
Co-Chair: De Momi, Elena | Politecnico Di Milano |
|
10:30-12:00, Paper WeAT15-AX.1 | Add to My Program |
VIDAR: Data Quality Improvement for Monocular 3D Reconstruction through In-Situ Visual Interaction |
|
Gao, Han | National Key Lab for Novel Software Technology, Nanjing Universi |
Liu, Yating | Nanjing University |
Cao, Fang | Nanjing University |
Wu, Hao | Nanjing University |
Xu, Fengyuan | National Key Lab for Novel Software Technology, Nanjing Universi |
Zhong, Sheng | Nanjing University |
Keywords: Human Factors and Human-in-the-Loop
Abstract: 3D reconstruction based on monocular videos has attracted wide attention, and existing reconstruction methods usually work in a reconstruction-after-scanning manner. However, these methods suffer from insufficient data collection problems due to the lack of effective guidance for users during the scanning process, which affects reconstruction quality. We propose VIDAR, which visually guides users with the streaming incremental reconstructed mesh in data collection for monocular 3D reconstruction. We propose an incremental mesh extraction algorithm to achieve lossless fusion of streaming incremental mesh data via slice-style management for guidance quality. We also design an incremental mesh rendering algorithm to achieve precise memory reallocation by updating the buffer in a fill-in-the-blank pattern for guidance efficiency. Besides, we introduce several optimizations on data transmission and human-computer interaction to improve the overall system performance. The experiment results on real-world scenes show that VIDAR efficiently delivers high-quality visual guidance and outperforms the non-interactive data collection methods for scene reconstruction.
|
|
10:30-12:00, Paper WeAT15-AX.2 | Add to My Program |
Transparency Control of a 1-DoF Knee Exoskeleton Via Human-In-The-Loop Velocity Optimisation |
|
Cha, Lukas | Technical University of Munich |
Guez, Annika | Imperial College London |
Chen, Chih-Yu | Technical University of Munich |
Kim, Sion | Imperial College London |
Yu, Zhenhua | Imperial College London |
Xiao, Bo | Imperial College London |
Vaidyanathan, Ravi | Imperial College London |
Keywords: Human Factors and Human-in-the-Loop, Prosthetics and Exoskeletons, Rehabilitation Robotics
Abstract: Rehabilitative robotics, particularly lower-limb exoskeletons (LLEs), have gained increasing importance in aiding patients regain ambulatory functions. One of the challenges in making these systems effective is the implementation of an assist-as-needed (AAN) control strategy that intervenes only when the patient deviates from the correct movement pattern. Equally crucial is the need for the LLE to exhibit "transparency" — minimising its interaction forces with the wearer to feel as natural as possible. This paper introduces a novel approach to transparency control based on a human-in-the-loop velocity optimisation framework. The proposed method employs torque data captured from past steps through a Series Elastic Actuator (SEA) to approximate the wearer's intended future movements and computes a corresponding transparent velocity trajectory. The velocity commands are complemented by an Adaptive Frequency Oscillator (AFO) based position controller that leverages the periodic nature of human gait and is modified with a force sensor for increased reactiveness to human gait variations. This approach is experimentally evaluated against a standard zero-torque controller with a stationary single-degree-of-freedom knee exoskeleton test platform in a proof-of-concept study. Preliminary results indicate that combining adaptive oscillators with interaction force sensing can improve transparency compared to the conventional zero-torque controller, using force readings for position control and torque measurements for velocity optimisation and control.
|
|
10:30-12:00, Paper WeAT15-AX.3 | Add to My Program |
Towards Enhanced Human Activity Recognition for Real-World Human-Robot Collaboration |
|
Yalcinkaya, Beril | Ingeniarius Lda |
Couceiro, Micael | University of Coimbra |
Pina, Lucas | Ingeniarius Lda |
Soares, Salviano | UTAD |
Valente, António | University of Trás Os Montes and Alto Douro |
Remondino, Fabio | FBK |
Keywords: Human Factors and Human-in-the-Loop, Human-Robot Collaboration, Robotics and Automation in Agriculture and Forestry
Abstract: This research contributes to the field of Human-Robot Collaboration (HRC) within dynamic and unstructured environments by extending the previously proposed Fuzzy State-Long Short-Term Memory (FS-LSTM) architecture to handle the uncertainty and irregularity inherent in real-world sensor data. Recognising the challenges posed by low-cost sensors, which are highly susceptible to environmental conditions and often fail to provide regular periodic readings, this paper introduces additional pre-processing blocks. These include two indirect Kalman filters and an additional LSTM network, which together enhance the input variables for the fuzzification process. The enhanced FS-LSTM approach is evaluated using real-world data, demonstrating its effectiveness in extracting meaningful information and accurately recognising human activities. This work underscores the potential of robotics in addressing global challenges, particularly in labour-intensive and hazardous tasks. By improving the integration of humans and robots in unstructured environments, this research contributes to the broader exploration of robotics in new societal applications, fostering connections and collaborations across diverse fields.
|
|
10:30-12:00, Paper WeAT15-AX.4 | Add to My Program |
Self-Supervised Regression of sEMG Signals Combining Non-Negative Matrix Factorization with Deep Neural Networks for Robot Hand Multiple Grasping Motion Control |
|
Meattini, Roberto | University of Bologna |
Caporali, Alessio | University of Bologna |
Bernardini, Alessandra | University of Bologna |
Palli, Gianluca | University of Bologna |
Melchiorri, Claudio | University of Bologna |
Keywords: Human Factors and Human-in-the-Loop, Intention Recognition
Abstract: Advanced Human-In-The-Loop (HITL) control strategies for robot hands based on surface electromyography (sEMG) are among major research questions in robotics. Due to intrinsic complexity and inaccuracy of labeling procedures, unsupervised regression of sEMG signals has been employed in literature, however showing several limitations in realizing multiple grasping motion control. In this work, we propose a novel Human-Robot interface (HRi) based on self-supervised regression of sEMG signals, combining Non-Negative Matrix Factorization (NMF) with Deep Neural Networks (DNN) in order to both avoid explicit labeling procedures and have powerful nonlinear fitting capabilities. Experiments involving 10 healthy subjects were carried out, consisting of an offline session for systematic evaluations and comparisons with traditional unsupervised approaches, and an online session for assessing realtime control of a wearable anthropomorphic robot hand. The offline results demonstrate that the proposed self-supervised regression approach overcame traditional unsupervised methods, even considering different robot hands with dissimilar kinematic structures. Furthermore, the subjects were able to successfully perform online control of multiple grasping motions of a real wearable robot hand, reporting for high reliability over repeated grasp-transportation-release tasks with different objects. Statistical support is provided along with experimental outcomes.
|
|
10:30-12:00, Paper WeAT15-AX.5 | Add to My Program |
Maximising Coefficiency of Human-Robot Handovers through Reinforcement Learning |
|
Lagomarsino, Marta | Istituto Italiano Di Tecnologia |
Lorenzini, Marta | Istituto Italiano Di Tecnologia |
Constable, Merryn Dale | Northumbria University |
De Momi, Elena | Politecnico Di Milano |
Becchio, Cristina | University Medical Center Hamburg-Eppendorf |
Ajoudani, Arash | Istituto Italiano Di Tecnologia |
Keywords: Human Factors and Human-in-the-Loop, Physical Human-Robot Interaction, Human-Centered Robotics
Abstract: Handing objects to humans is an essential capability for collaborative robots. Previous research works on human-robot handovers focus on facilitating the performance of the human partner and possibly minimising the physical effort needed to grasp the object. However, altruistic robot behaviours may result in protracted and awkward robot motions, contributing to unpleasant sensations by the human partner and affecting perceived safety and social acceptance. This paper investigates whether transferring the psychological principle that "humans act coefficiently as a group" (i.e. simultaneously maximising the benefits of all agents involved) to human-robot cooperative tasks promotes a more seamless and natural interaction. Human-robot coefficiency is first modelled by identifying implicit indicators of human comfort and discomfort as well as calculating the robot energy consumption in performing the desired trajectory. We then present a reinforcement learning approach that uses the human-robot coefficiency score as reward to adapt and learn online the combination of robot interaction parameters that maximises such coefficiency. Results proved that by acting coefficiently the robot could meet the individual preferences of most subjects involved in the experiments, improve the human perceived comfort, and foster trust in the robotic partner.
|
|
10:30-12:00, Paper WeAT15-AX.6 | Add to My Program |
Jacquard V2: Refining Datasets Using the Human in the Loop Data Correction Method |
|
Li, Qiuhao | Northeastern University |
Yuan, Shenghai | Nanyang Technological University |
Keywords: Human Factors and Human-in-the-Loop, Learning Categories and Concepts, Data Sets for Robotic Vision
Abstract: In the context of rapid advancements in industrial automation, vision-based robotic grasping plays an increasingly crucial role. In order to enhance visual recognition accuracy, the utilization of large-scale datasets is imperative for training models to acquire implicit knowledge related to the handling of various objects. Creating datasets from scratch is a time and labor-intensive process. Moreover, existing datasets often contain errors due to automated annotations aimed at expediency, making the improvement of these datasets a substantial research challenge. Consequently, several issues have been identified in the annotation of grasp bounding boxes within the popular Jacquard Grasp. We propose utilizing a Human-In-The-Loop(HIL) method to enhance dataset quality. This approach relies on backbone deep learning networks to predict object positions and orientations for robotic grasping. Predictions with Intersection over Union (IOU) values below 0.2 undergo an assessment by human operators. After their evaluation, the data is categorized into False Negatives(FN) and True Negatives(TN). FN are then subcategorized into either missing annotations or catastrophic labeling errors. Images lacking labels are augmented with valid grasp bounding box information, whereas images afflicted by catastrophic labeling errors are completely removed. The open-source tool Labelbee was employed for 53,026 iterations of HIL dataset enhancement, leading to the removal of 2,884 images and the incorporation of ground truth information for 30,292 images. The enhanced dataset, named the Jacquard V2 Grasping Dataset, served as the training data for a range of neural networks. We have empirically demonstrated that these dataset improvements significantly enhance the training and prediction performance of the same network, resulting in an increase of 7.1% across most popular detection architectures for ten iterations. This refined dataset will be accessible on Google Drive and Baidu Netdisk,
|
|
10:30-12:00, Paper WeAT15-AX.7 | Add to My Program |
Decision Making for Human-In-The-Loop Robotic Agents Via Uncertainty-Aware Reinforcement Learning |
|
Singi, Siddharth | Columbia University |
He, Zhanpeng | Columbia University |
Pan, Alvin | Columbia University |
Patel, Sandipkumar | Columbia University |
Sigurdsson, Gunnar | Amazon |
Piramuthu, Robinson | Amazon |
Song, Shuran | Columbia University |
Ciocarlie, Matei | Columbia University |
Keywords: Human Factors and Human-in-the-Loop, Human-Robot Collaboration, Human-Centered Robotics
Abstract: In a Human-in-the-Loop paradigm, a robotic agent is able to act mostly autonomously in solving a task, but can request help from an external expert when needed. However, knowing when to request such assistance is critical: too few requests can lead to the robot making mistakes, but too many requests can overload the expert. In this paper, we present a Reinforcement learning-based approach to this problem, where a semi-autonomous agent asks for external assistance when it has low confidence in the eventual success of the task. The confidence level is computed by estimating the variance of the return from the current state. We iteratively improve this estimate during training using a Bellman-like recursion. On discrete navigation problems with both fully- and partially-observable state information, we show that our method makes effective use of a limited budget of expert calls at run-time, despite having no access to the expert at training time. To the best of our knowledge, this is the first instance of using the variance of the return computed in an RL framework as a guidance measure for a Human-in-the-Loop agent.
|
|
10:30-12:00, Paper WeAT15-AX.8 | Add to My Program |
Building User Proficiency in Piloting Small Unmanned Aerial Vehicles (sUAV) |
|
Kunde, Siya | University of Nebraska |
Duncan, Brittany | University of Nebraska, Lincoln |
Keywords: Human Factors and Human-in-the-Loop, Design and Human Factors, Long term Interaction
Abstract: Assessing proficiency in small unmanned aerial vehicles (sUAVs) pilots is complex and not well understood, but increasingly important to employ these vehicles in serious jobs such as wildland firefighting and infrastructure inspection. The limited prior work with UAVs has focused on user training using modalities like simulators and VR and no performance assessments with line-of-sight UAVs. This paper presents a training methodology for novice pilots of sUAVs. We presented two studies: the Baseline study (21 participants) and the Training study (16 participants). Our work is of interest to sUAV operators, regulators, and companies developing this technologies to produce a more capable workforce capable of consistent, safe operations. We successfully utilized the method developed in cite{kunde2022recognizing} to assess user proficiency in flying UAVs. We presented a UAV pilot training schedule for novice users (in the Training study), and were able to determine the minimum training time necessary to observe performance gains and mitigate damage. Results indicate that task completions noticeably improved and crashes minimized by day 10 of training, with a training plateau observed by day 15.
|
|
10:30-12:00, Paper WeAT15-AX.9 | Add to My Program |
A Probabilistic Model for Cobot Decision Making to Mitigate Human Fatigue in Repetitive Co-Manipulation Tasks |
|
Yaacoub, Aya | LORIA-CNRS |
Thomas, Vincent | LORIA - Universite De Lorraine |
Colas, Francis | Inria Nancy Grand Est |
Maurice, Pauline | Cnrs - Loria |
Keywords: Human Factors and Human-in-the-Loop, Human-Robot Collaboration, Planning under Uncertainty
Abstract: Work related musculoskeletal disorders (WMSDs) are very common. Repetitive motion, which is often present in industrial work, is one of the main physical causes of WMSDs. It uses the same set of human joints repeatedly, which leads to localized joint fatigue. In this work, we present a framework to plan a policy of a collaborative robot that reduces the human fatigue in the long term, in highly repetitive co-manipulation tasks, while taking into account the uncertainty in the human postural reaction to the robot motion and the partial observability of the human fatigue state. We model the problem using continuous-state Partially Observable Markov Decision Process (POMDP), and use a physics-based digital human simulator to predict the fatigue cost of the possible robot actions. We then use an online planning algorithm to compute the optimal robot policy. We demonstrate our approach on a simulated experiment in which a robot repeatedly carries an object for the human to work on, and the object Cartesian pose needs to be optimized. We compare the policy generated with our approach with a random, a cyclic and a greedy (short-term optimization) policy, for different user profiles. We show that our approach outperforms the other policies on all tested scenarios.
|
|
WeAT16-AX Oral Session, AX-204 |
Add to My Program |
Force and Tactile Sensing IV |
|
|
Chair: Konyo, Masashi | Tohoku University |
Co-Chair: Chen, Chao | Monash University |
|
10:30-12:00, Paper WeAT16-AX.1 | Add to My Program |
GelRoller: A Rolling Vision-Based Tactile Sensor for Large Surface Reconstruction Using Self-Supervised Photometric Stereo Method |
|
Zhang, Zhiyuan | Huazhong University of Science and Technology |
Ma, Huan | Huazhong University of Science and Technology |
Zhou, Yulin | Huazhong University of Science and Technology |
Ji, Jingjing | Huazhong University of Science and Technology |
Yang, Hua | Huazhong University of Science and Technology |
Keywords: Force and Tactile Sensing, Deep Learning in Grasping and Manipulation, Product Design, Development and Prototyping
Abstract: Accurate perception of the surrounding environment stands as a primary objective for robots. Through tactile interaction, vision-based tactile sensors provide the capability to capture high-resolution and multi-modal surface information of objects, thereby facilitating robots in achieving more dexterous manipulations. However, the prevailing GelSight sensors entail intricate calibration procedures, posing challenges in their application on curved surfaces and requiring the maintenance of stable lighting conditions throughout experimentation. Additionally, constrained by shape and structure, current vision-based tactile sensors are predominantly applied to measurements within a limited area. In this study, we design a novel cylindrical vision-based tactile sensor that enables continuous and swift perception of large-scale object surfaces through rolling. To tackle the challenges posed by laborious calibration processes, we propose a self-supervised photometric stereo method based on deep learning, which eliminates pre-calibration requirements and enables the derivation of surface normals from a single image without relying on stable lighting conditions. Finally, we perform surface reconstruction from normal and point cloud registration on the multiple frames of images obtained by rolling the cylindrical sensor, resulting in large surface reconstruction. We compare our method with the representative lookup table method in the GelSight sensors. The results show that the proposed method enhances both reconstruction accuracy and robustness, thereby demonstrating the potential of the proposed sensor in large-scale surface reconstruction. Codes and mechanical structures are available at: https://github.com/ZhangZhiyuanZhang/GelRoller
|
|
10:30-12:00, Paper WeAT16-AX.2 | Add to My Program |
Marker-Embedded Tactile Image Generation Via Generative Adversarial Networks |
|
Kim, Won Dong | Korea Advanced Institute of Science & Technology (KAIST) |
Yang, Sanghoon | KAIST |
Kim, Woojong | KAIST |
Kim, Jeong-Jung | Korea Institute of Machinery & Materials (KIMM) |
Kim, Chang-Hyun | Korea Institute of Machinery and Materials (KIMM) |
Kim, Jung | KAIST |
Keywords: Force and Tactile Sensing, Deep Learning Methods, Simulation and Animation
Abstract: Data-driven methods have been successfully applied to images from vision-based tactile sensors to fulfill various manipulation tasks. Nevertheless, these methods remain inefficient because of the lack of methods for simulating the sensors. Relevant research on simulating vision-based tactile sensors generally focus on generating images without markers, owing to the challenges in accurately generating marker motions caused by elastomer deformation. This disallows access to tactile information deducible from markers. In this work, we propose a generative adversarial network (GAN)-based method to generate realistic marker-embedded tactile images in GelSight-like vision-based tactile sensors. We trained the proposed GAN model with an aligned real tactile and simulated depth image dataset obtained from deforming the sensor against various objects. This allows the model to translate simulated depth image sequences into RGB tactile images with markers. Furthermore, the generator in the proposed GAN allows the network to integrate the history of deformations from the depth image sequences to generate realistic marker motions during the normal and lateral sensor deformations. We evaluated and compared the positional accuracy of the markers and image similarity metrics of the images generated via our method with those from prior methods. The generated tactile images from the proposed model show a 28.3 % decrease in marker positional error and a 93.5 % decrease in the image similarity m
|
|
10:30-12:00, Paper WeAT16-AX.3 | Add to My Program |
TEXterity: Tactile Extrinsic DeXterity |
|
Bronars, Antonia | MIT |
Kim, Sangwoon | Massachusetts Institute of Technology |
Patre, Parag | Magna International |
Rodriguez, Alberto | Massachusetts Institute of Technology |
Keywords: Force and Tactile Sensing, In-Hand Manipulation, Perception for Grasping and Manipulation
Abstract: We introduce a novel approach that combines tactile estimation and control for in-hand object manipulation. By integrating measurements from robot kinematics and an image-based tactile sensor, our framework estimates and tracks object pose while simultaneously generating motion plans in a receding horizon fashion to control the pose of a grasped object. This approach consists of a discrete pose estimator that tracks the most likely sequence of object poses in a coarsely discretized grid, and a continuous pose estimator-controller to refine the pose estimate and accurately manipulate the pose of the grasped object. Our method is tested on diverse objects and configurations, achieving desired manipulation objectives and outperforming single-shot methods in estimation accuracy. The proposed approach holds potential for tasks requiring precise manipulation and limited intrinsic in-hand dexterity under visual occlusion, laying the foundation for closed-loop behavior in applications such as regrasping, insertion, and tool use. Please see https://sites.google.com/view/texterity for videos of real-world demonstrations.
|
|
10:30-12:00, Paper WeAT16-AX.4 | Add to My Program |
Optimization of Flexible Bronchoscopy Shape Sensing Using Fiber Optic Sensors |
|
Liu, Xinran | University of Chinese Academy of Sciences |
Chen, Hao | University of Chinese Academy of Sciences |
Liu, Hongbin | Hong Kong Institute of Science & Innovation, Chinese Academy Of |
Keywords: Force and Tactile Sensing, Intelligent and Flexible Manufacturing
Abstract: This work presents a novel shape evaluation and optimization approach for shape sensing, specifically targeting the constrained, irregular, and intricate spatial shapes of flexible bronchoscopes (FB) in human bronchial tree. The proposed evaluation criteria and optimization methods combine clinical significance related to bronchial anatomical structures and address issues related to singular points and discontinuities in traditional shape reconstruction models. Three-dimensional experiments were conducted within eight spatial complex configurations printed from a proportional bronchial model. The 3D experiment results demonstrate an average reduction of approximately 34.1% in shape reconstruction errors across all eight airway models compared to the traditional model, validating the effectiveness and feasibility.
|
|
10:30-12:00, Paper WeAT16-AX.5 | Add to My Program |
Tactile-Informed Action Primitives Mitigate Jamming in Dense Clutter |
|
Brouwer, Dane | Stanford University |
Citron, Joshua | Stanford University |
Choi, Hojung | Stanford University |
Lepert, Marion | Stanford University |
Lin, Michael A. | Stanford University |
Bohg, Jeannette | Stanford University |
Cutkosky, Mark | Stanford University |
Keywords: Force and Tactile Sensing, Multi-Contact Whole-Body Motion Planning and Control
Abstract: It is difficult for robots to retrieve objects in densely cluttered lateral access scenes with movable objects as jamming against adjacent objects and walls can inhibit progress. We propose the use of two action primitives---burrowing and excavating---that can fluidize the scene to un-jam obstacles and enable continued progress. Even when these primitives are implemented in an open loop manner at clock-driven intervals, we observe a decrease in the final distance to the target location. Furthermore, we combine the primitives into a closed loop hybrid control strategy using tactile and proprioceptive information to leverage the advantages of both primitives without being overly disruptive. In doing so, we achieve a 10-fold increase in success rate above the baseline control strategy and significantly improve completion times as compared to the primitives alone or a naive combination of them.
|
|
10:30-12:00, Paper WeAT16-AX.6 | Add to My Program |
Crosstalk-Free Impedance-Separating Array Measurement for Iontronic Tactile Sensors |
|
Hou, Funing | Fudan University |
Li, Gang | Hebei University of Technology |
Mu, Chenxing | Hebei University of Technology |
Shi, Mengqi | Hebei University of Technology |
Liu, Jixiao | Hebei University of Technology |
Guo, Shijie | Hebei University of Technology |
Keywords: Force and Tactile Sensing, Physical Human-Robot Interaction
Abstract: Iontronic tactile sensors are promising to measure spatial-temporal contact information with high performance. However, no suitable measuring method has been presented, due to issues with crosstalk and non-negligible equivalent resistance. Hence, this study presents an impedance-separating method, which does not require complex analog components. A general Quadri-Terminal Impedance Network (QTIN) model is introduced to reduce crosstalk, which has specific compatibility with the impedance-separating method. The precise ranges are measured, showing non-rectangle shapes suitable for the response of iontronic tactile sensors. A simple denoising method is provided to reduce initial array noise obviously. This work could benefit various scenarios, such as human-robot interaction and physiological information monitoring.
|
|
10:30-12:00, Paper WeAT16-AX.7 | Add to My Program |
Visual-Tactile Learning of Garment Unfolding for Robot-Assisted Dressing |
|
Zhang, Fan | Honda Research Institute EU |
Demiris, Yiannis | Imperial College London |
Keywords: Force and Tactile Sensing, Physical Human-Robot Interaction, Manipulation Planning
Abstract: Assistive robots have the potential to support disabled and elderly people in daily dressing activities. An intermediate stage of dressing is to manipulate the garment from a crumpled initial state to an unfolded configuration that facilitates robust dressing. Applying quasi-static grasping actions with vision feedback on garment unfolding usually suffers from occluded grasping points. In this work, we propose a dynamic manipulation strategy: tracing the garment edge until the hidden corner is revealed. We introduce a model-based approach, where a deep visual-tactile predictive model iteratively learns to perform servoing from raw sensor data. The predictive model is formalized as Conditional Variational Autoencoder with contrastive optimization, which jointly learns underlying visual-tactile latent representations, a latent garment dynamics model, and future predictions of garment states. Two cost functions are explored: the visual cost defined by garment corner positions guarantees the gripper to move towards the corner, while the tactile cost defined by garment edge poses prevents the garment from falling from the gripper. The experimental results demonstrate the improvement of our contrastive visual-tactile model predictive control over single sensing modality and baseline model learning techniques. The proposed method enables a robot to unfold back-opening hospital gowns and perform upper-body dressing.
|
|
10:30-12:00, Paper WeAT16-AX.8 | Add to My Program |
Multimodal Visual-Tactile Representation Learning through Self-Supervised Contrastive Pre-Training |
|
Dave, Vedant | Montanuniversität Leoben |
Lygerakis, Fotios | University of Leoben |
Rueckert, Elmar | Montanuniversitaet Leoben |
Keywords: Force and Tactile Sensing, Representation Learning
Abstract: The rapidly evolving field of robotics necessitates methods that can facilitate the fusion of multiple modalities. Specifically, when it comes to interacting with tangible objects, effectively combining visual and tactile sensory data is key to understanding and navigating the complex dynamics of the physical world, enabling a more nuanced and adaptable response to changing environments. Nevertheless, much of the earlier work in merging these two sensory modalities has relied on supervised methods utilizing datasets labeled by humans. This paper introduces MViTac, a novel methodology that leverages contrastive learning to integrate vision and touch sensations in a self-supervised fashion. By availing both sensory inputs, MViTac leverages intra and inter-modality losses for learning representations, resulting in enhanced material property classification and more adept grasping prediction. Through a series of experiments, we showcase the effectiveness of our method and its superiority over existing state-of-the-art self-supervised and supervised techniques. In evaluating our methodology, we focus on two distinct tasks: material classification and grasping success prediction. Our results indicate that MViTac facilitates the development of improved modality encoders, yielding more robust representations as evidenced by linear probing assessments.
|
|
10:30-12:00, Paper WeAT16-AX.9 | Add to My Program |
A Hierarchical Framework for Robot Safety Using Whole-Body Tactile Sensors |
|
Jiang, Shuo | Northeastern University |
Wong, Lawson L.S. | Northeastern University |
Keywords: Force and Tactile Sensing, Robot Safety, Multi-Contact Whole-Body Motion Planning and Control
Abstract: Using tactile signal is a natural way to perceive potential dangers and safeguard robots. One possible method is to use full-body tactile sensors on the robot and perform safety maneuvers when dangerous stimuli are detected. In this work, we proposed a method based on full-body tactile sensors that operates at three different levels of granularity to ensure that robot interacts with the environment safely. The results showed that our system dramatically reduced the overall collision chance compared with several baselines, and intelligently handled current collisions. Our proposed framework is generalizable to a wide variety of robots, enabling them to predict and avoid dangerous collisions and reactively handle accidental tactile stimuli.
|
|
WeAT17-AX Oral Session, AX-205 |
Add to My Program |
Legged Robots IV |
|
|
Chair: Zhao, Ye | Georgia Institute of Technology |
Co-Chair: Kober, Jens | TU Delft |
|
10:30-12:00, Paper WeAT17-AX.1 | Add to My Program |
Robust Jumping with an Articulated Soft Quadruped Via Trajectory Optimization and Iterative Learning |
|
Ding, Jiatao | Delft University of Technology |
van Löben Sels, Mees Alexander | TU Delft |
Angelini, Franco | University of Pisa |
Kober, Jens | TU Delft |
Della Santina, Cosimo | TU Delft |
Keywords: Legged Robots, Optimization and Optimal Control, Modeling, Control, and Learning for Soft Robots
Abstract: Quadrupeds deployed in real-world scenarios need to be robust to unmodelled dynamic effects. In this work, we aim to increase the robustness of quadrupedal periodic forward jumping (i.e., pronking) by unifying cutting-edge model-based trajectory optimization and iterative learning control. Using a reduced-order soft anchor model, the optimization-based motion planner generates the periodic reference trajectory. The controller then iteratively learns the feedforward control signal in a repetition process, without requiring an accurate full-body model. When enhanced by a continuous learning mechanism, the proposed controller can learn the control input without resetting the system at the end of each iteration. Simulations and experiments on a quadruped with parallel springs demonstrate that continuous jumping can be learned in a matter of minutes, with high robustness against various types of terrain.
|
|
10:30-12:00, Paper WeAT17-AX.2 | Add to My Program |
Unlocking Versatile Locomotion: A Novel Quadrupedal Robot with 4-DoFs Legs for Roller Skating |
|
Chen, Jiawei | Beihang University |
Qin, Ripeng | Inner Mongolia University of Science and Technology |
Huang, Longfei | Beihang University,Beijing Institute of Spacecraft System |
He, Zongbo | Beijing Institute of Sapcecraft System Engineering |
Xu, Kun | Beijing University |
Ding, Xilun | Beijing Univerisity of Aeronautics and Astronautics |
Keywords: Legged Robots, Mechanism Design, Motion Control
Abstract: Roller skating with passive wheels on a quadrupedal robot is more efficient than traditional walking. However, the typical mammalian quadruped robot with 3-DoFs legs can only perform one dynamic roller skating gait and has difficulty achieving turning motion. To address this limitation, we designed a novel quadrupedal robot with each leg having 4-DoFs to enable various roller skating locomotion including Swizzling, Stroking, and trot-like gaits while easily achieving turning motions. We considered the geometrical characteristics of the passive wheel and used the Levenberg-Marquardt method in robot kinematics to improve precision for both roller skating kinematics and contact point position for the dynamics controller. The position of the robot foot and the yaw angle of the passive wheel are decoupled for motion planning of all proposed gaits. Our proposed kinematics with wheeled geometry was verified through experiments to have higher precision, while the feasibility of all proposed roller-skating gaits was confirmed during straight motion and turning motion with a small radius on our prototype robot. Finally, we discussed the mobility efficiency of different roller skating gaits which were found to be more efficient than walking.
|
|
10:30-12:00, Paper WeAT17-AX.3 | Add to My Program |
Efficient Terrain Map Using Planar Regions for Footstep Planning on Humanoid Robots |
|
Mishra, Bhavyansh | Institute of Human and Machine Cognition, University of West Flo |
Calvert, Duncan | IHMC, UWF |
Bertrand, Sylvain | Institute for Human and Machine Cognition |
Pratt, Jerry | Inst. for Human and Machine Cognition |
Sevil, Hakki Erhan | University of West Florida |
Griffin, Robert J. | Institute for Human and Machine Cognition (IHMC) |
Keywords: Humanoid and Bipedal Locomotion, Legged Robots, Mapping
Abstract: Humanoid robots possess the ability to perform complex tasks in challenging environments. However, they require a model of the surroundings in a representation that is sufficient enough for downstream tasks such as footstep planning. The maps generated by existing mapping algorithms are either sparse, insufficient for footstep planning, memory intensive, or too slow for dynamic humanoid behaviors. In this work, we develop a mapping algorithm that combines planar region measurements along with kinematic-inertial state estimates to build a dense but efficient map of bounded planar surfaces. We present novel algorithms for plane feature matching, tracking and registration for mapping within a factor graph framework. The generated map is not only memory efficient, but also offers higher reliability and speed in bipedal footstep planning, than was possible earlier. The complete algorithm is also demonstrated using a full-scale humanoid robot, Nadia, walking over both flat ground and rough terrain utilizing the generated terrain map.
|
|
10:30-12:00, Paper WeAT17-AX.4 | Add to My Program |
Convergent iLQR for Safe Trajectory Planning and Control of Legged Robots |
|
Zhu, James | Carnegie Mellon University |
Payne, J. Joe | Carnegie Mellon University |
Johnson, Aaron M. | Carnegie Mellon University |
Keywords: Legged Robots, Optimization and Optimal Control, Robust/Adaptive Control
Abstract: In order to perform highly dynamic and agile maneuvers, legged robots typically spend time in underactuated domains (e.g. with feet off the ground) where the system has limited command of its acceleration and a constrained amount of time before transitioning to a new domain (e.g. foot touchdown). Meanwhile, these transitions can instantaneously change the system’s state, possibly causing perturbations to be mapped arbitrarily far away from the target trajectory. These properties make it difficult for local feedback controllers to effectively recover from disturbances as the system evolves through underactuated domains and hybrid impact events. To address this, we utilize the fundamental solution matrix that characterizes the evolution of perturbations through a hybrid trajectory and its 2-norm, which represents the worst- case growth of perturbations. In this paper, the worst-case perturbation analysis is used to explicitly reason about the tracking performance of a hybrid trajectory and is incorporated in an iLQR framework to optimize a trajectory while taking into account the closed-loop convergence of the trajectory under an LQR tracking controller. The generated convergent trajectories recover more effectively from perturbations, are more robust to large disturbances, and use less feedback control effort than trajectories generated with traditional methods.
|
|
10:30-12:00, Paper WeAT17-AX.5 | Add to My Program |
Optimization Based Dynamic Skateboarding of Quadrupedal Robot |
|
Xu, Zhe | Beijing Institute of Technology |
Al-Khulaqui, Mohamed | Xiaomi Inc |
Ma, Hanxin | BeihangUniversity |
Wang, Jiajun | UBTECH Robotics |
Xin, Quanbin | Beijing Xiaomi Robot Technology Co., Ltd |
You, Yangwei | Institute for Infocomm Research |
Zhou, Mingliang | Beijing Xiaomi Mobile Software Co., Ltd |
Xiang, Diyun | XIAOMI |
Zhang, Shiwu | University of Science and Technology of China |
Keywords: Legged Robots, Optimization and Optimal Control, Whole-Body Motion Planning and Control
Abstract: Robot skateboarding is an unexplored and challenging task for legged robots. Accurately modeling the dynamics of dual floating bases and developing effective planning and control methods present significant complexities in accomplishing skateboarding behavior.This paper focuses on enabling the quadrupedal platform CyberDog2 to achieve dynamic balancing and acceleration on a skateboard. An optimization-based control pipeline is developed through careful derivation of the system's equations of motion, considering both the robot and skateboard dynamics. By accounting for system physical constraints, an advanced offline trajectory optimization method is employed to generate various acceleration trajectories, creating a motion library for the system. An online linear model predictive control with whole body control framework is used to track the generated trajectories and stablize the system in real-time. To validate its effectiveness , we conducted experiments in various scenarios. The quadrupedal robot successfully performed acceleration from a static state to various velocities and demonstrated the ability to balance and steer the skateboard.
|
|
10:30-12:00, Paper WeAT17-AX.6 | Add to My Program |
Hierarchical Experience-Informed Navigation for Multi-Modal Quadrupedal Rebar Grid Traversal |
|
Asselmeier, Maxwell | Georgia Institute of Technology |
Ivanova, Evgeniia | SkyMul |
Zhou, Ziyi | Georgia Institute of Technology |
Vela, Patricio | Georgia Institute of Technology |
Zhao, Ye | Georgia Institute of Technology |
Keywords: Legged Robots, Robotics and Automation in Construction, Constrained Motion Planning
Abstract: This study focuses on a layered, experience-based, multi-modal contact planning framework for agile quadrupedal locomotion over a constrained rebar environment. To this end, our hierarchical planner incorporates locomotion-specific modules into the high-level contact sequence planner and solves kinodynamically-aware trajectory optimization as the low-level motion planner. Through quantitative analysis of the experience accumulation process and experimental validation of the kinodynamic feasibility of the generated locomotion trajectories, we demonstrate that the experience planning heuristic offers an effective way of providing candidate footholds for a legged contact planner. Additionally, we introduce a guiding torso path heuristic at the global planning level to enhance the navigation success rate in the presence of environmental obstacles. Our results indicate that the torso-path guided experience accumulation requires significantly fewer offline trials to successfully reach the goal compared to regular experience accumulation. Finally, our planning framework is validated in both dynamics simulations and real hardware implementations on a quadrupedal robot provided by Skymul Inc.
|
|
10:30-12:00, Paper WeAT17-AX.7 | Add to My Program |
Learning-Based Propulsion Control for Amphibious Quadruped Robots with Dynamic Adaptation to Changing Environment |
|
Yao, Qingfeng | Shenyang Institute of Automation, Chinese Academy of Sciences |
Meng, Linghan | Shenyang Institute of Automation |
Zhang, Qifeng | Shenyang Institute of Automation, CAS |
Zhao, Jing | Shenyang Institute of Automation (SIA), Chinese Academy of Scien |
Pajarinen, Joni | Aalto University |
Wang, Xiaohui | Heriot-Watt University |
Li, Zhibin (Alex) | University College London |
Wang, Cong | Delft University of Technology (TU Delft) |
Keywords: Legged Robots, Robust/Adaptive Control, Reinforcement Learning
Abstract: This paper proposes a learning-based adaptive propulsion control (APC) method for a quadruped robot integrated with thrusters in amphibious environments, allowing it to move efficiently in water while maintaining its ground locomotion capabilities. We designed the specific reinforcement learning method to train the neural network to perform the vector propulsion control. Our approach coordinates the legs and propeller, enabling the robot to achieve speed and trajectory tracking tasks in the presence of actuator failures and unknown disturbances. Our simulated validations of the robot in water demonstrate the effectiveness of the trained neural network to predict the disturbances and actuator failures based on historical information, showing that the framework is adaptable to changing environments and is suitable for use in dynamically changing situations. Our proposed approach is suited to the hardware augmentation of quadruped robots to create avenues in the field of amphibious robotics and expand the use of quadruped robots in various applications.
|
|
10:30-12:00, Paper WeAT17-AX.8 | Add to My Program |
Reinforcement Learning for Blind Stair Climbing with Legged and Wheeled-Legged Robots |
|
Chamorro, Simon | Université De Sherbrooke |
Klemm, Victor | ETH Zurich |
de la Iglesia Valls, Miguel | ETH Zürich |
Pal, Chris | Polytechnique Montreal |
Siegwart, Roland | ETH Zurich |
Keywords: Machine Learning for Robot Control, Legged Robots, Humanoid and Bipedal Locomotion
Abstract: In recent years, legged and wheeled-legged robots have gained prominence for tasks in environments predominantly created for humans across various domains. One significant challenge faced by many of these robots is their limited capability to navigate stairs, which hampers their functionality in multi-story environments. This study proposes a method aimed at addressing this limitation, employing reinforcement learning to develop a versatile controller applicable to a wide range of robots. In contrast to the conventional velocity-based controllers, our approach builds upon a position-based formulation of the RL task, which we show to be vital for stair climbing. Furthermore, the methodology leverages an asymmetric actor-critic structure, enabling the utilization of privileged information from simulated environments during training while eliminating the reliance on exteroceptive sensors during real-world deployment. Another key feature of the proposed approach is the incorporation of a boolean observation within the controller, enabling the activation or deactivation of a stair-climbing mode. We present our results on different quadrupeds and bipedal robots in simulation and showcase how our method allows the balancing robot Ascento to climb 15cm stairs in the real world, a task that was previously impossible for this robot.
|
|
10:30-12:00, Paper WeAT17-AX.9 | Add to My Program |
Modeling and Analysis of Combined Rimless Wheel with Tensegrity Spine |
|
Xiang, Yuxuan | Japan Advanced Institute of Science and Technology |
Zheng, Yanqiu | Ritsumeikan University |
Asano, Fumihiko | Japan Advanced Institute of Science and Technology |
Keywords: Passive Walking, Legged Robots
Abstract: In the natural world, benefited from the advantages of the spine, quadrupeds exhibiting extraordinary flexibility which allowing them to move efficiently on variable terrains. The previous researches have indicated the legged robots which efficiently utilizing their spine can achieve rapid and stable locomotion. However,within the field of legged robot dynamics, the design of the spine and understanding how it positively influences locomotion is unclear, which is significant for quadruped robot to achieve efficient and stable walking. In this study, we proposed a model formed by tensegrity spine and rimless wheel to represent quadrupeds, using passive dynamic walking as a method, which has been well-demonstrated for observing the inherent characteristics, exhibited the locomotion characteristic of the model proposed. By numerical simulation, we observed change trend of locomotion performance with the configurations of spine's shape, and found direction of spine design that have a positive impact on walking. These findings contribute to the design of spine structures in quadruped robots.
|
|
WeAT18-AX Oral Session, AX-206 |
Add to My Program |
Force Control and Sensing |
|
|
Chair: Tsuji, Toshiaki | Saitama University |
Co-Chair: Huang, Guoquan | University of Delaware |
|
10:30-12:00, Paper WeAT18-AX.1 | Add to My Program |
Robot-Camera Calibration in Tightly Constrained Environment Using Interactive Perception |
|
Zhong, Fangxun | The Chinese University of Hong Kong |
Li, Bin | The Chinese University of Hong Kong |
Chen, Wei | The Chinese University of Hong Kong |
Liu, Yunhui | Chinese University of Hong Kong |
Keywords: Calibration and Identification, Sensor-based Control, Motion Control of Manipulators, Surgical Robotics: Laparoscopy
Abstract: Manipulation in tight environment is challenging but increasingly common in vision-guided robotic applications. The significantly reduced amount of available feedback (limited visual cues, field of view, robot motion space, etc.) hinders solving the hand-eye relationship accurately. In this paper, we propose a new generic approach for online camera-robot calibration that could deal with the least feedback input available in tight environment: an arbitrarily restricted motion space and a single feature point with unknown position for the robot end-effector. We introduce the interactive perception to generate prescribed but tunable robot motions to reveal high-dimensional sensory feedback, which is not obtainable from static images. We then define the interactive feature plane (IFP), whose spatial property corresponds to the robot-actuating trajectories. A depth-free adaptive controller is proposed based on image feedback, where the converged orientation of IFP directly harvests the data for solving the hand-eye relationship. Our algorithm requires neither external calibration sensors/objects nor large-scale data acquisition process. Simulations demonstrate the va
|
|
10:30-12:00, Paper WeAT18-AX.2 | Add to My Program |
Degenerate Motions of Multisensor Fusion-Based Navigation |
|
Lee, Woosik | University of Delaware |
Chen, Chuchu | University of Delaware |
Huang, Guoquan | University of Delaware |
Keywords: Calibration and Identification, SLAM, Sensor Fusion
Abstract: The system observability analysis is of practical importance, for example, due to its ability to identify the unobservable directions of the estimated state which can influence estimation accuracy and help develop consistent and robust estimators. Recent studies focused on analyzing the observability of the state of various multisensor systems with a particular interest in unobservable directions induced by degenerate motions. However, those studies mostly stay in the specific sensor domain without aiding to extend the understanding to other heterogeneous systems. To this end, in this work, we provide degenerate motion analysis on general local and global sensor-paired systems, offering insights applicable to a wide range of existing navigation systems. Our analysis includes 9 degenerate motion identification including 5 already identified in literature and 4 new motions with both synchronous and asynchronous sensor-pair cases. Comprehensive numerical studies are conducted to verify those identified motions, show the effect of degenerate motion on state estimation, and demonstrate the generalizability of our analysis on various multisensor systems.
|
|
10:30-12:00, Paper WeAT18-AX.3 | Add to My Program |
Interaction Control for Tool Manipulation on Deformable Objects Using Tactile Feedback |
|
Zhang, Hanwen | Institute of Optics and Electronics, CAS |
Lu, Zeyu | National University of Singapore |
Liang, Wenyu | Institute for Infocomm Research, A*STAR |
Yu, Haoyong | National University of Singapore |
Mao, Yao | Institute of Optics and Electronics, CAS |
Wu, Yan | A*STAR Institute for Infocomm Research |
Keywords: Force and Tactile Sensing, Force Control, Contact Modeling
Abstract: The sense of touch enables humans to perform many delicate tasks on deformable objects and/or in a vision-denied environment. For a robot to achieve similar desirable interactions, such as administering a swab test, tactile information sensed beyond the tool-in-hand is correspondingly crucial for contact state estimation and contact tracking control. In this paper, we propose a tactile-guided planning and control framework using GTac, a hetero-Geneous Tactile sensor tailored for interaction with deformable objects beyond the immediate contact area. The biomimetic GTac in use is an improved version optimised for readout linearity which provides reliability in contact state estimation and force tracking. While a tactile-based classification and manipulation process is designed to estimate and align to the contact angle between the tool and the environment, a Koopman operator-based optimal control scheme is proposed to address challenges in the nonlinear control arising from the interaction with the deformable object. Several experiments are conducted to verify the effectiveness of the proposed framework. The experimental results demonstrate that the proposed framework can achieve accurate contact angle estimation as well as excellent tracking performance and strong robustness in force control.
|
|
10:30-12:00, Paper WeAT18-AX.4 | Add to My Program |
Development of an Easy-To-Cut Six-Axis Force Sensor |
|
Kawahara, Takamasa | Saitama University |
Tsuji, Toshiaki | Saitama University |
Keywords: Force and Tactile Sensing, Force Control, Robotics and Automation in Agriculture and Forestry
Abstract: Although the potential demand for force sensors in both robotics and automation is high, the complexity of their structure increases the number of manufacturing processes. As a result, the rising cost of sensors has hindered the practical application of force measurement and force control. In this study, a flexure element comprising a structure that is easier to cut and process than conventional ones, as well as holes through the side of a cuboid, is proposed to simplify the manufacturing of force sensors. To ensure the safety of the proposed sensor design, an approximate equation is derived to predict the maximum von Mises stress on the flexure element using design parameters. Subsequently, we clarified a way to attach the strain gauge in a position that improves sensitivity. The results of the actual prototype sensor based on the proposed method show that the maximum nonlinearity error and decoupling error in the other axes are 0.442 %R.O. and 0.660 %R.O., respectively, and the performance is comparable to that of conventional force sensors. Because the prototype has a difference in resolution between the axes, a method for improving the resolution isotropy without changing the difficulty of machining is also proposed. In addition, the validity of the proposed method is demonstrated using experiments. Consequently, a force sensor with the same level of performance was developed using the proposed method, and the cutting process was made easier compared to that of convention
|
|
10:30-12:00, Paper WeAT18-AX.5 | Add to My Program |
An Ultra-Fast Intrinsic Contact Sensing Method for Medical Instruments with Arbitrary Shape |
|
Cao, Guanglin | Institute of Automation, Chinese Academy of Sciences |
Chen, Mingcong | City University of Hong Kong |
Hu, Jian | Institute of Automation, Chinese Academy of Sciences |
Liu, Hongbin | Hong Kong Institute of Science & Innovation, Chinese Academy Of |
Keywords: Force Control, Force and Tactile Sensing, Medical Robots and Systems
Abstract: Intraoperative contact sensing has the potential to reduce the risk of surgical errors and enhance manipulation capabilities for medical robots, particularly in contact force control. Current intrinsic force sensing (IFS) methods are limited in application to medical instruments with arbitrary shape, due to high computational time and reliance on precise surface equations. This study presents an ultra-fast IFS method that uses multiple planes to establish surface geometry descriptions. The method can reduce high-order contact mechanical models that need to be solved iteratively to a set of linear equations, and calculate contact location analytically. In addition, a robot motion control approach based on the contact sensing method is proposed to maintain stable contact force and regulate the probe's orientation for robotic ultrasound systems (RUSS). Experimental results show that the contact sensing method is robust to friction and can achieve a mean (±SD) displacement error of 1.04±0.43 mm in contact location with computational time less than 1 ms. The system has been evaluated on a phantom with sinusoidal motion. To the best of our knowledge, this is the first study to validate adaptiveness of RUSS under dynamic conditions. The results demonstrated that the system exhibits comparable manipulation capabilities to human operators with only force sensing, indicating a high level of adaptiveness.
|
|
10:30-12:00, Paper WeAT18-AX.6 | Add to My Program |
Proprioceptive-Based Whole-Body Disturbance Rejection Control for Dynamic Motions in Legged Robots |
|
Zhu, Zhengguo | Shandong University |
Zhang, Guoteng | Shandong University |
Sun, Zhongkai | Shandong University |
Chen, Teng | Shandong University |
Rong, Xuewen | Shandong University |
Xie, Anhuan | Zhejiang University |
Li, Yibin | Shandong University |
Keywords: Force Control, Motion Control, Robust/Adaptive Control
Abstract: This paper presents a control framework for legged robots that enables self-perception and resistance to external disturbances. First, a novel proprioceptive-based disturbance estimator is proposed. Compared with other disturbance estimators, this estimator possesses notable advantages in terms of filtering foot-ground interaction noise and suppressing the accumulation of estimation errors. Additionally, our estimator is a fully proprioceptive-based estimator, eliminating the need for any exteroceptive devices or observers. Second, we present a hierarchical optimized whole-body controller (WBC), which takes into account the full body dynamics, the actuation limits, the external disturbances, and the interactive constraints. Finally, extensive experimental trials conducted on the point-foot biped robot BRAVER validate the capabilities of the proposed estimator and controller under various disturbance conditions.
|
|
10:30-12:00, Paper WeAT18-AX.7 | Add to My Program |
Contact Force Estimation of Robot Manipulators with Imperfect Dynamic Model: On Gaussian Process Adaptive Disturbance Kalman Filter (I) |
|
Wei, Yanran | Beihang University |
Lyu, Shangke | Nanyang Technological University |
Li, Wenshuo | Beihang University |
Yu, Xiang | Beihang University |
Guo, Lei | Beihang University |
Keywords: Industrial Robots, Force and Tactile Sensing, Calibration and Identification
Abstract: This paper is concerned with the contact force estimation problem of robot manipulators based on imperfect dynamic models of the manipulator and the contact force. To handle the imperfect dynamic information of the manipulator, a hybrid model, consisting of the nominal model and the residual dynamics, is established for the manipulator, and the Gaussian process regression (GPR) technique is employed to learn the mean and covariance of the residual dynamics. On this basis, a virtual measurement equation is established for contact force estimation and a Gaussian process adaptive disturbance Kalman filter (GPADKF) is developed where the variational Bayes technique is employed to achieve online identification of the noise statistics in the force dynamics. The GPADKF is capable of decoupling the contact force from residual dynamics and system noises, thereby reducing the dependence on accurate dynamic models of the manipulator and the contact force. Simulation and experimental results demonstrate that the proposed scheme outperforms the state-of-art methods.
|
|
WeAT19-NT Oral Session, NT-G301 |
Add to My Program |
Medical Robots IV |
|
|
Chair: Mylonas, George | Imperial College London |
Co-Chair: Navab, Nassir | TU Munich |
|
10:30-12:00, Paper WeAT19-NT.1 | Add to My Program |
On the Disentanglement of Tube Inequalities in Concentric Tube Continuum Robots |
|
Grassmann, Reinhard M. | University of Toronto |
Senyk, Anastasiia | Ukrainian Catholic University |
Burgner-Kahrs, Jessica | University of Toronto |
Keywords: Medical Robots and Systems, Modeling, Control, and Learning for Soft Robots, Kinematics
Abstract: Concentric tube continuum robots utilize nested tubes, which are subject to a set of inequalities. Current approaches to account for inequalities rely on branching methods such as if-else statements. It can introduce discontinuities, may result in a complicated decision tree, has a high wall-clock time, and cannot be vectorized. This affects the behavior and result of downstream methods in control, learning, workspace estimation, and path planning, among others. In this paper, we investigate a mapping to mitigate branching methods. We derive a lower triangular transformation matrix to disentangle the inequalities and provide proof for the unique existence. It transforms the interdependent inequalities into independent box constraints. Further investigations are made for sampling, control, and workspace estimation. Approaches utilizing the proposed mapping are at least 14 times faster (up to 176 times faster), generate always valid joint configurations, are more interpretable, and are easier to extend.
|
|
10:30-12:00, Paper WeAT19-NT.2 | Add to My Program |
3D Navigation of a Magnetic Swimmer Using a 2D Ultrasonography Probe Manipulated by a Robotic Arm for Position Feedback |
|
Gorroochurn, Premal | Columbia University |
Hong, Charles | Georgia Institute of Technology |
Klebuc, Carter | University of Houston |
Lu, Yitong | University of Houston |
Phan, Ngoc Tu Khue | Kerr High School |
Garcia Gonzalez, Javier | University of Houston |
Becker, Aaron | University of Houston |
Julien, Leclerc | University of Houston |
Keywords: Medical Robots and Systems, Motion Control, Control Architectures and Programming
Abstract: Millimeter-scale magnetic rotating swimmers have multiple potential medical applications. They could, for example, navigate inside the bloodstream of a patient toward an occlusion and remove it. Magnetic rotating swimmers have internal magnets and propeller fins with a helical shape. A rotating magnetic field applies torque on the swimmer and makes it rotate. The shape of the swimmer, combined with the rotational movement, generates a propulsive force. Visual feedback is suitable for in-vitro closed-loop control. However, in-vivo procedures will require different feedback modalities due to the opacity of the human body. In this paper, we provide new methods and tools that enable the 3D control of a magnetic swimmer using a 2D ultrasonography device attached to a robotic arm to sense the swimmer's position. We also provide an algorithm that computes the placement of the robotic arm and a controller that keeps the swimmer within the ultrasound imaging slice. The position measurement and closed-loop control were tested experimentally.
|
|
10:30-12:00, Paper WeAT19-NT.3 | Add to My Program |
An Intelligent Robotic Endoscope Control System Based on Fusing Natural Language Processing and Vision Models |
|
Dong, Beili | Imperial College London |
Chen, Junhong | Imperial College London |
Wang, Zeyu | Imperial College London |
Deng, Kaizhong | Imperial College London |
Li, Yiping | Imperial College London |
Lo, Benny Ping Lai | Imperial College London |
Mylonas, George | Imperial College London |
Keywords: Medical Robots and Systems, Motion Control
Abstract: In recent years, the area of Robot-Assisted Minimally Invasive Surgery (RAMIS) is standing on the the verge of a new wave of innovations. However, autonomy in RAMIS is still in a primitive stage. Therefore, most surgeries still require manual control of the endoscope and the robotic instruments, resulting in surgeons needing to switch attention between performing surgical procedures and moving endoscope camera. Automation may reduce the complexity of surgical operations and consequently reduce the cognitive load on the surgeon while speeding up the surgical process. In this paper, a hybrid robotic endoscope control system based on fusion model of natural language processing (NLP) and modified YOLO-V8 vision model is proposed. This proposed system can analyze the current surgical workflow and generate logs to summarize the procedure for teaching and providing feedback to junior surgeons. The user study of this system indicated a significant reduction of the number of clutching actions and mean task time, which effectively enhanced the surgical training.
|
|
10:30-12:00, Paper WeAT19-NT.4 | Add to My Program |
AiAReSeg: Catheter Detection and Segmentation in Interventional Ultrasound Using Transformers |
|
Ranne, Alex | Imperial College London |
Velikova, Yordanka | TU Munich |
Navab, Nassir | TU Munich |
Rodriguez y Baena, Ferdinando | Imperial College, London, UK |
Keywords: Medical Robots and Systems, Object Detection, Segmentation and Categorization, Simulation and Animation
Abstract: To date, endovascular surgeries are performed using the golden standard of Fluoroscopy, which uses ionising radiation to visualise catheters and vasculature. Prolonged Fluoroscopic exposure is harmful for the patient and the clinician, and may lead to severe post-operative sequlae such as the development of cancer. Meanwhile, the use of interventional Ultrasound has gained popularity, due to its well-known benefits of small spatial footprint, fast data acquisition, and produce higher tissue contrast images. However, ultrasound images are hard to interpret, and it is difficult to localise vessels, catheters, and guidewires within them. This work proposed a solution using an adaptation of a state-of-the-art machine learning architecture (Transformers) to detect and segment catheters in axial interventional Ultrasound image sequences. The network architecture was inspired by the Attention in Attention mechanism, temporal tracking networks, and introduced a novel 3D segmentation head that performs 3D deconvolution across time. In order to facilitate training of such deep learning networks, we introduced a new data synthesis pipeline that used physics-based catheter insertion simulations, along with a convolutional ray-casting ultrasound simulator to produce synthetic ultrasound images of endovascular interventions. The proposed method was validated on a hold-out validation dataset, thus demonstrated robustness to ultrasound noise and a wide range of scanning angles. It was also tested on data collected from silicon-based aorta phantoms, thus demonstrated its potential for translation from sim-to-real. This work represents a significant step towards safer and more efficient endovascular surgery using interventional ultrasound.
|
|
10:30-12:00, Paper WeAT19-NT.5 | Add to My Program |
Hybrid Robot for Percutaneous Needle Intervention Procedures: Mechanism Design and Experiment Verification |
|
Zhang, Hanyi | Imperial College London |
Yao, Guocai | Tsinghua University |
Zhang, Feifan | University College London |
Lin, Fanchuan | Beihang University |
Sun, Fuchun | Tsinghua Univerisity |
Keywords: Medical Robots and Systems, Parallel Robots, Mechanism Design
Abstract: This paper presents a 6-DOF hybrid robot for percutaneous needle intervention procedures. The new robot combines the advantages of both serial robots and parallel robots, featuring compactness, high accuracy, and small footprint while overcoming the problems of the high cost of serial robots and the small workspace and singularity issue of parallel robots. Besides, by analyzing the workspace of the robot, the equation is derived between the structure parameter and workspace to adjust the parameters of the robot to satisfy different working scenes. According to the experiment, the accuracy of the robot is related to the position, distance, and insertion angle. The result shows that the performance is better when working near the center workspace and away from the servos and the average error of the robot is 1.39mm. The phantom experiment of lumbar puncture validates its feasibility.
|
|
10:30-12:00, Paper WeAT19-NT.6 | Add to My Program |
Envibroscope: Real-Time Monitoring and Prediction of Environmental Motion for Enhancing Safety in Robot-Assisted Microsurgery |
|
Alikhani, Alireza | Augen Klinik Und Poliklinik, Klinikum Rechts Der Isar Der Techn |
Inagaki, Satoshi | NSK.Ltd |
Dehghani, Shervin | TUM |
Maier, Mathias | Klinikum Rechts Der Isar Der TU München |
Navab, Nassir | TU Munich |
Nasseri, M. Ali | Technische Universitaet Muenchen |
Keywords: Medical Robots and Systems, Robot Safety, Machine Learning for Robot Control
Abstract: Several robotic systems have been emerged in the recent past to enhance the precision of micro-surgeries such as retinal procedures. Significant advancements have recently been achieved to increase the precision of such systems beyond surgeon capabilities. However, little attention has been paid to the impact of non-predicted and sudden movements of the patient and the environment. Therefore, analyzing environmental motion and vibrations is crucial to ensuring the optimal performance and reliability of medical systems that require micron-level precision, especially in real-life scenarios. To address this challenge, this paper introduces a novel environmental motion analysis system that employs a grid layout with distributed sensing nodes throughout the environment. This system effectively tracks undesired movements (motions) at designated locations and predicts upcoming motions using neural network-based approaches. The outcomes of our experiments exhibit promising prospects for real-time motion monitoring and prediction, which has the potential to form a solid basis for enhancing the automation, safety, integration, and overall efficiency of robot-assisted micro-surgeries.
|
|
10:30-12:00, Paper WeAT19-NT.7 | Add to My Program |
Cooperative vs. Teleoperation Control of the Steady Hand Eye Robot with Adaptive Sclera Force Control: A Comparative Study |
|
Esfandiari, Mojtaba | Johns Hopkins University |
Kim, Ji Woong | Johns Hopkins University |
Zhao, Botao | Johns Hopkins University |
Amirkhani, Golchehr | Johns Hopkins University |
Hadi, Muhammad | Johns Hopkins University |
Gehlbach, Peter | Johns Hopkins Medical Institute |
Taylor, Russell H. | The Johns Hopkins University |
Iordachita, Ioan Iulian | Johns Hopkins University |
Keywords: Medical Robots and Systems, Robust/Adaptive Control, Force Control
Abstract: A surgeon's physiological hand tremor can significantly impact the outcome of delicate and precise retinal surgery, such as retinal vein cannulation (RVC) and epiretinal membrane peeling. Robot-assisted eye surgery technology provides ophthalmologists with advanced capabilities such as hand tremor cancellation, hand motion scaling, and safety constraints that enable them to perform these otherwise challenging and high-risk surgeries with high precision and safety. Steady-Hand Eye Robot (SHER) with cooperative control mode can filter out surgeon's hand tremor, yet another important safety feature, that is, minimizing the contact force between the surgical instrument and sclera surface for avoiding tissue damage cannot be met in this control mode. Also, other capabilities, such as hand motion scaling and haptic feedback, require a teleoperation control framework. In this work, for the first time, we implemented a teleoperation control mode incorporated with an adaptive sclera force control algorithm using a PHANTOM Omni haptic device and a force-sensing surgical instrument equipped with Fiber Bragg Grating (FBG) sensors attached to the SHER 2.1 end-effector. This adaptive sclera force control algorithm allows the robot to dynamically minimize the tool-sclera contact force. Moreover, for the first time, we compared the performance of the proposed adaptive teleoperation mode with the cooperative mode by conducting a vessel-following experiment inside an eye phantom under a microscope.
|
|
10:30-12:00, Paper WeAT19-NT.8 | Add to My Program |
Adaptive Motion Scaling for Robot-Assisted Microsurgery Based on Hybrid Offline Reinforcement Learning and Damping Control |
|
Jiang, Peiyang | University of Bristol |
Li, Wei | Imperial College London |
Li, Yifan | University of Bristol |
Zhang, Dandan | Imperial College London |
Keywords: Medical Robots and Systems, Robust/Adaptive Control
Abstract: Motion scaling is essential to empower users to conduct precise manipulation during teleoperation for robot-assisted microsurgery (RAMS). A constant, small motion scaling ratio can enhance the precision of teleoperation but hinder the operator from quickly reaching distant targets. The concept of self-adaptive motion scaling has been proposed in previous work. However, previous frameworks required extensive manual tuning of core parameters, which significantly depends on prior knowledge and may potentially lead to non-optimal solutions. This paper presents a hybrid offline reinforcement learning and damping control approach to regulate the motion scaling ratio for different operations during offline training. This method can take user-specific characteristics into consideration and help them achieve better teleoperation performance. Comparisons are made with and without using the adaptive motion-scaling algorithm. Detailed user studies indicate that a suitable motion-scaling ratio can be obtained and adjusted online. The overall performance of the operators in terms of time cost for task completion is significantly improved, while the variance of average speed and the total distance for robot operation is reduced.
|
|
10:30-12:00, Paper WeAT19-NT.9 | Add to My Program |
Chained Flexible Capsule Endoscope: Unraveling the Conundrum of Size Limitations and Functional Integration for Gastrointestinal Transitivity |
|
Yuan, Sishen | The Chinese University of Hong Kong |
Li, Guang | Chinese University of Hong Kong |
Liang, Baijia | Chinese University of Hong Kong |
Li, Lailu | The Chinese University of Hong Kong |
Zheng, Qingzhuo | The Chinese University of Hong Kong |
Song, Shuang | Harbin Institute of Technology (Shenzhen) |
Li, Zhen | Qilu Hospital of Shandong University |
Ren, Hongliang | Chinese Univ Hong Kong (CUHK) & National Univ Singapore(NUS) |
Keywords: Medical Robots and Systems, Soft Robot Applications
Abstract: Capsule endoscopes, predominantly serving diagnostic functions, provide lucid internal imagery but are devoid of surgical or therapeutic capabilities. Consequently, despite lesion detection, physicians frequently resort to traditional endoscopic or open surgical procedures for treatment, resulting in more complex, potentially risky interventions. To surmount these limitations, this study introduces a flexible capsule endoscope (FCE) design concept, specifically conceived to navigate the inherent volume constraints of capsule endoscopes whilst augmenting their therapeutic functionalities. The FCE’s distinctive flexibility originates from a conventional rotating joint design and the incision pattern in the flexible material. In vitro experiments validated the passive navigation ability of the FCE in rugged intestinal tracts. Further, the FCE demonstrates consistent reptile-like peristalsis under the influence of an external magnetic field, and possesses the capability for film expansion and disintegration under high-frequency electromagnetic stimulation. These findings illuminate a promising path toward amplifying the therapeutic capacities of capsule endoscopes without necessitating a size compromise.
|
|
WeAT20-NT Oral Session, NT-G302 |
Add to My Program |
Performance Evaluation and Benchmarking |
|
|
Chair: Roa, Maximo A. | German Aerospace Center (DLR) |
Co-Chair: Alenyà, Guillem | Institut De Robòtica I Informàtica Industrial CSIC-UPC |
|
10:30-12:00, Paper WeAT20-NT.1 | Add to My Program |
A Group Theoretic Metric for Robot State Estimation Leveraging Chebyshev Interpolation |
|
Agrawal, Varun | Georgia Institute of Technology |
Dellaert, Frank | Verdant Robotics/Georgia Tech |
Keywords: Performance Evaluation and Benchmarking
Abstract: We propose a new metric for robot state estimation based on the recently introduced text{SE}_2(3) Lie group definition. Our metric is related to prior metrics for SLAM but explicitly takes into account the linear velocity of the state estimate, improving over current pose-based trajectory analysis. This has the benefit of providing a single, quantitative metric to evaluate state estimation algorithms against, while being compatible with existing tools and libraries. Since ground truth data generally consists of pose data from motion capture systems, we also propose an approach to compute the ground truth linear velocity based on polynomial interpolation. Using Chebyshev interpolation and a pseudospectral parameterization, we can accurately estimate the ground truth linear velocity of the trajectory in an optimal fashion with best approximation error. We demonstrate how this approach performs on multiple robotic platforms where accurate state estimation is vital, and compare it to alternative approaches such as finite differences. The pseudospectral parameterization also provides a means of trajectory data compression as an additional benefit. Experimental results show our method provides a valid and accurate means of comparing state estimation systems, which is also easy to interpret and report.
|
|
10:30-12:00, Paper WeAT20-NT.2 | Add to My Program |
AD4RL: Autonomous Driving Benchmarks for Offline Reinforcement Learning with Value-Based Dataset |
|
Lee, Dongsu | Soongsil University |
Eom, Chanin | Soongsil University |
Kwon, Minhae | Soongsil University |
Keywords: Performance Evaluation and Benchmarking, Data Sets for Robot Learning, Reinforcement Learning
Abstract: Offline reinforcement learning has emerged as a promising technology by enhancing its practicality through the use of pre-collected large datasets. Despite its practical benefits, most algorithm development research in offline reinforcement learning still relies on game tasks with synthetic datasets. To address such limitations, this paper provides autonomous driving datasets and benchmarks for offline reinforcement learning research. We provide 19 datasets, including real-world human driver's datasets, and seven popular offline reinforcement learning algorithms in three realistic driving scenarios. We also provide a unified decision-making process model that can operate effectively across different scenarios, serving as a reference framework in algorithm design. Our research lays the groundwork for further collaborations in the community to explore practical aspects of existing reinforcement learning methods. Dataset and codes can be found in https://sites.google.com/view/ad4rl.
|
|
10:30-12:00, Paper WeAT20-NT.3 | Add to My Program |
The Cluttered Environment Picking Benchmark (CEPB) for Advanced Warehouse Automation |
|
D'Avella, Salvatore | Sant'Anna School of Advanced Studies |
Bianchi, Matteo | University of Pisa |
Sundaram, Ashok M. | German Aerospace Center (DLR) |
Avizzano, Carlo Alberto | Scuola Superiore Sant'Anna |
Roa, Maximo A. | German Aerospace Center (DLR) |
Tripicchio, Paolo | Scuola Superiore Sant'Anna |
Keywords: Performance Evaluation and Benchmarking, Grasping, Dexterous Manipulation
Abstract: Autonomous and reliable robotic grasping is a desirable functionality in robotic manipulation and is still an open problem. Standardized benchmarks are important tools for evaluating and comparing robotic grasping and manipulation systems among different research groups, also sharing with the community the best practices to learn from errors. An ideal benchmarking protocol should encompass the different aspects underpinning grasp execution, including the mechatronic design of grippers, planning, perception, and control to give information on each aspect and the overall problem. The proposed work gives an overview of the benchmarks, datasets, and competitions that have been proposed and adopted in the last few years and presents a novel benchmark with protocols for different tasks that evaluate both the single components of the system and the system as a whole, introducing an evaluation metric that allows for a fair comparison in highly cluttered scenes taking into account the difficulty of the clutter. A website dedicated to the benchmark containing information on the different tasks, maintaining the leaderboards, and serving as a contact point for the community is also provided.
|
|
10:30-12:00, Paper WeAT20-NT.4 | Add to My Program |
SceneReplica: Benchmarking Real-World Robot Manipulation by Creating Replicable Scenes |
|
Khargonkar, Ninad | University of Texas at Dallas |
Allu, Sai Haneesh | The University of Texas at Dallas |
Lu, Yangxiao | The University of Texas at Dallas |
P, Jishnu Jaykumar | The University of Texas at Dallas |
Prabhakaran, B | University of Texas at Dallas |
Xiang, Yu | University of Texas at Dallas |
Keywords: Performance Evaluation and Benchmarking, Grasping, Perception for Grasping and Manipulation
Abstract: We present a new reproducible benchmark for evaluating robot manipulation in the real world, specifically focusing on a pick-and-place task. Our benchmark uses the YCB object set, a commonly used dataset in the robotics community, to ensure that our results are comparable to other studies. Additionally, the benchmark is designed to be easily reproducible in the real world, making it accessible to researchers and practitioners. We also provide our experimental results and analyzes for model-based and model-free 6D robotic grasping on the benchmark, where representative algorithms are evaluated for object perception, grasping planning, and motion planning. We believe that our benchmark will be a valuable tool for advancing the field of robot manipulation. By providing a standardized evaluation framework, researchers can more easily compare different techniques and algorithms, leading to faster progress in developing robot manipulation methods. Appendix, code and videos for the project are available at https://irvlutd.github.io/SceneReplica.
|
|
10:30-12:00, Paper WeAT20-NT.5 | Add to My Program |
CRITERIA: A New Benchmarking Paradigm for Evaluating Trajectory Prediction Models for Autonomous Driving |
|
Chen, Changhe | University of Toronto |
Pourkeshavarz, Mozhgan | Research Scientist at Huawei |
Rasouli, Amir | Huawei Technologies Canada |
Keywords: Performance Evaluation and Benchmarking, Intelligent Transportation Systems, Intention Recognition
Abstract: Benchmarking is a common method for evaluating trajectory prediction models for autonomous driving. Existing benchmarks rely on datasets, which are biased towards more common scenarios, such as cruising, and distance-based metrics that are computed by averaging over all scenarios. Following such a regiment provides a little insight into the properties of the models both in terms of how well they can handle different scenarios and how admissible and diverse their outputs are. There exist a number of complementary metrics designed to measure the admissibility and diversity of trajectories, however, they suffer from biases, such as length of trajectories. In this paper, we propose a new benChmarking paRadIgm for evaluaTing trajEctoRy predIction Approaches (CRITERIA). Particularly, we propose 1) a method for extracting driving scenarios at varying levels of specificity according to the structure of the roads, models' performance, and data properties for fine-grained ranking of prediction models; 2) A set of new bias-free metrics for measuring diversity, by incorporating the characteristics of a given scenario, and admissibility, by considering the structure of roads and kinematic compliancy, motivated by real-world driving constraints; 3) Using the proposed benchmark, we conduct extensive experimentation on a representative set of the prediction models using the large scale Argoverse dataset. We show that the proposed benchmark can produce a more accurate ranking of the models and serve as a means of characterizing their behavior. We further present ablation studies to highlight contributions of different elements that are used to compute the proposed metrics
|
|
10:30-12:00, Paper WeAT20-NT.6 | Add to My Program |
LET-3D-AP: Longitudinal Error Tolerant 3D Average Precision for Camera-Only 3D Detection |
|
Hung, Wei-Chih | Waymo |
Casser, Vincent Michael | Waymo |
Kretzschmar, Henrik | Waymo |
Hwang, Jyh-Jing | Waymo |
Anguelov, Dragomir | Waymo |
Keywords: Performance Evaluation and Benchmarking, Object Detection, Segmentation and Categorization, Data Sets for Robotic Vision
Abstract: The 3D~Average Precision (3D AP) relies on the intersection over union between predictions and ground truth objects. However, camera-only detectors have limited depth accuracy, which may cause otherwise reasonable predictions that suffer from such longitudinal localization errors to be treated as false positives. We therefore propose variants of the 3D AP metric to be more permissive with respect to depth estimation errors. Specifically, our novel longitudinal error tolerant metrics, LET-3D-AP and LET-3D-APL, allow longitudinal localization errors of the prediction boxes up to a given tolerance. To evaluate the proposed metrics, we also construct a new test set for the Waymo Open Dataset, tailored to camera-only 3D detection methods. Surprisingly, we find that state-of-the-art camera-based detectors can outperform popular LiDAR-based detectors with our new metrics past at 10% depth error tolerance, suggesting that existing camera-based detectors already have the potential to surpass LiDAR-based detectors in downstream applications. We believe the proposed metrics and the new benchmark dataset will facilitate advances in the field of camera-only 3D detection by providing more informative signals that can better indicate the system-level performance.
|
|
10:30-12:00, Paper WeAT20-NT.7 | Add to My Program |
HuNavSim: A ROS 2 Human Navigation Simulator for Benchmarking Human-Aware Robot Navigation |
|
Perez-Higueras, Noe | University Pablo De Olavide |
Otero, Roberto | University Pablo De Olavide |
Caballero, Fernando | Universidad De Sevilla |
Merino, Luis | Universidad Pablo De Olavide |
Keywords: Performance Evaluation and Benchmarking, Simulation and Animation, Human-Aware Motion Planning
Abstract: This work presents the Human Navigation Simulator (HuNavSim), a novel open-source tool for the simulation of different human-agent navigation behaviors in scenarios with mobile robots. The tool, the first programmed under the ROS 2 framework, can be used together with different well-known robotics simulators like Gazebo. The main goal is to facilitate the development and evaluation of human-aware robot navigation systems in simulation. In addition to a general human-navigation model, HuNavSim includes, as a novelty, a rich set of individual and varied human navigation behaviors and a comprehensive set of metrics for social navigation benchmarking.
|
|
10:30-12:00, Paper WeAT20-NT.8 | Add to My Program |
RobotPerf: An Open-Source, Vendor-Agnostic, Benchmarking Suite for Evaluating Robotics Computing System Performance |
|
Mayoral-Vilches, Victor | Klagenfurt University |
Jabbour, Jason | Harvard University |
Hsiao, Yu-Shun | Harvard University |
Wan, Zishen | Georgia Institute of Technology |
Crespo-Álvarez, Martiño | Acceleration Robotics |
Stewart, Matthew | Harvard University |
Reina-Muñoz, Juan Manuel | Acceleration Robotics |
Nagras, Prateek | Acceleration Robotics |
Vikhe, Gaurav | Acceleration Robotics |
Bakhshalipour, Mohammad | Carnegie Mellon University |
Pinzger, Martin | Universität Klagenfurt |
Rass, Stefan | Alpen-Adria Universität Klagenfurt |
Panigrahi, Smruti | Ford Motor Company |
Corradi, Giulio | AMD |
Roy, Niladri | Intel |
Gibbons, Phillip | Carnegie Mellon University |
Neuman, Sabrina | Boston University |
Plancher, Brian | Barnard College, Columbia University |
Janapa Reddi, Vijay | Harvard University |
Keywords: Performance Evaluation and Benchmarking, Software Tools for Benchmarking and Reproducibility, Computer Architecture for Robotic and Automation
Abstract: We introduce RobotPerf, a vendor-agnostic benchmarking suite designed to evaluate robotics computing performance across a diverse range of hardware platforms using ROS 2 as its common baseline. The suite encompasses ROS 2 packages covering the full robotics pipeline and integrates two distinct benchmarking approaches: black-box testing, which measures performance by eliminating upper layers and replacing them with a test application, and grey-box testing, an application-specific measure that observes internal system states with minimal interference. Our benchmarking framework provides ready-to-use tools and is easily adaptable for the assessment of custom ROS 2 computational graphs. Drawing from the knowledge of leading robot architects and system architecture experts, RobotPerf establishes a standardized approach to robotics benchmarking. As an open-source initiative, RobotPerf remains committed to evolving with community input to advance the future of hardware-accelerated robotics.
|
|
10:30-12:00, Paper WeAT20-NT.9 | Add to My Program |
Standardization of Cloth Objects and Its Relevance in Robotic Manipulation |
|
Garcia-Camacho, Irene | Institut De Robòtica I Informàtica Industrial CSIC-UPC |
Longhini, Alberta | KTH Royal Institute of Technology |
Welle, Michael C. | KTH Royal Institute of Technology |
Alenyà, Guillem | CSIC-UPC |
Kragic, Danica | KTH |
Borràs Sol, Júlia | Institut De Robòtica I Informàtica Industrial (CSIC-UPC) |
Keywords: Performance Evaluation and Benchmarking, Software Tools for Benchmarking and Reproducibility, Grasping
Abstract: The field of robotics faces inherent challenges in manipulating deformable objects, particularly in understanding and standardising fabric properties like elasticity, stiffness, and friction. While the significance of these properties is evident in the realm of cloth manipulation, accurately categorising and comprehending them in real-world applications remains elusive. This study sets out to address two primary objectives: (1) to provide a framework suitable for robotics applications to characterise cloth objects, and (2) to study how these properties influence robotic manipulation tasks. Our preliminary results validate the framework's ability to characterise cloth properties and compare cloth sets, and reveal the influence that different properties have on the outcome of five manipulation primitives. We believe that, in general, results on the manipulation of clothes should be reported along with a better description of the garments used in the evaluation. This paper proposes a set of these measures.
|
|
WeAT22-NT Oral Session, NT-G304 |
Add to My Program |
Marine Robotics IV |
|
|
Chair: Hollinger, Geoffrey | Oregon State University |
|
10:30-12:00, Paper WeAT22-NT.1 | Add to My Program |
A Model for Multi-Agent Autonomy That Uses Opinion Dynamics and Multi-Objective Behavior Optimization |
|
Paine, Tyler | Massachusetts Institute of Technology |
Benjamin, Michael | Massachusetts Institute of Technology |
Keywords: Marine Robotics, Multi-Robot Systems, Cooperating Robots
Abstract: This paper reports a new hierarchical architecture for modeling autonomous multi-robot systems (MRSs): a non-linear dynamical opinion process is used to model high-level group choice, and multi-objective behavior optimization is used to model individual decisions. Using previously reported theoretical results, we show it is possible to design the behavior of the MRS by the selection of a relatively small set of parameters. The resulting behavior - both collective actions and individual actions - can be understood intuitively. The approach is entirely decentralized and the communication cost scales by the number of group options, not agents. We demonstrated the effectiveness of this approach using a hypothetical `explore-exploit-migrate' scenario in a two hour field demonstration with eight unmanned surface vessels (USVs). The results from our preliminary field experiment show the collective behavior is robust even with time-varying network topology and agent dropouts.
|
|
10:30-12:00, Paper WeAT22-NT.2 | Add to My Program |
Convex Geometric Trajectory Tracking Using Lie Algebraic MPC for Autonomous Marine Vehicles |
|
Jang, Junwoo | University of Michigan |
Teng, Sangli | University of Michigan, Ann Arbor |
Ghaffari, Maani | University of Michigan |
Keywords: Marine Robotics, Optimization and Optimal Control, Motion Control
Abstract: Controlling marine vehicles in challenging environments is a complex task due to the presence of nonlinear hydrodynamics and uncertain external disturbances. Despite nonlinear model predictive control (MPC) showing potential in addressing these issues, its practical implementation is often constrained by computational limitations. In this paper, we propose an efficient controller for trajectory tracking of marine vehicles by employing a convex error-state MPC on the Lie group. By leveraging the inherent geometric properties of the Lie group, we can construct globally valid error dynamics and formulate a quadratic programming-based optimization problem. Our proposed MPC demonstrates effectiveness in trajectory tracking through extensive-numerical simulations, including scenarios involving ocean currents. Notably, our method substantially reduces computation time compared to nonlinear MPC, making it well-suited for real-time control applications with long prediction horizons or involving small marine vehicles.
|
|
10:30-12:00, Paper WeAT22-NT.3 | Add to My Program |
Mission Planning for Multiple Autonomous Underwater Vehicles with Constrained in Situ Recharging |
|
Singh, Priti | Oregon State University |
Hollinger, Geoffrey | Oregon State University |
Keywords: Marine Robotics, Path Planning for Multiple Mobile Robots or Agents, Energy and Environment-Aware Automation
Abstract: Persistent operation of Autonomous Underwater Vehicles (AUVs) without manual interruption for recharging saves time and total cost for offshore monitoring and data collection applications. In order to facilitate AUVs for long mission durations without ship support, they can be equipped with docking capabilities to recharge in situ at Wave Energy Converter (WEC) with dock recharging stations. However, the power generated at the recharging stations may be constrained depending on the sea conditions. Therefore, a robust mission planning framework is proposed using a centralized Evolutionary Algorithm (EA) and a decentralized Monte Carlo Tree Search (MCTS) method. Both methods incorporate the charge availability constraint at the recharging station in addition to the maximum charge capacity of each AUV. The planner utilizes a time-varying power profile of irregular waves incident at WECs for dock charging and generates efficient mission plans for AUVs by optimizing their time to visit the dock based on the imposed constraint. The effects of increasing the number of AUVs, increasing the number of points of interest in the mission area, and varying sea state on the mission duration are also analyzed.
|
|
10:30-12:00, Paper WeAT22-NT.4 | Add to My Program |
Decentralized Multi-Robot Navigation for Autonomous Surface Vehicles with Distributional Reinforcement Learning |
|
Lin, Xi | Stevens Institute of Technology |
Huang, Yewei | Stevens Institute of Technology |
Chen, Fanfei | Stevens Institute of Technology |
Englot, Brendan | Stevens Institute of Technology |
Keywords: Marine Robotics, Path Planning for Multiple Mobile Robots or Agents, Reinforcement Learning
Abstract: Collision avoidance algorithms for Autonomous Surface Vehicles (ASV) that follow the Convention on the International Regulations for Preventing Collisions at Sea (COLREGs) have been proposed in recent years. However, it may be difficult and unsafe to follow COLREGs in congested waters, where multiple ASVs are navigating in the presence of static obstacles and strong currents, due to the complex interactions. To address this problem, we propose a decentralized multi-ASV collision avoidance policy based on Distributional Reinforcement Learning, which considers the interactions among ASVs as well as with static obstacles and current flows. We evaluate the performance of the proposed Distributional RL based policy against a traditional RL-based policy and two classical methods, Artificial Potential Fields (APF) and Reciprocal Velocity Obstacles (RVO), in simulation experiments, which show that the proposed policy achieves superior performance in navigation safety, while requiring minimal travel time and energy. A variant of our framework that automatically adapts its risk sensitivity is also demonstrated to improve ASV safety in highly congested environments.
|
|
10:30-12:00, Paper WeAT22-NT.5 | Add to My Program |
Real-Time Planning under Uncertainty for AUVs Using Virtual Maps |
|
Collado-Gonzalez, Ivana | Stevens Institute of Technology |
McConnell, John | Stevens Institute of Technology |
Wang, Jinkun | Stevens Institute of Technology |
Szenher, Paul | Stevens Institute of Technology |
Englot, Brendan | Stevens Institute of Technology |
Keywords: Marine Robotics, Planning under Uncertainty, Reactive and Sensor-Based Planning
Abstract: Reliable localization is an essential capability for marine robots navigating in GPS-denied environments. SLAM, commonly used to mitigate dead reckoning errors, still fails in feature-sparse environments or with limited-range sensors. Pose estimation can be improved by incorporating the uncertainty prediction of future poses into the planning process and choosing actions that reduce uncertainty. However, performing belief propagation is computationally costly, especially when operating in large-scale environments. This work proposes a computationally efficient planning under uncertainty framework suitable for large-scale, feature-sparse environments. Our strategy leverages SLAM graph and occupancy map data obtained from a prior exploration phase to create a virtual map, describing the uncertainty of each map cell using a multivariate Gaussian. The virtual map is then used as a cost map in the planning phase, and performing belief propagation at each step is avoided. A receding horizon planning strategy is implemented, managing a goal-reaching and uncertainty-reduction tradeoff. Simulation experiments in a realistic underwater environment validate this approach. Experimental comparisons against a full belief propagation approach and a standard shortest-distance approach are conducted.
|
|
10:30-12:00, Paper WeAT22-NT.6 | Add to My Program |
Sea-U-Foil: A Hydrofoil Marine Vehicle with Multi-Modal Locomotion |
|
Zhao, Zuoquan | The Chinese University of Hong Kong |
Zhai, Yu | The Chinese University of Hong Kong |
Gao, Chuanxiang | The Chinese University of Hong Kong |
Ding, Wendi | The Chinese University of Hong Kong |
Yan, Ruixin | The Chinese University of Hong Kong |
Gao, Songqun | Chinese University of Hong Kong |
Han, Bingxin | The Chinese University of Hong Kong |
Liu, Xuchen | The Chinese University of Hong Kong |
Guo, Zixuan | The Chinese University of Hong Kong |
Chen, Ben M. | Chinese University of Hong Kong |
Keywords: Marine Robotics, Product Design, Development and Prototyping, Search and Rescue Robots
Abstract: Autonomous Marine Vehicles (AMVs) have been widely used in many critical tasks such as surveillance, patrolling, marine environment monitoring, and hydrographic surveying. However, most typical AMVs cannot meet the diverse demands of different marine tasks. In this article, we design a new type of remote-controlled hydrofoil marine vehicle, named Sea-U-Foil, which is suitable for different marine scenarios. Sea-U-Foil features three distinct locomotion modes, displacement mode, foilborne mode, and submarine mode, which enable the platform flexible mobility, high-speed and high-load capacities, and superior concealment. Specifically, the submarine mode makes Sea-U-Foil unique among previous studies. In addition, the performance of Sea-U-Foil in foilborne mode outperforms those of most current unmanned surface vehicles (USVs) in terms of speed and payload. To the best of our knowledge, we are the first to introduce a new type of AMV that can work in displacement mode, foilborne mode, and submarine mode. We elaborate on the design principles and methodologies of Sea-U-Foil first, then validate the effectiveness of its tri-modal locomotion through extensive experiments.
|
|
10:30-12:00, Paper WeAT22-NT.7 | Add to My Program |
Learning Which Side to Scan: Multi-View Informed Active Perception with Side Scan Sonar for Autonomous Underwater Vehicles |
|
Venkatramanan Sethuraman, Advaith | University of Michigan |
Baldoni, Philip | United States Naval Research Laboratory |
Skinner, Katherine | University of Michigan |
McMahon, James | The Naval Research Laboratory |
Keywords: Marine Robotics, Reactive and Sensor-Based Planning, Object Detection, Segmentation and Categorization
Abstract: Autonomous underwater vehicles often perform surveys that capture multiple views of targets in order to provide more information for human operators or automatic target recognition algorithms. In this work, we address the problem of choosing the most informative views that minimize survey time while maximizing classifier accuracy. We introduce a novel active perception framework for multi-view adaptive surveying and reacquisition using side scan sonar imagery.Our framework addresses this challenge by using a graph formulation for the adaptive survey task. We then use Graph Neural Networks (GNNs) to both classify acquired sonar views and reinforcement learning to choose the next best view to capture based on the collected data. We evaluate our method using simulated surveys in a high-fidelity side scan sonar simulator. Our results demonstrate that our approach is able to surpass the state-of-the-art in classification accuracy and efficiency. This framework is a promising approach for more efficient autonomous missions involving side scan sonar, such as underwater exploration, marine archaeology, and environmental monitoring.
|
|
10:30-12:00, Paper WeAT22-NT.8 | Add to My Program |
Development of a Lightweight Underwater Manipulator for Delicate Structural Repair Operations |
|
Mao, Juzheng | Southeast University |
Song, Guangming | Southeast University |
Hao, Shuang | Southeast University |
Zhang, Mingquan | Southeast University |
Song, Aiguo | Southeast University |
Keywords: Marine Robotics, Robotics and Automation in Construction, Engineering for Robotic Systems
Abstract: In recent years, underwater robots have been increasingly used in the maintenance of hydraulic structures. Underwater manipulators are essential devices that are used to carry out such maintenance tasks. For delicate repair operations such as fixing tiny cracks, most existing underwater manipulators face limitations in terms of size, accuracy, and scalability. Therefore, in this letter, we present a novel electric underwater manipulator, named SEU-4. This four-degree-of-freedom manipulator weighs 8.91 kg and has a maximum payload of 9 kg. It has a rapid-switching interface that supports convenient mechanical and electrical connections for end-effectors. To compensate for the disturbances that are present in the complex underwater environment, a trajectory-tracking control strategy based on a disturbance observer and sliding-mode control (DOB-SMC) is proposed. A prototype of the proposed underwater manipulator was created, and a flowing-water experimental platform was constructed to test its trajectory-tracking performance in fast-flowing water. The experimental results show that the manipulator achieves a trajectory-tracking error of 1.03 mm in static water and 2.91 mm in flowing water at 1.2 m/s, which satisfies the requirements of delicate repair operations.
|
|
WeAT23-NT Oral Session, NT-G401 |
Add to My Program |
Aerial Systems: Mechanics and Control IV |
|
|
Chair: Ott, Christian | TU Wien |
Co-Chair: Perez-Arancibia, Nestor O | Washington State University (WSU) |
|
10:30-12:00, Paper WeAT23-NT.1 | Add to My Program |
MPS: A New Method for Selecting the Stable Closed-Loop Equilibrium Attitude-Error Quaternion of a UAV During Flight |
|
Gonçalves, Francisco | Washington State University |
Bena, Ryan | University of Southern California |
Matveev, Konstantin | Washington State University |
Perez-Arancibia, Nestor O | Washington State University (WSU) |
Keywords: Aerial Systems: Mechanics and Control, Motion Control, Space Robotics and Automation
Abstract: We present model predictive selection (MPS), a new method for selecting the stable closed-loop (CL) equilibrium attitude-error quaternion (AEQ) of an uncrewed aerial vehicle (UAV) during the execution of high-speed yaw maneuvers. In this approach, we minimize the cost of yawing measured with a performance figure of merit (PFM) that takes into account both the aerodynamic-torque control input and attitude-error state of the UAV. Specifically, this method uses a control law with a term whose sign is dynamically switched in real time to select, between two options, the torque associated with the lesser cost of rotation as predicted by a dynamical model of the UAV derived from first principles. This problem is relevant because the selection of the stable CL equilibrium AEQ significantly impacts the performance of a UAV during high-speed rotational flight, from both the power and control-error perspectives. To test and demonstrate the functionality and performance of the proposed method, we present data collected during one hundred real-time high-speed yaw-tracking flight experiments. These results highlight the superior capabilities of the proposed MPS-based scheme when compared to a benchmark controller commonly used in aerial robotics, as the PFM used to quantify the cost of flight is reduced by 60.30 %, on average. To our best knowledge, these are the first flight-test results that thoroughly demonstrate, evaluate, and compare the performance of a real-time controller capable of selecting the stable CL equilibrium AEQ during operation.
|
|
10:30-12:00, Paper WeAT23-NT.2 | Add to My Program |
Realtime Brain-Inspired Adaptive Learning Control for Nonlinear Systems with Configuration Uncertainties (I) |
|
Zhang, Yanhui | Zhejiang University |
Tong, Zheyu | Zhejiang University |
Zhang, YiFan | Zhejiang University |
Chen, Song | Zhejiang University |
Yang, Junyuan | Zhejiang University |
Chen, Weifang | Zhejiang University |
Keywords: Aerial Systems: Mechanics and Control, Reinforcement Learning, Imitation Learning
Abstract: This paper investigates the problem of adaptive tracking control for quadcopter in the presence of nonlinear configuration uncertainties. It utilizes a real-time brain-inspired learning control (RBiLC) method to address the challenges posed by nonlinear time-varying uncertain instructions. To address the issue of flight control law reconfiguration caused by unknown changes in the fuselage configuration (e.g., propellers or motors), this paper introduces an online learning-evaluation-optimization reconstruction mechanism based on RBiLC. The proposed adaptive learning controller mitigates the need for extensive human resources and reduces the time required for flight controller design. The Lyapunov-Krasovskii function is introduced as a compensatory measure to address the impact of parameter uncertainty on system stability. Furthermore, this paper proposes a signed sinusoidal function perturbation estimate to guide the direction and magnitude throughout the online learning process. The approach conducts a theoretical stability analysis on a quadcopter vehicle considering uncertainties in UAV dynamics modeling. The results demonstrate that the proposed scheme achieves superior control and faster adaptation, enabling the system to ultimately converge to a compact set within a limited time domain. Finally, software-in-the-loop (SITL) simulations and flight verification results are presented to validate the proposed control strategy
|
|
10:30-12:00, Paper WeAT23-NT.3 | Add to My Program |
Safety-Conscious Pushing on Diverse Oriented Surfaces with Underactuated Aerial Vehicles |
|
Hui, Tong | Technical University of Denmark |
Fernández González, Manuel Jesús | Automation and Control, Technical University of Denmark |
Fumagalli, Matteo | Danish Technical University |
Keywords: Aerial Systems: Mechanics and Control, Robot Safety, Underactuated Robots
Abstract: Pushing tasks performed by aerial manipulators can be used for contact-based industrial inspections. Underactuated aerial vehicles are widely employed in aerial manipulation due to their widespread availability and relatively low cost. Industrial infrastructures often consist of diverse oriented work surfaces. When interacting with such surfaces, the coupled gravity compensation and interaction force generation of underactuated aerial vehicles can present the potential challenge of near-saturation operations. The blind utilization of these platforms for such tasks can lead to instability and accidents, creating unsafe operating conditions and potentially damaging the platform. In order to ensure safe pushing on these surfaces while managing platform saturation, this work establishes a safety assessment process. This process involves the prediction of the saturation level of each actuator during pushing across variable surface orientations. Furthermore, the assessment results are used to plan and execute physical experiments, ensuring safe operations and preventing platform damage.
|
|
10:30-12:00, Paper WeAT23-NT.4 | Add to My Program |
Geranos: A Novel Tilted-Rotors Aerial Robot for the Transportation of Poles |
|
Gorlo, Nicolas | ETH Zurich |
Müller, Mario Sven | ETH Zürich |
Bamert, Samuel | ETH Zürich |
Reinhart, Tim | ETH Zurich |
Stadler, Henriette | ETH Zürich |
Cathomen, Rafael | ETH Zurich |
Käppeli, Gabriel | ETH Zürich |
Shen, Hua | ETH Zürich |
Cuniato, Eugenio | ETH Zurich |
Tognon, Marco | Inria Rennes |
Siegwart, Roland | ETH Zurich |
Keywords: Aerial Systems: Applications, Robotics and Automation in Construction, Grippers and Other End-Effectors
Abstract: Building and maintaining structures like antennas and cable-car masts in challenging terrain often involves hazardous and expensive sling-loaded helicopter operations. In this work, we challenge this paradigm by proposing Geranos: a multicopter unmanned aerial vehicle (UAV) adept at precisely placing vertical poles, blending load transport with precision. Geranos minimizes the effects of the poles' large moment of inertia by adopting a ring design that accommodates the pole in its center. To grasp the load, we developed a two-part grasping mechanism, creating a near-rigid connection between the UAV and the load. This lightweight construction is reliable and robust while not relying on active forces to maintain the grasp. The UAV utilizes four main propellers to counteract gravity and four auxiliary ones for enhanced lateral positional accuracy, ensuring full actuation around hovering. In a demonstration mimicking the installation of antennas or cable-car masts, we show the capability of Geranos to assemble poles (mass of 3kg and length of 2m) on top of each other. In this scenario, Geranos demonstrates an impressive load-placement accuracy of less than 5cm.
|
|
10:30-12:00, Paper WeAT23-NT.5 | Add to My Program |
Robust and Energy-Efficient Control for Multi-Task Aerial Manipulation with Automatic Arm-Switching |
|
Wu, Ying | Sun Yat-Sen University |
Zhou, Zida | Sun Yat-Sen University |
Wei, Mingxin | Sun Yat-Sen University |
Cheng, Hui | Sun Yat-Sen University |
Keywords: Aerial Systems: Applications, Robust/Adaptive Control, Model Learning for Control
Abstract: Aerial manipulation has received increasing research interest with wide applications of drones. To perform specific tasks, robotic arms with various mechanical structures will be mounted on the drone. It results in sudden disturbances to the aerial manipulator when switching the robotic arm or interacting with the environment. Hence, it is challenging to design a generic and robust control strategy adapted to various robotic arms when achieving multi-task aerial manipulation. In this paper, we present a learning-based control algorithm that allows online trajectory optimization and tracking to accomplish various aerial interaction tasks without manual adjustment. The proposed energy-saved trajectory planning approach integrates coupled dynamics model with a single rigid body to generate the energy-efficient trajectory for the aerial manipulator. Addressing the challenges of precise control when performing aerial manipulation tasks, this paper presents a controller based on deep neural networks that classifies and learns accurate forces and moments caused by different robotic arms and interactions. Moreover, the forces arising from robotic arm motions are delicately used as part of the drone’s power to save energy. Extensive real-world experiments demonstrate that the proposed method can adapt to various robotic arms and interactions when performing multi-task aerial manipulation.
|
|
10:30-12:00, Paper WeAT23-NT.6 | Add to My Program |
Optimal Collaborative Transportation for Under-Capacitated Vehicle Routing Problems Using Aerial Drone Swarms |
|
Kopparam Sreedhara, Akash | Vellore Institute of Technology, Vellore |
Padala, Deepesh | Vellore Institute of Technology, Vellore |
Mahesh, Shashank | Vellore Institute of Technology, Vellore |
Cui, Kai | Technische Universität Darmstadt |
Li, Mengguang | Technische Universität Darmstadt |
Koeppl, Heinz | Technische Universität Darmstadt |
Keywords: Aerial Systems: Applications, Optimization and Optimal Control, Swarm Robotics
Abstract: Swarms of aerial drones have recently been considered for last-mile deliveries in urban logistics or automated construction. At the same time, collaborative transportation of payloads by multiple drones is another important area of recent research. However, efficient coordination algorithms for collaborative transportation of many payloads by many drones remain to be considered. In this work, we formulate the collaborative transportation of payloads by a swarm of drones as a novel, under-capacitated generalization of vehicle routing problems (VRP), which may also be of separate interest. In contrast to standard VRP and capacitated VRP, we must additionally consider waiting times for payloads lifted cooperatively by multiple drones, and the corresponding coordination. Algorithmically, we provide a solution encoding that avoids deadlocks and formulate an appropriate alternating minimization scheme to solve the problem. On the hardware side, we integrate our algorithms with collision avoidance and drone controllers. The approach and the impact of the system integration are successfully verified empirically, both on a swarm of real nano-quadcopters and for large swarms in simulation. Overall, we provide a framework for collaborative transportation with aerial drone swarms, that uses only as many drones as necessary for the transportation of any single payload.
|
|
10:30-12:00, Paper WeAT23-NT.7 | Add to My Program |
A Modular Aerial System Based on Homogeneous Quadrotors with Fault-Tolerant Control |
|
Li, Mengguang | Technische Universität Darmstadt |
Cui, Kai | Technische Universität Darmstadt |
Koeppl, Heinz | Technische Universität Darmstadt |
Keywords: Aerial Systems: Mechanics and Control, Swarm Robotics, Aerial Systems: Applications
Abstract: The standard quadrotor is one of the most popular and widely used aerial vehicle of recent decades, offering great maneuverability with mechanical simplicity. However, the under-actuation characteristic limits its applications, especially when it comes to generating desired wrench with six degrees of freedom (DOF). Therefore, existing work often compromises between mechanical complexity and the controllable DOF of the aerial system. To take advantage of the mechanical simplicity of a standard quadrotor, we propose a modular aerial system, IdentiQuad, that combines only homogeneous quadrotor-based modules. Each IdentiQuad can be operated alone like a standard quadrotor, but at the same time allows task-specific assembly, increasing the controllable DOF of the system. Each module is interchangeable within its assembly. We also propose a general controller for different configurations of assemblies, capable of tolerating rotor failures and balancing the energy consumption of each module. The functionality and robustness of the system and its controller are validated using physics-based simulations for different assembly configurations.
|
|
10:30-12:00, Paper WeAT23-NT.8 | Add to My Program |
Observer-Based Controller Design for Oscillation Damping of a Novel Suspended Underactuated Aerial Platform |
|
Das, Hemjyoti | Technical University of Vienna |
Vu, Minh Nhat | TU Wien, Austria |
Egle, Tobias | TU Wien |
Ott, Christian | TU Wien |
Keywords: Aerial Systems: Mechanics and Control, Underactuated Robots
Abstract: In this work, we present a novel actuation strategy for a suspended aerial platform. By utilizing an underactuation approach, we demonstrate the successful oscillation damping of the proposed platform, modeled as a spherical double pendulum. A state estimator is designed in order to obtain the deflection angles of the platform, which uses only onboard IMU measurements. The state estimator is an extended Kalman filter (EKF) with intermittent measurements obtained at different frequencies. An optimal state feedback controller and a PD+ controller are designed in order to dampen the oscillations of the platform in the joint space and task space respectively. The proposed underactuated platform is found to be more energy-efficient than an omnidirectional platform and requires fewer actuators. The effectiveness of our proposed system is validated using both simulations and experimental studies.
|
|
10:30-12:00, Paper WeAT23-NT.9 | Add to My Program |
MOAR Planner: Multi-Objective and Adaptive Risk-Aware Path Planning for Infrastructure Inspection with a UAV |
|
Petit, Louis | McGill University |
Lussier Desbiens, Alexis | Université De Sherbrooke |
Keywords: Aerial Systems: Perception and Autonomy, Aerial Systems: Applications, Task and Motion Planning
Abstract: The problem of autonomous navigation for UAV inspection remains challenging as it requires effectively navigating in close proximity to obstacles, while accounting for dynamic risk factors such as weather conditions, communication reliability, and battery autonomy. This paper introduces the MOAR path planner which addresses the complexities of evolving risks during missions. It offers real-time trajectory adaptation while concurrently optimizing safety, time, and energy. The planner employs a risk-aware cost function that integrates pre-computed cost maps, the new concepts of damage and insertion costs, and an adaptive speed planning framework. With that, the optimal path is searched in a graph using a discrete representation of the state and action spaces. The method is evaluated through simulations and real-world flight tests. The results show the capability to generate real-time trajectories spanning a broad range of evaluation metrics, around 90% of the range occupied by popular algorithms. The proposed framework contributes by enabling UAVs to navigate more autonomously and reliably in critical missions.
|
|
WeAT24-NT Oral Session, NT-G402 |
Add to My Program |
Field Robotics and Automation |
|
|
Chair: Liarokapis, Minas | The University of Auckland |
Co-Chair: Chowdhary, Girish | University of Illinois at Urbana Champaign |
|
10:30-12:00, Paper WeAT24-NT.1 | Add to My Program |
SOL: A Compact, Portable, Telescopic, Soft-Robotic Sun-Tracking Mechanism for Improved Solar Power Production |
|
Busby, Bryan | The University of Auckland |
Duan, Shifei | University of Auckland |
Thompson, Marcus | Whanauka Limited |
Liarokapis, Minas | The University of Auckland |
Keywords: Energy and Environment-Aware Automation
Abstract: Solar power is becoming an increasingly popular option for energy production in commercial and private applications. While installing solar panels (photovoltaic cells) in a stationary configuration is simple and inexpensive, such a setup fails to maximise their potential solar energy production. Single- and dual-axis sun trackers automatically adjust the tilt angle of photovoltaic cells so as to directly face towards sun, but these also come with their own drawbacks such as increased cost and weight. This paper presents SOL, a soft-robotic, dual-axis, sun-tracking mechanism for improved solar panel efficiency. The proposed design was built to be compact, portable, and lightweight, and it utilises closed-loop control for the intelligent actuation of a set of soft telescopic structures that raise and tilt the solar panels in the direction of the sun. The performance of the proposed solar tracking platform was experimentally validated in terms of its maximum elevation at different azimuths and its ability to balance different loads. The result is a device that provides solar panel users with an accessible, affordable, and convenient means of increasing the efficiency of their solar energy system.
|
|
10:30-12:00, Paper WeAT24-NT.2 | Add to My Program |
Measuring Ball Joint Faults in Parabolic-Trough Solar Plants with Data Augmentation and Deep Learning |
|
Pérez Cutiño, Miguel Angel | Universidad De Sevilla |
Capitan, Jesus | University of Seville |
Díaz-Báñez, José-Miguel | Universidad Sevilla |
Valverde, Juan | University of Seville |
Keywords: Energy and Environment-Aware Automation, Deep Learning for Visual Perception, Aerial Systems: Applications
Abstract: Automatic inspection of parabolic-trough solar plants is key to preventing failures that can harm the environment and the production of green energy. In this work, we propose a novel methodology to inspect ball joints in parabolic trough collectors, which is a relevant problem that is not adequately covered in the literature. Images collected by an Unmanned Aerial Vehicle are segmented using deep learning to extract ball joint components. In order to generate rich training datasets, we develop a novel data augmentation technique by rotating joints and adding synthetic image background, and demonstrate its impact on the object detection accuracy. Then two types of faults are analyzed: fluid leaks, by means of image color filtering; and geometric shape anomalies, by measuring joint angles of the robotic arms. We propose metrics to quantify these faults and evaluate the damage of the inspected components. Our experimental results with images from operating commercial plants show that we can automatically detect leaks and anomalous angular geometry with a low failure rate compared to human labeling.
|
|
10:30-12:00, Paper WeAT24-NT.3 | Add to My Program |
ECDP: Energy Consumption Disaggregation Pipeline for Energy Optimization in Lightweight Robots |
|
Heredia, Juan | University of Southern Denmark |
Kirschner, Robin Jeanne | TU Munich, Institute for Robotics and Systems Intelligence |
Schlette, Christian | University of Southern Denmark (SDU) |
Abdolshah, Saeed | KUKA Deutschland GmbH |
Haddadin, Sami | Technical University of Munich |
Mikkel, Kjærgaard | University of Southern Denmark |
Keywords: Energy and Environment-Aware Automation, Engineering for Robotic Systems
Abstract: Limited resources and resulting energy crises occurring all over the world highlight the importance of energy efficiency in technological developments such as robotic manipulators. Efficient energy consumption of manipulators is necessary to make them affordable and spread their application in the future industry. Previously, the power consumption of the robot motion was the main factor considered in the evaluation of energy efficiency. Lately, the paradigm in industrial robotics shifted towards lightweight robot manipulators which require a new investigation on the disaggregation of robot energy consumption. In this paper, we propose a novel pipeline to identify and disaggregate the energy use of mechatronic devices and apply it to lightweight industrial robots. The proposed method allows the identification of the electronic components consumption, mechanical losses, electrical losses, and required mechanical energy for robot motion. We evaluate the pipeline and understand the distribution of energy consumption using four different manipulators, namely, Universal Robot's UR5e, UR10e, Franka Emika's FR3, and Kinova Gen3. The experimental results show that most of the energy (60- 90%) is consumed by the electronic components of the robot control box. Using this knowledge, the approaches to further optimize their energy consumption need to shift towards efficient robot electronic design instead of efficient robot mass distribution or motion control.
|
|
10:30-12:00, Paper WeAT24-NT.4 | Add to My Program |
Autonomous UAV Mission Cycling: A Mobile Hub Approach for Precise Landings and Continuous Operations in Challenging Environments |
|
Moortgat-Pick, Alexander | Technical University of Munich (TUM) |
Schwahn, Marie | Technical University of Munich |
Adamczyk, Anna | Technical University of Munich (TUM) |
Duecker, Daniel Andre | Technical University of Munich (TUM) |
Haddadin, Sami | Technical University of Munich |
Keywords: Environment Monitoring and Management, Aerial Systems: Applications, Field Robots
Abstract: Environmental monitoring via UAVs offers unprecedented aerial observation capabilities. However, the limited flight durations of typical multirotors and the demands on human attention in outdoor missions call for more autonomous solutions. Addressing the specific challenges of precise UAV landings—especially amidst wind disturbances, obstacles, and unreliable global localization—we introduce a mobile hub concept. This hub facilitates continuous mission cycling for unmodified off-the-shelf UAVs. Our approach centers on a small landing platform affixed to a robotic arm, adeptly correcting UAV pose errors in windy conditions. Compact enough for installation in an economy car, the system emphasizes two novel strategies. Firstly, external visual tracking of the UAV informs the landing controls for both the drone and the robotic arm. The arm compensates for UAV positioning errors and aligns the platform's attitude with the UAV for stable landings, even on small platforms under windy conditions. Secondly, the robotic arm can transport the UAV inside the hub, perform maintenance tasks like battery replacements, and then facilitate direct relaunches. Importantly, our design places all operational responsibility on the hub, ensuring the UAV remains unaltered. This ensures broad compatibility with standard UAVs, only necessitating an API for attitude setpoints. Experimental results underscore the efficiency of our model, achieving safe landings with minimal errors (≤ 7 cm) in winds up to 5 Beaufort (8.1 m/s). In essence, our mobile hub concept significantly boosts UAV mission availability, allowing for autonomous operations even under challenging conditions.
|
|
10:30-12:00, Paper WeAT24-NT.5 | Add to My Program |
Low-To-High Resolution Path Planner for Robotic Gas Distribution Mapping |
|
Nanavati, Rohit Vishwajit | Loughborough University |
Rhodes, Callum | Imperial College London |
Coombes, Matthew | Loughborough University |
Liu, Cunjia | Loughborough University |
Keywords: Environment Monitoring and Management, Robotics in Hazardous Fields, Reactive and Sensor-Based Planning
Abstract: Robotic gas distribution mapping improves the understanding of a hazardous gas dispersion while putting the human operator out of danger. Generating an accurate gas distribution map quickly is of utmost importance in situations such as gas leaks and industrial incidents, so that the efficient use of resources in response to incidents can be facilitated. In this paper, to incorporate the operational requirement on map granularity, we propose a low-to-high resolution path planner that first guides a single robots to quickly and sparsely sample the region of interest to generate a low resolution gas distribution map, followed by high resolution sampling informed by the low resolution map as a prior. The low resolution prior acts as a coverage survey allowing the algorithm to perform a relatively exploitative search of high concentration regions, resulting in overall shorter mission times. The proposed framework is designed to iteratively identify the next best T locations to sample, which prioritises the potentially high reward locations, while ensuring that the robot can travel to and sample the chosen locations within a user specified map update cycle. We present a simulation study to demonstrate the alternating exploration-exploitation like behaviour along with bench-marking its performance in contrast to the traditional sampling path planners and various reward functions.
|
|
10:30-12:00, Paper WeAT24-NT.6 | Add to My Program |
Persistent Monitoring of Large Environments with Robot Deployment Scheduling in between Remote Sensing Cycles |
|
Masaba, Kizito | Dartmouth College |
Roznere, Monika | Dartmouth College |
Jeong, Mingi | Dartmouth College |
Quattrini Li, Alberto | Dartmouth College |
Keywords: Environment Monitoring and Management, Robotics in Under-Resourced Settings, Planning under Uncertainty
Abstract: This paper proposes a novel decision-making framework for planning “when” and “where” to deploy robots based on prior data with the goal of persistently monitoring a spatio-temporal phenomenon in an environment. We specifically focus on large lake monitoring, where remote sensors, such as satellites, can provide a snapshot of the target phenomenon at regular cycles. Between these cycles, Autonomous Surface Vehicles (ASVs) can be deployed to maintain an up-to-date model of the phenomenon. However, deploying ASVs has a significant logistical overhead in terms of time and cost. It requires a team of people to go on site and spend typically a day to monitor the deployment. It is vital to not only be intentional about where to sample in the environment on a given day, but also determine the worth of deploying the ASVs that day at all. Therefore, we propose a persistent monitoring strategy that provides the days and locations of when and where to sample with the robots by leveraging Gaussian Process model estimates of future trends based on collected remote sensing and point measurement data. Our approach minimizes the number of days and locations for sampling, while preserving the quality of estimates. Through simulation experiments using realistic spatio-temporal datasets, we demonstrate the benefits of our approach over traditional deployment strategies, including significant savings on the effort and operational cost of deploying the ASVs.
|
|
10:30-12:00, Paper WeAT24-NT.7 | Add to My Program |
System Calibration of a Field Phenotyping Robot with Multiple High-Precision Profile Laser Scanners |
|
Esser, Felix | University of Bonn |
Tombrink, Gereon | University of Bonn |
Cornelißen, André | University of Bonn |
Klingbeil, Lasse | University of Bonn |
Kuhlmann, Heiner | University of Bonn |
Keywords: Field Robots, Robotics and Automation in Agriculture and Forestry, Agricultural Automation
Abstract: The creation of precise and high-resolution crop point clouds in agricultural fields has become a key challenge for high-throughput phenotyping applications. This work implements a novel calibration method to calibrate the laser scanning system of an agricultural field robot consisting of two industrial-grade laser scanners used for high-precise 3D crop point cloud creation. The calibration method optimizes the transformation between the scanner origins and the robot pose by minimizing 3D point omnivariances within the point cloud. Moreover, we present a novel factor graph-based pose estimation method that fuses total station prism measurements with IMU and GNSS heading information for high-precise pose determination during calibration. The root-mean-square error of the distances to a georeferenced ground truth point cloud results in 0.8 cm after parameter optimization. Furthermore, our results show the importance of a reference point cloud in the calibration method needed to estimate the vertical translation of the calibration. Challenges arise due to non-static parameters while the robot moves, indicated by systematic deviations to a ground truth terrestrial laser scan.
|
|
10:30-12:00, Paper WeAT24-NT.8 | Add to My Program |
Atmospheric Aerosol Diagnostics with UAV-Based Holographic Imaging and Computer Vision |
|
Bristow, Nathaniel | University of Minnesota |
Pardoe, Nikolas | University of Minnesota |
Hong, Jiarong | ME, UMN |
Keywords: Field Robots, Vision-Based Navigation, Aerial Systems: Applications
Abstract: Emissions of particulate matter into the atmosphere are essential to characterize, in terms of properties such as particle size, morphology, and composition, to better understand impacts on public health and the climate. However, there is no currently available technology capable of measuring individual particles with such high detail over the extensive domains associated with events such as wildfires or volcanic eruptions. To solve this problem, we present an autonomous measurement system involving an unmanned aerial vehicle (UAV) coupled with a digital inline holographic microscope for in situ particle diagnostics. The flight control uses computer vision to localize and then trace the movements of particle-laden flows while sampling particles to determine their properties as they are transported away from their source. We demonstrate this system applied to measuring particulate matter in smoke plumes and discuss broader implications for this type of system in similar applications.
|
|
10:30-12:00, Paper WeAT24-NT.9 | Add to My Program |
WayFASTER: A Self-Supervised Traversability Prediction for Increased Navigation Awareness |
|
Valverde Gasparino, Mateus | University of Illinois at Urbana-Champaign |
Sivakumar, Arun Narenthiran | University of Illinois at Urbana Champaign |
Chowdhary, Girish | University of Illinois at Urbana Champaign |
Keywords: Field Robots, Vision-Based Navigation, Robotics and Automation in Agriculture and Forestry
Abstract: Accurate and robust navigation in unstructured environments requires fusing data from multiple sensors. Such fusion ensures that the robot is better aware of its surroundings, including areas of the environment that are not immediately visible but were visible at a different time. To solve this problem, we propose a method for traversability prediction in challenging outdoor environments using a sequence of RGB and depth images fused with pose estimations. Our method, termed WayFASTER (Waypoints-Free Autonomous System for Traversability with Enhanced Robustness), uses experience data recorded from a receding horizon estimator to train a self-supervised neural network for traversability prediction, eliminating the need for heuristics. Our experiments demonstrate that our method excels at avoiding obstacles, and correctly detects that traversable terrains, such as tall grass, can be navigable. By using a sequence of images, WayFASTER significantly enhances the robot’s awareness of its surroundings, enabling it to predict the traversability of terrains that are not immediately visible. This enhanced awareness contributes to better navigation performance in environments where such predictive capabilities are essential.
|
|
WeAT25-NT Oral Session, NT-G403 |
Add to My Program |
Localization IV |
|
|
Chair: Chen, Changhao | National University of Defense Technology |
Co-Chair: Tan, U-Xuan | Singapore University of Techonlogy and Design |
|
10:30-12:00, Paper WeAT25-NT.1 | Add to My Program |
A Coarse-To-Fine Place Recognition Approach Using Attention-Guided Descriptors and Overlap Estimation |
|
Fu, Chencan | Zhejiang University |
Li, Lin | Zhejiang University |
Mei, Jianbiao | Zhejiang University |
Ma, Yukai | Zhejiang Unicersity |
Peng, Linpeng | Zhejiang University |
Zhao, Xiangrui | Zhejiang University |
Liu, Yong | Zhejiang University |
Keywords: Localization, Mapping, Deep Learning Methods
Abstract: Place recognition is a challenging but crucial task in robotics. Current description-based methods may be limited by representation capabilities, while pairwise similarity-based methods require exhaustive searches, which is time-consuming. In this paper, we present a novel coarse-to-fine approach to address these problems, which combines BEV (Bird's Eye View) feature extraction, coarse-grained matching and fine-grained verification. In the coarse stage, our approach utilizes an attention-guided network to generate attention-guided descriptors. We then employ a fast affinity-based candidate selection process to identify the Top-textit{K} most similar candidates. In the fine stage, we estimate pairwise overlap among the narrowed-down place candidates to determine the final match. Experimental results on the KITTI and KITTI-360 datasets demonstrate that our approach outperforms state-of-the-art methods. The code will be released publicly soon.
|
|
10:30-12:00, Paper WeAT25-NT.2 | Add to My Program |
LHMap-Loc: Cross-Modal Monocular Localization Using LiDAR Point Cloud Heat Map |
|
Wu, Xinrui | Shanghai Jiao Tong University |
Xu, Jianbo | SJTU |
Hu, Puyuan | ShanghaiJiaoTongUniversity |
Wang, Guangming | University of Cambridge |
Wang, Hesheng | Shanghai Jiao Tong University |
Keywords: Localization, Mapping, Deep Learning Methods
Abstract: Localization using a monocular camera in the pre-built LiDAR point cloud map has drawn increasing attention in the field of autonomous driving and mobile robotics. However, there are still many challenges (e.g. difficulties of map storage, poor localization robustness in large scenes) in accurately and efficiently implementing cross-modal localization. To solve these problems, a novel pipeline termed LHMap-loc is proposed, which achieves accurate and efficient monocular localization in LiDAR maps. Firstly, feature encoding is carried out on the original LiDAR point cloud map by generating offline heat point clouds, by which the size of the original LiDAR map is compressed. Then, an end-to-end online pose regression network is designed based on optical flow estimation and spatial attention to achieve real-time monocular visual localization in a pre-built map. In addition, a series of experiments have been conducted to prove the effectiveness of the proposed method. Our code is available at: https://github.com/IRMVLab/LHMap-loc.
|
|
10:30-12:00, Paper WeAT25-NT.3 | Add to My Program |
LocNDF: Neural Distance Field Mapping for Robot Localization |
|
Wiesmann, Louis | University of Bonn |
Guadagnino, Tiziano | University of Bonn |
Vizzo, Ignacio | Dexory |
Zimmerman, Nicky | University of Lugano |
Pan, Yue | University of Bonn |
Kuang, Haofei | University of Bonn |
Behley, Jens | University of Bonn |
Stachniss, Cyrill | University of Bonn |
Keywords: Localization, Mapping, Deep Learning Methods
Abstract: Mapping an environment is essential for several robotic tasks, particularly for localization. In this paper, we address the problem of mapping the environment using LiDAR point clouds with the goal to obtain a map representation that is well suited for robot localization. To this end, we utilize a neural network to learn a discretization-free distance field of a given scene for localization. In contrast to prior approaches, we directly work on the sensor data and do not assume a perfect model of the environment or rely on normals. Inspired by the recently proposed NeRF representations, we supervise the network by points sampled along the measured beams, and our loss is designed to learn a valid distance field. Additionally, we show how to perform scan registration and global localization directly within the neural distance field. We illustrate the capabilities to globally localize within an indoor environment utilizing a particle filter as well as to perform scan registration by tracking the pose of a car based on matching LiDAR scans to the neural distance field.
|
|
10:30-12:00, Paper WeAT25-NT.4 | Add to My Program |
Looking beneath More: A Sequence-Based Localizing Ground Penetrating Radar Framework |
|
Zhang, Pengyu | National University of Defense Technology |
Zhi, Shuaifeng | National University of Defense Technology |
Yuan, Yuelin | Hikauto |
Bi, Beizhen | National University of Defense Technology |
Xin, Qin | National University of Defense Technology |
Huang, Xiaotao | National University of Defense Technology |
Shen, Liang | National University of Defense Technology |
Keywords: Localization, Mapping, Transfer Learning
Abstract: Localizing ground penetrating radar (LGPR) has been proven to be a promising technology for robot localization in various dynamic environments. However, the extreme scarcity of underground features introduces false candidate matches and brings unique challenges to this task. In this paper, we propose a sequence-based framework for LGPR to address the aforementioned issues. Specifically, we first introduce a trainable strategy to extract robust underground features in multi-weather conditions. By further using sequential infor- mation, our LGPR system can observe richer underground scene contexts, and the associated multi-frame scans could also improve the performance of underground place recognition. We demonstrate the superiority of our proposed method by comparing it against several recent state-of-the-art baseline methods applied to GPR image tasks. Experimental results on large public and self-collected datasets show that our proposed framework significantly improves the performance of various baselines in different scenarios.
|
|
10:30-12:00, Paper WeAT25-NT.5 | Add to My Program |
Increasing SLAM Pose Accuracy by Ground-To-Satellite Image Registration |
|
Zhang, Yanhao | University of Technology Sydney |
Shi, Yujiao | The Australian National University |
Wang, Shan | The Australian National University |
Vora, Ankit | Ford Motor Company |
Perincherry, Akhil | Ford Motor Company |
Chen, Yongbo | Australian National University |
Li, Hongdong | Australian National University and NICTA |
Keywords: Localization, SLAM, Deep Learning for Visual Perception
Abstract: Vision-based localization for autonomous driving has been of great interest among researchers. When a pre-built 3D map is not available, the techniques of visual simultaneous localization and mapping (SLAM) are typically adopted. Due to error accumulation, visual SLAM (vSLAM) usually suffers from long-term drift. This paper proposes a framework to increase the localization accuracy by fusing the vSLAM with a deep-learning based ground-to-satellite (G2S) image registration method. In this framework, a coarse (spatial correlation bound check) to fine (visual odometry consistency check) method is designed to select the valid G2S prediction. The selected prediction is then fused with the SLAM measurement by solving a scaled pose graph problem. To further increase the localization accuracy, we provide an iterative trajectory fusion pipeline. The proposed framework is evaluated on two well-known autonomous driving datasets, and the results demonstrate the accuracy and robustness in terms of vehicle localization.
|
|
10:30-12:00, Paper WeAT25-NT.6 | Add to My Program |
EffLoc: Lightweight Vision Transformer for Efficient 6-DOF Camera Relocalization |
|
Xiao, Zhendong | South China University of Technology |
Chen, Changhao | National University of Defense Technology |
Shan, Yang | South China University of Technology |
Wei, Wu | School of Automation Science and Engineering, South China Univers |
Keywords: Localization, SLAM, Deep Learning Methods
Abstract: Camera relocalization is pivotal in computer vision, with applications in AR, drones, robotics, and autonomous driving. It estimates 3D camera position and orientation (6-DoF) from images. Unlike traditional methods like SLAM, recent strides use deep learning for direct end-to-end pose estimation. We propose EffLoc, a novel efficient Vision Transformer for single-image camera relocalization. EffLoc's hierarchical layout, memory-bound self-attention, and feed-forward layers boost memory efficiency and inter-channel communication. Our introduced sequential group attention (SGA) module enhances computational efficiency by diversifying input features, reducing redundancy, and expanding model capacity. EffLoc excels in efficiency and accuracy, outperforming prior methods, such as AtLoc and MapNet. It thrives on large-scale outdoor car-driving scenario, ensuring simplicity, end-to-end trainability, and eliminating handcrafted loss functions.
|
|
10:30-12:00, Paper WeAT25-NT.7 | Add to My Program |
SAGE-ICP: Semantic Information-Assisted ICP |
|
Cui, Jiaming | Zhejiang University |
Chen, Jiming | Zhejiang University |
Li, Liang | Zhejiang Univerisity |
Keywords: Localization, SLAM, Semantic Scene Understanding
Abstract: Robust and accurate pose estimation in unknown environments is an essential part of robotic applications. We focus on LiDAR-based point-to-point ICP combined with effective semantic information. This paper proposes a novel semantic information-assisted ICP method named SAGE-ICP, which leverages semantics in odometry. The semantic information for the whole scan is timely and efficiently extracted by a 3D convolution network, and these point-wise labels are deeply involved in every part of the registration, including semantic voxel downsampling, data association, adaptive local map, and dynamic vehicle removal. Unlike previous semantic-aided approaches, the proposed method can improve localization accuracy in large-scale scenes even if the semantic information has certain errors. Experimental evaluations on KITTI and KITTI-360 show that our method outperforms the baseline methods, and improves accuracy while maintaining real-time performance, i.e., runs faster than the sensor frame rate.
|
|
10:30-12:00, Paper WeAT25-NT.8 | Add to My Program |
HR-APR: APR-Agnostic Framework with Uncertainty Estimation and Hierarchical Refinement for Camera Relocalisation |
|
Liu, Changkun | The Hong Kong University of Science and Technology |
Chen, Shuai | University of Oxford |
Zhao, Yukun | Hong Kong University of Science and Technology |
Huang, Huajian | The Hong Kong University of Science and Technology |
Prisacariu, Victor | University of Oxford |
Braud, Tristan | HKUST |
Keywords: Localization, Visual Learning, Probabilistic Inference
Abstract: Absolute Pose Regressors (APRs) directly estimate camera poses from monocular images, but their accuracy is unstable for different queries. Uncertainty-aware APRs provide uncertainty information on the estimated pose, alleviating the impact of these unreliable predictions. However, existing uncertainty modelling techniques are often coupled with a specific APR architecture, resulting in suboptimal performance compared to state-of-the-art (SOTA) APR methods. This work introduces a novel APR-agnostic framework, HR-APR, that formulates uncertainty estimation as cosine similarity estimation between the query and database features. It does not rely on or affect APR network architecture, which is flexible and computationally efficient. In addition, we take advantage of the uncertainty for pose refinement to enhance the performance of APR. The extensive experiments demonstrate the effectiveness of our framework, reducing 27.4% and 15.2% of computational overhead on the 7Scenes and Cambridge Landmarks datasets while maintaining the SOTA accuracy in single-image APRs.
|
|
10:30-12:00, Paper WeAT25-NT.9 | Add to My Program |
Implicit Learning of Scene Geometry from Poses for Global Localization |
|
Altillawi, Mohammad | Huawei, Autonomous University of Barcelona, |
Li, Shile | Algolux Germany |
Prakhya, Sai Manoj | Huawei Technologies Deutscheland GmbH |
Liu, Ziyuan | Huawei Group |
Serrat, Joan | Computer Vision Center and Computer Science Department, Universi |
Keywords: Localization, Visual Learning, Virtual Reality and Interfaces
Abstract: Global visual localization estimates the absolute pose of a camera using a single image, in a previously mapped area. Obtaining the pose from a single image enables many robotics and augmented/virtual reality applications. Inspired by latest advances in deep learning, many existing approaches directly learn and regress 6 DoF pose from an input image. However, these methods do not fully utilize the underlying scene geometry for pose regression. The challenge in monocular relocalization is the minimal availability of supervised training data, which is just the corresponding 6 DoF poses of the images. In this paper, we propose to utilize these minimal available labels (.i.e, poses) to learn the underlying 3D geometry of the scene and use the geometry, in return, to estimate a 6 DoF pose in a geometric manner. We present a learning method that uses these pose labels and rigid alignment to learn two 3D geometric representations (X, Y, Z coordinates) of the scene, one in camera coordinate frame and the other in global coordinate frame. Given a single image, it estimates these two 3D scene representations, which are then aligned to estimate a pose that matches the pose label. This formulation allows for the active inclusion of additional learning constraints to minimize 3D alignment errors between the two 3D scene representations and 2D re-projection errors between the 3D global scene representation and 2D image pixels, which improves localization accuracy. At inference time, our mo
|
|
WeAT26-NT Oral Session, NT-G404 |
Add to My Program |
SLAM I |
|
|
Chair: Wang, Sen | Imperial College London |
|
10:30-12:00, Paper WeAT26-NT.1 | Add to My Program |
KDD-LOAM: Jointly Learned Keypoint Detector and Descriptors Assisted LiDAR Odometry and Mapping |
|
Huang, Renlang | Zhejiang University |
Zhao, Minglei | Zhejiang University |
Chen, Jiming | Zhejiang University |
Li, Liang | Zhejiang Univerisity |
Keywords: SLAM, Deep Learning Methods, Localization
Abstract: Sparse keypoint matching based on distinct 3D feature representations can improve the efficiency and robustness of point cloud registration. Existing learning-based 3D descriptors and keypoint detectors are either independent or loosely coupled, so they cannot fully adapt to each other. In this work, we propose a tightly coupled keypoint detector and descriptor (TCKDD) based on a multi-task fully convolutional network with a probabilistic detection loss. In particular, this self-supervised detection loss fully adapts the keypoint detector to any jointly learned descriptors and benefits the self-supervised learning of descriptors. Extensive experiments on both indoor and outdoor datasets show that our TCKDD achieves textit{state-of-the-art} performance in point cloud registration. Furthermore, we design a keypoint detector and descriptors-assisted LiDAR odometry and mapping framework (KDD-LOAM), whose real-time odometry relies on keypoint descriptor matching-based RANSAC. The sparse keypoints are further used for efficient scan-to-map registration and mapping. Experiments on KITTI dataset demonstrate that KDD-LOAM significantly surpasses LOAM and shows competitive performance in odometry.
|
|
10:30-12:00, Paper WeAT26-NT.2 | Add to My Program |
Campus Map: A Large-Scale Dataset to Support Multi-View VO, SLAM and BEV Estimation |
|
Ross, James | University of Surrey |
Kaygusuz, Nimet | University of Surrey |
Mendez, Oscar | University of Surrey |
Bowden, Richard | University of Surrey |
Keywords: SLAM
Abstract: Significant advances in robotics and machine learning have resulted in many datasets designed to support research into autonomous vehicle technology. However, these datasets are rarely suitable for a wide variety of navigation tasks. For example, datasets that include multiple cameras often have short trajectories without loops that are unsuitable for the evaluation of longer-range SLAM or odometry systems, and datasets with a single camera often lack other sensors, making them unsuitable for sensor fusion approaches. Furthermore, alternative environmental representations such as semantic Bird's Eye View (BEV) maps are growing in popularity, but datasets often lack accurate ground truth and are not flexible enough to adapt to new research trends. To address this gap, we introduce Campus Map, a novel large-scale multi-camera dataset with 2M images from 6 mounted cameras that includes GPS data and 64-beam, 125k point LiDAR scans totalling 8M points (raw packets also provided). The dataset consists of 16 sequences in a large car park and 6 long-term trajectories around a university campus that provide data to support research into a variety of autonomous driving and parking tasks. Long trajectories (average 10~min) and many loops make the dataset ideal for the evaluation of SLAM, odometry and loop closure algorithms, and we provide several state-of-the-art baselines. We also include 40k semantic BEV maps rendered from a digital twin. This novel approach to ground truth generation allows us to produce more accurate and crisp semantic maps than are currently available. We make the simulation environment available to allow researchers to adapt the dataset to their specific needs.
|
|
10:30-12:00, Paper WeAT26-NT.3 | Add to My Program |
DISO: Direct Imaging Sonar Odometry |
|
Xu, Shida | Imperial College London |
Zhang, Kaicheng | Heriot-Watt University |
Hong, Ziyang | Heriot-Watt University |
Liu, Yuanchang | University College London |
Wang, Sen | Imperial College London |
Keywords: SLAM
Abstract: This paper introduces a novel sonar odometry system that estimates the relative spatial transformation between two sonar image frames. Considering the unique challenges, such as low resolution and high noise, of sonar imagery for odometry and Simultaneous Localization and Mapping (SLAM), the proposed Direct Imaging Sonar Odometry (DISO) system is designed to estimate the relative transformation between two sonar frames by minimizing the aggregated sonar intensity errors of points with high intensity gradients. Moreover, DISO is implemented to incorporate a multi-sensor window optimization technique, a data association strategy and an acoustic intensity outlier rejection algorithm for reliability and accuracy. The effectiveness of DISO is evaluated using both simulated and real-world sonar datasets, showing that it outperforms the existing geometric-only method on localization accuracy and achieves state-of-the-art sonar odometry performance. The source code is available at https://github.com/SenseRoboticsLab/DISO.
|
|
10:30-12:00, Paper WeAT26-NT.4 | Add to My Program |
CURL-MAP: Continuous Mapping and Positioning with CURL Representation |
|
Zhang, Kaicheng | Heriot-Watt University |
Ding, Yining | Heriot-Watt University |
Xu, Shida | Imperial College London |
Hong, Ziyang | Heriot-Watt University |
Kong, Xianwen | Heriot-Watt Universiy |
Wang, Sen | Imperial College London |
Keywords: SLAM
Abstract: Maps of LiDAR Simultaneous Localisation and Mapping (SLAM) are often represented as point clouds. They usually take up a huge amount of storage space for large-scale environments, otherwise much structural detail may not be kept. In this paper, a novel paradigm of LiDAR mapping and odometry is designed by leveraging the Continuous and Ultra-compact Representation of LiDAR (CURL). Termed CURL-MAP (Mapping and Positioning), the proposed approach can not only reconstruct 3D maps with a continuously varying density but also efficiently reduce map storage space by using CURL's spherical harmonics implicit encoding. Different from the popular Iterative Closest Point (ICP) based LiDAR odometry techniques, CURL-MAP formulates LiDAR pose estimation as a unique optimisation problem tailored for CURL. Experiment evaluation shows that CURL-MAP achieves state-of-the-art 3D mapping results and competitive LiDAR odometry accuracy. We will release the CURL-MAP codes for the community.
|
|
10:30-12:00, Paper WeAT26-NT.5 | Add to My Program |
Degradation Resilient LiDAR-Radar-Inertial Odometry |
|
Nissov, Morten | NTNU |
Khedekar, Nikhil Vijay | NTNU |
Alexis, Kostas | NTNU - Norwegian University of Science and Technology |
Keywords: SLAM, Aerial Systems: Perception and Autonomy, Field Robots
Abstract: Enabling autonomous robots to operate robustly in challenging environments is necessary in a future with increased autonomy. For many autonomous systems, estimation and odometry remains a single point of failure, from which it can often be difficult, if not impossible, to recover. As such robust odometry solutions are of key importance. In this work a method for tightly-coupled LiDAR-Radar-Inertial fusion for odometry is proposed, enabling the mitigation of the effects of LiDAR degeneracy by leveraging a complementary perception modality while preserving the accuracy of LiDAR in well-conditioned environments. The proposed approach combines modalities in a factor graph-based windowed smoother with sensor information-specific factor formulations which enable, in the case of degeneracy, partial information to be conveyed to the graph along the non-degenerate axes. The proposed method is evaluated in real-world tests on a flying robot experiencing degraded conditions including geometric self-similarity as well as obscurant occlusion. For the benefit of the community we release the datasets presented: https://github.com/ntnu-arl/lidar_degeneracy_datasets.
|
|
10:30-12:00, Paper WeAT26-NT.6 | Add to My Program |
Design and Evaluation of a Generic Visual SLAM Framework for Multi Camera Systems |
|
Kaveti, Pushyami | Northeastern University |
Vaidyanathan, Shankara Narayanan | Northeastern University |
Thamil Chelvan, Arvind | Northeastern University |
Singh, Hanumant | Northeatern University |
Keywords: SLAM, Data Sets for SLAM, Field Robots
Abstract: Multi-camera systems have been shown to improve the accuracy and robustness of SLAM estimates, yet state-of-the-art SLAM systems predominantly support monocular or stereo setups. This paper presents a generic sparse visual SLAM framework capable of running on any number of cameras and in any arrangement. Our SLAM system uses the generalized camera model, which allows us to represent an arbitrary multi-camera system as a single imaging device. Additionally, it takes advantage of the overlapping fields of view (FoV) by extracting cross-matched features across cameras in the rig. This limits the linear rise in the number of features with the number of cameras and keeps the computational load in check while enabling an accurate representation of the scene. We evaluate our method in terms of accuracy, robustness, and run time on indoor and outdoor datasets that include challenging real-world scenarios such as narrow corridors, featureless spaces, and dynamic objects. We show that our system can adapt to different camera configurations and allows real-time execution for typical robotic applications. Finally, we benchmark the impact of the critical design parameters - the number of cameras and the overlap between their FoV that define the camera configuration for SLAM. All our software and datasets are freely available for further research.
|
|
10:30-12:00, Paper WeAT26-NT.7 | Add to My Program |
Ground-Fusion: A Low-Cost Ground SLAM System Robust to Corner Cases |
|
Yin, Jie | Shanghai Jiao Tong University |
Li, Ang | Shanghai Jiao Tong University |
Xi, Wei | Nankai University |
Yu, Wenxian | Shanghai Jiao Tong University |
Zou, Danping | Shanghai Jiao Ton University |
Keywords: SLAM, Data Sets for SLAM, Sensor Fusion
Abstract: We introduce Ground-Fusion, a low-cost sensor fusion simultaneous localization and mapping (SLAM) system for ground vehicles. Our system features efficient initialization, effective sensor anomaly detection and handling, real-time dense color mapping, and robust localization in diverse environments. We tightly integrate RGB-D images, inertial measurements, wheel odometer and GNSS signals within a factor graph to achieve accurate and reliable localization both indoors and outdoors. To ensure successful initialization, we propose an efficient strategy that comprises three different methods: stationary, visual, and dynamic, tailored to handle diverse cases. Furthermore, we develop mechanisms to detect sensor anomalies and degradation, handling them adeptly to maintain system accuracy. Our experimental results on both public and self-collected datasets demonstrate that Ground-Fusion outperforms existing low-cost SLAM systems in corner cases. We release the code and datasets at href{https://github.com/SJTU-ViSYS/Ground-Fusion}{https://github.com/SJTU-ViSYS/Ground-Fusion}.
|
|
10:30-12:00, Paper WeAT26-NT.8 | Add to My Program |
HERO-SLAM: Hybrid Enhanced Robust Optimization of Neural SLAM |
|
Xin, Zhe | Meituan |
Yue, Yufeng | Beijing Institute of Technology |
Zhang, Liangjun | Baidu |
Wu, Chenming | Baidu Research |
Keywords: SLAM, Deep Learning for Visual Perception, Vision-Based Navigation
Abstract: Simultaneous Localization and Mapping (SLAM) is a fundamental task in robotics, driving numerous applications such as autonomous driving and virtual reality. Recent progress on neural implicit SLAM has shown encouraging and impressive results. However, the robustness of neural SLAM, particularly in challenging or data-limited situations, remains an unresolved issue. This paper presents HERO-SLAM, a Hybrid Enhanced Robust Optimization method for neural SLAM, which combines the benefits of neural implicit field and feature-metric optimization. This hybrid method optimizes a multi-resolution implicit field and enhances robustness in challenging environments with sudden viewpoint changes or sparse data collection. Our comprehensive experimental results on benchmarking datasets validate the effectiveness of our hybrid approach, demonstrating its superior performance over existing implicit field-based methods in challenging scenarios. HERO-SLAM provides a new pathway to enhance the stability, performance, and applicability of neural SLAM in real-world scenarios. Project page: https://hero-slam.github.io.
|
|
WeAL-EX Poster Session, Exhibition Hall |
Add to My Program |
Late Breaking Results Poster IV |
|
|
|
10:30-12:00, Paper WeAL-EX.1 | Add to My Program |
Development of Real-Time Motion Mapping for Surgical Robot |
|
Peuchpen, Pantita | The Hong Kong University of Science and Technology (Guangzhou) |
Liu, Haichao | The Hong Kong University of Science and Technology |
Ma, Jun | The Hong Kong University of Science and Technology |
Keywords: Telerobotics and Teleoperation, Mapping, Haptics and Haptic Interfaces
Abstract: Enhancing healthcare equity is a key global policy objective under the United Nations' Sustainable Development Goals (SDGs). Geographical hurdles to healthcare access present substantial challenges, resulting in decreased service utilization, lower uptake of preventive care, and diminished survival rates, particularly among individuals residing distant from healthcare facilities. Therefore, teleoperation technology has been implemented in the fields of medicine and surgery to address this issue. However, this technology requires a high level of precision and control. This paper presents the method for mapping the human hand’s motion from surgeon to the robot arm and providing haptic feedback to send the force feedback back from the tips of instruments to surgeons.
|
|
10:30-12:00, Paper WeAL-EX.2 | Add to My Program |
Micromanipulation Assistance Via Motion Guidanceto a Spatiotemporal Ideal Trajectory Using GMM and LSTM |
|
Mori, Ryoya | Nagoya University |
Aoyama, Tadayoshi | Nagoya University |
Kobayashi, Taisuke | National Institute of Informatics |
Sakamoto, Kazuya | Nagoya University |
Takeuchi, Masaru | Nagoya University |
Hasegawa, Yasuhisa | Nagoya University |
Keywords: Imitation Learning, Biological Cell Manipulation, Human-Centered Robotics
Abstract: Intracytoplasmic sperm injection (ICSI) requires high skill to rotate the oocyte without causing damage using micropipettes. The oocyte rotation process is challenging due to the need for delicate and fast manipulations, as well as limited depth information. To address these difficulties, we propose a micromanipulation assistance system that utilizes a Gaussian Mixture Model (GMM) and Long Short-Term Memory (LSTM) to guide operators along an ideal spatiotemporal trajectory. The system learns static ideal trajectory points from data on an expert's pipette manipulations using GMM. An LSTM is trained to learn the expert's manipulations by inferring the expected pipette manipulation at each time step. During assistance for novice operators, the system provides real-time haptic and visual guidance by combining the element of spatial guidance to the static ideal trajectory and the element of time-series-aware inference by inference through the trained LSTM. Experiments conducted on novice operators demonstrated that the GMM+LSTM assistance system significantly improved operational efficiency and reduced cell damage when compared to both the conventional system and an assistance system with LSTM alone. These results demonstrate the effectiveness of the spatiotemporal guidance approach for assisting complex micromanipulation tasks.
|
|
10:30-12:00, Paper WeAL-EX.3 | Add to My Program |
Feedforward Macro-Mini Dynamics Compensation Toward Dynamically Transparent Exoskeletons |
|
Shimoyama, Takuma | Graduate School of Informatics and Engineering, the University O |
Noda, Tomoyuki | ATR Computational Neuroscience Laboratories |
Teramae, Tatsuya | ATR Computational Neuroscience Laboratories |
Nakata, Yoshihiro | The University of Electro-Communications |
Keywords: Prosthetics and Exoskeletons, Rehabilitation Robotics
Abstract: Mechanical transparency in exoskeletons, i.e., the robots do not adversely affect patients' body dynamics, is essential for robot rehabilitation. Conventional transparency has only aimed at zero interaction force, but considering transparency during robot-assisted movements has been overlooked. We focused on the ability to keep interaction forces at a non-zero target value during assistance and named it force-based dynamic transparency. As the interaction force increases, the robot's mechanical losses also increase, making it more challenging to achieve force-based dynamic transparency. We aim to achieve force-based dynamic transparency by extending the concept of the distributed macro-mini actuation approach. By using a pneumatic–electromagnetic hybrid actuator effectively compensates each mechanical loss by distributing these losses to the appropriate actuators. We have already demonstrated robust torque generation by distributing the kinetic friction force to the electromagnetic actuator of the pneumatic-electromagnetic hybrid actuator in our ICRA 2024 contributed paper. This research proposes a more general control design policy to distribute and compensate the physical properties of elements consisting of robots to each actuator to realize force-based dynamic transparency by the distributed macro-mini actuation approach using hybrid actuators.
|
|
10:30-12:00, Paper WeAL-EX.4 | Add to My Program |
Research on Planning and Control Methods for Refined Operations of Excavation Robot |
|
Lu, Liang | Tongji University |
Zhu, Minyan | Tongji University |
Tang, Chengzong | Tongji University |
Wang, Zhipeng | Tongji University |
He, Bin | Tongji University |
Keywords: Robotics and Automation in Construction, Task Planning, Motion Control
Abstract: A set of planning and control methods is designed to improve the precision operation ability of the excavation robot. The main work achievements are summarized as follows: (1)A refined work comprehensive trajectory optimization strategy was proposed, which improves the ability of excavation robot in refined work at the planning level. (2)A joint trajectory optimization method based on MABC and SQP was proposed, which improved the efficiency of trajectory optimization. (3)A variable universe fuzzy PID control strategy was designed to further reduce trajectory tracking errors.
|
|
10:30-12:00, Paper WeAL-EX.5 | Add to My Program |
Design of a Modular Supernumerary Mechanical Limb Actuated by a Foot Interface |
|
Chao, Elizabeth Ting | The Chinese University of Hong Kong |
Chan, Sheung Yan | The Chinese University of Hong Kong |
Huang, Yanpei | Imperial College London |
Eden, Jonathan | University of Melborune |
Burdet, Etienne | Imperial College London |
Lau, Darwin | The Chinese University of Hong Kong |
Keywords: Wearable Robotics, Prosthetics and Exoskeletons, Tendon/Wire Mechanism
Abstract: Supernumerary limbs are wearable devices that can act both as a prosthetic or as a mechanism of human augmentation to provide additional extremities rather than replace missing ones. Compared to typical manipulators, supernumerary limbs are unique as they are not fixed to a single base. The high redundancy of the human body can be taken advantage of since the supernumerary limb can be controlled directly by moving the base. Addressing these considerations, the development of a task-specific, cable-driven Superlimb would allow for the practicality of such a wearable device. For the intuitive control of the supernumerary limb, a foot interface was implemented. The end effector is directly actuated mechanically through the wheel motion translated to the displacement of inner steel cables within an outer housing. the result is a wearable device in which haptic feedback can be felt by the user from the end effector to the foot. Experiments were performed which demonstrated the workload of an additional supernumerary limb attached to one's body was not significantly higher than on a fixed base.
|
|
10:30-12:00, Paper WeAL-EX.6 | Add to My Program |
Drone-Enabled Last Mile Delivery for Energy Management in UGV Teams |
|
Singh, Gaurav | Iowa State University |
Mandal, Shashwata | Iowa State University |
Bhattacharya, Sourabh | Iowa State University |
Keywords: Energy and Environment-Aware Automation, Multi-Robot Systems, Planning, Scheduling and Coordination
Abstract: Autonomous multi-robot systems deployed outdoors experience bottlenecking in efficiency due to recharge operations. Many challenges come up while attempting to place static recharge stations outdoors due to various factors such as distance from robots, navigation, charge time, and recharge sequence. To overcome these challenges we implement a recharge methodology that uses UAVs to deliver secondary batteries to a robot that charges its primary battery. We propose a framework and algorithms for finding efficient delivery sequences for recharging the robots, in which we explore the use of the nearest-first approach. The combined impact of the algorithms on efficiencies of UAV-UGV collaboration is studied. A testbed is set up to evaluate the feasibility and scalability of the system in the real world, using Crazyflies and Boe-Bots.
|
|
10:30-12:00, Paper WeAL-EX.7 | Add to My Program |
Vision-Driven Robotic System for Autonomous Sewing of Elastic Fabrics |
|
Marchello, Gabriele | Istituto Italiano Di Tecnologia |
Abidi, Syed Haider Jawad | Istituto Italiano Di Tecnologia |
Lahoud, Marcel | Italian Institute of Technology |
Fontana, Eleonora | Istituto Italiano Di Tecnologia |
Meddahi, Amal | Italian Institute of Technoloy |
Baizid, Khelifa | Italian Institute of Technology |
Farajtabar, Mohammad | University of Calgary |
D'Imperio, Mariapaola | Istituto Italiano Di Tecnologia |
Cannella, Ferdinando | Istituto Italiano Di Tecnologia |
Keywords: Industrial Robots, Grippers and Other End-Effectors, Soft Robot Applications
Abstract: Manipulating soft materials has always been one of the most difficult problems in robotics, due to the non-linear mechanical behaviour of fabrics. Therefore, the automation of systems based on the manipulation of soft materials (such as the clothing industry) has been very limited. We present a robotic cell that supports workers by automating the production of cyclist garments, composed of an elastic cloth and a foam pad to sew together. The robotic cell is comprised of two robotic arms equipped with a two-finger parallel gripper and a pneumatic needle gripper to flatten the cloth and pick the foam pad, respectively. Moreover, a Cartesian robot is employed to drive the two fabrics under the needle of a sewing machine. This project aims to improve the productivity of garments and the working conditions of the operators. The results obtained by the robotic cell are comparable with conventional ones both in quality and production time. In addition, the modularity underlying the design of this structure ensures a high degree of flexibility. Therefore, the system can be used to make all types of garments.
|
|
10:30-12:00, Paper WeAL-EX.8 | Add to My Program |
EAIK: A Toolbox for Efficient Analytical Inverse Kinematics by Subproblem Decomposition |
|
Ostermeier, Daniel | Technical University of Munich |
Külz, Jonathan | Technical University of Munich |
Keywords: Kinematics, Industrial Robots, Software Tools for Robot Programming
Abstract: Current methods for general closed-form inverse kinematics (IK) suffer from slow derivation speed and complex setup procedures. Our IK toolbox provides high usability by encapsulating all its functionalities in a Python package. It automatically derives a robot’s kinematic structure from either a URDF File or a set of Denavit–Hartenberg (DH) Parameters. We achieve millisecond derivation speeds for the subproblem decomposition, numeric stability for the IK solutions, and microsecond IK computation times that surpass current numerical methods whilst providing an analytical complete solution set.
|
|
10:30-12:00, Paper WeAL-EX.9 | Add to My Program |
3D Actuation and Trajectory Control of Ferrofluidic Droplet Robot Swarms for Targeted Drug Delivery |
|
Fan, Xinjian | Soochow University |
Zhang, Yunfei | Soochow University |
Yang, Zhan | Soochow University |
Keywords: Automation at Micro-Nano Scales, Biologically-Inspired Robots, Micro/Nano Robots
Abstract: Research on microrobot swarms points to exciting applications, but handling these swarms is much more complex than dealing with individual robots. The challenge starts with making these tiny robots efficiently and in large numbers, which current methods can't always do well. Additionally, orchestrating the collective operation of these swarms within the human body, especially beyond mere planar movement, poses a substantial challenge. Most existing research doesn't fully tackle how to control these swarms in three-dimensional spaces within the body, which is crucial for delivering medicine right where it's needed. To address the aforementioned problems and challenges, this work presents an innovative method based on microfluidic technology for tackling the issue of mass production of microrobots. Furthermore, this work constructs an 8-axis distributed electromagnetic coil to realize a decoupled three-dimensional (3D) spatial control method for microrobot swarms based on magnetic force and torque, solving the challenges related to three-dimensional actuation and trajectory control. By using 3D printing and animal tissue, we finally create environments that mimic human tissues for 3D locomotion tests of microrobot swarms. This research endeavors to advance our capabilities in manipulating these minuscule robotic swarms, paving the way for novel disease treatment methods that promise greater precision and reduced invasiveness.
|
|
10:30-12:00, Paper WeAL-EX.10 | Add to My Program |
Autonomous Loose Fruit Collection for Oil Palm Plantation |
|
Ismail, Muhamad Khuzaifah | Sime Darby Plantation Research |
Keywords: Robotics and Automation in Agriculture and Forestry, Agricultural Automation
Abstract: The Autonomous Loose Fruit Robot ALFRo project, spearheaded by SD Plantation Research ( presents a groundbreaking solution to address labor shortages and operational inefficiencies in the oil palm industry Leveraging advanced technologies including robotics, artificial intelligence ( and automation, ALFRo is able to carry out labor intensive tasks that eases the burden of labor shortage in the plantation industry The ALFRo project involved research and development activities that focused on integrating advanced sensors with image processing algorithms to detect palm fruitlets that are left on the ground and evacuating the valuable consignment in a timely manner ALFRo features scalability and agility for application in the oil palm plantations Considerations of operational sustainability is demonstrated via minimal operational losses and modification to the soil conditions With a planned timeline of one year for development and extensive testing, ALFRo aims to set new standards for loose fruit collection in the palm oil industry By enhancing efficiency, productivity, and safety while minimizing environmental impact, ALFRo represents a transformative shift in oil palm estate management, paving the way for a more sustainable and prosperous future in the industry and beyond
|
|
10:30-12:00, Paper WeAL-EX.11 | Add to My Program |
Balance Recovery Via Whole-Body Model Predictive Control for Wheeled Bipedal Robots |
|
Lee, Young Hun | Korea Institute of Machinery & Materials |
Kang, Woosong | DGIST |
Park, Jongwoo | Korea Institue of Machinery & Materials |
Ahn, Jeongdo | Korea Institute of Machinery and Materials |
Park, Dongil | Korea Institute of Machinery and Materials (KIMM) |
Park, Chanhun | KIMM |
Keywords: Legged Robots, Whole-Body Motion Planning and Control, Optimization and Optimal Control
Abstract: This poster presents a whole body controller based on model predictive control (MPC), which enables a wheeled bipedal robot to demonstrate dynamic locomotion over various terrains including slope and stair, as well as under various types of external disturbances. To stabilize the robot's balance, optimal torques for each joint are generated through the MPC method. The proposed whole body controller was tested on the wheeled bipedal robot. Locomotive abilities are evaluated in the Gazebo simulator.
|
|
10:30-12:00, Paper WeAL-EX.12 | Add to My Program |
An Origami-Inspired Approach to Height Adjustment of Wind Assisted Ship Propulsion |
|
Kim, Chan | Seoul National University |
Jung, Sun-Pill | Seoul National University |
Cho, Kyu-Jin | Seoul National University, Biorobotics Laboratory |
Keywords: Mechanism Design, Actuation and Joint Mechanisms, Tendon/Wire Mechanism
Abstract: The International Maritime Organization (IMO) is pushing for cleaner seas. They've set targets to reduce CO2 emissions by 40% by 2030 and 70% by 2050 based on 2008 levels. Ships are now required to adopt more eco-friendly technologies means green ships. One approach to creating a green ship is to reduce fuel consumption. This involves using an auxiliary propulsion device that utilizes wind power. This method can be applied to existing ships. We plan to manufacture the rotor sailThe figure illustrates the rotor sail. Rotor sail system uses a rotating actuator on the inner tower to spin the outer panel. The rotor sail operates on the principle of the Magnus effect. When a rotor sail rotates and encounters wind, it generates lift. The rotor sail needs to be tall because wind strength increases with altitude at sea, enhancing the Magnus effect. Our target rotor sail height is 35m. If the sail is divided into layers mean diameter is not constant, and flow separation occurs, leading to efficiency issues. The rotor sail's height can create navigational issues, such as when passing under bridges or docking. Thus, the sail's height must occasionally be reduced. Then What if the rotor sail could fold to the required height? By folding the upper part to the required height, can achieve a height at which the rotor sail operates efficiently. Therefore, the selection or additional design of a foldable method that can maintain a constant diameter is necessary.
|
|
10:30-12:00, Paper WeAL-EX.13 | Add to My Program |
2 DoF Prosthetic Wrist with Concave-Convex Rolling Contact Joint |
|
Jeong, Inchul | Seoul National University |
Cho, Kyu-Jin | Seoul National University, Biorobotics Laboratory |
Keywords: Actuation and Joint Mechanisms, Prosthetics and Exoskeletons, Tendon/Wire Mechanism
Abstract: Wrist plays crucial role with its 2 Degree of Freedom(DoF) in orientating hand position to grasp objects with desired grasping posture. Absence of wrist simplifies kinematics of upper extremity. Prosthetic hand users without wrist relies on other joints or intact limb and suffers discomfort from compensatory motion, which can leads to residual limb pain, secondary musculoskeletal disease and overuse syndrome. Wrist not only involves in orientating hands, but also takes its part in manipulating hand or grasped tools. People actively use their wrist in activities of daily living. Coupled DoF motion, which is known as dart throwing motion, is mainly used in ADL with various coupling ratio. With this characteristics of wrist, prosthetic users preferred hand with wrist to hand without wrist. To fulfill needs of prosthetic users, artificial limb needs to be lighter, have better usability and functionality with anthropomorphic size. Prosthetic wrist needs to meet those conditions to restore missing functions for amputee In this paper, we propose design of 3d printed prosthetic wrist with compact size and low inertia with 2 DoF actuation. To enlarge load capacity with light weight material and mimic concave shape of wrist row, rolling contact joint was used with concave-convex shape to lower contact stress. Tendon driven actuating system enables proximal placement of motor with low inertia. Joint surface and tendon routing are designed to enable decoupling of 2 DoF.
|
|
10:30-12:00, Paper WeAL-EX.14 | Add to My Program |
Size-Adaptive Robotic Gripper with Constant Gripping Force Using Electromagnetic Fuse Switch Mechanism |
|
Kim, Tae Hwan | Seoul National University |
Park, Yong-Lae | Seoul National University |
Keywords: Grippers and Other End-Effectors, Grasping, Mechanism Design
Abstract: The demand for customized and personalized products has recently been increasing, and it would be advantageous to have the capability to manufacture products of various sizes on a single production line, necessitating adaptive robotic grippers for different objects. We propose a size-adaptive robotic gripper capable of grasping objects of various sizes without using any sensors. The proposed mechanism employs an electromagnetic fuse that delivers gripping force below a threshold level and disconnects the actuation circuit when overloaded, resulting in a constant gripping force applied to objects of different sizes. In this system, while the gripper holds an object, the gripping force can be controlled by adjusting the amount of electric current supplied to the electromagnetic fuse. Since force transmission is determined by the geometry and motion of the mechanism, the transfer function of force transmission for the proposed mechanism is modeled and optimized to generate a constant gripping force regardless of the size of the object. Experimental results confirm that the gripper is capable of grasping various objects without closed-loop control.
|
|
10:30-12:00, Paper WeAL-EX.15 | Add to My Program |
SEMG-Based Hand Gesture Recognition by Time-Frequency Domain Multifeature Coupling Network |
|
Wang, Peiyao | Shenyang University of Technology |
Li, Yazhou | Shenyang University of Technology |
Li, Kairu | Shenyang University of Technology |
Keywords: Gesture, Posture and Facial Expressions, Prosthetics and Exoskeletons, Datasets for Human Motion
Abstract: Surface electromyography (sEMG), which enables tracking of electrical activity within muscles, is widely applied to human-machine interaction (HMI), such as gesture recognition and prosthetic control. However, electrode displacement can seriously affect the sEMG-based motion recognition accuracy. Therefore, in practice, users have to retraining each time when they rewear sEMG electrodes, which increases their training burden and affects user experience. Therefore, we propose a Global-Local Time-Frequency Coupling Network (GL-TF Coupling Network) for sEMG-based gesture recognition. The network innovatively adopts a compact convolution-transformer structure, where the convolutional module is responsible for learning dual-channel signals in time and frequency domains to extract low-level local features of gesture actions. Combined with a self-attention module, it captures global correlations within local time-frequency features. Additionally, a simple classifier module composed of fully connected layers predicts gesture categories of sEMG signals. To enhance the multi-channel information fusion capability among sEMG signals, a “conical flask” structure for the convolutional fusion channel is introduced, coupling information across different channels. Experiment results demonstrate an average gesture recognition accuracy of 90% on the public "EMG data for gestures" dataset and 90.69% on our "ED-sEMG" dataset which includes scenarios of electrode displacement.
|
|
10:30-12:00, Paper WeAL-EX.16 | Add to My Program |
ImitationBT: Imitation Learning for Behavior Tree Generation from DRL Agents |
|
Bathula, Shailendra Sekhar | University of Georgia |
Parasuraman, Ramviyas | University of Georgia |
Keywords: Imitation Learning, Behavior-Based Systems, Reinforcement Learning
Abstract: Behavior Trees (BT) stand as a favored control architecture among game designers and robotics experts, prized for their modularity, reactivity, and hierarchical structure. These properties enable BTs to offer scalable and clear-cut solutions to a wide array of decision-making challenges. In contrast, Deep Reinforcement Learning (DRL) has demonstrated exceptional performance but faces hesitancy in high-stakes domains due to its reliance on neural networks, which present challenges in verifiability and explainability. In this context, we introduce a novel framework designed to bridge the gap between the high performance of DRL and the desirable transparency and verifiability of BTs. By employing imitation learning to capture and transfer the expertise of a reinforcement learning model, we pave the way for generating BTs that are not only effective but also transparent, interpretable, and readily verifiable for real world problems.
|
|
10:30-12:00, Paper WeAL-EX.18 | Add to My Program |
Accurate Loop Closure with Panoptic Information and Scan Context++ for LiDAR-Based SLAM |
|
Tan, Louise | Kumoh National Institute of Technology |
Lee, Heoncheol | Kumoh National Institute of Technology |
Keywords: SLAM, Semantic Scene Understanding
Abstract: Loop closing is crucial in a SLAM system to reduce drift accumulations. Most SLAM systems only leverage low-level geometric features, leaving high-level information unused. The implementation of panoptic information into the Scan Context++ algorithm to improve loop closure detection accuracy is proposed. The proposed approach is able to exploit LiDAR odometry and panoptic information to perform loop closure detection as well as pose estimation and mapping. Experimental results shows improvements in loop closure detection with the implementation of panoptic information.
|
|
10:30-12:00, Paper WeAL-EX.19 | Add to My Program |
Flexure Hinge-Based Miniature Parallel Manipulator for Eye-Box Expansion of AR-HUD System |
|
Park, Yong-Min | Seoul National University |
Jung, Sun-Pill | Seoul National University |
You, Jang-Woo | Samsung |
Koh, Je-Sung | Ajou University |
Lee, Hong-Seok | Pukyong National University |
Cho, Kyu-Jin | Seoul National University, Biorobotics Laboratory |
Keywords: Parallel Robots, Industrial Robots, Multi-Robot Systems
Abstract: Augmented reality head-up display (AR-HUD) implements augmented information on the road to provide users with a diverse experience. Maxwellian view display and holographic display were devised to implement realistic AR; however, their small eye-box limited the potential for HUD applications. In this paper, we developed a manipulator to mechanically expand the eye-box into 3D space and maximize the utilization of light, unlike previous solutions. A modified delta robot mechanism which consists of three legs with 90-degree arrangement is used to manipulate two adjacent projectors with discrete 3-DOF movements that enter images into both eyes in real time, respectively. We utilized the origami fabrication method for its lightweight and simple fabrication and proposed a triangular-prism-shaped parallelogram linkage, which can be used as a linkage for delta robots. The linkage has stiffness and zero-backlash to move the projector at the required speed and precision within the workspace. The eye-box was expanded to 140 mm X 110.6 mm X 140 mm, and the positioning error of the AR-HUD system was evaluated to be less than 1mm.
|
|
10:30-12:00, Paper WeAL-EX.20 | Add to My Program |
Model-Based Real-Time Simulator for Robotic Electromagnetic Actuation |
|
Ko, Yeongoh | Chonnam National University |
Lee, Han-Sol | Chonnam National University |
Kim, Chang-Sei | Chonnam National University |
Keywords: Medical Robots and Systems, Simulation and Animation
Abstract: This study introduces a real-time simulator designed for a robotic electromagnetic actuator (EMA) system. To address the complexities of electromagnetic field computations, a simplified magnetic field model based on the Biot-Savart law is proposed. The proposed model reduces calculation time from 48 seconds using the Finite Element Method (FEM) to 204 milliseconds and shows less than 4% error compared to FEM simulations and real measurements for the principal axis. Within the ROS Gazebo environment, the simulator provides visualization of the robot EMA system and its magnetic fields. It operates by receiving joystick commands for robot pose, computing currents based on the desired fields and posture, and transmitting these currents simultaneously to both the real system and the simulator. Experimental results exhibit 5° errors in capsule movements and a 2.29mm root mean square error (RMSE) for guidewire navigation.
|
|
10:30-12:00, Paper WeAL-EX.21 | Add to My Program |
ILPSR: Imitation Learning with Predictable Skill Representation for Long-Horizon Manipulation Tasks |
|
Wang, Hao | University of Science and Technology of China |
Zhang, Hao | University of Science and Technology of China |
Li, Lin | University of Science and Technology of China |
Qian, Tangyu | University of Science and Technology of China |
Zhou, Zhangli | University of Science and Technology of China |
Kan, Zhen | University of Science and Technology of China |
Keywords: Deep Learning in Grasping and Manipulation, AI-Based Methods, Learning from Experience
Abstract: Robots rely heavily on prior experience when learning new tasks. However, traditional supervised learning-based methods are limited by the need for large-scale and high-quality datasets as well as generalization fragility, resulting in their poor scalability. To address these problems, this work proposes the Imitation Learning with Predictable Skill Representation (ILPSR) to drive robots to learn downstream tasks robustly and efficiently with priori data from previous tasks. To better utilize the prior experience, a Predictable Skill Representation Learning model (PSRL) is first developed to extract predictable skill embeddings and skill priors from the prior data. Subsequently, a skill-based behavioral cloning method is employed to apply the learned skill embeddings for policy learning and generalization in downstream target tasks. Experiments demonstrate that ILPSR can more effectively perform challenging long-horizon complex manipulation skills, with learning performance outperforming baselines.
|
|
10:30-12:00, Paper WeAL-EX.22 | Add to My Program |
Servo Integrated Nonlinear Model Predictive Control for Overactuated Tiltable-Quadrotors |
|
Li, Jinjie | The University of Tokyo |
Sugihara, Junichiro | The University of Tokyo |
Zhao, Moju | The University of Tokyo |
Keywords: Aerial Systems: Mechanics and Control, Motion Control
Abstract: Quadrotors are widely employed across various domains, yet conventional models face limitations due to underactuation, where attitude control is closely tied to positional adjustments. In contrast, quadrotors equipped with tiltable rotors offer overactuation, empowering them to track both position and attitude references. However, the nonlinear dynamics of the drone body and the sluggish response of tilting servos pose challenges for conventional cascade controllers. In this study, we propose a control methodology for tilting-rotor quadrotors leveraging nonlinear model predictive control (NMPC). Unlike conventional approaches, our method preserves the full dynamics without simplification and utilizes actuator commands directly as control inputs. Notably, we incorporate a first-order servo model within the NMPC framework. Through simulation, we observe that integrating the servo dynamics not only enhances control performance but also accelerates convergence. To assess the efficacy of our approach, we fabricate a tiltable-quadrotor and deploy the algorithm onboard at a frequency of 100Hz. Extensive real-world experiments demonstrate smooth and rapid pose tracking performance.
|
|
10:30-12:00, Paper WeAL-EX.23 | Add to My Program |
Design of Multi-Functional and Deployable Small-Scale Modular Robot Using Origami-Based Compliant Structure |
|
Kim, Junhyung | Seoul National University |
Jung, Mincheol | Seoul National University |
Kim, Jaehoon | Seoul National University |
Park, Yong-Lae | Seoul National University |
Keywords: Cellular and Modular Robots, Multi-Robot Systems, Sensor-based Control
Abstract: Manipulation tasks in confined spaces are challenges for human workers and modular robots, characterized by deployable and multifunctional capabilities, have recently gained attentions in these applications. They can be dexterous even with spatial limitations and can alter their form factors based on their modularity, offering a wide range of motion and functionality. However, achieving lightweight and compact designs remains a challenge due to power sources and electric motors. Moreover, oversimplified designs and lightweight structures may degrade the performance of precise control. To address these issues, we aim to develop a versatile, dexterous, and controllable modular robot in a compact centimeter-scale size. The actuator modules, made of shape memory alloy (SMA) springs and smart composite microstructure (SCM) technology, enable linear and bending motions. The sensor module directly integrated into the actuator module detects the actuation states by measuring the capacitance change. Multiple actuator-sensor modules can be combined for diverse applications. The module's performance is experimentally characterized by comparison of its mechanical responses with analytical models based on the relationship between the temperature of the SMA and the generated force. Closed-loop control performance is evaluated using root-mean-square error (RMSE). Lastly, the application demonstrates various module combinations for manipulators with different target motions.
|
|
10:30-12:00, Paper WeAL-EX.24 | Add to My Program |
Neuro-Symbolic Task Replanning Using Large Language Models |
|
Kwon, Minseo | Ewha Womans University |
Kim, Young J. | Ewha Womans University |
Keywords: Task Planning
Abstract: We propose a novel task replanning pipeline for executing complicated robotic tasks on physical robots utilizing a combination of a symbolic task planner and a multi-modal Large Language Model (LLM). Our pipeline begins by obtaining the semantic and spatial relationships of target objects in the environment using a multimodal LLM and an open-vocabulary object detection model. Then, LLM specifies a planning problem based on scene and user-provided goal descriptions, which a symbolic planner then utilizes to plan tasks. These plans are translated into low-level programming languages for execution on the robot, with syntax and semantic checking by LLM to ensure correctness and replanning if failed. We demonstrate the implementation of our pipeline on dual UR5e robots across various benchmark tasks, including pick and place, stacking blocks, and block rearrangement, to verify the effectiveness.
|
|
10:30-12:00, Paper WeAL-EX.25 | Add to My Program |
Lifting 2D Pretrained Knowledge to 3D for Object Grounding |
|
S, Ashwin | Indian Institute of Science |
Bannur, Ganesh | Indian Institute of Science, RV College of Engineering |
Amrutur, Bharadwaj | Indian Institute of Science |
Keywords: Semantic Scene Understanding, Object Detection, Segmentation and Categorization, Deep Learning for Visual Perception
Abstract: We propose GAP (Ground And Project), a method for leveraging pretrained 2D models and NeRF for 3D grounding. Recently 3D grounding has seen the emergence of techniques such as LERF, which distil CLIP’s knowledge of scenes by training a separate network. However, it is prohibitive to train such a network to fully extract the capabilities of 2D models. An alternate method for connecting 2D models to 3D is by lifting their 2D mask outputs to 3D. This enables the use of pre-existing 2D models for 3D grounding. GAP adopts this approach and projects 2D grounding masks into 3D using depth information from NeRF. We demonstrate GAP with a 2D grounding pipeline consisting of two models, visual grounding and text spotting. Incorporating text spotting increases the accuracy of grounding by disambiguating between multiple instances of an object (for example an HP laptop vs a Lenovo laptop). GAP demonstrates stronger 3D grounding capabilities when compared to LERF especially in such multi-instance scenes. It is also able to transfer precise masks predicted by 2D models into 3D. GAP has wide utility in robotics such as guiding object manipulation and identifying navigation goals. Critical for mobile robots, GAP enables adapting to new scenes rapidly since it uses pretrained models and only trains NeRF. Finally, while we give a specific pipeline, our technique is generic and can incorporate any model/pipeline which takes image-query pairs as input and gives masks as output.
|
|
10:30-12:00, Paper WeAL-EX.26 | Add to My Program |
Upper-Limb Motion Intention Estimation Using Surface EMG and Soft Strain Sensors for Soft Wearable Robots |
|
Kim, Jaehyeon | Seoul National University |
Lee, Minhee | Seoul National University |
Hwang, Sungjae | Seoul National University |
Choi, YeongJin | Seoul National University |
Kim, Jeongnam | Seoul National University |
Park, Yong-Lae | Seoul National University |
Keywords: Human Detection and Tracking, Soft Sensors and Actuators, Deep Learning Methods
Abstract: Motion estimation plays an important role in human-assistive robotic systems, since it provides the information on the motion intention of the user and the states of the system. Motion estimation with a surface electromyography (sEMG) technique is one of promising methods in that the sEMG signals provide the information on the muscle activation of the user. Researchers have studied different methods of estimating body motions using sEMG, especially via data-driven approaches. Deep learning, one of commonly used techniques, has shown a reasonable performance on gesture recognition, but estimating accurate joint motions is still a challenge since it is difficult to extract the intermediate states of the body only from sEMG signals. To address this issue, we propose a method of estimating upper-limb motions using both sEMG and soft strain sensor data. A soft strain sensor, made of highly stretchable elastomer embedded with a liquid-metal conductor, is able to detect the strain on the joint where the sensor is placed. We use the output from the CNN-RNN models to find the angle displacement of the joint in this work. Using the current state on muscle activation detected by the sEMG and the strain on the elbow joint measured by the soft sensor, the system is able to reliably estimate the joint angle in real time. The average root-mean-square error of the estimated joint angle displacement from the model is 1.7 deg, while the maximum angle displacement of sample dataset is 12.2 deg.
|
|
WeBA1-CC Award Session, CC-Main Hall |
Add to My Program |
Service Robotics |
|
|
Chair: Barfoot, Timothy | University of Toronto |
Co-Chair: Cavallo, Filippo | University of Florence |
|
13:30-15:00, Paper WeBA1-CC.1 | Add to My Program |
Censible: A Robust and Practical Global Localization Framework for Planetary Surface Missions |
|
Nash, Jeremy | Jet Propulsion Laboratory |
Dwight, Quintin | University of Michigan |
Saldyt, Lucas | Jet Propulsion Laboratory |
Wang, Haoda | Jet Propulsion Laboratory, California Institute of Technology |
Myint, Steven | Jet Propulsion Laboratory |
Ansar, Adnan | NASA Jet Propulsion Laboratory |
Verma, Vandi | NASA Jet Propulsion Laboratory, California Institute Of |
Keywords: Space Robotics and Automation, Field Robots, Localization
Abstract: To achieve longer driving distances, planetary robotics missions require accurate localization to counteract position uncertainty. Freedom and precision in driving allows scientists to reach and study sites of interest. Typically, rover global localization has been performed manually by humans, which is accurate but time-consuming as data is relayed between planets. This paper describes a global localization algorithm that is run onboard the Perseverance Mars rover. Our approach matches rover images to orbital maps using a modified census transform to achieve sub-meter accurate, near-human localization performance on a real dataset of 264 Mars rover panoramas. The proposed solution has also been successfully executed on the Perseverance Mars Rover, demonstrating the practicality of our approach.
|
|
13:30-15:00, Paper WeBA1-CC.2 | Add to My Program |
Learning to Walk in Confined Spaces Using 3D Representation |
|
Miki, Takahiro | ETH Zurich |
Lee, Joonho | ETH Zurich |
Wellhausen, Lorenz | ETH Zürich |
Hutter, Marco | ETH Zurich |
Keywords: Legged Robots, Robotics in Hazardous Fields, Reinforcement Learning
Abstract: Legged robots have the potential to traverse complex terrain and access confined spaces beyond the reach of traditional platforms thanks to their ability to carefully select footholds and flexibly adapt their body posture while walking. However, robust deployment in real-world applications is still an open challenge. In this paper, we present a method for legged locomotion control using reinforcement learning and 3D volumetric representations to enable robust and versatile locomotion in confined and unstructured environments. By employing a two-layer hierarchical policy structure, we exploit the capabilities of a highly robust low-level policy to follow 6D commands and a high-level policy to enable three-dimensional spatial awareness for navigating under overhanging obstacles. Our study includes the development of a procedural terrain generator to create diverse training environments. We present a series of experimental evaluations in both simulation and real-world settings, demonstrating the effectiveness of our approach in controlling a quadruped robot in confined, rough terrain. By achieving this, our work extends the applicability of legged robots to a broader range of scenarios.
|
|
13:30-15:00, Paper WeBA1-CC.3 | Add to My Program |
Efficient and Accurate Transformer-Based 3D Shape Completion and Reconstruction of Fruits for Agricultural Robots |
|
Magistri, Federico | University of Bonn |
Marcuzzi, Rodrigo | University of Bonn |
Marks, Elias Ariel | University of Bonn |
Sodano, Matteo | Photogrammetry and Robotics Lab, University of Bonn |
Behley, Jens | University of Bonn |
Stachniss, Cyrill | University of Bonn |
Keywords: Robotics and Automation in Agriculture and Forestry, RGB-D Perception
Abstract: Robots that operate in agricultural environments need a robust perception system that can deal with occlusions, which are naturally present in agricultural scenarios. In this paper, we address the problem of estimating 3D shapes of fruits when only partial observations are available. Generally speaking, such a shape completion can be realized by exploiting prior knowledge about the geometry of the fruit. This is typically done by template matching using traditional optimization algorithms, which are slow but accurate, or by encoding such knowledge into the weights of a neural network, leading to faster but often less accurate estimates. Our approach combines the best of both worlds. It exploits the benefit of having a template representing our object of interest with the advantages of using a neural network to learn how to deform a template. Our experimental evaluation demonstrates that our approach yields accurate estimation at a competitively low inference time in challenging greenhouse environments.
|
|
13:30-15:00, Paper WeBA1-CC.4 | Add to My Program |
CoPAL: Corrective Planning of Robot Actions with Large Language Models |
|
Joublin, Frank | Honda Research Institute Europe |
Ceravola, Antonello | Honda Research Institute Europe GmbH |
Smirnov, Pavel | Honda Research Institute Europe |
Ocker, Felix | Honda |
Deigmoeller, Joerg | Honda Research Institute Europe GmbH |
Belardinelli, Anna | Honda Research Institute Europe |
Wang, Chao | Honda Research Institute Europe GmbH |
Hasler, Stephan | Honda Research Institute Europe |
Tanneberg, Daniel | Honda Research Institute |
Gienger, Michael | Honda Research Institute Europe |
Keywords: AI-Enabled Robotics, Software Architecture for Robotic and Automation, Task and Motion Planning
Abstract: In the pursuit of fully autonomous robotic systems capable of taking over tasks traditionally performed by humans, the complexity of open-world environments poses a considerable challenge. Addressing this imperative, this study contributes to the field of Large Language Models (LLMs) applied to task and motion planning for robots. We propose a system architecture that orchestrates a seamless interplay between multiple cognitive levels, encompassing reasoning, planning, and motion generation. At its core lies a novel replanning strategy that handles physically grounded, logical, and semantic errors in the generated plans. We demonstrate the efficacy of the proposed feedback architecture, particularly its impact on executability, correctness, and time complexity via empirical evaluation in the context of a simulation and two intricate real-world scenarios: blocks world, barman and pizza preparation.
|
|
13:30-15:00, Paper WeBA1-CC.5 | Add to My Program |
CalliRewrite: Recovering Handwriting Behaviors from Calligraphy Images without Supervision |
|
Luo, Yuxuan | Peking University |
Wu, Zekun | Peking University |
Lian, Zhouhui | Peking University |
Keywords: Art and Entertainment Robotics, AI-Enabled Robotics
Abstract: Human-like planning skills and dexterous manipulation have long posed challenges in the fields of robotics and artificial intelligence (AI). The task of reinterpreting calligraphy presents a formidable challenge, as it involves the decomposition of strokes and dexterous utensil control. Previous efforts have primarily focused on supervised learning of a single instrument, limiting the performance of robots in the realm of cross-domain text replication. To address these challenges, we propose CalliRewrite: a coarse-to-fine approach for robot arms to discover and recover plausible writing orders from diverse calligraphy images without requiring labeled demonstrations. Our model achieves fine-grained control of various writing utensils. Specifically, an unsupervised image-to-sequence model decomposes a given calligraphy glyph to obtain a coarse stroke sequence. Using an RL algorithm, a simulated brush is fine-tuned to generate stylized trajectories for robotic arm control. Evaluation in simulation and physical robot scenarios reveals that our method successfully replicates unseen fonts and styles while achieving integrity in unknown characters. To access our code and supplementary materials, please visit our project page: https://luoprojectpage.github.io/callirewrite/.
|
|
WeBA2-CC Award Session, CC-301 |
Add to My Program |
Unmanned Aerial Vehicles |
|
|
Chair: Scaramuzza, Davide | University of Zurich |
Co-Chair: Schoellig, Angela P. | TU Munich |
|
13:30-15:00, Paper WeBA2-CC.1 | Add to My Program |
Co-Design Optimisation of Morphing Topology and Control of Winged Drones |
|
Bergonti, Fabio | Istituto Italiano Di Tecnologia |
Nava, Gabriele | Istituto Italiano Di Tecnologia |
Wüest, Valentin | EPFL |
Paolino, Antonello | Istituto Italiano Di Tecnologia |
L'Erario, Giuseppe | Istituto Italiano Di Tecnologia |
Pucci, Daniele | Italian Institute of Technology |
Floreano, Dario | Ecole Polytechnique Fédérale De Lausanne (EPFL) |
Keywords: Aerial Systems: Mechanics and Control, Methods and Tools for Robot System Design, Optimization and Optimal Control
Abstract: The design and control of winged aircraft and drones is an iterative process aimed at identifying a compromise of mission-specific costs and constraints. When agility is required, shape-shifting (morphing) drones represent an efficient solution. However, morphing drones require the addition of actuated joints that increase the topology and control coupling, making the design process more complex. We propose a co-design optimisation method that assists the engineers by proposing a morphing drone’s conceptual design that includes topology, actuation, morphing strategy, and controller parameters. The method consists of applying multi-objective constraint-based optimisation to a multi-body winged drone with trajectory optimisation to solve the motion intelligence problem under diverse flight mission requirements, such as energy consumption and mission completion time. We show that co-designed morphing drones outperform fixed-winged drones in terms of energy efficiency and mission time, suggesting that the proposed co-design method could be a useful addition to the aircraft engineering toolbox.
|
|
13:30-15:00, Paper WeBA2-CC.2 | Add to My Program |
FC-Planner: A Skeleton-Guided Planning Framework for Fast Aerial Coverage of Complex 3D Scenes |
|
Feng, Chen | Hong Kong University of Science and Technology |
Li, Haojia | The Hong Kong University of Science and Technology |
Zhang, Mingjie | Northwestern Polytechnical University |
Chen, Xinyi | The Hong Kong University of Science and Technology |
Zhou, Boyu | Sun Yat-Sen University |
Shen, Shaojie | Hong Kong University of Science and Technology |
Keywords: Aerial Systems: Perception and Autonomy, Motion and Path Planning, Aerial Systems: Applications
Abstract: 3D coverage path planning for UAVs is a crucial problem in diverse practical applications. However, existing methods have shown unsatisfactory system simplicity, computation efficiency, and path quality in large and complex scenes. To address these challenges, we propose FC-Planner,a skeleton-guided planning framework that can achieve fastaerial coverage of complex 3D scenes without pre-processing.We decompose the scene into several simple subspaces by askeleton-based space decomposition (SSD). Additionally, theskeleton guides us to effortlessly determine free space. Weutilize the skeleton to efficiently generate a minimal set ofspecialized and informative viewpoints for complete cover age. Based on SSD, a hierarchical planner effectively divides the large planning problem into independent sub-problems, enabling parallel planning for each subspace. The carefully designed global and local planning strategies are then in corporated to guarantee both high quality and efficiency in path generation. We conduct extensive benchmark and real world tests, where FC-Planner computes over 10 times faster compared to state-of-the-art methods with shorter path and more complete coverage. The source code will be made publicly available to benefit the community3. Project page: https: //hkust-aerial-robotics.github.io/FC-Planner.
|
|
13:30-15:00, Paper WeBA2-CC.3 | Add to My Program |
Time-Optimal Gate-Traversing Planner for Autonomous Drone Racing |
|
Qin, Chao | University of Toronto |
Michet, Maxime Simon Joseph | University of Toronto |
Chen, Jingxiang | University of Toronto |
Liu, Hugh H.-T. | University of Toronto |
Keywords: Aerial Systems: Applications, Motion and Path Planning, Art and Entertainment Robotics
Abstract: In drone racing, the time-minimum trajectory is affected by the drone's capabilities, the layout of the race track, and the configurations of the gates (e.g., their shapes and sizes). However, previous studies neglect the configuration of the gates, simply rendering drone racing a waypoint-passing task. This formulation often leads to a conservative choice of paths through the gates, as the spatial potential of the gates is not fully utilized. To address this issue, we present a time-optimal planner that can faithfully model gate constraints with various configurations and thereby generate the most time-efficient trajectory while considering the single-rotor-thrust limits. Our approach excels in computational efficiency which only takes a few seconds to compute the full state and control trajectories of the drone through tracks with dozens of different gates. Extensive simulations and experiments confirm the effectiveness of the proposed methodology, showing that the lap time can be further reduced by taking into account the gate's configuration. We validate our planner in real-world flights and demonstrate super-extreme flight trajectory through race tracks.
|
|
13:30-15:00, Paper WeBA2-CC.4 | Add to My Program |
Sequential Trajectory Optimization for Externally-Actuated Modular Manipulators with Joint Locking |
|
Choe, Jaeu | Seoul National University |
Lee, Jeongseob | Seoul National University |
Yang, Hyunsoo | Seoul National University |
Nguyen, Hai-Nguyen (Hann) | CNRS |
Lee, Dongjun | Seoul National University |
Keywords: Aerial Systems: Applications, Aerial Systems: Mechanics and Control
Abstract: In this paper, we present a novel trajectory planning method for externally-actuated modular manipulators (EAMMs), consisting of multiple rotor-actuated links with joints that can be either locked or unlocked. This joint-locking feature allows effective balancing of the payload capacity and dexterity of the robot but significantly complicates the planning problem by introducing binary decision variables. To address this challenge, we leverage the problem's intrinsic structure, i.e., the payload at the end-effector being enhanced by merely locking its immediate connected links; this allows us to break down the complex planning problem into a series of manageable subproblems and solve them sequentially. Our approach significantly reduces the problem's complexity: in a serial n-link EAMM with m joint-lock mechanisms, where there could potentially be 2^m distinct configurational dynamics, we require solving only n+1 trajectory optimization problems for single rigid body dynamics sequentially, thereby rendering the problem tractable. We substantiate the efficacy of our method through various simulation and experimental studies, covering ground-free and ground-bound configurations as well as both motion-only and manipulation tasks.
|
|
13:30-15:00, Paper WeBA2-CC.5 | Add to My Program |
Spatial Assisted Human-Drone Collaborative Navigation and Interaction through Immersive Mixed Reality |
|
Morando, Luca | New York University |
Loianno, Giuseppe | New York University |
Keywords: Aerial Systems: Applications
Abstract: Aerial robots have the potential to play a crucial role in assisting humans with complex and dangerous tasks. Nevertheless, the future industry demands innovative solutions to streamline the interaction process between humans and drones to enable seamless collaboration and efficient co-working. In this paper, we present a novel tele-immersive framework that facilitates cognitive and physical collaboration between humans and robots through Mixed Reality (MR). This includes a novel bi-directional spatial awareness and a multi-modal virtual-physical interaction approaches. The former seamlessly integrates the physical and virtual worlds, providing a bidirectional egocentric and exocentric environment representations. The latter, leveraging the proposed spatial representation, further enhances the collaboration combining a robot planning algorithm for obstacle avoidance with a variable admittance control. This enables the user to generate commands based on virtual forces while ensuring compatibility with the environment map. We validate the proposed approach by conducting several collaborative planning and exploration tasks involving a drone and a user equipped with a MR headset.
|
|
13:30-15:00, Paper WeBA2-CC.6 | Add to My Program |
A Trajectory-Based Flight Assistive System for Novice Pilots in Drone Racing Scenario |
|
Zhong, Yuhang | Zhejiang Unviersity |
Zhao, Guangyu | Zhejiang University |
Wang, Qianhao | Zhejiang University |
Xu, Guangtong | Zhejiang University |
Xu, Chao | Zhejiang University |
Gao, Fei | Zhejiang University |
Keywords: Human Factors and Human-in-the-Loop, Telerobotics and Teleoperation, Art and Entertainment Robotics
Abstract: Drone racing has become a popular international competition and has attained wide attention in recent years. However, the requirements of high-level operation keep the novice pilots away from participating in it. This paper presents a trajectory-based flight assistive system that enables various operators to fly the drone in a racing scene at a high speed. The whole system is structured hierarchically, consisting of both offline and online components. In the offline part, a global time-optimal trajectory is generated as the expert reference, and a dense flight corridor is constructed to provide sufficiently large safe region. In the online part, a remote control-mapped primitive is designed to fast encapsulate pilots' inputs, and the time mapping based trajectory progress is customized to further capture intention. Then, a trajectory planner is proposed to efficiently generate intention-aligned, smooth, feasible, and safe trajectories periodically. Additionally, a yaw planning that provides the pilot with the best suitable view angle is employed to further alleviate the operation difficulty. Simulations and real world experiments are implemented to verify the performance of our system. The maximum velocity can reach 6.0 m/s for a novice drone pilot in a real racing scene. We will open source our code later.
|
|
WeBT1-CC Oral Session, CC-303 |
Add to My Program |
Motion and Path Planning II |
|
|
Chair: Yong, Sze Zheng | Northeastern University |
Co-Chair: Liu, Sicong | Southern University of Science and Technology |
|
13:30-15:00, Paper WeBT1-CC.1 | Add to My Program |
RBI-RRT*: Efficient Sampling-Based Path Planning for High-Dimensional State Space |
|
Chen, Fang | Southern University of Science and Technology |
Zheng, Yu | Tencent |
Wang, Zheng | Southern University of Science and Technology |
Chi, Wanchao | Tencent |
Liu, Sicong | Southern University of Science and Technology |
Keywords: Motion and Path Planning
Abstract: Sampling-based planning algorithms such as RRT have been proved to be efficient in solving path planning problems for robotic systems. Various improvements to the RRT algorithm have been presented to improve the performance of the extension and convergence of the random trees, such as Informed RRT*. However, with the growth of spatial dimensions, the time consumption of randomly sampling the entire state space and incrementally rewiring the random trees raises drastically before a feasible solution is found. In this paper, to enhance the convergence performance of optimal solutions, we present Reconstructed Bi-directional Informed RRT* (RBI-RRT*) path planning algorithm. The algorithm acts as RRT-Connect to rapidly find a feasible solution, which helps compress the sampling space as Informed RRT* does. After the random trees are transformed into RRT* structure by the reconstruction process in RBI-RRT*, the algorithm continues to find the near-optimal path. A series of simulations and real-world robot experiments were conducted to evaluate the algorithm against existing planning algorithms. Compared to Informed RRT* Connect, RBI-RRT* reduced the computation time of achieving a specific cost by 22.1% on average in simulations and 11.2% in the real-world robotic arm experiments. The results show that RBI-RRT* is more efficient in high-dimensional planning problems.
|
|
13:30-15:00, Paper WeBT1-CC.2 | Add to My Program |
Quasi-Static Path Planning for Continuum Robots by Sampling on Implicit Manifold |
|
Wang, Yifan | Georgia Institute of Technology |
Chen, Yue | Georgia Institute of Technology |
Keywords: Motion and Path Planning, Flexible Robotics
Abstract: Continuum robots (CR) offer excellent dexterity and compliance in contrast to rigid-link robots, making them suitable for navigating through, and interacting with, confined environments. However, the study of path planning for CRs while considering external elastic contact is limited. The challenge lies in the fact that CRs can have multiple possible configurations when in contact, rendering the forward kinematics not well-defined, and characterizing the set of feasible robot configurations is non-trivial. In this paper, we propose to perform quasi-static path planning on an implicit manifold. We model elastic obstacles as external potential fields and formulate the robot statics in the potential field as the extremal trajectory of an optimal control problem. We show that the set of stable robot configurations is a smooth manifold diffeomorphic to a submanifold embedded in the product space of the CR actuation and base internal wrench. We then propose to perform path planning on this manifold using AtlasRRT*, a sampling-based planner dedicated to planning on implicit manifolds. Simulations in different operation scenarios were conducted and the results show that the proposed planner outperforms Euclidean space planners in terms of success rate and computational efficiency.
|
|
13:30-15:00, Paper WeBT1-CC.3 | Add to My Program |
Reconfiguration of a 2D Structure Using Spatio-Temporal Planning and Load Transferring |
|
Garcia Gonzalez, Javier | University of Houston |
Yannuzzi, Michael | University of Houston |
Kramer, Peter | TU Braunschweig |
Rieck, Christian | Technische Universität Braunschweig |
Fekete, Sándor | Technische Universität Braunschweig |
Becker, Aaron | University of Houston |
Keywords: Building Automation, Motion and Path Planning, Swarm Robotics
Abstract: We present progress on the problem of reconfiguring a 2D arrangement of building material by a cooperative group of robots. These robots must avoid collisions, deadlocks, and are subjected to the constraint of maintaining connectivity of the structure. We develop two reconfiguration methods, one based on spatio-temporal planning, and one based on target swapping, to increase building efficiency. The first method can significantly reduce planning times compared to other multi-robot planners. The second method helps to reduce the amount of time robots spend waiting for paths to be cleared, and the overall distance traveled by the robots.
|
|
13:30-15:00, Paper WeBT1-CC.4 | Add to My Program |
Neural Informed RRT*: Learning-Based Path Planning with Point Cloud State Representations under Admissible Ellipsoidal Constraints |
|
Huang, Zhe | University of Illinois at Urbana-Champaign |
Chen, Hongyu | University of Illinois at Urbana-Champaign |
Pohovey, John | University of Illinois Urbana-Champaign |
Driggs-Campbell, Katherine | University of Illinois at Urbana-Champaign |
Keywords: Motion and Path Planning, AI-Based Methods
Abstract: Sampling-based planning algorithms like Rapidly-exploring Random Tree (RRT) are versatile in solving path planning problems. RRT* offers asymptotic optimality but requires growing the tree uniformly over the free space, which leaves room for efficiency improvement. To accelerate convergence, rule-based informed approaches sample states in an admissible ellipsoidal subset of the space determined by the current path cost. Learning-based alternatives model the topology of the free space and infer the states close to the optimal path to guide planning. We propose Neural Informed RRT* to combine the strengths from both sides. We define point cloud representations of free states. We perform Neural Focus, which constrains the point cloud within the admissible ellipsoidal subset from Informed RRT*, and feeds into PointNet++ for refined guidance state inference. In addition, we introduce Neural Connect to build connectivity of the guidance state set and further boost performance in challenging planning problems. Our method surpasses previous works in path planning benchmarks while preserving probabilistic completeness and asymptotic optimality. We deploy our method on a mobile robot and demonstrate real world navigation around static obstacles and dynamic humans. Code is available at https://github.com/tedhuang96/nirrt_star.
|
|
13:30-15:00, Paper WeBT1-CC.5 | Add to My Program |
Motions in Microseconds Via Vectorized Sampling-Based Planning |
|
Thomason, Wil | Rice University |
Kingston, Zachary | Rice University |
Kavraki, Lydia | Rice University |
Keywords: Motion and Path Planning
Abstract: Modern sampling-based motion planning algorithms typically take between hundreds of milliseconds to dozens of seconds to find collision-free motions for high degree-of-freedom problems. This paper presents performance improvements of more than 500x over the state-of-the-art, bringing planning times into the range of microseconds and solution rates into the range of kilohertz, without specialized hardware. Our key insight is how to exploit fine-grained parallelism within planning, providing generality-preserving algorithmic improvements to any such planner and significantly accelerating critical subroutines, such as forward kinematics and collision checking. We demonstrate our approach over a diverse set of challenging, realistic problems for complex robots ranging from 7 to 14 degrees-of-freedom. Moreover, we show that our approach does not require high-power hardware by also evaluating on a low-power single-board computer. The planning speeds demonstrated are fast enough to reside in the range of control frequencies and open up new avenues of motion planning research.
|
|
13:30-15:00, Paper WeBT1-CC.6 | Add to My Program |
Gathering Data from Risky Situations with Pareto-Optimal Trajectories |
|
Brodt, Brennan | Boston University |
Pierson, Alyssa | Boston University |
Keywords: Motion and Path Planning, Optimization and Optimal Control, Autonomous Agents
Abstract: This paper proposes a formulation for the risk-aware path planning problem which utilizes multi-objective optimization to dynamically plan trajectories that satisfy multiple complex mission specifications. In the setting of persistent monitoring, we develop a method for representing environmental information and risk in a way that allows for local sampling to generate Pareto-dominant solutions over a receding horizon. We propose two algorithms capable of solving these problems: a dense sampling approach and an improved method utilizing noisy gradient descent. Simulation results demonstrate the efficacy of our methods at persistently gathering information while avoiding risk, robust to randomly-generated environments.
|
|
13:30-15:00, Paper WeBT1-CC.7 | Add to My Program |
RETRO: Reactive Trajectory Optimization for Real-Time Robot Motion Planning in Dynamic Environments |
|
Dastider, Apan | University of Central Florida |
Fang, Hao | University of Central Florida |
Mingjie, Lin | University of Central Florida |
Keywords: Motion and Path Planning, Optimization and Optimal Control, Planning under Uncertainty
Abstract: Reactive trajectory optimization for robotics presents formidable challenges, demanding the rapid generation of purposeful robot motion in complex and swiftly changing dynamic environments. While much existing research predominantly addresses robotic motion planning with predefined objectives, emerging problems in robotic trajectory optimization frequently involve dynamically evolving objectives and stochastic motion dynamics. However, effectively addressing such reactive trajectory optimization challenges for robot manipulators proves difficult due to inefficient, high-dimensional trajectory representations and a lack of consideration for time optimization. In response, we introduce a novel trajectory optimization framework called RETRO. RETRO employs adaptive optimization techniques that span both spatial and temporal dimensions. As a result, it achieves a remarkable computing complexity of O(T^ 2.4 ) +O(T n^2 ), a significant improvement over the naive application of DDP, which leads to a complexity of O(n^ 4 ) when reasonable time step sizes are used. To evaluate RETRO’s performance in terms of error, we conducted a comprehensive analysis of its regret bounds, comparing it to an Oracle value function obtained through an Oracle trajectory optimization algorithm. Our analytical findings demonstrate that RETRO’s total regret can be upper-bounded by a function of the chosen time step size. Moreover, our approach delivers smoothly optimized robot trajectories within the joint-space, offering flexibility and adaptability for various tasks. It seamlessly integrates task-specific requirements such as collision avoidance while maintaining real-time control rates. We validate the effectiveness of our framework through extensive simulations and real-world robot experiments in closed-loop manipulation scenarios. For further details and supplementary materials, please visit: https://sites.google.com/view/retro-optimal-control/home
|
|
13:30-15:00, Paper WeBT1-CC.8 | Add to My Program |
WiTHy A*: Winding-Constrained Motion Planning for Tethered Robot Using Hybrid A* |
|
Chipade, Vishnu S. | University of Michigan |
Kumar, Rahul | Northeastern University |
Yong, Sze Zheng | Northeastern University |
Keywords: Motion and Path Planning, Constrained Motion Planning, Nonholonomic Motion Planning
Abstract: In this paper, a variant of hybrid A* is developed to find the shortest path for a curvature-constrained robot, that is tethered at its start position, such that the tether satisfies user-defined winding angle constraints. A variant of tangent graphs is used as an underlying graph for searching a path using A* in order to reduce the overall computation and define appropriate cost metrics to ensure winding angle constraints are satisfied. Conditions are provided under which the proposed algorithm is guaranteed to find a winding angle-constrained path. The effectiveness and performance of the proposed algorithm are studied in simulation.
|
|
13:30-15:00, Paper WeBT1-CC.9 | Add to My Program |
Differentiable Boustrophedon Paths That Enable Optimization Via Gradient Descent |
|
Manzini, Thomas | Texas A&M |
Murphy, Robin | Texas A&M |
Keywords: Motion and Path Planning, Optimization and Optimal Control
Abstract: This paper introduces a differentiable representation for the optimization of boustrophedon path plans in convex polygons, explores an additional parameter of these path plans that can be optimized, discusses the properties of this representation that can be leveraged during the optimization process and shows that the previously published attempt at optimization of these path plans was too coarse to be practically useful. Experiments were conducted to show that this differentiable representation can reproduce scores from traditional discrete representations of boustrophedon path plans with high fidelity. Finally, optimization via gradient descent was attempted but found to fail because the search space is far more non-convex than was previously considered in the literature. The wide range of applications for boustrophedon path plans means that this work has the potential to improve path planning efficiency in numerous areas of robotics, including mapping and search tasks using uncrewed aerial systems, environmental sampling tasks using uncrewed marine vehicles, and agricultural tasks using ground vehicles, among numerous others applications.
|
|
WeBT2-CC Oral Session, CC-311 |
Add to My Program |
Robot Design |
|
|
Chair: Bonev, Ilian | École De Technologie Supérieure |
Co-Chair: Yeshmukhametov, Azamat | Nazarbayev University |
|
13:30-15:00, Paper WeBT2-CC.1 | Add to My Program |
Torsion-Induced Compliant Joints and Its Application to Flat-Foldable and Self-Assembling Robotic Arm |
|
Yang, Dong-Wook | Korea Advanced Institute of Science and Technology (KAIST) |
Park, Hyun-Su | KAIST |
Jang, Keon-Ik | Korea Advanced Institute of Science and Technology |
Han, Jae-Hung | KAIST |
Lee, Dae-Young | Korea Advanced Institute of Science and Technology |
Keywords: Compliant Joints and Mechanisms, Soft Robot Applications, Soft Robot Materials and Design
Abstract: The joint design of origami-inspired robots is one of the most distinctive features that distinguishes them from conventional robots. A joint design using material’s compliance enables origami robots to implement complex transformational movements in a lightweight and simple manner. However, utilizing the continuum bending mode of materials brings critical problems, including undesired movements and joint radius. This study introduces a solution to these problems through a torsion-based compliant joint (T-C joint) design, which utilizes the torsion deformation of materials. The potential of the T-C joint is demonstrated in a flat-foldable and self-assembling robotic arm, providing its applicability in environments with form-factor limitations and minimal human intervention. The robotic arm—comprising links, joints, and a gripper—can fold into a flat state, deploy with precision and minimal weight, and effectively manipulate target objects. This demonstration shows the real-world application of the proposed joint design.
|
|
13:30-15:00, Paper WeBT2-CC.2 | Add to My Program |
OriTrack: A Small, 3 Degree of Freedom, Origami Solar Tracker |
|
Winston, Crystal | Stanford University |
Casey, Leo | Google, Inc |
Keywords: Energy and Environment-Aware Automation, Soft Robot Applications, Soft Robot Materials and Design
Abstract: In response to the need for sustainable energy solutions, solar panels have gained significant traction. One way to increase the energy capture of solar systems is through solar tracking, a means of reorienting solar panels throughout the day in order to face the sun. The energy consumption increase that comes with solar tracking often far outweighs the amount of energy required to move the panel, which makes it a compelling strategy for improving solar systems. Unfortunately, while solar trackers are commonly used in large solar farms, they are rarely used on rooftops, an area where solar panels are commonly installed. This is for two primary reasons: (1) most commercially available solar trackers are too large to be installed on roofs and (2) even if traditional solar trackers were made in a more compact form-factor it would be difficult to densely lay them out on a roof without the trackers substantially shading each other. In order to address these issues, we introduce OriTrack, a small three-degree-of-freedom (3 DOF) solar tracker which reduces the area of its shadow by reducing its height as it tracks the sun. In this paper we discuss the design, manufacturing, and control of OriTrack. We then compare OriTrack to a flat reference panel, the solar energy solution commonly used on roofs today, and find that OriTrack demonstrates 23% increased energy production. This result suggests OriTrack could be used as a future solution for solar tracking on rooftops.
|
|
13:30-15:00, Paper WeBT2-CC.3 | Add to My Program |
Reinforcement Learning for Freeform Robot Design |
|
Li, Muhan | Northwestern University |
Matthews, David | Northwestern University |
Kriegman, Sam | Northwestern University |
Keywords: Evolutionary Robotics
Abstract: Inspired by the necessity of morphological adaptation in animals, a growing body of work has attempted to expand robot training to encompass physical aspects of a robot's design. However, reinforcement learning methods capable of optimizing the 3D morphology of a robot have been restricted to reorienting or resizing the limbs of a predetermined and static topological genus. Here we show policy gradients for designing freeform robots with arbitrary external and internal structure. This is achieved through actions that deposit or remove bundles of atomic building blocks to form higher-level nonparametric macrostructures such as appendages, organs and cavities. Although results are provided for open loop control only, we discuss how this method could be adapted for closed loop control and sim2real transfer to physical machines in future.
|
|
13:30-15:00, Paper WeBT2-CC.4 | Add to My Program |
A Helical Bistable Soft Gripper Enable by Pneumatic Actuation |
|
Yin, Xuanchun | South China Agricultural University |
Xie, Junliang | South China Agricultural University |
Zhou, Pengyu | South China Agricultural University |
Wen, Sheng | South China Agricultural University |
Zhang, Jiantao | South China Agricultural University |
Keywords: Grippers and Other End-Effectors, Biologically-Inspired Robots, Soft Robot Applications
Abstract: There are many instances of helical mechanisms that are used to efficiently grasp different objects with various shapes and sizes in nature. Inspired by the helical grasping in the nature, we proposed a helical bistable soft gripper with high load capacity and energy saving. An off-the-shelf bistable steel shell (BSS) as the stiff element was inserted into a 3D printing soft helical exo-skeleton to achieve coiling around and holding the objects with-out energy consumption. Two air pouches were designed as the actuator to control the transition between the two stable states. To facilitate gripper design, a simplified model of the gripper was conducted, and the geometric parameters of the gripper are listed in a table for reference. The transition pressures between the two stable states were experimentally characterized. Moreover, we conduct experiments to demonstrate the capability of the gripper in two working modes. The gripper exhibits coiling diameters ranging between 40 mm and 60 mm and is successfully attached to various slender objects of different geometries with a maximum holding force of 92.67 N (up to 135.1 times of its mass) in hanging mode. Finally, the gripper was integrated into a robot arm and successfully grasped different objects, and the maximum grasping weight is 221.6 g in the grasping mode.
|
|
13:30-15:00, Paper WeBT2-CC.5 | Add to My Program |
Singularity Analysis of Kinova's Link 6 Robot~Arm Via Grassmann Line Geometry |
|
Asgari, Milad | École De Technologie Supérieure |
Bonev, Ilian | École De Technologie Supérieure |
Gosselin, Clement | Université Laval |
Keywords: Kinematics, Mechanism Design, Actuation and Joint Mechanisms
Abstract: Unlike parallel robots, for which hundreds of different architectures have been proposed, the vast majority of six-degree-of-freedom (DOF) serial robots have one of two simple architectures. In both architectures, the inverse kinematics can be solved in closed form and the singularities described by trivial geometric and algebraic conditions. These conditions can be readily obtained by analyzing the determinant of the robot's Jacobian matrix, and provide an in-depth understanding of the robot's singularities, which is essential for its optimal use. However, for various reasons, robot arms with unorthodox architectures are occasionally designed. Such arms do not have closed-form inverse kinematics and little insight into their singularities can be gained by analyzing the determinant of their Jacobian. One such robot arm for which the conventional singularity analysis approach fails is the new Link~6 collaborative robot by Kinova. In this paper, we study the complex singularities of Link~6 by investigating all possibilities for screw dependencies, deriving a simple equation for each case, and then describing each singularity type using Grassmann line geometry. Twelve different singularity configurations are identified and described with seven relatively simple geometric conditions. Our approach is general and can be applied to other robot arms.
|
|
13:30-15:00, Paper WeBT2-CC.6 | Add to My Program |
Design and Testing of a Multi-Module, Tetherless, Soft Robotic Eel |
|
Hall, Robin | Worcester Polytechnic Institute |
Espinosa, Gabriel | Worcester Polytechnic Institute |
Chiang, Shou-Shan | Worcester Polytechnic Institute |
Onal, Cagdas | WPI |
Keywords: Marine Robotics, Soft Robot Materials and Design, Biologically-Inspired Robots
Abstract: This paper presents a free-swimming, tetherless, cable-driven modular soft robotic fish. The body comprises a series of 3D-printed wave spring structures that create a flexible biologically inspired shape that is capable of an anguilliform swimming gait. A three-module soft robotic fish was designed, fabricated, and evaluated. The motion of the robot was characterized and different combinations of actuation amplitude, frequency, and phase shift were determined experimentally to determine the optimal parameters that maximized speed and minimized the cost of transport (COT). The maximum speed recorded was 0.20 BL/s (body lengths per second) with a COT of 15.82. These results were compared against other robotic and biological fish. We operated the robot, untethered, in a variety of environments to test how it was able to function outside of laboratory settings.
|
|
13:30-15:00, Paper WeBT2-CC.7 | Add to My Program |
Untethered Underwater Soft Robot with Thrust Vectoring |
|
Hall, Robin | Worcester Polytechnic Institute |
Onal, Cagdas | WPI |
Keywords: Marine Robotics, Soft Robot Materials and Design, Soft Robot Applications
Abstract: This paper introduces DRAGON: Deformable Robot for Agile Guided Observation and Navigation, a free-swimming deformable impeller-powered vectored underwater vehicle (VUV). A 3D-printed wave spring structure directs the water drawn through the center of the robot by an impeller, enabling it to move smoothly in different directions. The robot is designed to have a narrow cylindrical profile to lower drag and improve agility. It has a maximum recorded speed of 2.1 BL/s (body lengths per second) and a minimum cost of transport (COT) of 2.9. The robot has two degrees of freedom (DoFs) and is capable of performing a variety of maneuvers including a full circle with a radius of 0.23 m (1.4 BL) and a figure eight, which it completed in 4.98 s (72.3 degree/s) and 10.74 s respectively. We operated the robot, untethered, in various environments to test the robustness of the design and analyze its motion and performance.
|
|
13:30-15:00, Paper WeBT2-CC.8 | Add to My Program |
A Backdrivable Axisymmetric Kinematically Redundant (6+3)-Degree-Of-Freedom Hybrid Parallel Manipulator |
|
Kim, Jehyeok | Université Laval |
Gosselin, Clement | Université Laval |
Keywords: Redundant Robots, Mechanism Design, Physical Human-Robot Interaction
Abstract: A kinematically redundant (6+3)-degree-of-freedom (DOF) hybrid parallel robot with an axisymmetric workspace is proposed. By arranging the first revolute joint of each leg such that they have the same rotation axis, this robot can achieve an axisymmetric workspace, resulting in a large reachable workspace. In addition, type II singularities, which critically limit the orientational workspace, can be fully avoided by utilizing kinematic redundancy. A gripper mechanism is developed to increase the orientational workspace by exploiting the redundant DOFs. Moreover, the orientational workspace can be further increased by introducing a redundant DOF with a constant angle. As a result, the proposed hybrid parallel robot achieves a high workspace-to-footprint ratio comparable to that of serial robots. A CAD model of the robot and computer animations are provided to demonstrate the large workspaces and the gripper mechanism. A significant advantage of the proposed robot over serial architectures is that the robot is backdrivable since it uses direct-drive or quasi-direct-drive actuators.
|
|
13:30-15:00, Paper WeBT2-CC.9 | Add to My Program |
Design of a Fully Pulley-Guided Wire-Driven Prismatic Tensegrity Robot: Friction Impact to Robot Payload Capacity |
|
Yeshmukhametov, Azamat | Nazarbayev University |
Koganezawa, Koichi | Tokai University |
Keywords: Redundant Robots, Tendon/Wire Mechanism, Mechanism Design
Abstract: The tensegrity structure was initially created as a static structure, but it has gained significant attention among robotics researchers due to its benefits, including high payload capability, shock resistance, and resiliency. However, implementing tensegrity structures in robotics presents new technical challenges, primarily related to their wire-driven structure, such as wire-routing and wire-friction problems. Therefore, this research letter proposes a technical solution for the aforementioned problems. The main contribution of this research is the design of frictionless pulley-guided nodes. To validate the proposed concept, we conducted comparative experiments between a common tensegrity prototype and a pulley-guided prototype, evaluating wire tension distribution and payload capacity.
|
|
WeBT3-CC Oral Session, CC-313 |
Add to My Program |
Kinematics and Dynamics |
|
|
Chair: Yi, Jingang | Rutgers University |
Co-Chair: Lau, Darwin | The Chinese University of Hong Kong |
|
13:30-15:00, Paper WeBT3-CC.1 | Add to My Program |
Motion Planning and Inertia Based Control for Impact Aware Manipulation |
|
Khurana, Harshit | EPFL |
Billard, Aude | EPFL |
Keywords: Impact Aware Manipulation, Motion Control of Manipulators, Motion and Path Planning, Factory Automation
Abstract: In this paper, we propose a metric called hitting flux which is used in the motion generation and controls for a robot manipulator to interact with the environment through a hitting or a striking motion. Given the task of placing a known object outside of the workspace of the robot, the robot needs to come in contact with it at a non zero relative speed. The configuration of the robot and the speed at contact matter because they affect the motion of the object. The physical quantity called hitting flux depends on the robot's configuration, the robot speed and the properties of the environment. An approach to achieve the desired directional pre-impact flux for the robot through a combination of a dynamical system (DS) for motion generation and a control system that regulates the directional inertia of the robot is presented. Furthermore, a Quadratic Program (QP) formulation for achieving a desired inertia matrix at a desired position while following a motion plan constrained to the robot limits is presented. The system is tested for different scenarios in simulation showing the repeatability of the procedure and in real scenarios with KUKA LBR iiwa 7 robot.
|
|
13:30-15:00, Paper WeBT3-CC.2 | Add to My Program |
RASCAL: A Scalable, High-Redundancy Robot for Automated Storage and Retrieval Systems |
|
Black, Richard | Microsoft |
Caballero, Marco | Microsoft Research |
Chatzieleftheriou, Andromachi | Microsoft |
Deegan, Tim | Microsoft Research |
Heard, Philip | Microsoft Research, Cambridge, UK |
Hong, Freddie | Microsoft Research |
Joyce, Russell | Microsoft Research |
Legtchenko, Sergey | Microsoft |
Rowstron, Antony | Microsoft Research |
Smith, Adam | Microsoft |
Sweeney, David | Microsoft Research |
Williams, Hugh | Microsoft |
Keywords: Industrial Robots, Climbing Robots, Mechanism Design
Abstract: Automated storage and retrieval systems (ASRS) are a key component of the modern storage industry, and are used in a wide range of applications, carrying anything from lightweight tape cartridges to entire pallets of goods. Many of these systems are under pressure to maximise the use of space by growing in height and density, but this can create challenges for the the robots that service them. In this context, we present RASCAL, a novel ASRS robot for small payload items in structured environments, with a focus on system-level scalability and redundancy. We describe the design objectives of RASCAL and how they address some of the limitations of existing robotic systems in this area, such as scalability and redundancy. We then demonstrate the viability of our design with a proof-of-concept implementation of a data centre storage media robot, and show through a series of experiments that its design, speed, accuracy, and energy efficiency are appropriate for this application.
|
|
13:30-15:00, Paper WeBT3-CC.3 | Add to My Program |
Virtual Passive-Joint Space Based Time-Optimal Trajectory Planning for a 4-DOF Parallel Manipulator |
|
Zhao, Jie | Chinese Academy of Sciences |
Yang, Guilin | Ningbo Institute of Material Technology and Engineering, Chines |
Shi, Haoyu | University of Nottingham, Ningbo China |
Chen, Silu | Ningbo Institute of Materials Technology and Engineering, CAS |
Chen, Chin-Yin | Ningbo Institute of Material Technology and Engineering, CAS |
Zhang, Chi | Ningbo Institute of Material Technology and Engineering, CAS |
Keywords: Parallel Robots, Kinematics, Motion and Path Planning
Abstract: The 4-DOF (3T1R) 4PPa-2PaR parallel manipulator is developed for high-speed pick-and-place operations. However, conventional trajectory planning methods in either active joint space or Cartesian space have some shortcomings due to its high nonlinear kinematics. Owing to its unique four-to-two leg structure, the middle link that connects to the two proximal parallelogram four-bar linkages in each side only generates 2-DOF translational motions in a vertical plane. By treating each of the middle link as a 2-DOF virtual passive joint, a new trajectory planning method in the 4-DOF virtual passive-joint space is proposed, which not only simplifies the kinematic analysis, but also decreases the kinematics nonlinearity. By introducing the virtual passive joints, both displacement and velocity analyses are readily investigated. The Lagrangian method is employed to formulate the closed-form dynamic model. The quintic B-spline is utilized to generate trajectories in the virtual passive-joint space, while the Genetic Algorithm is implemented to search for the time optimal trajectory. The simulation results indicate that the optimal time planned in the virtual passive-joint space is decreased by 2.8% and 8.1% compared with the active-joint space method and Cartesian space method respectively. The average and peak jerks of the moving platform are decreased by 14.6% and 37.6% compared with the active-joint space method.
|
|
13:30-15:00, Paper WeBT3-CC.4 | Add to My Program |
Direct Kinematic Singularities and Stability Analysis of Sagging Cable-Driven Parallel Robots |
|
Briot, Sébastien | LS2N |
Merlet, Jean-Pierre | INRIA |
Keywords: Parallel Robots, Kinematics, stability, Industrial Robots
Abstract: Sagging cable-driven parallel robots (CDPRs) are often modelled by using the Irvine's model. We will show that their configurations may be unstable, and moreover, that assessing the stability of the robot with the Irvine's model cannot be done by checking the spectrum of a stiffness matrix associated with the platform motions. In the present paper, we show that the static configurations of the sagging CDPRs are local extrema of the functional describing the robot potential energy. For assessing the stability, it is then necessary to check two conditions: The Legendre-Clebsch and the Jacobi conditions, both well known in optimal control theory. We will also (i) prove that there is a link between some singularities of the CDPRs and the limits of stability and (ii) show that singularities of the platform wrench system are not singularities of the geometric model of the sagging CDPRs, contrary to what happens in rigid-link parallel robotics. The stability prediction results are validated in simulation by cross-validating them by using a lumped model, for which the stability can be assessed by analyzing the spectrum of a reduced Hessian matrix of the potential energy.
|
|
13:30-15:00, Paper WeBT3-CC.5 | Add to My Program |
Towards Solving Cable-Driven Parallel Robot Inaccuracy Due to Cable Elasticity |
|
Suarez Roos, Adolfo | IRT Jules Verne |
Zake, Zane | IRT Jules Verne |
Rasheed, Tahir | IRCCyN - ECN - IRT JV |
Pedemonte, Nicolo | IRT Jules Verne |
Caro, Stéphane | CNRS/LS2N |
Keywords: Parallel Robots, Tendon/Wire Mechanism, Kinematics
Abstract: Cable elasticity can significantly impact the accuracy of Cable-Driven Parallel Robots (CDPRs). However, it’s frequently disregarded as negligible in CDPR simulations and designs. In this paper, we propose a numerical approach, referred to as SEECR, which is designed to estimate the behavior of a CDPR featuring elastic cables while ensuring the Static Equilibrium (SE) of the Moving-Platform (MP). By modeling the cables as elastic springs, the proposed approach correctly predicts which cables become slack, estimates the tension distribution among cables and computes unwanted MP motions, allowing to predict the impact of design choices. The results have been validated experimentally on two cable types and configurations.
|
|
13:30-15:00, Paper WeBT3-CC.6 | Add to My Program |
Wrench and Twist Capability Analysis for Cable-Driven Parallel Robots with Consideration of the Actuator Torque-Speed Relationship |
|
Chan, Ngo Foon | The Chinese University of Hong Kong |
Lam, Wai Yi | The Chinese University of Hong Kong |
Lau, Darwin | The Chinese University of Hong Kong |
Keywords: Parallel Robots, Tendon/Wire Mechanism, Manipulation Planning, Wrench-twist Feasibility
Abstract: The wrench and twist feasibility are the workspace conditions that indicate whether the mobile-platform (MP) of the cable-driven parallel robots (CDPRs) can provide a sufficient amount of wrench and twist. Traditionally, these two quantities are evaluated independently from the actuator's torque and speed limits, which are assumed to be fixed in the literature, but they are indeed coupled. This results in a conservative usage of the actuator capability and hence hinders the robot's actual feasibility. In this study, new approaches to analyzing and commanding CDPRs by considering the coupling effect are proposed. First, the required wrench of the MP is mapped into the twist space by the motors' torque-speed relationship and becomes the wrench-dependent available twist set. Then a new workspace condition and a new metric are introduced based on the available twist set. The metric shows the maximum allowable MP speed map of the workspace. Finally, a varying speed trajectory is designed based on the metric to optimize the total MP traveling time. This study shows the potential of robot wrench-twist capability and enhances the robot hardware effectiveness without any ha
|
|
13:30-15:00, Paper WeBT3-CC.7 | Add to My Program |
RicMonk: A Three-Link Brachiation Robot with Passive Grippers for Energy-Efficient Brachiation |
|
Grama Srinivas Shourie, Grama Srinivas Shourie | Deutsches Forschungszentrum Für Künstliche Intelligenz, Bremen |
Javadi, Mahdi | German Research Center for Artificial Intelligence Robotics Inn |
Kumar, Shivesh | DFKI GmbH |
Zamani Boroujeni, Hossein | DFKI-Robotics Innovation Center |
Kirchner, Frank | University of Bremen |
Keywords: Underactuated Robots, Biologically-Inspired Robots, Education Robotics
Abstract: This paper presents the design, analysis, and performance evaluation of RicMonk, a novel three-link brachiation robot equipped with passive hook-shaped grippers. Brachiation, an agile and energy-efficient mode of locomotion observed in primates, has inspired the development of RicMonk to explore versatile locomotion and maneuvers on ladder-like structures. The robot’s anatomical resemblance to gibbons and the integration of a tail mechanism for energy injection contribute to its unique capabilities. The paper discusses the use of the Direct Collocation methodology for optimizing trajectories for the robot’s dynamic behaviors and stabilization of these trajectories using a Time-varying Linear Quadratic Regulator. With RicMonk we demonstrate bidirectional brachiation, and provide comparative analysis with its predecessor, AcroMonk - a two-link brachiation robot, to demonstrate that the presence of a passive tail helps improve energy efficiency. The system design, controllers, and software implementation are publicly available on GitHub at https://github.com/dfki-ric-underactuated-lab/ricmonk and the video demonstration of the experiments can be viewed at https://youtu.be/hOuDQI7CD8w.
|
|
13:30-15:00, Paper WeBT3-CC.8 | Add to My Program |
Gaussian Process-Enhanced, External and Internal Convertible Form-Based Control of Underactuated Balance Robots |
|
Han, Feng | Rutgers University |
Yi, Jingang | Rutgers University |
Keywords: Underactuated Robots, Dynamics, Machine Learning for Robot Control
Abstract: External and internal convertible (EIC) form-based motion control (i.e., EIC-based control) is one of the effective approaches for underactuated balance robots. By sequentially controller design, trajectory tracking of the actuated subsystem and balance of the unactuated subsystem can be achieved simultaneously. However, with certain conditions, there exists uncontrolled robot motion under the EIC-based control. We first identify these conditions and then propose an enhanced EIC-based control with a Gaussian process data-driven robot dynamic model. Under the new enhanced EIC-based control, the stability and performance of the closed-loop system are guaranteed. We demonstrate the GP-enhanced control experimentally using two examples of underactuated balance robots.
|
|
WeBT4-CC Oral Session, CC-315 |
Add to My Program |
Multi-Robot Systems V |
|
|
Chair: Sabattini, Lorenzo | University of Modena and Reggio Emilia |
Co-Chair: Garcia de Marina, Hector | Universidad De Granada |
|
13:30-15:00, Paper WeBT4-CC.1 | Add to My Program |
Automation and Artificial Intelligence Technology in Surface Mining: State of the Art, Challenges and Opportunities |
|
Leung, Raymond | The University of Sydney |
Hill, Andrew John | University of Sydney |
Melkumyan, Arman | The University of Sydney |
Keywords: Mining Robotics, Planning, Scheduling and Coordination, Probability and Statistical Methods
Abstract: This survey article provides a synopsis on some of the engineering problems, technological innovations, robotic development and automation efforts encountered in the mining industry---particularly in the Pilbara iron-ore region of Western Australia. The goal is to paint the technology landscape and highlight issues relevant to an engineering audience to raise awareness of AI and automation trends in mining. It assumes the reader has no prior knowledge of mining and builds context gradually through focused discussion and short summaries of common open-pit mining operations. The principal activities that take place may be categorized in terms of resource development, mine-, rail- and port operations. From mineral exploration to ore shipment, there are roughly nine steps in between. These include: geological assessment, mine planning and development, production drilling and assaying, blasting and excavation, transportation of ore and waste, crush and screen, stockpile and load-out, rail network distribution, and ore-car dumping. The objective is to describe these processes and provide insights on some of the challenges/opportunities from the perspective of a decade-long industry-university R&D partnership.
|
|
13:30-15:00, Paper WeBT4-CC.2 | Add to My Program |
Hierarchical Traffic Management of Multi-AGV Systems with Deadlock Prevention Applied to Industrial Environments (I) |
|
Pratissoli, Federico | Università Degli Studi Di Modena E Reggio Emilia |
Brugioni, Riccardo | RSEngineering Srl |
Battilani, Nicola | University of Modena and Reggio Emilia |
Sabattini, Lorenzo | University of Modena and Reggio Emilia |
Keywords: Multi-Robot Systems, Factory Automation, Path Planning for Multiple Mobile Robots or Agents
Abstract: This paper concerns the coordination and the traffic management of a group of Automated Guided Vehicles (AGVs) moving in a real industrial scenario, such as an automated factory or warehouse. The proposed methodology is based on a three-layer control architecture, which is described as follows: 1) the Top Layer (or Topological Layer) allows to model the traffic of vehicles among the different areas of the environment; 2) the Middle Layer allows the path planner to compute a traffic sensitive path for each vehicle; 3) the Bottom Layer (or Roadmap Layer) defines the final routes to be followed by each vehicle and coordinates the AGVs over time. In the paper we describe the coordination strategy we propose, which is executed once the routes are computed and has the aim to prevent congestions, collisions and deadlocks. The coordination algorithm exploits a novel deadlock prevention approach based on time-expanded graphs. Moreover, the presented control architecture aims at grounding theoretical methods to an industrial application by facing the typical practical issues such as graphs difficulties (load/unload locations, weak connections,), a predefined roadmap (constrained by the plant layout), vehicles errors, dynamical obstacles, etc. In this paper we propose a flexible and robust methodology for multi-AGVs traffic-aware management. Moreover, we propose a coordination algorithm, which does not rely on ad hoc assumptions or rules, to prevent collisions and deadlocks and to deal
|
|
13:30-15:00, Paper WeBT4-CC.3 | Add to My Program |
Task Allocation in Heterogeneous Multi-Robot Systems Based on Preference-Driven Hedonic Game |
|
Zhang, Liwang | National University of Defense Technology |
Li, Minglong | National University of Defense Technology |
Yang, Wenjing | State Key Laboratory of High Performance Computing (HPCL), Schoo |
Yang, Shaowu | National University of Defense Technology |
Keywords: Multi-Robot Systems, Search and Rescue Robots, Cooperating Robots
Abstract: Multiple preferences between robots and tasks have been largely overlooked in previous research on Multi-Robot Task Allocation (MRTA) problems. In this paper, we propose a preference-driven approach based on hedonic game to address the task allocation problem of muti-robot systems in emergency rescue scenarios. We present a distributed framework considering various preferences between robots and tasks to determine the division of coalitions in such problems and evaluate the scalability and adaptability of our algorithm through relevant experiments. Furthermore, considering the strict communication limitations in emergency rescue scenarios, we have verified that our algorithm can efficiently converge to a Nash-stable coalition partition even in conditions of insufficient communication distance.
|
|
13:30-15:00, Paper WeBT4-CC.4 | Add to My Program |
Persistent Monitoring of Multiple Moving Targets Using High Order Control Barrier Functions |
|
Balandi, Lorenzo | Centre Inria De l'Université De Rennes |
De Carli, Nicola | CNRS |
Robuffo Giordano, Paolo | Irisa Cnrs Umr6074 |
Keywords: Multi-Robot Systems, Sensor Networks, Cooperating Robots
Abstract: This paper considers the problem of persistently monitoring a set of moving targets using a team of aerial vehicles. Each agent in the network is assumed equipped with a camera with limited range and Field of View (FoV) providing bearing measurements and it implements an Information Consensus Filter (ICF) to estimate the state of the target(s). The ICF can be proven to be uniformly globally exponentially stable under a Persistency of Excitation (PE) condition. We then propose a distributed control scheme that allows maintaining a prescribed minimum PE level so as to ensure filter convergence. At the same time, the agents in the group are also allowed to perform additional tasks of interest while maintaining a collective observability of the target(s). In order to enforce satisfaction of the observability constraint, we leverage two main tools: (i) the weighted Observability Gramian with a forgetting factor as a measure of the cumulative acquired information, and (ii) the use of High Order Control Barrier Functions (HOCBF) as a mean to maintain a minimum level of observability for the targets. Simulation results are reported to prove the effectiveness of this approach.
|
|
13:30-15:00, Paper WeBT4-CC.5 | Add to My Program |
Singularity Analysis of Rigid Directed Bearing Graphs for Quadrotor Formations |
|
Erskine, Julian | Ecole Centrale De Nantes |
Briot, Sébastien | LS2N |
Fantoni, Isabelle | CNRS |
Chriette, Abdelhamid | Ecole Centrale De Nantes |
Keywords: Multi-Robot Systems, Sensor-based Control, Kinematics, graph rigidity
Abstract: The decentralization of formations using onboard sensing is important for multi-robot systems, improving the robustness and independence of fleet operations. Bearing measurements (obtainable from embedded cameras) are an attractive choice for use in decentralized formation control, however this requires that the formation framework be bearing rigid. Rigidity may be checked numerically for a given formation framework, however it remains difficult to determine the geometric conditions under which otherwise rigid formations become flexible. This paper models the sensor and robot constraints in bearing formations of quadrotors as a kinematic mechanism with analogous properties to find geometric conditions for the degeneration of bearing rigidity (singularities) and the resulting uncontrollable motions. A classification of singularities based on graph substructures is developed, and it is shown that arbitrarily large formations may be designed for which all singularities lie within a known set of geometric conditions. An application on how to use the knowledge of all singularity cases in a formation for singularity-free control maintenance is provided.
|
|
13:30-15:00, Paper WeBT4-CC.6 | Add to My Program |
EM-Patroller: Entropy Maximized Multi-Robot Patrolling with Steady State Distribution Approximation |
|
Guo, Hongliang | Agency for Science Technology and Research |
Kang, Qi | National University of Singapore |
Yau, Wei-Yun | I2R |
Ang Jr, Marcelo H | National University of Singapore |
Rus, Daniela | MIT |
Keywords: Multi-Robot Systems, Surveillance Robotic Systems
Abstract: This paper investigates the multi-robot patrolling (MuRP) problem in a discrete environment with the objective of approaching the uniform node coverage probability distribution by the robot team. Prevailing MuRP solutions for uniform node coverage either incur high (non-polynomial) computational complexity operations for the global optimal solution, or recourse to simple yet effective heuristics for approximate solutions without any performance guarantee. In this paper, we bridge the gap by proposing an efficient iterative algorithm, namely Entropy Maximized Patroller (EM-Patroller), with the per-iteration performance improvement guarantee and polynomial computational complexity. We reformulate the multi-robot patrolling problem in discrete environments as an 'unnormalized' joint steady state distribution entropy maximization problem, and employ multi-layer perceptron (MLP) to model the relationship between each robot's patrolling strategy and the individual steady state distribution. Then, we derive a multi-agent model-based policy gradient method to gradually update the robots' patrolling strategies towards the optimum. Complexity analysis indicates the polynomial computational complexity of EM-Patroller, and we also show that EM-Patroller has additional benefits of catering to miscellaneous user-defined joint steady state distributions and incorporating other objectives, e.g., entropy maximization of individual steady state distribution, into the objective. We compare E
|
|
13:30-15:00, Paper WeBT4-CC.7 | Add to My Program |
Behavioral-Based Circular Formation Control for Robot Swarms |
|
Bautista, Jesús | Universidad De Granada |
Garcia de Marina, Hector | Universidad De Granada |
Keywords: Multi-Robot Systems, Swarm Robotics
Abstract: This paper focuses on coordinating a robot swarm orbiting a convex path without collisions among the individuals. The individual robots lack braking capabilities and can only adjust their courses while maintaining their constant but different speeds. Instead of controlling the spatial relations between the robots, our formation control algorithm aims to deploy a dense robot swarm that mimics the behavior of tornado schooling fish. To achieve this objective safely, we employ a combination of a scalable overtaking rule, a guiding vector field, and a control barrier function with an adaptive radius to facilitate smooth overtakes. The decision-making process of the robots is distributed, relying only on local information. Practical applications include defensive structures or escorting missions with the added resiliency of a swarm without a centralized command. We provide a rigorous analysis of the proposed strategy and validate its effectiveness through numerical simulations involving a high density of unicycles.
|
|
13:30-15:00, Paper WeBT4-CC.8 | Add to My Program |
Optimization and Evaluation of a Multi Robot Surface Inspection Task through Particle Swarm Optimization |
|
Chiu, Darren | University of Southern California |
Nagpal, Radhika | Harvard University |
Haghighat, Bahar | University of Groningen |
Keywords: Multi-Robot Systems, Swarm Robotics, Sensor Networks
Abstract: Robot swarms can be tasked with a variety of automated sensing and inspection applications in aerial, aquatic, and surface environments. In this paper, we study a simplified two-outcome surface inspection task. We task a group of robots to inspect and collectively classify a 2D surface section based on a binary pattern projected on the surface. We use a decentralized Bayesian decision-making algorithm and deploy a swarm of 3-cm sized wheeled robots to inspect a randomized black and white tiled surface section of size 1m x 1m in simulation. We first describe the model parameters that characterize our simulated environment, the robot swarm, and the inspection algorithm. We then employ a noise-resistant heuristic optimization scheme based on the Particle Swarm Optimization (PSO) using a fitness evaluation that combines the swarm's classification decision accuracy and decision time. We use our fitness measure definition to asses the optimized parameters through 100 randomized simulations that vary surface pattern and initial robot poses. The optimized algorithm parameters show up to 55% improvement in median of fitness evaluations against an empirically chosen parameter set.
|
|
13:30-15:00, Paper WeBT4-CC.9 | Add to My Program |
Hierarchical Planning for Long-Horizon Multi-Agent Collective Construction |
|
Singh, Shambhavi | Carnegie Mellon University |
Huang, Zejian | Carnegie Mellon University |
Kesarimangalam Srinivasan, Akshaya | Carnegie Mellon University |
Gutow, Geordan | Carnegie Mellon University |
Vundurthy, Bhaskar | Carnegie Mellon University |
Choset, Howie | Carnegie Mellon University |
Keywords: Multi-Robot Systems, Task and Motion Planning, Robotics and Automation in Construction
Abstract: We develop a planner that directs robots to construct a 3D target structure composed of blocks. The robots themselves are cubes of the same size as the blocks, and they may place, carry, or remove one block at a time. When moving, robots are also allowed to climb or descend a block. A construction plan may thus build a staircase-like scaffolding of blocks to reach other blocks at higher levels. The order of block placement is important; for example, a block that sits atop other blocks must be placed after the blocks below it, and a block that needs scaffolding cannot be placed until after the scaffolding is. Prior works focus on end-to-end approaches that simultaneously plan for block placement order and inter-robot collisions. Larger structures are either intractable or yield high-cost solutions. A prior approach mitigates this by decomposing the structure into smaller components that can be planned for independently, but the computational challenge remains. We present a hierarchical approach that first 1) uses A* to determine a sequence of block placements and removals while ignoring inter-robot collision, then 2) identifies ordering constraints between block placement and removal actions, and finally (3) computes collision-free paths for multiple robots to perform said actions. Compared to an optimization approach that minimizes the number of timesteps to complete the structure, we observe a 100x reduction in computation time for comparable solutions.
|
|
WeBT5-CC Oral Session, CC-411 |
Add to My Program |
Sensor Fusion I |
|
|
Chair: Pomerleau, Francois | Université Laval |
Co-Chair: Funabora, Yuki | Nagoya University |
|
13:30-15:00, Paper WeBT5-CC.1 | Add to My Program |
Enhancing mmWave Radar Point Cloud Via Visual-Inertial Supervision |
|
Fan, Cong | Wuhan University of Technology |
Zhang, Shengkai | Wuhan University of Technology |
Liu, Kezhong | Wuhan University of Technology |
Wang, Shuai | Southeast University |
Yang, Zheng | Tsinghua University |
Wang, Wei | Huazhong University of Science and Technology |
Keywords: Sensor Fusion, AI-Based Methods, SLAM
Abstract: Complementary to prevalent LiDAR and camera systems, millimeter-wave (mmWave) radar is robust to adverse weather conditions like fog, rainstorms, and blizzards but offers sparse point clouds. Current techniques enhance the point cloud by the supervision of LiDAR's data. However, high-performance LiDAR is notably expensive and is not commonly available on vehicles. This paper presents mmEMP, a supervised learning approach that enhances radar point clouds using a low-cost camera and an inertial measurement unit (IMU), enabling crowdsourcing training data from commercial vehicles. Bringing the visual-inertial (VI) supervision is challenging due to the spatial agnostic of dynamic objects. Moreover, spurious radar points from the curse of RF multipath make robots misunderstand the scene. mmEMP first devises a dynamic 3D reconstruction algorithm that restores the 3D positions of dynamic features. Then, we design a neural network that densifies radar data and eliminates spurious radar points. We build a new dataset in the real world. Extensive experiments show that mmEMP achieves competitive performance compared with the SOTA approach training by LiDAR's data. In addition, we use the enhanced point cloud to perform object detection, localization, and mapping to demonstrate mmEMP's effectiveness.
|
|
13:30-15:00, Paper WeBT5-CC.2 | Add to My Program |
Influence of Camera-LiDAR Configuration on 3D Object Detection for Autonomous Driving |
|
Li, Ye | University of Michigan |
Hu, Hanjiang | Carnegie Mellon University |
Liu, Zuxin | Carnegie Mellon University |
Xu, Xiaohao | University of Michigan |
Huang, Xiaonan | University of Michigan |
Zhao, Ding | Carnegie Mellon University |
Keywords: Sensor Fusion, Computer Vision for Transportation, AI-Enabled Robotics
Abstract: Cameras and LiDARs are both important sensors for autonomous driving, playing critical roles in 3D object detection. Camera-LiDAR Fusion has been a prevalent solution for robust and accurate driving perception. In contrast to the vast majority of existing arts that focus on how to improve the performance of 3D target detection through cross-modal schemes, deep learning algorithms, and training tricks, we devote attention to the impact of sensor configurations on the performance of learning-based methods. To achieve this, we propose a unified information-theoretic surrogate metric for camera and LiDAR evaluation based on the proposed sensor perception model. We also design an accelerated high-quality framework for data acquisition, model training, and performance evaluation that functions with the CARLA simulator. To show the correlation between detection performance and our surrogate metrics, We conduct experiments using several camera-LiDAR placements and parameters inspired by self-driving companies and research institutions. Extensive experimental results of representative algorithms on nuScenes dataset validate the effectiveness of our surrogate metric, demonstrating that sensor configurations significantly impact point-cloud-image fusion based detection models, which contribute up to 30% discrepancy in terms of the average precision.
|
|
13:30-15:00, Paper WeBT5-CC.3 | Add to My Program |
Chasing Day and Night: Towards Robust and Efficient All-Day Object Detection Guided by an Event Camera |
|
Cao, Jiahang | The Hong Kong University of Science and Technology (Guangzhou) |
Zheng, Xu | The Hong Kong University of Science and Technology |
Lyu, Yuanhuiyi | The Hong Kong University of Science and Technology (Guangzhou) |
Wang, Jiaxu | Hong Kong University of Science and Technology (Guangzhou) |
Xu, Renjing | The Hong Kong University of Science and Technology (Guangzhou) |
Wang, Lin | HKUST |
Keywords: Sensor Fusion, Data Sets for Robotic Vision, Deep Learning for Visual Perception
Abstract: The ability to detect objects in all lighting (i.e., normal-, over-, and under-exposed) conditions is crucial for real-world applications, such as self-driving. Traditional RGB-based detectors often fail under such varying lighting conditions. Therefore, recent works utilize novel event cameras to supplement or guide the RGB modality; however, these methods typically adopt asymmetric network structures that rely predominantly on the RGB modality, resulting in limited robustness for all-day detection. In this paper, we propose EOLO, a novel object detection framework that achieves robust and efficient all-day detection by fusing both RGB and event modalities. Our EOLO framework is built based on a lightweight spiking neural network (SNN) to efficiently leverage the asynchronous property of events. Buttressed by it, we first introduce an Event Temporal Attention (ETA) module to learn the high temporal information from events while preserving crucial edge information. % for fusion with RGB modality. Secondly, as different modalities exhibit varying levels of importance under diverse lighting conditions, we propose a novel Symmetric RGB-Event Fusion (SREF) module to effectively fuse RGB-Event features without relying on a specific modality, thus ensuring a balanced and adaptive fusion for all-day detection.In addition, to compensate for the lack of paired RGB-Event datasets for all-day training and evaluation, we propose an event synthesis approach based on the randomized optical flow that allows for directly generating the event frame from a single exposure image. We further build two new datasets, E-MSCOCO and E-VOC based on the popular benchmarks MSCOCO and PASCAL VOC. Extensive experiments demonstrate that our EOLO outperforms the state-of-the-art detectors, e.g., RENet, by a substantial margin (+3.74 mAP50) in all lighting conditions.Our code and datasets will be available at https://vlislab22.github.io/EOLO/.
|
|
13:30-15:00, Paper WeBT5-CC.4 | Add to My Program |
CMDFusion: Bidirectional Fusion Network with Cross-Modality Knowledge Distillation for LIDAR Semantic Segmentation |
|
Cen, Jun | The Hong Kong University of Science and Technology |
Zhang, Shiwei | Alibaba Group |
Pei, Yixuan | Xi'an Jiaotong University |
Li, Kun | Alibaba Group |
Zheng, Hang | Alibaba Group |
Luo, Maochun | Alibaba Group |
Zhang, Yingya | Alibaba Group |
Chen, Qifeng | HKUST |
Keywords: Sensor Fusion, Deep Learning for Visual Perception, Recognition
Abstract: 2D RGB images and 3D LIDAR point clouds provide complementary knowledge for the perception system of autonomous vehicles. Several 2D and 3D fusion methods have been explored for the LIDAR semantic segmentation task, but they suffer from different problems. 2D-to-3D fusion methods require strictly paired data during inference, which may not be available in real-world scenarios, while 3D-to-2D fusion methods cannot explicitly make full use of the 2D information. Therefore, we propose a Bidirectional Fusion Network with Cross-Modality Knowledge Distillation (CMDFusion) in this work. Our method has two contributions. First, our bidirectional fusion scheme explicitly and implicitly enhances the 3D feature via 2D-to-3D fusion and 3D-to-2D fusion, respectively, which surpasses either one of the single fusion schemes. Second, we distillate the 2D knowledge from a 2D network (Camera branch) to a 3D network (2D knowledge branch) so that the 3D network can generate 2D information even for those points not in the FOV (field of view) of the camera. In this way, RGB images are not required during inference anymore since the 2D knowledge branch provides 2D information according to the 3D LIDAR input. We show that our CMDFusion achieves the best performance among all fusion-based methods on SemanticKITTI and nuScenes datasets. The code will be released at https://github.com/Jun-CEN/CMDFusion.
|
|
13:30-15:00, Paper WeBT5-CC.5 | Add to My Program |
SK-Net: Spectral-Based Knowledge Distillation in Low-Light Thermal Imagery for Robotic Perception |
|
Sikdar, Aniruddh | Indian Institute of Science, Bangalore |
Teotia, Jayant | Indian Institute of Science, Bangalore |
Sundaram, Suresh | Indian Institute of Science |
Keywords: Sensor Fusion, Deep Learning for Visual Perception, Transfer Learning
Abstract: Enhancing the generalization capacity of robotic perception systems for safety-critical applications is vital, especially for environments with low-light and adverse conditions. Multi-spectral fusion techniques aim to maintain the merits of optical (EO) and infrared (IR) modalities, e.g., retaining functional highlights and capturing detailed textures from both modalities. However, these techniques encounter limitations when faced with scenarios involving missing modalities, especially during inference when only IR images are available. To address this issue, this paper proposes a novel contrastive learning-based spectral knowledge distillation technique known as SK-Net to improve the performance of deep learning models for missing modality scenarios for semantic segmentation tasks. Gated Spectral Unit (GSU) is also proposed to combine information from both modalities. SK-Net aims to extract valuable semantic information from optical images while preserving spectral knowledge from the IR modality within the feature space. The model retains the style information in the shallow layers while simultaneously fusing the high-level semantic context obtained from optical (EO) and IR modalities to improve the feature generation capacity when dealing with only IR images during inference. SK-Net outperforms state-of-the-art multi-modal fusion and distillation models in scenarios with missing modalities when using only IR data during inference in two public benchmarking datasets. This performance increase is achieved without additional computational costs compared to the baseline segmentation models.
|
|
13:30-15:00, Paper WeBT5-CC.6 | Add to My Program |
Use Your Imagination: A Detector-Independent Approach for LiDAR Quality Booster |
|
Zhang, Zeping | University of Ottawa |
Liu, Tianran | University of Ottawa |
Laganiere, Robert | University of Ottawa |
Keywords: Sensor Fusion, Deep Learning Methods, AI-Based Methods
Abstract: Features from LiDAR and cameras are considered to be complementary. However, due to the sparsity of the LiDAR point clouds, a dense and accurate RGB/3D projective relationship is difficult to establish especially for distant scene points. Recent works try to solve this problem by designing a network that learns missing points or dense point density distributions to compensate for the sparsity of the LiDAR point cloud. In this work, we propose to use an imagine-and-locate process, called UYI. The objective of this module is to improve the point cloud quality and is independent of the detection network used for inference. We accomplish this task through a GAN based cross-modality module which uses an image as input to infer a dense LiDAR shape. Boosted by our UYI block, our experiments show a significant performance improvement in all tested baseline models. In fact, benefiting from the plug-and-play characteristics of our module, we were able to push the performance of existing state-of-the-art model to a new height. Code will be made available.
|
|
13:30-15:00, Paper WeBT5-CC.7 | Add to My Program |
SuperFusion: Multilevel LiDAR-Camera Fusion for Long-Range HD Map Generation |
|
Dong, Hao | ETH Zürich |
Gu, Weihao | HAOMO.AI Technology Co., Ltd |
Zhang, Xianjing | HAOMO.AI Technology Co., Ltd |
Xu, Jintao | HAOMO.AI Technology Co., Ltd |
Ai, Rui | HAOMO.AI Technology Co., Ltd |
Lu, Huimin | National University of Defense Technology |
Kannala, Juho | Aalto University |
Chen, Xieyuanli | National University of Defense Technology |
Keywords: Sensor Fusion, Deep Learning Methods, Computer Vision for Automation
Abstract: High-definition (HD) semantic map generation of the environment is an essential component of autonomous driving. Existing methods have achieved good performance in this task by fusing different sensor modalities, such as LiDAR and camera. However, current works are based on raw data or network feature-level fusion and only consider short-range HD map generation, limiting their deployment to realistic autonomous driving applications. In this paper, we focus on the task of building the HD maps in both short ranges, i.e., within 30 m, and also predicting long-range HD maps up to 90 m, which is required by downstream path planning and control tasks to improve the smoothness and safety of autonomous driving. To this end, we propose a novel network named SuperFusion, exploiting the fusion of LiDAR and camera data at multiple levels. We use LiDAR depth to improve image depth estimation and use image features to guide long-range LiDAR feature prediction. We benchmark our SuperFusion on the nuScenes dataset and a self-recorded dataset and show that it outperforms the state-of-the-art baseline methods with large margins on all intervals. Additionally, we apply the generated HD map to a downstream path planning task, demonstrating that the long-range HD maps predicted by our method can lead to better path planning for autonomous vehicles. Our code and self-recorded dataset have been released at https://github.com/haomo-ai/SuperFusion.
|
|
13:30-15:00, Paper WeBT5-CC.8 | Add to My Program |
Continuous-Time Ultra-Wideband-Inertial Fusion |
|
Li, Kailai | Linköping University |
Cao, Ziyu | Linköping University |
Hanebeck, Uwe D. | Karlsruhe Institute of Technology (KIT) |
Keywords: Sensor Fusion, Localization
Abstract: We introduce a novel framework of continuous-time ultra-wideband-inertial sensor fusion for online motion estimation. Quaternion-based cubic cumulative B-splines are exploited for parameterizing motion states continuously over time. Systematic derivations of analytic kinematic interpolations and spatial differentiations are further provided. Based thereon, a new sliding-window spline fitting scheme is established for asynchronous multi-sensor fusion and online calibration. We conduct a dedicated validation of the quaternion spline fitting method, and evaluate the proposed system, SFUISE (spline fusion-based ultra-wideband-inertial state estimation), in real-world scenarios using public data set and experiments. The proposed sensor fusion system is real-time capable and delivers superior performance over state-of-the-art discrete-time schemes. We release the source code and own experimental data at https://github.com/KIT-ISAS/SFUISE.
|
|
13:30-15:00, Paper WeBT5-CC.9 | Add to My Program |
GICI-LIB: A GNSS/INS/Camera Integrated Navigation Library |
|
Chi, Cheng | Shanghai Jiao Tong University |
Zhang, Xin | Shanghai Jiao Tong Univeristy |
Liu, Jiahui | Shanghai Jiao Tong University |
Sun, Yulong | Shanghai Jiao Tong University |
Zhang, Zihao | Shanghai Jiao Tong University |
Zhan, Xingqun | Shanghai Jiao Tong University |
Keywords: Sensor Fusion, Localization, Data Sets for SLAM
Abstract: Accurate navigation is essential for autonomous robots and vehicles. In recent years, the integration of the Global Navigation Satellite System (GNSS), Inertial Navigation System (INS), and camera has garnered considerable attention due to its robustness and high accuracy in diverse environments. However, leveraging the full capacity of GNSS is cumbersome because of the diverse choices of formulations, error models, satellite constellations, signal frequencies, and service types, which lead to different precision, robustness, and usage dependencies. To clarify the capacity of GNSS algorithms and accelerate the development efficiency of employing GNSS in multi-sensor fusion algorithms, we open source the GNSS/INS/Camera Integration Library (GICI-LIB), together with detailed documentation and a comprehensive land vehicle dataset. A factor graph optimization-based multi-sensor fusion framework is established, which combines almost all GNSS measurement error sources by fully considering temporal and spatial correlations between measurements. The graph structure is designed for flexibility, making it easy to form any kind of integration algorithm. For illustration, RTK, PPP, and four Real-Time Kinematic (RTK)-based algorithms from GICI-LIB are evaluated using our dataset and public datasets. Results confirm the potential of the GICI system to provide continuous precise navigation solutions in a wide spectrum of urban environments.
|
|
WeBT6-CC Oral Session, CC-414 |
Add to My Program |
Visual Perception and Learning II |
|
|
Chair: Dansereau, Donald | University of Sydney |
Co-Chair: Choi, Jun Won | Seoul National University |
|
13:30-15:00, Paper WeBT6-CC.1 | Add to My Program |
Semi-Supervised Learning for Visual Bird's Eye View Semantic Segmentation |
|
Zhu, Junyu | Zhejiang University |
Liu, Lina | Zhejiang University |
Tang, Yu | HuaWei |
Wen, Feng | Huawei Technologies Co., Ltd |
Li, Wanlong | Beijing Huawei Digital Technologies Co., Ltd |
Liu, Yong | Zhejiang University |
Keywords: Computer Vision for Automation, Deep Learning for Visual Perception, Visual Learning
Abstract: Visual bird's eye view (BEV) semantic segmentation helps autonomous vehicles understand the surrounding environment only from front-view (FV) images, including static elements (e.g., roads) and dynamic elements (e.g., vehicles, pedestrians). However, the high cost of annotation procedures of full-supervised methods limits the capability of the visual BEV semantic segmentation, which usually needs HD maps, 3D object bounding boxes, and camera extrinsic matrixes. In this paper, we present a novel semi-supervised framework for visual BEV semantic segmentation to boost performance by exploiting unlabeled images during the training. A consistency loss that makes full use of unlabeled data is then proposed to constrain the model on not only semantic prediction but also the BEV feature. Furthermore, we propose a novel and effective data augmentation method named conjoint rotation which reasonably augments the dataset while maintaining the geometric relationship between the FV images and the BEV semantic segmentation. Extensive experiments on the nuScenes dataset show that our semi-supervised framework can effectively improve prediction accuracy. To the best of our knowledge, this is the first work that explores improving visual BEV semantic segmentation performance using unlabeled data. The code is available at https://github.com/Junyu-Z/Semi-BEVseg.
|
|
13:30-15:00, Paper WeBT6-CC.2 | Add to My Program |
OpenAnnotate3D: Open-Vocabulary Auto-Labeling System for Multi-Modal 3D Data |
|
Zhou, Yijie | Fudan University |
Cai, Likun | University of Toronto |
Cheng, Xianhui | Fudan University |
Gan, Zhongxue | Fudan University |
Xue, Xiangyang | Fudan University |
Ding, Wenchao | Fudan University |
Keywords: Computer Vision for Automation, Multi-Modal Perception for HRI, Autonomous Vehicle Navigation
Abstract: In the era of big data and large models, automatic annotating functions for multi-modal data are of great significance for real-world AI-driven applications, such as autonomous driving and embodied AI.Unlike traditional closed-set annotation, open-vocabulary annotation is essential to achieve human-level cognition capability. However, there are few open-vocabulary auto-labeling systems for multi-modal 3D data. In this paper, we introduce OpenAnnotate3D, an open-source open-vocabulary auto-labeling system that can automatically generate 2D masks, 3D masks, and 3D bounding box annotations for vision and point cloud data. Our system integrates the chain-of-thought capabilities of Large Language Models (LLMs) and the cross-modality capabilities of vision-language models (VLMs). To the best of our knowledge, OpenAnnotate3D is one of the pioneering works for open-vocabulary multi-modal 3D auto-labeling. We conduct comprehensive evaluations on both public and in-house real-world datasets, which demonstrate that the system significantly improves annotation efficiency compared to manual annotation while providing accurate open-vocabulary auto-annotating results.
|
|
13:30-15:00, Paper WeBT6-CC.3 | Add to My Program |
SAM-Event-Adapter: Adapting Segment Anything Model for Event-RGB Semantic Segmentation |
|
Yao, Bowen | Beijing University of Technology |
Deng, Yongjian | Beijing University of Technology |
Liu, Yuhan | Beijing University of Technology |
Chen, Hao | City University of Hong Kong |
Li, You-Fu | City University of Hong Kong |
Yang, Zhen | Beijing University of Technology |
Keywords: Computer Vision for Automation, Sensor Fusion, Deep Learning for Visual Perception
Abstract: Semantic segmentation, a fundamental visual task ubiquitously employed in sectors ranging from transportation and robotics to healthcare, has always captivated the research community. In the wake of rapid advancements in large model research, the foundation model for semantic segmentation tasks, termed the Segment Anything Model (SAM), has been introduced. This model substantially addresses the dilemma of poor generalizability of previous segmentation models and the disadvantage in requiring to retrain the whole model on variant datasets. Nonetheless, segmentation models developed on SAM remain constrained by the inherent limitations of RGB sensors, particularly in scenarios characterized by complex lighting conditions and high-speed motion. Motivated by these observations, a natural recourse is to adapt SAM to additional visual modalities without compromising its robust generalizability. To achieve this, we introduce a lightweight SAM-Event-Adapter (SE-Adapter) module, which incorporates event camera data into a cross-modal learning architecture based on SAM, with only limited tunable parameters incremental. Capitalizing on the high dynamic range and temporal resolution afforded by event cameras, our proposed multi-modal Event-RGB learning architecture effectively augments the performance of semantic segmentation tasks. In addition, we propose a novel paradigm for representing event data in a patch format compatible with transformer-based models, employing multi-spatiotemporal scale encoding to efficiently extract motion and semantic correlations from event representations. Exhaustive empirical evaluations conducted on the DSEC-Semantic and DDD17 datasets provide validation of the effectiveness and rationality of our proposed approach.
|
|
13:30-15:00, Paper WeBT6-CC.4 | Add to My Program |
BuFF: Burst Feature Finder for Light-Constrained 3D Reconstruction |
|
Ravendran, Ahalya | The Commonwealth Scientific and Industrial Research Organisation |
Bryson, Mitch | University of Sydney |
Dansereau, Donald | University of Sydney |
Keywords: Computer Vision for Automation, SLAM
Abstract: Robots operating in low-light conditions with conventional cameras face significant challenges due to the low signal-to-noise ratio in the images. Previous work has demonstrated the use of burst-imaging techniques to partially overcome this issue. This study proposes a novel feature finder that enhances vision-based reconstruction under extremely low light conditions. The approach locates features with well-defined scale and apparent motion within each burst by jointly searching in a scale-slope space. We demonstrate improved performance in feature detection, camera pose estimation and reconstruction compared to state-of-the-art feature extractors on conventional and burst-merged images. This work opens avenues for robotic applications where low-light conditions often pose difficulties such as disaster recovery and drone delivery at night.
|
|
13:30-15:00, Paper WeBT6-CC.5 | Add to My Program |
Unsupervised Spike Depth Estimation Via Cross-Modality Cross-Domain Knowledge Transfer |
|
Liu, Jiaming | Peking University |
Zhang, Qizhe | Peking University |
Li, Xiaoqi | Peking University |
Li, Jianing | Nanjing University |
Wang, Guanqun | Peking University |
Lu, Ming | Intel Labs |
Huang, Tiejun | Peking University |
Zhang, Shanghang | Peking University |
Keywords: Computer Vision for Automation, Transfer Learning, Deep Learning for Visual Perception
Abstract: Neuromorphic spike data, an upcoming modality with high temporal resolution, has shown promising potential in autonomous driving by mitigating the challenges posed by high-velocity motion blur. However, training the spike depth estimation network holds significant challenges in two aspects: sparse spatial information for pixel-wise tasks and difficulties in achieving paired depth labels for temporally intensive spike streams. Therefore, we introduce open-source RGB data to support spike depth estimation, leveraging its annotations and spatial information. The inherent differences in modalities and data distribution make it challenging to directly apply transfer learning from open-source RGB to target spike data. To this end, we propose a cross-modality cross-domain (BiCross) framework to realize unsupervised spike depth estimation by introducing simulated mediate source spike data. Specifically, we design a Coarse-to-Fine Knowledge Distillation (CFKD) approach to facilitate comprehensive cross-modality knowledge transfer while preserving the unique strengths of both modalities, utilizing a spike-oriented uncertainty scheme. Then, we propose a Self-Correcting Teacher-Student (SCTS) mechanism to screen out reliable pixel-wise pseudo labels and ease the domain shift of the student model, which avoids error accumulation in target spike data. To verify the effectiveness of BiCross, we conduct extensive experiments on four scenarios, including Synthetic to Real, Extreme Weather, Scene Changing, and Real Spike. Our method achieves state-of-the-art (SOTA) performances, compared with RGB-oriented unsupervised depth estimation methods. The code and dataset are available at https://github.com/anonymous-4869/BiCross.
|
|
13:30-15:00, Paper WeBT6-CC.6 | Add to My Program |
PillarGen: Enhancing Radar Point Cloud Density and Quality Via Pillar-Based Point Generation Network |
|
Kim, Jisong | Hanyang University |
Bang, Geonho | Hanyang University |
Choi, KwangJin | Hanyang University |
Seong, Minjae | Hanyang University |
Yoo, Jae Chang | HL Klemove |
Pyo, Eunjong | HL Klemove |
Choi, Jun Won | Seoul National University |
Keywords: Computer Vision for Automation, Visual Learning, Deep Learning for Visual Perception
Abstract: In this paper, we present a novel point generation model, referred to as Pillar-based Point Generation Network (PillarGen), which facilitates the transformation of point clouds from one domain into another. PillarGen can produce synthetic point clouds with enhanced density and quality based on the provided input point clouds. The PillarGen model performs the following three steps: 1) pillar encoding, 2) Occupied Pillar Prediction (OPP), and 3) Pillar to Point Generation (PPG). The input point clouds are encoded using a pillar grid structure to generate pillar features. Then, OPP determines the active pillars used for point generation and predicts the center of points and the number of points to be generated for each active pillar. PPG generates the synthetic points for each active pillar based on the information provided by OPP. We evaluate the performance of PillarGen using our proprietary radar dataset, focusing on enhancing the density and quality of short-range radar data using the long-range radar data as supervision. Our experiments demonstrate that PillarGen outperforms traditional point upsampling methods in quantitative and qualitative measures. We also confirm that when PillarGen is incorporated into bird’s eye view object detection, a significant improvement in detection accuracy is achieved.
|
|
13:30-15:00, Paper WeBT6-CC.7 | Add to My Program |
Deep Dynamic Layout Optimisation of Photogrammetry Camera Position Based on Digital Twin (I) |
|
Wang, Likun | Hunan University, University of Nottingham |
Wang, Zi | University of Nottingham |
Kendall, Peter | University of Nottingham |
Gumma, Kevin | KGTec Ltd |
Turner, Alison | University of Nottingham |
Ratchev, Svetan | The University of Nottingham |
Keywords: Computer Vision for Manufacturing, Intelligent and Flexible Manufacturing, Methods and Tools for Robot System Design
Abstract: The photogrammetry system has been widely used in industrial manufacturing applications, such as high-precision assembly, reverse engineering and additive manufacturing. In order to meet the demand of the product variety and short product lifecycle, the factory facilities including photogrammetry devices, should be relocated in response to rapid change in mechanical structure and hardware integration. Nevertheless, the camera position of the photogrammetry system is difficult to select to guarantee an optimal field of view (FoV) coverage of retro-reflective targets during the whole production horizon. Especially in a reconfigurable manufacturing work cell, scaling and calibration of a photogrammetry system requires professional skills and these would cost tremendous labour for rapid configuration each time. In this paper, we propose a novel deep optimisation framework for the photogrammetry camera position for the dynamic layout design based on digital twin. The optimisation framework follows an effective coarse-to-fine procedure to evaluate the FoV visibility over the target frame. In addition, the deep Q-learning algorithm is utilised to find the maximum FoV coverage and avoid collision. Three experiments are implemented to verify the application feasibility of the proposed deep camera position optimisation framework.
|
|
13:30-15:00, Paper WeBT6-CC.8 | Add to My Program |
MMA-Net: Multiple Morphology-Aware Network for Automated Cobb Angle Measurement |
|
Qiu, Zhengxuan | Southern University of Science and Technology |
Yang, Jie | Southern University of Science and Technology |
Wang, Jiankun | Southern University of Science and Technology |
Keywords: Computer Vision for Medical Robotics, Medical Robots and Systems, Computer Vision for Automation
Abstract: Scoliosis diagnosis and assessment depend largely on the measurement of the Cobb angle in spine X-ray images. With the emergence of deep learning techniques that employ landmark detection, tilt prediction, and spine segmentation, automated Cobb angle measurement has become increasingly popular. However, these methods encounter difficulties such as high noise sensitivity, intricate computational procedures, and exclusive reliance on a single type of morphological information. In this paper, we introduce the Multiple Morphology-Aware Network (MMA-Net), a novel framework that improves Cobb angle measurement accuracy by integrating multiple spine morphology as attention information. In the MMA-Net, we first feed spine X-ray images into the segmentation network to produce multiple morphological information (spine region, centerline, and boundary) and then concatenate the original X-ray image with the resulting segmentation maps as input for the regression module to perform precise Cobb angle measurement. Furthermore, we devise joint loss functions for our segmentation and regression network training, respectively. We evaluate our method on the AASCE challenge dataset and achieve superior performance with the SMAPE of 7.28% and the MAE of 3.18°, indicating a strong competitiveness compared to other outstanding methods. Consequently, we can offer clinicians automated, efficient, and reliable Cobb angle measurement.
|
|
13:30-15:00, Paper WeBT6-CC.9 | Add to My Program |
Synset Boulevard: A Synthetic Image Dataset for VMMR |
|
Sielemann, Anne | Fraunhofer IOSB |
Wolf, Stefan | Fraunhofer IOSB |
Roschani, Masoud | Fraunhofer IOSB |
Ziehn, Jens | Fraunhofer IOSB |
Beyerer, Jürgen | Fraunhofer Gesellschaft |
Keywords: Computer Vision for Transportation, Deep Learning for Visual Perception, Simulation and Animation
Abstract: We present and discuss the Synset Boulevard dataset, designed for the task of surveillance-nature vehicle make and model recognition (VMMR)—to the best of our knowledge the first entirely synthetically generated large-scale VMMR image dataset. Through the simulation of image data rather than the manual annotation of real data, we intend to mitigate common challenges in state-of-the-art VMMR datasets, namely bias, human error, privacy, and the challenge of providing systematic updates. On the other hand, the provision and use of synthetic data introduce individual challenges, such as potential domain gaps and a less pronounced intra-class variance. Our approach to address these challenges, using path tracing and physically-based, data-driven models, is evaluated on an existing large real-world dataset. Overall, our synthetic dataset contains 32,400 independent images (each with different imaging simulations and with/without masked license plates, leading to a total of 259,200 images) from 162 different vehicle models of 43 makes depicted in front view. It is split into 8 subdatasets to investigate the influence of optical/imaging effects on the classification ability.
|
|
WeBT7-CC Oral Session, CC-416 |
Add to My Program |
Learning in Control I |
|
|
Chair: Chitnis, Rohan | Meta AI |
Co-Chair: Zhao, Ding | Carnegie Mellon University |
|
13:30-15:00, Paper WeBT7-CC.1 | Add to My Program |
IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control |
|
Chitnis, Rohan | Massachusetts Institute of Technology |
Xu, Yingchen | Meta AI |
Hashemi, Bobak | Meta AI |
Lehnert, Lucas | Meta AI |
Dogan, Ürün | Ruhr-University Bochum, Institut Für Neuroinformatik |
Zhu, Zheqing | Stanford University |
Delalleau, Olivier | NVIDIA |
Keywords: Reinforcement Learning, Machine Learning for Robot Control, Imitation Learning
Abstract: Model-based reinforcement learning (RL) has shown great promise due to its sample efficiency, but still struggles with long-horizon sparse-reward tasks, especially in offline settings where the agent learns from a fixed dataset. We hypothesize that model-based RL agents struggle in these environments due to a lack of long-term planning capabilities, and that planning in a temporally abstract model of the environment can alleviate this issue. In this paper, we make two key contributions: 1) we introduce an offline model-based RL algorithm, IQL-TD-MPC, that extends the state-of-the-art Temporal Difference Learning for Model Predictive Control (TD-MPC) with Implicit Q-Learning (IQL); and 2) we propose to use IQL-TD-MPC as a Manager in a hierarchical setting with any off-the-shelf offline RL algorithm as a Worker. More specifically, we pre-train a temporally abstract IQL-TD-MPC Manager to predict "intent embeddings", which roughly correspond to subgoals, via planning. We show that augmenting state representations with intent embeddings generated by an IQL-TD-MPC manager significantly improves off-the-shelf offline RL agents' performance on some of the most challenging D4RL benchmark tasks. For instance, the offline RL algorithms AWAC, TD3-BC, DT, and CQL all get zero or near-zero normalized evaluation scores on the medium and large antmaze tasks, while our modification gives an average score over 40.
|
|
13:30-15:00, Paper WeBT7-CC.2 | Add to My Program |
SLIM: Skill Learning with Multiple Critics |
|
Emukpere, David | Naver Labs Europe |
Wu, Bingbing | Naver Labs Europe |
Perez, Julien | Naver Labs Europe |
Renders, Jean-Michel | Naver Labs Europe |
Keywords: Reinforcement Learning, Machine Learning for Robot Control, Deep Learning Methods
Abstract: Self-supervised skill learning aims to acquire useful behaviors that leverage the underlying dynamics of the environment. Latent variable models, based on mutual information maximization, have been successful in this task but still struggle in the context of robotic manipulation. As it requires impacting a possibly large set of degrees of freedom composing the environment, mutual information maximization fails alone in producing useful and safe manipulation behaviors. Furthermore, tackling this by augmenting skill discovery rewards with additional rewards through a naive combination might fail to produce desired behaviors. To address this limitation, we introduce SLIM, a multi-critic learning approach for skill discovery with a particular focus on robotic manipulation. Our main insight is that utilizing multiple critics in an actor-critic framework to gracefully combine multiple reward functions leads to a significant improvement in latent-variable skill discovery for robotic manipulation while overcoming possible interference occurring among rewards which hinders convergence to useful skills. Furthermore, in the context of tabletop manipulation, we demonstrate the applicability of our novel skill discovery approach to acquire safe and efficient motor primitives in a hierarchical reinforcement learning fashion and leverage them through planning, significantly surpassing baseline approaches for skill discovery.
|
|
13:30-15:00, Paper WeBT7-CC.3 | Add to My Program |
SPRINT: Scalable Policy Pre-Training Via Language Instruction Relabeling |
|
Zhang, Jesse | University of Southern California |
Pertsch, Karl | University of Southern California |
Zhang, Jiahui | University of Southern California |
Lim, Joseph | Korea Advanced Institute of Science and Technology |
Keywords: Reinforcement Learning, Machine Learning for Robot Control, Learning from Demonstration
Abstract: Pre-training robots with a rich set of skills can substantially accelerate the learning of downstream tasks. Prior works have defined pre-training tasks via natural language instructions, but doing so requires tedious human annotation of hundreds of thousands of instructions. Thus, we propose SPRINT, a scalable offline policy pre-training approach which substantially reduces the human effort needed for pre-training a diverse set of skills. Our method uses two core ideas to automatically expand a base set of pre-training tasks: instruction relabeling via large language models and cross-trajectory skill chaining with offline reinforcement learning. As a result, SPRINT pre-training equips robots with a richer repertoire of skills that can help an agent generalize to new tasks. Experiments in a household simulator and on a real robot kitchen manipulation task show that SPRINT leads to substantially faster learning of new long-horizon tasks than previous pre-training approaches. Website at https://clvrai.com/sprint.
|
|
13:30-15:00, Paper WeBT7-CC.4 | Add to My Program |
Effective Representation Learning Is More Effective in Reinforcement Learning Than You Think |
|
Zheng, Jiawei | Xi'an Jiaotong University |
Song, Yonghong | Xi'an Jiaotong University |
Keywords: Reinforcement Learning, Representation Learning, Model Learning for Control
Abstract: In reinforcement learning (RL), learning directly from pixels, is commonly known as vision-based RL. Effective state representations are crucial for high performance in vision-based RL. However, in order to learn effective state representations, most current vision-based RL methods based on contrastive unsupervised learning use auxiliary tasks similar to those in computer vision, which does not guarantee the effective information interaction between representation learning and RL. To learn more efficient states, we propose a simple and effective vision-based RL method. It leverages the representations acquired through contrastive learning by the Teacher Encoder and the Student Encoder to collaboratively estimate the Q-function. This cooperative process utilizes the TD error to steer updates to the Teacher Encoder, thereby ensuring effective information exchange between representation learning and RL. We refer to this approach as Reinforcement Learning with Teacher-Student Collaboration (RLTSC). RLTSC incorporates recent advancements in contrastive unsupervised learning, endowing it with potent representation learning capabilities. It provides a robust estimate of the Q-function with minimal variance and effectively guides the Teacher Ecoder to update and acquire a more efficient representation. RLTSC substantially enhances data efficiency in vision-based RL, surpassing state-of-the-art methods on various continuous and discrete control benchmarks. Remarkably, RLTSC even outperforms RL methods based on physical state features in terms of data efficiency for continuous control benchmarks. This may enlighten us: effective representation learning is more effective in reinforcement learning than you think!
|
|
13:30-15:00, Paper WeBT7-CC.5 | Add to My Program |
Learning Highly Dynamic Behaviors for Quadrupedal Robots |
|
Zhang, Chong | Tencent Robotics X |
Sheng, Jiapeng | Shandong University |
Li, Tingguang | The Chinese University of Hong Kong |
Zhang, He | Tencent |
Zhou, Cheng | Tencent |
Zhu, Qingxu | Tencent |
Zhao, Rui | Tencent |
Zhang, Yizheng | Tencent |
Han, Lei | Tencent Robotics X |
Keywords: Reinforcement Learning, Imitation Learning, Learning from Demonstration
Abstract: Learning highly dynamic behaviors for robots has been a longstanding challenge. Traditional approaches have demonstrated robust locomotion, but the exhibited behaviors lack diversity and agility. They employ approximate models, which lead to compromises in performance. Data-driven approaches have been shown to reproduce agile behaviors of animals, but typically have not been able to learn highly dynamic behaviors. In this paper, we propose a learning-based approach to enable robots to learn highly dynamic behaviors from animal motion data. The learned controller is deployed on a quadrupedal robot and the results show that the controller is able to reproduce highly dynamic behaviors including sprinting, jumping and sharp turning. Various behaviors can be activated through human interaction using a stick with makers attached to it. Based on the motion pattern of the stick, the robot exhibits walking, running, sitting and jumping, much like the way humans interact with a pet.
|
|
13:30-15:00, Paper WeBT7-CC.6 | Add to My Program |
TWIST: Teacher-Student World Model Distillation for Efficient Sim-To-Real Transfer |
|
Yamada, Jun | University of Oxford |
Rigter, Marc | University of Oxford |
Collins, Jack | University of Oxford |
Posner, Ingmar | Oxford University |
Keywords: Reinforcement Learning, Transfer Learning, Machine Learning for Robot Control
Abstract: Model-based RL is a promising approach for real-world robotics due to its improved sample efficiency and generalization capabilities compared to model-free RL. However, effective model-based RL solutions for vision-based real-world applications require bridging the sim-to-real gap for any world model learnt. Due to its significant computational cost, standard domain randomisation does not provide an effective solution to this problem. This paper proposes TWIST (Teacher-Student World Model Distillation for Sim-to-Real Transfer) to achieve efficient sim-to-real transfer of vision-based model-based RL using distillation. Specifically, TWIST leverages state observations as readily accessible, privileged information commonly garnered from a simulator to significantly accelerate sim-to-real transfer. Specifically, a teacher world model is trained efficiently on state information. At the same time, a matching dataset is collected of domain-randomised image observations. The teacher world model then supervises a student world model that takes the domain-randomised image observations as input. By distilling the learned latent dynamics model from the teacher to the student model, TWIST achieves efficient and effective sim-to-real transfer for vision-based model-based RL tasks. Experiments in simulated and real robotics tasks demonstrate that our approach outperforms naive domain randomisation and model-free methods in terms of sample efficiency and task performance of sim-to-real transfer.
|
|
13:30-15:00, Paper WeBT7-CC.7 | Add to My Program |
Learning Vision-Based Pursuit-Evasion Robot Policies |
|
Bajcsy, Andrea | Carnegie Mellon University |
Loquercio, Antonio | UC Berkeley |
Kumar, Ashish | UC Berkeley |
Malik, Jitendra | UC Berkeley |
Keywords: Machine Learning for Robot Control, Visual Learning, Human-Aware Motion Planning
Abstract: Learning strategic robot behavior---like that required in pursuit-evasion interactions---under real-world constraints is extremely challenging. It requires exploiting the dynamics of the interaction, and planning through both physical state and latent intent uncertainty. In this paper, we transform this intractable problem into a supervised learning problem, where a fully-observable robot policy generates supervision for a partially-observable one. We find that the quality of the supervision signal for the partially-observable pursuer policy depends on two key factors: the balance of diversity and optimality of the evader’s behavior, and the strength of the modeling assumptions in the fully-observable policy. We deploy our policy on a physical quadruped robot with an RGB-D camera on pursuit-evasion interactions in the wild. Despite all the challenges, the sensing constraints bring about creativity: the robot is pushed to gather information when uncertain, predict intent from noisy measurements, and anticipate in order to intercept.
|
|
WeBT8-CC Oral Session, CC-418 |
Add to My Program |
Learning in Grasping and Manipulation II |
|
|
Chair: Namiki, Akio | Chiba University |
Co-Chair: Posa, Michael | University of Pennsylvania |
|
13:30-15:00, Paper WeBT8-CC.1 | Add to My Program |
Learning to Catch Reactive Objects with a Behavior Predictor |
|
Lu, Kai | University of Oxford |
Zhong, Jia-Xing | University of Oxford |
Yang, Bo | The Hong Kong Polytechnic University |
Wang, Bing | University of Oxford |
Markham, Andrew | Oxford University |
Keywords: Deep Learning Methods, Reinforcement Learning, Machine Learning for Robot Control
Abstract: Tracking and catching moving objects is an important ability for robots in a dynamic world. Whilst some objects have highly predictable state evolution e.g., the ballistic trajectory of a tennis ball, reactive targets alter their behavior in response to the motion of the manipulator. Reactive applications range from gently capturing living animals such as snakes or fish for biological investigations, to smoothly interacting with and assisting a person. Existing works for dynamic catching usually perform target prediction followed by planning, but seldom account for highly non-linear reactive behaviors. Alternatively, Reinforcement Learning (RL) based methods simply treat the target and its motion as part of the observation of the world-state, but perform poorly due to the weak reward signal. In this work, we blend the approach of an explicit, yet learned, target state predictor with RL. We further show how a tightly coupled predictor which `observes' the state of the robot leads to significantly improved anticipatory action, especially with targets that seek to evade the robot following a simple policy. Experiments show that our method achieves an 86.4% (open plane area) and a 73.8% (room) success rate on evasive objects, outperforming monolithic reinforcement learning and other techniques. We also demonstrate the efficacy of our approach across varied targets and trajectories. All code, data, and additional videos: https://kl-research.github.io/dyncatch.
|
|
13:30-15:00, Paper WeBT8-CC.2 | Add to My Program |
Enhancing Task Performance of Learned Simplified Models Via Reinforcement Learning |
|
Bui, Hien | University of Pennsylvania |
Posa, Michael | University of Pennsylvania |
Keywords: Reinforcement Learning, Model Learning for Control, Dexterous Manipulation
Abstract: In contact-rich tasks, the hybrid, multi-modal nature of contact dynamics poses great challenges in model representation, planning, and control. Recent efforts have attempted to address these challenges via data-driven meth- ods, learning dynamical models in combination with model predictive control. Those methods, while effective, rely solely on minimizing forward prediction errors to hope for better task performance with MPC controllers. This weak correlation can result in data inefficiency as well as limitations to overall performance. In response, we propose a novel strategy: using a policy gradient algorithm to find a simplified dynamics model that explicitly maximizes task performance. Specifically, we parameterize the stochastic policy as the perturbed output of the MPC controller, thus, the learned model representation can directly associate with the policy or task performance. We apply the proposed method to contact-rich tasks where a three-fingered robotic hand manipulates previously unknown objects. Our method significantly enhances task success rate by up to 15% in manipulating diverse objects compared to the existing method while sustaining data efficiency. Our method can solve some tasks with success rates of 70% or higher using under 30 minutes of data. All videos and codes are available at https://sites.google.com/view/lcs-rl.
|
|
13:30-15:00, Paper WeBT8-CC.3 | Add to My Program |
Leveraging the Efficiency of Multi-Task Robot Manipulation Via Task-Evoked Planner and Reinforcement Learning |
|
Qian, Haofu | Zhejiang University |
Zhang, Haoyang | Zhejiang University |
Shao, Jun | Zhejiang University |
Zhang, Jiatao | Zhejiang University |
Gu, Jason | Dalhousie University |
Song, Wei | Zhejiang Lab |
Zhu, Shiqiang | Zhejiang University |
Keywords: Reinforcement Learning, Manipulation Planning
Abstract: Multi-task learning has expanded the boundaries of robotic manipulation, enabling the execution of increasingly complex tasks. However, policies learned through reinforcement learning frequently exhibit limited generalization and narrow distributions, which restrict their effectiveness in multi-task training. Addressing the challenge of obtaining policies with generalization and stability represents a non-trivial problem. To tackle this issue, we propose a planning-guided reinforcement learning method. It leverages a task-evoked planner(TEP) and a reinforcement learning approach with planner's guidance. TEP utilizes reusable samples as the source, with the aim of learning reachability information across different task scenarios. Then in reinforcement learning, TEP assesses and guides the Actor towards better outputs and smoothly enhances the performance in multi-task benchmarks. We evaluate this approach within the Meta-World framework and compare it with other typical multi-task algorithms in terms of learning efficiency and effectiveness. Depending on experimental results, our method has more efficiency, higher success rates, and demonstrates more realistic behavior.
|
|
13:30-15:00, Paper WeBT8-CC.4 | Add to My Program |
Generalize by Touching: Tactile Ensemble Skill Transfer for Robotic Furniture Assembly |
|
Lin, Haohong | Carnegie Mellon University |
Corcodel, Radu | Mitsubishi Electric Research Laboratories |
Zhao, Ding | Carnegie Mellon University |
Keywords: Reinforcement Learning, Force and Tactile Sensing, Transfer Learning
Abstract: Furniture assembly remains an unsolved problem in robotic manipulation due to its long task horizon and nongeneralizable operations plan. This paper presents the Tactile Ensemble Skill Transfer (TEST) framework, a pioneering offline reinforcement learning (RL) approach that incorporates tactile feedback in the control loop. TEST's core design is to learn a skill transition model for high-level planning, along with a set of adaptive intra-skill goal-reaching policies. Such design aims to solve the robotic furniture assembly problem in a more generalizable way, facilitating seamless chaining of skills for this long-horizon task. We first sample demonstration from a set of heuristic policies and trajectories consisting of a set of randomized sub-skill segments, enabling the acquisition of rich robot trajectories that capture skill stages, robot states, visual indicators, and crucially, tactile signals. Leveraging these trajectories, our offline RL method discerns skill termination conditions and coordinates skill transitions. Our evaluations highlight the proficiency of TEST on the in-distribution furniture assemblies, its adaptability to unseen furniture configurations, and its robustness against visual disturbances. Ablation studies further accentuate the pivotal role of two algorithmic components: the skill transition model and tactile ensemble policies. Results indicate that TEST can achieve a success rate of 90% and is over 4 times more efficient than the heuristic policy in both in-distribution and generalization settings, suggesting a scalable skill transfer approach for contact-rich manipulation.
|
|
13:30-15:00, Paper WeBT8-CC.5 | Add to My Program |
Sim2Real Manipulation on Unknown Objects with Tactile-Based Reinforcement Learning |
|
Su, Entong | University of California San Diego |
Jia, Chengzhe | University of California SanDiego |
Qin, Yuzhe | UC San Diego |
Zhou, Wenxuan | Carnegie Mellon University |
Macaluso, Annabella | University of California, San Diego |
Huang, Binghao | University of California, San Diego |
Wang, Xiaolong | UC San Diego |
Keywords: Reinforcement Learning, Force and Tactile Sensing, Sensor-based Control
Abstract: Using tactile sensors for manipulation remains one of the most challenging problems in robotics. At the heart of these challenges is generalization: How can we train a tactile-based policy that can manipulate unseen and diverse objects? In this paper, we propose to perform Reinforcement Learning with only visual tactile sensing inputs on diverse objects in a physical simulator. By training with diverse objects in simulation, it enables the policy to generalize to unseen objects. However, leveraging simulation introduces the Sim2Real transfer problem. To mitigate this problem, we study different tactile representations and evaluate how each affects real-robot manipulation results after transfer. We conduct our experiments on diverse real-world objects and show significant improvements over baselines. Our project page is available at: https://tactilerl.github.io
|
|
13:30-15:00, Paper WeBT8-CC.6 | Add to My Program |
Synchronized Dual-Arm Rearrangement Via Cooperative MTSP |
|
Li, Wenhao | National University of Defense Technology |
Zhang, Shishun | National University of Defense Technology |
Dai, Sisi | National University of Defense Technology |
Huang, Hui | Shenzhen University |
Hu, Ruizhen | Shenzhen University |
Chen, Xiaohong | Hunan University of Technology and Business |
Xu, Kai | National University of Defense Technology |
Keywords: Reinforcement Learning, Task Planning, Dual Arm Manipulation
Abstract: Synchronized dual-arm rearrangement is widely studied as a common scenario in industrial applications. It often faces scalability challenges due to the computational complexity of robotic arm rearrangement and the high-dimensional nature of dual-arm planning. To address these challenges, we formulated the problem as cooperative mTSP, a variant of mTSP where agents share cooperative costs, and utilized reinforcement learning for its solution. Our approach involved representing rearrangement tasks using a task state graph that captured spatial relationships and a cooperative cost matrix that provided details about action costs. Taking these representations as observations, we designed an attention-based network to effectively combine them and provide rational task scheduling. Furthermore, a cost predictor is also introduced to directly evaluate actions during both training and planning, significantly expediting the planning process. Our experimental results demonstrate that our approach outperforms existing methods in terms of both performance and planning efficiency.
|
|
13:30-15:00, Paper WeBT8-CC.7 | Add to My Program |
EquivAct: SIM(3)-Equivariant Visuomotor Policies Beyond Rigid Object Manipulation |
|
Yang, Jingyun | Stanford University |
Deng, Congyue | Stanford |
Wu, Jimmy | Princeton University |
Antonova, Rika | Stanford University |
Guibas, Leonidas | Stanford University |
Bohg, Jeannette | Stanford University |
Keywords: Representation Learning, Deep Learning in Grasping and Manipulation, Mobile Manipulation
Abstract: If a robot masters folding a kitchen towel, we would also expect it to master folding a beach towel. However, existing works for policy learning that rely on data augmentation are still limited in achieving this level of generalization. Our insight is to add equivariance to both the visual object representation and policy architecture. We propose EquivAct which utilizes SIM(3)-equivariant network structures that guarantee generalization across all possible object translations, 3D rotations, and scales by construction. EquivAct is trained in two phases. We first pre-train a SIM(3)-equivariant visual representation on simulated scene point clouds. Then, we learn a SIM(3)-equivariant visuomotor policy on top of the pre-trained visual representation using a small amount of source task demonstrations. We show that the learned policy directly transfers to objects that substantially differ in scale, position, and orientation from the source demonstrations. In simulation, we evaluate our method in three manipulation tasks involving deformable and articulated objects that go beyond typical rigid object manipulation tasks that prior work considered. We show that our method outperforms prior works that do not use equivariant architectures or do not use our contrastive pre-training procedure. We also show real robot experiments where the robot watches 20 demonstrations of a tabletop task and transfers zero-shot to a mobile manipulation task in a much larger setup. Project website: https://equivact.github.io
|
|
13:30-15:00, Paper WeBT8-CC.8 | Add to My Program |
Multi Actor-Critic DDPG for Robot Action Space Decomposition: A Framework to Control Large 3D Deformation of Soft Linear Objects |
|
Daniel, Mélodie | LaBRI - Université De Bordeaux |
Magassouba, Aly | INP Clermont |
Aranda, Miguel | Universidad De Zaragoza |
Lequievre, Laurent | Université Clermont Auvergne - CNRS |
Corrales Ramon, Juan Antonio | Universidade De Santiago De Compostela |
Iglesias, Roberto | Univ of Santiago De Compostela |
Mezouar, Youcef | Clermont Auvergne INP - SIGMA Clermont |
Keywords: Reinforcement Learning, Deep Learning in Grasping and Manipulation
Abstract: Robotic manipulation of deformable linear objects (DLOs) has great potential for applications in diverse fields such as agriculture or industry. However, a major challenge lies in acquiring accurate deformation models that describe the relationship between robot motion and DLO deformations. Such models are difficult to calculate analytically and vary among DLOs. Consequently, manipulating DLOs poses significant challenges, particularly in achieving large deformations that require highly accurate global models. To address these challenges, this paper presents MultiAC6: a new multi Actor-Critic framework for robot action space decomposition to control large 3D deformations of DLOs. In our approach, two deep reinforcement learning (DRL) agents orient and position a robot gripper to deform a DLO into the desired shape. Unlike previous DRL-based studies, MultiAC6 is able to solve the sim-to-real gap, achieving large 3D deformations up to 40 cm in real-world settings. Experimental results also show that MultiAC6 has a 66% higher success rate than a single-agent approach. Further experimental studies demonstrate that MultiAC6 generalizes well, without retraining, to DLOs with different lengths or materials.
|
|
WeBT9-CC Oral Session, CC-419 |
Add to My Program |
Collision Avoidance II |
|
|
Chair: Zhu, Chi | Maebashi Institute of Technology |
Co-Chair: Kanoulas, Dimitrios | University College London |
|
13:30-15:00, Paper WeBT9-CC.1 | Add to My Program |
DiPPeR: Diffusion-Based 2D Path Planner Applied on Legged Robots |
|
Liu, Jianwei | University College London |
Stamatopoulou, Maria | University College London |
Kanoulas, Dimitrios | University College London |
Keywords: Motion and Path Planning, Collision Avoidance, Legged Robots
Abstract: In this work, we present DiPPeR, a novel and fast 2D path planning framework for quadrupedal locomotion, leveraging diffusion-driven techniques. Our contributions include a scalable dataset generator for map images and corresponding trajectories, an image-conditioned diffusion planner for mobile robots, and a training/inference pipeline employing CNNs. We validate our approach in several mazes, as well as in real-world deployment scenarios on Boston Dynamic's Spot and Unitree's Go1 robots. DiPPeR performs on average 23 times faster for trajectory generation against both search based and data driven path planning algorithms with an average of 87% consistency in producing feasible paths of various length in maps of variable size, and obstacle structure.
|
|
13:30-15:00, Paper WeBT9-CC.2 | Add to My Program |
Efficient Polynomial Sum-Of-Squares Programming for Planar Robotic Arms |
|
Keren, Daniel | University of Haifa |
Shahar, Amit | University of Haifa |
Poranne, Roi | University of Haifa |
Keywords: Collision Avoidance, Optimization and Optimal Control, Motion and Path Planning
Abstract: Collision-avoiding motion planning for articulated robotic arms is one of the major challenges in robotics. The difficulty of the problem arises from its high dimensionality and the intricate geometry of the feasible space.Our goal is to seek large convex domains in configuration space, which contain no obstacles.In these domains, simple linear trajectories are guaranteed to be collision free, and can be leveraged for further optimization. To find such domains, practitioners have harnessed a methodology known as Sum-Of-Squares (SOS) Programming. SOS programs, however, are notorious for their poor scaling properties, which makes it challenging to employ them for complex problems. In this paper, we explore a simple formulation for a two-dimensional arm, which results in smaller SOS programs than previous suggested ones. We show that this formulation can express a variety of scenarios in a unified manner.
|
|
13:30-15:00, Paper WeBT9-CC.3 | Add to My Program |
PathRL: An End-To-End Path Generation Method for Collision Avoidance Via Deep Reinforcement Learning |
|
Yu, Wenhao | University of Science and Technology of China |
Peng, Jie | University of Science and Technology of China |
Qiu, Quecheng | School of Data Science, USTC, Hefei 230026, China |
Wang, Hanyu | University of Science and Technology of China |
Zhang, Lu | Institute of Artificial Intelligence, Hefei Comprehensive Nation |
Ji, Jianmin | University of Science and Technology of China |
Keywords: Reinforcement Learning, Collision Avoidance, Motion and Path Planning
Abstract: Robot navigation using deep reinforcement learning (DRL) has shown great potential in improving the performance of mobile robots. Nevertheless, most existing DRL-based navigation methods primarily focus on training a policy that directly commands the robot with low-level controls, like linear and angular velocities, which leads to unstable speeds and unsmooth trajectories of the robot during the long-term execution. An alternative method is to train a DRL policy that outputs the navigation path directly. Then the robot can follow the generated path smoothly using sophisticated velocity-planning and path-following controllers, whose parameters are specified according to the hardware platform. However, two roadblocks arise for training a DRL policy that outputs paths:(1) The action space for potential paths often involves higher dimensions comparing to low-level commands, which increases the difficulties of training; (2) It takes multiple time steps to track a path instead of a single time step, which requires the path to predicate the interactions of the robot w.r.t. the dynamic environment in multiple time steps. This, in turn, amplifies the challenges associated with training. In response to these challenges, we propose PathRL, a novel DRL method that trains the policy to generate the navigation path for the robot. Specifically, we employ specific action space discretization techniques and tailored state space representation methods to address the associated challenges. Curriculum learning is employed to expedite the training process, while the reward function also takes into account the smooth transition between adjacent paths. In our experiments, PathRL achieves better success rates and reduces angular rotation variability compared to other DRL navigation methods, facilitating stable and smooth robot movement. We demonstrate the competitive edge of PathRL in both real-world scenarios and multiple challenging simulation environments.
|
|
13:30-15:00, Paper WeBT9-CC.4 | Add to My Program |
ZAPP! Zonotope Agreement of Prediction and Planning for Continuous-Time Collision Avoidance with Discrete-Time Dynamics |
|
Paparusso, Luca | Politecnico Di Milano |
Kousik, Shreyas | Georgia Institute of Technology |
Schmerling, Edward | Stanford University |
Braghin, Francesco | Politecnico Di Milano |
Pavone, Marco | Stanford University |
Keywords: Collision Avoidance, Planning under Uncertainty
Abstract: The past few years have seen immense progress on two fronts that are critical to safe, widespread mobile robot deployment: predicting uncertain motion of multiple agents, and planning robot motion under uncertainty. However, the numerical methods required on each front have resulted in a mismatch of representation for prediction and planning. In prediction, numerical tractability is usually achieved by coarsely discretizing time, and by representing multimodal multi-agent interactions as distributions with infinite support. On the other hand, safe planning typically requires very fine time discretization, paired with distributions with compact support, to reduce conservativeness and ensure numerical tractability. The result is, when existing predictors are coupled with planning and control, one may often find unsafe motion plans. This paper proposes ZAPP (Zonotope Agreement of Prediction and Planning) to resolve the representation mismatch. ZAPP unites a prediction-friendly coarse time discretization and a planning-friendly zonotope uncertainty representation; the method also enables differentiating through a zonotope collision check, allowing one to integrate prediction and planning within a gradient-based optimization framework. Numerical examples show how ZAPP can produce safer trajectories compared to baselines in interactive scenes.
|
|
13:30-15:00, Paper WeBT9-CC.5 | Add to My Program |
Certifying Bimanual RRT Motion Plans in a Second |
|
Amice, Alexandre | MIT |
Werner, Peter | Massachusetts Institute of Technology |
Tedrake, Russ | Massachusetts Institute of Technology |
Keywords: Collision Avoidance, Motion and Path Planning, Computational Geometry
Abstract: We present an efficient method for certifying non-collision for piecewise-polynomial motion plans in algebraic reparametrizations of configuration space. Such motion plans include those generated by popular randomized methods including RRTs and PRMs, as well as those generated by many methods in trajectory optimization. Based on Sums-of-Squares optimization, our method provides exact, rigorous certificates of non-collision; it can never falsely claim that a motion plan containing collisions is collision-free. We demonstrate that our formulation is practical for real world deployment, certifying the safety of a twelve degree of freedom motion plan in just over a second. Moreover, the method is capable of discriminating the safety or lack thereof of two motion plans which differ by only millimeters.
|
|
13:30-15:00, Paper WeBT9-CC.6 | Add to My Program |
Cross View Capture for Distributed Image Compression with Decoder Side Information |
|
Yin, Yankai | Nankai University |
Sun, Zhe | RIKEN |
Ruan, Peiying | NVIDIA |
Duan, Feng | Nankai University |
Li, Ruidong | Kanazawa University |
Zhu, Chi | Maebashi Institute of Technology |
Keywords: Collision Avoidance
Abstract: Image compression is increasingly important in applications like intelligent driving and smart surveillance systems. This study presents a novel cross view capture distributed image compression network (CVCDIC) to improve the compression quality by using decoder side information. The CVCDIC’s decoder utilizes feature extraction networks to extract features from both the primary image and the side information. Furthermore, a multi-level cross view attention module is designed to capture interrelated details between images at multiple hierarchical levels. Finally, a spatial refinement module, constructed on the foundation of information distillation networks, is designed to further refine the quality of reconstructed images. The results show that CVCDIC can achieve an MS-SSIM of 0.978 at 0.15 bpp, surpassing DSIN (0.925), NDIC (0.956), and ATN (0.955) on the KITTI Stereo dataset.
|
|
13:30-15:00, Paper WeBT9-CC.7 | Add to My Program |
Planning with Learned Subgoals Selected by Temporal Information |
|
Huang, Xi | Karlsruhe Institute of Technology |
Sóti, Gergely | Karlsruhe University of Applied Sciences |
Ledermann, Christoph | Karlsruhe Institute of Technology |
Hein, Björn | Karlsruhe University of Applied Sciences |
Kroeger, Torsten | Karlsruher Institut Für Technologie (KIT) |
Keywords: Collision Avoidance, Motion and Path Planning, Integrated Planning and Learning
Abstract: Path planning in a changing environment is a challenging task in robotics, as moving objects impose time-dependent constraints. Recent planning methods focus primarily on the spatial aspects, lacking the capability to directly incorporate time constraints. In this paper, we propose a method that leverages a generative model to decompose a complex planning problem into small manageable ones by incrementally outputting subgoals given the current planning context. Then, we take into account the temporal information and use learned time estimators based on different statistic distributions to examine and select the generated subgoal candidates. Experiments show that planning from the current robot state to the selected subgoal can satisfy the given time-dependent constraints while being goal-oriented.
|
|
13:30-15:00, Paper WeBT9-CC.8 | Add to My Program |
Neural Potential Field for Obstacle-Aware Local Motion Planning |
|
Alhaddad, Muhammad | Moscow Institute of Physics and Technology |
Mironov, Konstantin | Moscow Institute of Physics and Technology |
Staroverov, Aleksei | MIPT |
Panov, Aleksandr | AIRI |
Keywords: Collision Avoidance, Machine Learning for Robot Control, Motion and Path Planning
Abstract: Model predictive control (MPC) may provide local motion planning for mobile robotic platforms. The challenging aspect is the analytic representation of collision cost for the case when both the obstacle map and robot footprint are arbitrary. We propose a Neural Potential Field: a neural network model that returns a differentiable collision cost based on robot pose, obstacle map, and robot footprint. The differentiability of our model allows its usage within the MPC solver. It is computationally hard to solve problems with a very high number of parameters. Therefore, our architecture includes neural image encoders, which transform obstacle maps and robot footprints into embeddings, which reduce problem dimensionality by two orders of magnitude. The reference data for network training are generated based on algorithmic calculation of a signed distance function. Comparative experiments showed that the proposed approach is comparable with existing local planners: it provides trajectories with outperforming smoothness, comparable path length, and safe distance from obstacles.
|
|
13:30-15:00, Paper WeBT9-CC.9 | Add to My Program |
Unconstrained Model Predictive Control for Robot Navigation under Uncertainty |
|
Arul, Senthil Hariharan | University of Maryland, College Park |
Park, Jong Jin | Amazon Lab126 |
Prem, Vishnu | Amazon |
Zhang, Yang | Latitude Ai |
Manocha, Dinesh | University of Maryland |
Keywords: Motion and Path Planning, Collision Avoidance, Planning under Uncertainty
Abstract: In this paper, we present a probabilistic and unconstrained model predictive control formulation for robot navigation under uncertainty. We present (1) a closed-form approximation of the probability of collision that naturally models the propagation of uncertainty over the planning horizon and is computationally cheap to evaluate, and (2) a collision-cost formulation which provably preserves forward invariance (i.e., keeps the robot away from obstacles) when combined with the probability formulation. Notably, our formulation avoids hard constraints by construction, which in turn avoids abrupt transitions in robot behavior around the constraint boundaries leading to graceful navigation. Further, we present proof for the forward invariance and the stability of the approach. We compare the efficacy of our method with the baseline [1], which the proposed approach builds on. We demonstrate that the approach results in confident and safe robot navigation in tight spaces by smoothly slowing down the robot in low survivability environments (e.g., tight corridors), but also allows it to move away from obstacles safely when needed.
|
|
WeBT10-CC Oral Session, CC-501 |
Add to My Program |
Soft Robot Applications I |
|
|
Chair: Wen, Li | Beihang University |
Co-Chair: Okamura, Allison M. | Stanford University |
|
13:30-15:00, Paper WeBT10-CC.1 | Add to My Program |
Efficient RRT*-Based Safety-Constrained Motion Planning for Continuum Robots in Dynamic Environments |
|
Luo, Peiyu | Southern University of Science and Technology |
Yao, Shilong | City University of Hong Kong/Southern University of Science And |
Yue, Yiyao | Southern University of Science and Technology |
Yan, Hong | City University of Hong Kong |
Wang, Jiankun | Southern University of Science and Technology |
Meng, Max Q.-H. | The Chinese University of Hong Kong |
Keywords: Soft Robot Applications, Constrained Motion Planning, Medical Robots and Systems
Abstract: Continuum robots, characterized by their high flexibility and infinite degrees of freedom (DoFs), have gained prominence in applications such as minimally invasive surgery and hazardous environment exploration. However, the intrinsic complexity of continuum robots requires a significant amount of time for their motion planning, posing a hurdle to their practical implementation. To tackle these challenges, efficient motion planning methods such as Rapidly Exploring Random Trees (RRT) and its variant, RRT*, have been employed. This paper introduces a unique RRT*-based motion control method tailored for continuum robots. Our approach embeds safety constraints derived from the robots' posture states, facilitating autonomous navigation and obstacle avoidance in rapidly changing environments. Simulation results show efficient trajectory planning amidst multiple dynamic obstacles and provide a robust performance evaluation based on the generated postures. Finally, preliminary tests were conducted on a two-segment cable-driven continuum robot prototype, confirming the effectiveness of the proposed planning approach. This method is versatile and can be adapted and deployed for various types of continuum robots through parameter adjustments.
|
|
13:30-15:00, Paper WeBT10-CC.2 | Add to My Program |
Ultrafast Capturing In-Flight Objects with Reprogrammable Working Speed Ranges |
|
Jiang, Yongkang | Tongji University |
Tong, Xin | Shenzhen Institute of Advanced Technology, CAS |
Sun, Zhongqing | Shenzhen Institute of Advanced Technology,Chinese Academy |
Zhou, Yanmin | Tongji University |
Wang, Zhipeng | Tongji University |
Jiang, Shuo | Tongji University |
Yin, Zhen | Tongji University |
Ding, Yulong | Tongji University |
He, Bin | Tongji University |
Li, Yingtian | Shenzhen Institutes of Advanced Technology, Chinese Academy of S |
Keywords: Soft Robot Applications, Grasping, Mechanism Design
Abstract: In-flight high-speed object capturing is crucial in nature to improve survival and adaptation to the environment, such as the predation of frogs, leopards, and eagles. Despite its ubiquitousness in nature, capturing fast-moving objects is extremely challenging in engineering implementations. In this paper, we report an ultrafast gripper based on tunable bistable structures. Different from current designs which are only suitable for objects with certain speed ranges once the grippers are fabricated, the working range of object speed of the proposed gripper could be reprogrammed by controlling the sensitivity of the structures. We present the design and fabrication of the proposed gripper in detail. A theoretical model is introduced to construct the energy landscape of the structures and the force response of the gripper when programmed to different states. The results show that in the original state, the gripper is capable of capturing a flying table tennis ball with a high speed of 15 m/s in only 6 ms. When the proposed gripper is controlled to the ultra-sensitive state, a flying ball with only 1 m/s could also be captured. This work broadens the frontiers of in-flight capturing design, and we envision broader promising applications.
|
|
13:30-15:00, Paper WeBT10-CC.3 | Add to My Program |
A Restorable, Variable Stiffness Pneumatic Soft Gripper Based on Jamming of Strings of Beads |
|
Han, Fenglin | Central South University |
Fei, Lei | Central South University |
Zou, Run | Central South University |
Li, Weijian | Central South University |
Zhou, JingHao | Central South University |
Zhao, Haiming | Central South University |
Keywords: Soft Robot Applications, Grippers and Other End-Effectors, Soft Robot Materials and Design, particle jamming
Abstract: Soft robots based on particle jamming cannot return to the initial position and initial mechanical state due to the accumulation of particles after removing the particle jamming,which means poor restorability, and the compliance of the robots during deformation will be reduced because of the jamming effect.Here,we present the design, fabrication, and tests of a novel soft actuator with good restorability and compliance. To improve the restorability of the actuator, we used cotton threads to connect the spherical acrylic beads into form strings instead of discrete beads.The beads could be pulled to the initial position by the threads, the actuator also returns to the initial state. To avoid the jamming effect during the deformation of the actuator, we used compressed air to drive the actuator and injected the beads into the actuator after the active deformation. To reduce the driving pressure and facilitate the flow of the beads, an initial noncontact, frame-type strain constraint structure was designed for the soft actuator. Experimental data show that the actuator was flexible during bending and the stiffness can increase more than 12-fold to resist the external load. By pulling the threads, the actuator could be restored to the initial state with an error of less than 3% of the actuator length after an operation cycle. The soft gripper based on the actuator can grasp repeatedly or laterally. The gripper can grasp soft objects such as a piece of tofu and a balloon of water.
|
|
13:30-15:00, Paper WeBT10-CC.4 | Add to My Program |
Hard Shell, Soft Core: Binary Actuators for Deep-Sea Applications |
|
Sourkounis, Cora Maria | Leibniz University Hannover |
García Morales, Ditzia Susana | Leibniz Universität Hannover |
Kwasnitschka, Tom | GEOMAR Helmholtz Centre for Ocean Research Kiel |
Raatz, Annika | Leibniz Universität Hannover |
Keywords: Soft Robot Applications, Marine Robotics, Mechanism Design
Abstract: Deep-sea research represents invaluable opportunities to unravel hidden ecosystems, uncover unknown biodiversity, and provide critical insights into the Earth’s history and the impacts of climate change. Due to the extreme conditions, exploring the deep-sea traditionally requires costly equipment, such as specific diving robots, engineered to withstand the high pressure. Our research aims to reduce the costs of deep-sea sediment sampling by introducing a novel actuation system for suction samplers, that capitalises the advantages of soft material actuators. At first glance, soft material actuators may not appear suitable for the harsh conditions that prevail in the deepsea, but when combined with a rigid, bistable mechanism there is great potential for improving the accessibility of sampling and research in this challenging environment. The binary actuation system that results from this combination, is modular, scalable,lightweight, and low cost in comparison to existing solutions.
|
|
13:30-15:00, Paper WeBT10-CC.5 | Add to My Program |
Tip-Clutching Winch for High Tensile Force Application with Soft Growing Robots |
|
Osele, Obumneme Godson | Stanford University |
Barhydt, Kentaro | Massachusetts Institute of Technology |
Cerone, Nicholas | Massachusetts Institute of Technology |
Okamura, Allison M. | Stanford University |
Asada, Harry | MIT |
Keywords: Soft Robot Applications, Mechanism Design, Soft Robot Materials and Design
Abstract: The navigational abilities of tip-everting soft growing robots, known as vine robots, are compromised when tip-mount devices are added to enable carrying of payloads. We present a new method for securing a vine robot to objects or its environment that exploits the unique eversion-based growth mechanism and flexibility of vine robots, while keeping the tip of the vine robot free of encumbrance. Our implementation is a tip-clutching winch, into which vine robots can insert themselves and anchor to via powerful overlapping belt friction. The device enables passive, high-strength, and reversible fastening, and can easily release the vine robot. This approach enables carrying of loads of at least 28 kg (limited by the tensile strength of the vine robot body material and winch actuator torque capacity), as well as novel material transport and locomotion capabilities.
|
|
13:30-15:00, Paper WeBT10-CC.6 | Add to My Program |
Symmetry-Aware Reinforcement Learning for Robotic Assembly under Partial Observability with a Soft Wrist |
|
Hai, Nguyen | Northeastern University |
Kozuno, Tadashi | Omron Sinic X |
Beltran-Hernandez, Cristian Camilo | Omron Sinic X |
Hamaya, Masashi | OMRON SINIC X Corporation |
Keywords: Soft Robot Applications, Modeling, Control, and Learning for Soft Robots, Compliant Assembly
Abstract: This study tackles the representative yet challenging contact-rich peg-in-hole task of robotic assembly, using a soft wrist that can operate more safely and tolerate lower-frequency control signals than a rigid one. Previous studies often use a fully observable formulation, requiring external setups or estimators for the peg-to-hole pose. In contrast, we use a partially observable formulation and deep reinforcement learning from demonstrations to learn a memory-based agent that acts purely on haptic and proprioceptive signals. Moreover, previous works do not incorporate potential domain symmetry and thus must search for solutions in a bigger space. Instead, we propose to leverage the symmetry for sample efficiency by augmenting the training data and constructing auxiliary losses to force the agent to adhere to the symmetry. Results in simulation with five different symmetric peg shapes show that our proposed agent can be comparable to or even outperform a state-based agent. In particular, the sample efficiency also allows us to learn directly on the hardware within 3 hours.
|
|
13:30-15:00, Paper WeBT10-CC.7 | Add to My Program |
Force Estimation at the Bionic Soft Arm’s Tool-Center-Point During the Interaction with the Environment |
|
Pilch, Samuel | University of Stuttgart |
Müller, Daniel | University of Stuttgart |
Sawodny, Oliver | University of Stuttgart |
Keywords: Soft Robot Applications, Modeling, Control, and Learning for Soft Robots, Force Control
Abstract: Soft continuum robots enable new application areas in contrast to standard rigid robots, such as interaction with a varying environment. Due to their compliant continuous structure, they are inherently safe and adaptive to environmental conditions. In this paper, the interaction with the environment is performed at the tool-center-point of a soft continuum manipulator and is realized by a hybrid force-position control. For this, a force estimation model is derived to substitute the force sensor at the tool-center-point. The force estimation is probabilistic and relies on normal distributions considering model parameters and deviations from model identification of the soft continuum robot. It also provides a qualitative measure for the contact estimation. This paper first presents the probabilistic force estimation model and then shows the hybrid force-position control using the presented model. From the results, it is concluded that force sensing is replaceable for the environment interaction.
|
|
13:30-15:00, Paper WeBT10-CC.8 | Add to My Program |
Field-Evaluated Closed Structure Soft Gripper Enhances the Shelf Life of Harvested Blackberries |
|
Johnson, Philip. H | University of Lincoln |
Junge, Kai | École Polytechnique Fédérale De Lausanne |
Whitfield, Charles | National Institute of Agricultural Botany |
Hughes, Josie | EPFL |
Calisti, Marcello | The University of Lincoln |
Keywords: Soft Robot Applications, Robotics and Automation in Agriculture and Forestry, Grippers and Other End-Effectors
Abstract: Soft robotic grippers are intrinsically delicate while grasping objects, and can rely on mechanical deformation to adapt to different shapes without explicit control. These characteristics are particularly appealing for agriculture, where items of produce from the same crop can vary significantly in shape and size, and delicate harvesting is among the first concerns for fruit quality. Various soft robotic grippers have been proposed for harvesting different produce types, however their employment in field testing has been extremely limited. In this paper we developed the first closed structure soft gripper for the harvest of blackberries. We adapted an existing gripper concept, initially testing it on a sensorised raspberry physical twin. Then, followed grower-guided protocols to pick blackberries in farm polytunnels, and to evaluate the shelf life in comparison with berries picked by professional human pickers. Our results with ten experimental varieties showed a picking success rate of 95.4% demonstrating the capability of a closed structure gripper to adapt mechanically to fruit-shape variability. Moreover, a shelf life assessment on seven measured traits reported greatly improved shelf life of between 30 and 150%, across all traits for gripper harvested blackberries. Our study demonstrates the potential of soft grippers for delicate fruit harvesting, and indicates how to increase the impact of robotics in agriculture.
|
|
WeBT11-CC Oral Session, CC-502 |
Add to My Program |
Semantic Scene Understanding II |
|
|
Chair: Beetz, Michael | University of Bremen |
Co-Chair: Nikolakopoulos, George | Luleå University of Technology |
|
13:30-15:00, Paper WeBT11-CC.1 | Add to My Program |
Translating Universal Scene Descriptions into Knowledge Graphs for Robotic Environment |
|
Nguyen, Giang | University of Bremen |
Beßler, Daniel | Universität Bremen |
Stelter, Simon | Universität Bremen |
Pomarlan, Mihai | Universitatea Politehnica Timisoara |
Beetz, Michael | University of Bremen |
Keywords: Semantic Scene Understanding, Embodied Cognitive Science, Simulation and Animation
Abstract: Robots performing human-scale manipulation tasks require an extensive amount of knowledge about their surroundings in order to perform their actions competently and human-like. In this work, we investigate the use of virtual reality technology as an implementation for robot environment modeling, and present a technique for translating scene graphs into knowledge bases. To this end, we take advantage of the Universal Scene Description (USD) format which is an emerging standard for the authoring, visualization and simulation of complex environments. We investigate the conversion of USD-based environment models into Knowledge Graph (KG) representations that facilitate semantic querying and integration with additional knowledge sources.
|
|
13:30-15:00, Paper WeBT11-CC.2 | Add to My Program |
SeMLaPS: Real-Time Semantic Mapping with Latent Prior Networks and Quasi-Planar Segmentation |
|
Wang, Jingwen | University College London |
Tarrio, Juan Jose | SLAMCore |
Agapito, Lourdes | University College London |
Fernández Alcantarilla, Pablo | SLAMcore Ltd |
Vakhitov, Alexander | Slamcore |
Keywords: Semantic Scene Understanding, Mapping, SLAM
Abstract: The availability of real-time semantics greatly improves the core geometric functionality of SLAM systems, enabling numerous robotic and AR/VR applications. We present a new methodology for real-time semantic mapping from RGB-D sequences that combines a 2D neural network and a 3D network based on a SLAM system with 3D occupancy mapping. When segmenting a new frame we perform latent feature re-projection from previous frames based on differentiable rendering. Fusing re-projected feature maps from previous frames with current-frame features greatly improves image segmentation quality, compared to a baseline that processes images independently. For 3D map processing, we propose a novel geometric quasi-planar over-segmentation method that groups 3D map elements likely to belong to the same semantic classes, relying on surface normals. We also describe a novel neural network design for lightweight semantic map post-processing. Our system achieves state-of-the-art semantic mapping quality within 2D-3D networks-based systems and matches the performance of 3D convolutional networks on three real indoor datasets, while working in real-time. Moreover, it shows better cross-sensor generalization abilities compared to 3D CNNs, enabling training and inference with different depth sensors. Code and data will be available upon paper acceptance.
|
|
13:30-15:00, Paper WeBT11-CC.3 | Add to My Program |
Robotic Exploration through Semantic Topometric Mapping |
|
Fredriksson, Scott | Luleå University of Technology |
Saradagi, Akshit | Luleå University of Technology, Luleå, Sweden |
Nikolakopoulos, George | Luleå University of Technology |
Keywords: Semantic Scene Understanding, Motion and Path Planning
Abstract: In this article, we introduce a novel strategy for robotic exploration in unknown environments using a semantic topometric map. As it will be presented, the semantic topometric map is generated by segmenting the grid map of the currently explored parts of the environment into regions, such as intersections, pathways, dead-ends, and unexplored frontiers, which constitute the structural semantics of an environment. The proposed exploration strategy leverages metric information of the frontier, such as distance and angle to the frontier, similar to existing frameworks, with the key difference being the additional utilization of structural semantic information, such as properties of the intersections leading to frontiers. The algorithm for generating semantic topometric mapping utilized by the proposed method is lightweight, resulting in the method's online execution being both rapid and computationally efficient. Moreover, the proposed framework can be applied to both structured and unstructured indoor and outdoor environments, which enhances the versatility of the proposed exploration algorithm. We validate our exploration strategy and demonstrate the utility of structural semantics in exploration in two complex indoor environments by utilizing a Turtlebot3 as the robotic agent. Compared to traditional frontier-based methods, our findings indicate that the proposed approach leads to faster exploration and requires less computation time.
|
|
13:30-15:00, Paper WeBT11-CC.4 | Add to My Program |
Open-Fusion: Real-Time Open-Vocabulary 3D Mapping and Queryable Scene Representation |
|
Yamazaki, Kashu | University of Arkansas |
Hanyu, Taisei | University of Arkansas |
Vo, Khoa | University of Arkansas |
Pham, Trong Thang | University of Arkansas |
Minh, Tran | University of Arkansas |
Doretto, Gianfranco | West Virginia University |
Nguyen, Anh | University of Liverpool |
Le, Ngan | University of Arkansas |
Keywords: Semantic Scene Understanding, Mapping, Localization
Abstract: Precise 3D environmental mapping with semantics is essential in robotics. Existing methods often rely on predefined concepts during training or are time-intensive when generating semantic maps. This paper presents Open-Fusion, an approach for real-time open-vocabulary 3D mapping and queryable scene representation using RGB-D data. Open-Fusion harnesses the power of a pretrained vision-language foundation model (VLFM) for open-set semantic comprehension and employs the Truncated Signed Distance Function (TSDF) for swift 3D scene reconstruction. By leveraging the VLFM, we extract region-based embeddings and their associated confidence maps. These are then integrated with the 3D knowledge from TSDF using an enhanced Hungarian-based feature matching mechanism. In particular, Open-Fusion delivers outstanding annotation-free 3D segmentation for open-vocabulary query without the need for additional 3D training. Benchmark tests on the ScanNet dataset against leading zero-shot methods highlight Open-Fusion's superiority. Furthermore, it seamlessly combines the strengths of region-based VLFM and TSDF, facilitating real-time 3D scene comprehension that includes object concepts and open-world semantics. We encourage the readers to view the demos on our project page: url{https://uark-aicv.github.io/OpenFusion}
|
|
13:30-15:00, Paper WeBT11-CC.5 | Add to My Program |
Mask4Former: Mask Transformer for 4D Panoptic Segmentation |
|
Yilmaz, Kadir | RWTH Aachen University |
Schult, Jonas | RWTH Aachen University |
Nekrasov, Alexey | RWTH Aachen University |
Leibe, Bastian | RWTH Aachen University |
Keywords: Semantic Scene Understanding, Object Detection, Segmentation and Categorization
Abstract: Accurately perceiving and tracking instances over time is vital for the decision-making processes of autonomous agents safely interacting in dynamic environments. With this intention, we propose Mask4Former for the challenging task of 4D panoptic segmentation of LiDAR point clouds. Mask4Former is the first transformer-based approach unifying semantic instance segmentation and tracking of sparse and irregular sequences of 3D point clouds into a single joint model. Our model directly predicts semantic instances and their temporal associations without relying on any hand-engineered non-learned association strategies such as probabilistic clustering or voting-based center predictions. Instead, Mask4Former introduces spatio-temporal instance queries which encode the semantic and geometric properties of each semantic tracklet in the sequence. In an in-depth study, we discover that it is critical to promote spatially compact instance predictions as spatio-temporal instance queries tend to merge multiple semantically similar instances, even if they are spatially distant. To this end, we regress 6-DOF bounding box parameters from spatio-temporal instance queries, which is used as an auxiliary task to foster spatially compact predictions. Mask4Former achieves a new state-of-the-art on SemanticKITTI test with a score of 68.4 LSTQ.
|
|
13:30-15:00, Paper WeBT11-CC.6 | Add to My Program |
Mask4D: End-To-End Mask-Based 4D Panoptic Segmentation for LiDAR Sequences |
|
Marcuzzi, Rodrigo | University of Bonn |
Nunes, Lucas | University of Bonn |
Wiesmann, Louis | University of Bonn |
Marks, Elias Ariel | University of Bonn |
Behley, Jens | University of Bonn |
Stachniss, Cyrill | University of Bonn |
Keywords: Semantic Scene Understanding, Deep Learning Methods
Abstract: Scene understanding is crucial for autonomous systems to reliably navigate in the real world. Panoptic segmentation of 3D LiDAR scans allows us to semantically describe a vehicle’s environment by predicting semantic classes for each 3D point and to identify individual instances through different instance IDs. To describe the dynamics of the surroundings, 4D panoptic segmentation further extends this information with temporarily consistent instance IDs to identify the different instances in the scans consistently over whole sequences. Previous approaches for 4D panoptic segmentation rely on post-processing steps and are often not end-to-end trainable. In this paper, we propose a novel approach that can be trained end-to-end and directly predicts a set of non-overlapping masks along with their semantic classes and instance IDs that are consistent over time without any post-processing like clustering or associations between predictions. We extend a mask-based 3D panoptic segmentation model to 4D by reusing queries that decoded instances in previous scans. This way, each query decodes the same instance over time, carries its ID and the tracking is performed implicitly. This enables us to jointly optimize segmentation and tracking and directly supervise for 4D panoptic segmentation. We plan to provide the code and pre-trained models in case of paper acceptance.
|
|
13:30-15:00, Paper WeBT11-CC.7 | Add to My Program |
HSPNav: Hierarchical Scene Prior Learning for Visual Semantic Navigation towards Real Settings |
|
Kang, Jiaxu | Central South University |
Chen, Bolei | Central South University |
Zhong, Ping | Central South University |
Yang, Haonan | Central South University |
Yu, Sheng | Central South University |
Wang, Jianxin | Central South University |
Keywords: Semantic Scene Understanding, Embodied Cognitive Science, Vision-Based Navigation
Abstract: Visual Semantic Navigation (VSN) aims at navigating a robot to a given target object in a previously unseen scene. To tackle this task, the robot must learn a nimble navigation policy by utilizing spatial patterns and semantic co-occurrence relations among objects in the scene. Prevailing approaches extract scene priors from the instant visual observations and solidify them in neural episodic memory to achieve flexible navigation. However, due to the oblivion and underuse of the scene priors, these methods are plagued by repeated exploration, effective-knowledge sparsity, and wrong decisions. To alleviate these issues, we propose a novel VSN policy, HSPNav, based on Hierarchical Scene Priors (HSP) and Deep Reinforcement Learning (DRL). The HSP contains two components, i.e., the egocentric semantic map-based Local Scene Priors (LSP) and the commonsense relational graph-based Global Scene Priors (GSP). Then, efficient semantic navigation is achieved by employing an immediate LSP to retrieve conducive contextual memories from the GSP. By utilizing the MP3D dataset, the experimental results in the Habitat simulator demonstrate that our HSP brings a significant boost over the baselines. Furthermore, we take an essential step from simulation to reality by bridging the gap from Habitat to ROS. The migration evaluations show that HSPNav can generalize to realistic settings well and achieve promising performance.
|
|
13:30-15:00, Paper WeBT11-CC.8 | Add to My Program |
Belief Scene Graphs: Expanding Partial Scenes with Objects through Computation of Expectation |
|
Valdes Saucedo, Mario Alberto | Lulea University of Technology |
Patel, Akash | Luleå University of Technology |
Saradagi, Akshit | Luleå University of Technology, Luleå, Sweden |
Kanellakis, Christoforos | LTU |
Nikolakopoulos, George | Luleå University of Technology |
Keywords: Semantic Scene Understanding, Learning Categories and Concepts, AI-Enabled Robotics
Abstract: In this article, we propose the novel concept of Belief Scene Graphs, which are utility-driven extensions of partial 3D scene graphs, that enable efficient high-level task planning with partial information. We propose a graph-based learning methodology for the computation of belief (also referred to as expectation) on any given 3D scene graph, which is then used to strategically add new nodes (referred to as blind nodes) that are relevant to a robotic mission. We propose the method of Computation of Expectation based on Correlation Information (CECI), to reasonably approximate real Belief/Expectation, by learning histograms from available training data. A novel Graph Convolutional Neural Network (GCN) model is developed, to learn CECI from a repository of 3D scene graphs. As no database of 3D scene graphs exists for the training of the novel CECI model, we present a novel methodology for generating a 3D scene graph dataset based on semantically annotated real-life 3D spaces. The generated dataset is then utilized to train the proposed CECI model and for extensive validation of the proposed method. We establish the novel concept of textit{Belief Scene Graphs} (BSG), as a core component to integrate expectations into abstract representations. This new concept is an evolution of the classical 3D scene graph concept and aims to enable high-level reasoning for task planning and optimization of a variety of robotics missions. The efficacy of the overall framework has been evaluated in an object search scenario, and has also been tested in a real-life experiment to emulate human common sense of unseen-objects. For a video of the article, showcasing the experimental demonstration, please refer to the following link: https://youtu.be/hsGlSCa12iY
|
|
13:30-15:00, Paper WeBT11-CC.9 | Add to My Program |
A Guided Gaussian-Dirichlet Random Field for Scientist-In-The-Loop Inference in Underwater Robotics |
|
Samuelson, Chad | University |
Mangelson, Joshua | Brigham Young University |
Keywords: Semantic Scene Understanding, Human Factors and Human-in-the-Loop, Marine Robotics
Abstract: Visual topic modeling (VTM) provides key insight into data sets based on learned semantic topic models. The Gaussian-Dirichlet Random Field (GDRF), a state-of-the-art VTM technique, models these semantic topics in continuous space as densities. However, ambiguity in learned topics is a disadvantage of such Dirichlet-based VTM algorithms. We propose the Guided Gaussian-Dirichlet Random Field (GGDRF). Our method applies Dirichlet Forest priors from natural language processing (NLP) to the vision domain as a way to embed visual scientific knowledge into the estimation process. This modification and addition to the GDRF provides a key shift from unsupervised machine learning to semi-supervised machine learning in the robotic VTM domain. We show through simulation and real-world underwater data that the proposed GGDRF outperforms the previous GDRF method both quantitatively and qualitatively by improving alignment between estimated topics and scientific interests.
|
|
WeBT12-CC Oral Session, CC-503 |
Add to My Program |
Transfer Learning |
|
|
Chair: Yang, Jianfei | Nanyang Technological University |
Co-Chair: Sóti, Gergely | Karlsruhe University of Applied Sciences |
|
13:30-15:00, Paper WeBT12-CC.1 | Add to My Program |
Fine-Tuning Point Cloud Transformers with Dynamic Aggregation |
|
Fei, Jiajun | Tsinghua University |
Deng, Zhidong | Tsinghua University |
Keywords: Transfer Learning, Representation Learning, Object Detection, Segmentation and Categorization
Abstract: Point clouds play an important role in 3D analysis, which has broad applications in robotics and autonomous driving. The pre-training fine-tuning paradigm has shown great potential in the point cloud domain. Full fine-tuning is generally effective but leads to a heavy storage and computational burden, which becomes inefficient and unacceptable as the size of pretrained models scales. Although efficient fine-tuning approaches have significant progress in other domains, they generally perform worse for point clouds. To overcome this dilemma, we revisit the official Point-MAE implementation and find the critical role of aggregation in fine-tuning performances. Inspired by such discoveries, we propose a novel dynamic aggregation (DA) method to replace previous static aggregation like mean or max pooling for pre-trained point cloud Transformers. Besides standard metrics such as accuracy or mIoU, we evaluate the number of tunable parameters and additional FLOPs for a fair comparison of our method to different fine-tuning approaches. We construct several DA variants and validate them through extensive experiments. Experimental results demonstrate that DA has competitive performances against full fine-tuning and other efficient fine-tuning approaches. The code is publicly available at https://github.com/JaronTHU/DynamicAggregation.
|
|
13:30-15:00, Paper WeBT12-CC.2 | Add to My Program |
MoPA: Multi-Modal Prior Aided Domain Adaptation for 3D Semantic Segmentation |
|
Cao, Haozhi | Nanyang Technological University |
Xu, Yuecong | National University of Singapore |
Yang, Jianfei | Nanyang Technological University |
Yin, Pengyu | Nanyang Technological University |
Yuan, Shenghai | Nanyang Technological University |
Xie, Lihua | NanyangTechnological University |
Keywords: Transfer Learning, Object Detection, Segmentation and Categorization, Deep Learning Methods
Abstract: Multi-modal unsupervised domain adaptation (MM-UDA) for 3D semantic segmentation is a practical solution to embed semantic understanding in autonomous systems without expensive point-wise annotations. While previous MM-UDA methods can achieve overall improvement, they suffer from significant class-imbalanced performance, restricting their adoption in real applications. This imbalanced performance is mainly caused by: 1) self-training with imbalanced data and 2) the lack of pixel-wise 2D supervision signals. In this work, we propose Multi-modal Prior Aided (MoPA) domain adaptation to improve the performance of rare objects. Specifically, we develop Valid Ground-based Insertion (VGI) to rectify the imbalance supervision signals by inserting prior rare objects collected from the wild while avoiding introducing artificial artifacts that lead to trivial solutions. Meanwhile, our SAM consistency loss leverages the 2D prior semantic masks from SAM as pixel-wise supervision signals to encourage consistent predictions for each object in the semantic mask. The knowledge learned from modal-specific prior is then shared across modalities to achieve better rare object segmentation. Extensive experiments show that our method achieves state-of-the-art performance on the challenging MM-UDA benchmark. Code will be available at https://github.com/AronCao49/MoPA.
|
|
13:30-15:00, Paper WeBT12-CC.3 | Add to My Program |
Cross Domain Policy Transfer with Effect Cycle-Consistency |
|
Zhu, Ruiqi | King's College London |
Dai, Tianhong | Imperial College London |
Celiktutan, Oya | King's College London |
Keywords: Transfer Learning, Machine Learning for Robot Control
Abstract: Training a robotic policy from scratch using deep reinforcement learning methods can be prohibitively expensive due to sample inefficiency. To address this challenge, transferring a pre-trained policy in the source domain to the target domain becomes an attractive solution. Previous research has typically focused on domains with similar state and action spaces but differing in other aspects. In this paper, our primary focus lies in domains with different state and action spaces, which has broader practical implications, i.e. transfer the policy from robot A to robot B. Unlike prior methods that rely on paired data, we propose a novel approach for learning the mapping functions between state and action spaces across domains with unpaired data. We propose effect cycle-consistency, which aligns the effects of transitions across two domains through a symmetrical optimization structure for learning these mapping functions. Once the mapping functions are learned, we can seamlessly transfer the policy from the source domain to the target domain without the need for additional fine-tuning. Our approach has been tested through experiments conducted on three locomotion tasks and two robotic manipulation tasks. The empirical results demonstrate that our method not only achieves better performance of the transferred policies but also reduces alignment errors significantly compared to the baselines.
|
|
13:30-15:00, Paper WeBT12-CC.4 | Add to My Program |
Parameter-Efficient Prompt Learning for 3D Point Cloud Understanding |
|
Sun, Hongyu | Renmin University of China |
Wang, Yongcai | Renmin University of China |
Chen, Wang | RENMIN UNIVERSITY of CHINA |
Deng, Haoran | Renmin University of China |
Li, Deying | Renmin University of China |
Keywords: Transfer Learning
Abstract: This paper presents a parameter-efficient prompt tuning method, named PPT, to adapt a large multi-modal model for 3D point cloud understanding. Existing strategies are quite expensive in computation and storage, and depend on time-consuming prompt engineering. We address the problems from three aspects. Firstly, a PromptLearner module is devised to replace hand-crafted prompts with learnable contexts to automate the prompt tuning process. Then, we lock the pre-trained backbone instead of adopting the full fine-tuning paradigm to substantially improve the parameter efficiency. Finally, a lightweight PointAdapter module is arranged near target tasks to enhance prompt tuning for 3D point cloud understanding. Comprehensive experiments are conducted to demonstrate the superior parameter and data efficiency of the proposed method. Meanwhile, we obtain new records on 4 public datasets and multiple 3D tasks, i.e., point cloud recognition, few-shot learning, and part segmentation. The implementation is available at https://github.com/auniquesun/PPT.
|
|
13:30-15:00, Paper WeBT12-CC.5 | Add to My Program |
BEVUDA: Multi-Geometric Space Alignments for Domain Adaptive BEV 3D Object Detection |
|
Liu, Jiaming | Peking University |
Zhang, Rongyu | Nanjing University |
Li, Xiaoqi | Peking University |
Chi, Xiaowei | Hong Kong University of Science and Technology |
Chen, Zehui | University of Science and Technology of China |
Lu, Ming | Intel Labs |
Guo, Yandong | OPPO Research Institute |
Zhang, Shanghang | Peking University |
Keywords: Transfer Learning, Deep Learning for Visual Perception
Abstract: Vision-centric bird-eye-view (BEV) perception has shown promising potential in autonomous driving. Recent works mainly focus on improving efficiency or accuracy but neglect the challenges when facing environment changing, resulting in severe degradation of transfer performance. For BEV perception, we figure out the significant domain gaps existing in typical real-world cross-domain scenarios and make the first attempt to solve the Domain Adaption (DA) problem for multi-view 3D object detection. Since BEV perception approaches are complicated and contain several components, the domain shift accumulation on multiple geometric spaces (i.e., 2D, 3D Voxel, BEV) makes BEV DA even challenging. In this paper, we propose a Multi-space Alignment Teacher-Student (MATS) framework to ease the domain shift accumulation, which consists of a Depth-Aware Teacher (DAT) and a Geometric-space Aligned Student (GAS) model. DAT tactfully combines target lidar and reliable depth prediction to construct depth-aware information, extracting target domain-specific knowledge in Voxel and BEV feature spaces. It then transfers the sufficient domain knowledge of multiple spaces to the student model. In order to jointly alleviate the domain shift, GAS projects multi-geometric space features to a shared geometric embedding space and decreases data distribution distance between two domains. To verify the effectiveness of our method, we conduct BEV 3D object detection experiments on three cross-domain scenarios and achieve state-of-the-art performance. The code will be released at https://github.com/liujiaming1996/BEV_UDA.
|
|
13:30-15:00, Paper WeBT12-CC.6 | Add to My Program |
6-DOF Grasp Pose Evaluation and Optimization Via Transfer Learning from NeRFs |
|
Sóti, Gergely | Karlsruhe University of Applied Sciences |
Huang, Xi | Karlsruhe Institute of Technology |
Wurll, Christian | Karlsruhe University of Applied Sciences |
Hein, Björn | Karlsruhe University of Applied Sciences |
Keywords: Transfer Learning, Representation Learning, Grasping
Abstract: We address the problem of robotic grasping of known and unknown objects using implicit behavior cloning. We train a grasp evaluation model from a small number of demonstrations that outputs higher values for grasp candidates that are more likely to succeed in grasping. This evaluation model serves as an objective function, that we maximize to identify successful grasps. Key to our approach is the utilization of learned implicit representations of visual and geometric features derived from a pre-trained NeRF. Though trained exclusively in a simulated environment with simplified objects and 4-DoF top-down grasps, our evaluation model and optimization procedure demonstrate generalization to 6-DoF grasps and novel objects both in simulation and in real-world settings, without the need for additional data. Supplementary material is available at: https://gergely-soti.github.io/grasp
|
|
13:30-15:00, Paper WeBT12-CC.7 | Add to My Program |
Multi-Level Progressive Reinforcement Learning for Control Policy in Physical Simulations |
|
Wu, Kefei | ShanghaiTech University |
He, Xuming | ShanghaiTech University |
Wang, Yang | Shanghaitech University |
Liu, Xiaopei | SHANGHAITECH UNIVERSITY |
Keywords: Transfer Learning, Reinforcement Learning, Simulation and Animation
Abstract: Training model-free intelligent agents in complex real-world scenarios using reinforcement learning (RL) often necessitates simulation-based environments due to high physical expenses. However, when simulation takes a long time, e.g., in an unsteady 3D fluid simulation with interactions to the controllable solids, existing RL algorithms meet difficulty to accomplish training within a reasonable timeframes. In this paper, we propose a novel multi-level framework for RL to accelerate convergence as the first attempt to address this difficulty. Motivated by the idea of multi-grid solver, the control policy on a virtual agent over time can be decomposed into different frequency bands, which can be progressively learned via a set of simulations in a coarse-to-fine manner. It is expected that most RL trials are performed in coarse simulations to learn lower frequency bands with efficient convergence, while higher frequency levels require much less RL trials, thus significantly accelerating the learning process. To implement our idea, we designed a novel multi-level residual network with a filter module attached, where each level of the network is learned by performing RL for a given simulation resolution. The proposed framework is evaluated by conducting policy learning experiments on virtual aerial (2D) and underwater (3D) robots, both requiring time-consuming physical simulations. Our results demonstrate a 50% decrease in learning time compared to a direct RL approach, while achieving similar control performance.
|
|
13:30-15:00, Paper WeBT12-CC.8 | Add to My Program |
Kalman Filter-Based One-Shot Sim-To-Real Transfer Learning |
|
Dongqingwei, Dongqingwei | Shenyang Institute of Automation, Chinese Academy of Sciences |
Zeng, Peng | Shenyang Institute of Automation Chinese Academy of Sciences |
Wan, Guangxi | Shenyang Institute of Automation, Chinese Academy of Sciences |
He, Yunpeng | Shenyang Institute of Automation, Chinese Academy of Sciences |
Dong, Xiaoting | Shenyang Institute of Automation, Chinese Academic of Science |
Keywords: Transfer Learning, Reinforcement Learning, Machine Learning for Robot Control
Abstract: Deep reinforcement learning algorithms offer a promising method for industrial robots to tackle unstructured and complex scenarios that are difficult to model. However, due to constraints related to equipment lifespan and safety requirements, acquiring a number of samples directly from the physical environment is often infeasible. With the development of increasingly realistic simulators, it has become feasible for industrial robots to acquire complex motion skills within simulated environments. Nonetheless, the ”reality gap” frequently results in performance degradation when transferring policies trained in simulators to physical systems. In this paper, we treat the reality gap between a physical environment (target domain) and a simulated environment (source domain) as a Gaussian perturbation and utilize Kalman filtering to reduce the discrepancy between source and target domain data. We refine the source domain controller using target domain data to enhance the controller’s adaptability to the target domain. The efficacy of the proposed method is demonstrated in reaching tasks and peg-in-hole tasks conducted on PR2 and UR5 robotic platforms.
|
|
WeBT13-AX Oral Session, AX-201 |
Add to My Program |
Human-Robot Collaboration II |
|
|
Chair: Hirata, Yasuhisa | Tohoku University |
Co-Chair: Taguchi, Shun | Toyota Central R&D Labs., Inc |
|
13:30-15:00, Paper WeBT13-AX.1 | Add to My Program |
Shaping Social Robot to Play Games with Human Demonstrations and Evaluative Feedback |
|
Zheng, Chuanxiong | Ocean University of China |
Zhang, Lei | Ocean University of China |
Wang, Hui | Ocean University of China |
Gomez, Randy | Honda Research Institute Japan Co., Ltd |
Nichols, Eric | Honda Research Institute Japan |
Li, Guangliang | Ocean University of China |
Keywords: Human-Centered Robotics, Imitation Learning, Robot Companions
Abstract: In this paper, building on recent advances in the fields of gaming AI and social robotics, we present a new approach to facilitate the social robot Haru to imitate game strategies from human players' demonstrated trajectories and evaluative feedback in a real-time two-player game. Our research shows that Haru is able to learn and imitate human different game strategies from human players in a human time scale. In addition, our results show that human evaluative feedback plays an important role in allowing Haru to obtain a better performance via our method than human player's demonstrations. Finally, results of our user study indicate that Haru imitating human player’s game strategies via our method is perceived to be more human-like and have better game performance and experience than self-learning from pre-defined reward functions via traditional deep reinforcement learning method.
|
|
13:30-15:00, Paper WeBT13-AX.2 | Add to My Program |
Running Guidance for Visually Impaired People Using Sensory Augmentation Technology Based Robotic System |
|
Liao, Zhenyu | Tohoku University |
Salazar Luces, Jose Victorio | Tohoku University |
Ravankar, Ankit A. | Tohoku University |
Hirata, Yasuhisa | Tohoku University |
Keywords: Human-Centered Robotics, Physically Assistive Devices, Wearable Robotics
Abstract: Participating in sports is of great significance to people’s physical and mental well-being. While physical activity is commonplace for healthy individuals, it presents challenges for those with visual impairments, as they are difficult to rely on visual cues to perceive essential information related to sports participation, such as their surroundings. Many related studies including our previous work for assisting users in doing sports using sensory augmentation technology, which couples haptic feedback with people’s desired movements, are proposed for this challenge. On the basis of these studies, we propose a system for guiding visually impaired users running outdoors using a drone-based robotic system to locate a user and a track, calculate desired moving directions, and provide haptic feedback to the user. We conduct an experiment to explore how accurately people can recognize the directions conveyed by the proposed guidance method. Subjects were asked to select their felt directions on a tablet while running on a treadmill at 6.5 km/h and 7.5 km/h. The results show subjects could recognize the cued directions with an average resolution of 19.8◦ and 19.6◦ at different speeds, respectively, and there is no significant difference exist between the two speeds. In addition, we guide users in realistic running scenarios on sports tracks. Subjects in this experiment wore an eye mask to simulate the visually impaired. They were instructed to run by following the perc
|
|
13:30-15:00, Paper WeBT13-AX.3 | Add to My Program |
Rider Cooperative Control of Rear-Wheel-Swing Motorcycle Based on Divergent Component of Motion |
|
Sumioka, Tadashi | Honda R&D Co., Ltd |
Akimoto, Kazushi | Honda R & D Co., Ltd |
Tsujimura, Takuya | Honda R&D Co., Ltd |
Takayanagi, Sho | Honda R&D Co., Ltd |
Fukushima, Katsuhiko | Honda R&D Co., Ltd |
Nose, Tsubasa | Honda Motor Co., Ltd |
Keywords: Human-Centered Robotics, Human-Robot Collaboration, Physically Assistive Devices
Abstract: We previously proposed a motorcycle with a balance assist function that can generate a self-balancing moment by changing the front wheel fork slant angle. Although this method reduces the risk of falling over, it makes it difficult for the rider to go in the desired direction due to interference by the front wheel assist. In order to solve this problem, this paper proposes a rear-wheel-swing assist mechanism that minimizes the influence on the front wheel steering operation. A method for realizing cooperative control with the rider using the divergent component of motion is also proposed. The results of an extremely low-speed U-turn test are used to show that the proposed methods provide stability while maintaining drivability.
|
|
13:30-15:00, Paper WeBT13-AX.4 | Add to My Program |
MORPHeus: A Multimodal One-Armed Robot-Assisted Peeling System with Human Users In-The-Loop |
|
Ye, Ruolin | Cornell University |
Hu, Yifei | Cornell University |
Bian, Yuhan (Anjelica) | Cornell University |
Kulm, Luke | Cornell University |
Bhattacharjee, Tapomayukh | Cornell University |
Keywords: Human-Centered Robotics, Perception for Grasping and Manipulation, Physically Assistive Devices
Abstract: Meal preparation is an important instrumental activity of daily living (IADL). While existing research has explored robotic assistance in meal preparation tasks like cutting and cooking, the crucial task of peeling has received less attention. Peeling, conventionally a bimanual task, is challenging for care recipients using one robot arm mounted on their wheelchair due to ergonomic and transferring challenges. This paper introduces a real-world robot-assisted peeling system utilizing a single robotic arm and an assistive cutting board, inspired by how individuals with one functional hand do meal preparation. Our system incorporates a multimodal active perception module, a human-in-the-loop long-horizon planning through a natural language interface, and a compliant controller for adaptive motion. Our robot-assisted peeling system uses visual, haptic, and vibration sensing modalities to peel a diverse range of food items with varying physical properties and can successfully adapt to different environments featuring multiple specialized cutting boards. Videos and supplementary materials are available at https://emprise.cs.cornell.edu/morpheus/.
|
|
13:30-15:00, Paper WeBT13-AX.5 | Add to My Program |
MIntNet: Rapid Motion Intention Forecasting of Coupled Human-Robot Systems with Simulation-To-Real Autoregressive Neural Networks |
|
Atkins, John | Arizona State University |
Lee, Hyunglae | Arizona State University |
Keywords: Human-Centered Robotics, Physical Human-Robot Interaction
Abstract: This paper describes the use of a simulation-to-real training pipeline using autoregressive neural networks (MIntNet) for coupled-human robot motion intention prediction. Using only general prior knowledge about the interaction task, a large simulation dataset was generated and used to train a multi-output variation of the classic autoregressive model. The network used an encoding-decoding method to construct condensed representations of the coupled system kinematics over a sequence of time windows and generated their condensed latent representations to predict multiple sequences of the future system states. This method was then tested on 10 real human subjects’ data for the interaction task and the simulation-to-real generalization performance was evaluated for the proposed network along with alternative implementations of standard multilayered perceptron, convolutional, and long-short term memory based networks. Results show the proposed network has better generalization performance compared to the alternatives, capable of closely predicting positions during fast motion along non-constant curvatures subject to low-frequency disturbances. The MIntNet was able to accurately predict future positions in a 200 ms window with errors of 3.1 ± 4.8 mm averaged over the prediction window with inference times of 0.26 ± 0.44 ms. Performance was higher for short range predictions with errors over the time window growing as 2.3 ± 3.4 mm at 50 ms, 2.4 ± 4.4 mm at 100 ms, and 5.5 ± 6.7 mm.
|
|
13:30-15:00, Paper WeBT13-AX.6 | Add to My Program |
Language to Map: Topological Map Generation from Natural Language Path Instructions |
|
Deguchi, Hideki | Toyota Central R&D Labs., Inc |
Shibata, Kazuki | Toyota Central R&D Labs., INC |
Taguchi, Shun | Toyota Central R&D Labs., Inc |
Keywords: Human-Centered Robotics, Mapping, Human-Robot Collaboration
Abstract: In this paper, a method for generating a map from path information described using natural language (textual path) is proposed. In recent years, robotics research mainly focus on vision-and-language navigation (VLN), a navigation task based on images and textual paths. Although VLN is expected to facilitate user instructions to robots, its current implementation requires users to explain the details of the path for each navigation session, which results in high explanation costs for users. To solve this problem, we proposed a method that creates a map as a topological map from a textual path and automatically creates a new path using this map. We believe that large language models (LLMs) can be used to understand textual path. Therefore, we propose and evaluate two methods, one for storing implicit maps in LLMs, and the other for generating explicit maps using LLMs. The implicit map is in the LLM’s memory. It is created using prompts. In the explicit map, a topological map composed of nodes and edges is constructed and the actions at each node are stored. This makes it possible to estimate the path and actions at waypoints on an undescribed path, if enough information is available. Experimental results on path instructions generated in a real environment demonstrate that generating explicit maps achieves significantly higher accuracy than storing implicit maps in the LLMs.
|
|
13:30-15:00, Paper WeBT13-AX.7 | Add to My Program |
Human-Centered Autonomy for UAS Target Search |
|
Ray, Hunter | University of Colorado Boulder |
Laouar, Zakariya | University of Colorado |
Sunberg, Zachary | University of Colorado |
Ahmed, Nisar | University of Colorado Boulder |
Keywords: Human-Centered Robotics, Planning under Uncertainty, Aerial Systems: Applications
Abstract: Current methods of deploying robots that operate in dynamic, uncertain environments, such as Uncrewed Aerial Systems in search & rescue missions, require nearly continuous human supervision for vehicle guidance and operation. These methods do not consider high-level mission context resulting in cumbersome manual operation or inefficient exhaustive search patterns. We present a human-centered autonomous framework that infers geospatial mission context through dynamic feature sets, which then guides a probabilistic target search planner. Operators provide a set of diverse inputs, including priority definition, spatial semantic information about ad-hoc geographical areas, and reference waypoints, which are probabilistically fused with geographical database information and condensed into a geospatial distribution representing an operator's preferences over an area. An online, POMDP-based planner, optimized for target searching, is augmented with this reward map to generate an operator-constrained policy. Our results, simulated based on input from five professional rescuers, display effective task mental model alignment, 18% more victim finds, and 15 times more efficient guidance plans then current operational methods.
|
|
13:30-15:00, Paper WeBT13-AX.8 | Add to My Program |
Gaze-Based Human-Robot Interaction System for Infrastructure Inspections |
|
Choi, Sunwoong | University of California, Los Angeles |
Al-sabbag, Zaid Abbas | University of Waterloo |
Narasimhan, Sriram | University of California, Los Angeles |
Yeum, Chul Min | University of Waterloo |
Keywords: Human-Centered Robotics, Intention Recognition, Robotics and Automation in Construction
Abstract: Routine inspections for critical infrastructures such as bridges are required in most jurisdictions worldwide. Such routine inspections are largely visual in nature, which are qualitative, subjective, and not repeatable. Although robotic infrastructure inspections address such limitations, they cannot replace the superior ability of experts to make decisions in complex situations, thus making human-robot interaction systems a promising technology. This study presents a novel gaze-based human-robot interaction system, designed to augment the visual inspection performance through mixed reality. Through holograms from a mixed reality device, gaze can be utilized effectively to estimate the properties of the defect in real-time. Additionally, inspectors can monitor the inspection progress online, which enhances the speed of the entire inspection process. Limited controlled experiments demonstrate its effectiveness across various users and defect types. To our knowledge, this is the first demonstration of the real-time application of eye gaze in civil infrastructure inspections.
|
|
13:30-15:00, Paper WeBT13-AX.9 | Add to My Program |
Facile Integration of Robots into Experimental Orchestration at Scientific User Facilities |
|
Fernando, Warnakulasuriya Chandima | Brookhaven National Lab |
Campbell, Stuart | Brookhaven National Laboratory |
Olds, Daniel | Brookhaven National Laboratory |
Maffettone, Phillip | Brookhaven National Laboratory |
Keywords: Software-Hardware Integration for Robot Systems, Software Architecture for Robotic and Automation, Robotics and Automation in Life Sciences
Abstract: Integration of robots into scientific user facilities, such as the National Synchrotron Light Source II, improves their efficiency and capacity. Many such facilities use the open-source Bluesky project for experimental control and orchestration. However, there remains an open challenge in deploying robotic solutions at these facilities that are reconfigurable, extensible, and compatible with pre-existing software infrastructure. Herein, we introduce a framework that uses the Robotic Operating System 2 (ROS2) and Bluesky to provide extensible robotic applications, while working under the operational constraints of a large-scale user facility. We demonstrated this framework by integrating a robotic arm to pick and place a sample holder at a beamline, recording a 90% repeatability rate. This provides the groundwork for further new robotics applications at large-scale scientific user facilities that depend on Bluesky.
|
|
WeBT15-AX Oral Session, AX-203 |
Add to My Program |
Human Factors and Human-In-The-Loop II |
|
|
Chair: Maruyama, Hisataka | Nagoya University |
Co-Chair: Chrysostomou, Dimitrios | Aalborg University |
|
13:30-15:00, Paper WeBT15-AX.1 | Add to My Program |
SEQUEL: Semi-Supervised Preference-Based RL with Query Synthesis Via Latent Interpolation |
|
Marta, Daniel | KTH Royal Institute of Technology |
Holk, Simon | KTH Royal Institute of Technology |
Pek, Christian | Delft University of Technology |
Leite, Iolanda | KTH Royal Institute of Technology |
Keywords: Human Factors and Human-in-the-Loop, Reinforcement Learning, Representation Learning
Abstract: Preference-based reinforcement learning (RL) poses as a recent research direction in robot learning, by allowing humans to teach robots through preferences on pairs of desired behaviours. Nonetheless, to obtain realistic robot policies, an arbitrarily large number of queries is required to be answered by humans. In this work, we approach the sample-efficiency challenge by presenting a technique which synthesizes queries, in a semi-supervised learning perspective. To achieve this, we leverage latent variational autoencoder (VAE) representations of trajectory segments (sequences of state-action pairs). Our approach manages to produce queries which are closely aligned with those labeled by humans, while avoiding excessive uncertainty according to the human preference predictions as determined by reward estimations. Additionally, by introducing variation without deviating from the original human's intents, more robust reward function representations are achieved. We compare our approach to recent state-of-the-art preference-based RL semi-supervised learning techniques. Our experimental findings reveal that we can enhance the generalization of the estimated reward function without requiring additional human intervention. Lastly, to confirm the practical applicability of our approach, we conduct experiments involving actual human users in a simulated social navigation setting. Videos of the experiments can be found at https://sites.google.com/view/rl-sequel
|
|
13:30-15:00, Paper WeBT15-AX.2 | Add to My Program |
Learning When to Ask for Help: Efficient Interactive Navigation Via Implicit Uncertainty Estimation |
|
Igbinedion, Ifueko | Massachusetts Institute of Technology |
Karaman, Sertac | Massachusetts Institute of Technology |
Keywords: Human Factors and Human-in-the-Loop, Vision-Based Navigation, Reinforcement Learning
Abstract: Robots operating alongside humans often encounter unfamiliar environments that make autonomous task completion challenging. Though improving models and increasing dataset size can enhance a robot's performance in unseen environments, data collection and model refinement may be impractical in every environment. Approaches that utilize human demonstrations through manual operation can aid in refinement and generalization, but often require significant data collection efforts to generate enough demonstration data to achieve satisfactory task performance. Interactive approaches allow for humans to provide correction to robot action in real time, but intervention policies are often based on explicit factors related to state and task understanding that may be difficult to generalize. Addressing these challenges, we train a lightweight interaction policy that allows robots to decide when to proceed autonomously or request expert assistance at estimated times of uncertainty. An implicit estimate of uncertainty is learned via evaluating the feature extraction capabilities of the robot's visual navigation policy. By incorporating part-time human interaction, robots recover quickly from their mistakes, significantly improving the odds of task completion. Incorporating part-time interaction yields an increase in success of 0.38 with only a 0.3 expert interaction rate within the Habitat simulation environment using a simulated human expert. We further show success transferring this approach to a new domain with a real human expert, improving success from less than 0.1 with an autonomous agent to 0.92 with a 0.23 human interaction rate. This approach provides a practical means for robots to interact and learn from humans in real-world settings.
|
|
13:30-15:00, Paper WeBT15-AX.3 | Add to My Program |
JaywalkerVR: A VR System for Collecting Safety-Critical Pedestrian-Vehicle Interactions |
|
Mukoya, Kenta | Carnegie Mellon University |
Weng, Erica | Carnegie Mellon University |
Choudhury, Rohan | Carnegie Mellon University |
Kitani, Kris | Carnegie Mellon University |
Keywords: Virtual Reality and Interfaces, Motion and Path Planning, Human Factors and Human-in-the-Loop
Abstract: Developing autonomous vehicles that can safely interact with pedestrians requires large amounts of pedestrian and vehicle data in order to learn accurate pedestrian-vehicle interaction models. However, gathering data that include crucial but rare scenarios - such as pedestrians jaywalking into heavy traffic - can be costly and unsafe to collect. We propose a virtual reality human-in-the-loop simulator, JaywalkerVR, to obtain vehicle-pedestrian interaction data to address these challenges. Our system enables efficient, affordable, and safe collection of long-tail pedestrian-vehicle interaction data. Using our proposed simulator, we create a high-quality dataset with vehicle-pedestrian interaction data from safety critical scenarios called CARLA-VR. The CARLA-VR dataset addresses the lack of long-tail data samples in commonly used real world autonomous driving datasets. We demonstrate that models trained with CARLA-VR improve displacement error and collision rate by 10.7% and 4.9%, respectively, and are more robust in rare vehicle-pedestrian scenarios.
|
|
13:30-15:00, Paper WeBT15-AX.4 | Add to My Program |
Human Preference-Aware Rebalancing and Charging for Shared Electric Micromobility Vehicles |
|
Tan, Heng | Lehigh University |
Yuan, Yukun | University of Tennessee at Chattanooga |
Yan, Hua | LEHIGH UNIVERSITY |
Zhong, Shuxin | Rutgers University |
Yang, Yu | Lehigh University |
Keywords: Intelligent Transportation Systems, Human Factors and Human-in-the-Loop, Reinforcement Learning
Abstract: Shared electric micromobility has surged to a popular model of urban transportation due to its efficiency in short-distance trips and environmentally friendly characteristics compared to traditional automobiles. However, managing thousands of shared electric micromobility vehicles, including rebalancing and charging to meet users' travel demands still has been a challenge. Existing methods generally ignore human preferences in vehicle selection and assume all nearby vehicles have an equal chance of being selected, which is unrealistic based on our findings. To address this problem, we design PERCEIVE, a human preference-aware rebalancing and charging framework for shared electric micromobility vehicles. Specifically, we model human preferences in vehicle selection based on vehicle usage history and current status (e.g., energy level) and incorporate the vehicle selection model into a robust adversarial reinforcement learning framework. We further utilize conformal prediction to quantify human preference uncertainty and fuse it with the reinforcement learning framework. We evaluate our framework using two months of real-world electric micromobility operation data in a city. Experimental results show that our method achieves a performance gain of at least 4.02% in the net revenue and offers more robust performance in worst-case scenarios compared to state-of-the-art baselines.
|
|
13:30-15:00, Paper WeBT15-AX.5 | Add to My Program |
A Power-Aware Control Strategy for an Elbow Effort-Compensation Device |
|
Mobedi, Emir | Istituto Italiano Di Tecnologia |
Hjorth, Sebastian | Istituto Italiano Di Technologia |
Kim, Wansoo | Hanyang University ERICA |
De Momi, Elena | Politecnico Di Milano |
Tsagarakis, Nikos | Istituto Italiano Di Tecnologia |
Ajoudani, Arash | Istituto Italiano Di Tecnologia |
Keywords: Physically Assistive Devices, Wearable Robotics, Human Factors and Human-in-the-Loop
Abstract: This work presents a reactive control strategy for loading and sudden unloading of an elbow effort-compensation device controlled in force. Through this control strategy, in addition to an individual's forearm weight, an external load can be detected and adaptively compensated via a feed-forward force reference, facilitating the execution of arbitrary movements by the wearer. In case of a sudden contact/load loss, a power-aware strategy is implemented to immediately eliminate the portion of external loading in the force reference. The adaptive compensation of the external loads is achieved through an electromyography interface. Instead, to react to sudden load releases, we set a power limit on the tendon, and continuously measure it through an encoder and a load cell connected with the cable. Two sets of experiments are designed to test the proposed load-releasing method on a bench-top setup with 2 kg, and 3.9 kg, and a human subject with 0.5 and 1 kg. Next, the overall scenario including load-compensation and load-releasing are carried out on eight human subjects with 0.5 and 1 kg loads to evaluate the release and compensation time, and the effort reduction with respect to non-powered exoskeleton case. Results show that the average compensation/release time (payload) among subjects is measured as 0.98/0.91 seconds (0.5 kg), and 1/0.86 seconds (1 kg). The average effort reduction among the subjects are also reported as 66.4%, and 67.11% for 0.5 kg, and 1 kg, respectively.
|
|
13:30-15:00, Paper WeBT15-AX.6 | Add to My Program |
A Planar Compliant Contact Control Applied to Multi-Dimensional Elastic Gripper for Unexpected Contact |
|
Huang, Junnan | Tsinghua University |
Wang, Xuefeng | Peking University |
Xia, Chongkun | Tsinghua University |
Liu, Houde | Shenzhen Graduate School, Tsinghua University |
Shao, Mingqi | Tsinghua Shenzhen International Graduate School |
Liang, Bin | Tsinghua University |
Keywords: Human Factors and Human-in-the-Loop, Safety in HRI, Human-Robot Collaboration
Abstract: It is difficult to guarantee an empty living environment to prevent unexpected contact between the object being manipulated by the robot and unplanned obstacles. In this paper, we propose a planar compliant contact control method for planar manipulation to cope with unexpected contact. We first use sheet gel as a multi-dimensional passive elastic element and combine it with a two-finger gripper to design an elastic gripper. Subsequently, we explore the lumped parameter model for the force-displacement relationship of gel deformation and combine the model with the highly impedance motion of robots to design an elastic interaction controller. The controller not only actively adjusts the deformation of the gel to provide the desired contact force and torque depending on contact, but also performs avoidance by following the surface of obstacles. Finally, we design and deploy several planar compliant contact experiments to validate the proposed method and demonstrate the unexpected contact response in humanrobot co-packing. The results show that our method enables the robot to remain compliance in the face of unexpected contact caused by unplanned obstacles, which provides a guarantee for safe manipulation. Physics experiments can be viewed in the attached video.
|
|
13:30-15:00, Paper WeBT15-AX.7 | Add to My Program |
Enabling Passivity for Cartesian Workspace Restrictions |
|
Hjorth, Sebastian | Istituto Italiano Di Technologia |
Lachner, Johannes | Massachusetts Institute of Technology |
Ajoudani, Arash | Istituto Italiano Di Tecnologia |
Chrysostomou, Dimitrios | Aalborg University |
Keywords: Human-Robot Collaboration, Disassembly, Compliance and Impedance Control
Abstract: An emerging trend in the field of human-robot collaboration is the disassembly of end-of-life products. Safety is a crucial requirement of the disassembly process since wornout or damaged products could break, possibly resulting in dangerous behavior of the robot. To protect the user from such behavior, this work addresses this challenge through the implementation of an energy-aware Cartesian impedance controller, combined with virtual workspace restrictions. Hereby, the passivity of the robotic system is ensured. The paper proposed two approaches to ensure the passivity of the system when subjected to workspace restrictions due to unplanned interactions and contact loss. The first approach employs an augmented energy tank with restricted energy flow. The second approach monitors the overall energy flow, regulating and separating non-passive behavior, caused by workspace restrictions. The approaches are evaluated and compared with each other, by using a KUKA LBR iiwa robot. The results highlight the potential of virtual workspace restrictions in human-robot collaborative disassembly tasks.
|
|
WeBT16-AX Oral Session, AX-204 |
Add to My Program |
Haptics and Haptic Interfaces |
|
|
Chair: De Momi, Elena | Politecnico Di Milano |
Co-Chair: Morimoto, Tania K. | University of California San Diego |
|
13:30-15:00, Paper WeBT16-AX.1 | Add to My Program |
X-Tacformer : Spatio-Tempral Attention Model for Tactile Recognition |
|
Hu, Jiarui | Tongji University |
Zhou, Yanmin | Tongji University |
Wang, Zhipeng | Tongji University |
Li, Xin | Tongji University |
Jiang, Yongkang | Tongji University |
He, Bin | Tongji University |
Keywords: Haptics and Haptic Interfaces, Soft Sensors and Actuators
Abstract: Recently, tactile sensing has attracted great interests in robotics, especially for exploring unstructured objects. Sensor arrays play an important role in the exploration, which generates rich spatio-temporal information. In this work, we propose an efficient tactile recognition model, X-Tacformer. This model pays attention to both spatial and temporal features of tactile sequences from sensor arrays, which is verified by four public datasets, Ev-Objects, Ev-Containers, Augment8000 and BioTac-Dos. Comparative studies show that our model has resulted in a significant improvement of the recognition accuracy by 0.0223, 0.1416, 0.2735 and 0.1592 in these datasets.In order to verify its performances on dataset with rich spatio-temporal features, a self-designed dataset, ALU-Textures, was constructed with 10 fabrics from everyday textiles, aiming to extend the data collection action modes of current datasets by simulating human rubbing movements with the thumb and index fingers of an Allegro hand. Our model also demonstrates efficient salient feature learning capabilities on ALU-Textures, which is further augmented by tactile data augmentation methods.
|
|
13:30-15:00, Paper WeBT16-AX.2 | Add to My Program |
The Joint-Space Reconstruction of Human Fingers by Using a Highly Under-Actuated Exoskeleton |
|
Su, Yuan | Northeastern University |
Li, Gaofeng | Zhejiang University |
Deng, Yongsheng | Northeastern University, China |
Sarakoglou, Ioannis | Fondazione Istituto Italiano Di Tecnologia |
Tsagarakis, Nikos | Istituto Italiano Di Tecnologia |
Chen, Jiming | Zhejiang University |
Keywords: Haptics and Haptic Interfaces, Human-Centered Robotics, Wearable Robotics
Abstract: Hand motion tracking is essential in many fields, e.g., immersive virtual reality, teleoperation of robotic hand, and hand rehabilitation of stroke patient, as human hand plays a crucial role in our daily life. The highly under-actuated hand exoskeleton, which can track the 6-DoF motions of each fingertip via a highly under-actuated kinematic chain, exhibits many benefits in wearability and portability over other solutions. However, due to the non-anthropomorphic linkage, this hand exoskeleton also encounters difficulties in measuring human-finger's joint angles. While the joint-space is important in many scenarios, such as teleoperating a robotic hand with anthropomorphic kinematics but with different size to human. Here we proposed a new method to reconstruct the human finger joints by using a highly under-actuated hand exoskeleton. Our key contribution is the arc-fitting algorithm, which is able to calibrate the misalignment between the exoskeleton's and the human-finger's base frames and estimate the length of human's phalanxes, by using the fingertip's circular motions. With knowing the aforementioned informations, the joint angles can be reconstructed in high precision based on the inverse kinematics models of human fingers. Furthermore, our proposed method is compared with a baseline method, in which the joint angles obtained by a motion capture system are served as ground-truth. The results demonstrate that our proposed method exhibits excellent performance in reconstructing finger's joint configurations.
|
|
13:30-15:00, Paper WeBT16-AX.3 | Add to My Program |
Prosthetic Upper-Limb Sensory Enhancement (PULSE): A Dual Haptic Feedback Device in a Prosthetic Socket |
|
Ivani, Alessia Silvia | Fondazione Istituto Italiano Di Tecnologia |
Barontini, Federica | University of Pisa |
Catalano, Manuel Giuseppe | Istituto Italiano Di Tecnologia |
Grioli, Giorgio | Istituto Italiano Di Tecnologia |
Bianchi, Matteo | University of Pisa |
Bicchi, Antonio | Fondazione Istituto Italiano Di Tecnologia |
Keywords: Haptics and Haptic Interfaces, Prosthetics and Exoskeletons, Wearable Robotics
Abstract: This study presents the Prosthetic Upper-Limb Sensory Enhancement (PULSE), a novel dual feedback device completely integrated into a prosthetic socket. The core of the system includes two compact vibrotactile actuators and two silicone chambers in contact with the user's skin. These components provide high-frequency tactile cues for initial contact and surface information (e.g. texture) as well as pressure stimuli related to grasping force. Ten able-bodied participants and one subject with limb loss validated the system, accomplishing an object discrimination task in two different modalities (with and without the feedback). Standardized questionnaires evaluate users’ satisfaction and workload, enabling a systematic and robust device assessment. The results show that the PULSE device enhanced performance compared to its absence without causing discomfort for a prosthetic user and able-bodied participants. The findings highlight the potential of dual haptic feedback to enhance sensory perception in prosthetic applications and offer valuable insights for future prosthetic design.
|
|
13:30-15:00, Paper WeBT16-AX.4 | Add to My Program |
Point-Wise Vibration Pattern Production Via a Sparse Actuator Array for Surface Tactile Feedback |
|
Li, Xiaosa | Tsinghua University |
Zhao, Runze | Tsinghua University |
Lu, Chengyue | Tsinghua University |
Xiao, Xiao | Tsinghua University Shenzhen Graduate School |
Ding, Wenbo | Tsinghua University |
Keywords: Haptics and Haptic Interfaces, Physical Human-Robot Interaction, Touch in HRI
Abstract: Surface vibration tactile feedback is capable of conveying various semantic information to humans via handheld electronic devices, such as smartphones, touch panels, and game controllers. However, covering the entire contacting surface of the device with a dense arrangement of actuators can affect its normal use. Determining how to produce desired vibration patterns at any contact point with only a few sparse actuators deployed on the surface of the handheld device remains a significant challenge. In this work, we develop a tactile feedback board in the size of a smartphone with only five actuators, and achieve the precise production of vibration patterns that can focus at any desired position on the board. Specifically, we investigate the vibration characteristics of a single passive coil actuator and construct its vibration pattern model for any position on the feedback board surface. Optimal phase and amplitude modulation, determined using the simulated annealing algorithm, is employed with five actuators in a sparse array. The vibration patterns from all actuators are superimposed linearly to synthetically generate different onboard vibration energy distributions for tactile sensing. Experiments demonstrated that point-wise vibration pattern production on our tactile board achieved an average level of about 0.9 in the Structural Similarity Index Measure (SSIM) evaluation, when compared to the ideal single-point-focused target vibration pattern. Four point-wise patterns focused on the top, bottom, left, and right parts of the tactile board were applied, to guide continuous directional movements without visual assistance, which shows significant implications for machine-assisted cognition based on vibration tactile feedback.
|
|
13:30-15:00, Paper WeBT16-AX.5 | Add to My Program |
Implementation and Assessment of an Augmented Training Curriculum for Surgical Robotics |
|
Rota, Alberto | Politecnico Di Milano |
Fan, Ke | Politecnico Di Milano |
De Momi, Elena | Politecnico Di Milano |
Keywords: Haptics and Haptic Interfaces, Surgical Robotics: Laparoscopy
Abstract: The integration of high-level assistance algorithms in surgical robotics training curricula may be beneficial in establishing a more comprehensive and robust skillset for aspiring surgeons, improving their clinical performance as a consequence. This work presents the development and validation of a haptic- enhanced Virtual Reality simulator for surgical robotics training, featuring 8 surgical tasks that the trainee can interact with thanks to the embedded physics engine. This virtual simulated environment is augmented by the introduction of high-level haptic interfaces for robotic assistance that aim at re-directing the motion of the trainee’s hands and wrists toward targets or away from obstacles, and providing a quantitative performance score after the execution of each training exercise. An experimental study shows that the introduction of enhanced robotic assistance into a surgical robotics training curriculum improves performance during the training process and, crucially, promotes the transfer of the acquired skills to an unassisted surgical scenario, like the clinical one.
|
|
13:30-15:00, Paper WeBT16-AX.6 | Add to My Program |
Hapstick: A Soft Flexible Joystick for Stiffness Rendering Via Fiber Jamming |
|
Giri, Ayush | University of California San Diego |
Bloom, Robert | University of California San Diego |
Morimoto, Tania K. | University of California San Diego |
Keywords: Haptics and Haptic Interfaces, Medical Robots and Systems, Telerobotics and Teleoperation
Abstract: Continuum robots are well-suited for applications in delicate and constrained environments, such as minimally invasive surgery, due to their inherent compliance and ability to conform to highly curved paths. Yet the kinematic dissimilarity between continuum robots and conventional, off-the-shelf input devices, along with the general lack of haptic feedback available with such devices, can lead to non-intuitive control. In this work, we present Hapstick --- a soft, flexible haptic joystick that uses fiber jamming to modulate its stiffness and provide feedback to users during teleoperation tasks. We characterize the performance of Hapstick, showing that the bending stiffness increases linearly with the increase in applied vacuum load. A psychophysical study is also conducted to obtain the just noticeable difference in stiffness that users can perceive using Hapstick. Lastly, we perform a study in which participants use Hapstick to teleoperate a physical tendon-driven continuum robot in a simulated colorectal cancer screening task. Users correctly identify the position and development stages of cancerous tissues in 25 out of 27 trials, illustrating the potential of jamming-based mechanisms as bidirectional interfaces capable of providing effective haptic feedback.
|
|
13:30-15:00, Paper WeBT16-AX.7 | Add to My Program |
Fingertip Ultrasonic Array for Tactile Rendering |
|
Rozsa, Jace | Carnegie Mellon University |
Costrell, Sarah | Carnegie Mellon University |
Orta Martinez, Melisa | Carnegie Mellon University |
Fedder, Gary K. | Carnegie Mellon University |
Keywords: Haptics and Haptic Interfaces, Mechanism Design, Touch in HRI
Abstract: A miniature haptic stimulation device utilizes focused ultrasound to deliver a tactile haptic sensation to the finger. The 1-3 piezocomposite device has a 1 cm^2 footprint, which is an order of magnitude smaller than other ultrasonic haptic devices and is a good candidate for wearable tactile rendering systems. The device focuses energy to a 1 mm^3 voxel. The current prototype was validated with a small, preliminary human subject study and requires an average input voltage of 68.8 V to elicit tactile sensation. The sensory drive voltage threshold will decrease with future refinement of mechanical impedance matching and focusing.
|
|
13:30-15:00, Paper WeBT16-AX.8 | Add to My Program |
Active Exploration for Real-Time Haptic Training |
|
Ketchum, Jake | Northwestern University |
Prabhakar, Ahalya | Yale University |
Murphey, Todd | Northwestern University |
Keywords: Haptics and Haptic Interfaces, Force and Tactile Sensing, AI-Based Methods
Abstract: Tactile perception is important for robotic systems that interact with the world through touch. Touch is an active sense in which tactile measurements depends on the contact properties of an interaction---e.g., velocity, force, acceleration---as well as properties of the sensor and object under test. These dependencies make training tactile perceptual models challenging. Additionally, the effects of limited sensor life and the near-field nature of tactile sensors preclude the practical collection of exhaustive data sets even for fairly simple objects. Active learning provides a mechanism for focusing on only the most informative aspects of an object during data collection. Here we use an active learning approach that uses a data-driven model's entropy as an uncertainty measure and explore relative to that entropy conditioned on the sensor state variables. Using a coverage-based controller ergodic controller, we train perceptual models in near-real time. We demonstrate our approach using a biomimentic sensor, exploring "tactile scenes" composed of shapes, textures, and objects. Each learned representation provides a perceptual sensor model for a particular tactile scene. Models trained on actively collected data outperform their randomly collected counterparts in real-time training tests. Additionally, we find that the resulting network entropy maps can be used to identify high salience portions of a tactile scene.
|
|
13:30-15:00, Paper WeBT16-AX.9 | Add to My Program |
A Multi-Stable Curved Line Shape Display |
|
Law, Wing-Sum Adrienne | Stanford University |
Wyetzner, Sofia | Stanford University |
Zhen, Raymond | Stanford University |
Follmer, Sean | Stanford University |
Keywords: Haptics and Haptic Interfaces, Soft Robot Applications
Abstract: Shape-changing displays enable real-time visual- ization and haptic exploration of 3D surfaces. However, many shape-changing displays are composed of individually actuated rigid bodies, which makes them both mechanically complex and unable to form smooth surfaces. In this work, we build a multi- stable curved line display inspired by physical splines. By using circular splines to initialize a discrete elastic rods simulator, we can model multiple stable shapes that fit specific boundary conditions. We then generate actuation instructions based on the circular spline initialization to drive the physical display. We demonstrate our display’s ability to create 16 shapes with 8 different boundary conditions. Our display is consistent in shape output, with an average standard deviation in height of 0.75 mm or 0.47% of the display’s maximum vertical range. We also show that our model is consistent with our display, with a mean RMSE of 6.68 mm or 3.85% of the display’s maximum vertical range for shapes we could stably simulate. We then demonstrate potential scalability by simulating a multi- segment version of the system and show the display’s ability to withstand loads during contour following in haptic exploration.
|
|
WeBT17-AX Oral Session, AX-205 |
Add to My Program |
Legged Robots and Learning I |
|
|
Chair: Della Santina, Cosimo | TU Delft |
Co-Chair: Hutter, Marco | ETH Zurich |
|
13:30-15:00, Paper WeBT17-AX.1 | Add to My Program |
Seeing through the Grass: Semantic Pointcloud Filter for Support Surface Learning |
|
Li, Anqiao | ETH Zurich |
Yang, Chenyu | ETH Zurich |
Frey, Jonas | ETH Zurich |
Lee, Joonho | ETH Zurich |
Cadena Lerma, Cesar | ETH Zurich |
Hutter, Marco | ETH Zurich |
Keywords: Legged Robots, Deep Learning for Visual Perception, Field Robots
Abstract: Mobile ground robots require perceiving and understanding their surrounding support surface to move around autonomously and safely. The support surface is commonly estimated based on exteroceptive depth measurements, e.g., from LiDARs. However, the measured depth fails to align with the true support surface in the presence of high grass or other penetrable vegetation. In this work, we present the semantic pointcloud filter (SPF), a convolutional neural network (CNN) that learns to adjust LiDAR measurements to align with the underlying support surface. The SPF is trained in a semi-self-supervised manner and takes as an input a LiDAR pointcloud and RGB image. The network predicts a binary segmentation mask that identifies the specific points requiring adjustment, along with estimating their corresponding depth values. To train the segmentation task, 464 distinct images are manually labeled into rigid and non-rigid terrain. The depth estimation task is trained in a self-supervised manner by utilizing the future footholds of the robot to estimate the support surface based on a Gaussian process. Our method can correctly adjust the support surface prior to interacting with the terrain and is extensively tested on the quadruped robot ANYmal. We show the qualitative benefits of SPF in natural environments for elevation mapping and traversability estimation compared to using raw sensor measurements and existing smoothing methods. Quantitative analysis is performed in various natural e
|
|
13:30-15:00, Paper WeBT17-AX.2 | Add to My Program |
Manipulator As a Tail: Promoting Dynamic Stability for Legged Locomotion |
|
Huang, Huang | University of California at Berkeley |
Loquercio, Antonio | UC Berkeley |
Kumar, Ashish | UC Berkeley |
Thakkar, Neerja | UC Berkeley |
Goldberg, Ken | UC Berkeley |
Malik, Jitendra | UC Berkeley |
Keywords: Legged Robots, Incremental Learning, Whole-Body Motion Planning and Control
Abstract: For locomotion, is an arm on a legged robot a liability or an asset for locomotion? Biological systems evolved additional limbs beyond legs that facilitates postural control. This work shows how a manipulator can be an asset for legged locomotion at high speeds or under external perturbations, where the arm serves beyond manipulation. Since the system has 15 degrees of freedom (twelve for the legged robot and three for the arm), off-the-shelf reinforcement learning (RL) algorithms struggle to learn effective locomotion policies. Inspired by Bernstein’s neurophysiological theory of animal motor learning, we develop an incremental training procedure that initially freezes some degrees of freedom and gradually releases them, using behaviour cloning (BC) from an early learning procedure to guide optimization in later learning. Simulation experiments show that our policy increases the success rate by up to 61 percentage points over the baselines. Simulation and real robot experiments suggest that our policy learns to use the arm as a “tail” to initiate robot turning at high speeds and to stabilize the quadruped under external perturbations. Quantitatively, in simulation experiments, we cut the failure rate up to 43.6% during high-speed turning and up to 31.8% for quadruped under external forces compared to using a locked arm.
|
|
13:30-15:00, Paper WeBT17-AX.3 | Add to My Program |
Two-Stage Learning of Highly Dynamic Motions with Rigid and Articulated Soft Quadrupeds |
|
Vezzi, Francesco | Technical University of Delft |
Ding, Jiatao | Delft University of Technology |
Raffin, Antonin | DLR |
Kober, Jens | TU Delft |
Della Santina, Cosimo | TU Delft |
Keywords: Legged Robots, Machine Learning for Robot Control, Motion Control
Abstract: Controlled execution of dynamic motions in quadrupedal robots, especially those with articulated soft bodies, presents a unique set of challenges that traditional methods struggle to address efficiently. In this study, we tackle these issues by relying on a simple yet effective two-stage learning framework to generate dynamic motions for quadrupedal robots. First, a gradient-free evolution strategy is employed to discover simply represented control policies, eliminating the need for a predefined reference motion. Then, we refine these policies using deep reinforcement learning. Our approach enables the acquisition of complex motions like pronking and back-flipping, effectively from scratch. Additionally, our method simplifies the traditionally labour-intensive task of reward shaping, boosting the efficiency of the learning process. Importantly, our framework proves particularly effective for articulated soft quadrupeds, whose inherent compliance and adaptability make them ideal for dynamic tasks but also introduce unique control challenges.
|
|
13:30-15:00, Paper WeBT17-AX.4 | Add to My Program |
High-Dimensional Controller Tuning through Latent Representations |
|
Sarmadi, Alireza | New York University |
Khorrami, Farshad | New York University Tandon School of Engineering |
Krishnamurthy, Prashanth | New York University Tandon School of Engineering |
Keywords: Legged Robots, Machine Learning for Robot Control, Optimization and Optimal Control
Abstract: In this paper, we propose a method to automatically and efficiently tune high-dimensional vectors of controller parameters. The proposed method first learns a mapping from the high-dimensional controller parameter space to a lower dimensional space using a machine learning-based algorithm. This mapping is then utilized in an actor-critic framework using Bayesian optimization (BO). The proposed approach is applicable to complex systems (such as quadruped robots). In addition, the proposed approach also enables efficient generalization to different control tasks while also reducing the number of evaluations required while tuning the controller parameters. We evaluate our method on a legged locomotion application. We show the efficacy of the algorithm in tuning the high-dimensional controller parameters and also reducing the number of evaluations required for the tuning. Moreover, it is shown that the method is successful in generalizing to new tasks and is also transferable to other robot dynamics.
|
|
13:30-15:00, Paper WeBT17-AX.5 | Add to My Program |
Expert Composer Policy: Scalable Skill Repertoire for Quadruped Robots |
|
Galelli Christmann, Guilherme Henrique | Inventec Corporation |
Luo, Ying-Sheng | Inventec Corporation |
Chen, Wei-Chao | Inventec Inc |
Keywords: Legged Robots, Reinforcement Learning
Abstract: We propose the expert composer policy, a framework to reliably expand the skill repertoire of quadruped agents. The composer policy links pair of experts via transitions to a sampled target state, allowing experts to be composed sequentially. Each expert specializes in a single skill, such as a locomotion gait or a jumping motion. Instead of a hierarchical or mixture-of-experts architecture, we train a single composer policy in an independent process that is not conditioned on the other expert policies. By reusing the same composer policy, our approach enables adding new experts without affecting existing ones, enabling incremental repertoire expansion and preserving original motion quality. We measured the transition success rate of 72 transition pairs and achieved an average success rate of 99.99%, which is over 10% higher than the baseline random approach, and outperforms other state-of-the-art methods. Using domain randomization during training we ensure a successful transfer to the real world, where we achieve an average transition success rate of 97.22% (N=360) in our experiments.
|
|
13:30-15:00, Paper WeBT17-AX.6 | Add to My Program |
Learning Agile Bipedal Motions on a Quadrupedal Robot |
|
Li, Yunfei | Tsinghua University |
Li, Jinhan | Tsinghua University |
Fu, Wei | Tsinghua University |
Wu, Yi | Tsinghua University |
Keywords: Legged Robots, Reinforcement Learning
Abstract: Can a quadrupedal robot perform bipedal motions like humans? Although developing human-like behaviors is more often studied on costly bipedal robot platforms, we present a solution over a lightweight quadrupedal robot that unlocks the agility of the quadruped in an upright standing pose and is capable of a variety of human-like motions. Our framework is with a hierarchical structure. At the low level is a motion-conditioned control policy that allows the quadrupedal robot to track desired base and front limb movements while balancing on two hind feet. The policy is commanded by a high-level motion generator that gives trajectories of parameterized human-like motions to the robot from multiple modalities of human input. We for the first time demonstrate various bipedal motions on a quadrupedal robot, and showcase interesting human-robot interaction modes including mimicking human videos, following natural language instructions, and physical interaction. The video is available at https://sites.google.com/view/bipedal-motions-quadruped.
|
|
13:30-15:00, Paper WeBT17-AX.7 | Add to My Program |
LAGOON: Language-Guided Motion Control |
|
Xu, Shusheng | Tsinghua University |
Wang, Huaijie | Tsinghua University |
Ouyang, Yutao | Xiamen University |
Gao, Jiaxuan | Tsinghua University |
Mei, Zhiyu | Tsinghua University |
Yu, Chao | Tsinghua University |
Wu, Yi | Tsinghua University |
Keywords: Legged Robots, Reinforcement Learning
Abstract: Abstract— We aim to control a robot to physically behave in the real world following any high-level language command like “cartwheel” or “kick”. Although human motion datasets exist, this task remains particularly challenging since generative models can produce physically unrealistic motions, which will be more severe for robots due to different body structures and physical properties. Deploying such a motion to a physical robot can cause even greater difficulties due to the sim2real gap. We develop LAnguage-Guided mOtion cONtrol (LAGOON), a multi-phase reinforcement learning (RL) method to generate physically realistic robot motions under language commands. LAGOON first leverages a pre-trained model to generate a human motion from a language command. Then an RL phase trains a control policy in simulation to mimic the generated human motion. Finally, with domain randomization, our learned policy can be deployed to a quadrupedal robot, leading to a quadrupedal robot that can take diverse behaviors in the real world under natural language commands.
|
|
13:30-15:00, Paper WeBT17-AX.8 | Add to My Program |
Learning Quadrupedal Locomotion with Impaired Joints Using Random Joint Masking |
|
Kim, Mincheol | Seoul National University of Sciences and Technology |
Shin, Ukcheol | CMU(Carnegie Mellon University) |
Kim, Jung-Yup | Seoul National University of Science & Technology |
Keywords: Legged Robots, Reinforcement Learning
Abstract: Quadrupedal robots have played a crucial role in various environments, from structured environments to complex harsh terrains, thanks to their agile locomotion ability. However, these robots can easily lose their locomotion functionality if damaged by external accidents or internal malfunctions. In this paper, we propose a novel deep reinforcement learning frame-work to enable a quadrupedal robot to walk with impaired joints. The proposed framework consists of three components: 1) a random joint masking strategy for simulating impaired joint scenarios, 2) a joint state estimator to predict an implicit status of current joint condition based on past observation history, and 3) progressive curriculum learning to allow a single network to conduct both normal gait and various joint-impaired gaits. We verify that our framework enables the Unitree’s Go1 robot to walk under various impaired joint conditions in real-world indoor and outdoor environments.
|
|
13:30-15:00, Paper WeBT17-AX.9 | Add to My Program |
Multi-Task Learning of Active Fault-Tolerant Controller for Leg Failures in Quadruped Robots |
|
Hou, Taixian | FuDan University |
Tu, Jiaxin | FuDan University |
Gao, Xiaofei | Beijing Zhitong Robot Technology Co., Ltd |
Dong, Zhiyan | Fudan University |
Zhai, Peng | Fudan University |
ZHang, Lihua | Fudan University |
Keywords: Legged Robots, Reinforcement Learning, Body Balancing
Abstract: Electric quadruped robots used in outdoor exploration are susceptible to leg-related electrical or mechanical failures. Unexpected joint power loss and joint locking can immediately pose a falling threat. Typically, controllers lack the capability to actively sense the condition of their own joints and take proactive actions. Maintaining the original motion patterns could lead to disastrous consequences, as the controller may produce irrational output within a short period of time, further creating the risk of serious physical injuries. This paper presents a hierarchical fault-tolerant control scheme employing a multi-task training architecture capable of actively perceiving and overcoming two types of leg joint faults. The architecture simultaneously trains three joint task policies for health, power loss, and locking scenarios in parallel, introducing a symmetric reflection initialization technique to ensure rapid and stable gait skill transformations. Experiments demonstrate that the control scheme is robust in unexpected scenarios where a single leg experiences concurrent joint faults in two joints. Furthermore, the policy retains the robot's planar mobility, enabling rough velocity tracking. Finally, zero-shot Sim2Real transfer is achieved on the real-world SOLO8 robot, countering both electrical and mechanical failures.
|
|
WeBT18-AX Oral Session, AX-206 |
Add to My Program |
Optimization and Optimal Control I |
|
|
Chair: Okuda, Hiroyuki | Nagoya University |
Co-Chair: Ajoudani, Arash | Istituto Italiano Di Tecnologia |
|
13:30-15:00, Paper WeBT18-AX.1 | Add to My Program |
Motion Planning for 4WS Vehicle with Autonomous Selection of Steering Modes Via an MIQP-MPC Controller |
|
Nguyen, Ngoc Thinh | University of Luebeck |
Gangavarapu, Pranav Tej | University of Luebeck |
Mandel, Nicolas | University Zu Luebeck |
Bruder, Ralf | University of Lübeck |
Ernst, Floris | University of Lübeck |
Keywords: Optimization and Optimal Control, Constrained Motion Planning, Field Robots
Abstract: Navigation in agricultural fields imposes various constraints on manoeuvrability, which can be tackled by using four-wheel steering (4WS) vehicles which are capable of switching between multiple steering mechanisms with distinct kinematic properties. For example, parallel positive steering (PPS) with four wheels in parallel to each other can maintain the vehicle's heading when moving along a curve. Symmetric negative steering (SNS) with two wheels on each side sharing the same steering angle can turn with a small radius. This paper presents a controller capable of selecting and switching between the two aforementioned modes autonomously for better trajectory tracking performance with special heading requirements for agricultural applications. The controller is implemented as a Model Predictive Control (MPC) controller formulated as a mixed-integer quadratic programming (MIQP) problem for the 4WS vehicle. Practical constraints, such as limits on wheel velocities, steering angles and their rate-of-changes are taken into account. A Python implemention confirms the real-time execution capability of the controller and simulation results highlight its effectiveness.
|
|
13:30-15:00, Paper WeBT18-AX.2 | Add to My Program |
On the Performance of Jerk-Constrained Time-Optimal Trajectory Planning for Industrial Manipulators |
|
Lee, Jee-eun | The University of Texas at Austin |
Bylard, Andrew | Stanford University |
Sun, Zhouwen | Dexterity Inc |
Sentis, Luis | The University of Texas at Austin |
Keywords: Optimization and Optimal Control, Constrained Motion Planning, Industrial Robots
Abstract: Jerk-constrained trajectories offer a wide range of advantages that collectively improve the performance of robotic systems, including increased energy efficiency, durability, and safety. In this paper, we present a novel approach to jerk-constrained time-optimal trajectory planning (TOTP), which follows a specified path while satisfying up to third-order constraints to ensure safety and smooth motion. One significant challenge in jerk-constrained TOTP is a non-convex formulation arising from the inclusion of third-order constraints. Approximating inequality constraints can be particularly challenging because the resulting solutions may violate the real constraints. We address this by leveraging convexity within the proposed formulation to form conservative inequality constraints. We then obtain the desired trajectory by solving an n-dimensional Sequential Linear Program (SLP) iteratively until convergence. Lastly, we evaluate the performance of trajectories generated with and without jerk limits in terms of peak power, torque efficiency, and tracking capability.
|
|
13:30-15:00, Paper WeBT18-AX.3 | Add to My Program |
Symmetric Stair Preconditioning of Linear Systems for Parallel Trajectory Optimization |
|
Bu, Xueyi | Fu Foundation School of Engineering and Applied Science, Columbi |
Plancher, Brian | Barnard College, Columbia University |
Keywords: Optimization and Optimal Control, Control Architectures and Programming, Motion Control
Abstract: There has been a growing interest in parallel strategies for solving trajectory optimization problems. One key step in many algorithmic approaches to trajectory optimization is the solution of moderately-large and sparse linear systems. Iterative methods are particularly well-suited for parallel solves of such systems. However, fast and stable convergence of iterative methods is reliant on the application of a high-quality preconditioner that reduces the spread and increase the clustering of the eigenvalues of the target matrix. To improve the performance of these approaches, we present a new parallel-friendly symmetric stair preconditioner. We prove that our preconditioner has advantageous theoretical properties when used in conjunction with iterative methods for trajectory optimization such as a more clustered eigenvalue spectrum. Numerical experiments with typical trajectory optimization problems reveal that as compared to the best alternative parallel preconditioner from the literature, our symmetric stair preconditioner provides up to a 34% reduction in condition number and up to a 25% reduction in the number of resulting linear system solver iterations.
|
|
13:30-15:00, Paper WeBT18-AX.4 | Add to My Program |
MPCGPU: Real-Time Nonlinear Model Predictive Control through Preconditioned Conjugate Gradient on the GPU |
|
Adabag, Emre | Columbia University |
Atal, Miloni | Columbia University |
Gerard, William | Columbia University |
Plancher, Brian | Barnard College, Columbia University |
Keywords: Optimization and Optimal Control, Control Architectures and Programming, Motion Control
Abstract: Nonlinear Model Predictive Control (NMPC) is a state-of-the-art approach for locomotion and manipulation which leverages trajectory optimization at each control step. While the performance of this approach is computationally bounded, implementations of direct trajectory optimization that use iterative methods to solve the underlying moderately-large and sparse linear systems, are a natural fit for parallel hardware acceleration. In this work, we introduce MPCGPU, a GPU-accelerated, real-time NMPC solver that leverages an accelerated preconditioned conjugate gradient (PCG) linear system solver at its core. We show that MPCGPU increases the scalability and real-time performance of NMPC, solving larger problems, at faster rates. In particular, for tracking tasks using the Kuka IIWA manipulator, MPCGPU is able to scale to kilohertz control rates with trajectories as long as 512 knot points. This is driven by a custom PCG solver which outperforms state-of-the-art, CPU-based, linear system solvers by at least 10x for a majority of solves and 3.6x on average.
|
|
13:30-15:00, Paper WeBT18-AX.5 | Add to My Program |
Efficient and Robust Time-Optimal Trajectory Planning and Control for Agile Quadrotor Flight |
|
Zhou, Ziyu | Beijing Institute of Technology |
Wang, Gang | Beijing Institute of Technology |
Jian, Sun | Beijing Institute of Technology |
Wang, Jikai | Beijing Institute of Technology |
Chen, Jie | Tongji University |
Keywords: Optimization and Optimal Control, Integrated Planning and Control, Control Architectures and Programming
Abstract: Agile quadrotor flight relies on rapidly planning and accurately tracking time-optimal trajectories, a technology critical to their application in the wild. However, the computational burden of computing time-optimal trajectories based on the full quadrotor dynamics (typically on the order of minutes or even hours) can hinder its ability to respond quickly to changing scenarios. Additionally, modeling errors and external disturbances can lead to deviations from the desired trajectory during tracking in real time. This paper proposes a novel approach to computing time-optimal trajectories, by fixing the nodes with waypoint constraints and adopting separate sampling intervals for trajectories between waypoints, which significantly accelerates trajectory planning. Furthermore, the planned paths are tracked via a time-adaptive model predictive control scheme whose allocated tracking time can be adaptively adjusted on-the-fly, therefore enhancing the tracking accuracy and robustness. We evaluate our approach through simulations and experimentally validate its performance in dynamic waypoint scenarios for time-optimal trajectory replanning and trajectory tracking.
|
|
13:30-15:00, Paper WeBT18-AX.6 | Add to My Program |
Invariant Descriptors of Motion and Force Trajectories for Interpreting Object Manipulation Tasks in Contact |
|
Vochten, Maxim | KU Leuven |
Mousavi Mohammadi, Ali | Department of Mechanical Engineering, KU Leuven |
Verduyn, Arno | KU Leuven |
De Laet, Tinne | University of Leuven |
Aertbelien, Erwin | KU Leuven |
De Schutter, Joris | KU Leuven |
Keywords: Optimization and Optimal Control, Kinematics, Learning from Demonstration, Screw Theory
Abstract: Invariant descriptors of point and rigid-body motion trajectories have been proposed in the past as representative task models for motion recognition and generalization. Currently, no invariant descriptor exists for representing force trajectories, which appear in contact tasks. This paper introduces invariant descriptors for force trajectories by exploiting the duality between motion and force. Two types of invariant descriptors are presented depending on whether the trajectories consist of screw or vector coordinates. Methods and software are provided for robustly calculating the invariant descriptors from noisy measurements using optimal control. Using experimental human demonstrations of 3D contour following and peg-on-hole alignment tasks, invariant descriptors are shown to result in task representations that do not depend on the calibration of reference frames or sensor locations. The tuning process for the optimal control problems is shown to be fast and intuitive. Similar to motions in free space, the proposed invariant descriptors for motion and force trajectories may prove useful for the recognition and generalization of constrained motions, such as during object manipulation in contact.
|
|
13:30-15:00, Paper WeBT18-AX.7 | Add to My Program |
Generalizing Trajectory Retiming to Quadratic Objective Functions |
|
Chen, Gerry | Georgia Institute of Technology |
Dellaert, Frank | Verdant Robotics/Georgia Tech |
Hutchinson, Seth | Georgia Institute of Technology |
Keywords: Optimization and Optimal Control, Motion and Path Planning, Constrained Motion Planning
Abstract: Trajectory retiming is the task of computing a feasible time parameterization to traverse a path. It is commonly used in the decoupled approach to trajectory optimization whereby a path is first found, then a retiming algorithm computes a speed profile that satisfies kino-dynamic and other constraints. While trajectory retiming is most often formulated with the minimum-time objective (i.e. traverse the path as fast as possible), it is not always the most desirable objective, particularly when we seek to balance multiple objectives or when bang-bang control is unsuitable. In this paper, we present a novel algorithm based on factor graph variable elimination that can solve for the global optimum of the retiming problem with quadratic objectives as well (e.g. minimize control effort or match a nominal speed by minimizing squared error), which may extend to arbitrary objectives with iteration. Our work extends prior works, which find only solutions on the boundary of the feasible region, while maintaining the same linear time complexity from a single forward-backward pass. We experimentally demonstrate that (1) we achieve better real- world robot performance by using quadratic objectives in place of the minimum-time objective, and (2) our implementation is comparable or faster than state-of-the-art retiming algorithms.
|
|
13:30-15:00, Paper WeBT18-AX.8 | Add to My Program |
A Distributed Processing Approach for Smooth Task Transitioning in Strict Hierarchical Control |
|
Tassi, Francesco | Istituto Italiano Di Tecnologia |
Ajoudani, Arash | Istituto Italiano Di Tecnologia |
Keywords: Optimization and Optimal Control, Multi-Robot Systems, Whole-Body Motion Planning and Control
Abstract: To enhance robots' applicability in real-world scenarios, it is essential to establish a complex and multi-tasking behaviour, inspired by human nature. To this purpose, from a hardware perspective, a high number of degrees of freedom is necessary, as is the case for humanoids and collaborative mobile manipulators. From a software standpoint instead, complex hierarchical strategies are often used to define a set of behaviours that the robot should reflect in strict hierarchical order. Their main issue however, is related to the lack of continuity when their stack of tasks is changed. Existing works that address this issue clearly present a trade-off between optimality assurance during transition and computational costs. Here, we employ a distributed processing approach that enables not only the minimization of computational costs, but also continuous optimality and constraints feasibility even under sharp transitions. The approach is tested during three task transitions, for different tasks such as constrained trajectory tracking, obstacle avoidance, and postural optimization. Two mobile manipulators are used, each having 10 DoF, and the results confirm the smoothness of the generated solutions.
|
|
13:30-15:00, Paper WeBT18-AX.9 | Add to My Program |
Risk-Averse Trajectory Optimization Via Sample Average Approximation |
|
Lew, Thomas | Stanford University |
Bonalli, Riccardo | Laboratoire Des Signaux Et Systèmes |
Pavone, Marco | Stanford University |
Keywords: Optimization and Optimal Control, Planning under Uncertainty, Probability and Statistical Methods
Abstract: Trajectory optimization under uncertainty underpins a wide range of applications in robotics. However, existing methods are limited in terms of reasoning about sources of epistemic and aleatoric uncertainty, space and time correlations, nonlinear dynamics, and non-convex constraints. In this work, we first introduce a continuous-time planning formulation with an average-value-at-risk constraint over the entire planning horizon. Then, we propose a sample-based approximation that unlocks an efficient and general-purpose algorithm for risk-averse trajectory optimization. We prove that the method is asymptotically optimal and derive finite-sample error bounds. Simulations demonstrate the high speed and reliability of the approach on problems with stochasticity in nonlinear dynamics, obstacle fields, interactions, and terrain parameters.
|
|
WeBT19-NT Oral Session, NT-G301 |
Add to My Program |
Medical Robots V |
|
|
Chair: Krieger, Axel | Johns Hopkins University |
Co-Chair: Alambeigi, Farshid | University of Texas at Austin |
|
13:30-15:00, Paper WeBT19-NT.1 | Add to My Program |
Towards a Novel Soft Magnetic Laparoscope for Single Incision Laparoscopic Surgery |
|
Liu, Hui | University of Tennessee Knoxville |
Li, Ning | The University of Tennessee |
Li, Shuai | University of Tennessee Knoxville |
Mancini, Gregory | The University of Tennessee Graduate School of Medicine |
Tan, Jindong | University of Tennessee, Knoxville |
Keywords: Medical Robots and Systems, Soft Robot Materials and Design, Surgical Robotics: Laparoscopy
Abstract: In single-incision laparoscopic surgery (SILS), magnetic anchoring and guidance system (MAGS) is a promising technique to prevent clutter in surgical workspace and provide a larger vision field. Existing camera designs mainly rely on rigid structure design, resulting in risks of losing magnetic coupling and impacting tissue during the insertion and coupling procedure. In this paper, we proposed a wireless MAGS consisting of soft material and structure design. The camera can bend at the exit of the trocar and maintain strong coupling with the external actuator. The operation principle and modeling were established to investigate the parameter design. An easier insertion procedure was introduced and demonstrated in experiment. The bendability was tested showing the camera could reach 20 degrees in bending angle and 16.4mm in displacement. The insertion and deployment took less than 2 minutes on average.
|
|
13:30-15:00, Paper WeBT19-NT.2 | Add to My Program |
Magnetic-Guided Flexible Origami Robot Toward Long-Term Phototherapy of H. Pylori in the Stomach |
|
Yuan, Sishen | The Chinese University of Hong Kong |
Liang, Baijia | Chinese University of Hong Kong |
Wong, Po Wa | The Chinese University of Hong Kong |
Xu, Mingjing | The Chinese University of Hong Kong |
Li, Chi Hsuan | The Chinese University of Hong Kong |
Li, Zhen | Qilu Hospital of Shandong University |
Ren, Hongliang | Chinese Univ Hong Kong (CUHK) & National Univ Singapore(NUS) |
Keywords: Medical Robots and Systems, Soft Robot Materials and Design
Abstract: Helicobacter pylori, a pervasive bacterial infection associated with gastrointestinal disorders such as gastritis, peptic ulcer disease, and gastric cancer, impacts approximately 50% of the global population. The efficacy of standard clinical eradication therapies is diminishing due to the rise of antibiotic-resistant strains, necessitating alternative treatment strategies. Photodynamic therapy (PDT) emerges as a promising prospect in this context. This study presents the development and implementation of a magnetically-guided origami robot, incorporating flexible printed circuit units for sustained and stable phototherapy of Helicobacter pylori. Each integrated unit is equipped with wireless charging capabilities, producing an optimal power output that can concurrently illuminate up to 15 LEDs at their maximum intensity. Crucially, these units can be remotely manipulated via a magnetic field, facilitating both translational and rotational movements. We propose an openloop manual control sequence that allows the formation of a stable, compliant triangular structure through the interaction of internal magnets. This adaptable configuration is uniquely designed to withstand the dynamic squeezing environment prevalent in real-world gastric applications. The research herein represents a significant stride in leveraging technology for innovative medical solutions, particularly in the management of antibiotic-resistant Helicobacter pylori infections.
|
|
13:30-15:00, Paper WeBT19-NT.3 | Add to My Program |
Flexible Tactile-Sensing Gripper Design and Excessive Force Protection Function for Endovascular Surgery Robots |
|
Lyu, Chuqiao | Beijing Institute of Technology |
Guo, Shuxiang | Kagawa University |
Yan, Yonggan | Beijing Institute of Technology |
Zhang, Yongxin | Changhai Hospital |
Zhang, Yongwei | Changhai Hospital |
Yang, Pengfei | Changhai Hospital |
Liu, Jianmin | Changhai Hospital |
Keywords: Medical Robots and Systems, Soft Sensors and Actuators, Force and Tactile Sensing
Abstract: Research on endovascular surgery robots (ESR) is continuously developing, because ESR can protect surgeons from radiation exposure. For designing an ESR manipulator, the main challenge is controlling the soft surgical tools and measuring the endovascular stress simultaneously. To solve these problems, a flexible tactile-sensing gripper (FTG) is designed in this study. Firstly, a catheter grasping model is constructed, and the factors affecting the force measurement are quantitatively analyzed. Secondly, the simulation experiments based on FTG models with three different sizes are implemented. When the catheter force is too large, shrinking the grasping distance of FTG can avoid the surgical risk. This method protects the surgeon's behavior and controls the catheter force at the same time, which is named excessive force protection function (EFPF). Thirdly, the FTG prototype which meet the surgical requirements is made and integrated into the ESR manipulator. This manipulator can measure the catheter forces by detecting the coordinate of marks on FTG surface. The calibrated FTG gets the average and maximum errors of force sensing approximately 37 mN and 223 mN, respectively. Finally, in the experiment of carotid artery catheterization, EFPF can control the catheter force within 393 mN, which is far less than the control group's 1351 mN.
|
|
13:30-15:00, Paper WeBT19-NT.4 | Add to My Program |
Simultaneous Estimation of Shape and Force Along Highly Deformable Surgical Manipulators Using Sparse FBG Measurement |
|
Lu, Yiang | The Chinese University of Hong Kong |
Li, Bin | The Chinese University of Hong Kong |
Chen, Wei | The Chinese University of Hong Kong |
Yan, Junyan | The Chinese University of Hong Kong |
Cheng, Shing Shin | The Chinese University of Hong Kong |
Wang, Jiangliu | The Chinese University of Hong Kong |
Zhou, Jianshu | The Chinese University of Hong Kong |
Dou, Qi | The Chinese University of Hong Kong |
Liu, Yunhui | Chinese University of Hong Kong |
Keywords: Medical Robots and Systems, Soft Sensors and Actuators, Soft Robot Applications
Abstract: Recently, fiber optic sensors such as fiber Bragg gratings (FBGs) have been widely investigated for shape reconstruction and force estimation of flexible surgical robots. However, most existing approaches need precise model parameters of FBGs inside the fiber and their alignments with the flexible robots for accurate sensing results. Another challenge lies in online acquiring external forces at arbitrary locations along the flexible robots, which is highly required when with large deflections in robotic surgery. In this paper, we propose a novel data-driven paradigm for simultaneous estimation of shape and force along highly deformable flexible robots by using sparse strain measurement from a single-core FBG fiber. A thin-walled soft sensing tube helically embedded with FBG sensors is designed for a robotic-assisted flexible ureteroscope with large deflection up to 270 degrees and a bend radius under 10 mm. We introduce and study three learning models by incorporating spatial strain encoders, and compare their performances in both free space without interactions as well as constrained environments with contact forces at different locations. The experimental results in terms of dynamic shape-force sensing accuracy demonstrate the effectiveness and superiority of the proposed methods.
|
|
13:30-15:00, Paper WeBT19-NT.5 | Add to My Program |
Autonomous System for Tumor Resection (ASTR)-Dual-Arm Robotic Midline Partial Glossectomy |
|
Ge, Jiawei | Johns Hopkins University |
Kam, Michael | Johns Hopkins University |
Opfermann, Justin | Johns Hopkins University |
Saeidi, Hamed | University of North Carolina Wilmington |
Leonard, Simon | The Johns Hopkins University |
Mady, Leila | Johns Hopkins University |
Schnermann, Martin | National Cancer Institute |
Krieger, Axel | Johns Hopkins University |
Keywords: Medical Robots and Systems, Software Architecture for Robotic and Automation, Control Architectures and Programming
Abstract: Head and neck cancer is the seventh most common cancer worldwide, with squamous cell carcinoma being the most prevalent histologic type. Surgical resection is a primary treatment modality, and precisely identifying tumor edges and ensuring adequate resection margins are critical for optimizing oncologic outcomes. This letter presents an innovative autonomous system for tumor resection (ASTR) and conducts a feasibility study by performing autonomous midline partial glossectomy for pseudotumor with millimeter accuracy. ASTR consists of a dual-camera vision system, an electrosurgical tool, a vacuum grasping tool, two 6-DOF manipulators, and an autonomous control system. The letter introduces an ontology-based research framework for creating and implementing a complex autonomous surgical workflow, using the glossectomy as a case study. Porcine tongues are used in this study, and marked using color inks and near-infrared fluorescent (NIRF) markers to indicate the pseudotumor. ASTR monitors the NIRF markers and gathers spatial and color data from the samples, enabling planning and execution of robot trajectories in accordance with the proposed glossectomy workflow. The system successfully performs six consecutive supervised autonomous pseudotumor resections on porcine samples. The average surface and depth resection errors measure 0.73±0.60mm and 1.89±0.54mm, respectively. The resection accuracy is demonstrated to be on par with manual glossectomy performed by an otolaryngologist.
|
|
13:30-15:00, Paper WeBT19-NT.6 | Add to My Program |
A Semi-Autonomous Data Driven Shared Control Framework for Robotic Manipulation and Cutting of an Unknown Deformable Tissue |
|
Strohmeyer, Nicholas | University of Texas at Austin |
Park, Ji Hwan | The University of Texas at Austin |
Murphy, Braden | The University of Texas at Austin |
Alambeigi, Farshid | University of Texas at Austin |
Keywords: Medical Robots and Systems, Surgical Robotics: Laparoscopy, Human-Robot Teaming
Abstract: In this work, we propose a semi-autonomous scheme to synergistically share the complicated task of manipulation and cutting of an unknown deformable tissue (U-DT) between a remote surgeon and a surgical robot. Particularly, utilizing the da Vinci Research Kit (dVRK) platform, we have designed and successfully demonstrated a fully functional shared control scheme for an autonomous tensioning and tele-cutting of a U-DT. We have shown the system's ability to cooperate with a remote surgeon by leveraging an online data-driven learning and adaptive control method coupled with a reduced-order trajectory planning module that depends on just two parameters. By performing 25 experiments on custom-designed silicon phantoms and defining a set of success/failure metrics, we have put forward findings that establish a causal relationship between these two important parameters and the success or failure of the performed experiments.
|
|
13:30-15:00, Paper WeBT19-NT.7 | Add to My Program |
Design and Evaluation of a Modular Robotic System for Microsurgery |
|
Torrealba Molina, Jenireth | Imperial College London |
AbuBaker, Toqa | Imperial College London |
Huang, Yanpei | Imperial College London |
Cheng, Xiaoxiao | Imperial College of Science, Technology and Medicine, London UK |
Devillard, Alexis | Imperial College London |
Burdet, Etienne | Imperial College London |
Keywords: Medical Robots and Systems, Surgical Robotics: Laparoscopy, Mechanism Design
Abstract: The manipulation of instruments under a microscope suffers from physiological tremor and human errors, which are inevitable in long microsurgery interventions. Robotic systems developed in recent years for microsurgery are expensive and not flexible, as they cannot use standard instruments, and need the surgeon to modify their operative skills and strategies. In this paper, we introduce a modular robotic system for microsurgery enabling the surgeon to operate using conventional instruments. Our system was implemented using a commercial Kinova robot and a dedicated modular end-effector that uses standard microsurgery instruments. An initial teleoperation validation was carried out by eleven participants, who could successfully control the microsurgery tools to perform basic surgical movements. Furthermore, participants performed a simple anastomosis task with the robot and compared it to manual control. The results showed that robotic control is superior to manual control in simple surgical tasks and the converse in complex tasks. Participants preferred the proposed robotic system due to its user-friendliness and effort reduction.
|
|
13:30-15:00, Paper WeBT19-NT.8 | Add to My Program |
Analyzing Accessibility in Robot-Assisted Vitreoretinal Surgery: Integrating Eye Posture and Robot Position |
|
Inagaki, Satoshi | NSK.Ltd |
Alikhani, Alireza | Augen Klinik Und Poliklinik, Klinikum Rechts Der Isar Der Techn |
Navab, Nassir | TU Munich |
Maier, Mathias | Klinikum Rechts Der Isar Der TU München |
Nasseri, M. Ali | Technische Universitaet Muenchen |
Keywords: Medical Robots and Systems, Surgical Robotics: Planning, Optimization and Optimal Control
Abstract: Several robotic frameworks have been recently developed to assist ophthalmic surgeons in performing complex vitreoretinal procedures such as subretinal injection. However, in order to intuitively integrate robots into the surgical workflow, it is crucial to emphasize that an accessibility analysis framework for vitreoretinal surgery must be considered as an essential component. Such a framework, ideally, considers the comprehensive factors of the eye anatomy and its positioning, the insertion point, and the initial pose and position of the robot. By combining the mobilization of the eyeball and adjusting the pose and position of the robot, the accessibility of such systems is significantly optimized. At the same time, the accessible-visible area is better and faster matched to the working volume of the robot. This paper presents an analysis of an expansion strategy for the robot's accessibility and visibility area. The outcomes of this method demonstrate the promising potential to enhance the robot's accessibility, as evidenced in our analytical and experimental findings from 22.4% to 99.0% of the required working area on an adjustable phantom model.
|
|
WeBT20-NT Oral Session, NT-G302 |
Add to My Program |
Robot Safety I |
|
|
Chair: Khorrami, Farshad | New York University Tandon School of Engineering |
Co-Chair: Tumova, Jana | KTH Royal Institute of Technology |
|
13:30-15:00, Paper WeBT20-NT.1 | Add to My Program |
Fault Tolerant Neural Control Barrier Functions for Robotic Systems under Sensor Faults and Attacks |
|
Zhang, Hongchao | Washington University in St. Louis |
Niu, Luyao | University of Washington |
Clark, Andrew | Washington University in St. Louis |
Poovendran, Radha | University of Washington |
Keywords: Robot Safety, AI-Based Methods, Robust/Adaptive Control
Abstract: Safety is a fundamental requirement of many robotic systems. Control barrier function (CBF)-based approaches have been proposed to guarantee the safety of robotic systems. However, the effectiveness of these approaches highly relies on the choice of CBFs. Inspired by the universal approximation power of neural networks, there is a growing trend toward representing CBFs using neural networks, leading to the notion of neural CBFs (NCBFs). Current NCBFs, however, are trained and deployed in benign environments, making them ineffective for scenarios where robotic systems experience sensor faults and attacks. In this paper, we study safety-critical control synthesis for robotic systems under sensor faults and attacks. Our main contribution is the development and synthesis of a new class of CBFs that we term fault tolerant neural control barrier function (FT-NCBF). We derive the necessary and sufficient conditions for FT-NCBFs to guarantee safety, and develop a data-driven method to learn FT-NCBFs by minimizing a loss function constructed using the derived conditions. Using the learned FT-NCBF, we synthesize a control input and formally prove the safety guarantee provided by our approach. We demonstrate our proposed approach using two case studies: obstacle avoidance problem for an autonomous mobile robot and spacecraft rendezvous problem.
|
|
13:30-15:00, Paper WeBT20-NT.2 | Add to My Program |
Belief Control Barrier Functions for Risk-Aware Control |
|
Vahs, Matti | KTH Royal Institute of Technology, Stockholm |
Pek, Christian | Delft University of Technology |
Tumova, Jana | KTH Royal Institute of Technology |
Keywords: Robot Safety, Autonomous Agents, Formal Methods in Robotics and Automation
Abstract: Ensuring safety in real-world robotic systems is often challenging due to unmodeled disturbances and noisy sensor measurements. To account for such stochastic uncertainties, many robotic systems leverage probabilistic state estimators such as Kalman filters to obtain a robot's belief, i.e. a probability distribution over possible states. We propose belief control barrier functions (BCBFs) to enable risk-aware control synthesis, leveraging all information provided by state estimators. This allows robots to stay in predefined safety regions with desired confidence under these stochastic uncertainties. BCBFs are general and can be applied to a variety of robotic systems that use extended Kalman filters as state estimator. We demonstrate BCBFs on a quadrotor that is exposed to external disturbances and varying sensing conditions. Our results show improved safety compared to traditional state-based approaches while allowing control frequencies of up to 1kHz.
|
|
13:30-15:00, Paper WeBT20-NT.3 | Add to My Program |
A Novel Algorithmic Approach to Obtaining Maneuverable Control-Invariant Sets |
|
Solanki, Prashant | Delft University of Technology |
van Beers, Jasper | Delft University of Technology |
Jamshidnejad, A. | Delft University of Technology |
de Visser, Coen | TU Delft |
Keywords: Robot Safety, Collision Avoidance, Autonomous Vehicle Navigation
Abstract: Much effort has been devoted in the field of reachability analysis to obtaining control-invariant sets which ensure that a system inside of these sets does not have to leave these sets, and are thus essential for guaranteeing a system's safety. However, control invariance does not imply that a system can move from any state in the control-invariant set to any other state within the given time horizon. In this paper we develop an algorithm to obtain a control-invariant set that allows a given system to move from any state in the set to any other state in the set within a given time horizon without having to leave the set. We call this set 'maneuver set' or set mathcal{M}. We substantiate the algorithm's efficacy through mathematical proof, affirming that the maneuver set obtained through the application of the algorithm is indeed control-invariant. Furthermore, we prove that the system is indeed able to move from any state within this set to any other state. To illustrate the use of our algorithm, we provide the numerical example of a Dubins car, utilising HJB reachability analysis along with the algorithm to obtain the maneuver set.
|
|
13:30-15:00, Paper WeBT20-NT.4 | Add to My Program |
Safe Navigation and Obstacle Avoidance Using Differentiable Optimization Based Control Barrier Functions |
|
Dai, Bolun | New York University |
Khorrambakht, Rooholla | New York University |
Krishnamurthy, Prashanth | New York University Tandon School of Engineering |
Gonçalves, Vinicius Mariano | New York University Abu Dhabi, United Arab Emirates |
Tzes, Anthony | New York University Abu Dhabi |
Khorrami, Farshad | New York University Tandon School of Engineering |
Keywords: Robot Safety, Collision Avoidance
Abstract: Control barrier functions (CBFs) have been widely applied to safety-critical robotic applications. However, the construction of control barrier functions for robotic systems remains a challenging task. Recently, collision detection using differentiable optimization has provided a way to compute the minimum uniform scaling factor that results in an intersection between two convex shapes and to also compute the Jacobian of the scaling factor. In this paper, we propose a framework that uses this scaling factor, with an offset, to systematically define a CBF for obstacle avoidance tasks. We provide theoretical analyses of the continuity and continuous differentiability of the proposed CBF. We empirically evaluate the proposed CBF's behavior and show that the resulting optimal control problem is computationally efficient, which makes it applicable for real-time robotic control. We validate our approach, first using a 2D mobile robot example, then on the Franka-Emika Research~3 (FR3) robot manipulator both in simulation and experiment.
|
|
13:30-15:00, Paper WeBT20-NT.5 | Add to My Program |
Achieving Autonomous Cloth Manipulation with Optimal Control Via Differentiable Physics-Aware Regularization and Safety Constraints |
|
Zhang, Yutong | University of California San Diego |
Liu, Fei | UCSD |
Liang, Xiao | University of California San Diego |
Yip, Michael C. | University of California, San Diego |
Keywords: Robot Safety, Computational Geometry, Simulation and Animation
Abstract: Cloth manipulation is a category of deformable object manipulation of great interest to the robotics community, from applications of automated laundry-folding and home organizing to textiles and flexible manufacturing. Despite the desire for automated cloth manipulation, the thin-shell dynamics and under-actuation nature of cloth present significant challenges for robots to effectively interact with them. Many recent works omit explicit modeling in favor of learning-based methods that may yield control policies directly. However, these methods require large training sets that must be collected and curated. In this regard, we create a framework for differentiable modeling of cloth dynamics leveraging an Extended Position-based Dynamics (XPBD) algorithm. Together with the desired control objective, physics-aware regularization terms are designed for better results, including trajectory smoothness and elastic potential energy. In addition, safety constraints, such as avoiding obstacles, can be specified using signed distance functions (SDFs). We formulate the cloth manipulation task with safety constraints as a constrained optimization problem, which can be effectively solved by mainstream gradient-based optimizers thanks to the end-to-end differentiability of our framework. Finally, we assess the framework with various safety thresholds and demonstrate the feasibility of result trajectories on a surgical robot. The effects of the regularization terms are analyzed in an additional ablation study.
|
|
13:30-15:00, Paper WeBT20-NT.6 | Add to My Program |
Online Data-Driven Safety Certification for Systems Subject to Unknown Disturbances |
|
Rober, Nicholas | Massachusetts Institute of Technology |
Mahesh, Karan | Aurora Flight Sciences |
Paine, Tyler | Massachusetts Institute of Technology |
Greene, Max | University of Florida |
Lee, Steven | Carnegie Mellon University |
Monteiro, Sildomar | Boeing Research and Technology |
Benjamin, Michael | Massachusetts Institute of Technology |
How, Jonathan | Massachusetts Institute of Technology |
Keywords: Robot Safety, Formal Methods in Robotics and Automation, Failure Detection and Recovery
Abstract: Deploying autonomous systems in safety critical settings necessitates methods to verify their safety properties. This is challenging because real-world systems may be subject to disturbances that affect their performance, but are unknown a priori. This work develops a safety-verification strategy wherein data is collected online and incorporated into a reachability analysis approach to check in real-time that the system avoids dangerous regions of the state space. Specifically, we employ an optimization-based moving horizon estimator (MHE) to characterize the disturbance affecting the system, which is incorporated into an online reachability calculation. Reachable sets are calculated using a computational graph analysis tool to predict the possible future states of the system and verify that they satisfy safety constraints. We include theoretical arguments proving our approach generates reachable sets that bound the future states of the system, as well as numerical results demonstrating how it can be used for safety verification. Finally, we present results from hardware experiments demonstrating our approach's ability to perform online reachability calculations for an unmanned surface vehicle subject to currents and actuator failures.
|
|
13:30-15:00, Paper WeBT20-NT.7 | Add to My Program |
Towards Standardized Disturbance Rejection Testing of Legged Robot Locomotion with Linear Impactor: A Preliminary Study, Observations, and Implications |
|
Weng, Bowen | Iowa State University |
Castillo, Guillermo A. | The Ohio State University |
Kang, Yun-Seok | The Ohio State University |
Hereid, Ayonga | Ohio State University |
Keywords: Robot Safety, Legged Robots, Humanoid and Bipedal Locomotion
Abstract: Dynamic locomotion in legged robots is close to industrial collaboration, but a lack of standardized testing obstructs commercialization. The issues are not merely political, theoretical, or algorithmic but also physical, indicating limited studies and comprehension regarding standard testing infrastructure and equipment. For decades, the approaches we have been testing legged robots were rarely standardizable with hand-pushing, foot-kicking, rope-dragging, stick-poking, and ball-swinging. This paper aims to bridge the gap by proposing the use of the linear impactor, a well-established tool in other standardized testing disciplines, to serve as an adaptive, repeatable, and fair disturbance rejection testing equipment for legged robots. A pneumatic linear impactor is also adopted for the case study involving the humanoid robot Digit. Three locomotion controllers are examined, including a commercial one, using a walking-in-place task against frontal impacts. The statistically best controller was able to withstand the impact momentum (26.376 kg⋅m/s) on par with a reported average effective momentum from straight punches by Olympic boxers (26.506 kg⋅m/s). Moreover, the case study highlights other anti-intuitive observations, demonstrations, and implications that, to the best of the authors' knowledge, are first-of-its-kind revealed in real-world testing of legged robots.
|
|
13:30-15:00, Paper WeBT20-NT.8 | Add to My Program |
Detecting and Mitigating System-Level Anomalies of Vision-Based Controllers |
|
Gupta, Aryaman | Indian Institute of Technology (BHU), Varanasi |
Chakraborty, Kaustav | University of Southern California |
Bansal, Somil | University of Southern California |
Keywords: Robot Safety, Machine Learning for Robot Control, Vision-Based Navigation
Abstract: Autonomous systems, such as self-driving cars and drones, have made significant strides in recent years by leveraging visual inputs and machine learning for decision-making and control. Despite their impressive performance, these vision-based controllers can make erroneous predictions when faced with novel or out-of-distribution inputs. Such errors can cascade to catastrophic system failures and compromise system safety. In this work, we introduce a run-time anomaly monitor to detect and mitigate such closed-loop, system-level failures. Specifically, we leverage a reachability-based framework to stress-test the vision-based controller offline and mine its system-level failures. This data is then used to train a classifier that is leveraged online to flag inputs that might cause system breakdowns. The anomaly detector highlights issues that transcend individual modules and pertain to the safety of the overall system. We also design a fallback controller that robustly handles these detected anomalies to preserve system safety. We validate the proposed approach on an autonomous aircraft taxiing system that uses a vision-based controller for taxiing. Our results show the efficacy of the proposed approach in identifying and handling system-level anomalies, outperforming methods such as prediction error-based detection and ensembling, thereby enhancing the overall safety and robustness of autonomous systems.
|
|
13:30-15:00, Paper WeBT20-NT.9 | Add to My Program |
Generative Modeling of Residuals for Real-Time Risk-Sensitive Safety with Discrete-Time Control Barrier Functions |
|
Cosner, Ryan | California Institute of Technology |
Sadalski, Igor | California Institute of Technology |
Woo, Jana | California Institute of Technology |
Culbertson, Preston | Stanford University |
Ames, Aaron | California Institute of Technology |
Keywords: Robot Safety, Machine Learning for Robot Control
Abstract: A key source of brittleness for robotic systems is the presence of model uncertainty and external disturbances. Existing approaches to robust control either seek to bound the worst-case disturbance (which results in conservative behavior), or to learn a deterministic dynamics model (which is unable to capture uncertain dynamics or disturbances). This work proposes a different approach: training a state-conditioned generative model to represent the distribution of errors between the nominal dynamics and the actual system. This learned disturbance model can be used in conjunction with probabilistic safety methods such as Discrete-Time Control Barrier Functions (DTCBFs) to ensure the safety of the system up to some level of risk. For this work, we focus on learning the dynamics uncertainties of a quadrotor drone, which is subject to complex aerodynamic interactions between the aircraft and the environment or which is carrying an unmodeled payload. We use a conditional variational autoencoder (CVAE) to learn a state-conditioned disturbance distribution, and find the resulting probabilistic safety controller exhibits less conservative behavior while retaining theoretical safety properties.
|
|
WeBT22-NT Oral Session, NT-G304 |
Add to My Program |
Marine Robotics V |
|
|
Chair: Kim, Jinwhan | KAIST |
Co-Chair: Sattar, Junaed | University of Minnesota |
|
13:30-15:00, Paper WeBT22-NT.1 | Add to My Program |
An Open-Source Solution for Fast and Accurate Underwater Mapping with a Low-Cost Mechanical Scanning Sonar |
|
Hansen, Tim | Constructor University |
Birk, Andreas | Constructor University |
Keywords: Marine Robotics, SLAM, Hardware-Software Integration in Robotics
Abstract: An open-source software framework is presented that allows real-time underwater mapping with popular marine robotics components, namely a BlueRobotics BlueROV2 with its standard Ping360 Mechanical Scanning Sonar (MSS) and a A50 Doppler Velocity Log (DVL), which are low-cost devices for their respective types - if not even the most affordable ones on the market. The software runs with low computational power on a Raspberry Pi4. The framework builds upon Synthetic Scan Formation (SSF) where single MSS beams or scan-lines are embedded into a pose-graph. The rendering of scans is not only based on navigation, but based on the graph itself. Scans formed from scan-lines can be optimized by online Simultaneous Localization and Mapping (SLAM) and result in improved scans, based on the current state of the graph. In subsequent steps this leads to improved registration results. To this end, a combination of two different types of loop-closures is presented. Namely a consecutive loop closure, and a proximity based loop closure, which reduces the overall drift. The framework is validated in three different test-environments, namely a pool, a test-tank with a gantry for ground truth motion, and the flooded basement of a WW-II submarine bunker. Among others, it is shown that there is an increased accuracy compared to conventional SLAM and that the software is usable in real-time during a mission with the low-cost hardware.
|
|
13:30-15:00, Paper WeBT22-NT.2 | Add to My Program |
Boundary Factors for Seamless State Estimation between Autonomous Underwater Docking Phases |
|
Terán Espinoza, Aldo | KTH Royal Institute of Technology |
Teran Espinoza, Antonio | Massachusetts Institute of Technology |
Folkesson, John | KTH |
Sigray, Peter | KTH Royal Institute of Technology |
Kuttenkeuler, Jakob | KTH Royal Institute of Technology |
Keywords: Marine Robotics, SLAM, Sensor Fusion
Abstract: Autonomous underwater docking is of the utmost importance for expanding the capabilities of Autonomous Underwater Vehicles (AUVs). Due to a historical focus on underwater docking to only static targets, the research gap in underwater docking to dynamically active targets has been left relatively untouched. We address the state estimation problem that arises when trying to rendezvous a chaser AUV with a dynamic target by modeling the scenario as a factor graph optimization-based Simultaneous Localization and Mapping problem. We present a set of boundary factors that aid the inference process by seamlessly transitioning the target's state between the different observability stages, intrinsic to any dynamic docking scenario. We benchmark the performance of our approach using the Stonefish simulated environment.
|
|
13:30-15:00, Paper WeBT22-NT.3 | Add to My Program |
Vision-Based Water Clearance Determination in Maritime Environment |
|
Schiller, Carl | ABB Corporate Research, Baden-Dättwil, Switzerland |
Maas, Deran | ABB Corporate Research, Baden-Dättwil, Switzerland |
Arsenali, Bruno | ABB |
Peltola, Jukka | ABB Marine and Ports |
Tervo, Kalevi | Aalto University School of Science and Technology |
Maranò, Stefano | ABB Corporate Research |
Keywords: Marine Robotics, Vision-Based Navigation, Autonomous Vehicle Navigation
Abstract: Determining the distances from the hull of the own ship to obstacles or land, i.e. water clearance, is a fundamental task in navigation. This is particularly relevant during maneuvering in the harbor or navigating in confined waters. We introduce the concepts of area water clearance and line water clearance. Area water clearance is important especially for path planning and obstacle avoidance. Line water clearance is critical for maneuvering when approaching the quay. In this work, we present a vision-based approach to determine the water clearance. A single calibrated camera together with a semantic segmentation network is used to detect the water region in an image, and back-projection to determine the water clearance on the sea surface in world units. We validate the proposed approach on real data collected from two distinct vessels, where the proposed method is able to produce reliable water clearance for distances beyond one kilometer. During harbor maneuvering 90% of the relative water clearance errors were found to be between −2.3% and 3%.
|
|
13:30-15:00, Paper WeBT22-NT.4 | Add to My Program |
Adaptive Landmark Color for AUV Docking in Visually Dynamic Environments |
|
Knutson, Corey | University of Minnesota - Twin Cities |
Cao, Zhipeng | University of Minnesoata -- Twin Cities |
Sattar, Junaed | University of Minnesota |
Keywords: Marine Robotics, Vision-Based Navigation, Engineering for Robotic Systems
Abstract: Autonomous Underwater Vehicles (AUVs) conduct missions underwater without the need for human intervention. A docking station (DS) can extend mission times of an AUV by providing a location for the AUV to recharge its batteries and receive updated mission information. Various methods for locating and tracking a DS exist, but most rely on expensive acoustic sensors, or are vision-based, which is significantly affected by water quality. In this paper, we present a vision-based method that utilizes adaptive color LED markers and dynamic color filtering to maximize landmark visibility in varying water conditions. Both AUV and DS utilize cameras to determine the water background color in order to calculate the desired marker color. No communication between AUV and DS is needed to determine marker color. Experiments conducted in a pool and lake show our method performs 10 times better than static color thresholding methods as background color varies. DS detection is possible at a range of 5 meters in clear water with minimal false positives.
|
|
13:30-15:00, Paper WeBT22-NT.5 | Add to My Program |
Navigable Area Detection and Perception-Guided Model Predictive Control for Autonomous Navigation in Narrow Waterways |
|
Kim, Jonghwi | KAIST |
Lee, Changyu | KAIST |
Chung, Dongha | KAIST |
Kim, Jinwhan | KAIST |
Keywords: Marine Robotics, Vision-Based Navigation, Semantic Scene Understanding
Abstract: This paper presents an integrated navigation and control strategy for an autonomous surface vehicle (ASV) to operate in narrow waterways without relying on GPS. The proposed method uses a camera and a light detection and ranging (LiDAR) sensor to detect navigable regions in the waterway. A deep learning-based semantic segmentation algorithm is applied to detect the navigable region in camera images, and the segmented region is projected onto the water surface using planar homography. A line-detection algorithm is also introduced to improve the reliability of detecting navigable regions from LiDAR measurements. A safe collision-free path for the ASV is generated within the navigable regions using model predictive control-based local path planning and control algorithms. The performance and practical utility of the proposed method were demonstrated through field experiments using a small cruise boat, modified as an autonomous surface vehicle.
|
|
13:30-15:00, Paper WeBT22-NT.6 | Add to My Program |
An Online Self-Calibrating Refractive Camera Model with Application to Underwater Odometry |
|
Singh, Mohit | NTNU: Norwegian University of Science and Technology |
Dharmadhikari, Mihir Rahul | NTNU - Norwegian University of Science and Technology |
Alexis, Kostas | NTNU - Norwegian University of Science and Technology |
Keywords: Marine Robotics, Visual-Inertial SLAM
Abstract: This work presents a camera model for refractive media such as water and its application in underwater visual-inertial odometry. The model is self-calibrating in real-time and is free of known correspondences or calibration targets. It is separable as a distortion model (dependent on refractive index n and radial pixel coordinate) and a virtual pinhole model (as a function of n). We derive the self-calibration formulation leveraging epipolar constraints to estimate the refractive index and subsequently correct for distortion. Through experimental studies using an underwater robot integrating cameras and inertial sensing, the model is validated regarding the accurate estimation of the refractive index and its benefits for robust odometry estimation in an extended envelope of conditions. Lastly, we show the transition between media and the estimation of the varying refractive index online, thus allowing computer vision tasks across refractive media.
|
|
13:30-15:00, Paper WeBT22-NT.7 | Add to My Program |
Enhancing Visual Inertial SLAM with Magnetic Measurements |
|
Joshi, Bharat | University of South Carolina |
Rekleitis, Ioannis | University of South Carolina |
Keywords: Marine Robotics, Visual-Inertial SLAM, SLAM
Abstract: This paper presents an extension to visual inertial odometry (VIO) by introducing tightly-coupled fusion of magnetometer measurements. A sliding window of keyframes is optimized by minimizing re-projection errors, relative inertial errors, and relative magnetometer orientation errors. The results of IMU orientation propagation are used to efficiently transform magnetometer measurements between frames producing relative orientation constraints between consecutive frames. The soft and hard iron effects are calibrated using an ellipsoid fitting algorithm. The introduction of magnetometer data results in significant reductions in the orientation error and also in recovery of the true yaw orientation with respect to the magnetic north. The proposed framework operates in all environments with slow-varying magnetic fields, mainly outdoors and underwater. We have focused our work on the underwater domain, especially in underwater caves, as the narrow passage and turbulent flow make it difficult to perform loop closures and reset the localization drift. The underwater caves present challenges to VIO due to the absence of ambient light and the confined nature of the environment, while also being a crucial source of fresh water and providing valuable historical records. Experimental results from underwater caves demonstrate the improvements in accuracy and robustness introduced by the proposed VIO extension.
|
|
13:30-15:00, Paper WeBT22-NT.8 | Add to My Program |
Underwater Volumetric Mapping Using Imaging Sonar and Free-Space Modeling Approach |
|
Oliveira, António José | INESC TEC |
Ferreira, Bruno | INESC TEC |
Cruz, Nuno | University of Porto |
Keywords: Marine Robotics, Mapping, SLAM
Abstract: Lack of information and perceptual ambiguity are key problems in sonar-based mapping applications. We propose a technique for mapping of underwater environments, building on the finite, positive, sonar beamwidth. Our approach models the free-space covered by each emitted acoustic pulse, employing volumetric techniques to create grid-based submaps of the unoccupied water volumes through images collected from imaging sonars. A representation of the occupied space is obtained by exploration of the free-space frontier. Special attention is given to acoustic image preparation and segmentation. Experimental results are provided based on real data collected from a dam shaft scenario.
|
|
WeBT23-NT Oral Session, NT-G401 |
Add to My Program |
Aerial Systems: Motion Control and Planning |
|
|
Chair: Gasteratos, Antonios | Democritus University of Thrace |
Co-Chair: Nikolakopoulos, George | Luleå University of Technology |
|
13:30-15:00, Paper WeBT23-NT.1 | Add to My Program |
Light-Weight Approach for Safe Landing in Populated Areas |
|
Mitroudas, Tilemahos | Democritus University of Thrace |
Balaska, Vasiliki | Democritus University of Thrace |
Psomoulis, Athanasios | Democritus University of Thrace |
Gasteratos, Antonios | Democritus University of Thrace |
Keywords: Aerial Systems: Applications, Mapping, Localization
Abstract: Landing safety is a challenge heavily engaging the research community recently, due to the increasing interest in applications availed by aerial vehicles. In this paper, we propose a landing safety pipeline based on state of the art object detectors and OctoMap. First, a point cloud of surface obstacles is generated, which is then inserted in an OctoMap. The unoccupied areas are identified, thus resulting to a sum of safe landing points. Due to the low processing time achieved by state of the art object detectors and the efficient point cloud manipulation using OctoMap, it is feasible for our approach to deploy on low-weight embedded systems. The proposed pipeline has been evaluated in many simulation scenarios, varying in people density, number, and movement. Simulations were executed with an Nvidia Jetson Nano in the loop to confirm the pipeline's performance and robustness in a low computing power hardware. The experiments yielded promising results with a 87% success rate.
|
|
13:30-15:00, Paper WeBT23-NT.2 | Add to My Program |
Design and Evaluation of Motion Planners for Quadrotors in Environments with Varying Complexities |
|
Shao, Yifei | University of Pennsylvania |
Wu, Yuwei | University of Pennsylvania |
Jarin-Lipschitz, Laura | University of Pennsylvania |
Chaudhari, Pratik | University of Pennsylvania |
Kumar, Vijay | University of Pennsylvania |
Keywords: Aerial Systems: Applications, Motion and Path Planning, Performance Evaluation and Benchmarking
Abstract: Motion planning techniques for quadrotors have advanced significantly over the past decade. Most successful planners have two stages: a front-end that determines a path that incorporates geometric (or kinematic or input) constraints and specifies the homotopy class of the trajectory, and a back-end that optimizes this path to respect dynamics and input constraints. While there are many different choices for each stage, the eventual performance depends critically not only on these choices, but also on the environment. Given a new environment, it is difficult to decide a priori how one should design a motion planner. In this work, we develop (i) a procedure to construct parametrized environments, (ii) metrics that characterize the difficulty of motion planning in these environments, and (iii) an open-source software stack that can be used to combine a wide variety of two-stage planners seamlessly. We perform experiments in simulations and a real platform. We find, somewhat conveniently, that geometric front-ends are sufficient for environments with varying complexities if combined with dynamics-aware backends. The metrics we designed faithfully capture the planning difficulty in a given environment. All code is available at https://github.com/KumarRobotics/kr_mp_design.
|
|
13:30-15:00, Paper WeBT23-NT.3 | Add to My Program |
AutoTrans: A Complete Planning and Control Framework for Autonomous UAV Payload Transportation |
|
Li, Haojia | The Hong Kong University of Science and Technology |
Wang, Haokun | The Hong Kong University of Science and Technology |
Feng, Chen | Hong Kong University of Science and Technology |
Gao, Fei | Zhejiang University |
Zhou, Boyu | Sun Yat-Sen University |
Shen, Shaojie | Hong Kong University of Science and Technology |
Keywords: Aerial Systems: Applications, Motion and Path Planning, Robust/Adaptive Control
Abstract: The robotics community is increasingly interested in autonomous aerial transportation. Unmanned aerial vehicles with suspended payloads have advantages over other systems, including mechanical simplicity and agility, but pose great challenges in planning and control. To realize fully autonomous aerial transportation, this paper presents a systematic solution to address these difficulties. First, we present a real-time planning method that generates smooth trajectories considering the time-varying shape and non-linear dynamics of the system, ensuring whole-body safety and dynamic feasibility. Additionally, an adaptive NMPC with a hierarchical disturbance compensation strategy is designed to overcome unknown external perturbations and inaccurate model parameters. Extensive experiments show that our method is capable of generating high-quality trajectories online, even in highly constrained environments, and tracking aggressive flight trajectories accurately, even under significant uncertainty. We plan to release our code to benefit the community.
|
|
13:30-15:00, Paper WeBT23-NT.4 | Add to My Program |
Bat Planner: Aggressive Flying Ball Player |
|
Yu, Huan | Zhejiang University |
Tu, Jie | Zhejiang University |
Wang, Pengqin | The Hong Kong University of Science and Technology |
Zheng, Zhi | Zhejiang University |
Zhang, Kewen | Zhejiang University of Technology |
Lu, GuoDong | Zhejiang University |
Gao, Fei | Zhejiang University |
Wang, Jin | Zhejiang University |
Keywords: Aerial Systems: Applications, Motion and Path Planning, Task and Motion Planning
Abstract: In this paper, an aggressive quadrotor Ball plAying sysTem called BAT is proposed, whose goal is to intercept a flying ball and volley it towards a designated target. Aggressive means BAT operates the quadrotor aggressively to intercept balls that are far away and hit them to distant positions in ways that are beyond the reach of existing methods. The trajectory prediction of the ball is achieved by integrating forward the current position and velocity estimates using an extended kalman filter, and implementing cubic interpolation at the time resolution to calculate the continuous gradient for optimization. Facing the challenge of finding feasible hitting actions under extreme circumstances, we propose a two-stage planning approach, including transition point design and hitting primitive generation, with a simplified expression of uncoupled hitting actions. To obtain the best hitting motion, a trajectory optimization method is proposed, which can jointly optimize the hitting terminal states and time cost, considering dynamic feasibility and anticollision constraints. To avoid pathological hitting, a defensive rule constraint and its constraint transcription method are proposed. A large number of simulation and real-world experiments are conducted, which prove the flying ball player can hit arriving balls from different directions and distances to arbitrary targets.
|
|
13:30-15:00, Paper WeBT23-NT.5 | Add to My Program |
An NMPC Framework for Tracking and Releasing a Cable-Suspended Load to a Ground Target Using a Multirotor UAV |
|
Panetsos, Fotis | National Technical University of Athens |
Karras, George | University of Thessaly |
Kyriakopoulos, Kostas | New York University - Abu Dhabi |
Keywords: Aerial Systems: Applications, Motion Control, Field Robots
Abstract: In this work, we present a nonlinear Model Predictive Control (NMPC) scheme for tracking a ground target using a multirotor with a cable-suspended load. The NMPC framework relies on the dynamic model of the UAV with the suspended load and, hence, an estimate of the load state is obtained by fusing the measurements of a downward-facing camera and a load cell with an Unscented Kalman Filter (UKF). Additionally, since the NMPC relies on the future behavior of the system, the trajectory of the ground target throughout the predicted time horizon of the NMPC, is required. Towards this direction, Bézier curves are employed in order to predict the future trajectory of the target, which moves in an arbitrary way. The ultimate goal of the proposed framework is to release the suspended load to the ground target and, consequently, a condition is checked at each time instant that triggers the opening of a gripper, located at the lower edge of the cable. The performance of the proposed control scheme is experimentally validated using an octorotor.
|
|
13:30-15:00, Paper WeBT23-NT.6 | Add to My Program |
Multi-Vehicle Dynamic Water Surface Monitoring |
|
Nekovar, Frantisek | Czech Technical University in Prague |
Faigl, Jan | Czech Technical University in Prague |
Saska, Martin | Czech Technical University in Prague |
Keywords: Aerial Systems: Applications, Path Planning for Multiple Mobile Robots or Agents, Environment Monitoring and Management
Abstract: Repeated exploration of a water surface to detect objects of interest and their subsequent monitoring is important in search-and-rescue or ocean clean-up operations. Since the location of any detected object is dynamic, we propose to address the combined surface exploration and monitoring of the detected objects by modeling spatio-temporal reward states and coordinating a team of vehicles to collect the rewards. The model characterizes the dynamics of the water surface and enables the planner to predict future system states. The state reward value relevant to the particular water surface cell increases over time and is nullified by being in a sensor range of a vehicle. Thus, the proposed multi-vehicle planning approach is to minimize the collective value of the dynamic model reward states. The purpose is to address vehicles' motion constraints by using model predictive control on receding horizon and fully exploiting the utilized vehicles' motion capabilities. Based on the evaluation results, the approach indicates improvement in a solution to the kinematic orienteering problem and the team orienteering problem in the monitoring task compared to the existing solutions. The proposed approach has been experimentally verified, supporting its feasibility in real-world monitoring tasks.
|
|
13:30-15:00, Paper WeBT23-NT.7 | Add to My Program |
Aerial Physical Human Robot Interaction for Payload Transportation |
|
Prajapati, Pratik | Indian Institute of Technology Gandhinagar |
Vashista, Vineet | Indian Institute of Technology Gandhinagar |
Keywords: Aerial Systems: Applications, Physical Human-Robot Interaction, Human-Robot Collaboration
Abstract: Recent human-robot interaction paradigms on aerial robots unfold many potential applications, and efforts are further being made to explore this field. Physical interaction with aerial robots can provide an intuitive way of delivering high-level commands and allowing humans to perform collaborative tasks. The presented work demonstrates the feasibility of deploying the aerial robot to physically work with the human operator to transport the payload collaboratively in outdoor settings. A system comprised of a rigid object lifted by a human and a quadcopter from its end is considered. Custom build sensor systems, namely Human Handle Device and Cable Attitude Device, have been designed to estimate human commands and state feedback reliably. A control strategy for the quadcopter is designed to interact naturally with the operator for safer and smooth collaborative payload transportation. Successful outdoor experiments with five novice subjects are presented that demonstrate the feasibility and potential application of the proposed modality.
|
|
13:30-15:00, Paper WeBT23-NT.8 | Add to My Program |
On Experimental Emulation of Printability and Fleet Aware Generic Mesh Decomposition for Enabling Aerial 3D Printing |
|
Stamatopoulos, Marios-Nektarios | Luleå University of Technology |
Banerjee, Avijit | Luleå University of Technology |
Nikolakopoulos, George | Luleå University of Technology |
Keywords: Aerial Systems: Applications, Robotics and Automation in Construction
Abstract: This article introduces an experimental emulation of a novel chunk-based flexible multi-DoF aerial 3D printing framework. The experimental demonstration of the overall autonomy focuses on precise motion planning and task allocation for a UAV, traversing through a series of planned space-filling paths involved in the aerial 3D printing process without physically depositing the overlaying material. The flexible multi-DoF aerial 3D printing is a newly developed framework and has the potential to strategically distribute the envisioned 3D model to be printed into small, manageable chunks suitable for distributed 3D printing. Moreover, by harnessing the dexterous flexibility due to the 6 DoF motion of UAV, the framework enables the provision of integrating the overall autonomy stack, potentially opening up an entirely new frontier in additive manufacturing. However, it's essential to note that the feasibility of this pioneering concept is still in its very early stage of development, which yet needs to be experimentally verified. Towards this direction, experimental emulation serves as the crucial stepping stone, providing a pseudo mockup scenario by virtual material deposition, helping to identify technological gaps from simulation to reality. Experimental emulation results, supported by critical analysis and discussion, lay the foundation for addressing the technological and research challenges to significantly push the boundaries of the state-of-the-art 3D printing mechanism.
|
|
WeBT24-NT Oral Session, NT-G402 |
Add to My Program |
Aerial Systems |
|
|
Chair: Merino, Luis | Universidad Pablo De Olavide |
Co-Chair: Ryou, Gilhyun | Massachusetts Institute of Technology |
|
13:30-15:00, Paper WeBT24-NT.1 | Add to My Program |
Path and Trajectory Planning of a Tethered UAV-UGV Marsupial Robotic System |
|
Martinez-Rozas, Simon | Universidad De Antofagasta |
Alejo, David | University Pablo De Olavide |
Caballero, Fernando | Universidad De Sevilla |
Merino, Luis | Universidad Pablo De Olavide |
Keywords: Motion and Path Planning, Aerial Systems: Applications
Abstract: This letter addresses the problem of trajectory planning in a marsupial robotic system consisting of an unmanned aerial vehicle (UAV) linked to an unmanned ground vehicle (UGV) through a non-taut tether with controllable length. revF{To the best of our knowledge, this is the first method that addresses the trajectory planning of a marsupial UGV-UAV with a non-taut tether.} The objective is to determine a synchronized collision-free trajectory for the three marsupial system agents: UAV, UGV, and tether. First, we present a path planning solution based on optimal Rapidly-exploring Random Trees (RRT*) with novel sampling and steering techniques to speed-up the computation. This algorithm is able to obtain collision-free paths for the UAV and the UGV, taking into account the 3D environment and the tether. Then, the letter presents a trajectory planner based on non-linear least squares. The optimizer takes into account aspects not considered in the path planning, like temporal constraints of the motion imposed by limits on the velocities and accelerations of the robots, or raising the tether's clearance. Simulated and field test results demonstrate that the approach generates obstacle-free, smooth, and feasible trajectories for the marsupial system.
|
|
13:30-15:00, Paper WeBT24-NT.2 | Add to My Program |
Cooperative Exploration of Heterogeneous UAVs in Mountainous Environments by Constructing Steady Communication |
|
Jiang, Han | The State Key Laboratory of Robotics, Shenyang Institute of Auto |
Chang, Yanchun | Shenyang Institute of Automation |
Yang, Liying | Shenyang Institute of Automation |
Liu, Xu | Shenyang Institute of Automation, Chinese Academy of Sciences |
He, Yuqing | Shenyang Institute of Automation, Chinese Academy of Sciences |
Keywords: Motion and Path Planning, Cooperating Robots, Aerial Systems: Applications
Abstract: Unmanned aerial vehicles (UAVs) must fly at low altitudes to execute certain missions when operating in complex mountainous areas. However, in these environments, UAVs lose their line-of-sight (LOS) communication with the ground station (GS) due to the obstruction of the mountains and are unable to retransmit information such as video, which will lead to mission failure or affect the flight safety of UAVs. To address this difficulty, this study proposes a cooperative planning method for heterogeneous UAVs by ensuring steady communication based on the shortest total mission time. To accomplish this goal, a relay UAV is positioned to enable indirect but constant LOS connectivity between the mission UAV and the GS. Specifically, to alleviate data storage pressure, a terrain lightweight modeling method is employed. In addition, a new LOS judgment model that constructs communication relay LOS links between communication nodes for complex mountain environments is presented. This study considers different types of UAVs; for the fixed-wing UAV, the minimum turning radius constraints are taken into account to plan a flyable trajectory. The problem is formulated as the multi-step optimization model that accounts for communication constraints, obstacle and collision avoidance, and the performance constraints of heterogeneous UAVs to plan the trajectories of the mission UAV and relay UAV in order to maintain LOS links between the mission UAV and the ground station
|
|
13:30-15:00, Paper WeBT24-NT.3 | Add to My Program |
Perception-And-Energy-Aware Motion Planning for UAV Using Learning-Based Model under Heteroscedastic Uncertainty |
|
Takemura, Reiya | Keio University |
Ishigami, Genya | Keio University |
Keywords: Integrated Planning and Learning, View Planning for SLAM, Aerial Systems: Perception and Autonomy
Abstract: Global navigation satellite systems (GNSS) denied environments/conditions require unmanned aerial vehicles (UAVs) to energy-efficiently and reliably fly. To this end, this study presents perception-and-energy-aware motion planning for UAVs in GNSS-denied environments. The proposed planner solves the trajectory planning problem by optimizing a cost function consisting of two indices: the total energy consumption of a UAV and the perception quality of light detection and ranging (LiDAR) sensor mounted on the UAV. Before online navigation, a high-fidelity simulator acquires a flight dataset to learn energy consumption for the UAV and heteroscedastic uncertainty associated with LiDAR measurements, both as functions of the horizontal velocity of the UAV. The learned models enable the online planner to estimate energy consumption and perception quality, reducing UAV battery usage and localization errors. Simulation experiments in a photorealistic environment confirm that the proposed planner can address the trade-off between energy efficiency and perception quality under heteroscedastic uncertainty. The open-source code is released at https://gitlab.com/ReI08/perception-energy-planner.
|
|
13:30-15:00, Paper WeBT24-NT.4 | Add to My Program |
Representing On-Orbit Rendezvous and Proximity Operations with Fully-Actuated Multirotor Aerial Platforms |
|
Garzelli, Alessandro | GRVC Robotics Lab, University Seville |
Yadav, Kumud Darshan | GRVC Robotics Lab, University of Seville |
Scalvini, Alessandro | GRVC University of Seville |
Gonzalez-Morgado, Antonio | Universidad De Sevilla |
Suarez, Alejandro | University of Seville |
Ollero, Anibal | AICIA. G41099946 |
Keywords: Simulation and Animation, Aerial Systems: Applications, Space Robotics and Automation
Abstract: Ground testing is of paramount importance to verify and validate space operations and the associated control algorithms before on-orbit deployment. Although state-of-the-art facilities are capable of reproducing zero-G environment with high degree of fidelity, these infrastructures can be complemented with multi-rotors emulating free flying or free floating conditions, exploiting the similarities and analogies between both domains in terms of floating nature, attitude dynamics, and thrust-wrench relation through the mixer matrix. Furthermore, the effective workspace of the testbed can be extended to the dimensions of the flight area and the coverage of the positioning system. Therefore, this papers introduces a new way to recreate orbital motion within an indoor facility, considering the case study of trajectories derived from the Clohessy–Wiltshire equations. This advancement opens up avenues for replicating close-proximity operations between chaser and target satellites employing fully-actuated multi-rotors that allow decoupling translational and attitude dynamics.
|
|
13:30-15:00, Paper WeBT24-NT.5 | Add to My Program |
On-Device Self-Supervised Learning of Visual Perception Tasks Aboard Hardware-Limited Nano-Quadrotors |
|
Cereda, Elia | USI and SUPSI |
Rusci, Manuele | KU Leuven |
Giusti, Alessandro | IDSIA USI-SUPSI |
Palossi, Daniele | ETH Zurich |
Keywords: Continual Learning, Micro/Nano Robots, Aerial Systems: Perception and Autonomy
Abstract: Sub-SI{50}{gram} nano-drones are gaining momentum in both academia and industry. Their most compelling applications rely on onboard deep learning models for perception despite severe hardware constraints (ie sub-SI{100}{milliwatt} processor). When deployed in unknown environments not represented in the training data, these models often underperform due to domain shift. To cope with this fundamental problem, we propose, for the first time, on-device learning aboard nano-drones, where the first part of the in-field mission is dedicated to self-supervised fine-tuning of a pre-trained convolutional neural network (CNN). Leveraging a real-world vision-based regression task, we thoroughly explore performance-cost trade-offs of the fine-tuning phase along three axes: textit{i}) dataset size (more data increases the regression performance but requires more memory and longer computation); textit{ii}) methodologies (eg fine-tuning all model parameters vs. only a subset); and textit{iii}) self-supervision strategy. Our approach demonstrates an improvement in mean absolute error up to 30% compared to the pre-trained baseline, requiring only SI{22}{second} fine-tuning on an ultra-low-power GWT GAP9 System-on-Chip. Addressing the domain shift problem via on-device learning aboard nano-drones not only marks a novel result for hardware-limited robots but lays the ground for more general advancements for the entire robotics community.
|
|
13:30-15:00, Paper WeBT24-NT.6 | Add to My Program |
Aerobatic Trajectory Generation for a VTOL Fixed-Wing Aircraft Using Differential Flatness |
|
Tal, Ezra | MIT |
Ryou, Gilhyun | Massachusetts Institute of Technology |
Karaman, Sertac | Massachusetts Institute of Technology |
Keywords: Motion and Path Planning, Aerial Systems: Mechanics and Control, Autonomous Vehicle Navigation, Trajectory Optimization
Abstract: This article proposes a novel algorithm for aerobatic trajectory generation for a vertical take-off and landing (VTOL) tailsitter flying wing aircraft. The algorithm differs from existing approaches for fixed-wing trajectory generation, as it considers a realistic six-degree-of-freedom (6-DOF) flight dynamics model, including aerodynamic equations. Using a global dynamics model enables the generation of aerobatics trajectories that exploit the entire flight envelope, allowing agile maneuvering through the stall regime, sideways uncoordinated flight, inverted flight, etc. The method uses the differential flatness property of the global tailsitter flying wing dynamics, which is derived in this work. By performing snap minimization in the differentially flat output space, a computationally efficient algorithm, suitable for online motion planning, is obtained. The algorithm is demonstrated in extensive flight experiments encompassing six aerobatic maneuvers, a time-optimal drone racing trajectory, and an airshowlike aerobatic sequence for three tailsitter aircraft.
|
|
13:30-15:00, Paper WeBT24-NT.7 | Add to My Program |
EVOLVER: Online Learning and Prediction of Disturbances for Robot Control |
|
Jia, Jindou | Beihang University |
Zhang, Wenyu | Beihang University |
Guo, Kexin | Beihang University |
Wang, Jianliang | Hangzhou Innovation Institute of Beihang University |
Yu, Xiang | Beihang University |
Shi, Yang | University of Victoria |
Guo, Lei | Beihang University |
Keywords: Learning and Adaptive Systems, Model Learning for Control, Motion Control of Manipulators, Aerial Systems: Mechanics and Control
Abstract: In nature, when encountering unexpected uncertainty, animals tend to react quickly to ensure safety as the top priority, and gradually adapt to it based on fresh experience. We present a framework, namely EVOLVER, to mimic the bio-behavior for robotics to achieve rapid transient reaction ability and high-precision steady state performance simultaneously. In particular, the Koopman operator is leveraged to explore the unknown model of uncertainties, which is subsequently utilized in an evolutionary model-based disturbance observer. The resulting observer can guarantee a provable convergence in optimal conditions. Several practical considerations, including construction of a training dataset, data noise handling, and lifting functions selection, are elaborated in pursuit of the theoretical optimality in real applications. The lightweight nature of EVOLVER enables online computation. The framework is thoroughly evaluated by 1) trajectory prediction of an irregular free-flying object subject to aerodynamic drag, 2) agile flight of a quadrotor subject to wind gust, and 3) high-precision end-effector control of a manipulator subject to base moving disturbance.
|
|
13:30-15:00, Paper WeBT24-NT.8 | Add to My Program |
Time-Optimal Path Planning in a Constant Wind for Uncrewed Aerial Vehicles Using Dubins Set Classification |
|
Moon, Brady | Carnegie Mellon University |
Sachdev, Sagar | Carnegie Mellon University |
Yuan, Junbin | Carnegie Mellon University |
Scherer, Sebastian | Carnegie Mellon University |
Keywords: Motion and Path Planning, Aerial Systems: Perception and Autonomy, Field Robots
Abstract: Time-optimal path planning in high winds for a turning-rate constrained UAV is a challenging problem to solve and is important for deployment and field operations. Previous works have used trochoidal path segments comprising straight and maximum-rate turn segments, as optimal extremal paths in uniform wind conditions. Current methods iterate over all candidate trochoidal trajectory types and select the one that is time-optimal; however, this exhaustive search can be computationally slow. In this paper, we introduce a method to decrease the computation time. This is achieved by reducing the number of candidate trochoidal trajectory types by framing the problem in the air-relative frame and bounding the solution within a subset of candidate trajectories. Our method reduces overall computation by 37.4% compared to pre-existing methods in Bang-Straight-Bang trajectories, freeing up computation for other onboard processes and can lead to significant total computational reductions when solving many trochoidal paths. When used within the framework of a global path planner, faster state expansions help find solutions faster or compute higher-quality paths. We also release our open-source codebase as a C++ package.
|
|
13:30-15:00, Paper WeBT24-NT.9 | Add to My Program |
Robust and Efficient Depth-Based Obstacle Avoidance for Autonomous Miniaturized UAVs |
|
Müller, Hanna | ETH Zürich |
Niculescu, Vlad | ETH Zurich |
Polonelli, Tommaso | ETH Zürich |
Magno, Michele | ETH Zurich |
Benini, Luca | University of Bologna |
Keywords: Collision Avoidance, Aerial Systems: Perception and Autonomy, Micro/Nano Robots, Reactive and Sensor-Based Planning
Abstract: Nano-size drones hold enormous potential to explore unknown and complex environments. Their small size makes them agile and safe for operation close to humans and allows them to navigate through narrow spaces. However, their tiny size and payload restrict the possibilities for on-board computation and sensing, making fully autonomous flight extremely challenging. The first step towards full autonomy is reliable obstacle avoidance, which has proven to be challenging by itself in a generic indoor environment. Current approaches utilize vision-based or 1-dimensional sensors to support nano-drone perception algorithms. This work presents a lightweight obstacle avoidance system based on a novel millimeter form factor 64 pixels multizone Time-of-Flight (ToF) sensor and a generalized model-free control policy. In-field tests are based on the Crazyflie 2.1, extended by a custom multi-zone ToF deck, featuring a total flight mass of 35 g. The algorithm only uses 0.3% of the on-board processing power (210 μs execution time) with a frame rate of 15 fps. The presented autonomous nano-size drone reaches 100% reliability at 0.5 m/s in a generic and previously unexplored indoor environment.
|
|
WeBT25-NT Oral Session, NT-G403 |
Add to My Program |
Localization V |
|
|
Chair: He, Fenghua | Harbin Institute of Technology |
Co-Chair: Huang, Guoquan | University of Delaware |
|
13:30-15:00, Paper WeBT25-NT.1 | Add to My Program |
Visual Localization in Repetitive and Symmetric Indoor Parking Lots Using 3D Key Text Graph |
|
Kim, Joohyung | Korea University |
Koo, Gunhee | Samsung Electronics |
Park, Heewon | Samsung Electronics |
Doh, Nakju | Korea University |
Keywords: Localization
Abstract: Indoor parking lots are the GPS-denied spaces to which vision-based localization approaches have usually been applied to solve localization problems. However, due to the repetitiveness and symmetry of the spaces, visual localization methods commonly confront difficulties in estimating precise 3D poses. In this study, we propose four novel modules that improve localization precision by imposing the existing methods with the spatial discerning ability. The first module constructs a key text graph that represents the topology of key texts in the space and becomes the basis for discerning repetitiveness and symmetry. Next, the orientation filtering module estimates the unknown 3D orientation of the query image and resolves spatial symmetric ambiguity. The similarity scoring module sorts out the top-scored database images, discerning the spatial repetitiveness based on detected key text bounding boxes. Our pose verification module evaluates the pose confidence of top-scored candidates and determines the most reliable pose. Our method has been validated in two real indoor parking lots, achieving new state-of-the-art performance levels.
|
|
13:30-15:00, Paper WeBT25-NT.2 | Add to My Program |
VOLoc: Visual Place Recognition by Querying Compressed Lidar Map |
|
Cai, Xudong | Renmin University of China |
Wang, Yongcai | Renmin University of China |
Huang, Zhe | Renmin University of China |
Shao, Yu | Renmin University of China |
Li, Deying | Renmin University of China |
Keywords: Localization
Abstract: The availability of city-scale Lidar maps enables the potential of city-scale place recognition using mobile cameras. However, the city-scale Lidar maps generally need to be compressed for storage efficiency, which increases the difficulty of direct visual place recognition in compressed Lidar maps. This paper proposes VOLoc, an accurate and efficient visual place recognition method that exploits geometric similarity to directly query the compressed Lidar map via the real-time captured image sequence. In the offline phase, VOLoc compresses the Lidar maps using a emph{Geometry-Preserving Compressor} (GPC), in which the compression is reversible, a crucial requirement for the downstream 6DoF pose estimation. In the online phase, VOLoc proposes an online Geometric Recovery Module (GRM), which is composed of online Visual Odometry (VO) and a point cloud optimization module, such that the local scene structure around the camera is online recovered to build the emph{Querying Point Cloud} (QPC). Then the QPC is compressed by the same GPC, and is aggregated into a global descriptor by an attention-based aggregation module, to query the compressed Lidar map in the vector space. A transfer learning mechanism is also proposed to improve the accuracy and the generality of the aggregation network. Extensive evaluations show that VOLoc provides localization accuracy even better than the Lidar-to-Lidar place recognition, setting up a new record for utilizing the compressed Lidar map by low-end mobile cameras. The code are publicly available at href{https://github.com/Master-cai/VOLoc}{https://github.com/Master-cai/VOLoc}.
|
|
13:30-15:00, Paper WeBT25-NT.3 | Add to My Program |
VPRTempo: A Fast Temporally Encoded Spiking Neural Network for Visual Place Recognition |
|
Hines, Adam D. | Queensland University of Technology |
Stratton, Peter | University of Queensland |
Milford, Michael J | Queensland University of Technology |
Fischer, Tobias | Queensland University of Technology |
Keywords: Localization, Bioinspired Robot Learning
Abstract: Spiking Neural Networks (SNNs) are at the forefront of neuromorphic computing thanks to their potential energy-efficiency, low latencies, and capacity for continual learning. While these capabilities are well suited for robotics tasks, SNNs have seen limited adaptation in this field thus far. This work introduces a SNN for Visual Place Recognition (VPR) that is both trainable within minutes and queryable in milliseconds, making it well suited for deployment on compute- constrained robotic systems. Our proposed system, VPRTempo, overcomes slow training and inference times using an abstracted SNN that trades biological realism for efficiency. VPRTempo employs a temporal code that determines the timing of a single spike based on a pixel’s intensity, as opposed to prior SNNs relying on rate coding that determined the number of spikes; improving spike efficiency by over 100%. VPRTempo is trained using Spike-Timing Dependent Plasticity and a supervised delta learning rule enforcing that each output spiking neuron responds to just a single place. We evaluate our system on the Nordland and Oxford RobotCar benchmark localization datasets, which include up to 27k places. We found that VPRTempo’s accuracy is comparable to prior SNNs and the popular NetVLAD place recognition algorithm, while being several orders of magnitude faster and suitable for real-time deployment – with inference speeds over 50 Hz on CPU. VPRTempo could be integrated as a loop closure component for online SLAM on resource-constrained systems such as space and underwater robots.
|
|
13:30-15:00, Paper WeBT25-NT.4 | Add to My Program |
17-Point Algorithm Revisited: Toward a More Accurate Way |
|
Xie, Chen | Harbin Institute of Technology |
Xing, Rui | Harbin Institute of Technology |
Hao, Ning | Harbin Institute of Technology |
He, Fenghua | Harbin Institute of Technology |
Keywords: Localization, Computer Vision for Automation, SLAM
Abstract: 17-point algorithm is a popular method in relative pose estimation of multi-cameras. However, the role of overlap in 17-point algorithm remains unexplored. And the relaxed way in solving constrained normal equation leads to sub-optimal results. Both of them influence accuracy of the estimated pose. In this paper, we theoretically analyze the influence of overlap and the solvability of 17-point algorithm. In addition, we show that the abuse of overlap can harm accuracy in practice. In light of these findings, we propose an improved 17-point algorithm, which avoids using overlaps and derives a simple way to solve normal equation on manifold. Both simulations and real world data experiments demonstrate the proposed one outperforms the traditional 17-point algorithm in term of accuracy.
|
|
13:30-15:00, Paper WeBT25-NT.5 | Add to My Program |
AnyLoc: Towards Universal Visual Place Recognition |
|
Keetha, Nikhil Varma | Carnegie Mellon University |
Mishra, Avneesh | International Institute of Information Technology, Hyderabad |
Karhade, Jay | Carnegie Mellon University |
Jatavallabhula, Krishna Murthy | MIT |
Scherer, Sebastian | Carnegie Mellon University |
Krishna, Madhava | IIIT Hyderabad |
Garg, Sourav | University of Adelaide |
Keywords: Localization, Recognition, Deep Learning for Visual Perception
Abstract: Visual Place Recognition (VPR) is vital for robot localization. To date, the most performant VPR approaches are environment- and task-specific: while they exhibit strong performance in structured environments (predominantly urban driving), their performance degrades severely in unstructured environments, rendering most approaches brittle to robust real-world deployment. In this work, we develop a universal solution to VPR -- a technique that works across a broad range of structured and unstructured environments (urban, outdoors, indoors, aerial, underwater, and subterranean environments) without any re-training or fine-tuning. We demonstrate that general-purpose feature representations derived from off-the-shelf self-supervised models with no VPR-specific training are the right substrate upon which to build such a universal VPR solution. Combining these derived features with unsupervised feature aggregation enables our suite of methods, AnyLoc, to achieve up to 4X significantly higher performance than existing approaches. We further obtain a 6% improvement in performance by characterizing the semantic properties of these features, uncovering unique domains which encapsulate datasets from similar environments. Our detailed experiments and analysis lay a foundation for building VPR solutions that may be deployed anywhere, anytime, and across anyview. We encourage the readers to explore our project page and interactive demos: https://anyloc.github.io/
|
|
13:30-15:00, Paper WeBT25-NT.6 | Add to My Program |
Lightweight Ground Texture Localization |
|
Wilhelm, Aaron | Cornell University |
Napp, Nils | Cornell University |
Keywords: Localization, Vision-Based Navigation, SLAM
Abstract: We present a lightweight ground texture based localization algorithm (L-GROUT) that improves the state of the art in performance and can be run in real-time on single board computers without GPU acceleration. Such computers are ubiquitous on small indoor robots and thus this work enables high-precision, millimeter-level localization without instrumenting, marking, or modifying the environment. The key innovations are an improved database feature extraction algorithm, a dimensionality reduction method based on locality preserving projections (LPP) that can accommodate faster-to-compute binary features, and an improved spatial filtering step that better preserves performance when the databases are tuned for lightweight applications. We demonstrate the approach by running the whole system on a low-cost single board computer (Raspberry Pi 4) to produce global localization estimates at greater than 4Hz on an outdoor asphalt dataset.
|
|
13:30-15:00, Paper WeBT25-NT.7 | Add to My Program |
NeRF-VINS: A Real-Time Neural Radiance Field Map-Based Visual-Inertial Navigation System |
|
Katragadda, Saimouli | University of Delaware |
Lee, Woosik | University of Delaware |
Peng, Yuxiang | University of Delaware |
Geneva, Patrick | University of Delaware |
Chen, Chuchu | University of Delaware |
Guo, Chao | Google |
Li, Mingyang | Alphabet Inc |
Huang, Guoquan | University of Delaware |
Keywords: Localization, Visual-Inertial SLAM, Deep Learning Methods
Abstract: Achieving efficient and consistent localization with a prior map remains challenging in robotics. Conventional keyframe-based approaches often suffer from sub-optimal view- points due to limited field of view (FOV) and/or constrained motion, thus degrading the localization performance. To ad- dress this issue, we design a real-time tightly-coupled Neural Radiance Fields (NeRF)-aided visual-inertial navigation system (VINS). In particular, by effectively leveraging the NeRF’s potential to synthesize novel views, the proposed NeRF-VINS overcomes the limitations of traditional keyframe-based maps (with limited views) and optimally fuses IMU, monocular images, and synthetically rendered images within an efficient filter-based framework. This tightly-coupled fusion enables efficient 3D motion tracking with bounded errors. We extensively validate the proposed NeRF-VINS against the state-of-the-art methods that use prior map information, and demonstrate its ability to perform real-time localization, at 15 Hz, on a resource- constrained Jetson AGX Orin embedded platform.
|
|
13:30-15:00, Paper WeBT25-NT.8 | Add to My Program |
Night-Rider: Nocturnal Vision-Aided Localization in Streetlight Maps Using Invariant Extended Kalman Filtering |
|
Gao, Tianxiao | University of Macao |
Zhao, Mingle | University of Macau |
Xu, Chengzhong | University of Macau |
Kong, Hui | University of Macau |
Keywords: Localization, Visual-Inertial SLAM, Sensor Fusion
Abstract: Vision-aided localization for low-cost mobile robots in diverse environments has attracted widespread attention recently. Although many current systems are applicable in daytime environments, nocturnal visual localization is still an open problem owing to the lack of stable visual information. An insight from most nocturnal scenes is that the static and bright streetlights are reliable visual information for localization. Hence we propose a nocturnal vision-aided localization system in streetlight maps with a novel data association and matching scheme using object detection methods. We leverage the Invariant Extended Kalman Filter (InEKF) to fuse IMU, odometer, and camera measurements for consistent state estimation at night. Furthermore, a tracking recovery module is also designed for tracking failures. Experimental results indicate that our proposed system achieves accurate and robust localization with less than 0.2% relative error of trajectory length in four nocturnal environments.
|
|
WeBT26-NT Oral Session, NT-G404 |
Add to My Program |
SLAM II |
|
|
Chair: Fallon, Maurice | University of Oxford |
Co-Chair: Mersch, Benedikt | University of Bonn |
|
13:30-15:00, Paper WeBT26-NT.1 | Add to My Program |
ONeK-SLAM: A Robust Object-Level Dense SLAM Based on Joint Neural Radiance Fields and Keypoints |
|
Zhuge, Yue | Institute of Computing Technology, Chinese Academy of Sciences; |
Luo, Haiyong | Institute of Computing Technology, Chinese Academy of Sciences |
Chen, Runze | Beijing University of Posts and Telecommunications |
Chen, Yushi | Beijing University of Posts and Telecommunications |
Yan, Jiaquan | Beijing University of Posts and Telecommunications |
Zhuqing, Jiang | Beijing University of Posts and Telecommunications |
Keywords: SLAM, Localization, Mapping
Abstract: Neural implicit representation has recently achieved significant advancements, especially in the field of SLAM(Simultaneous Localization and Mapping). Previous NeRF-based SLAM methods have difficulties with object-level localization and reconstruction and struggle in dynamic and illumination-varied environments. We propose ONeK-SLAM, a robust object-level SLAM system that effectively combines feature points and neural radiance fields. ONeK-SLAM uses the joint information at the object level to improve localization accuracy and enhance reconstruction details. Moreover, our approach detects and eliminates dynamic objects based on the joint errors, while also harnessing the illumination invariance offered by feature points. Consequently, ONeK-SLAM achieves high-precision localization and detailed object-level mapping, even in dynamic and illumination-varying environments. Our evaluations, conducted on three public datasets that include both dynamic and variable lighting sequences, demonstrate that our method outperforms recent NeRF-based SLAM method in both localization and reconstruction.
|
|
13:30-15:00, Paper WeBT26-NT.2 | Add to My Program |
A Two-Step Nonlinear Factor Sparsification for Scalable Long-Term SLAM Backend |
|
Jiang, Binqian | Hong Kong University of Science and Technology |
Shen, Shaojie | Hong Kong University of Science and Technology |
Keywords: SLAM, Localization
Abstract: This paper proposes a new nonlinear factor sparsification paradigm for general feature-based long-term SLAM backend. Given a pose sparsification policy, we aim to scale the SLAM problem with space explored instead of time in a principled way, so that the number of time-indexed poses can be limited while their influence and the long-lived landmarks are appropriately maintained. To do this, we propose a new two-step sparsification pipeline. Given a pose node to remove, the first step is performed in the Markov blankets of affected landmarks. It transforms pose-landmark constraints into pose-pose constraints while preserving observability and minimizing information loss in the blanket. Moreover, since landmarks are conditionally independent, we can do this in parallel, disconnecting a pose from all the landmarks. The second step marginalizes the pose of interest with pure pose-wise constraints without affecting any landmarks. Our method decouples the management of landmarks from pose-only measurements, making it general for any feature-based SLAM. We also give a practical example of how our backend works by concatenating it to a monocular VIO frontend. In simulation and real-world dataset, our sparsified backend is shown to be accurate and efficient. We open-source our backend, along with the VIO+Backend example, with the aim of contributing to the betterment of the community.
|
|
13:30-15:00, Paper WeBT26-NT.3 | Add to My Program |
Effectively Detecting Loop Closures Using Point Cloud Density Maps |
|
Gupta, Saurabh | University of Bonn |
Guadagnino, Tiziano | University of Bonn |
Mersch, Benedikt | University of Bonn |
Vizzo, Ignacio | Dexory |
Stachniss, Cyrill | University of Bonn |
Keywords: SLAM, Localization
Abstract: The ability to detect loop closures plays an essential role in any SLAM system. Loop closures allow correcting the drifting pose estimates from a sensor odometry pipeline. In this paper, we address the problem of effectively detecting loop closures in LiDAR SLAM systems in various environments with longer lengths of sequences and agnostic of the scanning pattern of the sensor. While many approaches for loop closures using 3D LiDAR sensors rely on individual scans, we propose the usage of local maps generated from locally consistent odometry estimates. Several recent approaches compute the maximum elevation map on a bird's eye view projection of point clouds to compute feature descriptors. In contrast, we use a density image bird's eye view representation, which is robust to viewpoint changes. The utilization of dense local maps allows us to reduce the complexity of features describing these maps, as well as the size of the database required to store these features over a long sequence. This yields a real-time application of our approach for a typical robotic 3D LiDAR sensor. We perform extensive experiments to evaluate our approach against other state-of-the-art approaches and show the benefits of our proposed approach.
|
|
13:30-15:00, Paper WeBT26-NT.4 | Add to My Program |
LOG-LIO: A LiDAR-Inertial Odometry with Efficient Local Geometric Information Estimation |
|
Huang, Kai | Tongji University |
Zhao, Junqiao | Tongji University |
Zhu, Zhongyang | Tongji University |
Ye, Chen | Tongji University |
Feng, Tiantian | Tongji University |
Keywords: SLAM, Localization
Abstract: Local geometric information, i.e., normal and distribution of points, is crucial for LiDAR-based simultaneous localization and mapping (SLAM) because it provides constraints for data association, which further determines the direction of optimization and ultimately affects the accuracy of localization. However, estimating normal and distribution of points are time-consuming tasks even with the assistance of kdtree or volumetric maps. To achieve fast normal estimation, we look into the structure of LiDAR scan and propose a ring-based fast approximate least squares (Ring FALS) method. With the Ring structural information, estimating the normal requires only the range information of the points when a new scan arrives. To efficiently estimate the distribution of points, we extend the ikd-tree to manage the map in voxels and update the distribution of points in each voxel incrementally while maintaining its consistency with the normal estimation. We further fix the distribution after its convergence to balance the time consumption and the correctness of representation. Based on the extracted and maintained local geometric information, we devise a robust and accurate hierarchical data association scheme where point-to-surfel association is prioritized over point-toplane. Extensive experiments on diverse public datasets demonstrate the advantages of our system compared to other state-ofthe-art methods. Our code is available at https://github.com/tiev-tongji/LOG-LIO.
|
|
13:30-15:00, Paper WeBT26-NT.5 | Add to My Program |
Radar-Only Odometry and Mapping for Autonomous Vehicles |
|
Casado Herraez, Daniel | University of Bonn & CARIAD SE |
Zeller, Matthias | CARIAD SE |
Chang, Le | University of Stuttgart |
Vizzo, Ignacio | Dexory |
Heidingsfeld, Michael | CARIAD SE |
Stachniss, Cyrill | University of Bonn |
Keywords: SLAM, Localization, Autonomous Vehicle Navigation
Abstract: Odometry and mapping play a pivotal role in the navigation of autonomous vehicles. In this paper, we address the problem of pose estimation and map creation using only radar sensors. We focus on two odometry estimation approaches followed by a mapping step. The first one is a new point-to-point ICP approach that leverages the velocity information provided by 3D radar sensors. The second one is advantageous for 2D radars with a low number of samples, and particularly useful for scenarios where the sensor is being blocked by large dynamic obstacles. It exploits a constant velocity filter and the measured Doppler velocities to estimate the vehicle’s ego-motion. We enrich this with a filtering step to improve the accuracy of the points in the resulting map. We put our work to the test using the View of Delft and NuScenes datasets, which involve 3D and 2D radar sensors. Our findings illustrate state-of-the-art performance of our odometry techniques in terms of accuracy when compared to existing alternatives. Moreover, we demonstrate that our map filtering methodology achieves higher similarity rates than the raw unfiltered map when benchmarked against a corresponding LiDAR map.
|
|
13:30-15:00, Paper WeBT26-NT.6 | Add to My Program |
IPC: Incremental Probabilistic Consensus-Based Consistent Set Maximization for SLAM Backends |
|
Olivastri, Emilio | University of Padua |
Pretto, Alberto | University of Padova |
Keywords: SLAM, Localization, Autonomous Vehicle Navigation
Abstract: In SLAM (Simultaneous localization and mapping) problems, Pose Graph Optimization (PGO) is a technique to refine an initial estimate of a set of poses (positions and orientations) from a set of pairwise relative measurements. The optimization procedure can be negatively affected even by a single outlier measurement, with possible catastrophic and meaningless results. Although recent works on robust optimization aim to mitigate the presence of outlier measurements, robust solutions capable of handling large numbers of outliers are yet to come. This paper presents IPC, acronym for Incremental Probabilistic Consensus, a method that approximates the solution to the combinatorial problem of finding the maximally consistent set of measurements in an incremental fashion. It evaluates the consistency of each loop closure measurement through a consensus-based procedure, possibly applied to a subset of the global problem, where all previously integrated inlier measurements have veto power. We evaluated IPC on standards benchmarks against several state-of-the-art methods. Although it is simple and relatively easy to implement, IPC competes with or outperforms the other tested methods in handling outliers while providing online performances. We release with this paper an open-source implementation of the proposed method.
|
|
13:30-15:00, Paper WeBT26-NT.7 | Add to My Program |
Generalized Correspondence Matching Via Flexible Hierarchical Refinement and Patch Descriptor Distillation |
|
Han, Yu | Donghua University |
Long, Ziwei | Tongji University |
Zhang, Yanting | Donghua University |
Jin, Wu | UESTC |
Fang, Zhijun | School of Computer Science and Technology, Donghua University |
Fan, Rui | Tongji University |
Keywords: SLAM, Localization, Computer Vision for Automation
Abstract: Correspondence matching plays a crucial role in numerous robotics applications. In comparison to conventional hand-crafted methods and recent data-driven approaches, there is significant interest in plug-and-play algorithms that make full use of pre-trained backbone networks for multi-scale feature extraction and leverage hierarchical refinement strategies to generate matched correspondences. The primary focus of this paper is to address the limitations of deep feature matching (DFM), a state-of-the-art (SoTA) plug-and-play correspondence matching approach. First, we eliminate the pre-defined threshold employed in the hierarchical refinement process of DFM by leveraging a more flexible nearest neighbor search strategy, thereby preventing the exclusion of repetitive yet valid matches during the early stages. Our second technical contribution is the integration of a patch descriptor, which extends the applicability of DFM to accommodate a wide range of backbone networks pre-trained across diverse computer vision tasks, including image classification, semantic segmentation, and stereo matching. Taking into account the practical applicability of our method in real-world robotics applications, we also propose a novel patch descriptor distillation strategy to further reduce the computational complexity of correspondence matching. Extensive experiments conducted on three public datasets demonstrate the superior performance of our proposed method. Specifically, it achieves an overall performance in terms of mean matching accuracy of 0.68, 0.92, and 0.95 with respect to the tolerances of 1, 3, and 5 pixels, respectively, on the HPatches dataset, outperforming all other SoTA algorithms.
|
|
13:30-15:00, Paper WeBT26-NT.8 | Add to My Program |
VOOM: Robust Visual Object Odometry and Mapping Using Hierarchical Landmarks |
|
Wang, Yutong | Beijing Institute of Technology |
Jiang, Chaoyang | Beijing Institute of Technology |
Chen, Xieyuanli | National University of Defense Technology |
Keywords: SLAM, Localization, Mapping
Abstract: In recent years, object-oriented simultaneous localization and mapping (SLAM) has attracted increasing attention due to its ability to provide high-level semantic information while maintaining computational efficiency. Some researchers have attempted to enhance localization accuracy by integrating the modeled object residuals into bundle adjustment. However, few have demonstrated better results than feature-based visual SLAM systems, as the generic coarse object models, such as cuboids or ellipsoids, are less accurate than feature points. In this paper, we propose a Visual Object Odometry and Mapping framework VOOM using high-level objects and low-level points as the hierarchical landmarks in a coarse-to-fine manner instead of directly using object residuals in bundle adjustment. Firstly, we introduce an improved observation model and a novel data association method for dual quadrics, employed to represent physical objects. It facilitates the creation of a 3D map that closely reflects reality. Next, we use object information to enhance the data association of feature points and consequently update the map. In the visual object odometry backend, the updated map is employed to further optimize the camera pose and the objects. Meanwhile, local bundle adjustment is performed utilizing the objects and points-based covisibility graphs in our visual object mapping process. Experiments show that VOOM outperforms both object-oriented SLAM and feature points SLAM systems such as ORB-SLAM2 in terms of localization. The implementation of our method is available at https://github.com/yutongwangBIT/VOOM.git.
|
|
13:30-15:00, Paper WeBT26-NT.9 | Add to My Program |
Lite-SVO: Towards a Lightweight Self-Supervised Semantic Visual Odometry Exploiting Multi-Feature Sharing Architecture |
|
Wei, Wenhui | University of Science and Technology of China |
Li, Jiantao | University of Science and Technology of China |
Huang, Kaizhu | Duke Kunshan University |
Li, Jiadong | Suzhou Institute of Nano-Tech and Nano-Bionics, Chinese Academy |
Liu, Xin | Suzhou Institute of Nano-Tech and Nano-Bionics (SINANO), Chinese |
Zhou, Yangfan | Chinese Academy of Sciences |
Keywords: SLAM, Localization, Mapping
Abstract: Not relying on ground-truth data for training, self-supervised semantic visual odometry (SVO) has recently gained considerable attention. Within self-supervised SVO, feature representation inconsistency between semantic/depth and pose tasks presents a significant challenge, as it may disrupt cross-task feature representations and lead to notable performance degradation. Regrettably, existing self-supervised SVO lacks an effective solution to address this obstacle, for either overlooking this issue or exploiting a too heavy architecture. In response to this challenge, we propose a groundbreaking solution within the textit{Single-Stream} architecture, known as Lite-SVO, which is a lightweight yet efficient multi-feature sharing architecture. Lite-SVO is designed to bolster self-supervised SVO, facilitating its adoption on edge devices without compromising accuracy and performance. The crucial innovation lies in the multi-feature sharing architecture, which fuses the semantic and depth maps as pose features, thus significantly reducing the model complexity and boosting the speed in edge devices. Built upon the novel feature sharing framework, Lite-SVO is able to incorporate the fine-grained feature sharing representation to further optimize the performance. Specifically, the proposed cross-feature sharing module alleviates the impact of object boundary in depth estimation, and the designed multi-sharing module focuses on fine-grained features, thereby boosting the performance of Lite-SVO. Experimental results demonstrate that our method is at least 84.46% faster than the state-of-the-art textit{Single-Stream} approaches, and excitingly, our pose accuracy is about 79.83% higher than theirs.
|
|
WeBL-EX Poster Session, Exhibition Hall |
Add to My Program |
Late Breaking Results Poster V |
|
|
|
13:30-15:00, Paper WeBL-EX.1 | Add to My Program |
Tailoring Indoor Search Strategies: The Role of User Preferences in Multi-Target Navigation |
|
Chikhalikar, Akash | Tohoku University |
Ravankar, Ankit A. | Tohoku University |
Salazar Luces, Jose Victorio | Tohoku University |
Hirata, Yasuhisa | Tohoku University |
Keywords: Task and Motion Planning, Object Detection, Segmentation and Categorization, Task Planning
Abstract: Searching for objects is a high-level task that is necessary for service robots. Object (target) search involves the integration of scene-awareness with common-sense knowledge base. We propose a novel framework to combine semantic maps with probabilistic priors to incorporate user preferences in multi-target navigation. In conjunction with our framework, we present the quantitative results of our experiments in an indoor environment.
|
|
13:30-15:00, Paper WeBL-EX.2 | Add to My Program |
Needle End-Effector for a Tele-Operative Pain Intervention Robotic System |
|
Niazi, Muhammad Umer Khan | HEART Lab, Asan Medical Center |
Hyun, Jae ho | Biomedical Engineering Research Center of Asan Medical Center |
Yang, Bomi | Asan Medical Center |
Mehmood, Usman | Asan Medical Center |
Choi, Jaesoon | Asan Medical Center |
Moon, Youngjin | Asan Medical Center |
Keywords: Medical Robots and Systems, Surgical Robotics: Steerable Catheters/Needles, Mechanism Design
Abstract: The crux of pain management interventions such as epidural nerve blocks is the insertion of the needle into the epidural space and then the delivery of pain medication to the region. However, the physician has to continuously take x-rays to successfully guide the needle into the required space. This limits the number of procedures that can be performed due to accumulation of radiation exposure highlighting the need for a tele-operative robotic system. The current study being done focuses on the design of an end-effector for such a robotic system. The needle end effector should have insertion capabilities along with a viable workspace being able to cover regions from the thoracic vertebrate to the coccyx. The complete robot system consists of a follower section which would include an end-effector for needle insertion, a surgical tool handler, multiple syringe cartridges, a drug pump, and a control base with a support arm. Where as the master section including a haptic interface and control unit. Although the current basic version achieves adequate results. However, it is still limited in terms of x-ray compatibility and workspace shape in terms of flexibility at the edges. For the advanced version the current actuator unit lacks stiffness for insertion. Through a follow-study we plan to implement these changes with other feedback to design a device that can be practically applicable in the hands of physicians.
|
|
13:30-15:00, Paper WeBL-EX.5 | Add to My Program |
Exploring Applications for Cyber-Physical Interactionsthrough a Human Coincident Robot |
|
Sasaki, Tomoya | Tokyo University of Science |
Watanabe, Takafumi | Preferred Robotics Ainc |
Inami, Masahiko | The University of Tokyo |
Yoshida, Eiichi | Tokyo University of Science |
Keywords: Human-Robot Collaboration, Virtual Reality and Interfaces, Physical Human-Robot Interaction
Abstract: Recent advancements in information and communication technology have enabled human activities in cyberspace to become more active, just as in physical space. However, most activities are still limited to audiovisual information via displays and mobile devices, although we interact with our environment through whole-body motions involving multiple contacts in our daily lives. Here, we introduce an interactive cyber-physical system using a Human Coincident Robot (HCR). This system consists of an independent two-wheeled mobile robot that follows the human's walking motion to match the coordinates of the robot's body. By using the robot, cyberspace and physical space are seamlessly connected. This poster explores the capable applications of cyber-physical interaction using HCR. As examples, these are showcased digital twin applications that construct cyberspace by simultaneously acquiring human whole-body motions and surrounding environmental information, VR/AR contents that offer encountered-type haptic presentations in cyberspace, and human augmentation with robotic arms.
|
|
13:30-15:00, Paper WeBL-EX.6 | Add to My Program |
San Francisco World: Drift-Free Structure-Aware Visual Compass |
|
Ham, Jungil | Gwangju Institute of Science and Technology |
Kim, Minji | Gwangju Institute of Science and Technology |
Kang, Suyoung | Sookmyung Women's University |
Joo, Kyungdon | UNIST |
Li, Haoang | The Chinese University of Hong Kong |
Kim, Pyojin | Gwangju Institute of Science and Technology (GIST) |
Keywords: RGB-D Perception, Localization, Computer Vision for Automation
Abstract: This study proposes a novel drift-free, structure-aware visual compass system, tailored to the unique urban landscape of San Francisco. By exploiting the city's characteristic inclined slopes, our method provides precise angular displacement measurements, achieving an unprecedented one-degree-level error accuracy. The proposed system comprises a structural model that adapts to San Francisco’s uniform inclination, which is critical for both outdoor navigation and indoor applications, such as staircases. Our detection of SF World algorithm is central to the process, effectively filtering non-linear features and enforcing San Francisco's characteristic slope constraint for enhanced drift-free operation. We integrate Rotational Motion Estimation through an improved RANSAC mechanism that accommodates extended line and plane constraints, further refined by PNV(vp1) tracking. The efficacy of the proposed system is demonstrated through comprehensive evaluations, using mean value comparison against several benchmarks, including the LKPC, IJIS-SHA, ORB-SLAM3, and DROID-SLAM systems. The results are indicative of the significant improvements in rotational motion estimation, outperforming existing solutions in the accuracy of one-degree-level mean error, particularly in challenging structural environments like U-shaped staircases. Our work not only advances the field of visual navigation systems but also provides a pragmatic solution for navigation in cities with similar structural fea
|
|
13:30-15:00, Paper WeBL-EX.7 | Add to My Program |
Predicting Optimal Candidates for Stroke Rehabilitation through Exoskeletal Robotic Gait Training Using Clinical Machine Learning |
|
Park, Seonmi | Yonsei University |
Kim, Yunhwan | Yonsei University |
Park, HaEun | Yonsei University |
You, Joshua Sung H | Yonsei University |
Keywords: Rehabilitation Robotics, Optimization and Optimal Control, AI-Based Methods
Abstract: Objective: We aimed to determine the best predictive model based on the international classification of functioning impairment domain features (Fugl Meyer assessment, FMA, Modified Barthel index related-gait scale, MBI, Berg balance scale, BBS) and reveal their responsiveness to RAGT in patients with subacute stroke. Method: Data from 187 people with subacute stroke who underwent a 12-week Walkbot RAGT intervention were obtained and analyzed. Overall, 18 potential predictors encompassed demographic characteristics and the baseline score of functional and structural features. Five predictive ML models, including decision tree, random forest, eXtreme Gradient Boosting, light gradient boosting machine, and categorical boosting, were used. Results: The initial and final BBS, initial BBS, final Modified Ashworth scale, and initial MBI scores were important features, predicting functional improvements. eXtreme Gradient Boosting demonstrated superior performance compared to other models in predicting functional recovery after RAGT in patients with subacute stroke. Conclusions: eXtreme Gradient Boosting may be an invaluable prognostic tool, providing clinicians and caregivers with a robust framework to make precise clinical decisions regarding the identification of optimal responders and effectively pinpoint those who are most likely to derive maximum benefits from RAGT interventions.
|
|
13:30-15:00, Paper WeBL-EX.8 | Add to My Program |
Construction of VR Training System Using Visual Feedbacks of Physical Load and Movement for Movement Improvement |
|
Iwami, Kouichi | Tamagawa-University |
Inamura, Tetsunari | Tamagawa University |
Keywords: Human Performance Augmentation
Abstract: An efficient way to learn physical movements is training in a VR space. However, no conventional feedback that takes into account a person's physical load has been provided, and it is not clear about effective feedback for improving a person's physical movements. We constructed a feedback method that visualizes the physical load on the VR space as a color or bar graph and a method that feeds back physical movements on a screen. Compared each feedback method based on the amount of change in physical load before and after physical action training.
|
|
13:30-15:00, Paper WeBL-EX.9 | Add to My Program |
The Fractal-Based Wing Design with Stiffness Gradient in MAV |
|
Gan, Bian | Zhejiang University |
Cheng, Tianlun | Zhejiang University |
Liu, Yide | Zhejiang University |
Qu, Shaoxing | Zhejiang University |
Keywords: Micro/Nano Robots, Biomimetics, Biologically-Inspired Robots
Abstract: Insect wings possess a complex stiffness gradient resulting from the arrangement of wing veins and the varying thickness of the wing membrane, allowing for flexible deformation during flapping to enhance lift generation. However, replicating such intricate stiffness characteristics of insect wings in artificial wings for insect-scale flapping-wing micro air vehicles (FWMAVs) proves challenging. This study introduces a novel flexible wing design approach with a stiffness gradient based on fractals. The fractal wing is manufactured using the smart composite micro-structure (SCM) process, and its aerodynamic performance is evaluated on a lift test platform. Finite element analysis demonstrates that the deformation behavior of the fractal wing under uniform load closely resembles that of insect wings, with mechanical properties adjustable through modifications in fractal curve density and arrangement. Aerodynamic testing confirms that the fractal wing performs comparably to conventional rigid wings. The fractal-based flexible wing design method proposed in this study offers valuable insights for FWMAV design and manufacture.
|
|
13:30-15:00, Paper WeBL-EX.10 | Add to My Program |
Preliminary Results on 3D Shape Control of Deformable Linear Objects Using Dual-Arm Robots |
|
Choi, Jiyoung | Chonnam National University |
Lee, Donggun | UC Berkeley |
Hong, Ayoung | Chonnam National University |
Keywords: Model Learning for Control, Dual Arm Manipulation, Machine Learning for Robot Control
Abstract: Our daily lives are filled with deformable objects like cables, ropes, cloth-like objects, and even food products, rather than rigid bodies. The effective manipulation of these variable-shaped objects offers significant advantages in robotics. However, manipulating deformable objects is challenging compared to rigid objects due to their high degrees of freedom and the complex deformations caused by twisting and bending. These deformations make it difficult to create dynamic models using traditional analytical methods. To overcome this challenge, we propose an approach utilizing data-driven neural network models for dynamic modeling of deformable objects. Our method uses time-series data to predict the next object states without relying on complex dynamics. This data-driven approach provides benefits over traditional analytical methods and enables optimal robot movements through Model Predictive Control. In this study, the task was to manipulate a deformable linear object, specifically a cable, to a target shape using dual robotic arms. The performance of this task resulted in achieving an average position error of 0.015 meters. This accomplishment represents the effectiveness and potential of our approach for manipulating deformable objects in various robotic applications.
|
|
13:30-15:00, Paper WeBL-EX.11 | Add to My Program |
A Multimodal Learning Approach for Automated Produce Freshness Detection Using Vision and Tactile Data |
|
Fahmy, Israa | KU |
Hussain, Irfan | Khalifa University |
Hassan, Taimur | Khalifa University |
Seneviratne, Lakmal | Khalifa University |
Werghi, Naoufel | Khalifa University |
Keywords: Agricultural Automation, Robotics and Automation in Agriculture and Forestry, AI-Based Methods
Abstract: In the realm of agricultural robotics, ensuring the freshness of produce is paramount for maintaining quality and reducing food waste. This research presents a novel multimodal approach for assessing the freshness of fruits and vegetables, combining vision and tactile data through a robotic gripper equipped with RGB imaging and GelSight tactile sensors. Our comprehensive dataset includes 17 types of produce, each labeled with five freshness levels (Unripe, Ripe, Overripe, Dry, damaged) to cater to various applications from farm to retail. We introduce FastFruitNet, a fusion-based deep learning architecture that integrates features from Convolutional Neural Networks and Vision Transformers, featuring local and global attention layers for precise freshness classification. Our results demonstrate the superiority of the multimodal approach over single-modality methods, with significant improvements in accuracy and robustness. The proposed system has the potential to revolutionize freshness assessment in precision agriculture, offering a scalable solution for automated quality control in the supply chain.
|
|
13:30-15:00, Paper WeBL-EX.12 | Add to My Program |
Efficient and Safe Data Acquisition Method for Robot Modeling with Safe Exploration |
|
Mori, Kenya | The University of Tokyo |
Venture, Gentiane | The University of Tokyo |
Keywords: Data Sets for Robot Learning, Calibration and Identification, Optimization and Optimal Control
Abstract: We propose a method for generating exciting motions for the inertial parameter identification by combining active learning and safe exploration. Generating exciting motions is necessary to obtain accurate inertial parameters; however, certain accuracy of inertial parameters is required to calculate motion constraints in generating exciting motions. The proposed method estimates the safe torque limits at which the robot can operate using the safe exploration, and a higher quality dataset can be obtained even when there are differences between the initial dynamic model and the actual robot. The effectiveness was confirmed in simulation.
|
|
13:30-15:00, Paper WeBL-EX.13 | Add to My Program |
Evaluation of Difficulty Settings by Validity of Difficulty Levels and Relationship with Self-Efficacy in VR Kendama |
|
Goutsu, Yusuke | Tamagawa University |
Inamura, Tetsunari | Tamagawa University |
Keywords: Human Performance Augmentation, Virtual Reality and Interfaces
Abstract: In rehabilitation, games, sports, and education, it is crucial to set difficulty levels according to the user's skill. We have previously suggested a method to set difficulty levels adaptive for each user's skill based on GPDM. However, the difficulty levels do not always correspond with the user's subjective perception. An investigation of the effectiveness of adaptive difficulty setting considering the psychological aspect is necessary. This study evaluates adaptive difficulty setting in two unique aspects: the validity of difficulty levels that result in targeted success rates and the relationship between successful experiences and self-efficacy. In the experiment, we employ a Kendama task in a VR space where the difficulty level can be easily adjusted and compare our difficulty setting with the difficulty setting where the difficulty levels are uniformly fixed for all users. The comparison results indicate that the adaptive difficulty setting reduces the error between targeted and actual success rates more than the fixed difficulty setting, confirming that the difficulty levels can be set according to the user's skill. Moreover, the adaptive difficulty setting has a stronger correlation between the increased number of successes in the future expectation and the improvement of self-efficacy, which supports a psychological hypothesis that successful experience in the imagination is a significant component of self-efficacy.
|
|
13:30-15:00, Paper WeBL-EX.14 | Add to My Program |
RRT*-Connect Based Path Planning Method for Robot Grasping Pose Flexibility |
|
Yonrith, Phayuth | Chonnam National University |
Hong, Ayoung | Chonnam National University |
Keywords: Manipulation Planning, Agricultural Automation, Robotics and Automation in Agriculture and Forestry
Abstract: This poster presents an RRT*-Connect-based path planning method for grasping pose flexibility and a local gap sampling method. In the previous approach, a single grasp pose is usually selected from multiple candidates for planning, which has the potential to eliminate candidates with a lower cost. We propose the RRT*-Connect-based method allowing multiple goals to solve the above issue. Since the proposed method requires high computation time when it is used in high dimensional configuration space, we propose a local gap sampling method. The method aims to explore the trees within a constrained space, determined using the found path and the straight path from the initial to the goal. The experimental results show that the proposed approach (with grasping pose flexibility and local gap sampling) determined the possible path solutions with a faster cost convergence rate.
|
|
13:30-15:00, Paper WeBL-EX.15 | Add to My Program |
A Compact, Battery-Free Capsule for Multi-Targeted Sampling and Navigation within the Gastrointestinal Tract |
|
Ren, Huayang | Shanghai University |
Chen, Ziheng | Shanghai University |
Wang, Zhaokai | Queen's University |
Jiang, MengXi | Shanghai University |
Wang, Xian | Queen's University |
Liu, Na | Shanghai University, Shanghai, China |
Keywords: Medical Robots and Systems, Surgical Robotics: Laparoscopy, Surgical Robotics: Planning
Abstract: Understanding the impact of the human gut microbiome on health necessitates precise sampling from distinct locations within the gastrointestinal (GI) tract. Existing ingestible capsules lack the capability to fulfill this need adequately.This poster introduces a compact, battery-free capsule designed with active navigation and multiple sampling capabilities within the gastrointestinal tract, all steered by an external magnetic field.The capsule is equipped with three humidity-responsive sample structures, enabling three rounds of sampling. Enhanced by a humidity-responsive sealing mechanism, this innovation ensures safe sample transport. Results from the in-vitro multi-target sampling of intestinal liquid indicate successful retrieval of samples from multiple locations within the intestine.
|
|
13:30-15:00, Paper WeBL-EX.16 | Add to My Program |
Pressurization Patterns Determination Method of Pneumatic Soft Actuator with Arrayed Chambers under Interaction with Object |
|
Mizuno, Kaito | Graduate School of Engineering Osaka University |
Higashimori, Mitsuru | Osaka University |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Sensors and Actuators, Hydraulic/Pneumatic Actuators
Abstract: This paper presents a novel control method of pneumatic soft actuators with arrayed chambers (PSAAC) for object manipulations. A discrete model composed of multiple linear elastic elements is introduced for representing the surface shape of PSAAC. Based the model, pressurization patterns of PSAAC for generating the desired surface shape under the contact pressure from the object is obtained using a mathematical optimization method. The proposed method is experimentally verified by the object tilting task and the object lifting task.
|
|
13:30-15:00, Paper WeBL-EX.17 | Add to My Program |
Offline Reinforcement Learning for Visual Semantic Navigation |
|
Gutiérrez Álvarez, Carlos | Universidad De Alcalá |
Flor, Rafael | UAH |
Kanezaki, Asako | Tokyo Institute of Technology |
López-Sastre, Roberto | University of Alcalá |
Keywords: Reinforcement Learning, Vision-Based Navigation
Abstract: Is it possible to train navigation agents from just human demonstrations? By leveraging on the offline reinforcement learning paradigm, we can train agents from a fixed dataset of navigation experience, without querying any environment. This opens the possibility to create many navigation datasets from any navigation agent in any textbf{real or simulated} environment, and then use them to train new agents for different scenarios without the need to ever query that environment. We show this possibility via the first offline reinforcement learning algorithm implemented for visual semantic navigation, ie. OffNav, providing a small analysis of its performance on HM3D dataset.
|
|
13:30-15:00, Paper WeBL-EX.18 | Add to My Program |
Real-Time sEMG-Based Gesture Recognition System Based on Lightweight Networks |
|
Li, Yazhou | Shenyang University of Technology |
Wang, Peiyao | Shenyang University of Technology |
Li, Kairu | Shenyang University of Technology |
Keywords: Gesture, Posture and Facial Expressions, Prosthetics and Exoskeletons
Abstract: For the real-time control of smart hand prostheses based on surface electromyography (sEMG) signals, it keeps a challenge to balance a high gesture recognition accuracy and a heavy computational workload, which constrains hand prostheses’ commercial application. To address the issue, a real-time sEMG-based gesture recognition system based on lightweight networks is proposed.In the preprocessing stage, data augmentation, random dropout and L2 regularization are employed to prevent overfitting. The one-dimensional sEMG signals are filtered and then transformed into two-dimensional image frequency domain signals using sliding Hamming windows and short-time Fourier transforms (STFT) as inputs. A VanillaNet feature extraction module is utilized for feature extraction. Deep training strategies and stacked nonlinear activation layers are employed to reduce the complexity of the network model. Additionally, mismatched structural weights in the pre-trained model are frozen to achieve the highest classification performance with minimal training cost. The experimental results demonstrate that the proposed model achieves accuracies of 93.48% and 86% on a self-built sEMG dataset of multiple gestures and the public Ninapro DB3 dataset, respectively, which demonstrate a higher recognition accuracy and lower computational workload compared with Resnet18, Resnet 50 and LightVit networks.
|
|
13:30-15:00, Paper WeBL-EX.19 | Add to My Program |
Vision-Based Multi-Object Tracking for Autonomous Ship Navigation |
|
Ku, Japyeong | Chungnam National University |
Kim, Youngje | Samsung Heavy Industries |
Kim, Jaewoo | Samsung Heavy Industries |
Jung, Jongdae | Chungnam National University |
Keywords: Computer Vision for Transportation, Intelligent Transportation Systems, Deep Learning for Visual Perception
Abstract: There is a growing need for vision-based situational awareness to ensure safe navigation for autonomous ships. This requires continuous detection of maritime objects such as other ships, buoys, etc. to update collision risk information. An electro-optical targeting system (EOTS) offers optical/thermal imaging, pan/tilt/zoom operation, which is advantageous for distant maritime object recognition. However, the target object can easily move away during PTZ operation, requiring tracking and re-identification technology. In this paper, we developed a deep learning-based visual object tracking system by investigating existing benchmark models and training the models using a custom dataset. We also carried out a quantitative performance evaluation on multiple object tracking and applied the trained model to EOTS for qualitative evaluation.
|
|
13:30-15:00, Paper WeBL-EX.20 | Add to My Program |
Development of Intelligent Navigation System Based on Digital Bridge System for Maritime Autonomous Surface Ships: Framework Design and System Integration |
|
Park, Jeonghong | KRISO |
Kim, Dong-Ham | Korea Research Institute of Ships & Ocean Engineering |
Kang, Minju | Korea Research Institute of Ships & Ocean Engineering |
Choi, Hyun-Taek | Korea Research Institute of Ships and Oceans Engineering |
Choi, Jinwoo | KRISO, Korea Research Institute of Ships & Ocean Engineering |
Keywords: Marine Robotics, Intelligent Transportation Systems, Autonomous Vehicle Navigation
Abstract: This paper introduces an intelligent navigation system that consists of a situational awareness system, an autonomous navigation system, and a digital bridge system for maritime autonomous surface ships (MASSs) at sea. Above all, the intelligent navigation system for MASSs should be desired and developed to achieve a high level of autonomy. The situational awareness system comprises various perception sensors (i.e., cameras, lidar, and radar). It provides the motion information estimated from automatically detected objects as well as the situational information required to maneuver at sea. The autonomous navigation system is developed to safely maneuver to prevent a potential risk on the maneuvering trajectory of MASS. In particular, the digital bridge system enables the provision of various information for the situational awareness system and autonomous navigation system considering a transition function of control modes with autonomy level (i.e., remote, semi-autonomous, and fully autonomous operating modes) as well as data acquisition, classification, and management with a variety of navigational and engine devices and systems. The developed intelligent navigation system was integrated into the ship built as a testbed system to verify the performance of the autonomous operation function, and preliminary tests were conducted, and then the results of the developed system were described.
|
|
13:30-15:00, Paper WeBL-EX.21 | Add to My Program |
Education and AI Reliance: Differences in Recommendations, Habits, and Usage |
|
Biswas, Mriganka | University of Sunderland |
Murray, John Christopher | University of Sunderland |
Keywords: Acceptability and Trust, Behavior-Based Systems, Design and Human Factors
Abstract: Artificial Intelligence (AI) technologies are becoming increasingly intertwined with daily life. This study investigates the relationship between education level and various aspects of AI reliance. Findings indicate that individuals with higher education levels exhibit significantly greater reliance on AI-powered recommendations compared to those with lower education. However, education did not significantly impact the use of AI for predictions or assistance. These results highlight the importance of inclusive AI design and the need for AI literacy initiatives that span across all educational backgrounds, ensuring equitable access to and understanding of the benefits of AI technologies.
|
|
13:30-15:00, Paper WeBL-EX.22 | Add to My Program |
Multi-Stable Robot for Search & Rescue and Exploration |
|
Salem, Lior | Technion |
Gat, Amir | Technion - Israel Institute of Technology |
Or, Yizhar | Technion |
Keywords: Search and Rescue Robots, Robotics in Hazardous Fields, Modeling, Control, and Learning for Soft Robots
Abstract: Many exploration applications, particularly search and rescue for survivors in debris, may use high maneuverability robots that can locomote within complex, narrow, and unknown environments. Most conventional robots address these tasks using a snake-like robot with multiple serial rigid links, joints, and motors. The numerous components increase the design complexity and the robot's weight and price. The suggested approach is a robot that comprises serial elastic multi-stable structures that can transform between equilibrium states using a single internal pneumatic actuator. The robot can be deformed to the desired shape and advance along a complex, winding, narrow path. The mechanical simplicity in design, fabrication, and actuation, alongside lighter weight and lower price, are great advantages compared with conventional robots. The design of the autonomous robotic system, motion planning algorithm simulations, and experimental navigation are demonstrated.
|
|
13:30-15:00, Paper WeBL-EX.23 | Add to My Program |
Fundamental Validation of Forced Soft-Landing of UAVs Controlled by Open Source Software with an Abuse Prevention Processor |
|
Fukuda, Towa | Shibaura Institute of Technology |
Abiko, Satoko | Shibaura Institute of Technology |
Tsujita, Teppei | National Defense Academy of Japan |
Sato, Daisuke | Tokyo City University |
Keywords: Surveillance Robotic Systems, Product Design, Development and Prototyping, Engineering for Robotic Systems
Abstract: Recently, the widespread adoption of Open-Source Software (OSS), which allows programs to be freely distributed and improved, has facilitated the development of Unmanned Aerial Vehicles (UAVs). However, there is growing concern over an increase in crimes, such as terrorist acts, involving UAVs. In this presentation, we propose a system to prevent the misuse of OSS. Unlike commercially available UAVs, robots controlled via OSS can have their anti-abuse systems disabled by modifying the source code. To counteract this, the anti-abuse system is embedded in a section of the processor that is inaccessible to end-users. When developing the system, the following are satisfied: 1) preventing UAV misuse without interfering with the OSS's usefulness, 2) allowing developers to implement their judgments about abuse in the system, and 3) ensuring the anti-abuse system cannot be disabled by users. In this study, the system is designed to force a UAV soft landing if it enters a designated no-entry zone. The proposed approach is verified by a prototype developed by FPGA wherein a motor attached to a linear guide simulated a UAV. We observed a delay of approximately 300 microseconds between the anti-abuse system's abuse detection and the UAV's control recovery. If the UAV was at a flight speed of 100 km/h, this delay resulted in a deviation of less than 1.0 m from the no-entry zone. The fundamental experiment clearly demonstrated that the proposed system can effectively preempt misuse.
|
|
13:30-15:00, Paper WeBL-EX.24 | Add to My Program |
Deep Learning Based Harvestability Scoring Method for Effective Harvesting in Open Orchard |
|
Yoon, Chanyoung | Korea Institute of Industrial Technology |
Yoo, Ji-Hyeon | Korea Institute of Industrial Technology |
Kang, Jaehyeon | Korea Institute of Industrial Technology |
Pyo, Dongbum | Korea Institute of Industrial Technology |
Ko, KwangEun | Korea Institute of Industrial Technology |
Keywords: Robotics and Automation in Agriculture and Forestry, Deep Learning Methods
Abstract: In the context of robotic fruit harvesting in orchard environments, the process involves autonomously estimating the 6D pose of fruits in complex backgrounds within a three-dimensional space, and based on this estimation, determining the poses of harvesting manipulators and end effectors to generate optimal motions. Since multiple fruits are observed simultaneously in real farm scenarios, it is crucial to prioritize the selection of objects with a high probability of successful harvesting. In this study, we propose a metric called HavScore to quantify this harvesting probability. The proposed HavScore is designed to assign higher values to fruits with high manipulability indices in the manipulator's workspace, indicating that they are less obstructed by leaves, branches, etc. To achieve this, we developed a deep learning-based algorithm that estimates the 6D poses and visibility of fruits from RGB image inputs and outputs the HavScore value by integrating with manipulability indices. Prioritizing the harvesting of objects with a high probability of success can enhance the efficiency and productivity of agricultural operations.
|
|
13:30-15:00, Paper WeBL-EX.25 | Add to My Program |
Failure Recovery of Robotic Manipulation on Deformable Objects: Learning from Human-Guided Operation |
|
He, Weizan | Tohoku University |
Chen, Dayuan | Tohoku University |
Zhang, Yukuan | Tohoku University |
Petrilli-Barceló, Alberto Elías | Tohoku University |
Salazar Luces, Jose Victorio | Tohoku University |
Hirata, Yasuhisa | Tohoku University |
Keywords: Human-Robot Collaboration, Manipulation Planning, Motion Control
Abstract: With the rapid development of robot manipulation, handling deformable objects has become an important research topic. To address the challenges of unpredictability and perception in the state of deformable objects, employing models trained in simulators for trajectory generation presents a viable solution. However, this approach faces difficulties in recovery from failures during actual machine operations. This study introduces a novel failure recovery strategy that involves manual intervention after incorrect operations and is used to train correction models, achieving automated failure recovery.
|
|
13:30-15:00, Paper WeBL-EX.26 | Add to My Program |
Arm Back Support Suit (Abs-Suit) for Load Carriage in Parcel Delivery with a Passive Load Redistribution Mechanism |
|
Yoo, Hye Ju | Seoul National University |
Lee, Jewoo | Seoul National University |
Cho, Kyu-Jin | Seoul National University, Biorobotics Laboratory |
Keywords: Wearable Robotics, Health Care Management, Physically Assistive Devices
Abstract: Parcel delivery officers hold and carry heavy parcels during the delivery process to deliver them right at the doorstep of the customers. In the process, parcel delivery officers navigate through elevated ground and staircases especially in many crowded cities. As a result of holding and carrying heavy parcels repeatedly, the officers report upper body musculoskeletal disorders in the arms, and lower back. Robotic exosuits are actively being developed but are primarily oriented toward lifting assistance only and do not cater to the entire upper body assistance. In this late breaking poster, we present Abs-suit that utilizes a load-redistribution strategy in the form of a fully soft passive wearable robot, that provides load-dependent compression around the lower back for trunk support and arm assistance. This is achieved through the coupling of the load and the human body, which ensures the wearer receives appropriate assistance only when in need. The design parameters are studied through mannequin mock-up experiment and modelling to understand the relationship between the design parameters and pressure applied on the body. Human experiments are also conducted to validate the performance of the Abs-suit.
|
|
13:30-15:00, Paper WeBL-EX.27 | Add to My Program |
Late Breaking Results on vS-Graphs: Integrating Visual SLAM and Situational Graphs for Multi-Level Scene Understanding |
|
Tourani, Ali | University of Luxembourg |
Bavle, Hriday | University of Luxembourg |
Ejaz, Saad | University of Luxembourg |
Morilla-Cabello, David | Universidad De Zaragoza |
Sanchez-Lopez, Jose Luis | University of Luxembourg |
Voos, Holger | University of Luxembourg |
Keywords: Visual-Inertial SLAM, SLAM, Semantic Scene Understanding
Abstract: Integrating Visual SLAM (VSLAM) frameworks with semantic object detection and mapping to create richer maps, mainly when structural elements (e.g., walls and doorways) are targeted, provides the digital twin baseline for more complex robotics tasks, such as robot path planning. The fundamental challenge to implement a framework that can satisfy the mentioned goals, including situational graph generation, semantic mapping, real-time performance, proper vision sensor coverage, and 3D scene representation, is to re-design the architecture of the baseline (i.e., ORB-SLAM 3.0) to keep it working properly considering adding new components and improving the reconstruction accuracy. The under-development VSLAM framework, titled vS-Graphs, employs various changes to the architecture of the baseline by adding new threads and modules, modifying the optimization process, and sensor-based feature extraction (e.g., RGB-D point clouds). In vS-graphs, the point clouds obtained from cameras are passed to fetch all the 3D planes and their equations, without any knowledge about their type. In the next step, the portion of the point clouds related to walls and grounds are matched with the previously detected planes to add semantic information to them. Detection of walls and corridors is also done based on the geometrical layouts of structural-level entities.
|
|
WeCT1-CC Oral Session, CC-303 |
Add to My Program |
Motion and Path Planning III |
|
|
Chair: Stiffler, Nicholas | University of Dayton |
Co-Chair: Likhachev, Maxim | Carnegie Mellon University |
|
16:30-18:00, Paper WeCT1-CC.1 | Add to My Program |
Constant-Time Motion Planning with Anytime Refinement for Manipulation |
|
Mishani, Itamar | Carnegie Mellon University, Robotics Institute |
Feddock, Hayden | University of Pittsburgh |
Likhachev, Maxim | Carnegie Mellon University |
Keywords: Motion and Path Planning, Manipulation Planning, AI-Based Methods
Abstract: Robotic manipulators are essential for future autonomous systems, yet limited trust in their autonomy has confined them to rigid, task-specific systems. The intricate configuration space of manipulators, coupled with the challenges of obstacle avoidance and constraint satisfaction, often makes motion planning the bottleneck for achieving reliable and adaptable autonomy. Recently, a class of constant-time motion planners (CTMP) was introduced. These planners employ a preprocessing phase to compute data structures that enable online planning provably guarantee the ability to generate motion plans, potentially sub-optimal, within a user defined time bound. This framework has been demonstrated to be effective in a number of time-critical tasks. However, robotic systems often have more time allotted for planning than the online portion of CTMP requires, time that can be used to improve the solution. To this end, we propose an anytime refinement approach that works in combination with CTMP algorithms. Our proposed framework, as it operates as a constant time algorithm, rapidly generates an initial solution within a user-defined time threshold. Furthermore, functioning as an anytime algorithm, it iteratively refines the solution's quality within the allocated time budget. This enables our approach to strike a balance between guaranteed fast plan generation and the pursuit of optimization over time. We support our approach by elucidating its analytical properties, showing the convergence of the anytime component towards optimal solutions. Additionally, we provide empirical validation through simulation and real-world demonstrations on a 6 degree-of-freedom robot manipulator, applied to an assembly domain.
|
|
16:30-18:00, Paper WeCT1-CC.2 | Add to My Program |
VAPOR: Legged Robot Navigation in Unstructured Outdoor Environments Using Offline Reinforcement Learning |
|
Kulathun Mudiyanselage, Kasun Weerakoon | University of Maryland, College Park |
Sathyamoorthy, Adarsh Jagan | University of Maryland |
Elnoor, Mohamed | University of Maryland |
Manocha, Dinesh | University of Maryland |
Keywords: Motion and Path Planning, Reinforcement Learning, AI-Enabled Robotics
Abstract: We present VAPOR, a novel method for autonomous legged robot navigation in unstructured, densely vegetated outdoor environments using offline Reinforcement Learning (RL). Our method trains a novel RL policy using an actor-critic network and arbitrary data collected in real outdoor vegetation. Our policy uses height and intensity-based cost maps derived from 3D LiDAR point clouds, a goal cost map, and processed proprioception data as state inputs, and learns the physical and geometric properties of the surrounding obstacles such as height, density, and solidity/stiffness. The fully-trained policy's critic network is then used to evaluate the quality of dynamically feasible velocities generated from a novel context-aware planner. Our planner adapts the robot's velocity space based on the presence of entrapment including vegetation, and narrow passages in dense environments. We demonstrate our method's capabilities on a Spot robot in complex real-world outdoor scenes, including dense vegetation. We observe that VAPOR's actions improve success rates by up to 40%, decrease the average current consumption by up to 2.9%, and decrease the normalized trajectory length by up to 11.2% compared to existing end-to-end offline RL and other outdoor navigation methods.
|
|
16:30-18:00, Paper WeCT1-CC.3 | Add to My Program |
EDMP: Ensemble-Of-Costs-Guided Diffusion for Motion Planning |
|
Saha, Kallol | International Instititute of Information Technology, Hyderabad |
Mandadi, Vishal Reddy | International Institute of Information Technology, Hyderabad |
Gurram, Jayaram | International Institute of Information Technology, Hyderabad |
Srikanth, Ajit | International Institute of Information Technology, Hyderabad |
Agarwal, Aditya | IIIT Hyderabad |
Sen, Bipasha | International Institute of Information Technology |
Singh, Arun Kumar | University of Tartu |
Krishna, Madhava | IIIT Hyderabad |
Keywords: Motion and Path Planning, Deep Learning Methods, Manipulation Planning
Abstract: Classical motion planning for robotic manipulation includes a set of general algorithms that aim to minimize a scene-specific cost of executing a given plan. This approach offers remarkable adaptability, as they can be directly used off-the-shelf for any new scene without needing specific training datasets. However, without a prior understanding of what diverse valid trajectories are and without specially designed cost functions for a given scene, the overall solutions tend to have low success rates within a certain time limit. While deep-learning-based algorithms tremendously improve success rates, they are much harder to adopt without specialized training datasets. We propose EDMP, an Ensemble-of-costs-guided Diffusion for Motion Planning that aims to combine the strengths of classical and deep-learning-based motion planning. Our diffusion-based network is trained on a set of diverse kinematically valid trajectories. Like classical planning, for any new scene at the time of inference, we compute scene-specific costs such as "collision cost" and guide the diffusion to generate valid trajectories that satisfy the scene-specific constraints. Further, instead of a single cost function that may be insufficient in capturing diversity across scenes, we use an ensemble of costs to guide the diffusion process, significantly improving the success rate compared to classical planners. EDMP performs comparably with SOTA deep-learning-based methods while retaining the generalization capabilities primarily associated with classical planners.
|
|
16:30-18:00, Paper WeCT1-CC.4 | Add to My Program |
Approximating Robot Configuration Spaces with Few Convex Sets Using Clique Covers of Visibility Graphs |
|
Werner, Peter | Massachusetts Institute of Technology |
Amice, Alexandre | MIT |
Marcucci, Tobia | Massachusetts Institute of Technology |
Rus, Daniela | MIT |
Tedrake, Russ | Massachusetts Institute of Technology |
Keywords: Computational Geometry, Motion and Path Planning, Collision Avoidance
Abstract: Many computations in robotics can be dramatically accelerated if the robot configuration space is described as a collection of simple sets. For example, recently developed motion planners rely on a convex decomposition of free space to design collision-free trajectories using fast convex optimization. In this work, we present an efficient method for approximately covering complex configuration spaces with a small number of polytopes. The approach constructs a visibility graph using sampling, and generates a clique cover of this graph to find clusters of samples that have mutual line of sight. These clusters are then inflated into large, full-dimensional, polytopes. We evaluate our method on a variety of robotic systems, and show that it consistently covers larger portions of free configuration space, with fewer polytopes, and in a fraction of the time compared to previous methods.
|
|
16:30-18:00, Paper WeCT1-CC.5 | Add to My Program |
Asymptotically-Optimal Multi-Robot Visibility-Based Pursuit-Evasion |
|
Stiffler, Nicholas | University of Dayton |
O'Kane, Jason | Texas A&M University |
Keywords: Motion and Path Planning, Path Planning for Multiple Mobile Robots or Agents, Multi-Robot Systems
Abstract: The multi-robot visibility-based pursuit-evasion problem tasks a team of robots with systematically searching an environment to detect (capture) an evader. Previous techniques to generate search strategies for the pursuit team have shown to be either computationally intractable or permit poor solution quality. This paper presents a novel asymptotically optimal algorithm for generating a joint motion strategy for the pursuers. To explore the space of possible pursuer motion strategies, the algorithm utilizes a trio of hierarchical graph data structures that each capture certain elements of the problem such as connectivity (valid single pursuer motion), coordination (multiple pursuer motion), and tracking information (evaluating where an evader may be). The algorithm is inspired by well-known methods in the motion planning literature and inherits its asymptotic optimality from those techniques. In addition, we describe a method that can improve upon solutions found during the formative stages of the main algorithm, using a "fast-forward" approach that foregoes guarantees of asymptotic optimality, implementing heuristics that concentrate future samples into improving the path quality of the nominal solution. The algorithms were validated in simulation and results are provided.
|
|
16:30-18:00, Paper WeCT1-CC.6 | Add to My Program |
APP: A* Post-Processing Algorithm for Robots with Bidirectional Shortcut and Path Perturbation |
|
Li, Yong | Guangzhou Shiyuan Electronic Technology Co., Ltd |
Cheng, Hui | Sun Yat-Sen University |
Keywords: Motion and Path Planning, Task and Motion Planning, Collision Avoidance
Abstract: Paths generated by A* and other graph-search-based planners are widely used in the robotic field. Due to the restricted node-expansion directions, the resulting paths are usually not the shortest. Besides,unnecessary heading changes, or zig-zag patterns, exist even when no obstacle is nearby, which is inconsistent with the human intuition that the path segments should be straight in wide-open space due to the absence of obstacles. This article puts forward a general and systematic post-processing algorithm for A* and other graph-search-based planners.The A* post-processing algorithm, called APP, is developed based on the costmap, which is widely used in commercial service robots. First, a bidirectional vertices reduction algorithm is proposed to tackle the asymmetry of the path and the environments. During the forward and backward vertices reduction, a thorough shortcut strategy is put forward to improve the path-shortening performance and avoid unnecessary heading changes.Second, an iterative path perturbation algorithm is adopted to locally reduce the number of unnecessary heading changes and improve the path smoothness. Comparative experiments are then carried out to validate the superiority of the proposed method. Quantitative performance indexes show that APP outperforms the existing methods in planning time, path length as well as the number of unnecessary heading changes. Finally, field navigation experiments are carried out toverify the icability of APP.
|
|
16:30-18:00, Paper WeCT1-CC.7 | Add to My Program |
Scaling Infeasibility Proofs Via Concurrent, Codimension-One, Locally-Updated Coxeter Triangulation |
|
Li, Sihui | Colorado School of Mines |
Dantam, Neil | Colorado School of Mines |
Keywords: Motion and Path Planning, Integrated Planning and Learning
Abstract: A complete motion planner has long been desired but is hard to achieve in high dimensions. Previous work proposed an asymptotically complete motion planner that reports a plan or an infeasibility proof given long enough time. The algorithm trains a manifold using configuration space samples as data and triangulates the manifold to ensure its existence in the obstacle region of the configuration space. In this paper, we extend the construction of infeasibility proofs to higher dimensions by adapting Coxeter triangulation's manifold tracing and cell construction procedures to concurrently triangulate the configuration space codimension-one manifold, and we apply a local elastic update to fix the triangulation when part is in the free space. We perform experiments on 4-DOF, 5-DOF, and 6-DOF serial manipulators. Infeasibility proofs in 4D are two orders of magnitude faster than previous results. Infeasibility proofs in 5D complete within minutes.
|
|
16:30-18:00, Paper WeCT1-CC.8 | Add to My Program |
Skeleton Disk-Graph Roadmap: A Sparse Deterministic Roadmap for Safe 2D Navigation and Exploration |
|
Noël, Thibault | INRIA Rennes |
Lehuger, Antoine Lehuger | Groupe Créative |
Marchand, Eric | Univ Rennes, Inria, CNRS, IRISA |
Chaumette, Francois | Inria Center at University of Rennes |
Keywords: Motion and Path Planning, Autonomous Agents, Reactive and Sensor-Based Planning
Abstract: In this paper, we describe a novel roadmap construction method in unknown environments, which relies on the extraction of the Hamilton-Jacobi skeleton of the free space. This skeleton is used to construct a graph of free-space bubbles, effectively compressing the skeleton information in a sparse data structure but retaining its topology. The bubbles also enforce safety directly in the roadmap structure. We first demonstrate the relevance of this approach for standard path-planning tasks. We also propose a frontiers-based exploration strategy able to autonomously and safely build a complete 2D map of the environment.
|
|
16:30-18:00, Paper WeCT1-CC.9 | Add to My Program |
SMUG Planner: A Safe Multi-Goal Planner for Mobile Robots in Challenging Environments |
|
Chen, Changan | ETH Zurich |
Frey, Jonas | ETH Zurich |
Arm, Philip | ETH Zurich |
Hutter, Marco | ETH Zurich |
Keywords: Motion and Path Planning, Constrained Motion Planning
Abstract: Robotic exploration or monitoring missions require mobile robots to autonomously and safely navigate between multiple target locations in potentially challenging environments. Currently, this type of multi-goal mission often relies on humans designing a set of actions for the robot to follow in the form of a path or waypoints. In this work, we consider the multi-goal problem of visiting a set of pre-defined targets, each of which could be visited from multiple potential locations. To increase autonomy in these missions, we propose a safe multi-goal (SMUG) planner that generates an optimal motion path to visit those targets. To increase safety and efficiency, we propose a hierarchical state validity checking scheme, which leverages robot-specific traversability learned in simulation. We use LazyPRM* with an informed sampler to accelerate collision-free path generation. Our iterative dynamic programming algorithm enables the planner to generate a path visiting more than ten targets within seconds. Moreover, the proposed hierarchical state validity checking scheme reduces the planning time by 30% compared to pure volumetric collision checking and increases safety by avoiding high-risk regions. We deploy the SMUG planner on the quadruped robot ANYmal and show its capability to guide the robot in multi-goal missions fully autonomously on rough terrain.
|
|
WeCT2-CC Oral Session, CC-311 |
Add to My Program |
Robust/Adaptive Control |
|
|
Chair: Righetti, Ludovic | New York University |
Co-Chair: Whitcomb, Louis | The Johns Hopkins University |
|
16:30-18:00, Paper WeCT2-CC.1 | Add to My Program |
Non-Singular Fast Terminal Adaptive Visual Tracking Control with Reduced Tuning Parameters for an Aerial Vehicle under Perturbations |
|
Olivas-Martínez, Gustavo | Instituto Tecnológico De Estudios Superiores De Monterrey |
Miranda-Moya, Armando | Tecnologico De Monterrey |
Katt, Carlos | Tecnológico De Monterrey |
Castaneda, Herman | Tecnologico De Monterrey |
Keywords: Robust/Adaptive Control, Aerial Systems: Mechanics and Control, Control Architectures and Programming
Abstract: This paper presents a robust image-based visual servoing design for a quad-rotor unmanned aerial vehicle performing a visual target-tracking operation in the presence of turbulent wind. Image information is extracted and processed to control the positioning and heading of the aerial vehicle. A novel adaptive non-singular fast terminal sliding mode strategy is introduced to manage the visual servoing error. Unlike other sliding mode methods, the proposed approach diminishes the complexity of the system due to the reduction of its control parameters while providing practical finite-time convergence, robustness against bounded external disturbances and model uncertainties, non-overestimation of the control gains, and chattering attenuation. Furthermore, the stability of the system in closed loop is guaranteed through Lyapunov theory. Finally, simulation results demonstrate the capabilities and performance of such a controller in a high-fidelity scenario using the Robot Operating System and Gazebo frameworks.
|
|
16:30-18:00, Paper WeCT2-CC.2 | Add to My Program |
Quadrotor Neural Network Adaptive Control: Design and Experimental Validation |
|
Yu, Gan | Shanghai Jiao Tong University |
Reis, Joel | University of Macau |
Silvestre, Carlos | University of Macau |
Keywords: Robust/Adaptive Control, Aerial Systems: Mechanics and Control, Motion Control
Abstract: This letter presents the design and experimental study of an adaptive nonlinear controller for Unmanned Aerial Vehicles (UAVs) in the presence of unknown time-varying disturbances, and model parametric uncertainty. We employ an adaptive Neural Network (NN), used to approximate the partially unknown system, in tandem with a simple controller designed for trajectory tracking of a point located along the UAV’s vertical body axis instead of the center of mass. This strategy allows: (i) to avoid the two-subsystems control paradigm generally adopted by conventional UAV controllers; (ii) all control inputs to be defined at once; and (iii) to lump all unknown dynamics from both translational and rotational levels into a single vector term. The weights of the NN are determined online by an adaptive law based on the Lyapunov synthesis method. The tracking and adaption errors are shown to be uniformly ultimately bounded. Simulation and experimental results, including comparison data, are provided to validate and assess the proposed control solution.
|
|
16:30-18:00, Paper WeCT2-CC.3 | Add to My Program |
Parameter Identifying Disturbance Rejection Control with Asymptotic Error Convergence |
|
Patelski, Radosław | Poznan University of Technology |
Pazderski, Dariusz | Poznan University of Technology |
Keywords: Robust/Adaptive Control, Calibration and Identification, Formal Methods in Robotics and Automation
Abstract: In this paper, a new kind of adaptive controller for the problem of output feedback tracking is proposed on the basis of the Active Disturbance Rejection Control (ADRC) paradigm. The controller is synthesized for the systems linear in parameters by combining the classic ADRC algorithm with a recent Parameter Identifying Extended State Observer (PIESO) which employs a gradient adaptation law to actively identify the parameters of the plant. By means of the Lyapunov analysis, the asymptotic convergence of tracking, estimation, and identification errors is proved in the nominal case and the stability conditions of the closed-loop system are formulated.
|
|
16:30-18:00, Paper WeCT2-CC.4 | Add to My Program |
Smooth Computation without Input Delay: Robust Tube-Based Model Predictive Control for Robot Manipulator Planning |
|
Sima, Qie | Tsinghua University |
Luo, Yu | Tsinghua University |
Ji, Tianying | Tsinghua University |
Sun, Fuchun | Tsinghua University |
Liu, Huaping | Tsinghua University |
Zhang, Jianwei | University of Hamburg |
Keywords: Robust/Adaptive Control, Integrated Planning and Control, Motion Control
Abstract: Model Predictive Control (MPC) has exhibited remarkable capabilities in optimizing objectives and meeting constraints. However, the substantial computational burden associated with solving the Optimal Control Problem (OCP) at each triggering instant introduces significant delays between state sampling and control application. These delays limit the practicality of MPC in resource-constrained systems when engaging in complex tasks. The intuition to address this issue in this paper is that by predicting the successor state, the controller can solve the OCP one time step ahead of time thus avoiding the delay of the next action. To this end, we compute deviations between real and nominal system states, predicting forthcoming real states as initial conditions for the imminent OCP solution. Anticipatory computation stores optimal control based on current nominal states, thus mitigating the delay effects. Additionally, we establish an upper bound for linearization error, effectively linearizing the nonlinear system, reducing OCP complexity, and enhancing response speed. We provide empirical validation through two numerical simulations and corresponding real-world robot tasks, demonstrating significant performance improvements and augmented response speed (up to 90%) resulting from the seamless integration of our proposed approach compared to conventional time-triggered MPC strategies.
|
|
16:30-18:00, Paper WeCT2-CC.5 | Add to My Program |
Nullspace Adaptive Model-Based Trajectory-Tracking Control for a 6-DOF Underwater Vehicle with Unknown Plant and Actuator Parameters: Theory and Preliminary Simulation Evaluation |
|
Mao, Annie | Johns Hopkins University |
Moore, Joseph | Johns Hopkins University Applied Physics Lab |
Whitcomb, Louis | The Johns Hopkins University |
Keywords: Robust/Adaptive Control, Model Learning for Control, Marine Robotics
Abstract: We report a novel model-based nullspace adaptive trajectory-tracking control (NS-ATTC) algorithm for fully-actuated 6-degree-of-freedom (DOF) underwater vehicles which estimates unknown plant and actuator model parameters simultaneously. We provide a stability and convergence analysis with proof of asymptotically stable tracking error convergence, as well as a preliminary simulation study demonstrating 6-DOF trajectory tracking. The NS-ATTC algorithm does not require acceleration instrumentation and provides a stable online parameter estimate, enabling robust model-based autonomy.
|
|
16:30-18:00, Paper WeCT2-CC.6 | Add to My Program |
Adaptive Planning and Control with Time-Varying Tire Models for Autonomous Racing Using Extreme Learning Machine |
|
Kalaria, Dvij | Carnegie Mellon University |
Lin, Qin | Cleveland State University |
Dolan, John M. | Carnegie Mellon University |
Keywords: Robust/Adaptive Control, Motion and Path Planning, Intelligent Transportation Systems
Abstract: Autonomous racing is a challenging problem, as the vehicle needs to operate at the friction or handling limits in order to achieve minimum lap times. Autonomous race cars require highly accurate perception, state estimation, planning, and control. Adding to this complexity is the need to accurately identify vehicle model parameters governing lateral tire slip effects, which can evolve over time due to factors such as tire wear and tear. Current approaches to this problem typically either propose offline model identification methods or rely on initial parameters within a narrow range (typically within 15-20% of the actual values). However, these approaches fall short in accounting for significant changes in tire models that can occur during actual races, particularly when pushing the vehicle to its handling limits. we present a unified framework that not only learns the tire model in real time from collected data but also adapts the model to environmental changes, even when the model parameters exhibit substantial deviations. We validate our approach through testing in simulators, encompassing a 1:43 scale race car and a full-size car, and also through experiments with a physical F1/10 autonomous race car.
|
|
16:30-18:00, Paper WeCT2-CC.7 | Add to My Program |
Risk-Sensitive Extended Kalman Filter |
|
Jordana, Armand | New York University |
Meduri, Avadesh | New York University |
Arlaud, Etienne | INRIA |
Carpentier, Justin | INRIA |
Righetti, Ludovic | New York University |
Keywords: Robust/Adaptive Control, Optimization and Optimal Control, Legged Robots
Abstract: Designing robust algorithms in the face of estimation uncertainty is a challenging task. Indeed, controllers seldom consider estimation uncertainty and only rely on the most likely estimated state. Consequently, sudden changes in the environment or the robot's dynamics can lead to catastrophic behaviors. Leveraging recent results in risk-sensitive optimal control, this paper presents a risk-sensitive Extended Kalman Filter that can adapt its estimation to the control objective, hence allowing safe output-feedback Model Predictive Control (MPC). By taking a pessimistic estimate of the value function resulting from the MPC controller, the filter provides increased robustness to the controller in phases of uncertainty as compared to a standard Extended Kalman Filter (EKF). The filter has the same computational complexity as an EKF and can be used for real-time control. The paper evaluates the risk-sensitive behavior of the proposed filter when used in a nonlinear MPC loop on a planar drone and industrial manipulator in simulation, as well as on an external force estimation task on a real quadruped robot. These experiments demonstrate the ability of the approach to significantly improve performance in face of uncertainties.
|
|
16:30-18:00, Paper WeCT2-CC.8 | Add to My Program |
Global Terminal Sliding Mode Control of Tethered Satellites Formation with Chattering Reduction Via PID Laws |
|
Su, Bowen | Northwestern Polytechnical University |
Zhang, Fan | Northwestern Polytechnical Univeristy |
Huang, Panfeng | Northwestern Polytechnical University |
Keywords: Robust/Adaptive Control, Space Robotics and Automation
Abstract: This paper researches a novel global terminal sliding mode control(GTSMC) on a tethered satellites system(TSS) under outer disturbances, and the effect of PI/PD compensation in restraining chattering on sliding surface is appended. By taking advantage of the finite-time convergence of traditional terminal sliding surface, the sliding surface with global and terminal sliding motion is proposed, and the convergent time by GTSMC is qualitatively evaluated by the sliding surface. Then the integral/derivative function of the low-pass filtered switching control is appended in GTSMC, by virtue of the accuracy of integral and the damping of derivative, respectively, the persisting on sliding surface is eliminated, such that the chattering effect of the controlled system on the surface is restrained consequently. Finally, simulations of the proposed control on TSS is shown to validate the theoretical analyses.
|
|
16:30-18:00, Paper WeCT2-CC.9 | Add to My Program |
Robust Feedback Quadratic Programming for Kinematic-Controlled Robots |
|
Djeha, Mohamed | Université De Montpellier |
Gergondet, Pierre | CNRS |
Kheddar, Abderrahmane | CNRS-AIST |
Keywords: Robust/Adaptive Control of Robotic Systems, Motion Control, Robot Safety, Optimization and Optimal Control
Abstract: Task-space quadratic programming (QP) is an elegant approach for controlling robots subject to constraints. Yet, in the case of kinematic-controlled (i.e., high-gains position or velocity) robots, closed-loop QP control scheme can be prone to instability depending on how the gains related to the tasks or to the constraints are chosen. In this paper, we address such instability shortcomings. First, we highlight the non-robustness of the closed-loop system against non-modeled dynamics, such as those relative to joint-dynamics, flexibilities, external perturbations, etc. Then, we propose a robust QP control formulation based on high-level integral feedback terms in the task-space including the constraints. The proposed method is formally proved to ensure closed-loop robust stability, and is intended to be applied to any kinematic-controlled robots under practical assumptions. We assess our approach through experiments on a fixed-base robot performing stable fast motions, and a floating-base humanoid robot robustly reacting to perturbations to keep its balance.
|
|
WeCT3-CC Oral Session, CC-313 |
Add to My Program |
Dynamics |
|
|
Chair: Shen, Yantao | University of Nevada, Reno |
Co-Chair: Kovecses, Jozsef | McGill University |
|
16:30-18:00, Paper WeCT3-CC.1 | Add to My Program |
Model Predictive Control for an Autonomous Underwater Robot with Fully Vectored Propulsion |
|
Gao, Tianzhu | Dalian Maritime University |
Luo, Yudong | Dalian Maritime University |
Lv, Chao | Dalian Maritime University |
Luo, Weirong | Dalian Maritime University |
Fu, Xianping | Dalian Maritime University |
Zhao, Na | Dalian Maritime University |
Luo, Xi | Yichang Testing Tech. Research Institution |
Shen, Yantao | University of Nevada, Reno |
Keywords: Discrete Event Dynamic Automation Systems
Abstract: Due to the low motion efficiency and maneuverability of underwater robots with six degrees of freedom, it is challenging for them to quickly respond to the attitude requirements during underwater autonomous maneuvering. This paper presents a novel autonomous underwater robot with fully vectored propulsion, combined with a model predictive control method, to autonomously achieve more agile and efficient movements. In detail, we first design an eight vector-distributed thruster layout for fully vectored propulsion of the robot and construct software architecture based on the robot operating system (ROS). Then, we establish the hydrodynamic model for the robot by adopting the Fossen approach, thus constructing a 13-dimensional system state-space equation, which is discretized by using the explicit fourth-order Runge-Kutta method. To achieve the autonomous maneuver, model predictive control is employed along with physical constraints of the custom-built robot to enable real-time prediction and optimization of the robot's state for control purposes. Finally, numerical simulations and experiments of the Point-to-Point Motion are conducted to test the robot's performance. Experimental results reveal that the average error of each direction is 0.0027 m, 0.0031 m, and 0.0368 m in the x-axis, y-axis, and z-axis, respectively, and 0.8502 degrees, 2.1941 degrees, 0.2408 degrees corresponding to three attitude angles, which verify the performance of employing MPC to control an autonomous underwater robot with fully vectored propulsion.
|
|
16:30-18:00, Paper WeCT3-CC.2 | Add to My Program |
Attitude Control for Morphing Quadrotor through Model Predictive Control with Constraints |
|
Zhao, Na | Dalian Maritime University |
Luo, Yudong | Dalian Maritime University |
Qin, Chaojun | Dalian Maritime University |
Luo, Xi | Yichang Testing Tech. Research Institution |
Chen, Rong | Dalian Maritime University |
Shen, Yantao | University of Nevada, Reno |
Keywords: Discrete Event Dynamic Automation Systems, Logistics
Abstract: Morphing quadrotors that can be potentially applied to confined spaces such as warehouses, tanks, and pipelines have flourished in recent years. Most work has focused on the mechanical feasibility of the morphing systems and high-level flight controller design, with limited discussions on low-level control. In this paper, a constrained model predictive control (MPC) is proposed and applied to solve the attitude control problem of a morphing quadrotor. Prior to controller design, a custom-built morphing quadrotor is introduced with the kinematic and dynamic models established and corresponding issues and challenges presented. In the controller, to eliminate the steady-state error, an embedded integrator is adopted by exploiting the differential variables; then, the constraints of the morphing quadrotor are incorporated into the MPC formulation to simulate actual flight conditions, and an orthonormal function is employed to approximate the control input sequences in the controller to alleviate the computational burden. In the comparative studies, several scenarios are considered to demonstrate the effectiveness of the proposed control strategy in attitude control.
|
|
16:30-18:00, Paper WeCT3-CC.3 | Add to My Program |
NNgTL: Neural Network Guided Optimal Temporal Logic Task Planning for Mobile Robots |
|
Liu, Ruijia | Shanghai Jiao Tong University |
Li, Shaoyuan | Shanghai Jiao Tong University |
Yin, Xiang | Shanghai Jiao Tong Univ |
Keywords: Discrete Event Dynamic Automation Systems, Planning, Scheduling and Coordination, Task Planning
Abstract: In this work, we investigate task planning for mobile robots under linear temporal logic (LTL) specifications. This problem is particularly challenging when robots navigate in continuous workspaces due to the high computational com- plexity involved. Sampling-based methods have emerged as a promising avenue for addressing this challenge by incrementally constructing random trees, thereby sidestepping the need to ex- plicitly explore the entire state-space. However, the performance of this sampling-based approach hinges crucially on the chosen sampling strategy, and a well-informed heuristic can notably enhance sample efficiency. In this work, we propose a novel neural-network guided (NN-guided) sampling strategy tailored for LTL planning. Specifically, we employ a multi-modal neural network capable of extracting features concurrently from both the workspace and the B ̈uchi automaton. This neural network generates predictions that serve as guidance for random tree construction, directing the sampling process toward more optimal directions. Through numerical experiments, we com- pare our approach with existing methods and demonstrate its superior efficiency, requiring less than 15% of the time of the existing methods to find a feasible solution.
|
|
16:30-18:00, Paper WeCT3-CC.4 | Add to My Program |
Synthesis of Temporally-Robust Policies for Signal Temporal Logic Tasks Using Reinforcement Learning |
|
Wang, Siqi | Shanghai Jiao Tong University |
Li, Shaoyuan | Shanghai Jiao Tong University |
Yin, Li | Macau University of Science and Technology |
Yin, Xiang | Shanghai Jiao Tong Univ |
Keywords: Discrete Event Dynamic Automation Systems, Planning, Scheduling and Coordination, Task Planning
Abstract: This paper investigates the problem of designing control policies that satisfy high-level specifications described by signal temporal logic (STL) in unknown, stochastic environments. While many existing works concentrate on optimizing the spatial robustness of a system, our work takes a step further by also considering temporal robustness as a critical metric to quantify the tolerance of time uncertainty in STL. To this end, we formulate two relevant control objectives to enhance the temporal robustness of the synthesized policies. The first objective is to maximize the probability of being temporally robust for a given threshold. The second objective is to maximize the worst-case spatial robustness value within a bounded time shift. We use reinforcement learning to solve both control synthesis problems for unknown systems. Specifically, we approximate both control objectives in a way that enables us to apply the standard Q-learning algorithm. Theoretical bounds in terms of the approximations are also derived. We present case studies to demonstrate the feasibility of our approach.
|
|
16:30-18:00, Paper WeCT3-CC.5 | Add to My Program |
A Deep Learning Framework for Non-Symmetrical Coulomb Friction Identification of Robotic Manipulators |
|
Lahoud, Marcel | Italian Institute of Technology |
Marchello, Gabriele | Istituto Italiano Di Tecnologia |
D'Imperio, Mariapaola | Istituto Italiano Di Tecnologia |
Mueller, Andreas | Johannes Kepler University |
Cannella, Ferdinando | Istituto Italiano Di Tecnologia |
Keywords: Dynamics, Deep Learning Methods, Calibration and Identification
Abstract: The determination of the dynamic properties of a robot is especially important for designing highly accurate and efficient control systems. Conventional methods for dynamic model identification have proven to be effective, where deep learning (DL) approaches have shown limits due to data inefficiencies. However, thanks to novel physics-informed DL architectures, such as Deep Lagrangian Networks (DeLaN) [1], it is possible to control and extract interpretable physical information of a robot. This paper introduces an augmented DeLaN architecture for linear viscous and non-symmetrical Coulomb friction identification, which also learns motor parameters such as rotor inertia. An approach is proposed for comparing this method with the conventional dynamic identification and previous DeLaN implementations. Moreover, our friction and rotor inertia identification is validated, and the performance of our model is analyzed with a real robot (UR5e).
|
|
16:30-18:00, Paper WeCT3-CC.6 | Add to My Program |
Implicit Time Integration Simulation of Robots with Rigid Bodies and Cosserat Rods Based on a Newton-Euler Recursive Algorithm |
|
Boyer, Frédéric | IMT Atlantique |
Gotelli, Andrea | École Centrale Nantes |
Tempel, Philipp T. | Ecole Centrale De Nantes |
Lebastard, Vincent | IMT Atlantique |
Renda, Federico | Khalifa University of Science and Technology |
Briot, Sébastien | LS2N |
Keywords: Dynamics, Direct/Inverse Dynamics Formulation, Newton-Euler recursive algorithm, Modeling, Control, and Learning for Soft Robots
Abstract: In this paper, we propose a new algorithm for solving the forward dynamics of multibody systems consisting of rigid bodies connected in arbitrary topologies by localised joints and/or soft links, possibly actuated or not. The simulation is based on the implicit time-integration of the Lagrangian model of these systems, where the soft links are modelled by Cosserat rods parameterised by assumed strain modes. This choice imposes a predictor-corrector structure on the approach, and requires computing both the residual vector and the Jacobian of the residual vector of the dynamics constrained by the time integrator. These additional calculations are handled here with a new Newton-Euler recursive inverse dynamics algorithm and its linearized tangent version. The approach is illustrated with numerical examples from the Cosserat rod literature and from recent robotic applications.
|
|
16:30-18:00, Paper WeCT3-CC.7 | Add to My Program |
Efficient Constrained Dynamics Algorithms Based on an Equivalent LQR Formulation Using Gauss' Principle of Least Constraint |
|
Sathya, Ajay Suresha | Inria |
Bruyninckx, Herman | KU Leuven |
Decré, Wilm | Katholieke Universiteit Leuven |
Pipeleers, Goele | KU Leuven |
Keywords: Dynamics, Direct/Inverse Dynamics Formulation, Optimization and Optimal Control, Redundant Robots
Abstract: We derive a family of efficient constrained dynamics algorithms by formulating an equivalent linear quadratic regulator (LQR) problem using Gauss' principle of least constraint and solving it using dynamic programming. Our approach builds upon the pioneering (but largely unknown) O(n+m^2d+m^3) solver by Popov and Vereshchagin (PV), where n,m and d are the number of joints, number of constraints and the kinematic tree depth respectively. We provide an expository derivation for the original PV solver and extend it to floating-base kinematic trees with constraints allowed on any link. We make new connections between the LQR's dual Hessian and the inverse operational space inertia matrix (OSIM), permitting efficient OSIM computation, which we further accelerate using matrix inversion lemma. We generalize the elimination ordering and support MuJoCo-type soft constraint models to obtain O(n+m) complexity solvers. Our numerical results indicate that significant simulation speed-up can be achieved for high dimensional robots like quadrupeds and humanoids using our algorithms as they scale better than the widely used O(nd^2+m^2d+d^2m+m^3) LTL algorithm of Featherstone.
|
|
16:30-18:00, Paper WeCT3-CC.8 | Add to My Program |
Model-Based Co-Simulation of Flexible Mechanical Systems with Contacts Using Reduced Interface Models |
|
Dai, Xu | McGill University |
Raoofian, Ali | McGill University |
Kovecses, Jozsef | McGill University |
Teichmann, Marek | CMLabs Simulations Inc |
Keywords: Dynamics, Flexible Robotics
Abstract: Co-simulation is a useful approach in the modelling of robotic systems composed of multiple parts. In co-simulation, the subsystems only exchange information at communication points. The time delay of information exchange may cause error and instability. Thus, an appropriate way to determine the interface variables between the communication points is essential for efficient and stable performance, especially for real-time applications. Reduced interface models (RIMs) can be used to represent the dynamic behaviour of the subsystems at the interface in co-simulation. Such a model-based co-simulation scheme was limited to systems consisting of rigid bodies in previous studies. In this work, we introduce the formulation of RIMs for flexible multibody systems and based on that propose a general co-simulation scheme for systems consisting of both rigid body components and elements with structural flexibility. A robotic model is employed as an example to demonstrate the co-simulation scheme, where a non-smooth subsystem with contact interactions is present. The advantages of constructing RIM using flexible mechanical system models over rigid body models are also addressed by comparing the effective mass properties and the simulation results.
|
|
16:30-18:00, Paper WeCT3-CC.9 | Add to My Program |
RIDER: Reinforcement-Based Inferred Dynamics Via Emulating Rehearsals for Robot Navigation in Unstructured Environments |
|
Siva, Sriram | Army Research Laboratory |
Wigness, Maggie | U.S. Army Research Laboratory |
Keywords: Field Robots, Representation Learning, Integrated Planning and Learning
Abstract: Autonomous navigation in unstructured environments is a challenging task due to the complex and dynamic nature of robot terrain interactions. Existing approaches often struggle to generalize amidst the complexities of real-world settings. They tend to rely on hand-engineered, rule-based robot models or static weightings assigned to obstacles, semantics, and other perceptual cues to estimate traversability. To address these challenges, we propose a novel approach called Reinforcement-Based Inferred Dynamics via Emulating Rehearsals (RIDER), that learns the dynamics of robot-terrain interactions within a compact latent space, capturing robot's traversability. Operating within a reinforcement learning paradigm, RIDER learns to infer its own dynamics by predicting how future robot observations and states evolve within this latent space in response to navigational behaviors. Furthermore, our approach leverages emulated rehearsals, where the robot learns within the latent space to predict its rewards and generate navigational behaviors, even when real observations have not been updated. Accordingly, RIDER equips robots with the ability to generate navigational behaviors by predicting environmental changes, and plan beyond the speed at which observations from sensors are available. Experimental results and comparisons with baseline methods establish that our proposed method outperforms other approaches in cluttered and unstructured environments and demonstrates an enhanced capacity for autonomous navigation in real-world settings.
|
|
WeCT4-CC Oral Session, CC-315 |
Add to My Program |
Distributed Robot Systems |
|
|
Chair: Guo, Meng | Peking University |
|
16:30-18:00, Paper WeCT4-CC.1 | Add to My Program |
Dynamic Multi-Agent Deep Deterministic Policy Gradient for Autonomous Navigation of Reconfigurable Unmanned Aerial Vehicle |
|
Lu, Xin | University of Electronic Science and Technology |
Wu, Zegui | University of Electronic Science and Technology of China |
Zhao, Ruqing | University of Electronic Science and Technology of China |
Li, Fusheng | University of Electronic Science and Technology of China |
Keywords: Distributed Robot Systems, Aerial Systems: Mechanics and Control, Aerial Systems: Applications
Abstract: The reconfigurable unmanned aerial vehicle (RUAV) has the ability to create and break physical links to self-assemble and self-disassemble in midair. For the changes in task or environment, this system can dynamically disassemble the rectangular structure into multiple individual UAV modules or integrate these UAV modules into a whole. For practical applications, the R-UAV requires collaborative decision-making for autonomous navigation in complex environments. However, the navigation problem of the R-UAV has not been investigated. In this paper, we propose a dynamic multi-agent deep deterministic policy gradient (DMADDPG) algorithm for autonomous navigation of R-UAV. This algorithm introduces the leader agent assignment mechanism and a collaborative experience reward. The former deals with the action conflict problem caused by the disappearance of the UAV agent when multiple UAV modules are assembled. The latter provides guidance for the UAV agent to plan a collision-free and efficient trajectory. We validate our strategy in both simulation and practical scenarios, and experimental results demonstrate that the proposed scheme can generate reasonable and efficient paths for R-UAV in the presence of obstacles. The experiment video is available at https://youtu.be/mVm0qCvB7HY.
|
|
16:30-18:00, Paper WeCT4-CC.2 | Add to My Program |
FogROS2-LS: A Location-Independent Fog Robotics Framework for Latency Sensitive ROS2 Applications |
|
Chen, Kaiyuan | University of California, Berkeley |
Wang, Michael | Bosch |
Gualtieri, Marcus | Bosch Research |
Tian, Nan | University of California, Berkeley |
Juette, Christian | Bosch Research |
Ren, Liu | Robert Bosch North America Research Technology Center |
Kubiatowicz, John | UC Berkeley |
Goldberg, Ken | UC Berkeley |
Keywords: Distributed Robot Systems, Multi-Robot Systems, Networked Robots
Abstract: Limiting latency is essential for critical robot applications such as collision avoidance or target tracking and is challenging for Cloud or Fog robotics applications due to network congestion and failures. We introduce FogROS2-Latency-Sensitive(LS), a Fog Robotics framework that offers secure, location-independent connections between robots and latency-sensitive robotic services. FogROS2-LS offloads conventional on-board state estimators and feedback controllers to Cloud and Edge compute hardware without modifying the existing application in ROS2. In presence of multiple identical services, it dynamically identifies and transitions to the optimal service deployment that fulfills the application's latency requirement, thereby empowering robots with restricted on-board computing capacity to safely and efficiently navigate dynamic, human-dense environments. We evaluate FogROS2-LS with two latency sensitive case studies: (1) Collision Avoidance: a robot arm guided by visual feedback from consistent distance estimation and collision checking on Cloud and Edge. FogROS2-LS reduces collision failures by up to 8.5 times by selecting the best available machine (2) Target Tracking: FogROS2-LS also enables robust and continuous target following and can recover from network failures.
|
|
16:30-18:00, Paper WeCT4-CC.3 | Add to My Program |
Leveraging Tethers for Distributed Formation Control of Simple Robots |
|
Cutler, Sadie | Cornell University |
Petersen, Kirstin Hagelskjaer | Cornell University |
Keywords: Distributed Robot Systems, Multi-Robot Systems, Sensor-based Control
Abstract: Tethers have great potential in multi-robot systems from enabling retrieval of deployed robots and facilitating power transfer, to use by the robots as a net or partition. In this paper, we show in simulation that tethers can also be used to do distributed formation control on very simple robots. Specifically, our simulated agents are connected in series by un-actuated, flexible, fixed-length tethers and use tether angle and strain, in conjunction with the physical constraints of the tethers, to adjust their position with respect to their neighbors. This presents a significant simplification over traditional formation control which, at a minimum, requires exteroceptive sensors to perceive bearing and/or distance to nearby agents. We present and evaluate an algorithm on a large set of transitions between formations with 5 agents and an example transition with 35 agents. The convergence time grows with the number of agents, however, the memory and computation time per agent remain constant. Future work will investigate the ability to use tethers and strain for reactive behaviors and more diverse tasks.
|
|
16:30-18:00, Paper WeCT4-CC.4 | Add to My Program |
Distributed Differential Dynamic Programming Architectures for Large-Scale Multi-Agent Control |
|
Saravanos, Augustinos | Georgia Institute of Technology |
Aoyama, Yuichiro | Georgia Institute of Technology |
Zhu, Hongchang | Georgia Institute of Technology |
Theodorou, Evangelos | Georgia Institute of Technology |
Keywords: Distributed Robot Systems, Optimization and Optimal Control, Multi-Robot Systems, Swarms
Abstract: This paper proposes two decentralized multi-agent optimal control methods that combine the computational efficiency and scalability of Differential Dynamic Programming (DDP) and the distributed nature of the Alternating Direction Method of Multipliers (ADMM). The first one, Nested Distributed DDP (ND-DDP), is a three-level architecture which employs ADMM for consensus, an augmented Lagrangian layer for local constraints and DDP as the local optimizer. The second one, Merged Distributed DDP (MD-DDP), is a two-level architecture that addresses both consensus and local constraints with ADMM, further reducing computational complexity. Both frameworks are fully decentralized since all computations are parallelizable among the agents and only local communication is necessary. Simulation results that scale up to thousands of cars and hundreds of drones demonstrate the effectiveness of the algorithms. Superior scalability to large-scale systems against other DDP and sequential quadratic programming methods is also illustrated. Finally, hardware experiments on a multi-robot platform verify the applicability of the methods. A video with all results is provided in the supplementary material.
|
|
16:30-18:00, Paper WeCT4-CC.5 | Add to My Program |
Accelerated K-Serial Stable Coalition for Dynamic Capture and Resource Defense |
|
Chen, Junfeng | Peking University |
Tang, Zili | Peking University |
Guo, Meng | Peking University |
Keywords: Distributed Robot Systems, Planning, Scheduling and Coordination, Integrated Planning and Learning
Abstract: Coalition is an important mean of multi-robot systems to collaborate on common tasks. An adaptive coalition strategy is essential for the online performance in dynamic and unknown environments. In this work, the problem of territory defense by large-scale heterogeneous robotic teams is considered. The tasks include exploration, capture of dynamic targets, and perimeter defense over valuable resources. Since each robot can choose among many tasks, it remains a challenging problem to coordinate jointly these robots such that the overall utility is maximized. This work proposes a generic coalition strategy called K-serial stable coalition algorithm. Different from centralized approaches, it is distributed and complete, meaning that only local communication is required and a K-serial Stable solution is ensured. Furthermore, to accelerate adaptation to dynamic targets and resource distribution that are only perceived online, a heterogeneous graph attention network based heuristic is learned to select more appropriate parameters and promising initial solutions during local optimization. Compared with manual heuristics or end-to-end predictors, it is shown to both improve online adaptability and retain the quality guarantee. The proposed methods are validated via large-scale simulations with 170 robots and hardware experiments of 13 robots, against several strong baselines such as GreedyNE and FastMaxSum.
|
|
16:30-18:00, Paper WeCT4-CC.6 | Add to My Program |
Sensor-Based Multi-Robot Coverage Control with Spatial Separation in Unstructured Environments |
|
Wang, Xinyi | The Chinese University of Hong Kong |
Xu, Jiwen | The Chinese University of Hong Kong |
Gao, Chuanxiang | The Chinese University of Hong Kong |
Chen, Yizhou | Chinese University of Hong Kong |
Zhang, Jihan | Chinese University of Hong Kong |
Wang, Chenggang | Shanghai Jiao Tong University |
Ding, Yulong | Tongji University |
Chen, Ben M. | Chinese University of Hong Kong |
Keywords: Distributed Robot Systems, Reactive and Sensor-Based Planning, Collision Avoidance
Abstract: Multi-robot systems have increasingly become instrumental in tackling coverage problems. However, the challenge of optimizing task efficiency without compromising task success still persists, particularly in expansive, unstructured scenarios with dense obstacles. This paper presents an innovative, decentralized Voronoi-based coverage control approach to reactively navigate these complexities while guaranteeing safety. This approach leverages the active sensing capabilities of multi-robot systems to supplement GIS (Geographic Information System), offering a more comprehensive and real-time understanding of environments like post-disaster. Based on point cloud data, which is inherently non-convex and unstructured, this method efficiently generates collision-free Voronoi regions using only local sensing information through spatial decomposition and spherical mirroring techniques. Then, deadlock-aware guided map integrated with a gradient-optimized, centroid Voronoi-based coverage control policy, is constructed to improve efficiency by avoiding exhaustive searches and local sensing pitfalls. The effectiveness of our algorithm has been validated through extensive numerical simulations in high-fidelity environments, demonstrating significant improvements in task success rate, coverage ratio, and task execution time compared with others.
|
|
16:30-18:00, Paper WeCT4-CC.7 | Add to My Program |
Localized and Incremental Probabilistic Inference for Large-Scale Networked Dynamical Systems |
|
Matsuka, Kai | California Institute of Technology |
Chung, Soon-Jo | Caltech |
Keywords: Distributed Robot Systems, Sensor Networks, Swarms, SLAM
Abstract: We present new algorithms for Distributed Factor Graph Optimization (DFGO) problems that arise in the probabilistic inference of large-scale networked robotic systems. First, for the batch DFGO problem, we derive the Local Consensus ADMM (LC-ADMM) algorithm. LC-ADMM is fully localized; therefore, the computational effort, communication bandwidth, and memory for each agent scale like o(1) with respect to the network size. We establish two new theoretical results for LC-ADMM: (1) exponential convergence when the objective is strongly convex and has a Lipschitz continuous subdifferential, and (2) o(1/k) convergence when the objective is convex and has a unique solution. Second, we also develop the Incremental DFGO algorithm (iDFGO) for real-time problems by combining the ideas from LC-ADMM and the Bayes tree. The iDFGO algorithm incrementally recomputes estimates when new factors are added to the graph and is scalable with respect to both network size and time. We validate LC-ADMM and iDFGO in simulations with examples from multi-agent Simultaneous Localization and Mapping (SLAM) and power grids.
|
|
WeCT5-CC Oral Session, CC-411 |
Add to My Program |
Sensor Fusion II |
|
|
Chair: Forbes, James Richard | McGill University |
Co-Chair: Wang, Lin | HKUST |
|
16:30-18:00, Paper WeCT5-CC.1 | Add to My Program |
JSTR: Joint Spatio-Temporal Reasoning for Event-Based Moving Object Detection |
|
Zhou, Hanyu | Huazhong University of Science and Technology |
Shi, Zhiwei | National Key Lab of Multispectral Information Intelligent Proces |
Dong, Hao | National Key Laboratory of Science and Technology on Multispectr |
Peng, Shihan | Huazhong University of Science and Technology |
Chang, Yi | National Key Laboratory of Science and Technology on Multispectr |
Yan, Luxin | Huazhong University of Science and Technology |
Keywords: Sensor Fusion, Object Detection, Segmentation and Categorization
Abstract: Event-based moving object detection is a challenging task, where static background and moving object are mixed together. Typically, existing methods mainly align the background events to the same spatial coordinate system via motion compensation to distinguish the moving object. However, they neglect the potential spatial tailing effect of moving object events caused by excessive motion, which may affect the structure integrity of the extracted moving object. We discover that the moving object has a complete columnar structure in the point cloud composed of motion-compensated events along the timestamp. Motivated by this, we propose a novel joint spatio-temporal reasoning method for event-based moving object detection. Specifically, we first compensate the motion of background events using inertial measurement unit. In spatial reasoning stage, we project the compensated events into the same image coordinate, discretize the timestamp of events to obtain a time image that can reflect the motion confidence, and further segment the moving object through adaptive threshold on the time image. In temporal reasoning stage, we construct the events into a point cloud along timestamp, and use RANSAC algorithm to extract the columnar shape in the cloud for peeling off the background. Finally, we fuse the results from the two reasoning stages to extract the final moving object region. This joint spatio-temporal reasoning framework can effectively detect the moving object from motion confidence and geometric structure. Moreover, we conduct extensive experiments on various datasets to verify that the proposed method can improve the moving object detection accuracy by 13%.
|
|
16:30-18:00, Paper WeCT5-CC.2 | Add to My Program |
AYDIV: Adaptable Yielding 3D Object Detection Via Integrated Contextual Vision Transformer |
|
Dam, Tanmoy | Saab NTU Joint Lab, Nanyang Technological University, Singapore |
Dharavath, Sanjay Bhargav | Indian Institute of Technology, Kharagpur, India |
Alam, Sameer | Saab-NTU Joint Lab, Nanyang Technological University, Singapore |
Lilith, Nimrod | Saab-NTU Joint Lab, Nanyang Technological University, Singapore |
Chakraborty, Supriyo | Indian Institute of Technology, Kharagpur, India |
Feroskhan, Mir | Nanyang Technological University |
Keywords: Sensor Fusion, Object Detection, Segmentation and Categorization, Autonomous Vehicle Navigation
Abstract: Combining LiDAR and camera data has shown potential in enhancing short-distance object detection in autonomous driving systems. Yet, the fusion encounters difficulties with extended distance detection due to the contrast between LiDAR's sparse data and the dense resolution of cameras. Besides, discrepancies in the two data representations further complicate fusion methods. We introduce AYDIV, a novel framework integrating a tri-phase alignment process specifically designed to enhance long-distance detection even amidst data discrepancies. AYDIV consists of the Global Contextual Fusion Alignment Transformer (GCFAT), which improves the extraction of camera features and provides a deeper understanding of large-scale patterns; the Sparse Fused Feature Attention (SFFA), which fine-tunes the fusion of LiDAR and camera details; and the Volumetric Grid Attention (VGA) for a comprehensive spatial data fusion. AYDIV's performance on the Waymo Open Dataset (WOD) with an improvement of 1.24% in mAPH value(L2 difficulty) and the Argoverse2 Dataset with a performance improvement of 7.40% in AP value demonstrates its efficacy in comparison to other existing fusion-based methods. Our code is publicly available at https://github.com/sanjay-810/AYDIV2.
|
|
16:30-18:00, Paper WeCT5-CC.3 | Add to My Program |
RadarCam-Depth: Radar-Camera Fusion for Depth Estimation with Learned Metric Scale |
|
Li, Han | Zhejiang University |
Ma, Yukai | Zhejiang Unicersity |
Gu, Yaqing | Zhejiang University |
Hu, Kewei | Zhejiang University |
Liu, Yong | Zhejiang University |
Zuo, Xingxing | Caltech |
Keywords: Sensor Fusion, Range Sensing, Deep Learning for Visual Perception
Abstract: We present a novel approach for metric dense depth estimation based on the fusion of a single-view image and a sparse, noisy Radar point cloud. The direct fusion of heterogeneous Radar and image data, or their encodings, tends to yield dense depth maps with significant artifacts, blurred boundaries, and suboptimal accuracy. To circumvent this issue, we learn to augment versatile and robust monocular depth prediction with the dense metric scale induced from sparse and noisy Radar data. We propose a Radar-Camera framework for highly accurate and fine-detailed dense depth estimation with four stages, including monocular depth prediction, global scale alignment of monocular depth with sparse Radar points, quasi-dense scale estimation through learning the association between Radar points and image patches, and local scale refinement of dense depth using a scale map learner. Our proposed method significantly outperforms the state-of-the-art Radar-Camera depth estimation methods by reducing the mean absolute error (MAE) of depth estimation by 25.6% and 40.2% on the challenging nuScenes dataset and our self-collected ZJU-4DRadarCam dataset, respectively. Our code and dataset will be released at url{https://github.com/MMOCKING/RadarCam-Depth}.
|
|
16:30-18:00, Paper WeCT5-CC.4 | Add to My Program |
HabitatDyn 2.0: Dataset for Spatial Anticipation and Dynamic Object Localization |
|
Shen, Zhengcheng | TU Berlin |
Kästner, Linh | T-Mobile, TU Berlin |
Gao, Yi | TU Berlin |
Lambrecht, Jens | Technische Universität Berlin |
Keywords: Sensor Fusion, RGB-D Perception, Data Sets for Robotic Vision
Abstract: The ability of a robot to perceive and understand its environment is crucial for its actions and behavior. Humans are adept at using semantic information for object localization and path planning, a skill that robots need to emulate for intelligent adaptation in dynamic settings. Training of the spatial anticipation ability, which can enhance spatial perception through semantic understanding, necessitates the availability of appropriate data. Although extensive research has been conducted on datasets for outdoor environments, especially in the context of autonomous driving, there is still a notable lack of datasets specifically designed for indoor environments, with a focus on dynamic object localization. This paper introduces HabitatDyn 2.0, a dataset specifically designed for enhancing object localization capabilities with semantic information from a robot's perspective. Besides RGB videos, semantic annotations, and depth information, HabitatDyn 2.0 also features top-down view labels for dynamic objects, which is required for training the spatial anticipation ability based on semantic information. Additionally, an algorithm that leverages spatial anticipation for dynamic object localization is presented, trained, and evaluated on the dataset.
|
|
16:30-18:00, Paper WeCT5-CC.5 | Add to My Program |
Attentive Multimodal Fusion for Optical and Scene Flow |
|
Zhou, Youjie | Shandong University |
Mei, Guofeng | University of Technology Sydney |
Wang, Yiming | Fondazione Bruno Kessler |
Poiesi, Fabio | Fondazione Bruno Kessler |
Wan, Yi | Shandong University |
Keywords: Sensor Fusion, RGB-D Perception, Deep Learning for Visual Perception
Abstract: This paper presents an investigation into the estimation of optical and scene flow using RGBD information in scenarios where the RGB modality is affected by noise or captured in dark environments. Existing methods typically rely solely on RGB images or fuse the modalities at later stages, which can result in lower accuracy when the RGB information is unreliable. To address this issue, we propose a novel deep neural network approach called FusionRAFT, which enables early-stage information exchange between sensor modalities (RGB and depth). Our approach incorporates self- and cross-attention layers at different network levels to fuse these modalities and construct informative features that leverage the strengths of both modalities. Through comparative experiments, we demonstrate that our approach surpasses recent methods in terms of performance on the synthetic dataset Flythings3D, as well as generalization on the real-world dataset KITTI. We illustrate that our approach exhibits enhanced robustness in the presence of noise and low-lighting conditions affecting the RGB images.
|
|
16:30-18:00, Paper WeCT5-CC.6 | Add to My Program |
LiDAR-Camera Calibration Using Intensity Variance Cost |
|
Ishikawa, Ryoichi | The University of Tokyo |
Zhou, Shuyi | The University of Tokyo |
Sato, Yoshihiro | Kyoto University of Advanced Science |
Oishi, Takeshi | The University of Tokyo |
Ikeuchi, Katsushi | Microsoft |
Keywords: Sensor Fusion, SLAM, Omnidirectional Vision
Abstract: We propose an extrinsic calibration method for LiDAR-camera fusion systems using variations in intensities projected from camera images to the LiDAR point cloud. As the input, the proposed method uses a sequence of LiDAR data and camera images captured while moving the system. Once the camera motion is calculated, camera images are projected onto the point cloud. The variations in the projected intensities at each point are large in the presence of errors in the estimated motion or calibration parameters. Consequently, the extrinsic parameters are optimized for cost minimization based on the intensity variance. In addition, a suitable geometry is proposed for the calibration and verified using simulations. Our experimental results showed that the proposed method accurately performed calibrations using a camera and a sparse multi-beam LiDAR or one-dimensional LiDAR.
|
|
16:30-18:00, Paper WeCT5-CC.7 | Add to My Program |
SRFNet: Monocular Depth Estimation with Fine-Grained Structure Via Spatial Reliability-Oriented Fusion of Frames and Events |
|
Pan, Tianbo | Hong Kong University of Science and Technology(Guangzhou) |
Cao, Zidong | HKUST |
Wang, Lin | HKUST |
Keywords: Sensor Fusion, Vision-Based Navigation
Abstract: Monocular depth estimation is a crucial task to measure distance relative to a camera, which is important for applications, such as robot navigation and self-driving. Traditional frame-based methods suffer from performance drops due to the limited dynamic range and motion blur. Therefore, recent works leverage novel event cameras to complement or guide the frame modality via frame-event feature fusion. However, event streams exhibit spatial sparsity, leaving some areas unperceived, especially in regions with marginal light changes. Therefore, direct fusion methods, e.g., RAMNet, often ignore the contribution of the most confident regions of each modality. This leads to structural ambiguity in the modality fusion process, thus degrading the depth estimation performance.In this paper, we propose a novel Spatial Reliability-oriented Fusion Network (SRFNet), that can estimate depth with fine-grained structure at both daytime and nighttime. Our method consists of two key technical components. Firstly, we propose an attention-based interactive fusion (AIF) module that applies spatial priors of events and frames as the initial masks and learns the consensus regions to guide the inter-modal feature fusion. The fused feature are then fed back to enhance the frame and event feature learning. Meanwhile, it utilizes an output head to generate a fused mask, which is iteratively updated for learning consensual spatial priors. Secondly, we propose the Reliability-oriented Depth Refinement (RDR) module to estimate dense depth with the fine-grained structure based on the fused features and masks. We evaluate the effectiveness of our method on the synthetic and real-world datasets, which shows that, even without pretraining, our method outperforms the prior methods, e.g., RAMNet, especially in night scenes.
|
|
16:30-18:00, Paper WeCT5-CC.8 | Add to My Program |
Bayesian Filtering for Homography Estimation |
|
Del Castillo Bernal, Arturo | McGill University |
Decoste, Philippe | McGill University |
Forbes, James Richard | McGill University |
Keywords: Sensor Fusion, Visual Tracking, Vision-Based Navigation
Abstract: This paper considers homography estimation in a Bayesian filtering framework using rate gyro and camera measurements. The use of rate gyro measurements facilitates a more reliable estimate of homography in the presence of occlusions, while a Bayesian filtering approach generates both a homography estimate along with an uncertainty. Uncertainty information opens the door to adaptive filtering approaches, post-processing procedures, and safety protocols. In particular, herein an iterative extended Kalman filter and an interacting multiple model (IMM) filter are tested using both simulated and experimental datasets. The IMM is shown to have good consistency properties and better overall performance when compared to the state-of-the-art homography nonlinear deterministic observer in both simulations and experiments.
|
|
16:30-18:00, Paper WeCT5-CC.9 | Add to My Program |
Saturation-Aware Angular Velocity Estimation: Extending the Robustness of SLAM to Aggressive Motions |
|
Deschênes, Simon-Pierre | Université Laval |
Baril, Dominic | Université Laval |
Boxan, Matej | Norlab, Université Laval |
Laconte, Johann | French National Research Institute for Agriculture, Food and Env |
Giguère, Philippe | Université Laval |
Pomerleau, Francois | Université Laval |
Keywords: Sensor Fusion, Visual-Inertial SLAM, Data Sets for SLAM
Abstract: We propose a novel angular velocity estimation method to increase the robustness of Simultaneous Localization And Mapping (SLAM) algorithms against gyroscope saturations induced by aggressive motions. Field robotics expose robots to various hazards, including steep terrains, landslides, and staircases, where substantial accelerations and angular velocities can occur if the robot loses stability and tumbles. These extreme motions can saturate sensor measurements, especially gyroscopes, which are the first sensors to become inoperative. While the structural integrity of the robot is at risk, the robustness of the SLAM framework is oftentimes given little consideration. Consequently, even if the robot is physically capable of continuing the mission, its operation will be compromised due to a corrupted representation of the world. Regarding this problem, we propose a method to estimate the angular velocity using accelerometers during extreme rotations caused by tumbling. We show that our method reduces the median localization error by 71.5 % in translation and 65.5 % in rotation and is robust to mapping failures, which occurred in 37.5 % of the experiments without our method. We also propose the Tumbling-Induced Gyroscope Saturation (TIGS) dataset, which consists of outdoor experiments recording the motion of a mechanical lidar subject to angular velocities four times higher than other similar datasets available. The dataset is available online at https://github.com/norlab-ulaval/Norlab_wiki/wiki/TIGS-Dataset.
|
|
WeCT6-CC Oral Session, CC-414 |
Add to My Program |
Visual Perception Applications |
|
|
Chair: Ladikos, Alexander | ImFusion |
Co-Chair: Bian, Gui-Bin | Institute of Automation, Chinese Academy of Sciences |
|
16:30-18:00, Paper WeCT6-CC.1 | Add to My Program |
ContourPose: Monocular 6D Pose Estimation Method for Reflective Texture-Less Metal Parts |
|
He, Zaixing | Zhejiang University |
Li, Quanzhi | Zhejiang University |
Zhao, Xinyue | Zhejiang University |
Wang, Jin | Zhejiang University |
Shen, Huarong | Zhejiang Feihang Intelligent Technology Co., LTD |
Zhang, Shuyou | Zhejiang University |
Tan, Jianrong | Zhejiang University |
Keywords: 6D pose estimation, Computer Vision for Manufacturing, Deep Learning in Robotics and Automation, Grasping
Abstract: Pose estimation is an essential technology for industrial robots to perform precise gripping and assembly. The state-of-the-art deep learning-based approach uses an indirect strategy, i.e., first finding local correspondence between the 2D image and 3D model, and then using the PnP and RANSAC methods to calculate the poses of ordinary objects. However, the metal parts in industry are reflective and texture-less, making it difficult to identify distinguishable point features to establish 2D-3D correspondences. To address this problem, in this paper, we propose a novel deep learning based two-stage method for pose estimation of reflective texture-less metal parts, which accurately estimates the target pose using monocular RGB images. Since contours play an important role in both keypoints prediction and pose estimation stages, our method is named ContourPose. First, an additional contour decoder is adopted to implicitly constrain the keypoints prediction in the former stage, which improves the accuracy of the keypoints prediction. Then, the predicted contour of the previous stage is taken as geometric prior that is used to iteratively solve for the optimal pose. Experiments indicate that the proposed approach for reflective texture-less metal parts has a significant improvement over the state-of-the-art approaches.
|
|
16:30-18:00, Paper WeCT6-CC.2 | Add to My Program |
Soft Acoustic End-Effector |
|
Zhang, Zhiyuan | Acoustic Robotics Systems Laboratory, Institute of Robotics And |
Koch, Michael | Acoustic Robotics Systems Laboratory, Institute of Robotics And |
Ahmed, Daniel | ETH Zurich |
Keywords: Automation at Micro-Nano Scales, Micro/Nano Robots, Soft Robot Applications
Abstract: Acoustic techniques have been developed as multifunctional tools for various microscale manipulations. In prevalent design paradigms, a position-fixed piezoelectric transducer (PZT) is utilized to generate ultrasound waves. However, the immobility of the PZT restricts the modulation of the acoustic field's position and orientation, consequently diminishing the adaptability and effectiveness of subsequent acoustic micromanipulation tasks. Here, we proposed a miniaturized soft acoustic end-effector and demonstrated acoustic field modulation and microparticle manipulation by adjusting PZT position and orientation. The PZT is mounted on the end of a soft robotic arm that has three individual degrees of freedom and can be deformed in 3D space by inflating or deflating each chamber. Experiments showed that the soft acoustic end-effector can change the traveling direction of microparticles and modulate the location of a standing wave field. Our approach is simple, flexible, and controllable. We envision that the soft acoustic end-effector will facilitate multiscale acoustic manipulation in interdisciplinary applications, especially, for in vivo acoustic therapies.
|
|
16:30-18:00, Paper WeCT6-CC.3 | Add to My Program |
Deep Learning Based 6-DoF Antipodal Grasp Planning from Point Cloud in Random Bin-Picking Task Using Single-View |
|
Bui, Tat Hieu | Sungkyunkwan University |
Son, Yeong Gwang | SungKyunKwan University |
Moon, Seung Jae | Sungkyunkwan, Mechanical Engineering, Robottory |
Nguyen, Quang Huy | Sungkyunkwan University/ Robotics Innovatory |
Rhee, Issac | Sungkyunkwan University |
Hong, Juyong | Sungkyunkwan Univ |
Choi, Hyouk Ryeol | Sungkyunkwan University |
Keywords: Computer Vision for Automation, Data Sets for Robotic Vision, Industrial Robots
Abstract: Random bin picking is a crucial task in logistic centers, which is driven by E-Commerce growth. In this paper, we present an end-to-end method for 6-DoF antipodal grasps from cluttered scenes. Our approach includes two main steps: finding Potential Grasp Areas (PGAs) from depth image of the bin and detecting suitable parallel grasps in PGAs from point cloud data. To support our work, the training datasets are generated automatically in Pybullet simulation environment including 5000 depth images and above 30,000 point clouds of cluttered scenes with different number of the objects, which save time significantly for collecting and labeling. We implemented real grasping experiments with a robot arm UR10, 2-finger gripper, depth camera L515, and 10 objects arranged randomly in the bin to evaluate the efficiency of this method. It is simple, fast, and efficient to deal with many kinds of object which are random in shape, dimension, pose, and material. Video of the real robotic experiments is available at https://www.youtube.com/watch?v=cx5THPyIKjA
|
|
16:30-18:00, Paper WeCT6-CC.4 | Add to My Program |
Sim-To-Real Object Pose Estimation for Random Bin Picking |
|
Kim, Boyoung | Samsung Electronics |
Min, Junhong | Samsung Electronics |
Keywords: Computer Vision for Automation, Deep Learning for Visual Perception, Data Sets for Robotic Vision
Abstract: In industry, random bin picking is a complex and difficult task where instance segmentation and object pose estimation based on point clouds are key processes. Recently, learning-based segmentation and pose estimation methods for 3D point clouds have been proposed. However, many of them require supervised learning with datasets with annotations of objects. Since it is difficult to annotate all stacked instances in bin picking dataset, learning without real-world datasets has become a major interest. In this paper, we introduce an instance-level object pose estimation method for bin picking, which is trained using only simulated data and seamlessly applied to real-world scenarios without additional adaptation. To enable this, we introduce a method for generating a comprehensive synthetic dataset using a physics simulator, which incorporates 3D CAD models of objects and automatically generates annotations for both segmentation and pose estimation. Our experiments, conducted on synthetic datasets, highlight the competitive performance of our method in terms of recall and accuracy. Furthermore, we demonstrate the successful integration of our approach with real robot random bin picking, resulting in significantly improved picking success rates.
|
|
16:30-18:00, Paper WeCT6-CC.5 | Add to My Program |
Action-By-Detection: Efficient Forklift Action Detection for Autonomous Mobile Robots in Warehouses |
|
Prutsch, Alexander | Graz University of Technology |
Possegger, Horst | Graz University of Technology |
Bischof, Horst | Graz University of Technology |
Keywords: Computer Vision for Automation, Deep Learning for Visual Perception, Recognition
Abstract: Understanding actions of other agents increases the efficiency of autonomous mobile robots (AMRs) since they encompass intention and indicate future movements. We propose a new method that allows us to infer vehicle actions using a shallow image-based classification model. The actions are classified via bird's-eye view scene crops, where we project the detections of a 3D object detection model onto a context map. We learn map context information and aggregate temporal sequence information without requiring object tracking. This results in a highly efficient classification model that can easily be deployed on embedded AMR hardware. To evaluate our approach, we create new large-scale synthetic datasets showing warehouse traffic based on real vehicle models and geometry.
|
|
16:30-18:00, Paper WeCT6-CC.6 | Add to My Program |
RIDE: Self-Supervised Learning of Rotation-Equivariant Keypoint Detection and Invariant Description for Endoscopy |
|
Karaoglu, Mert Asim | Technical University of Munich, ImFusion GmbH |
Markova, Viktoria | ImFusion GmbH |
Navab, Nassir | TU Munich |
Busam, Benjamin | Technical University of Munich |
Ladikos, Alexander | ImFusion |
Keywords: Computer Vision for Medical Robotics
Abstract: Unlike in natural images in endoscopy there is no clear notion of an up-right camera orientation. Endoscopic videos therefore often contain large rotational motions, which require keypoint detection and description algorithms to be robust to these conditions. While most classical methods achieve rotation-equivariant detection and invariant description by design, many learning-based approaches learn to be robust only up to a certain degree. At the same time learning-based methods under moderate rotations often outperform classical approaches. In order to address this shortcoming, in this paper we propose RIDE, a learning-based method for rotation-equivariant detection and invariant description. Following recent advancements in group-equivariant learning, RIDE models rotation equivariance implicitly within its architecture. Trained in a self-supervised manner on a large curation of endoscopic images, RIDE requires no manual labeling of training data. We test RIDE in the context of surgical tissue tracking on the SuPeR dataset as well as in the context of relative pose estimation on a repurposed version of the SCARED dataset. In addition we perform explicit studies showing its robustness to large rotations. Our comparison against recent learning-based and classical approaches shows that RIDE sets a new state-of-the-art performance on matching and relative pose estimation tasks and scores competitively on surgical tissue tracking.
|
|
16:30-18:00, Paper WeCT6-CC.7 | Add to My Program |
LLM-Assisted Multi-Teacher Continual Learning for Visual Question Answering in Robotic Surgery |
|
Chen, Kexin | Department of Computer Science Engineering, the Chinese Universi |
Yuyang, Du | The Chinese University of Hong Kong |
You, Tao | National University of Singapore |
Islam, Mobarakol | University College London |
Guo, Ziyu | The Chinese University of Hong Kong |
Jin, Yueming | National University of Singapore |
Chen, Guangyong | Shenzhen Institute of Advanced Technology, Chinese Academy of Sc |
Heng, Pheng Ann | The Chinese University of Hong Kong, Shatin, N.T., Hong Kong |
Keywords: Computer Vision for Medical Robotics, Medical Robots and Systems, Education Robotics
Abstract: Visual question answering (VQA) can be fundamentally crucial for promoting robotic-assisted surgical education. In practice, the needs of trainees are constantly evolving, such as learning more surgical types and adapting to new surgical instruments/techniques. Therefore, continually updating the VQA system by a sequential data stream from multiple resources is demanded in robotic surgery to address new tasks. In surgical scenarios, the privacy issue of patient data often restricts the availability of old data when updating the model, necessitating an exemplar-free continual learning (CL) setup. However, prior studies overlooked two vital problems of the surgical domain: i) large domain shifts from diverse surgical operations collected from multiple departments or clinical centers, and ii) severe data imbalance arising from the uneven presence of surgical instruments or activities during surgical procedures. This paper proposes to address these two problems with a multimodal large language model (LLM) and an adaptive weight assignment methodology. We first develop a new multi-teacher CL framework that leverages a multimodal LLM as the additional teacher. The strong generalization ability of the LLM can bridge the knowledge gap when domain shifts and data imbalances occur. We then put forth a novel data processing method that transforms complex LLM embeddings into logits compatible with our CL framework. We also design an adaptive weight assignment approach that balances the generalization ability of the LLM and the domain expertise of the old CL model. Finally, we construct a new dataset for surgical VQA tasks. Extensive experimental results demonstrate the superiority of our method to other advanced CL models.
|
|
16:30-18:00, Paper WeCT6-CC.8 | Add to My Program |
Procedure Recognition by Knowledge-Driven Segmentation in Robotic-Assisted Vitreoretinal Surgery |
|
Li, Zhen | Institute of Automation, Chinese Academy of Sciences |
Deng, Yawen | Beijing Institute of Technology |
Ye, Qiang | Institute of Automation, Chinese Academy of Sciences |
Yu, Weihong | Peking Union Medical College Hospital |
Qi, Haoxiang | Beijing Institute of Technology |
Liu, Yaliang | Beijing Institute of Technology |
Yu, Zhangguo | Beijing Institute of Technology |
Bian, Gui-Bin | Institute of Automation, Chinese Academy of Sciences |
Keywords: Computer Vision for Medical Robotics, Recognition
Abstract: Internal limiting membrane (ILM) peeling is a vital vitreoretinal surgery procedure. However, due to the thickness of just 1-2 micrometers and the intricacies associated with its varying density and adhesion, the difficulty of manipulation exceeds the physiological limits of human perception and operation. Surgical robot is characterized by high precision and stability. However, navigating intricate intraocular environments and handling minuscule high-precision areas remain enormous challenges. These include issues of uneven lighting, field-of-view loss, and motion blur. This paper proposed a perception method named 'Multimodal Surgical Process Recognition based on Domain Knowledge and Segmentation (MSPR-DKS),' designed to address these challenges and provide input for the precise control of robots. Moreover, a comprehensive dataset focused on ILM peeling during macular hole surgeries was established. Experimental results underscore the efficacy of this approach, with segmentation accuracies exceeding 99.27% for instruments and macular holes and an average accuracy of 98.97% in recognizing surgical processes. This study paves the way for leveraging domain knowledge and image segmentation to improve robot-assisted manipulation of soft tissues in ophthalmology.
|
|
WeCT7-CC Oral Session, CC-416 |
Add to My Program |
Learning in Control II |
|
|
Chair: Kolathaya, Shishir | Indian Institute of Science |
Co-Chair: Park, Jaeheung | Seoul National University |
|
16:30-18:00, Paper WeCT7-CC.1 | Add to My Program |
Weighting Online Decision Transformer with Episodic Memory for Offline-To-Online Reinforcement Learning |
|
Ma, Xiao | Nanjing University |
Li, Wu-Jun | Nanjing University |
Keywords: Reinforcement Learning, Machine Learning for Robot Control
Abstract: Offline reinforcement learning (RL) has been shown to be successfully modeled as a sequence modeling problem, drawing inspiration from the success of Transformers. Offline RL is often limited by the quality of the offline dataset, so offline-to-online RL is a more realistic setting. Online decision transformer (ODT) is an effective and representative sequence modeling-based offline-to-online RL method. Despite its effectiveness, ODT still suffers from the sample inefficiency problem during the online fine-tuning phase. This sample inefficiency problem arises because the agent treats all state-action pairs in the replay buffer equally when trying to learn from the replay buffer. In this paper, we propose a simple yet effective method, called weighting online decision transformer with episodic memory (WODTEM), to improve sample efficiency. We first attempt to introduce an episodic memory (EM) mechanism into the sequence modeling-based RL methods. By utilizing the EM mechanism, we propose a novel training objective with a weighting function, based on ODT, to improve sample efficiency. Experimental results on multiple tasks show that WODTEM can improve sample efficiency.
|
|
16:30-18:00, Paper WeCT7-CC.2 | Add to My Program |
COMPOSER: Scalable and Robust Modular Policies for Snake Robots |
|
Zhang, Yuyou | Carnegie Mellon University |
Niu, Yaru | Carnegie Mellon University |
Liu, Xingyu | Carnegie Mellon University |
Zhao, Ding | Carnegie Mellon University |
Keywords: Bioinspired Robot Learning, Reinforcement Learning, Modeling, Control, and Learning for Soft Robots
Abstract: Snake robots have showcased remarkable compliance and adaptability in their interaction with environments as their natural counterparts. While their hyper-redundant and high-dimensional characteristics add to this adaptability, they also pose great challenges to robot control. Instead of perceiving the hyper-redundancy and flexibility of snake robots as mere challenges, there lies an unexplored potential in leveraging these traits to enhance robustness and generalizability at the control policy level. We seek to develop a control policy that effectively breaks down the high dimensionality of snake robots while harnessing their redundancy. In this work, we consider the snake robot as a modular robot and formulate the control of the snake robot as a cooperative Multi-Agent Reinforcement Learning (MARL) problem. Each segment of the snake robot is an agent, using its local observation to independently determine its actions. Specifically, we incorporate a self-attention mechanism to enhance the cooperative behavior between agents. A high-level imagination policy is trained to provide additional rewards to guide the low-level control policy. We validate the proposed method COMPOSER with five snake robot tasks, including goal reaching, wall climbing, shape formation, tube crossing, and block pushing. COMPOSER achieves the highest success rate across all tasks when compared to a centralized baseline and four modular policy baselines. Additionally, we show enhanced robustness against module corruption and significantly superior zero-shot generalizability in our proposed method. The videos of this work are available on our project page: https://sites.google.com/view/composer-snake/.
|
|
16:30-18:00, Paper WeCT7-CC.3 | Add to My Program |
Barrier Functions Inspired Reward Shaping for Reinforcement Learning |
|
Nilaksh, Nilaksh | Indian Institue of Technology, Kharagpur |
Ranjan, Abhishek | Indian Institute of Science Bangalore |
Agrawal, Shreenabh | Indian Institute of Science, Bangalore |
Jain, Aayush | Indian Institute of Technology Kharagpur |
Jagtap, Pushpak | Indian Institute of Science |
Kolathaya, Shishir | Indian Institute of Science |
Keywords: Reinforcement Learning, Machine Learning for Robot Control, Legged Robots
Abstract: Reinforcement Learning (RL) has progressed from simple control tasks to complex real-world challenges with large state spaces. While RL excels in these tasks, training time remains a limitation. Reward shaping is a popular solution, but existing methods often rely on value functions, which face scalability issues. This paper presents a novel safety-oriented reward-shaping framework inspired by barrier functions, offering simplicity and ease of implementation across various environments and tasks. To evaluate the effectiveness of the proposed reward formulations, we conduct simulation experiments on CartPole, Ant, and Humanoid environments, along with real-world deployment on the Unitree Go1 quadruped robot. Our results demonstrate that our method leads to 1.4-2.8 times faster convergence and as low as 50-60% actuation effort compared to the vanilla reward. In a sim-to-real experiment with the Go1 robot, we demonstrated better control and dynamics of the bot with our reward framework.
|
|
16:30-18:00, Paper WeCT7-CC.4 | Add to My Program |
AdaptAUG: Adaptive Data Augmentation Framework for Multi-Agent Reinforcement Learning |
|
Yu, Xin | Beihang University |
Tian, Yongkai | Beihang University |
Wang, Li | Beihang University |
Feng, Pu | Beihang University |
Wu, Wenjun | Beihang University |
Shi, Rongye | Beihang University |
Keywords: Reinforcement Learning, Deep Learning Methods, Machine Learning for Robot Control
Abstract: Multi-agent reinforcement learning has emerged as a promising approach for the control of multi-robot systems. Nevertheless, the low sample efficiency of MARL poses a significant obstacle to its broader application in robotics. While data augmentation appears to be a straightforward solution for improving sample efficiency, it usually incurs training instability, making the sample efficiency worse. Moreover, manually choosing suitable augmentations for a variety of tasks is a tedious and time-consuming process. To mitigate these challenges, our research theoretically analyzes the implications of data augmentation on MARL algorithms. Guided by these insights, we present AdaptAUG, an adaptive framework designed to selectively identify beneficial data augmentations, thereby achieving superior sample efficiency and overall performance in multi-robot tasks. Extensive experiments in both simulated and real-world multi-robot scenarios validate the effectiveness of our proposed framework.
|
|
16:30-18:00, Paper WeCT7-CC.5 | Add to My Program |
HyperPPO: A Scalable Method for Finding Small Policies for Robotic Control |
|
Hegde, Shashank | University of Southern California |
Huang, Zhehui | University of Southern California |
Sukhatme, Gaurav | University of Southern California |
Keywords: Reinforcement Learning, Machine Learning for Robot Control, Swarm Robotics
Abstract: Models with fewer parameters are necessary for the neural control of memory-limited, performant robots. Finding these smaller neural network architectures requires repetitive experimentation and can be time-consuming. We propose HyperPPO, an on-policy reinforcement learning algorithm that utilizes graph hypernetworks to estimate the weights of multiple architectures simultaneously. Our method is capable of estimating weights for policies that are much smaller than commonly used networks yet can represent high-performing policies. We obtain multiple trained policies at the same time while maintaining sample efficiency and provide the user the choice of picking a network architecture that satisfies their inference compute constraints. We show that our method scales well - more training resources produce faster convergence to higher-performing architectures. We also demonstrate that the neural policies estimated by HyperPPO are capable of decentralized control of a Crazyflie2.1 quadrotor. Project website: https://sites.google.com/usc.edu/hyperppo
|
|
16:30-18:00, Paper WeCT7-CC.6 | Add to My Program |
Grow Your Limits: Continuous Improvement with Real-World RL for Robotic Locomotion |
|
Smith, Laura | UC Berkeley |
Cao, Yunhao | UC Berkeley |
Levine, Sergey | UC Berkeley |
Keywords: Reinforcement Learning, Deep Learning Methods
Abstract: Deep reinforcement learning can enable robots to autonomously acquire complex behaviors such as legged locomotion. However, RL in the real world is complicated by constraints on efficiency, safety, and overall training stability, which limits its practical applicability. We present APRL, a policy regularization framework that modulates the robot's exploration throughout training, striking a balance between flexible improvement potential and focused, efficient exploration. APRL enables a quadrupedal robot to efficiently learn to walk entirely in the real world within minutes and continue to improve with more training where prior work saturates in performance. We demonstrate that continued training with APRL results in a policy that is substantially more capable of navigating challenging situations and adapts to changes in dynamics. Videos and code to reproduce our results are available at: https://sites.google.com/berkeley.edu/aprl
|
|
16:30-18:00, Paper WeCT7-CC.7 | Add to My Program |
Torque-Based Deep Reinforcement Learning for Task-And-Robot Agnostic Learning on Bipedal Robots Using Sim-To-Real Transfer |
|
Kim, Donghyeon | Graduate School of Convergence Science and Technology, Seoul Nat |
Berseth, Glen | Université De Montréal |
Schwartz, Mathew | New Jersey Institute of Technology |
Park, Jaeheung | Seoul National University |
Keywords: Reinforcement Learning, Humanoid and Bipedal Locomotion
Abstract: In this paper, we review the question of which action space is best suited for controlling a real biped robot in combination with Sim2Real training. Position control has been popular as it has been shown to be more sample efficient and intuitive to combine with other planning algorithms. However, for position control gain tuning is required to achieve the best possible policy performance. We show that instead, using a torque-based action space enables task-and-robot agnostic learning with less parameter tuning and mitigates the sim-to-reality gap by taking advantage of torque control's inherent compliance. Also, we accelerate the torque-based-policy training process by pre-training the policy to remain upright by compensating for gravity. The paper showcases the first successful sim-to-real transfer of a torque-based deep reinforcement learning policy on a real human-sized biped robot.
|
|
16:30-18:00, Paper WeCT7-CC.8 | Add to My Program |
Decentralized Motor Skill Learning for Complex Robotic Systems |
|
Guo, Yanjiang | Tsinghua University |
Jiang, Zheyuan | Tsinghua University |
Wang, Yen-Jen | Tsinghua University |
Gao, Jingyue | Tsinghua University |
Chen, Jianyu | Tsinghua University |
Keywords: Reinforcement Learning, Machine Learning for Robot Control
Abstract: Reinforcement learning (RL) has achieved remarkable success in complex robotic systems (eg. quadruped locomotion). In previous works, the RL-based controller was typically implemented as a single neural network with concatenated observation input. However, the corresponding learned policy is highly task-specific. Since all motors are controlled in a centralized way, out-of-distribution local observations can impact global motors through the single coupled neural network policy. In contrast, animals and humans can control their limbs separately. Inspired by this biological phenomenon, we propose a Decentralized motor skill (DEMOS) learning algorithm to automatically discover motor groups that can be decoupled from each other while preserving essential connections and then learn a decentralized motor control policy. Our method improves the robustness and generalization of the policy without sacrificing performance. Experiments on quadruped and humanoid robots demonstrate that the learned policy is robust against local motor malfunctions and can be transferred to new tasks.
|
|
WeCT8-CC Oral Session, CC-418 |
Add to My Program |
Manipulation Planning |
|
|
Chair: Makita, Satoshi | Fukuoka Institute of Technology |
Co-Chair: Higashimori, Mitsuru | Osaka University |
|
16:30-18:00, Paper WeCT8-CC.1 | Add to My Program |
Toward Optimal Tabletop Rearrangement with Multiple Manipulation Primitives |
|
Huang, Baichuan | Rutgers University |
Zhang, Xujia | Southern University of Science and Technology |
Yu, Jingjin | Rutgers University |
Keywords: Manipulation Planning, Task and Motion Planning
Abstract: In practice, many types of manipulation actions (e.g., pick-n-place and push) are needed to accomplish real-world manipulation tasks. Yet, limited research exists that explores the synergistic integration of different manipulation actions for optimally solving long-horizon task-and-motion planning problems. In this study, we propose and investigate planning high-quality action sequences for solving long-horizon tabletop rearrangement tasks in which multiple manipulation primitives are required. Denoting the problem rearrangement with multiple manipulation primitives (REMP), we develop two algorithms, hierarchical best-first search (HBFS) and parallel Monte Carlo tree search for multi-primitive rearrangement (PMMR) toward optimally resolving the challenge. Extensive simulation and real robot experiments demonstrate that both methods effectively tackle REMP, with HBFS excelling in planning speed and PMMR producing human-like, high-quality solutions with a nearly 100% success rate. Source code and supplementary materials will be available at https://github.com/arc-l/remp.
|
|
16:30-18:00, Paper WeCT8-CC.2 | Add to My Program |
ReorientDiff: Diffusion Model Based Reorientation for Object Manipulation |
|
Mishra, Utkarsh | Georgia Institute of Technology |
Chen, Yongxin | Georgia Institute of Technology |
Keywords: Probabilistic Inference, Manipulation Planning, Deep Learning Methods
Abstract: The ability to manipulate objects in desired configurations is a fundamental requirement for robots to complete various practical applications. While certain goals can be achieved by picking and placing the objects of interest directly, object reorientation is needed for precise placement in most of the tasks. In such scenarios, the object must be reoriented and re-positioned into intermediate poses that facilitate accurate placement at the target pose. To this end, we propose a reorientation planning method, ReorientDiff, that utilizes a diffusion model-based approach. The proposed method employs both visual inputs from the scene, and goal-specific language prompts to plan intermediate reorientation poses. Specifically, the scene and language-task information are mapped into a joint scene-task representation feature space, which is subsequently leveraged to condition the diffusion model. The diffusion model samples intermediate poses based on the representation using classifier-free guidance and then uses gradients of learned feasibility-score models for implicit iterative pose-refinement. The proposed method is evaluated using a set of YCB-objects and a suction gripper, demonstrating a success rate of 95.2% in simulation. Overall, we present a promising approach to address the reorientation challenge in manipulation by learning a conditional distribution, which is an effective way to move towards generalizable object manipulation. More results can be found on our website: utkarshmishra04.github.io/ReorientDiff
|
|
16:30-18:00, Paper WeCT8-CC.3 | Add to My Program |
ForceSight: Text-Guided Mobile Manipulation with Visual-Force Goals |
|
Collins, Jeremy | Georgia Institute of Technology |
Houff, Cody | Georgia Institute of Technology |
Tan, You Liang | Georgia Institute of Technology |
Kemp, Charles C. | Hello Robot Inc |
Keywords: Manipulation Planning, Deep Learning in Grasping and Manipulation, Mobile Manipulation
Abstract: We present ForceSight, a system for text-guided mobile manipulation that predicts visual-force goals using a text-conditioned vision transformer. Given a single RGBD image and a text prompt, ForceSight determines a target end-effector pose in the camera frame (kinematic goal) and the associated forces (force goal). Together, these two components form a visual-force goal. Prior work has demonstrated that deep models outputting human-interpretable kinematic goals can enable dexterous manipulation by real robots. Forces are critical to manipulation, yet have typically been relegated to low-level execution in these systems. When deployed on a mobile manipulator equipped with an eye-in-hand RGBD camera, ForceSight performed tasks such as precision grasps, drawer opening, and object handovers with an 81% success rate in unseen environments with object instances that differed significantly from the training data. In a separate experiment, relying exclusively on visual servoing and ignoring force goals dropped the success rate from 90% to 45%, demonstrating that force goals can significantly enhance performance. The appendix, videos, code, and trained models are available at https://force-sight.github.io/.
|
|
16:30-18:00, Paper WeCT8-CC.4 | Add to My Program |
Unknown Object Retrieval in Confined Space through Reinforcement Learning with Tactile Exploration |
|
Zhao, Xinyuan | Agency for Science, Technology and Research |
Liang, Wenyu | Institute for Infocomm Research, A*STAR |
Zhang, Xiaoshi | National University of Singapore |
Chew, Chee Meng | National University of Singapore |
Wu, Yan | A*STAR Institute for Infocomm Research |
Keywords: Manipulation Planning, Force and Tactile Sensing, Dexterous Manipulation
Abstract: The potential of tactile sensing for dexterous robotic manipulation has been demonstrated by its ability to enable nuanced real-world interactions. In this study, the retrieval of unknown objects from confined spaces, which is unsuitable for conventional visual perception and gripper-based manipulation, is identified and addressed. Specifically, a tactile-sensorized tool stick that well fits in the narrow space is utilized to provide multi-point contact sensing in object manipulation. A reinforcement learning (RL) agent with a hybrid action space is then proposed to acquire the optimal policy for manipulating the objects without prior knowledge of their physical properties. To accelerate on-hardware training, a focused training strategy is adopted with the hypothesis that an agent trained on a small set of representative shapes can be generalized to a wide range of everyday objects. Additionally, a curriculum on terminal goals is designed to further accelerate the hardware-based training process. Comparative experiments and ablation studies have been conducted to evaluate the effectiveness and robustness of the proposed approach, which highlights the high success rate of our solution for retrieving everyday objects.
|
|
16:30-18:00, Paper WeCT8-CC.5 | Add to My Program |
The Grasp Loop Signature: A Topological Representation for Manipulation Planning with Ropes and Cables |
|
Mitrano, Peter | University of Michigan |
Berenson, Dmitry | University of Michigan |
Keywords: Manipulation Planning, Motion and Path Planning
Abstract: Robotic manipulation of deformable, one-dimensional objects (DOOs) like ropes or cables has important potential applications in manufacturing, agriculture, and surgery. In such environments, the task may involve threading through or avoiding becoming tangled with objects like racks or frames. Grasping with multiple grippers can create closed loops between the robot and DOO, and If an obstacle lies within this loop, it may be impossible to reach the goal. However, prior work has only considered the topology of the DOO in isolation, ignoring the arms that are manipulating it. Searching over possible grasps to accomplish the task without considering such topological information is very inefficient, as many grasps will not lead to progress on the task due to topological constraints. Therefore, we propose a grasp loop signature which categorizes the topology of these grasp loops and show how it can be used to guide planning. We perform experiments in simulation on two DOO manipulation tasks to show that using the signature is faster and succeeds more often than methods that rely on local geometry or finite-horizon planning. Finally, we demonstrate using the signature in the real world to manipulate a cable in a scene with obstacles using a dual-arm robot.
|
|
16:30-18:00, Paper WeCT8-CC.6 | Add to My Program |
Articulated Object Manipulation with Coarse-To-Fine Affordance for Mitigating the Effect of Point Cloud Noise |
|
Ling, Suhan | Peking University |
Wang, Yian | Umass Amherst |
Wu, Ruihai | Peking University |
Wu, Shiguang | Chinese Academy of Sciences Beijing, China |
Zhuang, Yuzheng | Huawei Technologies Company |
Xu, Tianyi | Peking University |
Li, Yu | BUPT |
Liu, Chang | Peking University |
Dong, Hao | Peking University |
Keywords: Manipulation Planning
Abstract: 3D articulated objects are inherently challenging for manipulation due to the varied geometries and intricate functionalities associated with articulated objects. Point-level affordance, which predicts the per-point actionable score and thus proposes the best point to interact with, has demonstrated excellent performance and generalization capabilities in articulated object manipulation. However, a significant challengeremains: while previous works use perfect point cloud generated in simulation, the models cannot directly apply to the noisy point cloud in the real-world. To tackle this challenge, we leverage the property of real-world scanned point cloud that, the point cloud becomes less noisy when the camera is closer to the object. Therefore, we propose a novel coarse-to-fine affordance learning pipeline to mitigate the effect of point cloud noise in two stages. In the first stage, we learn the affordance on the noisy far point cloud which includes the whole object to propose the approximated place to manipulate. Then, we move the camera in the front of the approximated place, scan a less noisy point cloud containing precise local geometries for manipulation, and learn affordance on such point cloud to propose fine-grained final actions. The proposed method is thoroughly evaluated both using large-scale simulated noisy point clouds mimicing real-world scans, and in the real world scenarios, with superiority over existing methods, demonstrating the effectiveness in tackling the noisy real-world point cloud problem.
|
|
16:30-18:00, Paper WeCT8-CC.7 | Add to My Program |
Improved M4M: Faster and Richer Planning for Manipulation among Movable Objects in Cluttered 3D Workspaces |
|
Saxena, Dhruv Mauria | The Robotics Institute, Carnegie Mellon University |
Likhachev, Maxim | Carnegie Mellon University |
Keywords: Manipulation Planning, Task and Motion Planning
Abstract: We are interested in enabling robots to solve difficult pick-and-place manipulation tasks in cluttered and constrained environments. If the robot does not have collision-free access to the object-of-interest (OoI) which it intends to grasp and extract from the workspace, it must reason about which movable objects to rearrange, where to move them, and how it may do so. In recent work we introduced E-M4M, a graph search-based solver for solving such Manipulation tasks Among Movable Objects (MAMO). In this paper we make several improvements to E-M4M - we introduce the use of prehensile or pick-and-place rearrangement actions in addition to pushes; we show that by running it as a depth-first search improves performance; we show how the search can be run ``eagerly lazily'' to only simulate actions in a physics-based simulator when necessary; finally we relax the assumption that we require perfect knowledge of the physical properties of objects (mass and coefficient of friction in particular). The improved version of E-M4M presented in this paper, I-M4M, is a faster and more versatile MAMO solver with a rich action space. We discuss the impact of the improvements we make in an extensive simulation study and show previously unachievable results on a real-world PR2 robot.
|
|
16:30-18:00, Paper WeCT8-CC.8 | Add to My Program |
Preprocessing-Based Kinodynamic Motion Planning Framework for Intercepting Projectiles Using a Robot Manipulator |
|
Natarajan, Ramkumar | Robotics Institute, Carnegie Mellon University |
Yang, Hanlan | Carnegie Mellon University |
Xie, Qintong | University of Oxford |
Oza, Yash | Amazon Robotics |
Das, Manash Pratim | Carnegie Mellon University |
Islam, Fahad | Carnegie Mellon University |
Saleem, Muhammad Suhail | Carnegie Mellon University |
Choset, Howie | Carnegie Mellon University |
Likhachev, Maxim | Carnegie Mellon University |
Keywords: Manipulation Planning, Motion and Path Planning, Optimization and Optimal Control
Abstract: We are interested in studying sports with robots and starting with the problem of intercepting a projectile moving toward a robot manipulator equipped with a shield. To successfully perform this task, the robot needs to (i) detect the incoming projectile, (ii) predict the projectile's future motion, (iii) plan a minimum-time rapid trajectory that can evade obstacles and intercept the projectile, and (iv) execute the planned trajectory. These four steps must be performed under the manipulator's dynamic limits and extreme time constraints (<350ms in our setting) to successfully intercept the projectile. In addition, we want these trajectories to be smooth to reduce the robot's joint torques and the impulse on the platform on which it is mounted. To this end, we propose a kinodynamic motion planning framework that preprocesses smooth trajectories offline to allow real-time collision-free executions online. We present an end-to-end pipeline along with our planning framework, including perception, prediction, and execution modules. We evaluate our framework experimentally in simulation and show that it has a higher blocking success rate than the baselines. Further, we deploy our pipeline on a robotic system comprising an industrial arm (ABB IRB-1600) and an onboard stereo camera (ZED 2i), which achieves a 78% success rate in projectile interceptions.
|
|
WeCT9-CC Oral Session, CC-419 |
Add to My Program |
Collision Avoidance III |
|
|
Chair: Park, Daehyung | Korea Advanced Institute of Science and Technology, KAIST |
Co-Chair: Liu, Lantao | Indiana University |
|
16:30-18:00, Paper WeCT9-CC.1 | Add to My Program |
MIM: Indoor and Outdoor Navigation in Complex Environments Using Multi-Layer Intensity Maps |
|
Sathyamoorthy, Adarsh Jagan | University of Maryland |
Kulathun Mudiyanselage, Kasun Weerakoon | University of Maryland, College Park |
Elnoor, Mohamed | University of Maryland |
Russell, Mason | Army Research Laboratory |
Pusey, Jason | U.S. Army Research Laboratory (ARL) |
Manocha, Dinesh | University of Maryland |
Keywords: Collision Avoidance, Motion and Path Planning, Field Robots
Abstract: We present MIM (Multi-Layer Intensity Map), a novel 3D object representation for robot perception and autonomous navigation. MIMs consist of multiple stacked layers of 2D grid maps each derived from reflected point cloud intensities corresponding to a certain height interval. The different layers of MIMs can be used to simultaneously estimate obstacles' height, solidity/density, and opacity. We demonstrate that MIMs' can help accurately differentiate obstacles that are safe to navigate through (e.g. beaded/string curtains, pliable tall grass), from ones that must be avoided (e.g. transparent surfaces such as glass walls, bushes, trees, etc.) in indoor and outdoor environments. Further, to handle narrow passages, and navigate through non-solid obstacles in dense environments, we propose an approach to adaptively inflate or enlarge the obstacles detected on MIMs based on their solidity, and the robot's preferred velocity direction. We demonstrate these improved navigation capabilities in real-world narrow, dense environments using a real Turtlebot and Boston Dynamics Spot. We observe significant increases in success rates to more than 50%, up to a 9.5% decrease in normalized trajectory length, and up to a 22.6% increase in the F-score compared to current navigation methods using other sensor modalities.
|
|
16:30-18:00, Paper WeCT9-CC.2 | Add to My Program |
Gaussian Process-Based Traversability Analysis for Terrain Mapless Navigation |
|
Abe Leininger, Abraham | Indiana University |
Ali, Mahmoud | Indiana University |
Jardali, Hassan | Indiana University |
Liu, Lantao | Indiana University |
Keywords: Collision Avoidance, Motion and Path Planning
Abstract: Efficient navigation through uneven terrain remains a challenging endeavor for autonomous robots. We propose a new geometric-based uneven terrain mapless navigation framework combining a Sparse Gaussian Process (SGP) local map with a Rapidly-Exploring Random Tree* (RRT*) planner. Our approach begins with the generation of a high-resolution SGP local map, providing an interpolated representation of the robot's immediate environment. This map captures crucial environmental variations, including height, uncertainties, and slope characteristics. Subsequently, we construct a traversability map based on the SGP representation to guide our planning process. The RRT* planner efficiently generates real-time navigation paths, avoiding untraversable terrain in pursuit of the goal. This combination of SGP-based terrain interpretation and RRT* planning enables ground robots to safely navigate environments with varying elevations and steep obstacles. We evaluate the performance of our proposed approach through robust simulation testing, highlighting its effectiveness in achieving safe and efficient navigation compared to existing methods. See the project GitHub for source code and supplementary materials, including a video demonstrating experimental results.
|
|
16:30-18:00, Paper WeCT9-CC.3 | Add to My Program |
Active Collision-Based Navigation for Wheeled Robots |
|
Li, Jingjing | Zhejiang University |
Ji, Jialin | Zhejiang University |
Wang, Qianhao | Zhejiang University |
Yu, Huan | Zhejiang University |
Pan, Yu | Zhejiang University |
Gao, Fei | Zhejiang University |
Keywords: Planning under Uncertainty, Motion and Path Planning, Localization
Abstract: Collision is typically avoided in robot navigation for safety guarantee. However, when a robot's exteroceptive sensors fail, which means it becomes "blind", collision can actually be leveraged to improve localization performance. Our research demonstrates the informative nature of collisions in this context.Moreover, we show that a robot is able to navigate in a known environment with only proprioceptive sensors by actively colliding with its surroundings for more reliable localization. Firstly, we design a collision-based observation model, which is differentiable and can be easily applied to various estimators. Secondly, we integrate this model into a collision-aided localization framework and implement it in two widely used estimators, the Kalman filter and the particle filter. Thirdly, we propose an active collision path planning method, which effectively reduces localization uncertainty.
|
|
16:30-18:00, Paper WeCT9-CC.4 | Add to My Program |
Graph-Based 3D Collision-Distance Estimation Network with Probabilistic Graph Rewiring |
|
Song, Minjae | KAIST |
Kim, Yeseung | KAIST |
Kim, Min Jun | KAIST |
Park, Daehyung | Korea Advanced Institute of Science and Technology, KAIST |
Keywords: Collision Avoidance, Computational Geometry, AI-Based Methods
Abstract: We aim to solve the problem of data-driven collision-distance estimation given 3-dimensional (3D) geometries. Conventional algorithms suffer from low accuracy due to their reliance on limited representations, such as point clouds. In contrast, our previous graph-based model, GraphDistNet, achieves high accuracy using edge information but incurs higher message-passing costs with growing graph size, limiting its applicability to 3D geometries. To overcome these challenges, we propose GDN-R, a novel 3D graph-based estimation network.GDN-R employs a layer-wise probabilistic graph-rewiring algorithm leveraging the differentiable Gumbel-top-K relaxation. Our method accurately infers minimum distances through iterative graph rewiring and updating relevant embeddings. The probabilistic rewiring enables fast and robust embedding with respect to unforeseen categories of geometries. Through 41,412 random benchmark tasks with 150 pairs of 3D objects, we show GDN-R outperforms state-of-the-art baseline methods in terms of accuracy and generalizability. We also show that the proposed rewiring improves the update performance reducing the size of the estimation model. We finally show its batch prediction and auto-differentiation capabilities for trajectory optimization in both simulated and real-world scenarios.
|
|
16:30-18:00, Paper WeCT9-CC.5 | Add to My Program |
Jump Over Block (JOB): An Efficient Line-Of-Sight Checker for Grid/voxel Maps with Sparse Obstacles |
|
Yao, Zhuo | Beihang University |
Wang, Wei | Beihang University |
Zhang, Jiadong | Beihang University |
Wang, Yan | Beihang University, School of Mechanical Engineering and Automat |
Li, Jinjiang | Beihang |
Keywords: Collision Avoidance, Motion and Path Planning, Computational Geometry
Abstract: Line-Of-Sight (LOS) check plays a crucial role in collision avoidance and time comsuming, particularly in scenarios involving large-scale maps with sparse obstacles, as it necessitates a grid-by-grid state check. Specifically, LOS check consumes more than half of the computational time in any-angle path planning algorithms, such as Theta*, Visibility Graph, and RRT. To address this issue, we propose an efficient LOS checker for maps of arbitrary dimensions with sparse obstacles. Our approach involves a two-step process. Firstly, we partition the passable space into blocks until there is no vacancy for a minimum-sized block. When the adapted Bresenham algorithm reaches a surface of a block, it bypasses grid-by-grid traversal within the block and directly jumps to the opposing surface. This method significantly reduces the number of grids examined, resulting in higher efficiency compared to traditional LOS checks. We refer to our approach as Jump Over Block (JOB). To demonstrate the advantages of JOB, we compare its performance against traditional LOS checks using a widely recognized public dataset. The results indicate that JOB incurs only 1/6 to 1/5 of the computational cost associated with raw LOS checks, making it a valuable tool for both researchers and practitioners in the field. In order to facilitate further research within the community, we have made the source code of the proposed algorithm publicly available.
|
|
16:30-18:00, Paper WeCT9-CC.6 | Add to My Program |
Conformal Predictive Safety Filter for RL Controllers in Dynamic Environments |
|
Strawn, Kegan | University of Southern California |
Ayanian, Nora | Brown University |
Lindemann, Lars | University of Southern California |
Keywords: Collision Avoidance, Reinforcement Learning, Robot Safety
Abstract: The interest in using reinforcement learning (RL) controllers in safety-critical applications such as robot navigation around pedestrians motivates the development of additional safety mechanisms. Running RL-enabled systems among uncertain dynamic agents may result in high counts of collisions and failures to reach the goal. The system could be safer if the pre-trained RL policy was uncertainty-informed. For that reason, we propose conformal predictive safety filters that: 1) predict the other agents’ trajectories, 2) use statistical techniques to provide uncertainty intervals around these predictions, and 3) learn an additional safety filter that closely follows the RL controller but avoids the uncertainty intervals. We use conformal prediction to learn uncertainty-informed predictive safety filters, which make no assumptions about the agents’ distribution. The framework is modular and outperforms the existing controllers in simulation. We demonstrate our approach with multiple experiments in a collision avoidance gym environment and show that our approach minimizes the number of collisions without making overly conservative predictions.
|
|
16:30-18:00, Paper WeCT9-CC.7 | Add to My Program |
GrASPE: Graph Based Multimodal Fusion for Robot Navigation in Outdoor Environments |
|
Kulathun Mudiyanselage, Kasun Weerakoon | University of Maryland, College Park |
Sathyamoorthy, Adarsh Jagan | University of Maryland |
Liang, Jing | University of Maryland |
Guan, Tianrui | University of Maryland |
Patel, Utsav | University of Maryland |
Manocha, Dinesh | University of Maryland |
Keywords: Motion and Path Planning, Vision-Based Navigation, Collision Avoidance
Abstract: We present a novel trajectory traversability estimation and planning algorithm for robot navigation in complex outdoor environments. We incorporate multimodal sensory inputs from an RGB camera, 3D LiDAR, and the robot's odometry sensor to train a prediction model to estimate candidate trajectories' success probabilities based on partially reliable multi-modal sensor observations. We encode high-dimensional multi-modal sensory inputs to low-dimensional feature vectors using encoder networks and represent them as a connected graph. The graph is then used to train an attention-based Graph Neural Network (GNN) to predict trajectory success probabilities. We further analyze the number of features in the image (corners) and point cloud data (edges and planes) separately to quantify their reliability to augment the weights of the feature graph representation used in our GNN. During runtime, our model utilizes multi-sensor inputs to predict the success probabilities of the trajectories generated by a local planner to avoid potential collisions and failures. Our algorithm demonstrates robust predictions when one or more sensor modalities are unreliable or unavailable in complex outdoor environments. We evaluate our algorithm's navigation performance using a Spot robot in real-world outdoor environments. We observe an increase of 10-30% in terms of navigation success rate and up to 15% increase in AU-ROC compared to the state-of-the-art navigation methods.
|
|
16:30-18:00, Paper WeCT9-CC.8 | Add to My Program |
E-RRT*: Path Planning for Hyper-Redundant Manipulators |
|
Ji, Hongcheng | Zhejiang University |
Xie, Haibo | Zhejiang University |
Wang, Cheng | Zhejiang University |
Yang, Huayong | ZheJiang University |
Keywords: Motion and Path Planning, Collision Avoidance
Abstract: A hyper-redundant manipulator(HRM) can flexibly accomplish tasks in narrow spaces. However, its excessive degrees of freedom pose challenges for path planning. In this article, an ellipsoid-shape rapidly-exporing random tree (E-RRT*) method is proposed for path planning of HRMs in workspace, particularly those with angle limits. This method replaces line segments with ellipsoids to connect adjacent nodes. Firstly, an analysis of angle constraints of the HRM is conducted, providing restrictions on node selection during path planning. Secondly, a slow-speed informed guiding approach is introduced to optimize the sampling process. Finally, the obtained path is enhanced by adding control points and applying cubic polynomial interpolation to achieve path smoothing. Simulations demonstrate that the proposed E-RRT* method effectively solves the path planning problem for HRMs. Especially in narrow environments, appropriate informed guiding speeds enable E-RRT* to outperform other methods.
|
|
16:30-18:00, Paper WeCT9-CC.9 | Add to My Program |
LiDAR-Based Online Control Barrier Function Synthesis for Safe Navigation in Unknown Environments |
|
Keyumarsi, Shaghayegh | Tampere University |
Atman, Made Widhi Surya | Tampere University |
Gusrialdi, Azwirman | Tampere University |
Keywords: Collision Avoidance, Autonomous Agents, Machine Learning for Robot Control
Abstract: This paper presents a novel extension of the Control Barrier Function (CBF) as the low-level safety controller for autonomous mobile robots navigating in unknown environments. The main challenges of implementing CBF in real-world situations arise from the absence of a model or the lack of an exact one for the environment. Additionally, online learning is needed for the robot to maneuver in an unknown environment which leads to dealing with the sampled data set size, memory, and computational complexity. We address these challenges by designing an online non-parametric Lidar-based safety function using the Gaussian process (GP). It is both efficient in data size and eliminates the requirement to store previous data. Then, a CBF is synthesized using the proposed safety function to rectify the safe control input. The effectiveness of the Lidar-based CBF synthesis for navigation in unknown environments was validated by conducting experiments on unicycle-type robots.
|
|
WeCT10-CC Oral Session, CC-501 |
Add to My Program |
Soft Robot Applications II |
|
|
Chair: Zhao, Huichan | Tsinghua University |
Co-Chair: Paik, Jamie | Ecole Polytechnique Federale De Lausanne |
|
16:30-18:00, Paper WeCT10-CC.1 | Add to My Program |
A Soft, Lightweight Flipping Robot with Versatile Motion Capabilities for Wall-Climbing Applications |
|
Chen, Rui | Chongqing University |
Tao, Xinrui | Chongqing University |
Cao, Changyong (Chase) | Case Western Reserve University |
Jiang, Pei | Chongqing University |
Luo, Jun | Chongqing University |
Sun, Yu | University of Toronto |
Keywords: Soft Robot Applications, Soft Robot Materials and Design, Climbing Robots, Electroadhesion
Abstract: Soft wall-climbing robots have been limited in their ability to perform complex locomotion in diverse environments due to their structure and weight. Thus far, soft wall-climbing robots with integrated functions that can locomote in complex 3D environments are yet to be developed. This study addresses this challenge by presenting a lightweight (2.57 g) soft wall-climbing robot with integrated linear, turning, and transitioning motion capabilities. The soft robot employs three pneumatic bending actuators and two adaptive electroadhesion (EA) pads, which enable it to flip forward, transition between two walls, turn in two directions, and adhere to various surfaces. Different motion and control strategies are proposed based on a theoretical model. The experimental results demonstrate that the robot can move at an average speed of 3.85 mm/s on horizontal, vertical, and inverted walls and make transitions between walls with different pinch angles within 180°. Additionally, the soft robot can carry a miniature camera on vertical walls to perform detection and surveillance tasks. This work provides a reliable structure and control strategy to enhance the multifunctionalit
|
|
16:30-18:00, Paper WeCT10-CC.2 | Add to My Program |
Tetraflex: A Multigait Soft Robot for Object Transportation in Confined Environments |
|
Wharton, Peter | University of Bristol |
You, Tsam Lung | University of Bristol |
Jenkinson, George | University of Bristol |
Diteesawat, Richard Suphapol | University of Bristol |
Le, Nguyen Hao | University of Bristol |
Hall, Edith-Clare | University of Bristol |
Garrad, Martin | University of Bristol |
Conn, Andrew | University of Bristol |
Rossiter, Jonathan | University of Bristol |
Keywords: Soft Robot Applications, Soft Robot Materials and Design, Search and Rescue Robots
Abstract: Unstructured environments call for versatile robots with adaptable morphology that can perform multiple goal-directed actions including locomotion in confined spaces, environmental mapping, object retrieval and object manipulation. In response to these challenges, we present the Polyflex design concept for fabrication of modular, soft truss robots and demonstrate its varied capabilities in a tetrahedral robot (Tetraflex). Tetraflex is composed of six pneumatically actuated bellows joined at four points by rigid nodes. By extending or contracting the bellows, Tetraflex is capable of large size and shape change, and rolling, crawling and bounding gaits. Furthermore, Tetraflex is able to roll onto and engulf objects then subsequently transport them with the crawling gait. The rolling gait discretises Tetraflex’s locomotion into predictable steps on a triangular grid, simplifying odometry and allowing the use of path planning to attain a desired position. The size of rolling step can be changed at any time by dynamically varying the size of the robot. The crawling and bounding gaits enable Tetraflex to move in smaller incremental steps or through narrow passages (80 mm wide). The maximum speed was attained with a bounding locomotion gait at 19.6 mm/s (0.15 body lengths per second, or BL/s). Rolling locomotion attained between 15.6 and 19.4 mm/s (0.12-0.15 BL/s), and crawling 7.8 mm/s (0.06 BL/s). The rolling gait was the most accurate gait, achieving 2.3% linear deviation.
|
|
16:30-18:00, Paper WeCT10-CC.3 | Add to My Program |
A Strong Underwater Soft Manipulator with Planarly-Bundled Actuators and Accurate Position Control |
|
Tang, Kailuan | Harbin Institute of Technology |
Lu, Chenghua | University of Bristol |
Chen, Yishan | Southern University of Science and Technology |
Xiao, Yin | Southern University of Science and Technology |
Wu, Shijian | Southern University of Science and Technology |
Tang, Shaowu | Southern University of Science and Technology |
Wang, Hexiang | The University of Sydney |
Zhang, Binbin | Southern University of Science and Technology |
Shen, Zhong | The University of Hong Kong |
Yi, Juan | Southern University of Science and Technology |
Liu, Sicong | Southern University of Science and Technology |
Wang, Zheng | Southern University of Science and Technology |
Keywords: Soft Robot Applications, Soft Sensors and Actuators, Hydraulic/Pneumatic Actuators
Abstract: Soft robotic manipulators have inherent advantages in underwater applications, as they generate motion by deforming seamless muscles rather than having rotational joints or sliding cylinders, as well as having excellent passive adaptability. However, limited by insufficient structural stiffness, achieving high payload and positioning accuracy remains challenging in existing soft manipulator designs. In this work, we propose an innovative approach to underwater soft manipulator design: 1) by constraining high- power optimized actuators with densely spaced lateral supporting plates, we could significantly enhance structural stiffness as well as improve the model accuracy drastically; 2) paired with a novel flow-controllable open-circuit hydraulic actuation, we could keep the manipulator smoothly operated and depth-compensation-free; 3) in result, the manipulator could be modelled kinematically in a simplified way for position control. The entire workflow from mechanical design to actuation and control is presented. A prototype soft manipulator was developed to validate the proposed design experimentally.
|
|
16:30-18:00, Paper WeCT10-CC.4 | Add to My Program |
Learning-Based Object Recognition Via a Eutectogel Electronic Skin Enabled Soft Robotic Gripper |
|
Deng, Mo | University of Science and Technology of China |
Fan, Fengya | University of Science and Technology of China |
Wei, Xi | University of Science and Technology of China |
Keywords: Soft Robot Applications, Soft Sensors and Actuators, Modeling, Control, and Learning for Soft Robots
Abstract: Compared to the traditional robot, which is rigidly structured, the soft robot, usually made of soft material, or following a continuous movement pattern, has attracted extensive attention due to its unique features, such as high adaptivity to various unstructured environments and safe interaction with living beings through the deformable interface. However, mechanical and morphological requirements limit the design and implementation of a compatible sensing module, which restricts the further development of robotic functionality. Here, we designed a flexible soft sensing Wire with the piezoresistive Eutectogel packed in an Ecoflex tube (WEE), which is sensitive, stable, and easily manipulated. The wire and its array facilitated the perception function of the soft gripper and acted as the Electronic skin (E-skin) to acquire information from grasped objects. With the built-in E-skin, the gripper achieved object recognition at an accuracy of 93.78% for standard geometric objects in 9 categories based on a machine learning model. In addition, our design successfully demonstrated its application in fruit sorting, which proves its robustness and versatility. The proposed WEE-based E-skin can be easily applied to other soft robots with facile integration and further expedites advanced functionalization in robot-object interaction.
|
|
16:30-18:00, Paper WeCT10-CC.5 | Add to My Program |
Design and Validation of Slender Extensible Continuum Robot for Solar Wing Re-Unfolding in Aerospace |
|
Wang, Pengyuan | Harbin Institute of Technology |
Zheng, Zheng | Yangtze River Delta HIT Robot Technology Research Institute |
Sun, Jiazhen | Harbin Institute of Technology |
Liu, Yuqiang | Beijing Institute of Sapcecraft System Engineering |
He, Zongbo | The Beijing Institute of Spacecraft System Engineering |
Xing, Zhiguang | Harbin Institute of Technology, Weihai |
Zhao, Jianwen | Harbin Institute of Technology, Weihai |
Keywords: Soft Robot Applications, Space Robotics and Automation, Soft Robot Materials and Design
Abstract: The solar array wing deployment of orbiting satellites cannot be performed due to power failure of the connector caused by uncertain loads such as high temperature or vibration in the launching process of the spacecraft. There is currently a lack of suitable unlocking solutions for solar wing re-unfolding. This paper proposes a solution in which an extensible continuum robot (ECR) carrying the unlocking device enters the gap between the satellite and the solar wing, re-unlocking the solar wing. This solution effectively leverages the advantages of ECR collision buffering and adaptable maneuverability within confined space. In response to the proposed solution, the designed ECR with two segments helical spring structure features scalability, hollowness, lightweight, and a big length-diameter ratio. To perform the critical unlocking task, an end effector with the function of loosening and unplugging the aerospace connector for communication is designed based on the drive device away from itself to reduce the inertia of the manipulator. The information from the cameras and force sensors is used to estimate the extent of task execution. We establish an experimental setup to simulate the process of unlocking. The results validate that the ECR successfully accesses the gap (65mm) and accomplishes the unlocking task. The ECR has great application potential for on-orbit service.
|
|
16:30-18:00, Paper WeCT10-CC.6 | Add to My Program |
Bio-Inspired Pupal-Mode Actuator with Ultra-Crossing Capability for Soft Robots |
|
Wang, Zhenxing | Chinese Academy of Sciences |
He, Xiao | Shenyang Institute of Automation, Chinese Academy of Sciences |
Zhang, Yuhang | Shenyang Institute of Automation |
Zhang, Cheng | Shenyang Institute of Automation, Chinese Academy of Sciences |
Sun, Lei | The First Affiliated Hospital of China Medical University |
Wang, Zhidong | Chiba Institute of Technology |
Xu, Shun | The First Affiliated Hospital of China Medical University |
Liu, Hao | Chinese Academy of Sciences |
Keywords: Soft Robot Applications, Surgical Robotics: Laparoscopy, Biologically-Inspired Robots
Abstract: Robot-assisted Natural Orifice Transluminal Endoscopic Surgery (NOTES) represents a paradigm shift in surgical practice, significantly minimizing patient morbidity. However, the variability of inner diameter and the inter-luminal crossing within the luminal tracts lead to challenge for effective robotic intervention. Inspired by the motion of the chrysalis during its transformation, we designed an innovative pupal-mode actuator for NOTES robots. Through the manipulation of its internal air chambers, this actuator is capable of replicating wrigglelike movements. Through experimental analysis, we have acquired the constitutive characteristics of this actuator. Subsequently, an innovative gastric endoscopy robot is developed base the actuator and tested in a phantom. The results of the task simulations substantiate that the pupal-mode actuator has the capability to reduce resistance and enhance the safety of the endoscopic intervention.
|
|
16:30-18:00, Paper WeCT10-CC.7 | Add to My Program |
Crawling Soft Robot Exploiting Wheel-Legs and Multimodal Locomotion for High Terrestrial Maneuverability |
|
Ai, Xinpei | Hanyang University |
Yue, Hengmao | Hanyang University |
Wang, Wei | Hanyang University |
Keywords: Soft Robot Applications, Wheeled Robots, Search and Rescue Robots, Multimodal Locomotion
Abstract: How to efficiently traverse complex terrain remains an unresolved challenge for mobile soft robots, because their deformable bodies limit the magnitude of the forces they can exert on the environment. To achieve high maneuverability, this study demonstrates a pneumatic soft crawling robot equipped with wheel-legs capable of multimodal locomotion to negotiate various obstacles. The soft robot consists of a pneumatic soft actuator capable of multiple modes of bending deformation as the body and four identical multi-spoked wheel-legs with passive unidirectional forward rotation as limbs. The synergy of the body actuator and wheel-legs enables the robot to achieve multiple crawling gaits, including gecko-like crawling and inchworm-like crawling. A single gait or a combination of multiple gaits, as well as shape-morphing of the body, enables the robot to navigate obstacles as diverse as confined spaces, inclined surfaces, gaps, and stairs, or to avoid obstacles by circumventing them. Our study substantially improves the maneuverability of pneumatic soft crawling robots, thereby providing new routes for the potential applications of soft robots in obstacle-filled scenarios, including search and rescue, exploration, and inspection.
|
|
16:30-18:00, Paper WeCT10-CC.8 | Add to My Program |
Compliant Robotic Gripper with Integrated Ripeness Sensing for Blackberry Harvesting |
|
De, Arvyn | Georgia Institute of Technology |
Kumar, Divyam | Georgia Institute of Technology |
Kwuan, Ian | Georgia Institute of Technology |
Qiu, Alex | Georgia Institute of Technology |
Hu, Ai-Ping | Georgia Tech Research Institute |
Keywords: Agricultural Automation, Soft Robot Applications
Abstract: Global blackberry demand has been surging due to their antioxidant and nutritional value in a traditional diet. However, blackberries have extreme fragility (resulting in up to 85% of harvest batches sustaining damage) and near-ripe and ripe blackberries are difficult to distinguish in normal lighting conditions. These challenges in maintaining the blackberry supply motivate the development of an autonomous robotic solution to harvest fully ripe blackberries with minimal damage. The present paper details the mechanical design, methodology, analysis, and experimental results of a compliant robotic gripper created for this purpose. The gripper has a compact form factor and retractable fingers with specialized TPU finger pads for gentle picks, a near-infrared (NIR) reflectance-based probe for detecting full ripeness and a standardized harvesting sequence for effectively picking berries. In an outdoor harvesting experiment, the gripper attempted picking 26 berries without ripeness sensing, with 65.4% (17) being successfully picked and 38.5% (10) sustaining damage. The movements of the robot arm in the harvesting sequence were accordingly adjusted and finalized for following in-lab experiments, in which the gripper was also outfitted with ripeness sensing. Out of 40 berries, 62.5% (25) were successfully picked, with 0% of them sustaining damage. The ripeness probe classified 56 ripe and 11 near-ripe berries, with 89% (50) of the ripe and 64% (7) of the near-ripe berries being correctly classified. In a second in-lab experiment, 16 of 20 berries were successfully picked, with 2 sustaining damage.
|
|
WeCT11-CC Oral Session, CC-502 |
Add to My Program |
Semantic Scene Understanding III |
|
|
Chair: Hauser, Kris | University of Illinois at Urbana-Champaign |
|
16:30-18:00, Paper WeCT11-CC.1 | Add to My Program |
SG-RoadSeg: End-To-End Collision-Free Space Detection Sharing Encoder Representations Jointly Learned Via Unsupervised Deep Stereo |
|
Wu, Zhiyuan | Tongji University |
Li, Jiaqi | Tongji University |
Feng, Yi | Tongji University |
Liu, Chengju | Tongji University |
Ye, Wei | Tongji University |
Chen, Qijun | Tongji University |
Fan, Rui | Tongji University |
Keywords: Semantic Scene Understanding, Robot Safety, Collision Avoidance
Abstract: Collision-free space detection is of utmost importance for autonomous robot perception and navigation. State-of-the-art (SoTA) approaches generally extract features from RGB images and an additional source or modality of 3-D information, such as depth or disparity images, using a pair of independent encoders. The extracted features are subsequently fused and decoded to yield semantic predictions of collision-free spaces. Such feature-fusion approaches become infeasible in scenarios, where the sensor for 3-D information acquisition is unavailable, or just when multi-sensor calibration falls short of the necessary precision. To overcome these limitations, this paper introduces a novel end-to-end collision-free space detection network, referred to as SG-RoadSeg, built upon our previous work SNE-RoadSeg. A key contribution of this paper is a strategy for sharing encoder representations that are co-learned through both semantic segmentation and unsupervised stereo matching tasks, enabling the features extracted from RGB images to contain both semantic and spatial geometric information. The unsupervised deep stereo serves as an auxiliary functionality, capable of generating accurate disparity maps that can be used by other perception tasks that require depth-related data. Comprehensive experimental results on the KITTI road and semantics datasets validate the effectiveness of our proposed architecture and encoder representation sharing strategy. SG-RoadSeg also demonstrates superior performance than other SoTA collision-free space detection approaches. Our source code, demo video, and supplement are publicly available at mias.group/SG-RoadSeg.
|
|
16:30-18:00, Paper WeCT11-CC.2 | Add to My Program |
Robust Few-Shot 3D Point Cloud Scene Segmentation |
|
Huang, Hao | New York University |
Yuan, Shuaihang | New York University |
Wen, Congcong | New York University Abu Dhabi |
Hao, Yu | New York University |
Fang, Yi | New York University |
Keywords: Semantic Scene Understanding, Representation Learning, Deep Learning for Visual Perception
Abstract: In the domain of 3D point cloud scene semantic segmentation, a preponderance of methodology predominantly adopts a fully supervised framework. Such paradigms exhibit an intrinsic dependency on extensive labeled datasets, presenting challenges in acquisition and exhibiting incapacity to segment novel classes, especially when the training data are contaminated by noisy samples. To addressing these limitations, this study introduces a novel meta-learning-based few-shot segmentation approach to robustly segment 3D point cloud scenes. Specifically, we first build a multi-prototype graph and then suppress noisy samples based on the graph structure. A subgraph voting scheme is proposed to conduct transductive semi-supervised learning to propagate labels. To optimize the graph structure to learn discriminative prototype features, we design a triplet contrastive loss to increase the compactness of the graph. We eveluate our method on two widely used 3D point cloud scene segmentation benchmarks within specific few-shot segmentation (i.e., 2/3-way 5-shot) settings. Experimental results demonstrate improvement over the compared baseline methods, illustrating the robustness of our method in few-shot 3D scene segmentation against noisy samples.
|
|
16:30-18:00, Paper WeCT11-CC.3 | Add to My Program |
Radar Instance Transformer: Reliable Moving Instance Segmentation in Sparse Radar Point Clouds |
|
Zeller, Matthias | CARIAD SE |
Sandhu, Vardeep Singh | University of Bonn, CARIAD |
Mersch, Benedikt | University of Bonn |
Behley, Jens | University of Bonn |
Heidingsfeld, Michael | CARIAD SE |
Stachniss, Cyrill | University of Bonn |
Keywords: Semantic Scene Understanding, Object Detection, Segmentation and Categorization, Deep Learning in Robotics and Automation, Radar Perception
Abstract: The perception of moving objects is crucial for autonomous robots performing collision avoidance in dynamic environments. LiDARs and cameras tremendously enhance scene interpretation but do not provide direct motion information and face limitations under adverse weather. Radar sensors overcome these limitations and provide Doppler velocities, delivering direct information on dynamic objects. In this paper, we address the problem of moving instance segmentation in radar point clouds to enhance scene interpretation for safety-critical tasks. Our Radar Instance Transformer enriches the current radar scan with temporal information without passing aggregated scans through a neural network.We propose a full-resolution backbone to prevent information loss in sparse point cloud processing. Our instance transformer head incorporates essential information to enhance segmentation but also enables reliable, class-agnostic instance assignments. In sum, our approach shows superior performance on the new moving instance segmentation benchmarks, including diverse environments, and provides model-agnostic modules to enhance scene interpretation. The benchmark is based on the RadarScenes dataset and is available at https://doi.org/10.5281/zenodo.10203864.
|
|
16:30-18:00, Paper WeCT11-CC.4 | Add to My Program |
On the Overconfidence Problem in Semantic 3D Mapping |
|
Correia Marques, Joao Marcos | University of Illinois at Urbana-Champaign |
Zhai, Albert | UIUC |
Wang, Shenlong | University of Illinois at Urbana-Champaign |
Hauser, Kris | University of Illinois at Urbana-Champaign |
Keywords: Semantic Scene Understanding, Object Detection, Segmentation and Categorization, Domestic Robotics
Abstract: Semantic 3D mapping, the process of fusing depth and image segmentation information between multiple views to build 3D maps annotated with object classes in real-time, is a recent topic of interest. This paper highlights the fusion overconfidence problem, in which conventional mapping methods assign high confidence to the entire map even when they are incorrect, leading to miscalibrated outputs. Several methods to improve uncertainty calibration at different stages in the fusion pipeline are presented and compared on the ScanNet dataset. We show that the most widely used Bayesian fusion strategy is among the worst calibrated, and propose a learned pipeline that combines fusion and calibration, GLFS, which achieves simultaneously higher accuracy and 3D map calibration while retaining real-time capability and adding only 525 learned parameters to the pipeline. We further illustrate the importance of map calibration on a downstream task by showing that incorporating proper semantic fusion to an indoor object search agent improves its success rates.
|
|
16:30-18:00, Paper WeCT11-CC.5 | Add to My Program |
Complementing Onboard Sensors with Satellite Maps: A New Perspective for HD Map Construction |
|
Gao, Wenjie | Xi'an Jiaotong University |
Fu, Jiawei | Institute of Artificial Intelligence and Robotics |
Shen, Yanqing | Xi'an Jiaotong University |
Jing, Haodong | Xi'an Jiaotong University |
Chen, Shitao | Xi'an Jiaotong University |
Zheng, Nanning | Xi'an Jiaotong University |
Keywords: Semantic Scene Understanding, Sensor Fusion, Deep Learning for Visual Perception
Abstract: High-definition (HD) maps play a crucial role in autonomous driving systems. Recent methods have attempted to construct HD maps in real-time using vehicle onboard sensors. Due to the inherent limitations of onboard sensors, which include sensitivity to detection range and susceptibility to occlusion by nearby vehicles, the performance of these methods significantly declines in complex scenarios and long-range detection tasks. In this paper, we explore a new perspective that boosts HD map construction through the use of satellite maps to complement onboard sensors. We initially generate the satellite map tiles for each sample in nuScenes and release a complementary dataset for further research. To enable better integration of satellite maps with existing methods, we propose a hierarchical fusion module, which includes feature-level fusion and BEV-level fusion. The feature-level fusion, composed of a mask generator and a masked cross-attention mechanism, is used to refine the features from onboard sensors. The BEV-level fusion mitigates the coordinate differences between features obtained from onboard sensors and satellite maps through an alignment module. The experimental results on the augmented nuScenes showcase the seamless integration of our module into three existing HD map construction methods. The satellite maps and our proposed module notably enhance their performance in both HD map semantic segmentation and instance detection tasks. Our code will be available at https://github.com/xjtu-cs-gao/SatforHDMap.
|
|
16:30-18:00, Paper WeCT11-CC.6 | Add to My Program |
Complementary Random Masking for RGB-Thermal Semantic Segmentation |
|
Shin, Ukcheol | CMU(Carnegie Mellon University) |
Lee, Kyunghyun | KAIST |
Kweon, In So | KAIST |
Oh, Jean | Carnegie Mellon University |
Keywords: Semantic Scene Understanding, Sensor Fusion, Visual Learning
Abstract: RGB-thermal semantic segmentation is one potential solution to achieve reliable semantic scene understanding in adverse weather and lighting conditions. However, the previous studies mostly focus on designing a multi-modal fusion module without consideration of the nature of multi-modality inputs. Therefore, the networks easily become over-reliant on a single modality, making it difficult to learn complementary and meaningful representations for each modality. This paper proposes 1) a complementary random masking strategy of RGB-T images and 2) self-distillation loss between clean and masked input modalities. The proposed masking strategy prevents over-reliance on a single modality. It also improves the accuracy and robustness of the neural network by forcing the network to segment and classify objects even when one modality is partially available. Also, the proposed self-distillation loss encourages the network to extract complementary and meaningful representations from a single modality or complementary masked modalities. We achieve state-of-the-art performance over three RGB-T semantic segmentation benchmarks. Our source code is available at https://github.com/UkcheolShin/CRM_RGBTSeg.
|
|
16:30-18:00, Paper WeCT11-CC.7 | Add to My Program |
Collaborative Dynamic 3D Scene Graphs for Automated Driving |
|
Greve, Elias | University of Freiburg |
Büchner, Martin | University of Freiburg |
Vödisch, Niclas | University of Freiburg |
Burgard, Wolfram | University of Technology Nuremberg |
Valada, Abhinav | University of Freiburg |
Keywords: Semantic Scene Understanding, Object Detection, Segmentation and Categorization, Mapping
Abstract: Maps have played an indispensable role in enabling safe and automated driving. Although there have been many advances on different fronts ranging from SLAM to semantics, building an actionable hierarchical semantic representation of urban dynamic scenes and processing information from multiple agents are still challenging problems. In this work, we present Collaborative URBan Scene Graphs (CURB-SG) that enable higher-order reasoning and efficient querying for many functions of automated driving. CURB-SG leverages panoptic LiDAR data from multiple agents to build large-scale maps using an effective graph-based collaborative SLAM approach that detects inter-agent loop closures. To semantically decompose the obtained 3D map, we build a lane graph from the paths of ego agents and their panoptic observations of other vehicles. Based on the connectivity of the lane graph, we segregate the environment into intersecting and non-intersecting road areas. Subsequently, we construct a multi-layered scene graph that includes lane information, the position of static landmarks and their assignment to certain map sections, other vehicles observed by the ego agents, and the pose graph from SLAM including 3D panoptic point clouds. We extensively evaluate CURB-SG in urban scenarios using a photorealistic simulator. We release our code at http://curb.cs.uni-freiburg.de.
|
|
16:30-18:00, Paper WeCT11-CC.8 | Add to My Program |
BroadBEV: Collaborative LiDAR-Camera Fusion for Broad-Sighted Bird's Eye View Map Construction |
|
Kim, Minsu | Korea Institute of Science and Technology |
Kim, Giseop | NAVER LABS |
Jin, Kyong Hwan | Korea University |
Choi, Sunwook | NAVER LABS Corp |
Keywords: Semantic Scene Understanding, Representation Learning, Sensor Fusion
Abstract: A recent sensor fusion in a Bird's Eye View (BEV) space has shown its utility in various tasks such as 3D detection, map segmentation, etc. However, the approach struggles with inaccurate camera BEV estimation, and a perception of distant areas due to the sparsity of LiDAR points. In this paper, we propose a BEV fusion (BroadBEV) that aims to enhance camera BEV estimation for broad perception in the pre-defined BEV range, while simultaneously improving the completion of LiDAR's sparsity in the entire BEV space. Toward that end, we devise Point-scattering that scatters LiDAR BEV distribution to camera depth distribution. The method boosts the learning of depth estimation of the camera branch and induces accurate location of dense camera features in BEV space. For an effective BEV fusion between the spatially synchronized features, we suggest ColFusion that applies self-attention weights of LiDAR and camera BEV features to each other. Our extensive experiments demonstrate that the suggested methods enable a broad BEV perception with remarkable performance gains.
|
|
16:30-18:00, Paper WeCT11-CC.9 | Add to My Program |
AGRNav: Efficient and Energy-Saving Autonomous Navigation for Air-Ground Robots in Occlusion-Prone Environments |
|
Wang, Junming | The University of Hong Kong |
Sun, Zekai | The University of Hong Kong |
Guan, Xiuxian | The University of Hong Kong |
Shen, Tianxiang | The University of Hong Kong |
Zhang, Zongyuan | The University of Hong Kong |
Duan, Tianyang | The University of Hong Kong |
Huang, Dong | The University of Hong Kong |
Zhao, Shixiong | Huawei Technologies Co., Ltd |
Cui, Heming | The University of Hong Kong |
Keywords: AI-Based Methods, Semantic Scene Understanding
Abstract: The exceptional mobility and long endurance of air-ground robots are raising interest in their usage to navigate complex environments (e.g., forests and large buildings). However, such environments often contain occluded and unknown regions, and without accurate prediction of unobserved obstacles, the movement of the air-ground robot often suffers a suboptimal trajectory under existing mapping-based and learning-based navigation methods. In this work, we present AGRNav, a novel framework designed to search for safe and energy-saving air-ground hybrid paths. AGRNav contains a lightweight semantic scene completion network (SCONet) with self-attention to enable accurate obstacle predictions by capturing contextual information and occlusion area features. The framework subsequently employs a query-based method for low-latency updates of prediction results to the grid map. Finally, based on the updated map, the hierarchical path planner efficiently searches for energy-saving paths for navigation. We validate AGRNav's performance through benchmarks in both simulated and real-world environments, demonstrating its superiority over classical and state-of-the-art methods. The open-source code is available at https://github.com/jmwang0117/AGRNav.
|
|
WeCT12-CC Oral Session, CC-503 |
Add to My Program |
Deep Learning Methods |
|
|
Co-Chair: Liu, Kangcheng | ETH Zurich |
|
16:30-18:00, Paper WeCT12-CC.1 | Add to My Program |
ATPPNet: Attention Based Temporal Point Cloud Prediction Network |
|
Pal, Kaustab | International Institute of Information Technology, Hyderabad |
Sharma, Aditya | Robotics Research Center, IIIT Hyderabad |
Sharma, Avinash | International Institute of Information Technology, |
Krishna, Madhava | IIIT Hyderabad |
Keywords: Deep Learning for Visual Perception, Computer Vision for Transportation, Deep Learning Methods
Abstract: Point cloud prediction is an important yet challenging task in the field of autonomous driving. The goal is to predict future point cloud sequences that maintain object structures while accurately representing their temporal motion. These predicted point clouds help in other subsequent tasks like object trajectory estimation for collision avoidance or estimating locations with the least odometry drift. In this work, we present ATPPNet, a novel architecture that predicts future point cloud sequences given a sequence of previous time step point clouds obtained with LiDAR sensor. ATPPNet leverages Conv-LSTM along with channel-wise and spatial attention dually complemented by a 3D-CNN branch for extracting an enhanced spatio-temporal context to recover high quality fidel predictions of future point clouds. We conduct extensive experiments on publicly available datasets and report impressive performance outperforming the existing methods. We also conduct a thorough ablative study of the proposed architecture and provide an application study that highlights the potential of our model for tasks like odometry estimation.
|
|
16:30-18:00, Paper WeCT12-CC.2 | Add to My Program |
Transformer-CNN Cohort: Semi-Supervised Semantic Segmentation by the Best of Both Students |
|
Zheng, Xu | The Hong Kong University of Science and Technology |
Luo, Yunhao | Brown University |
Fu, Chong | Northeastern University |
Liu, Kangcheng | ETH Zurich |
Wang, Lin | HKUST |
Keywords: Deep Learning for Visual Perception, Deep Learning Methods, Computer Vision for Automation
Abstract: The popular methods for semi-supervised semantic segmentation mostly adopt a unitary network model using convolutional neural networks (CNNs) and enforce consistency of the model’s predictions over perturbations applied to the inputs or model. However, such a learning paradigm suffers from two critical limitations: a) learning the discriminative features for the unlabeled data; b) learning both global and local information from the whole image. In this paper, we propose a novel Semi-supervised Learning (SSL) approach, called Transformer-CNN Cohort (TCC), that consists of two students with one based on the vision transformer (ViT) and the other based on the CNN. Our method subtly incorporates the multi-level consistency regularization on the predictions and the heterogeneous feature spaces via pseudo labeling for the unlabeled data. First, as the inputs of the ViT student are image patches, the feature maps extracted encode crucial class-wise statistics. To this end, we propose class-aware feature consistency distillation (CFCD) that first leverages the outputs of each student as the pseudo labels and generates class-aware feature (CF) maps for knowledge transfer between the two students. Second, as the ViT student has more uniform representations for all layers, we propose consistency-aware cross distillation (CCD) to transfer knowledge between the pixel-wise predictions from the cohort. We validate the TCC framework on Cityscapes and Pascal VOC 2012 datasets, which outperforms existing SSL methods by a large margin. Project page: url{https://vlislab22.github.io/TCC/}.
|
|
16:30-18:00, Paper WeCT12-CC.3 | Add to My Program |
CrackNex: A Few-Shot Low-Light Crack Segmentation Model Based on Retinex Theory for UAV Inspections |
|
Yao, Zhen | Lehigh University |
Xu, Jiawei | Lehigh University |
Hou, Shuhang | Lehigh University |
Chuah, Mooi Choo | Lehigh University |
Keywords: Deep Learning for Visual Perception, Deep Learning Methods, Data Sets for Robotic Vision
Abstract: Routine visual inspections of concrete structures are imperative for upholding the safety and integrity of critical infrastructure. Such visual inspections sometimes happen under low-light conditions, e.g., checking for bridge health. Crack segmentation under such conditions is challenging due to the poor contrast between cracks and their surroundings. However, most deep learning methods are designed for well-illuminated crack images and hence their performance drops dramatically in low-light scenes. In addition, conventional approaches require many annotated low-light crack images which is time-consuming. In this paper, we address these challenges by proposing CrackNex, a framework that utilizes reflectance information based on Retinex Theory to help the model learn a unified illumination-invariant representation. Furthermore, we utilize few-shot segmentation to solve the inefficient training data problem. In CrackNex, both a support prototype and a reflectance prototype are extracted from the support set. Then, a prototype fusion module is designed to integrate the features from both prototypes. CrackNex outperforms the SOTA methods on multiple datasets. Additionally, we present the first benchmark dataset, LCSD, for low-light crack segmentation. LCSD consists of 102 well-illuminated crack images and 41 low-light crack images. The dataset and code are available at https://github.com/zy1296/CrackNex.
|
|
16:30-18:00, Paper WeCT12-CC.4 | Add to My Program |
FBPT: A Fully Binary Point Transformer |
|
Hou, Zhixing | Nanjing University of Science and Technology |
Shang, Yuzhang | Illinois Institute of Technology |
Yan, Yan | Illinois Institute of Technology |
Keywords: Deep Learning for Visual Perception, Deep Learning Methods, Recognition
Abstract: This paper presents a novel Fully Binary Point Cloud Transformer (FBPT) network model which has the potential to be widely applied and expanded in the fields of robotics and mobile devices. By compressing the weights and activations of a 32-bit full-precision network to 1-bit binary values, the proposed binary point cloud Transformer network significantly reduces the storage footprint and computational resource requirements of neural network models for point cloud processing tasks, compared to full-precision point cloud networks. However, achieving a fully binary point cloud Transformer network, where all parts except the modules specific to the task are binary, poses challenges and bottlenecks in quantizing the activations of Q, K, V and self-attention in the attention module, as they do not adhere to simple probability distributions and can vary with input data. Furthermore, in our network, the binary attention module undergoes a degradation of the self-attention module due to the uniform distribution that occurs after the softmax operation. The primary focus of this paper is on addressing the performance degradation issue caused by the use of binary point cloud Transformer modules. We propose a novel binarization mechanism called dynamic-static hybridization. Specifically, our approach combines static binarization of the overall network model with fine granularity dynamic binarization of data-sensitive components. Furthermore, we make use of a novel hierarchical training scheme to obtain the optimal model and binarization parameters. These above improvements allow the proposed binarization method to outperform binarization methods applied to convolution neural networks when used in point cloud Transformer structures. To demonstrate the superiority of our algorithm, we conducted experiments on two different tasks: point cloud classification and place recognition.
|
|
16:30-18:00, Paper WeCT12-CC.5 | Add to My Program |
Diffusion-Based Point Cloud Super-Resolution for mmWave Radar Data |
|
Luan, Kai | Intelligence Science and Technology,National University O |
Shi, Chenghao | NUDT |
Wang, Neng | National University of Defense Technology |
Cheng, Yuwei | Tsinghua University |
Lu, Huimin | National University of Defense Technology |
Chen, Xieyuanli | National University of Defense Technology |
Keywords: Deep Learning for Visual Perception, Localization, Deep Learning Methods
Abstract: The millimeter-wave radar sensor maintains stable performance under adverse environmental conditions, making it a promising solution for all-weather perception tasks, such as outdoor mobile robotics. However, the radar point clouds are relatively sparse and contain massive ghost points, which greatly limits the development of mmWave radar technology. In this paper, we propose a novel point cloud super-resolution approach for 3D mmWave radar data, named Radar-diffusion. Our approach employs the diffusion model defined by mean-reverting stochastic differential equations (SDE). Using our proposed new objective function with supervision from corresponding LiDAR point clouds, our approach efficiently handles radar ghost points and enhances the sparse mmWave radar point clouds to dense LiDAR-like point clouds. We evaluate our approach on two different datasets, and the experimental results show that our method outperforms the state-of-the-art baseline methods in 3D radar super-resolution tasks. Furthermore, we demonstrate that our enhanced radar point cloud is capable of downstream radar point-based registration tasks.
|
|
16:30-18:00, Paper WeCT12-CC.6 | Add to My Program |
Visual Noun Modifiers: The Problem of Binding Visual and Linguistic Cues |
|
Faridghasemnia, Mohamadreza | Orebro University |
Renoux, Jennifer | Örebro University |
Saffiotti, Alessandro | Orebro University |
Keywords: Deep Learning for Visual Perception, Recognition, Deep Learning Methods
Abstract: In many robotic applications, especially those involving humans and the environment, linguistic and visual information must be processed jointly and bound together. Existing works either encode the image or the language into a subsymbolic space, like the CLIP model, or create a symbolic space of extracted information, like the object detection models. In this paper, we propose to describe images by nouns and modifiers and introduce a new embedded binding space where the linguistic and visual cues can effectively be bound. We investigate how state-of-the-art models perform in recognizing nouns and modifiers from images, and propose our method by introducing a dataset and CLIP-like recognition techniques based on transfer learning and metric learning. We show real-world experiments that demonstrate the practical applicability of our approach to robotics applications. Our results indicate that our method can surpass the state-of-the-art in recognizing nouns and modifiers from images. Interestingly, our method exhibits a language characteristic related to context sensitivity.
|
|
16:30-18:00, Paper WeCT12-CC.7 | Add to My Program |
Cycle-Correspondence Loss: Learning Dense View-Invariant Visual Features from Unlabeled and Unordered RGB Images |
|
Adrian, David Benjamin | Bosch Corporate Research & Ulm University |
Kupcsik, Andras | Bosch Center for Artificial Intelligence |
Spies, Markus | Bosch Center for Artificial Intelligence |
Neumann, Heiko | Ulm University |
Keywords: Deep Learning for Visual Perception, Representation Learning, Deep Learning Methods
Abstract: Robot manipulation relying on learned object-centric descriptors became popular in recent years. Visual descriptors can easily describe manipulation task objectives, they can be learned efficiently using self-supervision, and they can encode actuated and even non-rigid objects. However, learning robust, view-invariant keypoints in a self-supervised approach requires a meticulous data collection approach involving precise calibration and expert supervision. In this paper we introduce Cycle-Correspondence Loss (CCL) for view-invariant dense descriptor learning, which adopts the concept of cycle-consistency, enabling a simple data collection pipeline and training on unpaired RGB camera views. The key idea is to autonomously detect valid pixel correspondences by attempting to use a prediction over a new image to predict the original pixel in the original image, while scaling error terms based on the estimated confidence. Our evaluation shows that we outperform other self-supervised RGB-only methods, and approach performance of supervised methods, both with respect to keypoint tracking as well as for a robot grasping downstream task.
|
|
16:30-18:00, Paper WeCT12-CC.8 | Add to My Program |
End-To-End RGB-D SLAM with Multi-MLPs Dense Neural Implicit Representations |
|
Li, MingRui | Dalian University of Technology |
He, Jiaming | Dalian University of Technology |
Wang, Yangyang | Dalian Maritime University |
Wang, Hongyu | Dalian University of Technology |
Keywords: Deep Learning for Visual Perception, SLAM, Deep Learning Methods
Abstract: An accurate and generalizable dense 3D reconstruction system has attracted much attention. However, existing 3D dense reconstruction systems are limited by the requirement for pre-training, and there is a demand for improved reconstruction of feature details. We propose an end-to-end 3D reconstruction system which achieves fine scene reconstruction without prior information by utilizing neural implicit encoding. Our proposed system successfully achieves the goal through improved multi-MLP decoders (MLM) and effective keyframe selection strategy. Experiments conducted on the commonly used Replica and TUM RGB-D datasets demonstrate that our approach can compete with widely adopted NeRF-based SLAM methods in terms of 3D reconstruction accuracy. Moreover, our approach shows 40.8%(except Completion Ratio) improvement in accuracy compared to NICE-SLAM that does not use prior information.
|
|
16:30-18:00, Paper WeCT12-CC.9 | Add to My Program |
Closing the Visual Sim-To-Real Gap with Object-Composable NeRFs |
|
Mishra, Nikhil | UC Berkeley, Covariant.ai |
Sieb, Maximilian | CovariantAI |
Abbeel, Pieter | UC Berkeley |
Chen, Xi | Embodied Intelligence, UC Berkeley |
Keywords: Deep Learning for Visual Perception, Transfer Learning, RGB-D Perception
Abstract: Deep learning methods for perception are the cornerstone of many robotic systems. Despite their potential for impressive performance, obtaining real-world training data is expensive, and can be impractically difficult for some tasks. Sim-to-real transfer with domain randomization offers a potential workaround, but often requires extensive manual tuning and results in models that are brittle to distribution shift between sim and real. In this work, we introduce Composable Object Volume NeRF (COV-NeRF), an object-composable NeRF model that is the centerpiece of a real-to-sim pipeline for synthesizing training data targeted to scenes and objects from the real world. COV-NeRF extracts objects from real images and composes them into new scenes, generating photorealistic renderings and many types of 2D and 3D supervision, including depth maps, segmentation masks, and meshes. We show that COV-NeRF matches the rendering quality of modern NeRF methods, and can be used to rapidly close the sim-to-real gap across a variety of perceptual modalities.
|
|
WeCT13-AX Oral Session, AX-201 |
Add to My Program |
Human-Robot Collaboration III |
|
|
Chair: Arami, Arash | University of Waterloo |
Co-Chair: Dai, Houde | Haixi Institutes, Chinese Academy of Sciences |
|
16:30-18:00, Paper WeCT13-AX.1 | Add to My Program |
Usability Evaluation Framework for Close-Proximity Collaboration with Large Industrial Manipulators |
|
Hald, Kasper | Aalborg University |
Rehm, Matthias | Aalborg University |
Keywords: Human-Robot Collaboration, Acceptability and Trust, Design and Human Factors
Abstract: Our goal is to design a framework for holistic evaluation of human-robot collaboration systems. To this end we utilize several standardized questionnaires administered while participants perform collaborative tasks in robot work cells. We used Standard Usability Scale and the Usability metric for user experience questionnaires to access usability, NASA Task-Load for workload, two questionnaires for human-robot trust as well as the Unified theory of acceptance and use of technology questionnaires. We performed two pilot tests of our framework with human-robot collabotation work cells at two test sites as part of the DrapeBot project. The goal of the project is to enable human-robot collaboration in the process of carbon fiber draping the production of outer parts. After utilizing the evaluation framework at the two test sites we found that the collection of questionnaires were easy to adapt to each work cell and the practical limitation around running the experiments. Both work cells scored high in usability, expected increase of productivity, as well as high trust and low anxiety, but both work cells scored low on expectancy of use for work in the future at their current state of development.
|
|
16:30-18:00, Paper WeCT13-AX.2 | Add to My Program |
MyoPassivity Map: Does Multi-Channel sEMG Correlate with the Energetic Behavior of Upper-Limb Biomechanics During Physical Human-Robot Interaction? |
|
Oliver, Suzanne | New York University |
Paik, Peter | New York University |
Zhou, Xingyuan | New York University |
Atashzar, S. Farokh | New York University (NYU), US |
Keywords: Human-Centered Robotics, Telerobotics and Teleoperation, Haptics and Haptic Interfaces
Abstract: The human arm has an intrinsic capacity to absorb energy during physical human-robot interaction (pHRI), which can be identified as biomechanical excess of passivity (EoP). This can be used as a central factor in the development of passivity-based pHRI controllers, securing haptic transparency while guaranteeing pHRI stability. Despite its significance, the real-time estimation of EoP remains an under-investigated topic. For the first time, we investigate the relationship between the EoP and muscle activity of the forearm at the wrist joint while analyzing sixteen surface electromyography (sEMG) sensors. The study explores optimal sensor placement for maximizing the correlation between muscle activity and the estimated EoP. Ten subjects participated in this study. The EoP of the wrist was identified through high-frequency perturbations in four directions, and two instructed co-contraction levels. The results uncover a strong correlation between sEMG and EoP. This paper also reports the effect of the direction of pHRI interaction on the EoP of the wrist, with increased energetic passivity in the abduction-adduction direction compared to supination-pronation. The findings of this paper indicate that sEMG encodes significant potential for real-time estimation of EoP in the design of next-generation pHRI controllers supporting concurrent transparency and stability.
|
|
16:30-18:00, Paper WeCT13-AX.3 | Add to My Program |
Language-Guided Active Sensing of Confined, Cluttered Environments Via Object Rearrangement Planning |
|
Chen, Weihan | Purdue University |
Ren, Hanwen | Purdue University |
Qureshi, Ahmed H. | Purdue University |
Keywords: Human-Centered Robotics, RGB-D Perception, Perception for Grasping and Manipulation
Abstract: Language-guided active sensing is a robotics subtask where a robot with an onboard sensor interacts efficiently with the environment via object manipulation to maximize perceptual information, following given language instructions. These tasks appear in various practical robotics applications, such as household service, search and rescue, and environment monitoring. Despite many applications, the existing works do not account for language instructions and have mainly focused on surface sensing, i.e., perceiving the environment from the outside without rearranging it for dense sensing. Therefore, in this paper, we introduce the first language-guided active sensing approach that allows users to observe specific parts of the environment via object manipulation. Our method spatially associates the environment with language instructions, determines the best camera viewpoints for perception, and then iteratively selects and relocates the best view-blocking objects to provide the dense perception of the region of interest. We evaluate our method against different baseline algorithms in simulation and also demonstrate it in real-world confined cabinet-like settings with multiple unknown objects. Our results show that the proposed method exhibits better performance across different metrics and successfully generalizes to real-world complex scenarios.
|
|
16:30-18:00, Paper WeCT13-AX.4 | Add to My Program |
Feedforward Control of Lower Limb Exoskeletons: Which Torque Profile Should We Use? |
|
Dinovitzer, Hannah | University of Waterloo |
Shushtari, Mohammad | University of Waterloo |
Arami, Arash | University of Waterloo |
Keywords: Human-Centered Robotics, Wearable Robotics, Physical Human-Robot Interaction
Abstract: Despite the increased use of lower limb exoskeletons as gait training and mobility assistive devices, their controllers often lack the ability to synchronize and adapt to meet individual users' needs. This paper investigates two control approaches for lower limb exoskeletons: a real-time kinematic state-dependent estimation of desired torques with an inverse dynamics model and a data-driven component in the first approach, and a pre-defined torque control based on gait speed and phase in the second approach. These controllers are linearly combined to shift the controller behavior between pure kinematic state-dependent and pure gait phase-dependent control.} These combinations were tested during overground and treadmill walking with nine able-bodied participants. The linearly combined controller with a greater emphasis on kinematic state-dependent control produced a more natural gait in terms of spatiotemporal metrics. This is reflected by 0.1m/s increases in overground walking speed and 5% decrease in percent stance compared to walking with a passive exoskeleton. This controller also decreases the overall activity of lower limb muscles by up to 25% and thigh co-contractions by up to 40%. Participant feedback through a questionnaire, in terms of perceived effort, walking naturalness, and stability, also favored the aforementioned controller.
|
|
16:30-18:00, Paper WeCT13-AX.5 | Add to My Program |
Design of Two Morphing Robot Surfaces and Results from a User Study on What People Want and Expect of Them, towards a “Robot-Room” |
|
Kumar, Nithesh | Clemson |
Chao, Hsin-Ming | Cornell University |
Tassari, Bruno Dantas da Silva | Cornell University |
Sabinson, Elena | Cornell University |
Walker, Ian | Clemson University |
Green, Keith Evan | Cornell University |
Keywords: Human-Centered Robotics, Product Design, Development and Prototyping
Abstract: We propose, examine prototypes of, and collect user input on morphing robotic surface, “robot-room” elements that, individually or in combination, change the functionality of the rooms we live in, directly controlled by the room’s occupants engaging with it. Robot-rooms represent an advance in human-robot interaction whereby human interaction is within a machine that physically envelops us. We discuss the motivation for such robot-rooms, present initial work aimed at their physical realization, and report on a user study of 80 participants to learn what people might want of and expect from robot rooms, the results of which will inform both the iterative design of the robot room and the thinking of our community as it grapples with how we want to live with (and “in”) robots. Keywords: Robot surfaces, User studies
|
|
16:30-18:00, Paper WeCT13-AX.6 | Add to My Program |
Automatic Trust Estimation from Movement Data in Industrial Human-Robot Collaboration Based on Deep Learning |
|
Rehm, Matthias | Aalborg University |
Pontikis, Ioannis | AALBORG UNIVERSITY |
Hald, Kasper | Aalborg University |
Keywords: Human-Robot Collaboration, Acceptability and Trust, Human-Centered Automation
Abstract: Trust in automation is usually assessed with post-interaction questionnaires. For human robot collaboration it would be beneficial to assess the trust level during the interaction to adjust the robot's collaboration behavior to the user expectations. In this paper we investigate if trust can be estimated from observable behavior like movements during the interaction with a large industrial manipulator. To this end, we report on a data collection for two tasks during collaborative draping, the transport of large cut pieces and the actual draping process in close proximity to the robot. The data is used to train and compare different deep learning models. Results show that automatic trust estimation is feasible, which opens up to using trust as a parameter for informing the interaction with robots.
|
|
16:30-18:00, Paper WeCT13-AX.7 | Add to My Program |
A Dual Closed-Loop Control Strategy for Human-Following Robots Respecting Social Space |
|
Peng, Jianwei | University of Chinese Academy of Sciences |
Liao, Zhelin | Fujian Agriculture and Forestry University |
Su, Zefan | Fuzhou University |
Yao, Hanchen | Haixi Institutes, Chinese Academy of Sciences |
Zeng, Yadan | Nanyang Technology University |
Dai, Houde | Haixi Institutes, Chinese Academy of Sciences |
Keywords: Human-Robot Collaboration
Abstract: Human following for mobile robots has emerged as a promising technique with widespread applications. To ensure psychological comfort while collaborating, coexisting, and interacting with humans, robots need to respect the social space of the target person. In this study, we propose a dual closed-loop human-following control strategy that combines model predictive control (MPC) and impedance control. The outer-loop MPC ensures precise control of the robot's posture while tracking the target person's velocity and direction to coordinate the motion between them. The inner-loop impedance controller is employed to regulate the robot's motion and interaction force with the target person, enabling the robot to maintain a respectful and comfortable distance from the target person. Concretely, the social interaction dynamics characteristics between the robot and the target person are described by human-robot interaction dynamics, which considers the rules of social space. Furthermore, an obstacle avoidance component constructed using behavioral dynamics is integrated into the impedance controller. Experimental results demonstrate the effectiveness of the proposed method in achieving human following and obstacle avoidance without intruding into the intimate zone of the target person.
|
|
16:30-18:00, Paper WeCT13-AX.8 | Add to My Program |
A Bayesian Optimization Framework for the Automatic Tuning of MPC-Based Shared Controllers |
|
van der Horst, Anne | Eindhoven University of Technology |
Meere, Bastiaan Guillermo Lorenzo | Eindhoven University of Technology |
Krishnamoorthy, Dinesh | TU Eindhoven |
Bakker, Saray | Delft University of Technology |
van de Vrande, Bram | Philips |
Stoutjesdijk, Henry | Philips Medical Systems |
Alonso, Marco | Company |
Torta, Elena | Eindhoven University of Technology |
Keywords: Human-Robot Collaboration, AI-Based Methods, Medical Robots and Systems
Abstract: This paper presents a Bayesian optimization framework for the automatic tuning of shared controllers which are defined as a Model Predictive Control (MPC) problem. The proposed framework includes the design of performance metrics as well as the representation of user inputs for simulation-based optimization. The framework is applied to the optimization of a shared controller for an Image Guided Therapy robot. VR-based user experiments confirm the increase in performance of the automatically tuned MPC shared controller with respect to a hand-tuned baseline version as well as its generalization ability.
|
|
WeCT15-AX Oral Session, AX-203 |
Add to My Program |
Human Factors and Human-In-The-Loop III |
|
|
Chair: Ayub, Ali | University of Waterloo |
Co-Chair: Carlson, Tom | University College London, UK |
|
16:30-18:00, Paper WeCT15-AX.1 | Add to My Program |
Robust Body Exposure (RoBE): A Graph-Based Dynamics Modeling Approach to Manipulating Blankets Over People |
|
Puthuveetil, Kavya | Carnegie Mellon University |
Wald, Sasha | Carnegie Mellon University |
Pusalkar, Atharva | Carnegie Mellon University |
Karnati, Pratyusha | Google X, Everyday Robots |
Erickson, Zackory | Carnegie Mellon University |
Keywords: Physically Assistive Devices, Physical Human-Robot Interaction, Model Learning for Control
Abstract: Robotic caregivers could potentially improve the quality of life of many who require physical assistance. However, in order to assist individuals who are lying in bed, robots must be capable of dealing with a significant obstacle: the blanket or sheet that will almost always cover the person's body. We propose a method for targeted bedding manipulation over people lying supine in bed where we first learn a model of the cloth's dynamics. Then, we optimize over this model to uncover a given target limb using information about human body shape and pose that only needs to be provided at run-time. We show how this approach enables greater robustness to variation relative to geometric and reinforcement learning baselines via a number of generalization evaluations in simulation and in the real world. We further evaluate our approach in a human study with 12 participants where we demonstrate that a mobile manipulator can adapt to real variation in human body shape, size, pose, and blanket configuration to uncover target body parts without exposing the rest of the body. Source code and supplementary materials are available online.
|
|
16:30-18:00, Paper WeCT15-AX.2 | Add to My Program |
Recency Bias in Task Performance History Affects Perceptions of Robot Competence and Trustworthiness |
|
Luebbers, Matthew | University of Colorado Boulder |
Tabrez, Aaquib | University of Colorado Boulder |
Talanki, Kanaka Samagna | University of Colorado Boulder |
Hayes, Bradley | University of Colorado Boulder |
Keywords: Design and Human Factors, Acceptability and Trust, Human-Aware Motion Planning
Abstract: Human memory of a robot's competence, and resulting subjective perceptions of that robot, are influenced by numerous cognitive biases. One class of cognitive bias deals with the ordering of items or interactions: information presented last among a grouping is most salient in memory formation (recency bias), followed by information presented first (primacy bias), followed by information in the middle, collectively known as the serial-position effect. For example, if a human's last observation of a robot involves a task failure, this will disproportionately negatively alter their perception of the robot's competence, as well as their trust in the robot moving forward. It is valuable to characterize the effect of these biases within human-robot interactions to inform strategies for risk-aware planning that cultivate appropriate levels of human trust. We conducted a human-subjects study (n=53) testing the influence of the serial-position effect on recalled competence (see overview at https://youtu.be/BgH2zhh1s48). Participants viewed videos of a robot performing the same tasks at the same level of competence, with task order differing by experimental condition (rising competence, falling competence, or failures at the midpoint), asking participants to rate robot competence in between every video as well at the very end of the experiment. We found that while the average between-video rating of robot competence remained stable across conditions, the recalled, post-experiment ratings of competence and trust were significantly lower in the condition with decreasing competence than in either of the other two conditions, suggesting a notable recency bias. We conclude with implications for human-subjects experiment design (i.e., how subjective measures are influenced by ordering effects) and provide design recommendations to minimize them. We further discuss practical applications of these results in creating risk-aware robotic planners capable of trust calibration.
|
|
16:30-18:00, Paper WeCT15-AX.3 | Add to My Program |
LaCE-LHMP: Airflow Modelling-Inspired Long-Term Human Motion Prediction by Enhancing Laminar Characteristics in Human Flow |
|
Zhu, Yufei | Örebro University |
Fan, Han | Örebro University |
Rudenko, Andrey | Robert Bosch GmbH |
Magnusson, Martin | Örebro University |
Schaffernicht, Erik | Örebro University, AASS Research Center |
Lilienthal, Achim J. | Orebro University |
Keywords: Human Detection and Tracking
Abstract: Long-term human motion prediction (LHMP) is essential for safely operating autonomous robots and vehicles in populated environments. It is fundamental for various applications, including motion planning, tracking, human-robot interaction and safety monitoring. However, accurate prediction of human trajectories is challenging due to complex factors, including, for example, social norms and environmental conditions. The influence of such factors can be captured through Maps of Dynamics (MoDs), which encode spatial motion patterns learned from (possibly scattered and partial) past observations of motion in the environment and which can be used for data-efficient, interpretable motion prediction (MoD-LHMP). To address the limitations of prior work, especially regarding accuracy and sensitivity to anomalies in long-term prediction, we propose the Laminar Component Enhanced LHMP approach (LaCE-LHMP). Our approach is inspired by data-driven airflow modelling, which estimates laminar and turbulent flow components and uses predominantly the laminar components to make flow predictions. Based on the hypothesis that human trajectory patterns also manifest laminar flow (that represents predictable motion) and turbulent flow components (that reflect more unpredictable and arbitrary motion), LaCE-LHMP extracts the laminar patterns in human dynamics and uses them for human motion prediction. We demonstrate the superior prediction performance of LaCE-LHMP through benchmark comparisons with state-of-the-art LHMP methods, offering an unconventional perspective and a more intuitive understanding of human movement patterns.
|
|
16:30-18:00, Paper WeCT15-AX.4 | Add to My Program |
Interactive Continual Learning Architecture for Long-Term Personalization of Home Service Robots |
|
Ayub, Ali | University of Waterloo |
Nehaniv, Chrystopher | University of Waterloo |
Dautenhahn, Kerstin | University of Waterloo |
Keywords: Long term Interaction, Learning Categories and Concepts, Continual Learning
Abstract: For robots to perform assistive tasks in unstructured home environments, they must learn and reason on the semantic knowledge of the environments. Despite a resurgence in the development of semantic reasoning architectures, these methods assume that all the training data is available a priori. However, each user's environment is unique and can continue to change over time, which makes these methods unsuitable for personalized home service robots. Although research in continual learning develops methods that can learn and adapt over time, most of these methods are tested in the narrow context of object classification on static image datasets. In this paper, we combine ideas from continual learning, semantic reasoning, and interactive machine learning literature and develop a novel interactive continual learning architecture for continual learning of semantic knowledge in a home environment through human-robot interaction. The architecture builds on core cognitive principles of learning and memory for efficient and real-time learning of new knowledge from humans. We integrate our architecture with a physical mobile manipulator robot and perform extensive system evaluations in a laboratory environment over two months. Our results demonstrate the effectiveness of our architecture to allow a physical robot to continually adapt to the changes in the environment from limited data provided by the users (experimenters), and use the learned knowledge to perform object fetching tasks.
|
|
16:30-18:00, Paper WeCT15-AX.5 | Add to My Program |
Human-Robot Interactive Creation of Artistic Portrait Drawings |
|
Gao, Fei | Xidian University |
Lingna, Dai | AiSketcher Technology Co.Ltd |
Zhu, Jingjie | Aisketcher Technology Co.Ltd |
Du, Mei | Hangzhou Danzi University |
Yiyuan, Zhang | Aisketcher Technology Co.Ltd |
Qiao, Maoying | UTS |
Xia, Chenghao | The University of Sydney |
Wang, Nannan | Xidian University |
Li, Peng | Institute of Software, Chinese Academy of Sciences |
Keywords: Art and Entertainment Robotics, Human-Robot Collaboration, Deep Learning Methods
Abstract: In this paper, we present a novel system for Human-Robot Interactive Creation of Artworks (HRICA). Different from previous robot painters, HRICA allows a human user and a robot to alternately draw strokes on a canvas, to collaboratively create a portrait drawing through frequent interactions. The key is to enable the robot to understand human intentions, during the interactive creation process. We here formulate this as a mask-free image inpainting problem, and propose a novel method to estimate the complete version of a portrait drawing, after the human user has drawn some initial strokes. In this way, the robot can select some complementary strokes and draw them on the canvas. To train and evaluate our inpainting method, we construct a novel large-scale portrait drawing dataset, CelebLine, which composes of high-quality portrait line-drawings, with dense labels of both 2D semantic parsing masks and 3D depth maps. Finally, we develop a humanrobot interactive drawing system with low-cost hardware, user-friendly interface, and interesting creation experience. Experiments show that our robot can stably cooperate with human users to create diverse styles of portrait drawings. In addition, our portrait drawing inpainting method significantly outperforms previous advanced methods. The code and dataset have been released at: https://github.com/fei-aiart/HRICA.
|
|
16:30-18:00, Paper WeCT15-AX.6 | Add to My Program |
High Stimuli Virtual Reality Training for a Brain Controlled Robotic Wheelchair |
|
Thomas, Alexander | University College London |
Chen, Jianan | University College London |
Hella-Szabo, Anna | Univeristy College London |
Kelly, Merlin | University College London |
Carlson, Tom | University College London, UK |
Keywords: Brain-Machine Interfaces, Virtual Reality and Interfaces
Abstract: Smart robotic wheelchairs, as well as other assistive robotic devices, can provide an effective form of independent mobility for those who suffer with motor disabilities. Although many control interfaces exist to operate these devices, brain computer interfaces (BCI) offer a control modality for those who have little to no motor function, as well as being able to re-associate movement with brain functionality. Although BCIs have been designed for robotic wheelchairs, more research and development is required before they can be adopted for use in the ‘real world’. One key challenge on that journey is the user training required to achieve an acceptable accuracy of the control. In this paper, we aim to identify the best training method by comparing users trained on a simple task, in a simulated environment on a 2D display (VR-2DD) and in a virtual environment using a virtual reality headset (VR-HMD). We trained 15 participants in mix of high and low noise virtual environments or on a simple training task, and found a significant improvement in the classification accuracies of the participants who trained using the VR-2DD task compared with those who were trained with the simple task. We also carried out active (online) tests across all participants in the same virtual training environment, with a varying level of external stimuli, and found a significant improvement in the performance of participants in both VR groups compared to participants in the simple task group.
|
|
16:30-18:00, Paper WeCT15-AX.7 | Add to My Program |
Automatic Captioning Based on Visible and Infrared Images |
|
Wang, Yan | Yantai University |
Lou, Shuli | Yantai Univ |
Wang, Kai | Yantai Univ |
Yuan, Xiaohu | Tsinghua Univerisity |
Liu, Huaping | Tsinghua University |
Keywords: Automation Technologies for Smart Cities, Human-Centered Automation
Abstract: In this paper, we tackle the task of image captioning with the complementarity of visible light images and infrared images. To address this problem, we propose an RGB-IR image fusion captioning model, which can take full advantage of visible light images and infrared images under different conditions. Meanwhile, we develop a wearable environment-assisted system. In addition, we collect and annotate a new dataset containing 3510 pairs of RGB-IR images to support model training. Finally, we conduct extensive experiments to evaluate the model and system. Experimental results show that our new method and system significantly outperform baselines on multiple metrics and have potential practical value.
|
|
16:30-18:00, Paper WeCT15-AX.8 | Add to My Program |
A Semi-Automatic Oriental Ink Painting Framework for Robotic Drawing from 3D Models |
|
Jin, Hao | Northwest A&F University |
Lian, Minghui | Northwest A&F University |
Qiu, Shicheng | Northwest A&F University |
Han, Xuxu | Northwest A&F University |
Zhao, Xizhi | Northwest A&F University |
Yang, Long | Northwest A&F University |
Zhang, Zhiyi | Northwest A&F University |
Xie, Haoran | Japan Advanced Institute of Science and Technology |
Konno, Kouichi | Iwate University |
Hu, Shaojun | Northwest A&F University |
Keywords: Art and Entertainment Robotics, Human-Robot Collaboration
Abstract: Creating visually pleasing stylized ink paintings from 3D models is a challenge in robotic manipulation. We propose a semi-automatic framework that can extract expressive strokes from 3D models and draw them in oriental ink painting styles by using a robotic arm. The framework consists of a simulation stage and a robotic drawing stage. In the simulation stage, geometrical contours were automatically extracted from a certain viewpoint and a neural network was employed to create simplified contours. Then, expressive digital strokes were generated after interactive editing according to user's aesthetic understanding. In the robotic drawing stage, an optimization method was presented for drawing smooth and physically consistent strokes to the digital strokes, and two oriental ink painting styles termed as Noutan (shade) and Kasure (scratchiness) were applied to the strokes by robotic control of a brush's translation, dipping and scraping. Unlike existing methods that concentrate on generating paintings from 2D images, our framework has the advantage of rendering stylized ink paintings from 3D models by using a consumer-grade robotic arm. We evaluate the proposed framework by taking 3 standard models and a user-defined model as examples. The results show that our framework is able to draw visually pleasing oriental ink paintings with expressive strokes.
|
|
16:30-18:00, Paper WeCT15-AX.9 | Add to My Program |
A 3D Mixed Reality Interface for Human-Robot Teaming |
|
Chen, Jiaqi | ETH Zurich |
Sun, Boyang | ETH Zurich |
Blum, Hermann | ETH Zurich |
Pollefeys, Marc | ETH Zurich |
Keywords: Virtual Reality and Interfaces, Human-Robot Teaming
Abstract: This paper presents a mixed-reality human-robot teaming system. It allows human operators to see in real-time where robots are located, even if they are not in line of sight. The operator can also visualize the map that the robots create of their environment and can easily send robots to new goal positions. The system mainly consists of a mapping and a control module. The mapping module is a real-time multi-agent visual SLAM system that co-localizes all robots and mixed-reality devices to a common reference frame. Visualizations in the mixed-reality device then allow operators to see a virtual life-sized representation of the cumulative 3D map overlaid onto the real environment. As such, the operator can effectively “see through” walls into other rooms. To control robots and send them to new locations, we propose a drag-and-drop interface. An operator can grab any robot hologram in a 3D mini map and drag it to a new desired goal pose. We validate the proposed system through a user study and real-world deployments. We make the mixed-reality application publicly available at github.com/cvg/hololens_ros.
|
|
WeCT16-AX Oral Session, AX-204 |
Add to My Program |
Wheeled Robots |
|
|
Chair: Asano, Fumihiko | Japan Advanced Institute of Science and Technology |
Co-Chair: Hirano, Masahiro | The University of Tokyo |
|
16:30-18:00, Paper WeCT16-AX.1 | Add to My Program |
Online Camera Orientation Calibration Aided by a High-Speed Ground-View Camera |
|
Su, Junzhe | The University of Tokyo |
Hirano, Masahiro | The University of Tokyo |
Yamakawa, Yuji | The University of Tokyo |
Keywords: Calibration and Identification, Wheeled Robots
Abstract: This paper proposes an online method for calibrating the orientation of cameras mounted on vehicles. To calibrate the orientation of the target camera relative to the vehicle, we propose using a high-speed vision sensor, focused on the ground, in conjunction with the target camera. First, the high-speed camera’s planar motion parallel to ground plane and target camera’s motion are estimated by a semi-dense approach and visual odometry method, respectively. Then, the motions are used to calibrate the target camera’s orientation through nonlinear optimization based on the invariance constraint of the extrinsic parameters and the nonholonomic constraint of the vehicle. Unlike traditional methods, this approach does not depend on artificial features such as lane markings and utilizes ground information more efficiently, making it applicable in broader scenarios. Simulation and field tests have demonstrated that the target camera orientation calibration errors are approximately 1°, even on a bumpy road, affirming the accuracy and robustness of the proposed method.
|
|
16:30-18:00, Paper WeCT16-AX.2 | Add to My Program |
Fast Wheeled Driving to Legged Leaping Onto a Step in a Leg-Wheel Transformable Robot |
|
Chen, Zhi-Ren | NTU |
Yu, Wei-Shun | National Taiwan University |
Lin, Pei-Chun | National Taiwan University |
Keywords: Legged Robots, Dynamics, Wheeled Robots
Abstract: The leg-wheel transformable robot has the advantage of smooth, fast, and power-efficient motion on flat terrain and negotiability on rough terrain. This study presents a highly dynamic maneuver of the robot to leap onto a step using its legged form from its original form of wheeled driving, taking full advantage of the rapid switching capabilities of the leg-wheel design of the robot. The robot motion is designed based on a reduced-order model and is planned using an optimization method with multiple constraints. In addition, both position and impedance control strategies are investigated. The proposed strategy is experimentally evaluated. The results show that the robot can leap onto a step higher than itself and then smoothly transition back to the wheeled mode after leaping. The dynamic driving-to-leaping maneuver endows the robot with an alternative and time-efficient approach to negotiate the step obstacles.
|
|
16:30-18:00, Paper WeCT16-AX.3 | Add to My Program |
Body Velocity Estimation in a Leg–Wheel Transformable Robot without a Priori Knowledge of Leg–Wheel Ground Contacts |
|
Huang, Pei-Chun | National Taiwan University |
Chang, I-Chia | Purdue University |
Yu, Wei-Shun | National Taiwan University |
Lin, Pei-Chun | National Taiwan University |
Keywords: Legged Robots, Wheeled Robots
Abstract: The state estimation of legged robots often relies on ground contact detection. However, due to complex mechanisms and other factors, ground contact detection can be challenging to obtain in certain situations. This paper presents a velocity estimation method that combines inertia measurement unit (IMU) and encoders, allowing estimation without using the ground contact state as the a priori. In this paper, the initial estimate derived from IMU integration is refined. Following the computation of velocity and ground contact state probabilities using encoder data, these probabilities are employed to modify particle weights within the particle filter framework. Subsequent resampling ensures that the contact status converges toward the correct result. This paper tests the algorithm through simulations and validates the method with physical experiments, showcasing the feasibility of concurrent ground contact state and velocity estimation.
|
|
16:30-18:00, Paper WeCT16-AX.4 | Add to My Program |
Rolling with Planar Parametric Curves for Real-Time Robot Locomotion Algorithms |
|
Mane, Adwait | FAMU-FSU College of Engineering, Florida State University, Talla |
Hubicki, Christian | Florida State University |
Keywords: Legged Robots, Wheeled Robots, Motion and Path Planning
Abstract: Robots routinely encounter obstacles and rough terrain, but terrain curvature is seldom included in models for real-time algorithms. We present a closed-form dynamic model for rolling with two planar smooth curves, and apply it to sagittal-plane locomotion problems. We assumed that the body rolls without slip and maintains a single point of contact. Using an auxiliary coordinate system to define the rolling body and terrain as parametric curves, we derived rolling constraints and dynamic equations of motion for model-based control algorithms - specifically Operational Space Control. The formulation was used to simulate an arbitrarily curved rock rolling on undulating terrain and to generate control signals to stabilize it on parabolic terrain. The stabilization problem was solved as a quadratic program in < 3 ms which shows that our formulation is suitable for real-time control algorithms. We also applied this framework to dynamically balance an underactuated 2 degree-of-freedom leg on parabolic terrain and achieve prescribed locomotion tasks for a wheel-leg vehicle on sinusoidal terrain in simulation.
|
|
16:30-18:00, Paper WeCT16-AX.5 | Add to My Program |
Non-Smooth Trajectory Optimization for Wheeled Balancing Robots with Contact Switches and Impacts |
|
Klemm, Victor | ETH Zurich |
de Viragh, Yvain | ETH Zurich |
Rohr, David | ETH Zurich |
Siegwart, Roland | ETH Zurich |
Tognon, Marco | Inria Rennes |
Keywords: Legged Robots, Wheeled Robots, Optimization and Optimal Control, Contact Modeling
Abstract: Recent years have seen a steady rise in the abilities of wheeled-legged balancing robots. Yet, their use is still severely restricted by the lack of efficient control algorithms for overcoming obstacles such as stairs. We take a considerable step towards closing this gap by presenting a fast trajectory optimizer for generating trajectories over a large class of challenging terrains. By limiting the underlying modeling to the planar, nonlinear rigid-body dynamics and subdividing the terrain into contact-phases, a tractable nonlinear programming problem is obtained. The model explicitly accounts for contact switches and impacts, traction limits, and actuation bounds. By introducing an arc-length-related parametrization, the trajectories are rendered inherently contact constraint-consistent. We apply our method to the specific case of the wheeled bipedal robot Ascento, for which we derive closed-form expressions of the dynamics equations, including the kinematic loops. To track the trajectories, we propose a simple LQR-based controller. The approach is validated in real-world experiments where we show the execution of trajectories for traversing steps, driving up ramps, jumping, standing up, and driving up entire stairways. To the authors’ best knowledge, enabling the latter by means of trajectory optimization is a novelty for wheeled-legged robots.
|
|
16:30-18:00, Paper WeCT16-AX.6 | Add to My Program |
Design and Central Pattern Generator Control of a New Transformable Wheel-Legged Robot |
|
Bishop, Tyler | University of California, Riverside |
Ye, Keran | University of California, Riverside |
Karydis, Konstantinos | University of California, Riverside |
Keywords: Nonholonomic Mechanisms and Systems, Mechanism Design, Wheeled Robots
Abstract: This paper introduces a new wheel-legged robot and develops motion controllers based on central pattern generators (CPGs) for the robot to navigate over a range of terrains. A transformable leg-wheel design is considered and characterized in terms of key locomotion characteristics as a function of the design. Kinematic analysis is conducted based on a generalized four-bar mechanism driven by a coaxial hub arrangement. The analysis is used to inform the design of a central pattern generator to control the robot by mapping oscillator states to wheel-leg trajectories and implementing differential steering within the oscillator network. Three oscillator models are used as the basis of the CPGs, and their performance is compared over a range of inputs. The CPG-based controller is used to drive the developed robot prototype on level ground and over obstacles. Additional simulated tests are performed for uneven terrain negotiation and obstacle climbing. Results demonstrate the effectiveness of CPG control in transformable wheel-legged robots.
|
|
16:30-18:00, Paper WeCT16-AX.7 | Add to My Program |
Planned Trajectory Classification for Wheeled Mobile Robots to Prevent Rollover and Slip |
|
Jeon, Sang-Yun | Seoul National University |
Chung, Rakjoon | Samsung Electronics |
Lee, Dongjun | Seoul National University |
Keywords: Wheeled Robots, Dynamics, Robot Safety
Abstract: In this paper, a novel planned trajectory classification method (PTCM) is proposed to evaluate the safety of the car-like four-wheeled mobile robots (4-WMRs) with Ackermann steering. To classify a planned trajectory to be safe or unsafe before the 4-WMR actually follows it, the conditions of the wheel forces (WFs: longitudinal, lateral, and normal forces for each wheel) necessary to execute the planned trajectory without rollover and slip are calculated using the passive decomposition of the WMR dynamics with the Pfaffian constraints of the no-rollover and no-slip conditions. Similar to the case of Navier’s table problem, only nine-dimensional WFs projected onto the constrained space are identifiable among the twelve-dimensional WFs. This indeterminacy turns out not to affect the rollover prediction, yet does so for the slip prediction. For this, we propose novel optimistic and pessimistic methods, together upper and lower bounding the exact slip prediction. The proposed PTCM classifies the planned trajectory as safe if rollover and slip are not predicted and unsafe otherwise. The proposed PTCM is demonstrated and validated by simulations and outdoor experiments.
|
|
16:30-18:00, Paper WeCT16-AX.8 | Add to My Program |
Mechanical Design and Kinematics of a Multimodal Two-Wheeled Robot |
|
Sun, Botian | Peking University |
Lang, Qinglin | Peking University |
Li, Minghe | Peking University |
Wang, Xuefeng | Peking University |
Keywords: Wheeled Robots, Kinematics, Mechanism Design
Abstract: A two-wheeled vehicle has a compact structure and high mobility in crowded and complex environments. The bicycle and self-balancing vehicle are two main modes of the two-wheeled vehicle, and their combination allows for good balance-control stability at both high and low speeds. Four control inputs by two steerable driving wheels are required to implement transformations between the two modes due to the difference of their configuration spaces. However, the control inputs are redundant for planar motions, which results in an over constraint of the vehicle. In this work, a two-wheeled robot with an addition structural deformation is designed to balance inputs and degrees of freedom (DOFs), so that the over constraint is avoided. A transition mode based on oblique vehicle motions is used to bridge the transformation of the bicycle and self-balancing vehicle modes. A general kinematic model is developed for planar motions of the two-wheeled robot, and kinematics of the three modes are special cases with particular servo constraints. Structural deformation control laws are developed and experimentally validated on a prototype robot. Smooth transformations of the multimodal motions are also validated by the prototype.
|
|
16:30-18:00, Paper WeCT16-AX.9 | Add to My Program |
Global Tracking Control for Car-Like Mobile Robots with Zero-Crossing Driving Velocity |
|
Yan, Kai | Beihang University |
Keywords: Wheeled Robots, Nonholonomic Mechanisms and Systems, Collision Avoidance
Abstract: This work proposes a smooth time-varying controller to address the trajectory tracking problems of car-like mobile robots. Currently, literature does not suggest globally asymptotically stable controllers solving this problem. Unlike the prototypical method of transforming the model into a nonholonomic chained-form system, the proposed method is designed based on the original tracking error equation, and therefore our approach does not have singularities of chained-form transformations. Opposing current methods, our control law satisfactorily addresses the cases where the vehicle's velocity passes through zero. In general, our redesigned control law has no singularities, which can satisfy the requirement that the vehicle's velocity can cross zero and at the same time have a global attraction region. The design of the controller is mainly divided into two steps. Firstly, the linear velocity and steering angle of the robot are regarded as control inputs, which are designed by making the derivative of a positive definite Lyapunov-like function semi-negative definite. In the next step, another control input is designed by the backstepping approach. Furthermore, the global convergence of the state trajectory to the reference one is strictly proved by Barbalat's Lemma. Finally, simulated and actual experiments on a car-like robot demonstrate the effectiveness of the proposed control scheme.
|
|
WeCT17-AX Oral Session, AX-205 |
Add to My Program |
Legged Robots and Learning II |
|
|
Chair: Tsukagoshi, Hideyuki | Tokyo Institute of Technology |
|
16:30-18:00, Paper WeCT17-AX.1 | Add to My Program |
Cascaded Compositional Residual Learning for Complex Interactive Behaviors |
|
Kannabiran, Niranjan Kumar | Georgia Institute of Technology |
Essa, Irfan | Georgia Institute of Technology |
Ha, Sehoon | Georgia Institute of Technology |
Keywords: Legged Robots, Reinforcement Learning, Machine Learning for Robot Control
Abstract: Real-world autonomous missions often require rich interaction with nearby objects, such as doors or switches, along with effective navigation. However, such complex behaviors are difficult to learn because they involve both high-level planning and low-level motor control. We present a novel framework, Cascaded Compositional Residual Learning (CCRL), which learns composite skills by recursively leveraging a library of previously learned control policies. Our framework combines multiple levels of pre-learned skills by using multiplicative skill composition and residual action learning. We also introduce a goal synthesis network and an observation selector to support combination of heterogeneous skills, each with its unique goals and observation space. Finally, we develop residual regularization for learning policies that solve a new task, while preserving the style of the motion enforced by the skill library. We show that our framework learns joint-level control policies for a diverse set of motor skills ranging from basic locomotion to complex interactive navigation, including navigating around obstacles, pushing objects, crawling under a table, pushing a door open with its leg, and holding it open while walking through it. The proposed CCRL framework leads to policies with consistent styles and lower joint torques, which we successfully transfer to a real Unitree A1 robot without any additional fine-tuning.
|
|
16:30-18:00, Paper WeCT17-AX.2 | Add to My Program |
Deep Compliant Control for Legged Robots |
|
Hartmann, Adrian | ETH Zürich |
Kang, Dongho | ETH Zurich |
Zargarbashi, Fatemeh | ETH Zurich |
Zamora Mora, Miguel Angel | ETH Zurich |
Coros, Stelian | ETH Zurich |
Keywords: Legged Robots, Reinforcement Learning, Motion Control
Abstract: Control policies trained using deep reinforcement learning often generate stiff, high-frequency motions in response to unexpected disturbances. To promote more natural and compliant balance recovery strategies, we propose a simple modification to the typical reinforcement learning training process. Our key insight is that stiff responses to perturbations are due to an agent’s incentive to maximize task rewards at all times, even as perturbations are being applied. As an alternative, we introduce an explicit recovery stage where tracking rewards are given irrespective of the motions generated by the control policy. This allows agents a chance to gradually recover from disturbances before attempting to carry out their main tasks. Through an in-depth analysis, we highlight both the compliant nature of the resulting control policies, as well as the benefits that compliance brings to legged locomotion. In our simulation and hardware experiments, the compliant policy achieves more robust, energy-efficient, and safe interactions with the environment.
|
|
16:30-18:00, Paper WeCT17-AX.3 | Add to My Program |
Imitating and Finetuning Model Predictive Control for Robust and Symmetric Quadrupedal Locomotion |
|
Youm, Donghoon | Korea Advanced Institute of Science and Technology |
Jung, Hyunyoung | Georgia Institute of Technology |
Kim, HyeongJun | Korea Advanced Institute of Science and Technology |
Hwangbo, Jemin | Korean Advanced Institute of Science and Technology |
Park, Hae-Won | Korea Advanced Institute of Science and Technology |
Ha, Sehoon | Georgia Institute of Technology |
Keywords: Legged Robots, Reinforcement Learning, Motion Control
Abstract: Control of legged robots is a challenging problem that has been investigated by different approaches, such as model-based control and learning algorithms. This work proposes a novel Imitating and Finetuning Model Predictive Control (IFM) framework to take the strengths of both approaches. Our framework first develops a conventional model predictive controller (MPC) using Differential Dynamic Programming and Raibert heuristic, which serves as an expert policy. Then we train a clone of the MPC using imitation learning to make the controller learnable. Finally, we leverage deep reinforcement learning with limited exploration for further finetuning the policy on more challenging terrains. By conducting comprehensive simulation and hardware experiments, we demonstrate that the proposed IFM framework can significantly improve the performance of the given MPC controller on rough, slippery, and conveyor terrains that require careful coordination of footsteps. We also showcase that IFM can efficiently produce more symmetric, periodic, and energy-efficient gaits compared to Vanilla RL with a minimal burden of reward shaping.
|
|
16:30-18:00, Paper WeCT17-AX.4 | Add to My Program |
Learning Agile Locomotion and Adaptive Behaviors Via RL-Augmented MPC |
|
Chen, Yiyu | University of Southern California |
Nguyen, Quan | University of Southern California |
Keywords: Legged Robots, Reinforcement Learning, Optimization and Optimal Control
Abstract: In the context of legged robots, adaptive behavior involves adaptive balancing and adaptive swing foot reflection. While adaptive balancing counteracts perturbations to the robot, adaptive swing foot reflection helps the robot to navigate intricate terrains without foot entrapment. In this paper, we manage to bring both aspects of adaptive behavior to quadruped locomotion by combining RL and MPC while improving the robustness and agility of blind legged locomotion. This integration leverages MPC's strength in predictive capabilities and RL's adeptness in drawing from past experiences. Unlike traditional locomotion controls that separate stance foot control and swing foot trajectory, our innovative approach unifies them, addressing their lack of synchronization. At the heart of our contribution is the synthesis of stance foot control with swing foot reflection, improving agility and robustness in locomotion with adaptive behavior. A hallmark of our approach is robust blind stair climbing through swing foot reflection. Moreover, we intentionally designed the learning module as a general plugin for different robot platforms. We trained the policy and implemented our approach on the Unitree A1 robot, achieving impressive results: a peak turn rate of 8.5 rad/s, a peak running speed of 3 m/s, and steering at a speed of 2.5 m/s. Remarkably, this framework also allows the robot to maintain stable locomotion while bearing an unexpected load of 10 kg, or 83% of its body mass. We further demonstrate the generalizability and robustness of the same policy where it realizes zero-shot transfer to different robot platforms like Go1 and AlienGo robots for load carrying. Code is made available for the use of the research community at https://github.com/DRCL-USC/RL_augmented_MPC.git
|
|
16:30-18:00, Paper WeCT17-AX.5 | Add to My Program |
Extreme Parkour with Legged Robots |
|
Cheng, Xuxin | University of California, San Diego |
Shi, Kexin | Carnegie Mellon University |
Agarwal, Ananye | Carnegie Mellon University |
Pathak, Deepak | Carnegie Mellon University |
Keywords: Legged Robots, Reinforcement Learning, Perception-Action Coupling
Abstract: Humans can perform parkour by traversing obstacles in a highly dynamic fashion requiring precise eye-muscle coordination and movement. Getting robots to do the same task requires overcoming similar challenges. Classically, this is done by independently engineering perception, actuation, and control systems to very low tolerances. This restricts them to tightly controlled settings such as a predetermined obstacle course in labs. In contrast, humans are able to learn parkour through practice without significantly changing their underlying biology. In this paper, we take a similar approach to developing robot parkour on a small low-cost robot with imprecise actuation and a single front-facing depth camera for perception which is low-frequency, jittery, and prone to artifacts. We show how a single neural net policy operating directly from a camera image, trained in simulation with large-scale RL, can overcome imprecise sensing and actuation to output highly precise control behavior end-to-end. We show our robot can perform a high jump on obstacles 2x its height, long jump across gaps 2x its length, do a handstand and run across tilted ramps, and generalize to novel obstacle courses with different physical properties. Parkour videos at https://extreme-parkour.github.io/.
|
|
16:30-18:00, Paper WeCT17-AX.6 | Add to My Program |
Learning Risk-Aware Quadrupedal Locomotion Using Distributional Reinforcement Learning |
|
Schneider, Lukas | ETH Zurich |
Frey, Jonas | ETH Zurich |
Miki, Takahiro | ETH Zurich |
Hutter, Marco | ETH Zurich |
Keywords: Legged Robots, Reinforcement Learning, Robot Safety
Abstract: Deployment in hazardous environments requires robots to understand the risks associated with their actions and movements to prevent accidents. Despite its importance, these risks are not explicitly modeled by currently deployed locomotion controllers for legged robots. In this work, we propose a risk sensitive locomotion training method employing distributional reinforcement learning to consider safety explicitly. Instead of relying on a value expectation, we estimate the complete value distribution to account for uncertainty in the robot's interaction with the environment. The value distribution is consumed by a risk metric to extract risk sensitive value estimates. These are integrated into Proximal Policy Optimization (PPO) to derive our method, Distributional Proximal Policy Optimization (DPPO). The risk preference, ranging from risk-averse to risk-seeking, can be controlled by a single parameter, which enables to adjust the robot's behavior dynamically. Importantly, our approach removes the need for additional reward function tuning to achieve risk sensitivity. We show emergent risk sensitive locomotion behavior in simulation and on the quadrupedal robot ANYmal. Videos of the experiments and code are available at https://sites.google.com/leggedrobotics.com/risk-aware-locomotion.
|
|
16:30-18:00, Paper WeCT17-AX.7 | Add to My Program |
Robust Quadrupedal Locomotion Via Risk-Averse Policy Learning |
|
Shi, Jiyuan | Tsinghua University |
Bai, Chenjia | Shanghai Artificial Intelligence Laboratory |
He, Haoran | Shanghai Jiao Tong University |
Han, Lei | Tencent Robotics X |
Wang, Dong | Shanghai Artificial Intelligence Laboratory |
Zhao, Bin | Northwestern Polytechnical University |
Zhao, Mingguo | Tsinghua University |
Li, Xiu | Tsinghua University |
Li, Xuelong | Northwestern Polytechnical University |
Keywords: Reinforcement Learning, Legged Robots, Robust/Adaptive Control
Abstract: The robustness of legged locomotion is crucial for quadrupedal robots in challenging terrains. Recently, Rein- forcement Learning (RL) has shown promising results in legged locomotion and various methods try to integrate privileged distillation, scene modeling, and external sensors to improve the generalization and robustness of locomotion policies. However, these methods are hard to handle uncertain scenarios such as abrupt terrain changes or unexpected external forces. In this paper, we consider a novel risk-sensitive perspective to enhance the robustness of legged locomotion. Specifically, we employ a distributional value function learned by quantile regression to model the aleatoric uncertainty of environments, and perform risk-averse policy learning by optimizing the worst-case scenarios via a risk distortion measure. Extensive experiments in both simulation environments and a real Aliengo robot demonstrate that our method is efficient in handling various external disturbances, and the resulting policy exhibits improved robustness in harsh and uncertain situations in legged locomotion. Videos are available at https://risk-averse- locomotion.github.io/.
|
|
16:30-18:00, Paper WeCT17-AX.8 | Add to My Program |
Maximizing Quadruped Velocity by Minimizing Energy |
|
Mahankali, Srinath | Massachusetts Institute of Technology |
Lee, Chi-Chang | Research Center for Information Technology Innovation, Academia |
Margolis, Gabriel | Massachusetts Institute of Technology |
Hong, Zhang-Wei | National Tsing Hua University |
Agrawal, Pulkit | MIT |
Keywords: Reinforcement Learning, Legged Robots, Sensorimotor Learning
Abstract: Reinforcement Learning (RL) has been a powerful tool for training robots to acquire agile locomotion skills. To learn locomotion, it is commonly necessary to introduce additional reward-shaping terms, such as an energy minimization term, to guide an algorithm like Proximal Policy Optimization (PPO) to good performance. Prior works rely on hyper-parameter tuning on the weight of the reward shaping terms to obtain satisfactory task performance. To save the efforts of tuning these weights, we adopt the Extrinsic-Intrinsic Policy Optimization (EIPO) framework. The key idea of EIPO is to establish a constrained optimization framework for the primary objective of enhancing task performance and the secondary objective of minimizing energy consumption. It seeks a policy that minimizes the energy consumption objective within the optimal policy space for task performance. This guarantees that the learned policy excels in task performance while conserving energy, all without requiring manual weight adjustments for both objectives. Our experiments evaluate EIPO on various quadruped locomotion tasks, revealing that policies trained with EIPO consistently achieve higher task performance than PPO comparisons while maintaining comparable energy consumption levels. Furthermore, EIPO exhibits superior task performance in real-world evaluations compared to PPO.
|
|
16:30-18:00, Paper WeCT17-AX.9 | Add to My Program |
Dexterous Legged Locomotion in Confined 3D Spaces with Reinforcement Learning |
|
Xu, Zifan | University of Texas at Austin |
Raj, Amir Hossain | George Mason University |
Xiao, Xuesu | George Mason University |
Stone, Peter | University of Texas at Austin |
Keywords: Reinforcement Learning, Legged Robots, Vision-Based Navigation
Abstract: Recent advances of locomotion controllers utilizing deep reinforcement learning (RL) have yielded impressive results in terms of achieving rapid and robust locomotion across challenging terrain, such as rugged rocks, non-rigid ground, and slippery surfaces. However, while these controllers primarily address challenges underneath the robot, relatively little research has investigated legged mobility through confined 3D spaces, such as narrow tunnels or irregular voids, which impose all-around constraints. The cyclic gait patterns resulted from existing RL-based methods to learn parameterized locomotion skills characterized by motion parameters, such as velocity and body height, may not be adequate to navigate robots through challenging confined 3D spaces, requiring both agile 3D obstacle avoidance and robust legged locomotion. Instead, we propose to learn locomotion skills end-to-end from goal-oriented navigation in confined 3D spaces. To address the inefficiency of tracking distant navigation goals, we introduce a hierarchical locomotion controller that combines a classical planner tasked with planning waypoints to reach a faraway global goal location, and an RL-based policy trained to follow these waypoints by generating low-level motion commands. This approach allows the policy to explore its own locomotion skills within the entire solution space and facilitates smooth transitions between local goals, enabling long-term navigation towards distant goals. In simulation, our hierarchical approach succeeds at navigating through demanding confined 3D environments, outperforming both pure end-to-end learning approaches and parameterized locomotion skills. We further demonstrate the successful real-world deployment of our simulation-trained controller on a real robot.
|
|
WeCT18-AX Oral Session, AX-206 |
Add to My Program |
Optimization and Optimal Control II |
|
|
Chair: Calinon, Sylvain | Idiap Research Institute |
Co-Chair: Lee, Jinoh | German Aerospace Center (DLR) |
|
16:30-18:00, Paper WeCT18-AX.1 | Add to My Program |
Optimal Control for Clutched-Elastic Robots: A Contact-Implicit Approach |
|
Ossadnik, Dennis | Technical University of Munich |
Rakcevic, Vasilije | Technical University of Munich |
Yildirim, Mehmet Can | Technical University of Munich |
Pozo Fortunić, Edmundo | Technical University of Munich |
Kussaba, Hugo Tadashi | Technical University of Munich |
Swikir, Abdalla | Technical University of Munich |
Haddadin, Sami | Technical University of Munich |
Keywords: Optimization and Optimal Control, Compliant Joints and Mechanisms, Actuation and Joint Mechanisms
Abstract: Intrinsically elastic robots surpass their rigid counterparts in a range of different characteristics. By temporarily storing potential energy and subsequently converting it to kinetic energy, elastic robots are capable of highly dynamic motions even with limited motor power. However, the time-dependency of this energy storage and release mechanism remains one of the major challenges in controlling elastic robots. A possible remedy is the introduction of locking elements (i.e. clutches and brakes) in the drive train. This gives rise to a new class of robots, so-called clutched-elastic robots (CER), with which it is possible to precisely control the energy-transfer timing. A prevalent challenge in the realm of CERs is the automatic discovery of clutch sequences. Due to complexity, many methods still rely on pre-defined modes. In this paper, we introduce a novel contact-implicit scheme designed to optimize both control input and clutch sequence simultaneously. A penalty in the objective function ensures the prevention of unnecessary clutch transitions. We empirically demonstrate the effectiveness of our proposed method on a double pendulum equipped with two of our newly proposed clutch-based Bi-Stiffness Actuators (BSA).
|
|
16:30-18:00, Paper WeCT18-AX.2 | Add to My Program |
Optimal Control for Articulated Soft Robots |
|
Chhatoi, Saroj Prasad | University of Pisa |
Pierallini, Michele | Centro Di Ricerca E. Piaggio - Università Di Pisa |
Angelini, Franco | University of Pisa |
Mastalli, Carlos | Heriot-Watt University |
Garabini, Manolo | Università Di Pisa |
Keywords: Optimization and Optimal Control, Flexible Robots, Underactuated Robots, articulated soft robots
Abstract: Soft robots can execute tasks with safer interactions. However, control techniques that can effectively exploit the systems’ capabilities are still missing. Differential dynamic programming (DDP) has emerged as a promising tool for achieving highly dynamic tasks. But most of the literature deals with applying DDP to articulated soft robots by using numerical differentiation, in addition to using pure feed-forward control to perform explosive tasks. Further, underactuated compliant robots are known to be difficult to control and the use of DDP-based algorithms to control them is not yet addressed. We propose an efficient DDP-based algorithm for trajectory optimization of articulated soft robots that can optimize the state trajectory, input torques, and stiffness profile. We provide an efficient method to compute the forward dynamics and the analytical derivatives of series elastic actuators /variable stiffness actuators and underactuated compliant robots. We present a state-feedback controller that uses locally optimal feedback policies obtained from DDP. We show through simulations and experiments that the method can generate motion plans and control for robo
|
|
16:30-18:00, Paper WeCT18-AX.3 | Add to My Program |
Force Feedback Model-Predictive Control Via Online Estimation |
|
Jordana, Armand | New York University |
Kleff, Sebastien | New York University |
Carpentier, Justin | INRIA |
Mansard, Nicolas | CNRS |
Righetti, Ludovic | New York University |
Keywords: Optimization and Optimal Control, Force Control, Sensor-based Control
Abstract: Nonlinear model-predictive control has recently shown its practicability in robotics. However it remains limited in contact interaction tasks due to its inability to leverage sensed efforts. In this work, we propose a novel model-predictive control approach that incorporates direct feedback from force sensors while circumventing explicit modeling of the contact force evolution. Our approach is based on the online estimation of the discrepancy between the force predicted by the dynamics model and force measurements, combined with high-frequency nonlinear model-predictive control. We report an experimental validation on a torque-controlled manipulator in challenging tasks for which accurate force tracking is necessary. We show that a simple reformulation of the optimal control problem combined with standard estimation tools enables to achieve state-of-the-art performance in force control while preserving the benefits of model-predictive control, thereby outperforming traditional force control techniques. This work paves the way toward a more systematic integration of force sensors in model predictive control.
|
|
16:30-18:00, Paper WeCT18-AX.4 | Add to My Program |
Geometric Algebra for Optimal Control with Applications in Manipulation Tasks |
|
Löw, Tobias | Idiap Research Institute, EPFL |
Calinon, Sylvain | Idiap Research Institute |
Keywords: Optimization and Optimal Control, Motion Control of Manipulators, Manipulation Planning, Geometric Algebra
Abstract: Many problems in robotics are fundamentally problems of geometry, which lead to an increased research effort in geometric methods for robotics in recent years. The results were algorithms using the various frameworks of screw theory, Lie algebra and dual quaternions. A unification and generalization of these popular formalisms can be found in geometric algebra. The aim of this paper is to showcase the capabilities of geometric algebra when applied to robot manipulation tasks. In particular the modelling of cost functions for optimal control can be done uniformly across different geometric primitives leading to a low symbolic complexity of the resulting expressions and a geometric intuitiveness. We demonstrate the usefulness, simplicity and computational efficiency of geometric algebra in several experiments using a Franka Emika robot. The presented algorithms were implemented in c++20 and resulted in the publicly available library gafro. The benchmark shows faster computation of the kinematics than state-of-the-art robotics libraries.
|
|
16:30-18:00, Paper WeCT18-AX.5 | Add to My Program |
Trajectory Tracking Runtime Assurance for Systems with Partially Unknown Dynamics |
|
Cao, Michael Enqi | Georgia Institute of Technology |
Coogan, Samuel | Georgia Tech |
Keywords: Optimization and Optimal Control, Planning under Uncertainty, Robot Safety
Abstract: We consider the problem of tracking a reference trajectory for dynamical systems subject to a priori unknown state-dependent disturbance behavior. We propose a formulation that embeds the uncertain system into a higher dimensional deterministic system that accounts for worst case disturbances. Our main insight is that a single controlled trajectory of this embedding system corresponds to a controlled forward invariant interval tube around the reference trajectory. By taking observations of the system, we then propose to estimate the state-dependent uncertainty with Gaussian Process regression, which improves the accuracy of the forward invariant tube as data is collected. Given a safety objective, we also provide conditions on when an additional observation of the unknown disturbance behavior needs to be collected to maintain safety. We demonstrate our formulation on a case study of a planar multirotor attempting a safe landing in an unknown wind field.
|
|
16:30-18:00, Paper WeCT18-AX.6 | Add to My Program |
How to Train Your Neural Control Barrier Function: Learning Safety Filters for Complex Input-Constrained Systems |
|
So, Oswin | Massachusetts Institute of Technology |
Serlin, Zachary | Boston University |
Mann, Makai | MIT Lincoln Laboratory |
Gonzales, Jake | University of Washington |
Rutledge, Kwesi | University of Michigan |
Roy, Nicholas | Massachusetts Institute of Technology |
Fan, Chuchu | Massachusetts Institute of Technology |
Keywords: Optimization and Optimal Control, Robot Safety, Machine Learning for Robot Control
Abstract: Control barrier functions (CBFs) have become popular as a safety filter to guarantee the safety of nonlinear dynamical systems for arbitrary inputs. However, it is difficult to construct functions that satisfy the CBF constraints for high relative degree systems with input constraints. To address these challenges, recent work has explored learning CBFs using neural networks via neural CBFs (NCBFs). However, such methods face difficulties when scaling to higher dimensional systems under input constraints. In this work, we first identify challenges that NCBFs face during training. Next, to address these challenges, we propose policy neural CBFs (PNCBFs), a method of constructing CBFs by learning the value function of a nominal policy, and show that the value function of the maximum-over-time cost is a CBF. We demonstrate the effectiveness of our method in simulation on a variety of systems ranging from toy linear systems to a jet aircraft with a 16-dimensional state space. Finally, we validate our approach on a two-agent quadcopter system on hardware under tight input constraints.
|
|
16:30-18:00, Paper WeCT18-AX.7 | Add to My Program |
Stable, Safe, and Passive Teleoperation of Multi-Robot Systems |
|
Notomista, Gennaro | University of Waterloo |
Keywords: Optimization and Optimal Control, Safety in HRI, Telerobotics and Teleoperation
Abstract: In this paper, we present a unified framework to ensure the stability, safety, and passivity of a multi-robot teleoperation system in a holistic fashion. The proposed approach consists of encoding these three properties as constraints in an optimization-based controller using control Lypaunov and (integral) control barrier functions. The result is a stability-safety-passivity (SSP) filter implemented as a convex optimization control policy, which can be efficiently evaluated in an online fashion. The developed filter minimally modifies the teleoperation input in order to ensure that the robotic system remains stable, safe, and passive. The effectiveness of the developed approach is showcased using a team of mobile robots in a human-multi-robot teleoperation scenario.
|
|
16:30-18:00, Paper WeCT18-AX.8 | Add to My Program |
Approximate Optimal Controller Synthesis for Cart-Poles and Quadrotors Via Sums-Of-Squares |
|
Yang, Lujie | MIT |
Dai, Hongkai | Toyota Research Institute |
Amice, Alexandre | MIT |
Tedrake, Russ | Massachusetts Institute of Technology |
Keywords: Optimization and Optimal Control, Underactuated Robots
Abstract: Sums-of-squares (SOS) optimization is a promising tool to synthesize certifiable controllers for nonlinear dynamical systems. Building upon prior works, we demonstrate that SOS can synthesize dynamic controllers with bounded suboptimal performance for various underactuated robotic systems by finding good approximations of the value function. We summarize a unified SOS framework to synthesize both under- and over- approximations of the value function for continuous-time, control-affine systems, use these approximations to generate approximate optimal controllers, and perform regional analysis on the closed-loop system driven by these controllers. We then extend the formulation to handle hybrid systems with contacts. We demonstrate that our method can generate tight under- and over- approximations of the value function with low-degree polynomials, which are used to provide stabilizing controllers for continuous-time systems including the inverted pendulum, the cart-pole, and the quadrotor as well as a hybrid system, the planar pusher. To the best of our knowledge, this is the first time that a SOS-based time-invariant controller can swing up and stabilize a cart-pole, and push the planar slider to the desired pose.
|
|
16:30-18:00, Paper WeCT18-AX.9 | Add to My Program |
Online Multi-Contact Feedback Model Predictive Control for Interactive Robotic Tasks |
|
Han, Seo Wook | Korean Advanced Institute of Science and Technology |
Iskandar, Maged | German Aerospace Center - DLR |
Lee, Jinoh | German Aerospace Center (DLR) |
Kim, Min Jun | KAIST |
Keywords: Force Control, Physical Human-Robot Interaction
Abstract: In this paper, we propose a model predictive control (MPC) that accomplishes interactive robotic tasks, in which multiple contacts may occur at unknown locations. To address such scenarios, we made an explicit contact feedback loop in the MPC framework. An algorithm called Multi-Contact Particle Filter with Exploration Particle (MCP-EP) is employed to establish real-time feedback of multi-contact information. Then the interaction locations and forces are accommodated in the MPC framework via a spring contact model. Moreover, we achieved real-time control for a 7 degrees of freedom robot without any simplifying assumptions by employing a Differential-Dynamic-Programming algorithm. We achieved 6.8kHz, 1.9kHz, and 1.8kHz update rates of the MPC for 0, 1, and 2 contacts, respectively. This allows the robot to handle unexpected contacts in real time. Real-world experiments show the effectiveness of the proposed method in various scenarios.
|
|
WeCT19-NT Oral Session, NT-G301 |
Add to My Program |
Medical Robots VI |
|
|
Chair: Do, Thanh Nho | University of New South Wales |
Co-Chair: Iordachita, Ioan Iulian | Johns Hopkins University |
|
16:30-18:00, Paper WeCT19-NT.1 | Add to My Program |
Learning-Based Efficient Phase-Amplitude Modulation and Hybrid Control for MRI-Guided Focused Ultrasound Treatment |
|
Dai, Jing | The University of Hong Kong |
Zhu, Bohao | University of Hong Kong |
Wang, Xiaomei | The University of Hong Kong |
Jiang, Zhiyi | The University of Hong Kong |
Wu, Mengjie | The University of Hong Kong |
Liang, Liyuan | The University of Hong Kong |
Xie, Xiaochen | Harbin Institute of Technology, Shenzhen |
Lam, James | University of Hong Kong |
Chang, Hing-Chiu | The University of Hong Kong |
Kwok, Ka-Wai | The University of Hong Kong |
Keywords: Medical Robots and Systems, Surgical Robotics: Planning
Abstract: Magnetic resonance-guided focused ultrasound (MRg-FUS) has become attractive, accrediting to its non-invasive nature. However, ultrasound beams focusing and steering is still challenging owing to aberrations induced by soft tissue heterogeneity. In particular for beam motion control to ensure real-time and precise tracking in the deep-seated region over abdominal organs, while considering full-wave propagation. To this end, we proposed a closed-loop hybrid control scheme and a learning-based modulation model for robot-assisted MRg-FUS treatments. By introducing a rapid phase estimator to provide an efficient (<3 ms) solution, the robust H_infinity controller enables real-time and accurate tracking (0.30 mm) without prior knowledge of heterogeneous media, even under unknown disturbances. Our model enables rapid (2.65 ms) phase-amplitude modulation and precise targeting (mean 0.35 mm, max. 0.65 mm), meeting clinical standard. Focal obliquity is significantly “aligned” to only 2.7°. Results from sensitivity analysis and transducer design also support the model’s clinical feasibility and potential in widespread MRg-FUS treatments.
|
|
16:30-18:00, Paper WeCT19-NT.2 | Add to My Program |
Co-Axial Slender Tubular Robot (CAST): Towards Robotized Operation for Transorbital Neurosurgery with Minimal Invasiveness |
|
Wang, Shuai | HKPU(The Hong Kong Polytechnic University) |
Zhao, Qing xiang | Hong Kong Institute of Science & Innovation, Centre for Artifici |
Chen, Jian | University of Chinese Academy of Sciences |
Chen, Mingcong | City University of Hong Kong |
Cao, Guanglin | Institute of Automation, Chinese Academy of Sciences |
Hu, Jian | Institute of Automation, Chinese Academy of Sciences |
Zhu, Runfeng | The Hong Kong Polytechnic University |
Liu, Hongbin | Hong Kong Institute of Science & Innovation, Chinese Academy Of |
Keywords: Medical Robots and Systems, Surgical Robotics: Steerable Catheters/Needles, Modeling, Control, and Learning for Soft Robots
Abstract: Transorbital Neuro Surgery (TNS) offers a novel treatment towards the lesion inside skull pursuing minimal invasiveness. Most conventional TNS tools are rigid and straight, limiting the dexterity and accessibility in passing a small port. Bendable and steerable surgical tools provides an alternative for this issue. In this work, we proposed a dual-segment slender surgical robot arm for TNS, which is a Co-Axial Slender Tubular robot (CAST), and modelled it using novel approaches. Another contribution is tendon-mortise shaped slits along the axial direction, enhancing the overall stiffness. The bending of CAST is actuated by pushing/pulling distance, and the maximum diameter is only 1.7mm with high dexterity after mounting on a rigid robot arm. Experiments demonstrates that the proposed the slit design doubles the stiffness properties compared to traditional rectangle slit designs. The path-following task shows that the position error was maximally 3mm in open-looped control. Test on a skull model demonstrates that the whole system could successfully perform electrocoagulation procedure inside the depth of skull in a robotized manner effectively.
|
|
16:30-18:00, Paper WeCT19-NT.3 | Add to My Program |
Vascular Centerline-Guided Autonomous Navigation Methods for Robot-Lead Endovascular Interventions |
|
Li, Naner | Huazhong University of Science and Technology |
Wang, Yiwei | Huazhong University of Science and Technology |
Cheng, Haoyuan | Huazhong University of Science and Technology |
Zhao, Huan | Huazhong University of Science and Technology |
Ding, Han | Huazhong University of Science and Technology |
Keywords: Medical Robots and Systems, Surgical Robotics: Steerable Catheters/Needles, Robotics and Automation in Life Sciences
Abstract: In minimally invasive endovascular interventional surgery, guidewire navigation is an indispensable process. However, even experienced physicians often encounter difficulties in manually manipulating the guidewire for branch selection, while also facing the risk of radiation exposure. In this study, we investigated robotic autonomous guidewire navigation methods. An electromagnetic system was used to track the real-time position and orientation of the guidewire tip, and a state space representing the guidewire within the vascular environment was constructed to guide the robot in precise guidewire manipulation. Experimental results demonstrated that the proposed trial-and-error and centerline-guided methods successfully completed navigation tasks in a static environment, outperforming human navigation performance in terms of trajectory smoothness, trajectory length, and incorrect branch entry counts. For dynamic environment navigation, dynamic time warping (DTW), a technique for measuring the similarity between two temporal sequences, was integrated into the centerline-guided method. The proposed approaches eliminate the need for visual feedback and thereby minimizing the risk of radiation exposure for both patients and medical staff present in the operating room during the procedure.
|
|
16:30-18:00, Paper WeCT19-NT.4 | Add to My Program |
A Soft Micro-Robotic Catheter for Aneurysm Treatment: A New Design and Enhanced Euler-Bernoulli Model with Cross-Section Optimization |
|
Emanuele, Nicotra | UNSW Sydney |
Chi Cong, Nguyen | University of New South Wales |
Davies, James J. | University of New South Wales |
Phan, Phuoc Thien | University of New South Wales |
Hoang, Trung Thien | University of New South Wales |
Bibhu, Sharma | UNSW Sydney |
Ji, Adrienne | University of New South Wales |
Zhu, Kefan | UNSW Sydney |
Ngo, Trung Dung | University of Prince Edward Island |
Ho, Van | Japan Advanced Institute of Science and Technology |
La, Hung | University of Nevada at Reno |
Lovell, Nigel Hamilton | University of New South Wales |
Do, Thanh Nho | University of New South Wales |
Keywords: Medical Robots and Systems, Surgical Robotics: Steerable Catheters/Needles, Soft Robot Applications
Abstract: Aneurysms, balloon-like bulges in blood vessels, present a signifi cant health risk due to their potential to rupture, leading to life-threatening internal bleeding. Current treatments often involve delivering embolic materials or metal coils to fill these bulges, occluding them from the pressure of blood flow. However, clinical micro-catheters that deploy embolic materials used today face limitations, primarily their rigidity and the lack of active control over the bending tip of the catheter. This paper introduces a new soft micro-robotics catheter, with diameter of only 0.8 mm, equipped with a hollow channel. With this new design, the new device can induce bending motions at its tip for active steerability to reach desired aneurysm targets and then perform the delivery of embolic materials and tools. To enhance the control and precise navigation during procedures, a robust mathematical model and image processing techniques are also introduced and validated. Experiments are also performed to characterise and validate the model’s accuracy and the steerability and navigation capabilities of the new micro-catheter.
|
|
16:30-18:00, Paper WeCT19-NT.5 | Add to My Program |
A Generic Modeling Framework for the Design of Tendon-Driven Continuum Manipulators with Flexure Patterns |
|
Liu, Yang | The University of Texas at Austin |
Kim, Hansoul | The University of California, Berkeley |
Kulkarni, Yash | The University of Texas at Austin |
Alambeigi, Farshid | University of Texas at Austin |
Keywords: Medical Robots and Systems, Surgical Robotics: Steerable Catheters/Needles, Tendon/Wire Mechanism
Abstract: In this paper, a novel mathematical framework is introduced for modeling deformation behavior of Tendon- Driven Continuum Manipulators (TD-CMs) featuring discontinuous cross-sectional geometries (i.e., having flexural patterns). Leveraging this framework, we also introduce the concept of design space by which the deformation-behavior space of a TD-CM can intuitively be analyzed via its geometrical design parameters. To thoroughly evaluate the performance of the proposed modeling framework, we have conducted various simulation studies and experiments
|
|
16:30-18:00, Paper WeCT19-NT.6 | Add to My Program |
Bevel-Tip Needle Deflection Modeling, Simulation, and Validation in Multi-Layer Tissues |
|
Wang, Yanzhou | Johns Hopkins University |
Al-Zogbi, Lidia | Johns Hopkins University |
Liu, Guanyun | University of Florida |
Liu, Jiawei | Johns Hopkins University |
Tokuda, Junichi | Brigham and Women's Hospital and Harvard Medical School |
Krieger, Axel | Johns Hopkins University |
Iordachita, Ioan Iulian | Johns Hopkins University |
Keywords: Medical Robots and Systems, Surgical Robotics: Steerable Catheters/Needles
Abstract: Percutaneous needle insertions are commonly performed for diagnostic and therapeutic purposes as an effective alternative to more invasive surgical procedures. However, the outcome of needle-based approaches relies heavily on the accuracy of needle placement, which remains a challenge even with robot assistance and medical imaging guidance due to needle deflection caused by contact with soft tissues. In this paper, we present a novel mechanics-based 2D bevel-tip needle model that can account for the effect of nonlinear strain-dependent behavior of biological soft tissues under compression. Real-time finite element simulation allows multiple control inputs along the length of the needle with full three-degree-of-freedom (DOF) planar needle motions. Cross-validation studies using custom-designed multi-layer tissue phantoms as well as heterogeneous chicken breast tissues result in less than 1mm in-plane errors for insertions reaching depths of up to 61 mm, demonstrating the validity and generalizability of the proposed method.
|
|
16:30-18:00, Paper WeCT19-NT.7 | Add to My Program |
Excitation Trajectory Optimization for Dynamic Parameter Identification Using Virtual Constraints in Hands-On Robotic System |
|
Huanyu, Tian | Beijing Institution of Technology |
Huber, Martin | King's College London |
Mower, Christopher Edwin | Huawei Technologies Research & Development |
Han, Zhe | Beijing Institute of Technology |
Li, Changsheng | Beijing Institute of Technology |
Duan, Xingguang | Beijing Institute of Technology |
Bergeles, Christos | King's College London |
Keywords: Physical Human-Robot Interaction, Medical Robots and Systems, Optimization and Optimal Control
Abstract: This paper proposes a novel, more computationally efficient method for optimizing robot excitation trajectories for dynamic parameter identification, emphasizing self-collision avoidance. This addresses the system identification challenges for getting high-quality training data associated with co-manipulated robotic arms that can be equipped with a variety of tools, a common scenario in industrial but also clinical and research contexts. Utilizing the Unified Robotics Description Format (URDF) to implement a symbolic Python implementation of the Recursive Newton-Euler Algorithm (RNEA), the approach aids in dynamically estimating parameters such as inertia using regression analyses on data from real robots. The excitation trajectory was evaluated and achieved on par criteria when compared to state-of-the-art reported results which didn't consider self-collision and tool calibrations. Furthermore, physical Human-Robot Interaction (pHRI) admittance control experiments were conducted in a surgical context to evaluate the derived inverse dynamics model showing a 30.1% workload reduction by the NASA TLX questionnaire.
|
|
WeCT20-NT Oral Session, NT-G302 |
Add to My Program |
Robot Safety II |
|
|
Chair: Das, Ersin | Caltech |
Co-Chair: Del Prete, Andrea | University of Trento |
|
16:30-18:00, Paper WeCT20-NT.1 | Add to My Program |
Learning-Based Inverse Perception Contracts and Applications |
|
Sun, Dawei | UIUC |
Yang, Benjamin | University of Illinois at Urbana Champaign |
Mitra, Sayan | University of Ilinois, Urbana Champagne |
Keywords: Robot Safety, Model Learning for Control, Robust/Adaptive Control
Abstract: Perception modules are integral in many modern autonomous systems, but their accuracy can be subject to the vagaries of the environment. In this paper, we propose a learning-based approach that can automatically characterize the error of a perception module from data and use this for safe control. The proposed approach constructs an inverse perception contract (IPC) which generates a set that contains the ground-truth value that is being estimated by the perception module, with high probability. We apply the proposed approach to study a vision pipeline deployed on a quadcopter. With the proposed approach, we successfully constructed an IPC for the vision pipeline. We then designed a control algorithm that utilizes the learned IPC, with the goal of landing the quadcopter safely on a landing pad. Experiments show that with the learned IPC, the control algorithm safely landed the quadcopter despite the error from the perception module, while the baseline algorithm without using the learned IPC failed to do so.
|
|
16:30-18:00, Paper WeCT20-NT.2 | Add to My Program |
Safe Multi-Robot Exploration Using Symbolic Control |
|
Juvvi, Manas Sashank | Indian Institute of Science, Bengaluru |
Sundarsingh, David Smith | Indian Institute of Science, Bangalore |
Das, Ratnangshu | Indian Institute of Science, Bangalore |
Jagtap, Pushpak | Indian Institute of Science |
Keywords: Robot Safety, Multi-Robot Systems, Mapping
Abstract: Multi-robot exploration is a complex problem that involves multiple robots working in a shared unknown environment. In such scenarios, the safety of the robots is of paramount importance alongside the completion of the exploration task. In this paper, we propose a modular exploration framework that (i) identifies safe frontier targets for multiple robots while taking into account the system dynamics of each robot to ensure collision avoidance with previously unknown obstacles and (ii) ensures that the robots reach their exploration targets while avoiding any obstacles discovered and each other. We employ a scalable approach to generate symbolic controllers for the multi-robot system, utilizing distance functions. We also provide formal guarantees on the safety of the exploration targets and the completion of each exploration run, with the robots avoiding collisions with each other and the obstacles. We test our approach on simulation experiments and a real-world implementation to validate it.
|
|
16:30-18:00, Paper WeCT20-NT.3 | Add to My Program |
Receding-Constraint Model Predictive Control Using a Learned Approximate Control-Invariant Set |
|
Lunardi, Gianni | University of Trento |
La Rocca, Asia | University of Trento |
Saveriano, Matteo | University of Trento |
Del Prete, Andrea | University of Trento |
Keywords: Robot Safety, Optimization and Optimal Control, Machine Learning for Robot Control
Abstract: In recent years, advanced model-based and data-driven control methods are unlocking the potential of complex robotics systems, and we can expect this trend to continue at an exponential rate in the near future. However, ensuring safety with these advanced control methods remains a challenge. A well-known tool to make controllers (either Model Predictive Controllers or Reinforcement Learning policies) safe, is the so-called control-invariant set (a.k.a. safe set). Unfortunately, for nonlinear systems, such a set cannot be exactly computed in general. Numerical algorithms exist for computing approximate control-invariant sets, but classic theoretic control methods break down if the set is not exact. This paper presents our recent efforts to address this issue. We present a novel Model Predictive Control scheme that can guarantee recursive feasibility and/or safety under weaker assumptions than classic methods. In particular, recursive feasibility is guaranteed by making the safe-set constraint move backward over the horizon, and assuming that such set satisfies a condition that is weaker than control invariance. Safety is instead guaranteed under an even weaker assumption on the safe set, triggering a safe task-abortion strategy whenever a risk of constraint violation is detected. We evaluated our approach on a simulated robot manipulator, empirically demonstrating that it leads to less constraint violations than state-of-the-art approaches, while retaining reasonable performance in terms of tracking cost, number of completed tasks, and computation time.
|
|
16:30-18:00, Paper WeCT20-NT.4 | Add to My Program |
VBOC: Learning the Viability Boundary of a Robot Manipulator Using Optimal Control |
|
La Rocca, Asia | University of Trento |
Saveriano, Matteo | University of Trento |
Del Prete, Andrea | University of Trento |
Keywords: Robot Safety, Optimization and Optimal Control, Machine Learning for Robot Control
Abstract: Safety is often the most important requirement in robotics applications. Nonetheless, control techniques that can provide safety guarantees are still extremely rare for nonlinear systems, such as robot manipulators. A well-known tool to ensure safety is the Viability kernel, which is the largest set of states from which safety can be ensured. Unfortunately, computing such a set for a nonlinear system is extremely challenging in general. Several numerical algorithms for approximating it have been proposed in the literature, but they suffer from the curse of dimensionality. This paper presents a new approach for numerically approximating the viability kernel of robot manipulators. Our approach solves optimal control problems to compute states that are guaranteed to be on the boundary of the set. This allows us to learn directly the set boundary, therefore learning in a smaller dimensional space. Compared to the state of the art on systems up to dimension 6, our algorithm resulted to be more than 2 times as accurate for the same computation time, or 6 times as fast to reach the same accuracy.
|
|
16:30-18:00, Paper WeCT20-NT.5 | Add to My Program |
Closing the Perception-Action Loop for Semantically Safe Navigation in Semi-Static Environments |
|
Qian, Jingxing | University of Toronto |
Zhou, Siqi | Technical University of Munich |
Ren, Nicholas | University of Waterloo |
Chatrath, Veronica | Vector Institute |
Schoellig, Angela P. | TU Munich |
Keywords: Robot Safety, Perception-Action Coupling, Motion Control
Abstract: Autonomous robots navigating in changing environments demand adaptive navigation strategies for safe long-term operation. While many modern control paradigms offer theoretical guarantees, they often assume known extrinsic safety constraints, overlooking challenges when deployed in real-world environments where objects can appear, disappear, and shift over time. In this paper, we present a closed-loop perception-action pipeline that bridges this gap. Our system encodes an online-constructed dense map, along with object-level semantic and consistency estimates into a control barrier function (CBF) to regulate safe regions in the scene. A model predictive controller (MPC) leverages the CBF-based safety constraints to adapt its navigation behaviour, which is particularly crucial when potential scene changes occur. We test the system in simulations and real-world experiments to demonstrate the impact of semantic information and scene change handling on robot behavior, validating the practicality of our approach.
|
|
16:30-18:00, Paper WeCT20-NT.6 | Add to My Program |
Magnetorheological-Actuators: An Enabling Technology for Fast, Safe and Practical Collaborative Robots |
|
St-Jean, Alexandre | Université De Sherbrooke |
Dorval, Francis | Université De Sherbrooke |
Plante, Jean-Sebastien | Université De Sherbrooke |
Lussier Desbiens, Alexis | Université De Sherbrooke |
Keywords: Robot Safety, Physical Human-Robot Interaction, Industrial Robots, Magneto-rheological clutch
Abstract: Collaborative robots are more and more used in applications requiring robots and humans to work in proximity or direct contact. However, conventional collaborative robots powered by servo-geared actuators are intrinsically dangerous due to their high reflected inertia. Recent studies have shown that low inertia and high bandwidth (> 30 Hz) magnetorheological (MR) actuators have the potential to improve the safety of collaborative robots without reducing their force and speed capabilities. The main contribution of this paper is to provide a quantitative assessment of how MR actuators can contribute to reducing the impact forces with humans, and thus increase the safety of collaborative robots. Dynamic models, validated with simplified 1 DOF experiments, show that the safety level of collaborative robots can be increased by a factor up to 3 only by changing the conventional servo-geared actuator architectures for MR actuators with no other changes. The paper also presents a simple, reliable, and fast collision detection method based on joint angular velocity band-pass filtering, a method exploiting the unique low inertia and clean dynamics properties of MR actuators.
|
|
16:30-18:00, Paper WeCT20-NT.7 | Add to My Program |
Risk-Aware Control for Robots with Non-Gaussian Belief Spaces |
|
Vahs, Matti | KTH Royal Institute of Technology, Stockholm |
Tumova, Jana | KTH Royal Institute of Technology |
Keywords: Robot Safety, Planning under Uncertainty, Collision Avoidance
Abstract: This paper addresses the problem of safety-critical control of autonomous robots, considering the ubiquitous uncertainties arising from unmodeled dynamics and noisy sensors. To take into account these uncertainties, probabilistic state estimators are often deployed to obtain a belief over possible states. Namely, Particle Filters (PFs) can handle arbitrary non-Gaussian distributions in the robot's state. In this work, we define the belief state and belief dynamics for continuous-discrete PFs and construct safe sets in the underlying belief space. We design a controller that provably keeps the robot's belief state within this safe set. As a result, we ensure that the risk of the unknown robot's state violating a safety specification, such as avoiding a dangerous area, is bounded. We provide an open-source implementation as a ROS2 package and evaluate the solution in simulations and hardware experiments involving high-dimensional belief spaces.
|
|
16:30-18:00, Paper WeCT20-NT.8 | Add to My Program |
Conformal Decision Theory: Safe Autonomous Decisions from Imperfect Predictions |
|
Lekeufack Sopze, Jordan | University of California, Berkeley |
Angelopoulos, Anastasios | University of California, Berkeley |
Bajcsy, Andrea | Carnegie Mellon University |
Jordan, Michael I. | UC Berkeley |
Malik, Jitendra | UC Berkeley |
Keywords: Robot Safety, Planning under Uncertainty, Robust/Adaptive Control
Abstract: We introduce Conformal Decision Theory, a framework for producing safe autonomous decisions despite imperfect machine learning predictions. Examples of such decisions are ubiquitous, from robot planning algorithms that rely on pedestrian predictions, to calibrating autonomous manufacturing to be high throughput but low error, to the choice of trusting a nominal policy versus switching to a safe backup policy at run-time. The decisions produced by our algorithms are safe in the sense that they come with provable statistical guarantees of having low risk without any assumptions on the world model whatsoever; the observations need not be I.I.D. and can even be adversarial. The theory extends results from conformal prediction to calibrate decisions directly, without requiring the construction of prediction sets. Experiments demonstrate the utility of our approach in robot motion planning around humans and robot manufacturing.
|
|
16:30-18:00, Paper WeCT20-NT.9 | Add to My Program |
A Learning-Based Framework for Safe Human-Robot Collaboration with Multiple Backup Control Barrier Functions |
|
Janwani, Neil | California Institute of Technology |
Das, Ersin | Caltech |
Touma, Thomas | Caltech |
Wei, Skylar | Caltech |
Molnar, Tamas G. | Wichita State University |
Burdick, Joel | California Institute of Technology |
Keywords: Robot Safety, Safety in HRI, Human-Robot Collaboration
Abstract: Ensuring robot safety in complex environments is a difficult task due to actuation limits, such as torque bounds. This paper presents a safety-critical control framework that leverages learning-based switching between multiple backup controllers to formally guarantee safety under bounded control inputs while satisfying driver intention. By leveraging {em backup controllers} designed to uphold safety and input constraints, textit{backup control barrier functions} (BCBFs) construct implicitly defined control invariant sets via a feasible quadratic program (QP). However, BCBF performance largely depends on the design and conservativeness of the chosen backup controller, especially in our setting of human-driven vehicles in complex, e.g, off-road, conditions. While conservativeness can be reduced by using multiple backup controllers, determining when to switch is an open problem. Consequently, we develop a broadcast scheme that estimates driver intention and integrates BCBFs with multiple backup strategies for human-robot interaction. An LSTM classifier uses data inputs from the robot, human, and safety algorithms to continually choose a backup controller in real-time. We demonstrate our method's efficacy on a dual-track robot in obstacle avoidance scenarios. Our framework guarantees robot safety while adhering to driver intention.
|
|
WeCT21-NT Oral Session, NT-G303 |
Add to My Program |
Bioinspired Robot Abilities |
|
|
Chair: Kanada, Ayato | Kyushu University |
Co-Chair: Yoon, Jungwon | Gwangju Institutue of Science and Technology |
|
16:30-18:00, Paper WeCT21-NT.1 | Add to My Program |
Geared Rod-Driven Continuum Robot with Woodpecker-Inspired Extension Mechanism and IMU-Based Force Sensing |
|
Mavinkurve, Ujjal | Kyushu University |
Kanada, Ayato | Kyushu University |
Tafrishi, Seyed Amir | Cardiff Univerity |
Honda, Koki | The University of Tokyo |
Nakashima, Yasutaka | Kyushu University |
Yamamoto, Motoji | Kyushu University |
Keywords: Biologically-Inspired Robots, Compliant Joints and Mechanisms, Force and Tactile Sensing
Abstract: Continuum robot arms that can access confined spaces are useful in many applications, such as invasive surgery, search and rescue, and inspection. However, their reach is often limited because their extension mechanism relies on elastic deformation or folding structures. To address this challenge, we propose a continuum robot with a novel extension mechanism inspired by the impressive ability of woodpeckers to extend and bend their long tongues to catch insects in tree holes. The proposed mechanism can change the effective length of the robot from almost zero to any length by moving the robot's body back and forth. Our prototype robot demonstrated a maximum extension of 450 mm and a minimum bending radius of 125 mm. In addition, we developed a Gaussian process regression model to predict an external force applied to the robot's tip using inertial measurement units. This enabled us to determine the magnitude and direction of the force with an error rate of 4.8 percent and 11.1 percent, even when the robot's length was varied between the training and test data. The unrestricted extension capability of the proposed approach has the potential to increase the application prospects of continuum robots.
|
|
16:30-18:00, Paper WeCT21-NT.2 | Add to My Program |
A Multi-Modal Hybrid Robot with Enhanced Traversal Performance |
|
He, Zhipeng | Beijing Institute of Technology |
Zhao, Na | Dalian Maritime University |
Luo, Yudong | Dalian Maritime University |
Long, Sian | Dalian Maritime University |
Luo, Xi | Yichang Testing Tech. Research Institution |
Deng, Hongbin | Beijing Institute of Technology |
Keywords: Biologically-Inspired Robots, Mechanism Design
Abstract: Current multi-modal hybrid robots with flight and wheeled modes have fallen into the dilemma that they can only avoid obstacles by re-taking off when encountering obstacles due to the poor performance of wheeled obstacle-crossing. To tackle this problem, this paper presents a novel multi-modal hybrid robot with the ability to actively adjust the wheel's size, which is inspired by the behavior of the turtle's legs when it encounters obstacles, to enhance the traversal performance. In detail, we first describe the hardware design that allows the robot to achieve a modal switch between flight and wheeled modes through foldable structures and variable wheel diameters; then, we present the architecture to control these two morphing mechanisms. After that, we establish the theoretical kinematic models for both the foldable arm and variable wheel, and carry out extensive experiments to test the performance of the foldable arm, the variable-diameter wheel, as well as the traversal performance of the robot. Experimental results show that the proposed multimodal robot can realize the function of a quadrotor, respond quickly with full-scale folding within 0.9 s, climb a maximum slope of 36 deg, and traverse narrow passageways, which exhibit superior mobility and environmental adaptability.
|
|
16:30-18:00, Paper WeCT21-NT.3 | Add to My Program |
Anisotropic Body Compliance Facilitates Robotic Sidewinding in Complex Environments |
|
Kojouharov, Velin | Georgia Institute of Technology |
Wang, Tianyu | Georgia Institute of Technology |
Fernandez, Matthew | Georgia Institute of Technology |
Maeng, Jiyeon | Georgia Institute of Technology |
Goldman, Daniel | Georgia Institute of Technology |
Keywords: Biologically-Inspired Robots, Mechanism Design, Redundant Robots
Abstract: Sidewinding, a locomotion strategy characterized by the coordination of lateral and vertical body undulations, is frequently observed in rattlesnakes and has been successfully implemented by limbless robotic systems for effective movement across diverse terrestrial terrains. However, the integration of compliant mechanisms into sidewinding limbless robots remains less explored, posing challenges for navigation in complex, rheologically diverse environments. Inspired by a notable control simplification via mechanical intelligence in lateral undulation, which offloads feedback control to passive body mechanics and interactions with the environment, we present an innovative design of a mechanically intelligent limbless robot for sidewinding. This robot features a decentralized bilateral cable actuation system that resembles organismal muscle actuation mechanisms. We develop a feedforward controller that incorporates programmable body compliance into the sidewinding gait template. Our experimental results highlight the emergence of mechanical intelligence when the robot is equipped with an appropriate level of body compliance. This allows the robot to 1) locomote more energetically efficiently, as evidenced by a reduced cost of transport, and 2) navigate through terrain heterogeneities, all achieved in an open-loop manner, without the need for environmental awareness.
|
|
16:30-18:00, Paper WeCT21-NT.4 | Add to My Program |
Combining Tail and Reaction Wheel for Underactuated Spatial Reorientation in Robot Falling with Quadratic Programming |
|
Chu, Xiangyu | The Chinese University of Hong Kong |
Wang, Shengzhi | The Chinese University of Hong Kong |
Ng, Raymond | Chinese University of Hong Kong |
Fan, Chun Yin | Chinese University of Hong Kong |
An, Jiajun | The Chinese University of Hong Kong |
Au, K. W. Samuel | The Chinese University of Hong Kong |
Keywords: Biologically-Inspired Robots, Motion Control, Underactuated Robots
Abstract: Inertial appendages (e.g., tails and reaction wheels) have shown their reorientation capability to enhance robots' mobility while airborne or improve robots' safety in falling. The tail, especially with two Degrees of Freedom (DoFs), is normally subject to its limited Range of Motion (RoM). Although the reaction wheel circumvents this limitation, its efficiency has been shown lower than the tail in terms of inducing Moment of Inertia (MoI). In literature, only one type of inertial appendages has been used on terrestrial robots in the air, e.g., either using a tail on the hexapedal robot RHex or using a reaction wheel on the jumping quadruped robot SpaceBok. In this paper, to benefit from both unlimited RoM and efficient MoI-inducing, we propose combining a 1-DoF tail and a reaction wheel together for spatial reorientation (regulating robot body's 3D orientation). Inspired by this, a hybrid tail-wheel robot is built, i.e., the tail that creates roll motion is attached to a wheel-equipped robot whose wheels act like a reaction wheel and generate pitch rotation; however, the robot is underactuated on the yaw rotation. To achieve its real-time spatial reorientation, we propose a novel quadratic programming algorithm based on a geometric metric for the underactuated hybrid tail-wheel robot. Within the proposed algorithm, the physical limitations on tail and wheel velocities are automatically accommodated. Numerical comparisons among wheel-wheel, tail-wheel, and 2-DoF tail robots s
|
|
16:30-18:00, Paper WeCT21-NT.5 | Add to My Program |
Environment-Modulated Self-Assembly by Changes in Modules' Buoyancy |
|
Chen, Xiao | University of Sheffield |
Han, Junyi | University of Sheffield |
Jin, Xin | University of Sheffield |
Miyashita, Shuhei | University of Sheffield |
Keywords: Biologically-Inspired Robots, Process Control, Additive Manufacturing
Abstract: While many inkjet printers employ only four types of ink (i.e. CKMY) to produce a wide range of colors, numerous technical challenges still exist for contemporary 3D printers to fabricate various materials and generate composite products such as electric devices. Conversely, there have been attempts and endeavors to make things through self-assembly of parts, analogous to the autonomous and decentralized development process of the human body from just 20 types of amino acids. In our previous work, we proposed a method for the rapid production of 3D objects using the centimeter-sized modules (referred to as Roblets) capable of generating a 2D structure and subsequently self-folding themselves into a 3D configuration, akin to origami. To further leverage the capability of generating a wide variety of different types of structures by combining different modules, this research studies a method of automatically selecting and supplying modules using environmental cues. More precisely, we developed a mechanism to couple different modules corresponding to three different environments (on a flat surface, on low-dense saline, and on saturated saline) and yielded different module configurations. The process of self-assembly necessitated the application of perturbation, which was realized by imparting magnetic torque originating from an external magnetic field onto the magnets embedded in the modules.
|
|
16:30-18:00, Paper WeCT21-NT.6 | Add to My Program |
Analysis and Validation of Stiffness and Payload of Nematode-Inspired Cable Routing Method for Cable Driven Redundant Manipulator |
|
Kim, Hoyoung | GIST |
Yoon, Jungwon | Gwangju Institutue of Science and Technology |
Keywords: Biologically-Inspired Robots, Tendon/Wire Mechanism, Kinematics
Abstract: The cable-driven redundant manipulator (CDRM) has significant potential for applications in narrow and hazardous spaces. However, traditional CDRMs have limited stiffness and load capacity due to their cable routing method. To address these limitations, several scholars have proposed new mechanisms and control strategies. Nevertheless, the cable routing method has not changed, and CDRMs continue to suffer from their limitations. Recently, a nematode-inspired cable routing method was proposed; however, stiffness calculations, derivation of inverse kinematics, and validation of stiffness and load capacity were incomplete. In this paper, we calculate the analytic equivalent stiffness of the nematode-inspired cable routing method and compare it with other cable routing methods. Additionally, we derived and simulate the kinematics and an effective inverse kinematics algorithm. Finally, we validate the stiffness and load capacity using a developed prototype.
|
|
WeCT22-NT Oral Session, NT-G304 |
Add to My Program |
Space Robotics I |
|
|
Chair: Chowdhury, Souma | University at Buffalo, State University of New York |
Co-Chair: Zhu, ZhengHong (George) | York University |
|
16:30-18:00, Paper WeCT22-NT.1 | Add to My Program |
Super-Resolution of Lunar Satellite Images for Enhanced Robotic Traverse Planning |
|
Delgado-Centeno, José Ignacio | Universite Du Luxembourg |
Harder, Paula | Mila - Quebec AI Institute |
Bickel, Valentin | ETH Zurich |
Moseley, Ben | ETH Zurich |
Kalaitzis, Freddie | University of Oxford |
Ganju, Siddha | NVIDIA |
Olivares-Mendez, Miguel A. | Interdisciplinary Centre for Security, Reliability and Trust - U |
Keywords: Space Robotics and Automation, AI-Based Methods, Deep Learning Methods
Abstract: Lunar exploration missions require detailed and accurate planning to ensure their safety. Remote sensing data, such as optical satellite imagery acquired by lunar orbiters, is key for the identification of future landing and mission sites. Here, robot- and astronaut-scale obstacles are the most relevant to resolve, however, the spatial resolution of the available image data is often insufficient - particularly in the poorly illuminated polar regions of the Moon -, leading to uncertainty. This work shows how a novel single-image Super-Resolution (SR) application - ANUBIS, Adversarial Network for Uncertainty Based Image Super-resolution - can enhance lunar surface imagery by improving their resolution by a factor of 2, outperforming other approaches and benchmarks. The enhanced images improve the reliability and detail of lunar traverse planning and topographic reconstruction, while providing an estimate of the uncertainty associated with the enhancement process, vital to ensure mission planning integrity. This work demonstrates how machine learning-driven processing can enhance existing data products to maximize their value for science and exploration of the Moon and other celestial
|
|
16:30-18:00, Paper WeCT22-NT.2 | Add to My Program |
PPO-Based Dynamic Control of Uncertain Floating Platforms in Zero-G Environment |
|
Ramezani, Mahya | University of Luxembourg |
Alandihallaj, Mohammadamin | University of Luxembourg |
Hein, Andreas | University of Luxembourg |
Keywords: Space Robotics and Automation, Reinforcement Learning
Abstract: Abstract— In the realm of space exploration, floating platforms play a crucial role in scientific investigations and technological advancements. However, controlling these platforms in zero-gravity environments presents unique challenges, including uncertainties and disturbances. This paper introduces an innovative approach that combines Proximal Policy Optimization (PPO) with Model Predictive Control (MPC) in the zero-gravity laboratory (Zero-G Lab) at the University of Luxembourg. This approach leverages PPO’s reinforcement learning power and MPC’s precision to navigate the complex control dynamics of floating platforms. Unlike traditional control methods, this PPO-MPC approach learns from MPC predictions, adapting to unmodeled dynamics and disturbances, resulting in a resilient control framework tailored to the zero-gravity environment. Simulations and experiments in the Zero-G Lab validate this approach, showcasing the adaptability of the PPO agent. This research opens new possibilities for controlling floating platforms in zero-gravity settings, promising advancements in space exploration.
|
|
16:30-18:00, Paper WeCT22-NT.3 | Add to My Program |
Learning-Aided Control of Robotic Tether-Net with Maneuverable Nodes to Capture Large Space Debris |
|
Boonrath, Achira | University at Buffalo, SUNY |
Liu, Feng | The State University of New York, University at Buffalo |
Botta, Eleonora | University at Buffalo |
Chowdhury, Souma | University at Buffalo, State University of New York |
Keywords: Space Robotics and Automation, Reinforcement Learning, Motion Control
Abstract: Maneuverable tether-net systems launched from an unmanned spacecraft offer a promising solution for the active removal of large space debris. Guaranteeing the successful capture of such space debris is dependent on the ability to reliably maneuver the tether-net system -- a flexible, many-DoF (thus complex) system -- for a wide range of launch scenarios. Here, scenarios are defined by the relative location of the debris with respect to the chaser spacecraft. This paper represents and solves this problem as a hierarchically decentralized implementation of robotic trajectory planning and control and demonstrates the effectiveness of the approach when applied to two different tether-net systems, with 4 and 8 maneuverable units (MUs), respectively. Reinforcement learning (policy gradient) is used to design the centralized trajectory planner that, based on the relative location of the target debris at the launch of the net, computes the final aiming positions of each MU, from which their trajectory can be derived. Each MU then seeks to follow its assigned trajectory by using a decentralized PID controller that outputs the MU's thrust vector and is informed by noisy sensor feedback (for realism) of its relative location. System performance is assessed in terms of capture success and overall fuel consumption by the MUs. Reward shaping and surrogate models are used to respectively guide and speed up the RL process. Simulation-based experiments show that this approach allows the successful capture of debris at fuel costs that are notably lower than nominal baselines, including in scenarios where the debris is significantly off-centered compared to the approaching chaser spacecraft.
|
|
16:30-18:00, Paper WeCT22-NT.4 | Add to My Program |
Online Supervised Training of Spaceborne Vision During Proximity Operations Using Adaptive Kalman Filtering |
|
Park, Tae Ha | Stanford University |
D’Amico, Simone | Stanford University |
Keywords: Space Robotics and Automation, Deep Learning for Visual Perception, Continual Learning
Abstract: This work presents an Online Supervised Training (OST) method to enable robust vision-based navigation about a non-cooperative spacecraft. Spaceborne Neural Networks (NN) are susceptible to domain gap as they are primarily trained with synthetic images due to the inaccessibility of space. OST aims to close this gap by training a pose estimation NN online using incoming flight images during Rendezvous and Proximity Operations (RPO). The pseudo-labels are provided by an adaptive unscented Kalman filter where the NN is used in the loop as a measurement module. Specifically, the filter tracks the target’s relative orbital and attitude motion, and its accuracy is ensured by robust on-ground training of the NN using only synthetic data. The experiments on real hardware-in-the-loop trajectory images show that OST can improve the NN performance on the target image domain given that OST is performed on images of the target viewed from a diverse set of directions during RPO.
|
|
16:30-18:00, Paper WeCT22-NT.5 | Add to My Program |
Towards Real-World Efficiency: Domain Randomization in Reinforcement Learning for Pre-Capture of Free-Floating Moving Targets by Autonomous Robots |
|
Beigomi, Bahador | York University |
Zhu, ZhengHong (George) | York University |
Keywords: Space Robotics and Automation, Deep Learning in Grasping and Manipulation, Reinforcement Learning
Abstract: In this research, we introduce a deep reinforcement learning-based control approach to address the intricate challenge of the robotic pre-grasping phase under microgravity conditions. Leveraging reinforcement learning eliminates the necessity for manual feature design, therefore simplifying the problem and empowering the robot to learn pre-grasping policies through trial and error. Our methodology incorporates an off-policy reinforcement learning framework, employing the soft actor-critic technique to enable the gripper to proficiently approach a free-floating moving object, ensuring optimal pre-grasp success. For effective learning of the pre-grasping approach task, we developed a reward function that offers the agent clear and insightful feedback. Our case study examines a pre-grasping task where a Robotiq 3F gripper is required to navigate towards a free-floating moving target, pursue it, and subsequently position itself at the desired pre-grasp location. We assessed our approach through a series of experiments in both simulated and real-world environments. The source code, along with recordings of real-world robot grasping, is available at Fanuc_Robotiq_Grasp.
|
|
16:30-18:00, Paper WeCT22-NT.6 | Add to My Program |
SPADES: A Realistic Spacecraft Pose Estimation Dataset Using Event Sensing |
|
Rathinam, Arunkumar | University of Luxembourg |
Qadadri, Haytam | University of Strasbourg |
Aouada, Djamila | SnT, University of Luxembourg |
Keywords: Space Robotics and Automation, Data Sets for Robotic Vision, Deep Learning for Visual Perception
Abstract: In recent years, there has been a growing demand for improved autonomy for in-orbit operations such as rendezvous, docking, and proximity manoeuvres, leading to increased interest in employing Deep Learning-based Spacecraft Pose Estimation techniques. However, due to limited access to real target datasets, algorithms are often trained using synthetic data and applied in the real domain, resulting in a performance drop due to the domain gap. State-of-the-art approaches employ Domain Adaptation techniques to mitigate this issue. In the search for viable solutions, event sensing has been explored in the past and shown to reduce the domain gap between simulations and real-world scenarios. Event sensors have made significant advancements in hardware and software in recent years. Moreover, the characteristics of the event sensor offer several advantages in space applications compared to RGB sensors. To facilitate further training and evaluation of DL-based models, we introduce a new dataset, SPADES, comprising real event data acquired in a controlled laboratory environment and simulated event data using the same camera intrinsics. Furthermore, we introduce an image-based event representation that performs better than existing representations. In addition, we propose an effective data filtering method to improve the quality of training data, thus enhancing model performance. A multifaceted baseline evaluation was conducted using different event representations, event filtering strategies, and algorithmic frameworks, and the results are summarized. The dataset will be made available at http://cvi2.uni.lu/spades.
|
|
16:30-18:00, Paper WeCT22-NT.7 | Add to My Program |
Covariance Based Terrain Mapping for Autonomous Mobile Robots |
|
Werner, Lennart | ETH Zürich |
Proença, Pedro F. | California Institute of Technology |
Nuechter, Andreas | University of Würzburg |
Brockers, Roland | California Institute of Technology |
Keywords: Space Robotics and Automation, Mapping, Vision-Based Navigation
Abstract: In this paper, we present a local, robot-centric navigation map optimized for autonomous mobile robots operating in unknown environments, enhancing their onboard perception systems for collision-free operation with far look-ahead distances. Utilizing a novel converging covariance cell representation, our approach effectively analyzes hazards such as obstacles and hazardous slopes in both terrestrial and aerial navigation contexts. The new technique specifically targets mapping from stereo scenarios with ultra short baseline and highly oblique viewpoints close to the ground. Our methodology surpasses traditional window-based hazard analysis by resolving sub-cell size obstacles and terrain gradients at the individual cell level, thereby avoiding the computational overhead typically associated with such analyses. It leverages a multi-resolution strategy adaptive to the range errors common in stereo vision systems, making it particularly suitable for embedded systems with computational limitations. Functionality includes constant-time queries for height, obstacle presence, and slope details, boasting improvements in run time, memory usage, precision, and resolvable obstacle size compared to existing grid-based mapping algorithms. We validate our approach through rigorous simulation and real-world testing. This technique will be used for the local mapping and collision avoidance on NASA's CADRE lunar rovers.
|
|
16:30-18:00, Paper WeCT22-NT.8 | Add to My Program |
VINSat: Solving the Lost-In-Space Problem with Visual-Inertial Navigation |
|
McCleary, Kyle | Carnegie Mellon University |
Gurumurthy, Swaminathan | Carnegie Mellon University |
Fisch, Paulo R.M. | Carnegie Mellon University |
Tayal, Saral | Carnegie Mellon University |
Manchester, Zachary | Carnegie Mellon University |
Lucia, Brandon | Carnegie Mellon University |
Keywords: Space Robotics and Automation, Vision-Based Navigation, Data Sets for Robotic Vision
Abstract: Rapid growth in the number of nanosatellite deployments has heightened the need for rapid, cost-effective, and accurate orbit determination (OD). This paper introduces a solution to this “lost-in-space” problem that we call Visual-Inertial Navigation for Satellites (VINSat). VINSat performs OD using data from an inertial measurement unit (IMU) and a low-cost RGB camera. Machine learning techniques are used to identify known landmarks in images captured by the spacecraft. These landmark locations are then combined with IMU data and a dynamics model in a batch nonlinear least-squares state estimator to determine the full state of the spacecraft. We validate VINSat in simulation using real nadir-pointing imagery and find that 85% of simulated satellites are localized to under 5 km within 6 hours (4 orbits). This performance substantially surpasses that of ground radar, demonstrating significantly faster and more precise localization without any reliance on ground infrastructure.
|
|
WeCT23-NT Oral Session, NT-G401 |
Add to My Program |
Aerial Systems: Applications |
|
|
Chair: Su, Yao | Beijing Institute for General Artificial Intelligence |
Co-Chair: Saska, Martin | Czech Technical University in Prague |
|
16:30-18:00, Paper WeCT23-NT.1 | Add to My Program |
A Compiler Framework for Proactive UAV Regulation Enforcement |
|
Tang, Huaxin | Binghamton University |
Burns, John Henry | SUNY Binghamton |
Strong, Alexander | Binghamton University |
Liu, Yu David | SUNY Binghamton |
Keywords: Aerial Systems: Applications
Abstract: In the rapidly evolving landscape of Unmanned Aerial Vehicles (UAVs), regulation enforcement is critical. Unfortunately, existing practices are largely manual and reactive in nature. In this paper, we present THEMIS, a novel compiler-directed approach for automated and proactive regulation enforcement. By expressing regulations through a specification language and integrating their enforcement into the compilation process, THEMIS enables safe and regulation-compliant UAV flights by enforcing prohibited and restricted areas, avoiding flights over humans, and managing maximum limits of altitude and speed. Our framework features a bi-directional interface that allows the concrete algorithms used for enforcement to be customized. Our evaluation shows THEMIS-compiled autopilots can adhere to regulatory constraints amidst complex flight conditions, while significantly reducing the burden of UAV operators.
|
|
16:30-18:00, Paper WeCT23-NT.2 | Add to My Program |
Real-Time Dynamic-Consistent Motion Planning for Over-Actuated UAVs |
|
Su, Yao | Beijing Institute for General Artificial Intelligence |
Zhang, Jingwen | University of California, Los Angeles |
Jiao, Ziyuan | Beijing Institute for General Artificial Intelligence |
Li, Hang | Beijing Institute for General Artificial Intelligence |
Wang, Meng | Beijing Institute for General Artificial Intelligence |
Liu, Hangxin | Beijing Institute for General Artificial Intelligence (BIGAI) |
Keywords: Aerial Systems: Applications, Constrained Motion Planning, Aerial Systems: Mechanics and Control
Abstract: Existing motion planning approaches for over-actuated unmanned aerial vehicle (UAV) platforms can achieve online planning without considering dynamics. However, in many envisioned application areas such as aerial manipulation, payload delivery, and moving target tracking, it is critical to ensure dynamic consistency in the generated trajectory. The dynamics of these platforms introduce a high nonlinearity, leading to a substantial increase in computational burden. This paper presents an efficient method to plan motions that are consistent with the dynamics of over-actuated UAVs. With a hierarchical control structure, the dimension of the optimization problem is greatly reduced with synthesized wrench commands. Additionally, by exploring the dynamics of over-actuated UAVs, the complex planning process is decoupled into two simpler sub-problems. As a result, the proposed planner can be solved as two small quadratic programmings (QPs) and deployed in real-time. The computational efficiency and dynamic consistency of the proposed method are verified through both simulations and experiments, including comparison with other approaches and dynamic target tracking.
|
|
16:30-18:00, Paper WeCT23-NT.3 | Add to My Program |
Trajectory Optimization for Cooperatively Localizing Quadrotor UAVs |
|
Go, H S Helson | University of Toronto |
Liu, Hugh H.-T. | University of Toronto |
Keywords: Aerial Systems: Applications, Cooperating Robots, Path Planning for Multiple Mobile Robots or Agents
Abstract: In this paper, an Active Cooperative Localization system for Quadrotor Unmanned Aerial Vehicles is developed. The optimal trajectories are determined by minimizing the uncertainty in position estimation by Extended Kalman Filter. In this system, a piecewise-polynomial parameterization of trajectories is adopted for the optimizer, and the underlying state estimator is updated with appropriate models of sensors and quadrotor dynamics. This system is verified in extensive simulations in the scenario of a team of quadrotors with heterogeneous GNSS capabilities. These simulations answer an open question, showing that solving for trajectories by minimizing Kalman covariance computed in a noiseless environment is reasonable and that the optimized trajectories offers visible reductions in positioning uncertainty in the presence of noise.
|
|
16:30-18:00, Paper WeCT23-NT.4 | Add to My Program |
Extending Guiding Vector Field to Track Unbounded UAV Paths |
|
Feurgard, Mael | Ecole Nationale De l'Aviation Civile |
Hattenberger, Gautier | ENAC, French Civil Aviation University |
Lacroix, Simon | LAAS/CNRS |
Keywords: Aerial Systems: Mechanics and Control, Motion Control
Abstract: A recent advance in vector field path following is the introduction of the Parametric Guiding Vector Field method. It allows for singularity-free vector fields with strong convergence guarantees, usable even for self-intersecting paths. However, the method requires significant gain tuning for practical use. In particular, for unbounded paths, the gains will inevitably become ill-suited for efficient path following. We propose a method to overcome this issue by introducing a dynamic step adaptation strategy, which provides additional normalization properties to the field. This allows the following of unbounded curves and reduces the number of gains to tune. The proposed improvements are verified in simulations using the PaparazziUAV software.
|
|
16:30-18:00, Paper WeCT23-NT.5 | Add to My Program |
Tethered Lifting-Wing Multicopter Landing Like Kite |
|
Wei, Haoyu | Beihang University |
Wang, Shuai | Beihang University |
Quan, Quan | Beihang University |
Keywords: Aerial Systems: Mechanics and Control, Motion Control
Abstract: Automatic landing of tethered unmanned aerial vehicles (UAVs) is an important issue. Typically, UAVs rely on location sensors such as global navigation satellite system (GNSS) and external cameras to obtain location data. However, harsh environments such as denial GNSS or strong winds make it difficult for UAVs to approach the landing area, and common solutions cannot be used for automatic landing. A tethered lifting-wing multicopter has a structure and static stability similar to a kite. Inspired by kites, this paper proposes a new landing method for tethered lifting-wing multicopters, which can be used without location or velocity sensors. During the landing phase, the tethered lifting-wing multicopter only needs to keep the rotor thrust to actively straighten the tethered cable and a constant attitude similar to that of a kite to keep position stability and increase damping. Meanwhile, the winch only needs to recover the cable at a constant speed until the tethered lifting-wing multicopter returns to its base. The feasibility and practicability of this method are demonstrated by real flight experiments.
|
|
16:30-18:00, Paper WeCT23-NT.6 | Add to My Program |
AirFisheye Dataset: A Multi-Model Fisheye Dataset for UAV Applications |
|
Jaisawal, Pravin Kumar | Hamburg University of Technology |
Papakonstantinou, Stephanos | Hamburg University of Technology |
Gollnick, Volker | Hamburg University of Technology |
Keywords: Aerial Systems: Perception and Autonomy, AI-Based Methods, Data Sets for Robotic Vision
Abstract: Drone applications require perception all around the vehicle for obstacle avoidance during drone navigation. Due to the weight and computation limitations on UAVs, using a large number of sensors e.g. a large amount of cameras could be prohibitive. In such scenarios, usage of fisheye camera with a wider field of view is very beneficial. Despite the usefulness of fisheye camera for UAV applications, not much work has been carried out to develop perception algorithm for fisheye camera. One of the main problems being the lack of publicly available omnidirectional datasets in relation to drone flight. With this paper, we address this gap by presenting AirFisheye dataset, which is applicable for tasks such as segmentation, depth estimation and depth completion, among other tasks required for autonomous drone navigation. Also, a generic framework for creating synthetic fisheye images is provided. Furthermore, we propose a novel occlusion correction algorithm that removes incorrectly projected LiDAR point clouds into the camera image due to the viewpoint variation of both sensors. We release about 26K images and LiDAR scans along with annotations. Baseline code and supporting scripts are available at https:// collaborating.tuhh.de/ilt/airfisheye-dataset
|
|
16:30-18:00, Paper WeCT23-NT.7 | Add to My Program |
Bio-Inspired Visual Relative Localization for Large Swarms of UAVs |
|
Krizek, Martin | Czech Technical University in Prague |
Vrba, Matous | Faculty of Electrical Engineering, Czech Technical University In |
Barisic, Antonella | University of Zagreb, Faculty of Electrical Engineering and Comp |
Bogdan, Stjepan | University of Zagreb |
Saska, Martin | Czech Technical University in Prague |
Keywords: Aerial Systems: Perception and Autonomy, Swarm Robotics, Deep Learning for Visual Perception
Abstract: We propose a new approach to visual perception for relative localization of agents within large-scale swarms of UAVs. Inspired by biological perception utilized by schools of sardines, swarms of bees, and other large groups of animals capable of moving in a decentralized yet coherent manner, our method does not rely on detecting individual neighbors by each agent and estimating their relative position, but rather we propose to regress a neighbor density over distance. This allows for a more accurate distance estimation as well as better scalability with respect to the number of neighbors. Additionally, a novel swarm control algorithm is proposed to make it compatible with the new relative localization method. We provide a thorough evaluation of the presented methods and demonstrate that the regressing approach to distance estimation is more robust to varying relative pose of the targets and that it is suitable to be used as the main source of relative localization for swarm stabilization.
|
|
16:30-18:00, Paper WeCT23-NT.8 | Add to My Program |
Heuristic-Based Incremental Probabilistic Roadmap for Efficient UAV Exploration in Dynamic Environments |
|
Xu, Zhefan | Carnegie Mellon University |
Suzuki, Christopher | Carnegie Mellon University |
Zhan, Xiaoyang | Carnegie Mellon University |
Shimada, Kenji | Carnegie Mellon University |
Keywords: Aerial Systems: Applications, Field Robots, Search and Rescue Robots
Abstract: Autonomous exploration in dynamic environments necessitates a planner that can proactively respond to changes and make efficient and safe decisions for robots. Although plenty of sampling-based works have shown success in exploring static environments, their inherent sampling randomness and limited utilization of previous samples often result in sub-optimal exploration efficiency. Additionally, most of these methods struggle with efficient replanning and collision avoidance in dynamic settings. To overcome these limitations, we propose the Heuristic-based Incremental Probabilistic Roadmap Exploration (HIRE) planner for UAVs exploring dynamic environments. The proposed planner adopts an incremental sampling strategy based on the probabilistic roadmap constructed by heuristic sampling toward the unexplored region next to the free space, defined as the heuristic frontier regions. The heuristic frontier regions are detected by applying a lightweight vision-based method to the different levels of the occupancy map. Moreover, our dynamic module ensures that the planner dynamically updates roadmap information based on the environment changes and avoids dynamic obstacles. Simulation and physical experiments prove that our planner can efficiently and safely explore dynamic environments. Our software is available on GitHub with the experiment video.
|
|
WeCT24-NT Oral Session, NT-G402 |
Add to My Program |
Robotics and Automation in Agriculture and Forestry I |
|
|
Chair: Davidson, Joseph | Oregon State University |
Co-Chair: Stachniss, Cyrill | University of Bonn |
|
16:30-18:00, Paper WeCT24-NT.1 | Add to My Program |
Dynamic Evaluation of a Suction Based Gripper for Fruit Picking Using a Physical Twin |
|
Velasquez-Lopez, Alejandro | Oregon State University |
Grimm, Cindy | Oregon State University |
Davidson, Joseph | Oregon State University |
Keywords: Agricultural Automation, Grippers and Other End-Effectors, Compliant Joints and Mechanisms
Abstract: We present and evaluate a novel suction-based gripper designed for fruit picking. This work is motivated by common problems observed in field trials of robotic harvesting: Calibration/perception errors, workspace obstacles, fruit swinging/moving when contacted, and varying stem and branch stiffnesses. The gripper consists of three suction-cups located on the palm, along with in-hand perception. To evaluate the gripper, we developed a physical proxy that approximates a realistic apple-stem-branch dynamic system. We performed 756 apple picks on the proxy with varying branch stiffness, stem strength and gripper pose (yaw, roll and offset w.r.t. the apple). Our results show that grasping performance improves when the gripper yaw w.r.t. the apple has two suction cups on the bottom of the apple and one suction cup on top. Even with ±15mm offset, at least two suction cups engaged with the apple 80% of the time, regardless of branch stiffness. Moreover, the gripper withstands ±20mm offset when it approaches the apple near its equator.
|
|
16:30-18:00, Paper WeCT24-NT.2 | Add to My Program |
Strawberry Weight Estimation Based on Plane-Constrained Binary Division Point Cloud Completion |
|
Huang, Yanjiang | Guangdong Provincial Key Lab. of Precision Equipment and Manufac |
Liu, Jiepeng | South China University of Technology |
Zhang, Xianmin | South China University of Technology |
Keywords: Agricultural Automation, Robotics and Automation in Agriculture and Forestry
Abstract: Labor shortages and the development of digital technology both impose requirements on the fruit industry. Modern agricultural competition has shifted from competition between products to competition between supply chains. Enhancing the digitization of production lines is crucial for gaining a competitive advantage. Strawberries, as fruits with a short shelf life, require sorting and packaging of fruits of different weights after being harvested. Estimating strawberry weight through visual technology can save time and labor costs. Common methods include methods based on feature size and learning-based methods, with the former having larger errors and the latter requiring a large amount of data. To address these issues, we propose a dataset for estimating strawberry weight, which includes strawberries with different heights and angles. Additionally, we propose a strawberry weight estimation method based on plane-constrained binary division point cloud completion. This method separates the plane point cloud and strawberry point cloud, constructs a coordinate system on the strawberry point cloud, generates an axis-aligned bounding box (AABB), and estimates the strawberry weight based on the bounding box and placement plane as constraints. Through comparison with different methods, we achieved a maximum improvement of 12.38% in prediction accuracy, demonstrating that our method provides the best estimation accuracy.
|
|
16:30-18:00, Paper WeCT24-NT.3 | Add to My Program |
AV4GAInsp: An Efficient Dual-Camera System for Identifying Defective Kernels of Cereal Grains |
|
Fan, Lei | The University of New South Wales |
Fan, Dongdong | Gaozhe Technology, Hefei, Anhui, CN |
Ding, Yiwen | Gauture Technology |
Wu, Yong | GaoZhe Technology |
Chu, Hongxia | Gaozhe Technology |
Pagnucco, Maurice | University of New South Wales |
Song, Yang | University of New South Wales |
Keywords: Agricultural Automation, Robotics and Automation in Agriculture and Forestry
Abstract: Grain Appearance Inspection (GAI) is a pre-requisite for grain quality determination, providing guidance for grain processing, storage and trade. GAI is routinely performed by trained inspectors who are required to visually inspect cereal grains for each individual kernel. Since grain kernels (e.g., wheat, rice) are tiny with heterogeneous shapes and appearance, manually performing GAI is time-consuming and error-prone. This paper presents a machine vision-based customization of an automated system for grain appearance inspection, called AV4GAInsp, which consists of a device and an analysis framework. The device is equipped with an elaborate feeding module and a capturing module for automatically pre-processing grain kernels and efficiently acquiring high-quality images for these kernels. The framework employs deep convolutional neural networks to process these captured images to classify the kernels as normal or defective. We also built and released a large-scale dataset, named GrainDet, that includes over 140K images for three types of grains: wheat, sorghum and rice. Comprehensive experiments are conducted to validate the efficacy and performance of our AV4GAInsp system, achieving an average F1-score of 98.4% and excelling at inspection efficiency by over 20x speedup. Kappa statistic tests are performed to confirm the consistency between our system and human experts. It is expected that AV4GAInsp will alleviate inspectors' workloads and inspire further r
|
|
16:30-18:00, Paper WeCT24-NT.4 | Add to My Program |
Aerial Image-Based Inter-Day Registration for Precision Agriculture |
|
Gao, Chen | ETH Zürich |
Daxinger, Franz | ETH Zurich |
Roth, Lukas | ETH Zurich |
Maffra, Fabiola | ETH Zurich |
Beardsley, Paul | Unity Technologies |
Chli, Margarita | ETH Zurich & University of Cyprus |
Teixeira, Lucas | ETH Zurich |
Keywords: Agricultural Automation, Robotics and Automation in Agriculture and Forestry, Aerial Systems: Perception and Autonomy
Abstract: Satellite imagery has traditionally been used to collect crop statistics, but its low resolution and registration accuracy limit agricultural analytics to plant stand levels and large areas. Precision agriculture seeks analytic tools at near single plant level, and this work explores how to improve aerial photogrammetry to enable inter-day precision agriculture analytics for intervals of up to a month. Our work starts by presenting an accurately registered image time series, captured up to twice a week, by an unmanned aerial vehicle over a wheat crop field. The dataset is registered using photogrammetry aided by fiducial ground control points (GCPs). Unfortunately, GCPs severely disrupt crop management activities. To address this, we propose a novel inter-day registration approach that only relies once on GCPs, at the beginning of the season. The method utilises LoFTR, a state-of-the-art image-matching transformer. The original LoFTR network was trained using imagery of outdoor urban areas. One of our contributions is to extend LoFTR's training method, which uses matching images of a static scene, to a dynamic scene of plants undergoing growth. Another contribution is a thorough evaluation of our registration method that integrates intra-day crop reconstruction with earlier-day scans in a seven degree-of-freedom alignment. Experimental results show the advantage of our approach over other matching algorithms and demonstrate the importance of retraining using crop scenes, and a training method customised for growing crops, with an average registration error of 27 cm across a season.
|
|
16:30-18:00, Paper WeCT24-NT.5 | Add to My Program |
Inexpensive, Automated Pruning Weight Estimation in Vineyards |
|
Jaramillo, Jonathan | Cornell University |
Wilhelm, Aaron | Cornell University |
Napp, Nils | Cornell University |
Vanden Heuvel, Justine | Cornell University |
Petersen, Kirstin Hagelskjaer | Cornell University |
Keywords: Agricultural Automation, Robotics and Automation in Agriculture and Forestry, Computer Vision for Automation
Abstract: Pruning weight is indicative of a vine’s ability to produce crop the following year, informing vineyard man- agement. Current methods for estimating pruning weight are costly, laborious, and/or require specialized know-how and equipment. In this paper we demonstrate an affordable, simple, computer vision-based method to measure pruning weight using a smartphone camera and structured light which produces results better than state-of-the-art techniques for vertical shoot position (VSP) vines and demonstrate initial steps towards estimating pruning weight in high cordon procumbent (HC) vines such as Concord. The simplicity and affordability of this technique lends its self to deployment by farmers today or on future viticulture robotics platforms. We achieved an R2=.80 for VSP vines (better than state-of-the-art computer vision-based methods) and R2=.29 for HC vines (not previously attempted with computer vision-based methods).
|
|
16:30-18:00, Paper WeCT24-NT.6 | Add to My Program |
High Precision Leaf Instance Segmentation for Phenotyping in Point Clouds Obtained under Real Field Conditions |
|
Marks, Elias Ariel | University of Bonn |
Sodano, Matteo | Photogrammetry and Robotics Lab, University of Bonn |
Magistri, Federico | University of Bonn |
Wiesmann, Louis | University of Bonn |
Desai, Dhagash | University of Bonn |
Marcuzzi, Rodrigo | University of Bonn |
Behley, Jens | University of Bonn |
Stachniss, Cyrill | University of Bonn |
Keywords: Agricultural Automation, Robotics and Automation in Agriculture and Forestry, Deep Learning for Visual Perception
Abstract: Measuring plant traits with a high throughput allows breeders to monitor and select the best cultivars to be used in following breeding generations. This in turn can enable farmers to improve yield to produce more food, feed, and fiber. Current breeding practices involve extracting leaf parameters on a small subset of the leaves present in the breeding plots, while still requiring substantial manual labor. To automate this process an important step is the precise distinction between separate leaves, which is the problem we address in this paper. To this end, we exploit recent advancements in 3D deep learning to build a convolutional neural network that learns to segment individual leaves. As done in current breeding practices, we select a subset of leaves to be used for phenotypic trait evaluation as this allows us to alleviate the influence of segmentation errors on the phenotypic trait estimation. To achieve this we propose to use an additional neural network to predict the quality of each segmented leaf and discard inaccurate leaf instances. The experiments show that our network yields higher segmentation accuracy on sugar beet breeding plots planted under the supervision of the German federal office for plant varieties. Furthermore, we show that our neural network helps in filtering out leaves with lower segmentation accuracy.
|
|
16:30-18:00, Paper WeCT24-NT.7 | Add to My Program |
Towards Robotic Tree Manipulation: Leveraging Graph Representations |
|
Kim, Chung Hee | Carnegie Mellon University |
Lee, Moonyoung | Carnegie Mellon University |
Kroemer, Oliver | Carnegie Mellon University |
Kantor, George | Carnegie Mellon University |
Keywords: Agricultural Automation, Robotics and Automation in Agriculture and Forestry, Deep Learning in Grasping and Manipulation
Abstract: There is growing interest in automating agricultural tasks that require intricate and precise interaction with specialty crops, such as trees and vines. However, developing robotic solutions for crop manipulation remains a difficult challenge due to complexities involved in modeling their deformable behavior. In this study, we present a framework for learning the deformation behavior of tree-like crops under contact interaction. Our proposed method involves encoding the state of a spring-damper modeled tree crop as a graph. This representation allows us to employ graph networks to learn both a forward model for predicting resulting deformations, and a contact policy for inferring actions to manipulate tree crops. We conduct a comprehensive set of experiments in a simulated environment and demonstrate generalizability of our method on previously unseen trees. Videos can be found on the project website: https://kantor-lab.github.io/tree_gnn
|
|
16:30-18:00, Paper WeCT24-NT.8 | Add to My Program |
Field Evaluation of a Prioritized Path-Planning Algorithm for Heterogeneous Agricultural Tasks of Multi-UGVs |
|
Jo, Yuseung | Chonnam National University |
Son, Hyoung Il | Chonnam National University |
Keywords: Agricultural Automation, Robotics and Automation in Agriculture and Forestry, Field Robots
Abstract: This paper introduces a prioritized path-planning algorithm for heterogeneous tasks performed by multiple unmanned ground vehicles (UGVs) in agricultural environments. The algorithm considers varying robot priorities, thereby extending the traditional multi-agent path finding (MAPF) approach. The proposed algorithm is evaluated in scenarios occurring during representative agricultural operations: harvesting and transportation. An experimental validation is conducted in agriculture-like settings by using multiple simultaneous localization and mapping systems and navigation systems. The results revealed that the path of agent1, which was assigned the highest priority in both the indoor and outdoor environments, was shortened considerably (3.38 m, 3.6 m, and 5.6 m, respectively). Especially in the face scenario, the sum of changes in distance, calculated using the proposed algorithm was negative, meaning that traffic congestion in the multi-robot system used in the experiment was alleviated without the need for inter-robot communication.
|
|
16:30-18:00, Paper WeCT24-NT.9 | Add to My Program |
Unsupervised Pre-Training for 3D Leaf Instance Segmentation |
|
Roggiolani, Gianmarco | University of Bonn |
Magistri, Federico | University of Bonn |
Guadagnino, Tiziano | University of Bonn |
Behley, Jens | University of Bonn |
Stachniss, Cyrill | University of Bonn |
Keywords: Agricultural Automation, Robotics and Automation in Agriculture and Forestry, Semantic Scene Understanding
Abstract: Crops for food, feed, fiber, and fuel are key resources for our society. Monitoring plants and measuring their traits is an important task in agriculture often referred to as plant phenotyping. Traditionally, this task is done manually, which is time- and labor-intensive. Robots can automate phenotyping providing reproducible and high-frequency measurements. Today's perception systems use deep learning to interpret these measurements, but require a substantial amount of annotated data to work well. Obtaining such labels is challenging as it often requires background knowledge on the side of the labelers.This paper addresses the problem of reducing the labeling effort required to perform leaf instance segmentation on 3D point clouds, which is a first step toward phenotyping in 3D. Separating all leaves allows us to count them and compute relevant traits as their areas, lengths, and widths.We propose a novel self-supervised task-specific pre-training approach to initialize the backbone of a network for leaf instance segmentation. We also introduce a novel automatic postprocessing that considers the difficulty of correctly segmenting the points close to the stem, where all the leaves petiole overlap. The experiments presented in this paper suggest that our approach boosts the performance over all the investigated scenarios. We also evaluate the embeddings to assess the quality of the fully unsupervised approach and see a higher performance of our domain-specific postprocessing.
|
|
WeCT25-NT Oral Session, NT-G403 |
Add to My Program |
Localization VI |
|
|
Chair: Richard, Antoine | University of Luxembourg |
Co-Chair: Adorno, Bruno Vilhena | The University of Manchester |
|
16:30-18:00, Paper WeCT25-NT.1 | Add to My Program |
GPS-VIO Fusion with Online Rotational Calibration |
|
Song, Junlin | University of Luxembourg |
Sanchez-Cuevas, Pedro J | Advanced Center for Aerospace Technologies |
Richard, Antoine | University of Luxembourg |
Rajan, Raj Thilak | Delft University of Technology |
Olivares-Mendez, Miguel A. | Interdisciplinary Centre for Security, Reliability and Trust - U |
Keywords: Localization
Abstract: Accurate global localization is crucial for autonomous navigation and planning. To this end, various GPS-aided Visual-Inertial Odometry (GPS-VIO) fusion algorithms are proposed in the literature. This paper presents a novel GPS-VIO system that is able to significantly benefit from the online calibration of the rotational extrinsic parameter between the GPS reference frame and the VIO reference frame. The behind reason is this parameter is observable. This paper provides novel proof through nonlinear observability analysis. We also evaluate the proposed algorithm extensively on diverse platforms, including flying UAV and driving vehicle. The experimental results support the observability analysis and show increased localization accuracy in comparison to state-of-the-art (SOTA) tightly-coupled algorithms.
|
|
16:30-18:00, Paper WeCT25-NT.2 | Add to My Program |
Fully Onboard Low-Power Localization with Semantic Sensor Fusion on a Nano-UAV Using Floor Plans |
|
Zimmerman, Nicky | University of Lugano |
Müller, Hanna | ETH Zürich |
Magno, Michele | ETH Zurich |
Benini, Luca | University of Bologna |
Keywords: Localization, Aerial Systems: Perception and Autonomy, Object Detection, Segmentation and Categorization
Abstract: Nano-sized unmanned aerial vehicles (UAVs) are well-fit for indoor applications and for close proximity to humans. To enable autonomy, the nano-UAV must be able to self-localize in its operating environment. This is a particularly challenging task due to the limited sensing and compute resources on board. This work presents an online and onboard approach for localization in floor plans annotated with semantic information. Unlike sensor-based maps, floor plans are readily-available, and do not increase the cost and time of deployment. To overcome the difficulty of localizing in sparse maps, the proposed approach fuses geometric information from miniaturized Time-of-Flight~(ToF) sensors and semantic cues. The semantic information is extracted from images by deploying a state-of-the-art object detection model on a high-performance multi-core microcontroller onboard the drone, consuming only 2.5mJ per frame and executing in 38ms. In our evaluation, we globally localize in a real-world office environment, achieving 90% success rate. We also release an open-source implementation of our work.
|
|
16:30-18:00, Paper WeCT25-NT.3 | Add to My Program |
The LuViRA Dataset: Synchronized Vision, Radio, and Audio Sensors for Indoor Localization |
|
Yaman, Ilayda | Lund University |
Tian, Guoda | Lund University |
Larsson, Martin | Lund University |
Persson, Patrik | Lund University |
Sandra, Michiel | Lund University |
Dürr, Alexander | Lund University |
Tegler, Erik | Lund University |
Challa, Nikhil | Lund University |
Garde, Henrik | Lund University |
Tufvesson, Fredrik | Lund University |
Åström, Karl | LTH, Lund University |
Edfors, Ove | Lund University |
Malkowsky, Steffen | Lund University |
Liu, Liang | Lund University |
Keywords: Localization, Data Sets for SLAM, Sensor Fusion
Abstract: We present a synchronized multisensory dataset for accurate and robust indoor localization: the Lund University Vision, Radio, and Audio (LuViRA) Dataset. The dataset includes color images, corresponding depth maps, inertial measurement unit (IMU) readings, channel response between a 5G massive multiple-input and multiple-output (MIMO) testbed and user equipment, audio recorded by 12 microphones, and accurate six degrees of freedom (6DOF) pose ground truth of 0.5 mm. We synchronize these sensors to ensure that all data is recorded simultaneously. A camera, speaker, and transmit antenna are placed on top of a slowly moving service robot, and 88 trajectories are recorded. Each trajectory includes 20 to 50 seconds of recorded sensor data and ground truth labels. Data from different sensors can be used separately or jointly to perform localization tasks, and data from the motion capture (mocap) system is used to verify the results obtained by the localization algorithms. The main aim of this dataset is to enable research on sensor fusion with the most commonly used sensors for localization tasks. Moreover, the full dataset or some parts of it can also be used for other research areas such as channel estimation, image classification, etc. Our dataset is available at: https://github.com/ilaydayaman/LuViRA_Dataset.
|
|
16:30-18:00, Paper WeCT25-NT.4 | Add to My Program |
Dual-IMU State Estimation for Relative Localization of Two Mobile Agents |
|
Lai, Wenqian | XREAL |
Guo, Ruonan | XREAL |
Wu, Kejian | XREAL |
Keywords: Localization, Multi-Robot Systems, Sensor Fusion
Abstract: In this paper, we address the problem of relative localization of two mobile agents. Specifically, we consider the Dual-IMU system, where each agent is equipped with one IMU, and employs relative pose observations between them. Previous works, however, typically assumed known ego motion and ignored biases of the IMUs. Instead, we study the most general case of unknown biases for both IMUs. Besides the derivation of dynamic model equations of the proposed system, we focus on the observability analysis, for the observability under general motion and the unobservable directions arising from various special motions. Through numerical simulations, we validate our key observability findings and examine their impact on the estimation accuracy and consistency. Finally, the system is implemented to achieve effective relative localization of an HMD with respect to a vehicle moving in the real world.
|
|
16:30-18:00, Paper WeCT25-NT.5 | Add to My Program |
Accurate Prior-Centric Monocular Positioning with Offline LiDAR Fusion |
|
He, Jinhao | The Hong Kong University of Science and Technology (Guangzhou) |
Huang, Huaiyang | The Hong Kong University of Science and Technology |
Zhang, Shuyang | The Hong Kong University of Science and Technology |
Jiao, Jianhao | University College London |
Liu, Chengju | Tongji University |
Liu, Ming | Hong Kong University of Science and Technology (Guangzhou) |
Keywords: Localization, Robotics in Under-Resourced Settings, Sensor Fusion
Abstract: Unmanned vehicles usually rely on Global Positioning System (GPS) and Light Detection and Ranging (LiDAR) sensors to achieve high-precision localization results for navigation purpose. However, this combination with their associated costs and infrastructure demands, poses challenges for widespread adoption in mass market applications. In this paper, we aim to use only a monocular camera to achieve comparable onboard localization performance by tracking deep-learning visual features on a LiDAR-enhanced visual prior map. Experiments show that the proposed algorithm can provide centimeter-level global positioning results with scale, which is effortlessly integrated and favorable for low-cost robot system deployment in real-world applications.
|
|
16:30-18:00, Paper WeCT25-NT.6 | Add to My Program |
A Nonlinear Estimator for Dead Reckoning of Aquatic Surface Vehicles Using an IMU and a Doppler Velocity Log |
|
Paterson, Jessica | University of Manchester |
Adorno, Bruno Vilhena | The University of Manchester |
Lennox, Barry | The University of Manchester |
Groves, Keir | The University of Manchester |
Keywords: Localization, Sensor Fusion
Abstract: Aquatic robots require an accurate and reliable localization system to navigate autonomously and perform practical missions. Kalman filters (KFs) and their variants are typically used in aquatic robots to combine sensor data. The two critical drawbacks of KFs are the requirement for skilled tuning of several filter parameters and the fact that changes to how the Inertial Measurement Unit (IMU) is oriented necessitate modifying the filter. To overcome those problems, this paper presents a novel method of fusing sensor data from a Doppler Velocity Log (DVL) and IMU using an adaptive nonlinear estimator to provide dead reckoning localization for a small autonomous surface vehicle. The proposed method has only one insensitive tuning parameter and is agnostic to the configuration of the IMU. The system was validated using a small ASV in a 2.4x3.6x2.4 m water tank, with a motion capture system as ground truth, and was evaluated against a state-of-the-art method based on KFs. Experiments showed that the average drift error of the nonlinear filter was 0.16 m (s.d. 0.06 m) compared to 0.15 m (s.d. 0.05 m) for the state of the art, meaning that the benefits in terms of tuning and flexible configuration do not come at the expense of performance.
|
|
16:30-18:00, Paper WeCT25-NT.7 | Add to My Program |
Range-Visual-Inertial Sensor Fusion for Micro Aerial Vehicle Localization and Navigation |
|
Goudar, Abhishek | University of Toronto |
Zhao, Wenda | University of Toronto |
Schoellig, Angela P. | TU Munich |
Keywords: Localization, Sensor Fusion
Abstract: We propose a fixed-lag smoother-based sensor fusion architecture to leverage the complementary benefits of range-based sensors and visual-inertial odometry (VIO) for localization. We use two fixed-lag smoothers (FLS) to decouple accurate state estimation and high-rate pose generation for closed-loop control. The first FLS combines ultrawideband (UWB)-based range measurements and VIO to estimate the robot trajectory and any systematic biases that affect the range measurements in cluttered environments. The second FLS estimates smooth corrections to VIO to generate pose estimates at a high rate for online control. The proposed method is lightweight and can run on a computationally constrained micro-aerial vehicle (MAV). We validate our approach through closed-loop flight tests involving dynamic trajectories in multiple real-world cluttered indoor environments. Our method achieves decimeter-to-sub-decimeter-level positioning accuracy using off-the-shelf sensors and decimeter-level tracking accuracy with minimally-tuned open-source controllers.
|
|
16:30-18:00, Paper WeCT25-NT.8 | Add to My Program |
An Equivariant Approach to Robust State Estimation for the ArduPilot Autopilot System |
|
Fornasier, Alessandro | University of Klagenfurt |
Ge, Yixiao | Australian National University |
van Goor, Pieter | The Australian National University |
Scheiber, Martin | University of Klagenfurt |
Tridgell, Andrew | Australian National University |
Mahony, Robert | Australian National University |
Weiss, Stephan | Universität Klagenfurt |
Keywords: Localization, Sensor Fusion, Aerial Systems: Applications
Abstract: The majority of commercial and open-source autopilot software for uncrewed aerial vehicles rely on the tried and tested extended Kalman filter (EKF) to provide the state estimation solution for the inertial navigation system (INS). While modern implementations achieve remarkable robustness, it is often due to the careful implementation of exception code for a multitude of corner cases along with significant skilled tuning effort. In this paper, we use the data wealth of the ArduPilot community to identify and highlight the most common real-world challenges in INS state estimation, including sensor self-calibration, robustness in static conditions, global navigation satellite system (GNSS) outliers and shifts, and robustness to faulty inertial measurement units (IMUs). We propose a novel equivariant filter (EqF) formulation for the INS solution that exploits a Semi-Direct-Bias symmetry group for multi-sensor fusion with self-calibration capabilities and incorporates equivariant velocity-type measurements. We augment the filter with a simple innovation-covariance inflation strategy that seamlessly handles GNSS outliers and shifts without requiring coding of a whole set of exception cases. We use real-world data from the Ardupilot community to demonstrate the performance of the proposed filter on known cases where existing filters fail without careful exception handling or case-specific tuning and benchmark against the ArduPilot’s EKF3, the most sophisticated EKF implementation currently available.
|
|
16:30-18:00, Paper WeCT25-NT.9 | Add to My Program |
Robust Indoor Localization with Ranging-IMU Fusion |
|
Jiang, Fan | Georgia Institute of Technology |
Caruso, David | Facebook Reality Lab |
Dhekne, Ashutosh | Georgia Institute of Technology |
Qu, Qi | Meta |
Engel, Jakob | Facebook |
Dong, Jing | Facebook |
Keywords: Localization, Sensor Fusion, SLAM
Abstract: Indoor wireless ranging localization is a promising approach for low-power and high-accuracy localization of wearable devices. A primary challenge in this domain stems from non-line of sight propagation of radio waves. This study tackles a fundamental issue in wireless ranging: the unpredictability of real-time multipath determination, especially in challenging conditions such as when there is no direct line of sight. We achieve this by fusing range measurements with inertial measurements obtained from a low cost Inertial Measurement Unit (IMU). For this purpose, we introduce a novel asymmetric noise model crafted specifically for non-Gaussian multipath disturbances. Additionally, we present a novel Levenberg-Marquardt (LM)-family trust-region adaptation of the iSAM2 fusion algorithm, which is optimized for robust performance for our ranging-IMU fusion problem. We evaluate our solution in a densely occupied real office environment. Our proposed solution can achieve temporally consistent localization with an average absolute accuracy of ~0.3m in real-world settings. Furthermore, our results indicate that we can achieve comparable accuracy even with infrequent range measurements down to 1Hz.
|
|
WeCT26-NT Oral Session, NT-G404 |
Add to My Program |
SLAM III |
|
|
Chair: Sanchez-Lopez, Jose Luis | University of Luxembourg |
Co-Chair: Garg, Sourav | University of Adelaide |
|
16:30-18:00, Paper WeCT26-NT.1 | Add to My Program |
Efficient Pose Prediction with Rational Regression Applied to VSLAM |
|
Terzakis, George | Varjo Technologies |
Lourakis, Manolis | Foundation for Research and Technology -- Hellas |
Keywords: SLAM, Localization, Visual Tracking
Abstract: Compared to polynomial splines, rational functions are known to be more efficient and well-behaved data fitting models. However, due to the potential presence of zeros in their denominator, rational functions tend to yield notoriously hard optimization problems. In this work, we present a novel least squares method for 6D pose prediction that employs rational regression. Our method can accommodate fixed data points and is able to circumvent the occurrence of zeros for rational quadratic interpolants. We demonstrate the suitability of rational quadratics for pose prediction by applying our approach to real data from the feature tracking stage of a real-time visual SLAM system and showing that it yields far more stable predictions when compared to state-of-the-art rational and polynomial spline methods.
|
|
16:30-18:00, Paper WeCT26-NT.2 | Add to My Program |
IMU-Aided Event-Based Stereo Visual Odometry |
|
Niu, Junkai | Hunan University |
Zhong, Sheng | Hunan University |
Zhou, Yi | Hunan University |
Keywords: SLAM, Localization, Mapping
Abstract: Direct methods for event-based visual odometry solve the mapping and camera pose tracking sub-problems by establishing implicit data association in a way that the generative model of events is exploited. The main bottlenecks faced by state-of-the-art work in this field include the high computational complexity of mapping and the limited accuracy of tracking. In this paper, we improve our previous direct pipeline Event-based Stereo Visual Odometry in terms of accuracy and efficiency. To speed up the mapping operation, we propose an efficient strategy of edge-pixel sampling according to the local dynamics of events. The mapping performance in terms of completeness and local smoothness is also improved by combining the temporal stereo results and the static stereo results. To circumvent the degeneracy issue of camera pose tracking in recovering the yaw component of general 6-DoF motion, we introduce as a prior the gyroscope measurements via pre-integration. Experiments on publicly available datasets justify our improvement. We release our pipeline as an open-source software for future research in this field.
|
|
16:30-18:00, Paper WeCT26-NT.3 | Add to My Program |
S-Graphs+: Real-Time Localization and Mapping Leveraging Hierarchical Representations |
|
Bavle, Hriday | University of Luxembourg |
Sanchez-Lopez, Jose Luis | Interdisciplinary Center for Security, Reliability and Trust (Sn |
Shaheer, Muhammad | University of Luxembourg |
Civera, Javier | Universidad De Zaragoza |
Voos, Holger | University of Luxembourg |
Keywords: SLAM, Localization, Mapping
Abstract: In this paper, we present an evolved version of Situational Graphs, which jointly models in a single optimizable factor graph (1) a pose graph, as a set of robot keyframes comprising associated measurements and robot poses, and (2) a 3D scene graph, as a high-level representation of the environment that encodes its different geometric elements with semantic attributes and the relational information between them. Specifically, our S-Graphs+ is a novel four-layered factor graph that includes: (1) a keyframes layer with robot pose estimates, (2) a walls layer representing wall surfaces, (3) a rooms layer encompassing sets of wall planes, and (4) a floors layer gathering the rooms within a given floor level. The above graph is optimized in real-time to obtain a robust and accurate estimate of the robot’s pose and its map, simultaneously constructing and leveraging high-level information of the environment. To extract this high-level information, we present novel room and floor segmentation algorithms utilizing the mapped wall planes and free-space clusters. We tested S-Graphs+ on multiple datasets, including simulated and real data of indoor environments from varying construction sites, and on a real public dataset of several indoor office areas. On average over our datasets, S-Graphs+ outperforms the accuracy of the second-best method by a margin of 10.67%, while extending the robot situational awareness by a richer scene model. Moreover, we make the software available as a d
|
|
16:30-18:00, Paper WeCT26-NT.4 | Add to My Program |
Visual Place Recognition: A Tutorial |
|
Schubert, Stefan | Chemnitz University of Technology |
Neubert, Peer | University of Koblenz |
Garg, Sourav | University of Adelaide |
Milford, Michael J | Queensland University of Technology |
Fischer, Tobias | Queensland University of Technology |
Keywords: SLAM, Localization, Mapping
Abstract: Localization is an essential capability for mobile robots. A rapidly growing field of research in this area is Visual Place Recognition (VPR), which is the ability to recognize previously seen places in the world based solely on images. This present work is the first tutorial paper on visual place recognition. It unifies the terminology of VPR and complements prior research in two important directions: 1) It provides a systematic introduction for newcomers to the field, covering topics such as the formulation of the VPR problem, a general-purpose algorithmic pipeline, an evaluation methodology for VPR approaches, and the major challenges for VPR and how they may be addressed. 2) As a contribution for researchers acquainted with the VPR problem, it examines the intricacies of different VPR problem types regarding input, data processing, and output. The tutorial also discusses the subtleties behind the evaluation of VPR algorithms, e.g., the evaluation of a VPR system that has to find all matching database images per query, as opposed to just a single match. Practical code examples in Python illustrate to prospective practitioners and researchers how VPR is implemented and evaluated.
|
|
16:30-18:00, Paper WeCT26-NT.5 | Add to My Program |
Multi-Radar Inertial Odometry for 3D State Estimation Using mmWave Imaging Radar |
|
Huang, Jui-Te | Carnegie Mellon University |
Xu, Ruoyang | Carnegie Mellon University |
Hinduja, Akshay | Carnegie Mellon University |
Kaess, Michael | Carnegie Mellon University |
Keywords: SLAM, Localization, Robotics in Hazardous Fields
Abstract: State estimation is a crucial component for the successful implementation of robotic systems, relying on sensors such as cameras, LiDAR, and IMUs. However, in real-world scenarios, the performance of these sensors is degraded by challenging environments, e.g. adverse weather conditions and low-light scenarios. The emerging 4D imaging radar technology is capable of providing robust perception in adverse conditions. Despite its potential, challenges remain for indoor settings where noisy radar data does not present clear geometric features. Moreover, disparities in radar data resolution and field of view(FOV) can lead to inaccurate measurements. While prior research has explored radar-inertial odometry based on Doppler velocity information, challenges remain for the estimation of 3D motion because of the discrepancy in the FOV and resolution of the radar sensor. In this paper, we address Doppler velocity measurement uncertainties. We present a method to optimize body frame velocity while managing Doppler velocity uncertainty. Based on our observations, we propose a dual imaging radar configuration to mitigate the challenge of discrepancy in radar data. To attain high-precision 3D state estimation, we introduce a strategy that seamlessly integrates radar data with a consumer-grade IMU sensor using fixed-lag smoothing optimization. Finally, we evaluate our approach using real-world 3D motion data.
|
|
16:30-18:00, Paper WeCT26-NT.6 | Add to My Program |
Semantically Guided Feature Matching for Visual SLAM |
|
Ilter, Oguzhan | ETH Zürich |
Armeni, Iro | Stanford University |
Pollefeys, Marc | ETH Zurich |
Barath, Daniel | MTA SZTAKI; Visual Recognition Group in CTU Prague |
Keywords: SLAM, Localization, Semantic Scene Understanding
Abstract: We introduce a new algorithm that utilizes semantic information to enhance feature matching in visual SLAM pipelines. The proposed method constructs a high-dimensional semantic descriptor for each detected ORB feature. When integrated with traditional visual ones, these descriptors aid in establishing accurate tentative point correspondences between consecutive frames. Additionally, our semantic descriptors enrich 3D map points, enhancing loop closure detection by providing deeper insights into the underlying map regions. Experiments on public large-scale datasets demonstrate that our technique surpasses the accuracy of established methods. Importantly, given its detector-agnostic nature, our algorithm also amplifies the efficacy of modern keypoint detectors, such as SuperPoint. The implementation of our algorithm can be found on Github.
|
|
16:30-18:00, Paper WeCT26-NT.7 | Add to My Program |
DVI-SLAM: A Dual Visual Inertial SLAM Network |
|
Peng, Xiongfeng | Samsung R&D Institute China-Beijing |
Liu, Zhihua | Samsung Research Center, Beijing, China |
Li, Weiming | Samsung Advanced Institute of Technology (SAIT) |
Tan, Ping | Simon Fraser University |
Cho, SoonYong | Samsung Advanced Institute of Technology |
Wang, Qiang | Samsung |
Keywords: SLAM, Localization, Visual-Inertial SLAM
Abstract: Recent deep learning based visual simultaneous localization and mapping (SLAM) methods have made significant progress. However, how to make full use of visual information as well as better integrate with inertial measurement unit (IMU) in visual SLAM has potential research value. This paper proposes a novel deep SLAM network with dual visual factors. The basic idea is to integrate both photometric factor and re-projection factor into the end-to-end differentiable structure through multi-factor data association module. We show that the proposed network dynamically learns and adjusts the confidence maps of both visual factors and it can be further extended to include the IMU factors as well. Extensive experiments validate that our proposed method significantly outperforms the state-of-the-art methods on several public datasets, including TartanAir, EuRoC and ETH3D-SLAM. Specifically, when dynamically fusing the three factors together, the absolute trajectory error for both monocular and stereo configurations on EuRoC dataset has reduced by 45.3% and 36.2% respectively.
|
|
16:30-18:00, Paper WeCT26-NT.8 | Add to My Program |
DMSA - Dense Multi Scan Adjustment for LiDAR Inertial Odometry and Global Optimization |
|
Skuddis, David | University of Stuttgart |
Haala, Norbert | University of Stuttgart |
Keywords: SLAM, Mapping
Abstract: We propose a new method for fine registering multiple point clouds simultaneously. The approach is characterized by being dense, therefore point clouds are not reduced to pre-selected features in advance. Furthermore, the approach is robust against small overlaps and dynamic objects, since no direct correspondences are assumed between point clouds. Instead, all points are merged into a global point cloud, whose scattering is then iteratively reduced. This is achieved by dividing the global point cloud into uniform grid cells whose contents are subsequently modeled by normal distributions. We show that the proposed approach can be used in a sliding window continuous trajectory optimization combined with IMU measurements to obtain a highly accurate and robust LiDAR inertial odometry estimation. Furthermore, we show that the proposed approach is also suitable for large scale keyframe optimization to increase accuracy. We provide the source code and some experimental data on https://github.com/davidskdds/DMSA_LiDAR_SLAM.git.
|
|
16:30-18:00, Paper WeCT26-NT.9 | Add to My Program |
CTA-LO: Accurate and Robust LiDAR Odometry Using Continuous-Time Adaptive Estimation |
|
Lv, Yuezhang | Northeastern University |
Zhang, Yunzhou | Northeastern University |
Zhao, Xiaoyu | Northeastern University, China |
Li, Wu | Northeastern University |
Ning, Jian | Northeastern University |
Jin, Yang | Northeastern University |
Keywords: SLAM, Mapping
Abstract: Accurate and robust LiDAR odometry is a crucial technology for robot localization. However, motion distortion and ranging error make it a bottleneck. Most existing methods are limited in accuracy and robustness because they simply compensate for motion distortion by constant velocity motion assumption without accurate model of ranging error. In this paper, we propose a high-precision and robust LiDAR odometry (LO), which utilizes continuous-time estimation to remove LiDAR distortion and builds the spot uncertainty model to quantify the ranging error. Generally, the number of variables in continuous-time estimation is several times higher than that in discrete-time ones, leading to insufficient constraints on the LiDAR odometry. To solve this problem, we propose a marginalization method to retain prior scans' constraints by exploiting the local support property of the B-spline. To further improve the odometry accuracy, we propose a residual adaptive weighting method and a probabilistic point cloud map based on the spot uncertainty model of LiDAR points. The experimental results show that our method outperforms state-of-the-art LiDAR odometry in accuracy and robustness.
|
|
WeCT33-CC Oral Session, CC-301 |
Add to My Program |
Networked and Cooperating Robots |
|
|
Chair: Goldberg, Ken | UC Berkeley |
Co-Chair: Haddadin, Sami | Technical University of Munich |
|
16:30-18:00, Paper WeCT33-CC.1 | Add to My Program |
Coordinated Landing Control for Cross-Domain UAV-USV Fleets Using Heterogeneous-Feature Matching |
|
Ding, Jianing | Huazhong University of Science and Technology |
Zhang, Hai-Tao | Huazhong University of Science AndTechnology |
Hu, Binbin | Nanyang Technological University |
Keywords: Multi-Robot Systems, Cooperating Robots
Abstract: Abstract—Coordinated landing control for multiple unmanned aerial vehicles (UAVs) on appropriate multiple unmanned surface vehicles (USVs) is an urgent yet challenging mission with the tremendous development of modern marine industry. To this end, we propose a coordinated multiple UAV-USV landing control algorithm via heterogeneous-feature matching. Specifically, the heterogeneous landing features of different UAVs and USVs are extracted to establish a dynamic UAV-USV cooperative landing ability mapping for the cross domain UAV-USV fleets (CDUUFs). Then, by incorporating suitable allocation with UAV-USV landing convergence and collision avoidance among UAVs into constraints with the assistance of both control Lyapunov functions (CLFs) and control barrier functions (CBFs), the multiple UAV-USV landing control problem is formulated as a constraint-based optimization one. Therein, slack variables are introduced to fulfill the assignment and facilitate the searching of a balanced solution between control performance and landing safety. Finally, extensive simulations are conducted to substantiate the effectiveness of the present multiple UAV-USV landing control law.
|
|
16:30-18:00, Paper WeCT33-CC.2 | Add to My Program |
Distributed Control Barrier Functions for Global Connectivity Maintenance |
|
De Carli, Nicola | CNRS |
Salaris, Paolo | University of Pisa |
Robuffo Giordano, Paolo | Irisa Cnrs Umr6074 |
Keywords: Multi-Robot Systems, Cooperating Robots, Aerial Systems: Perception and Autonomy
Abstract: In this work, we propose a framework for the distributed implementation of Quadratic Programs-based controllers, building upon and rectifying a significant limitation in a previously presented approach. The proposed framework is primarily motivated by the distributed implementation of Control Barrier Functions (CBFs), whose primary objective is to make minimal adjustments to a nominal controller while ensuring constraint satisfaction. By improving over some limitations in the current state-of-the-art, we are able to apply distributed CBFs to the problem of global connectivity maintenance in presence of communication and sensing constraints. Specifically, we consider the problem of preserving connectivity for a group of quadrotors with onboard sensors under distance and field of view constraints. Leveraging distributed control barrier functions, our approach maintains global graph connectivity while optimizing the performance of the desired task. Numerical simulations validate its effectiveness.
|
|
16:30-18:00, Paper WeCT33-CC.3 | Add to My Program |
Stability Analysis of Distance-Angle Leader-Follower Formation Control |
|
Machida, Manao | NEC |
Ichien, Masumi | NEC Corporation |
Keywords: Multi-Robot Systems, Cooperating Robots, Nonholonomic Mechanisms and Systems
Abstract: Necessary and sufficient conditions are described for stable distance-angle leader-follower formation control of first- and second-order holonomic and non-holonomic mobile robots. The distance-angle leader-follower formation is a problem of maintaining the desired relative distance and orientation of robots in a group. Our analysis shows that the input constraints on the leader are necessary for stable formation control. These constraints are summarized as follows: 1) In a team of first (second) order holonomic mobile robots, the leader has to be controlled as a first (second) order non-holonomic mobile robot; 2) In a team of first (second) order non-holonomic mobile robots, the control input of the leader must be limited so that the curvature is first (second) order differentiable. We further show that these constraints are sufficient for the followers to maintain formation. Moreover, we present globally asymptotically stable controllers and describe simulation experiments that demonstrate the effectiveness of these controllers.
|
|
16:30-18:00, Paper WeCT33-CC.4 | Add to My Program |
A Distributed Multi-Robot Framework for Exploration, Information Acquisition and Consensus |
|
Patwardhan, Aalok | Imperial College London |
Davison, Andrew J | Imperial College London |
Keywords: Multi-Robot Systems, Cooperating Robots, Path Planning for Multiple Mobile Robots or Agents
Abstract: The distributed coordination of robot teams performing complex tasks is challenging to formulate. The different aspects of a complete task such as local planning for obstacle avoidance, global goal coordination and collaborative mapping are often solved separately, when clearly each of these should influence the others for the most efficient behaviour. In this paper we use the example application of distributed information acquisition as a robot team explores a large space to show that we can formulate the whole problem as a single factor graph with multiple connected layers representing each aspect. We use Gaussian Belief Propagation (GBP) as the inference mechanism, which permits parallel, on-demand or asynchronous computation for efficiency when different aspects are more or less important. This is the first time that a distributed GBP multi-robot solver has been proven to enable intelligent collaborative behaviour rather than just guiding robots to individual, selfish goals. We encourage the reader to view our demos at https://aalpatya.github.io/gbpstack.
|
|
16:30-18:00, Paper WeCT33-CC.5 | Add to My Program |
Environmental Awareness Dynamic 5G QoS for Retaining Real Time Constraints in Robotic Applications |
|
Damigos, Gerasimos | Ericsson AB - Ericsson Research |
Saradagi, Akshit | Luleå University of Technology, Luleå, Sweden |
Sandberg, Sara | Ericsson AB - Ericsson Research |
Nikolakopoulos, George | Luleå University of Technology |
Keywords: Networked Robots, Aerial Systems: Applications
Abstract: The fifth generation (5G) cellular network technology is mature and increasingly utilized in many industrial and robotics applications, while an important functionality is the advanced Quality of Service (QoS) features. Despite the prevalence of 5G QoS discussions in the related literature, there is a notable absence of real-life implementations and studies concerning their application in time-critical robotics scenarios. This article considers the operation of time-critical applications for 5G-enabled unmanned aerial vehicles (UAVs) and how their operation can be improved by the possibility to dynamically switch between QoS data flows with different priorities. As such, we introduce a robotics oriented analysis on the impact of the 5G QoS functionality on the performance of 5G-enabled UAVs. Furthermore, we introduce a novel framework for the dynamic selection of distinct 5G QoS data flows that is autonomously managed by the 5G-enabled UAV. This problem is addressed in a novel feedback loop fashion utilizing a probabilistic finite state machine (PFSM). Finally, the efficacy of the proposed scheme is experimentally validated with a 5G-enabled UAV in a real-world 5G stand-alone (SA) network.
|
|
16:30-18:00, Paper WeCT33-CC.6 | Add to My Program |
CloudGripper: An Open Source Cloud Robotics Testbed for Robotic Manipulation Research, Benchmarking and Data Collection at Scale |
|
Zahid, Muhammad | KTH Royal Institute of Technology |
Pokorny, Florian T. | KTH Royal Institute of Technology |
Keywords: Networked Robots, Data Sets for Robot Learning, Engineering for Robotic Systems
Abstract: We present CloudGripper, an open source cloud robotics testbed, consisting of a scalable, space and cost-efficient design constructed as a rack of 32 small robot arm work cells. Each robot work cell is fully enclosed and features individual lighting, a low-cost Cartesian robot arm with an attached rotatable parallel jaw gripper and a dual camera setup for experimentation. The system design is focused on continuous operation and features a 10 Gbit/s network connectivity allowing for high throughput remote-controlled experimentation and data collection for robotic manipulation. Furthermore, CloudGripper is intended to form a community testbed to study the challenges of large scale machine learning and cloud and edge-computing in the context of robotic manipulation. In this work, we describe the mechanical design of the system, its initial software stack and evaluate the repeatability of motions executed by the proposed robot arm design. A local network API throughput and latency analysis is also provided. CloudGripper-Rope-100, a dataset of more than a hundred hours of randomized rope pushing interactions and approximately 4 million camera images is collected and serves as a proof of concept demonstrating data collection capabilities. A project website with more information is available at https://cloudgripper.org.
|
|
16:30-18:00, Paper WeCT33-CC.7 | Add to My Program |
FogROS2-Config: A Toolkit for Choosing Server Configurations for Cloud Robotics |
|
Chen, Kaiyuan | University of California, Berkeley |
Hari, Kush | UC Berkeley |
Khare, Rohil | UC Berkeley |
Le, Charlotte | University of California, Berkeley |
Chung, Trinity | UC Berkeley |
Drake, Jaimyn | University of California, Berkeley |
Ichnowski, Jeffrey | Carnegie Mellon University |
Kubiatowicz, John | UC Berkeley |
Goldberg, Ken | UC Berkeley |
Keywords: Networked Robots, Distributed Robot Systems, Multi-Robot Systems
Abstract: Cloud service providers provide over 50,000 distinct and dynamically changing set of cloud server options. To help roboticists make cost-effective decisions, we present FogROS2-Config, an open toolkit that takes ROS2 nodes as input and automatically runs relevant benchmarks to quickly return a menu of cloud compute services that tradeoff latency and cost. Because it is infeasible to try every hardware configuration, FogROS2-Config quickly samples tests a small set of edge-case servers. We evaluate FogROS2-Config on three robotics application tasks: visual SLAM, grasp planning. and motion planning. FogROS2-Config can reduce the cost by up to 20x. By comparing with a Pareto frontier for cost and latency by running the application task on all available server configurations, we evaluate cost and latency models and confirm that FogROS2-Config selects efficient hardware configurations to balance cost and latency.
|
|
16:30-18:00, Paper WeCT33-CC.8 | Add to My Program |
Opportunistic Communication in Robot Teams |
|
Mox, Daniel | University of Pennsylvania |
Garg, Kashish | University of Pennsylvania |
Ribeiro, Alejandro | University of Pennsylvania |
Kumar, Vijay | University of Pennsylvania |
Keywords: Networked Robots, Multi-Robot Systems
Abstract: In this paper we present a new approach to Mobile Infrastructure on Demand (MID) where a dedicated team of robots creates and sustains a wireless network that satisfies the communication requirements of a different team of task-oriented robots seeking to coordinate their actions in the absence of existing communication infrastructure. Different from previous works, our approach forgoes heuristics for network performance such as algebraic-connectivity or network flow optimizations and instead positions communication support robots to directly maximize the probability of packet delivery by the underlying opportunistic routing protocol. Our system is task agnostic and practical to implement and operate on robots equipped with off-the-shelf WiFi radios. We demonstrate this through a set of experiments showing our MID system maintaining the delivery of critical mission data in a situational awareness setting and enabling foraging robots to effectively coordinate their actions during multi-robot exploration.
|
|
16:30-18:00, Paper WeCT33-CC.9 | Add to My Program |
Enhancing the Tracking Performance of Passivity-Based High-Frequency Robot Cloud Control |
|
Jakob, Fabian | Technical University of Munich |
Chen, Xiao | Technical University of Munich |
Sadeghian, Hamid | Technical University of Munich |
Haddadin, Sami | Technical University of Munich |
Keywords: Networked Robots, Telerobotics and Teleoperation
Abstract: This paper addresses the migration of high-frequency robot controllers to remote computing services, which are connected via a communication channel prone to delays and packet loss. The stability of the networked system is guaranteed by ensuring passivity of each subcomponent in the interconnection, as well as the Time-Domain-Passivity-Approach (TDPA) for the communication channel. We reduce conservatism of the TDPA using the model knowledge on both sides of the communication system to identify passivity excesses. This is further used to avoid over-dissipation of energy in the passivity controller by augmentation of a tolerable passivity-shortage. Tracking offsets are eliminated with a position drift compensation algorithm, for which convergence guarantees are provided. The experimental validation of the results conducted on a 7-DoF Franka Research 3 robot demonstrates a substantial enhancement in tracking performance due to the proposed modifications, particularly in scenarios with high communication delays.
|
|
WeCL-EX Poster Session, Exhibition Hall |
Add to My Program |
Late Breaking Results Poster VI |
|
|
|
16:30-18:00, Paper WeCL-EX.1 | Add to My Program |
Deformable Mobile Robot for Adaptive Grasping and Manipulation |
|
Labazanova, Luiza | The Hong Kong Polytechnic University |
Qiu, Liuming | The Hong Kong Polytechnic University |
Nakan, Shokan | The Hong Kong Polytechnic University, RoMI Lab |
Navarro-Alarcon, David | The Hong Kong Polytechnic University |
Keywords: Flexible Robotics, Product Design, Development and Prototyping, Multi-Contact Whole-Body Motion Planning and Control
Abstract: Modern robotics seeks a balance between effectiveness and versatility. Researchers have addressed this by incorporating variable morphology. In our prior work, we introduced the 2SR mobile robot capable of functioning as both a rigid and flexible robot. While rigid, it excels in speed and strength; in a flexible mode, it adapts easily to its environment. However, the initial design suffered from slow phase transitions, in this work solved through modularity. Here, we present a mobile robot with variable morphology and modular structure that can easily grasp and manipulate objects of arbitrary curved shapes. Experiments on path tracking with several objects have been conducted to validate the proposed design and the manipulation method.
|
|
16:30-18:00, Paper WeCL-EX.2 | Add to My Program |
Extending Industrial Robot Systems with Agile Communication Aspects |
|
Balogh, Marcell | Budapest University of Technology and Economics |
Vidacs, Attila | Budapest University of Technology and Economics |
Geza, Szabo | Ericsson Research |
Keywords: Networked Robots, Multi-Robot Systems, Distributed Robot Systems
Abstract: One of the key promises of Industry 4.0 is to highlight the connectivity among components. Although existing frameworks can manage networked robotic systems, the management robotic and networking components are still rigidly separated. We propose a concept supported by a ROS 2 simulation setup of why and how modern robotic systems should include standardised connectivity features and network components. We present our idea of bringing robot systems and networks closer by augmenting current standard technologies. As part of the IEEE P2940 standardisation objectives, the formal description of robot systems is extended to include network components in a standardised way.
|
|
16:30-18:00, Paper WeCL-EX.3 | Add to My Program |
Leveraging Symbolic Models in Reinforcement Learning for Multi-Skill Chaining |
|
Lu, Wenhao | Chalmers |
Ramirez-Amaro, Karinne | Chalmers University of Technology |
Sjöberg, Jonas | Chalmers University of Technology |
Keywords: Reinforcement Learning, Machine Learning for Robot Control
Abstract: We present an integration of symbolic planning into reinforcement learning, aiming to enhance generalizability by reusing induced goal-conditioned robotic skills while reducing compounding errors in skill chaining for solving robotic tasks.
|
|
16:30-18:00, Paper WeCL-EX.4 | Add to My Program |
Reinforcement Learning with Task Decomposition and Task-Specific Reward System for Automation of High-Level Tasks |
|
Kwon, Gunam | Yeungnam University |
Oh, Sejik | Yeungnam University |
Jo, Hyojin | Yeungnam University |
Kim, Byeongjun | Yeungnam University |
Jung, Yujin | Yeungnam University |
Kwon, Nam Kyu | Yeungnam University |
Keywords: Machine Learning for Robot Control, Reinforcement Learning, Task and Motion Planning
Abstract: This paper introduces a reinforcement learning method that leverages task decomposition and a task-specific reward system to address complex high-level tasks, such as door opening, block stacking, and nut assembly. These tasks are decomposed into various subtasks, with the grasping and putting tasks executed through single joint and gripper actions, while other tasks are trained using the SAC algorithm alongside the task-specific reward system. The task-specific reward system aims to increase the learning speed, enhance the success rate, and enable more efficient task execution. The experimental results demonstrate the efficacy of the proposed method, achieving success rates of 99.9% for door opening, 95.25% for block stacking, 80.8% for square-nut assembly, and 90.9% for round-nut assembly. Overall, this method presents a promising solution to address the challenges associated with complex tasks, offering improvements over the traditional end-to-end approach.
|
|
16:30-18:00, Paper WeCL-EX.5 | Add to My Program |
Distributed Multi-Robot Multi-Target Tracking Using Heterogeneous Limited-Range Sensors |
|
Chen, Jun | Nanjing Normal University |
Abugurain, Mohammed | King Abdullah University of Science and Technology |
Dames, Philip | Temple University |
Park, Shinkyu | KAUST |
Xie, Fei | Nanjing Normal University |
Mao, Qi | City University of Hong Kong |
Keywords: Multi-Robot Systems, Sensor Networks, Path Planning for Multiple Mobile Robots or Agents
Abstract: This poster presents a cooperative multi-robot multi-target tracking framework aimed at enhancing the efficiency of the heterogeneous sensor network and, consequently, improving overall target tracking accuracy. The concept of normalized unused sensing capacity is introduced to quantify the information a sensor is currently gathering relative to its theoretical maximum. This measurement can be computed using entirely local information and is applicable to various sensor models, distinguishing it from previous literature on the subject. It is then utilized to develop a distributed coverage control strategy for a heterogeneous sensor network, adaptively balancing the workload based on each sensor's current unused capacity. The algorithm is validated through a series of ROS and MATLAB simulations, demonstrating superior results compared to standard approaches that do not account for heterogeneity or current usage rates.
|
|
16:30-18:00, Paper WeCL-EX.6 | Add to My Program |
Hardware Friendly Neuromorphic Architecture for Efficient and Fast Visibility Enhancement of Underwater Images |
|
Sudevan, Vidya | Khalifa University |
Zayer, Fakhreddine | Khalifa University |
Javed, Sajid | Khalifa University |
Karki, Hamad | Khalifa University |
De Masi, Giulia | Khalifa University |
Dias, Jorge | Khalifa University |
Keywords: Deep Learning for Visual Perception, Robotics in Under-Resourced Settings, Environment Monitoring and Management
Abstract: This work presents a hardware-friendly neuromorphic architecture that effectively improves the visibility of underwater images with significantly less energy consumption. Leveraging insights from the human visual system, the proposed approach adopts a biologically inspired neural network framework to mimic the processing of visual data in challenging underwater conditions. The architecture consists of a 19-layer spiking encoder-decoder framework with skip connections, which reconstructs the visibility-enhanced images from latent space representations of raw input images. The proposed method is trained and evaluated on the 'UIEB' dataset, with performance metrics such as Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) compared to state-of-the-art CNN architectures. Additionally, model efficiency metrics are presented to demonstrate the potential of using spiking neural networks to develop biologically inspired architectures that improve underwater images.
|
|
16:30-18:00, Paper WeCL-EX.7 | Add to My Program |
Realizing a Personal Adaptive Dressing Assistance Robot |
|
Yamasaki, Kakeru | Kyushu Institute of Technology |
Kajiwara, Takumi | Kyushu Institute of Technology |
Shibata, Tomohiro | Kyushu Institute of Technology |
Henaff, Patrick | Université De Lorraine, CNRS, INRIA, LORIA, F-54000 Nancy, Fra |
Keywords: Physical Human-Robot Interaction, Imitation Learning
Abstract: The development of assistive technologies for activities of daily living (ADLs) is urgently needed to cope with the future aging of society. Among these, dressing is known to be the most underutilized of all ADLs. The reasons for the lack of utilization of assistive technology are that the task is complex, as it involves handling individuals with large differences in height and symptoms, and requires manipulation of flexible clothing. We have been focusing on individual adaptation in our research. In this poster, we focus on two individual differences in human movement and posture and report our studies that enabled individual adaptation for each. For adaptation to human posture, we focused on the kyphosis posture observed in the elderly and developed a dressing system using DMP that is robust enough to adapt to both straight and kyphosis posture. For adaptation to human movements, we developed a dressing assistance system with a CPG controller based on the characteristics of human movements revealed in preliminary experiments. This system does not require human modeling but enables dressing based on the Assist-As-Needed principle (help only when needed, not when not needed). These studies have significantly improved the individual adaptability of the robot, and have contributed to the realization of future assistance robots. We also discuss issues that need to be addressed for future social implementation and propose a direction for future development of assistive robots.
|
|
16:30-18:00, Paper WeCL-EX.8 | Add to My Program |
An End-Cloud Integration Intelligent Driving System for Unmanned Bulldozer |
|
Peng, Gang | Huazhong University of Science and Technology |
Gao, Qiang | Huazhong University of Science and Technology |
Zhou, YiCheng | Huazhong University of Science and Technology |
Duan, Hangqi | Huazhong University of Science and Technology |
Liu, Xingyu | Huazhong University of Science and Technology |
Keywords: Mining Robotics, Localization, Mapping
Abstract: The traditional manned bulldozer has low automation, high work intensity, and low efficiency. To increase the efficiency of construction, and raise the intelligence of construction machinery, we developed an end-cloud integration unmanned bulldozer system, where “cloud” refers to an unmanned aerial vehicle and a digital construction cloud platform, and “end” refers to an unmanned bulldozer. Our unmanned bulldozer system integrates advanced sensors such as lidar and depth camera, applies state-of-the-art object detection, image enhancement, pose estimation, and mapping algorithms, and combines efficient and accurate planning and control schemes. It has three subsystems—perception, planning, and control—and each subsystem is connected in series with low coupling degree. We evaluated the performance of the unmanned bulldozer system in outdoor scenes based on the most common construction tasks of unmanned bulldozers. Experiments show that the efficiency of unmanned bulldozer is close to that of experienced operators in the test scenario.
|
|
16:30-18:00, Paper WeCL-EX.9 | Add to My Program |
Ray Casting and Diffusion: A Fast and Efficient Sampling-Based Approach for Path Planning |
|
Maravgakis, Michael | Foundation for Research and Technology - Hellas (FORTH) |
Argiropoulos, Despina-Ekaterini | (a) Institute of Computer Science Foundation for Research and T |
Piperakis, Stylianos | Agility Robotics Inc, |
Papadakis, Emmanouil | Foundation for Research and Technology - Hellas |
Trahanias, Panos | Foundation for Research and Technology – Hellas (FORTH) |
Keywords: Motion and Path Planning, Task and Motion Planning
Abstract: This poster introduces a novel bi-directional randomized sampling-based path planning algorithm coined as Ray Casting and Diffusion (RCD). RCD models both the robot and the target as light point-sources that emit pseudo-rays towards random directions. Upon collision with obstacles or same source produced rays (robot or target), a new source point is established and a weight coefficient is assigned to it that is used in subsequent iterations. The primary concept of RCD is to iteratively repeat this process until an intersection between the robot-generated-rays and the target-generated-rays occurs (path found). RCD incorporates multiple optimizations within each layer, contributing to the proposed approach's speed, reliability, probabilistic completeness, and its ability to consistently generate feasible and efficient paths. Both the implementation and the experimental datasets have been released as an open-source project to support future research endeavors. Extensive evaluations of RCD have been conducted, comparing it against multiple well-established state-of-the-art sampling-based path planning approaches. The results indicate superior efficiency in finding paths faster compared to existing methods. In some instances, RCD yields slightly longer paths, however its overall performance showcases significant advantages in terms of speed and efficacy in path finding tasks.
|
|
16:30-18:00, Paper WeCL-EX.10 | Add to My Program |
Soft Material Variable Stiffness Feet for Humanoid Robots |
|
Frizza, Irene | University of Montpellier/National Institute of Advanced Industr |
Kaminaga, Hiroshi | National Inst. of AIST |
Fraisse, Philippe | LIRMM |
Venture, Gentiane | The University of Tokyo |
Keywords: Mechanism Design, Soft Robot Applications, Humanoid and Bipedal Locomotion
Abstract: Humanoids are designed for real-world applications and uneven ground is a common feature of many natural environments. Ensuring that humanoids can navigate uneven terrain allows them to be used effectively in a wide range of scenarios. Feet is essential to maintain dynamic stability and to propel the body during walking. Most humanoid robots are designed with rigid flat feet. However, they cannot adapt to uneven terrains, limiting the robot's mobility and making it less capable of navigating complex environments. Our approach is to design, manufacture, and control variable stiffness feet for humanoid robots, for enhancing walking on different types of uneven grounds. Starting from simulation analysis, we demonstrate that changing the stiffness of humanoid feet in conjunction with the ground roughness during the walk improves the stability in different types of grounds, with rocks and obstacles. Then, we propose a foot design that incorporates compliant and flexible materials, so that the feet can conform to rough and irregular terrain. It is composed of a pneumatic variable stiffness soft sole. Variable stiffness is obtained by pressurizing a pair of bending-type structures placed in an antagonistic manner. We propose a process to fabricate a robust variable stiffness structure by combining insert molding and lost-core techniques. We develop the air pressure control method and a model that relates pressure and bending stiffness and we validate with experiments.
|
|
16:30-18:00, Paper WeCL-EX.11 | Add to My Program |
Design Optimization, Modeling and Simulation of 3D Printed Pneumatic Artificial Muscle for Prosthetic Hand Actuation |
|
Arafa, Mostafa | University of Nottingham |
Goodridge, Ruth | University of Nottingham |
Ashcroft, Ian | University of Nottingham |
Goher, Khaled | University of Nottingham |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Robot Applications, Prosthetics and Exoskeletons
Abstract: The compliant ability of Pneumatic Artificial Muscles (PAM) along with their high power to weight ratio, elects them to be employed in the upper limb prosthetic actuation. Furthermore, a single hand prosthesis can utilize different actuators of different sizes and force outputs according to the functionality of each actuator. Hexagonal Band Optimized Bellow (HexBOB) PAM implicates optimizable design that can satisfy the requirements of the actuator due to the anticipated functionality of the actuator. This work presents the mathematical modeling of HexBOB actuator and its design optimization method. The design optimization has resulted in improved performance designs that converge towards the design requirements by the iteration of the design optimization procedure.
|
|
16:30-18:00, Paper WeCL-EX.12 | Add to My Program |
Thermal Gaussian Splatting for Depth Estimation of Transparent Objects |
|
Jang, Hyunsoo | Seoul National University |
Kim, Ayoung | Seoul National University |
Keywords: Perception for Grasping and Manipulation, Computer Vision for Manufacturing, Deep Learning for Visual Perception
Abstract: Several researches have been conducted to apply Gaussian Splatting into SLAM due to its training speed and adequate quality. However, there are several limitations to it; one of which is that it is prone to transparent objects. Both RGB and depth camera have difficulty detecting these objects. This research tries on utilizing thermal camera to resolve this problem. Input thermal images are first normalized and clipped into 8 bits, and then used to optimize the gaussians. The trained model is then used to estimate the depth from specific viewpoint. With our proposed method, we achieve better result on depth estimation for scenes with transparent objects.
|
|
16:30-18:00, Paper WeCL-EX.13 | Add to My Program |
Analysis of the Relationship between Chemical Plume Tracking Behavior and Intake Strategy Using Insect-Robot Hybrid System |
|
Shigaki, Shunsuke | National Institute of Informatics |
Matsushita, Takumi | Osaka University |
Hosoda, Koh | Kyoto University |
Keywords: Biologically-Inspired Robots, Biomimetics, Cyborgs
Abstract: We experimentally verified the relationship between periodic odor intaking and the chemical plume tracking (CPT) performance of an adult male silkmoth, which uses female sex pheromone as a cue. The silkmoth can localize a female by walking with strong flapping when it receives sex pheromones with antennae. Although the silkmoth can not fly, it uses flapping to obtain propulsive force and to actively intake odor in its direction. The flapping frequency is not always constant and is modulated depending on the situation. However, the relationship between frequency modulation and CPT behavior is still unclear. We employed an insect-robot hybrid (IRH) system to generate a periodic odor intaking that is equivalent to the flapping of wings, and measured the relationship between the odor intaking and behavioral changes during CPT. The results suggest that it is important not only to increase the odor intake frequency but also to make a big difference in the airflow between odor interception and intake.
|
|
16:30-18:00, Paper WeCL-EX.14 | Add to My Program |
Towards Decentralised Formation of Minimal-Length Networks Using Swarms of Robots |
|
Miyauchi, Genki | The University of Sheffield |
Talamali, Mohamed S. | University of Sheffield |
Millard, Alan | University of York |
Gross, Roderich | Technical University of Darmstadt |
Keywords: Swarm Robotics, Distributed Robot Systems, Networked Robots
Abstract: Having robots form mobile ad hoc networks is a promising approach when deploying robot swarms in environments lacking global communication and navigation infrastructures. We propose a decentralised controller that enables robot swarms to form mobile ad hoc networks between multiple locations of interest. The controller is designed to minimise the length of the established networks, helping to reduce the number of robots required for the network, and the costs associated with communication and navigation along the network. Through physics-based simulations and real-robot experiments, we demonstrate that the controller yields networks that are significantly shorter in length than centrally computed optimal starlike trees, and that compare reasonably well with centrally computed minimal-length networks (i.e. Steiner trees). Moreover, robots not required to maintain the network become available for performing tasks at the locations of interest. The findings could pave the way for energy-efficient deployment of robot swarms in a range of environments from underground to outer space.
|
|
16:30-18:00, Paper WeCL-EX.15 | Add to My Program |
Multi-State Constraint Radar-Inertial Odometry |
|
Kim, Changseung | Ulsan National Institute of Science and Technology |
Bae, Geunsik | Ulsan National Institute of Science and Technology |
Shin, Woojae | Ulsan National Institute of Science and Technology |
Oh, Hyondong | UNIST |
Keywords: Localization, SLAM, Mapping
Abstract: Accurate and robust localization in GNSS-denied environments is crucial for autonomous robotics. While approaches utilizing a camera and a LiDAR are commonly considered viable options, the camera is susceptible to changes in illumination, and the LiDAR can be unreliable in environments with smoke or fog due to its short wavelengths. The Frequency Modulated Continuous Wave 4D (FMCW) radar, on the other hand, remains robust and accurate in these challenging environments, providing Doppler velocity and 3D point clouds essential for precise state estimation. In this paper, we propose a multi-state constraint Radar-Inertial Odometry that enhances the existing algorithm, which only uses Doppler velocity, by incorporating 3D point clouds. The proposed algorithm utilizes the stochastic cloning algorithm to estimate past radar states, employing measurements with constraints for features observed across multi-state. The performance of the proposed algorithm is validated in real experiments to enable accurate 6D pose estimation by comparing it to state-of-the-art algorithm.
|
|
16:30-18:00, Paper WeCL-EX.16 | Add to My Program |
Learning Strategies for Erecting Horizontal Objects Via Half-Grasping to Aid Subsequent Tasks |
|
Kim, Jinseok | UST, KITECH |
Choi, Iksu | Sungkyunkwan University, KITECH |
Cho, Taeyeop | Hanyang University, KITECH |
Won, Seungjae | University of Science and Technology |
Yang, Gi-Hun | KITECH |
Pyo, Dongbum | Korea Institute of Industrial Technology |
Keywords: Deep Learning in Grasping and Manipulation, Reinforcement Learning, Dexterous Manipulation
Abstract: In robotic assembly environments, the difficulty of subsequent tasks depends on how and where the target object is grasped, such as in peg-in-hole or bolting/screwing scenarios. Therefore, it is often necessary to adjust the pose of the target object before achieving the desired configuration. The conventional wisdom considers grasping a part distant from the center of mass of an object, resulting in the rotational displacement near the grasp point, as a failure. However, in this study, we redefine such scenarios as "half-grasping" and intentionally leverage them to execute the task of erecting horizontally lying objects. In this research, we introduce the concept of uprighting objects using half-grasping. We implemented this approach in the virtual environment using Isaac Sim and trained manipulation strategies through deep reinforcement learning. Using the obtained policy, object erection task was performed in a virtual environment and a success rate of over 99% was achieved confirming the feasibility of the proposal. For future work, we will focus on improving the policy to minimize displacement during uprighting task and generalize its applicability across objects with varying shapes, weights, weight distributions, and frictional characteristics. Additionally, we will explore the application of this method to real-world robotic systems through Sim2Real transfer techniques.
|
|
16:30-18:00, Paper WeCL-EX.17 | Add to My Program |
The Inverse Kinematics of Inextensible Three-Segment PCC Continuum Robots Have at Least Eight Solutions |
|
Li, Yucheng | University of Dayton |
Myszka, David H. | University of Dayton |
Murray, Andrew | University of Dayton |
Keywords: Kinematics, Formal Methods in Robotics and Automation, Soft Robot Applications
Abstract: This late-breaking result presents numerical solutions for the inverse kinematics (IK) problem in inextensible piecewise constant curvature (IPCC) continuum robots (CR) with three segments. For a specific IK problem, the criterion for the existence of the maximum length of each segment can be identified, noting that a maximum length may not exist. After specifying a segment length, all possible positions of the segment tip for the first segment of the CR are determined on a tip locus for a given IK problem. Selecting the tip location along the locus reduces the remaining two segments of the CR to being defined by a tip location on a circle. For an arbitrary choice along the tip locus and along the tip circle, the second and third segments possess arbitrary lengths. Moving along the initial tip locus, and then along the tip circle generates a two-parameter search for finding all three segments of the same length. Via this technique, an example of an IPCC continuum robot IK problem is shown to have eight unique solutions. Eight is the most found by the authors to date, but it is not known to be the maximum number of real, unique solutions.
|
|
16:30-18:00, Paper WeCL-EX.18 | Add to My Program |
Control of Multirotor with Integrated Horizontal Thrusters |
|
Rosales Martinez, Ricardo | Ritsumeikan University |
Paul, Hannibal | Ritsumeikan University |
Shimonomura, Kazuhiro | Ritsumeikan University |
Keywords: Aerial Systems: Mechanics and Control, Aerial Systems: Applications
Abstract: In both industrial and civil infrastructures, maintaining various structures and systems efficiently is crucial for ensuring their safety and longevity. While commercial UAVs (Unmanned Aerial Vehicles) have become increasingly popular for inspection and maintenance tasks, certain applications pose challenges beyond their capabilities. Our proposal aims to address these limitations by enhancing the manipulation and movement capabilities of conventional multirotor UAVs, focusing on applications that do not necessitate the full spectrum of control options offered by fully actuated systems. We evaluate multirotor UAVs equipped with different configurations of horizontal thrusters, expanding the capabilities of traditional under-actuated multirotors. To facilitate this, we developed a custom firmware based on version v.14.0 of the PX4 Autopilot firmware, enabling multirotors to transition seamlessly between various flight modes.
|
|
16:30-18:00, Paper WeCL-EX.19 | Add to My Program |
A Fabric Soft Robotic Assistive Glove with Novel Ruffles Actuators |
|
Suulker, Cem | Queen Mary University of London |
Althoefer, Kaspar | Queen Mary University of London |
Keywords: Soft Sensors and Actuators, Soft Robot Applications, Prosthetics and Exoskeletons
Abstract: Hand-wearable robots, specifically exoskeletons, are designed to aid hands in daily activities, playing a crucial role in post-stroke rehabilitation and assisting the elderly. Our contribution to this field is a textile robotic glove with integrated actuators. These actuators, powered by pneumatic pressure, guide the user's hand to a desired position. Crafted from textile materials, our soft robotic glove prioritizes safety, lightweight construction, and user comfort. Utilizing the ruffles technique, integrated actuators guarantee high performance in blocking force and bending effectiveness. Additionally, we present a participant study confirming the effectiveness of our robotic device.
|
|
16:30-18:00, Paper WeCL-EX.20 | Add to My Program |
The Effect of Adaptivity on Trust in a Robot Teacher |
|
Lange, Anna L. | Humboldt-Universität Zu Berlin |
Ackermann, Helene Leonie | Department of Educational Sciences, University of Potsdam |
Hafner, Verena Vanessa | Humboldt-Universität Zu Berlin |
Lazarides, Rebecca | Department of Educational Sciences, University of Potsdam |
Keywords: Social HRI, Physical Human-Robot Interaction, Acceptability and Trust
Abstract: Adaptive behavior is a desirable skill for social robots in the real world. It is particularly relevant in social situations as interaction partners have different needs and situation appropriate behavior can change quickly. Teaching interactions are a good example of where the teacher needs to be sensitive to the learner's behavior and change their response according to their development. This work highlights one of the adverse effects of implementing adaptive behaviors in robots: the risk of unpredictability leading to a reduction in trust.
|
|
16:30-18:00, Paper WeCL-EX.21 | Add to My Program |
Design and Development of a Novel Tactile Sensor Based on Photoelastic Effect Integrating Shape Adaptive Auxetic Metastructure for Harvesting Soft Fruits |
|
Gohil, Mahendra Kumar | Indian Institute of Technology Kanpur |
Ansari, Shahid | Indian Institute of Technology Kanpur |
Matsui, Itsuma | Yokohama National University |
Maeda, Yusuke | Yokohama National University |
Bhattacharya, Bishakh | Indian Institute of Technology Kanpur |
Keywords: Soft Sensors and Actuators, Force and Tactile Sensing, Deep Learning in Grasping and Manipulation
Abstract: In the field of precision agriculture, handling soft fruits without causing any damage is a challenging task. The tactile sensors are key elements that provide continuous force feedback during interaction with the fruit. The existing vision-based photoelastic sensors are bulky due to onboard cameras and electronics, so difficult to mount on existing grippers and do not carry rich information on the photoelastic fringe pattern sensing. This work introduces the development of a novel tactile sensor for robotic grasping applications to measure normal force using the principle of photoelasticity. Here, a photoelastic polyurethane material as a cylindrical block under the compression of sliding indentors generates the fringe pattern, visualized through polarizer film using a light source. The observed fringe pattern image is transmitted remotely through optical fibres, which are post-processed using a CNN-based model to get the calibrated contact force. The computing and electronics modules are placed remotely, reducing the sensor size and design complexity. The miniaturized sensor is installed with an auxetic structure, ensuring the shape conformability of a soft fruit or object. A sensing range of up to 11 N with a resolution of around 1 N is achieved. The sensor is largely unaffected by external disturbance and interference (electrical, magnetic, or thermal) as it uses light for data transmission. The sensor has a small form factor and can be integrated with robotic grippers.
|
|
16:30-18:00, Paper WeCL-EX.22 | Add to My Program |
Optoelectronic Shape Sensing for Flexible Robotic Application |
|
Osman, Dalia | Brunel University London |
Noh, Yohan | Brunel University London |
Keywords: Soft Sensors and Actuators, Sensor-based Control, Robotics and Automation in Life Sciences
Abstract: Continuum robots are highly flexible robotic systems that allow a versatile range of motion for applications in manufacturing, aerospace, and medical industries, as well as space and rescue operations. Shape sensing in continuum robotics enables stable actuation and control, as estimation of complex curvatures is essential for manoeuvring through delicate pathways using continuum robotic tools. Various sensing methods, including FBG technology, IMU networks, magnetic and stretch sensors have been used in various shape sensing applications. This paper demonstrates the performance of an optoelectronic based shape sensing system integrated into a two-segment tendon actuated robotic manipulator. The sensing principle is proximity-intensity based sensing and utilizes a convex shaped reflector for modulation of proximity during rotation of two degrees of freedom. A streamlined technique for the calibration of the sensors is demonstrated, and a validation of the shape sensing performance shows that the shape sensing system can estimate tip position and orientation with accuracy, and for improved sensing performance, the shape sensing system utilizes a simplified circuit design with features such as power switching properties for elimination of signal interference effects, as well as inclusion of low friction tubes along the tendon routing paths, for reduced friction during large bending motions
|
|
16:30-18:00, Paper WeCL-EX.23 | Add to My Program |
A Power Efficient Method to Achieve Continuous Spinning for Small-Sized Quadrotor |
|
Yoon, Seongwon | POSCO Holdings |
Keywords: Aerial Systems: Applications
Abstract: This study presents a method for spinning small-sized quadrotors at high speeds in a power efficient manner to create a full 3D omnidirectional field of view (FoV). A slightly inclined thrust is proposed to provide an effective yaw moment with minimal power consumption. The corresponding mathematical model of this system is constructed to investigate the efficiency of the proposed mechanical design. The consumed power is represented in terms of the tilting angle of the thrust and the spinning rate of a quadrotor. The power consumption is shown to be theoretically minimized for a prescribed spinning rate of a quadrotor with an optimized tilting angle. Experiments were conducted using advanced measurement equipment to determine the model parameters and coefficients related to power consumption. The proposed slightly inclined thrust is shown to reduce power consumption by 14.8% at a spinning rate of 5 Hz (revolutions per second), and ultimately provide 3D omnidirectional FoV up to 5 Hz with a 2D LiDAR without additional rotating actuators.
|
|
16:30-18:00, Paper WeCL-EX.25 | Add to My Program |
Improving Safety in Human-Robot Collaboration Via Mixed Reality-Augmented Deep Reinforcement Learning |
|
Li, Chengxi | The Hong Kong Polytechnic University |
Yin, Yue | The Hong Kong Polytechnic University |
Zhou, Peng | The University of Hong Kong |
Manyar, Omey Mohan | University of Southern California |
Zheng, Pai | The Hong Kong Polytechnic University |
Gupta, Satyandra K. | University of Southern California |
Keywords: Human-Centered Automation, Human-Robot Collaboration, Safety in HRI
Abstract: In the context of Industry 5.0, the transition towards a human-centric manufacturing paradigm underscores the importance of interactive collaboration among manufacturing equipment. Ensuring safety in Human-Robot Collaboration (HRC) becomes paramount, with traditional rule-based or physical isolation-based approaches exhibiting limitations in flexibility and synergy. Deep Reinforcement Learning (DRL) holds promise for safe motion planning in unstructured HRC environments; however, challenges such as inadequate state representation, complex scenes, and safety concerns impede its deployment. To address these challenges, we propose a Mixed Reality (MR)-augmented safe HRC framework integrating DRL. This framework incorporates both passive human and proactive human protective measures through deep reinforcement learning. Experimental validations in practical settings demonstrate the effectiveness of the proposed approach.
|
|
16:30-18:00, Paper WeCL-EX.26 | Add to My Program |
Initial Findings of Using DeepLabCut to Track Robotic Quadrupeds |
|
Davies, Martin | University of Sunderland |
Murray, John Christopher | University of Sunderland |
Manzoor, Umar | University of Sunderland |
Keywords: Legged Robots
Abstract: Markerless motion tracking solutions have been used for many years to track the movements of humans and animals. However, the use of these tools to track the motion of robots is limited but has the potential of allowing robots to copy each other from vision alone. DeepLabCut is a markerless pose estimation toolbox that is primarily designed for tracking animals (including humans). Many forms of motion tracking rely on the use of bulky equipment. This limits the types of environment motion tracking can be performed, along with its potential deployment onto a robotic platform. However, markerless pose estimation toolboxes like DeepLabCut only require the use of a single camera, giving it potential for tracking animals and robots in a wider range of locations. This poster shows the method and results of initial experiments conducted with DeepLabCut, testing its ability to track Robotic Quadrupeds
|
|
WeE-EX Expo Session, Exhibition Hall |
Add to My Program |
ICRA EXPO Day 2 |
|
|
Chair: Ravankar, Ankit A. | Tohoku University |
Co-Chair: Salazar Luces, Jose Victorio | Tohoku University |
|
13:30-17:00, Paper WeE-EX.1 | Add to My Program |
Closed-Loop Aerial Infrastructure Inspection and Repair |
|
Orr, Lachlan | Imperial College London |
Kaya, Yusuf Furkan | Imperial College London |
Kovac, Mirko | Imperial College London |
|
13:30-17:00, Paper WeE-EX.2 | Add to My Program |
Tilting Frame Multirotor UAV with Bidirectional Thruster System |
|
Paul, Hannibal | Ritsumeikan University |
Rosales Martinez, Ricardo | Ritsumeikan University |
Shimonomura, Kazuhiro | Ritsumeikan University |
|
13:30-17:00, Paper WeE-EX.3 | Add to My Program |
Demonstration of Flapping Flying Robot |
|
Mikawa, Yu | University of Tsukuba |
Afakh, Muhammad Labiyb | Tokyo Metropolitan University |
Sato, Hidaka | Tokyo Metropolitan University |
Mochiyama, Hiromi | University of Tsukuba |
Takesue, Naoyuki | Tokyo Metropolitan University |
|
13:30-17:00, Paper WeE-EX.4 | Add to My Program |
Verti-Wheelers: Wheeled Mobility on Vertically Challenging Terrain |
|
Datar, Aniket | George Mason University |
Pan, Chenhui | George Mason University |
Xiao, Xuesu | George Mason University |
|
13:30-17:00, Paper WeE-EX.5 | Add to My Program |
EEWOC: Extended-Reach Enhanced Wheeled Orb for Climbing |
|
Quan, Justin | UCLA |
Hong, Dennis | UCLA |
Zhu, Mingzhang | University of California, Los Angeles |
|
13:30-17:00, Paper WeE-EX.6 | Add to My Program |
Material Handling System for Transporting Large-Size Components Using Multiple Collaborative Autonomous Mobile Robots |
|
Qi, Lipeng | Xi'an Jiaotong University |
Yan, Chao-Bo | Xi'an Jiaotong University |
Zhang, Meng | Xi'an Jiaotong University |
Hu, Jianchen | Xi'an Jiaotong University |
|
13:30-17:00, Paper WeE-EX.7 | Add to My Program |
ICRA 2024 Demo: Dexterous In-Hand Manipulation |
|
Qi, Haozhi | UC Berkeley |
Yi, Brent | University of California, Berkeley |
Kumar, Ashish | UC Berkeley |
Suresh, Sudharshan | Carnegie Mellon University |
Lambeta, Mike Maroje | Facebook |
Ma, Yi | University of Illinois at Urbana-Champaign |
Calandra, Roberto | TU Dresden |
Malik, Jitendra | UC Berkeley |
|
13:30-17:00, Paper WeE-EX.8 | Add to My Program |
Low-Cost and Accessible Autonomous Driving Platforms Based on Autoware for Research and Education |
|
Carballo, Alexander | Gifu University |
Mangharam, Rahul | University of Pennsylvania |
Shih, Chi-Sheng | National Taiwan University |
Kim, Kanghee | Soongsil University |
Chiba, Rumika | RumiCar |
Fredriksson, Lars-Berno | CanEduDev |
Thompson, Simon | Tier IV |
Wang, Po-Jen | University of Pennsylvania |
|
13:30-17:00, Paper WeE-EX.9 | Add to My Program |
WallBo the Handwashing Robot Buddy |
|
Deshmukh, Amol | University of Glasgow |
Cross, Emily S | University of Glasgow |
|
13:30-17:00, Paper WeE-EX.10 | Add to My Program |
Demonstration of Dynamic Loco-Manipulation on HECTOR: Humanoid for Enhanced ConTrol and Open-Source Research |
|
Li, Junheng | University of Southern California |
Ma, Junchao | University of Southern California |
Chen, Yiyu | University of Southern California |
Nguyen, Quan | University of Southern California |
|
13:30-17:00, Paper WeE-EX.11 | Add to My Program |
Introducing Mini Cheetah Pro |
|
Roy, Ronak | Massachusetts Institute of Technology |
Mehrotra, Aditya | Massachusetts Institute of Technology |
Kim, Sangbae | Massachusetts Institute of Technology |
|
13:30-17:00, Paper WeE-EX.12 | Add to My Program |
StaccaToe: A Single-Leg Robot That Mimics the Human Leg and Toe |
|
Perera, Kankanige Nisal Minula | University of Massachusetts Amherst |
Yu, Shangqun | University of Massachusetts Amherst |
Marew, Daniel | University of Massachusetts Amherst |
Tang, Mack | University of Maryland College Park |
Suzuki, Ken | University of Massachusetts Amherst |
McCormack, Aidan | University of Massachusetts Amherst |
Zhu, Shifan | University of Massachusetts Amherst |
Kim, Yong-Jae | Korea University of Technology and Education |
Kim, Donghyun | University of Massachusetts Amherst |
|
13:30-17:00, Paper WeE-EX.13 | Add to My Program |
FiRe Gripper: Fire Registant Variable Stiffness Gripper Mechanism |
|
Tadakuma, Kenjiro | Osaka University |
Watanabe, Masahiro | Osaka University |
Tadokoro, Satoshi | Tohoku University |
|
13:30-17:00, Paper WeE-EX.14 | Add to My Program |
Cut-Resistant Variable Stiffness Soft Gripper Mechanism |
|
Tadakuma, Kenjiro | Osaka University |
Watanabe, Masahiro | Osaka University |
Tadokoro, Satoshi | Tohoku University |
|
13:30-17:00, Paper WeE-EX.15 | Add to My Program |
Demonstration of Upper-Extremity Exoskeleton Driven by Fusion Hybrid Linear Actuator |
|
Noda, Tomoyuki | ATR Computational Neuroscience Laboratories |
Shimoyama, Takuma | Graduate School of Informatics and Engineering, The University of Electro-Communications |
Teramae, Tatsuya | ATR Computational Neuroscience Laboratories |
Nakata, Yoshihiro | The University of Electro-Communications |
|
13:30-17:00, Paper WeE-EX.16 | Add to My Program |
A Stand-Assist Care Robot for Enhancing Senior Mobility |
|
Manríquez-Cisterna, Ricardo | Tohoku University |
Ravankar, Ankit A. | Tohoku University |
Salazar Luces, Jose Victorio | Tohoku University |
Hatsukari, Takuro | Paramount Bed Co., Ltd. |
Hirata, Yasuhisa | Tohoku University |
|
13:30-17:00, Paper WeE-EX.17 | Add to My Program |
OWLCR: An Omnidirectional Wheel-Legged Cane Robot |
|
Zhao, Yijun | Southern University of Science and Technology |
Liu, Haowen | Southern University of Science and Technology |
Gao, Zhiyi | Southern University of Science and Technology |
Zhang, Mingming | Southern University of Science and Technology |
|
13:30-17:00, Paper WeE-EX.18 | Add to My Program |
MOFU: Artificial Creature Capable of Body Expansion and Contraction |
|
Mogi, Taisei | The University of Electro-Communications |
Saito, Mari | Sony Corporation |
Nakata, Yoshihiro | The University of Electro-Communications |
|
13:30-17:00, Paper WeE-EX.19 | Add to My Program |
Cyber-Enhanced Canine -Robotic Technology to Support and Enhance the Abilities of Rescue & Working Dogs |
|
Ohno, Kazunori | Tohoku University |
|
13:30-17:00, Paper WeE-EX.20 | Add to My Program |
AMR Magnetic Field Sensor Array-Based Nondestructive Testing (NDT) Platform for Metalworking Cutting Fluids |
|
Lin, Ming-Yi | Yuan Ze University |
Lin, Yu-Cheng | Yuan Ze University |
|
13:30-17:00, Paper WeE-EX.21 | Add to My Program |
The ProxySkin Sensor: An Interactive Demo of a Multi-Modal Large-Area Network for Robotic Applications |
|
Grella, Francesco | University of Genova |
Giovinazzo, Francesco | University of Genoa |
Staiano, Marco | University of Genova |
Albini, Alessandro | University of Oxford |
Cannata, Giorgio | University of Genova |
Maiolino, Perla | University of Oxford |
|
13:30-17:00, Paper WeE-EX.22 | Add to My Program |
NARS-Transbot Demonstration at Expo |
|
Hammer, Patrick | KTH Royal Institute of Technology |
Isaev, Peter | Temple University |
|
13:30-17:00, Paper WeE-EX.24 | Add to My Program |
Dielectric Elastomer Bending Actuator for Soft Aquatic Glider |
|
Zhang, Chenyu | Tsinghua University |
|
13:30-17:00, Paper WeE-EX.25 | Add to My Program |
Automotive Workloads Based on Autoware's Open AD Kit and PIXKit 3.0 Autonomous Developer Chassis |
|
Carballo, Alexander | Gifu University |
Walmroth, David | PIX Moving Inc. |
Wong, David | Nagoya University |
Kütük, Samet | LeoDrive |
|
13:30-17:00, Paper WeE-EX.26 | Add to My Program |
Towards Dr. Octopus-Like Robot Arms |
|
Mavinkurve, Ujjal | Kyushu University |
Kanada, Ayato | Kyushu University |
Tafrishi, Seyed Amir | Cardiff Univerity |
Honda, Koki | The University of Tokyo |
Nakashima, Yasutaka | Kyushu University |
Yamamoto, Motoji | Kyushu University |
|
13:30-17:00, Paper WeE-EX.27 | Add to My Program |
Tendon-Driven Exosuits for Upper and Lower Limb Assistance |
|
Missiroli, Francesco | Heidelberg University |
Tricomi, Enrica | Heidelberg University |
Masia, Lorenzo | Heidelberg University |
| |