ICRA 2024 Program | Wednesday May 15, 2024


WePL-HL Plenary Session, National Convention Hall	Add to My Program
Plenary II: The Great Robot Accelerator: Collective Learning of Optimal Embodied AI -- Prof. Sami Haddadin

Chair: Kroeger, Torsten	Karlsruher Institut Für Technologie (KIT)

09:00-10:00, Paper WePL-HL.1	Add to My Program
The Great Robot Accelerator: Collective Learning of Optimal Embodied AI

Haddadin, Sami	Technical University of Munich


WeAA1-CC Award Session, CC-Main Hall	Add to My Program
Robot Manipulation

Chair: Harada, Kensuke	Osaka University
Co-Chair: Dogar, Mehmet R	University of Leeds

10:30-12:00, Paper WeAA1-CC.1	Add to My Program
Open X-Embodiment: Robotic Learning Datasets and RT-X Models

Levine, Sergey	UC Berkeley
Finn, Chelsea	Stanford University
Goldberg, Ken	UC Berkeley
Chen, Lawrence Yunliang	UC Berkeley
Sukhatme, Gaurav	University of Southern California
Dass, Shivin	UT Austin
Pinto, Lerrel	New York University
Zhu, Yuke	The University of Texas at Austin
Zhu, Yifeng	The University of Texas at Austin
Song, Shuran	Columbia University
Mees, Oier	University of California, Berkeley
Pathak, Deepak	Carnegie Mellon University
Fang, Hao-Shu	Shanghai Jiao Tong University
Christensen, Henrik Iskov	UC San Diego
Ding, Mingyu	UC Berkeley
Lee, Youngwoon	University of California, Berkeley
Sadigh, Dorsa	Stanford University
Radosavovic, Ilija	UC Berkeley
Bohg, Jeannette	Stanford University
Wang, Xiaolong	UC San Diego
Li, Xuanlin	UC San Diego
Rana, Krishan	Queensland University of Technology
Kawaharazuka, Kento	The University of Tokyo
Matsushima, Tatsuya	The University of Tokyo
Oh, Jihoon	The University of Tokyo
Osa, Takayuki	University of Tokyo
Kroemer, Oliver	Carnegie Mellon University
Kim, Beomjoon	Korea Advanced Institute of Science and Technology
Johns, Edward	Imperial College London
Stulp, Freek	DLR - Deutsches Zentrum Für Luft Und Raumfahrt E.V
Schneider, Jan	Max Planck Institute for Intelligent Systems
Wu, Jiajun	Stanford University
Li, Yunzhu	University of Illinois Urbana-Champaign
Ben Amor, Heni	Arizona State University
Ott, Lionel	ETH Zurich
Martín-Martín, Roberto	University of Texas at Austin
Hausman, Karol	Google Brain
Vuong, Quan	UC San Diego
Sanketi, Pannag	Google
Heess, Nicolas	Google Deepmind
Vanhoucke, Vincent	Google
Pertsch, Karl	UC Berkeley & Stanford University
Schaal, Stefan	Google X
Chi, Cheng	Columbia University
Pan, Chuer	Stanford University
Bewley, Alex	Google
Keywords: Data Sets for Robot Learning, Imitation Learning, Deep Learning Methods Abstract: Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train ``generalist’’ cross-robot policy that can be adapted efficiently to new robots, tasks, and environments? In this paper, we provide datasets in standardized data formats and models to make it possible to explore this possibility in the context of robotic manipulation, alongside experimental results that provide an example of effective cross-robot policies. We assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms.

10:30-12:00, Paper WeAA1-CC.2	Add to My Program
Towards Generalizable Zero-Shot Manipulation Via Translating Human Interaction Plans

Bharadhwaj, Homanga	Carnegie Mellon University
Gupta, Abhinav	Carnegie Mellon University
Kumar, Vikash	Meta AI
Tulsiani, Shubham	Carnegie Mellon University
Keywords: Machine Learning for Robot Control, Learning from Demonstration, Big Data in Robotics and Automation Abstract: We pursue the goal of developing robots that can interact zero-shot with generic unseen objects via a diverse repertoire of manipulation skills and show how passive human videos can serve as a rich source of data for learning such generalist robots. Unlike typical robot learning approaches which directly learn how a robot should act from interaction data, we adopt a factorized approach that can leverage large-scale human videos to learn how a human would accomplish a desired task (a human `plan'), followed by `translating’ this plan to the robot’s embodiment. Specifically, we learn a human `plan predictor’ that, given a current image of a scene and a goal image, predicts the future hand and object configurations. We combine this with a `translation’ module that learns a plan-conditioned robot manipulation policy, and allows following humans plans for generic manipulation tasks in a zero-shot manner with no deployment-time training. Importantly, while the plan predictor can leverage large-scale human videos for learning, the translation module only requires a small amount of in-domain data, and can generalize to tasks not seen during training. We show that our learned system can perform over 16 manipulation skills that generalize to 40 objects, encompassing 100 real-world tasks for table-top manipulation and diverse in-the-wild manipulation. https://homangab.github.io/hopman/

10:30-12:00, Paper WeAA1-CC.3	Add to My Program
Hearing Touch: Audio-Visual Pretraining for Contact-Rich Manipulation

Mejia, Jared	Carnegie Mellon University
Dean, Victoria	Carnegie Mellon University
Hellebrekers, Tess	Meta AI Research
Gupta, Abhinav	Carnegie Mellon University
Keywords: Representation Learning, Sensorimotor Learning, Robot Audition Abstract: Although pre-training on a large amount of data is beneficial for robot learning, current paradigms only perform large-scale pretraining for visual representations, whereas representations for other modalities are trained from scratch. In contrast to the abundance of visual data, it is unclear what relevant internet-scale data may be used for pretraining other modalities such as tactile sensing. Such pretraining becomes increasingly crucial in the low-data regimes common in robotics applications. In this paper, we address this gap by using contact microphones as an alternative tactile sensor. Our key insight is that contact microphones capture inherently audio-based information, allowing us to leverage large-scale audio-visual pretraining to obtain representations that boost the performance of robotic manipulation. To the best of our knowledge, our method is the first approach leveraging large-scale multisensory pre-training for robotic manipulation.

10:30-12:00, Paper WeAA1-CC.4	Add to My Program
SARA-RT: Scaling up Robotics Transformers with Self-Adaptive Robust Attention

Leal, Isabel	Google Deepmind
Choromanski, Krzysztof	Google DeepMind Robotics
Jain, Deepali	Robotics at Google
Dubey, Avinava	Google
Varley, Jacob	Google
Ryoo, Michael S.	Google, Stony Brook University
Lu, Yao	Google
Liu, Frederick	Google
Sindhwani, Vikas	Google Brain, NYC
Sarlos, Tamas	Google Research
Oslund, Kenneth	Google
Hausman, Karol	Google Brain
Vuong, Quan	UC San Diego
Rao, Kanishka	Google
Keywords: Deep Learning Methods, Deep Learning in Grasping and Manipulation Abstract: We present Self-Adaptive Robust Attention for Robotics Transformers (SARA-RT): a new paradigm for addressing the emerging challenge of scaling up Robotics Transformers (RT) for on-robot deployment. SARA-RT relies on the new method of fine-tuning proposed by us, called up-training. It converts pre-trained or already fine-tuned Transformer-based robotic policies of quadratic time complexity (including massive billion-parameter vision-language-action models or VLAs), into their efficient linear-attention counterparts maintaining high quality. We demonstrate the effectiveness of SARA-RT by speeding up: (a) the class of recently introduced RT-2 models, the first VLA robotic policies pre-trained on internet-scale data, as well as (b) Point Cloud Transformer (PCT) robotic policies operating on large point clouds. We complement our results with the rigorous mathematical analysis providing deeper insight into the phenomenon of SARA.

10:30-12:00, Paper WeAA1-CC.5	Add to My Program
DenseTact-Mini: An Optical Tactile Sensor for Grasping Multi-Scale Objects from Flat Surfaces

Do, Won Kyung	Stanford University
Dhawan, Ankush	Stanford University
Kitzmann, Mathilda	Stanford University
Kennedy, Monroe	Stanford University
Keywords: Grasping, Force and Tactile Sensing, Grippers and Other End-Effectors Abstract: Dexterous manipulation, especially of small daily objects, continues to pose complex challenges in robotics. This paper introduces the DenseTact-Mini, an optical tactile sensor with a soft, rounded, smooth gel surface and compact design equipped with a synthetic fingernail. We propose three distinct grasping strategies: tap grasping using adhesion forces such as electrostatic and van der Waals, fingernail grasping leveraging rolling/sliding contact between the object and fingernail, and fingertip grasping with two soft fingertips. Through comprehensive evaluations, the DenseTact-Mini demonstrates a lifting success rate exceeding 90.2% when grasping various objects, including items such as 1mm basil seeds, thin paperclips, and items larger than 15mm such as bearings. This work demonstrates the potential of soft optical tactile sensors for dexterous manipulation and grasping.

10:30-12:00, Paper WeAA1-CC.6	Add to My Program
Constrained Bimanual Planning with Analytic Inverse Kinematics

Cohn, Thomas	Massachusetts Institute of Technology
Shaw, Seiji	Massachusetts Institute of Technology
Simchowitz, Max	MIT
Tedrake, Russ	Massachusetts Institute of Technology
Keywords: Bimanual Manipulation, Constrained Motion Planning, Kinematics Abstract: In order for a bimanual robot to manipulate an object that is held by both hands, it must construct motion plans such that the transformation between its end effectors remains fixed. This amounts to complicated nonlinear equality constraints in the configuration space, which are difficult for trajectory optimizers. In addition, the set of feasible configurations becomes a measure zero set, which presents a challenge to sampling-based motion planners. We leverage an analytic solution to the inverse kinematics problem to parametrize the configuration space, resulting in a lower-dimensional representation where the set of valid configurations has positive measure. We describe how to use this parametrization with existing motion planning algorithms, including sampling-based approaches, trajectory optimizers, and techniques that plan through convex inner-approximations of collision-free space.


WeAA2-CC Award Session, CC-301	Add to My Program
Robot Vision

Chair: Chaumette, Francois	Inria Center at University of Rennes
Co-Chair: Hashimoto, Koichi	Tohoku University

10:30-12:00, Paper WeAA2-CC.1	Add to My Program
Deep Evidential Uncertainty Estimation for Semantic Segmentation under Out-Of-Distribution Obstacles

Ancha, Siddharth	Massachusetts Institute of Technology
Osteen, Philip	U.S. Army Research Laboratory
Roy, Nicholas	Massachusetts Institute of Technology
Keywords: Deep Learning for Visual Perception, Deep Learning Methods, Visual Learning Abstract: In order to navigate safely and reliably in novel environments, robots must estimate perceptual uncertainty when confronted with out-of-distribution (OOD) obstacles not seen in training data. We present a method to accurately estimate pixel-wise uncertainty in semantic segmentation without requiring real or synthetic OOD examples at training time. From a shared per-pixel latent feature representation, a classification network predicts a categorical distribution over semantic labels, while a normalizing flow estimates the probability density of features under the training distribution. The label distribution and density estimates are combined in a Dirichlet-based evidential uncertainty framework that efficiently computes epistemic and aleatoric uncertainty in a single neural network forward pass. Our method is enabled by three key contributions. First, we simplify the problem of learning a transformation to the training data density by starting from a fitted Gaussian mixture model instead of the conventional standard normal distribution. Second, we learn a richer and more expressive latent pixel representation to aid OOD detection by training a decoder to reconstruct input image patches. Third, we perform theoretical analysis of the loss function used in the evidential uncertainty framework and propose a principled objective that more accurately balances training the classification and density estimation networks. We demonstrate the accuracy of our uncertainty estimation approach under long-tail OOD obstacle classes for semantic segmentation in both off-road and urban driving environments.

10:30-12:00, Paper WeAA2-CC.2	Add to My Program
NGEL-SLAM: Neural Implicit Representation-Based Global Consistent Low-Latency SLAM System

Mao, Yunxuan	Zhejiang University
Yu, Xuan	Zhejiang University
Zhang, Zhuqing	Zhejiang University
Wang, Kai	HuaWei
Wang, Yue	Zhejiang University
Xiong, Rong	Zhejiang University
Liao, Yiyi	Zhejiang University
Keywords: SLAM Abstract: Neural implicit representations have emerged as a promising solution for addressing the challenges of Simultaneous Localization and Mapping (SLAM) problems in indoor scenes. This paper presents NGEL-SLAM, a low-latency global consistent SLAM system that utilizes neural implicit scene representation. To ensure global consistency, our system incorporates loop closure in the tracking module and maintains a global consistent map by representing the scene using multiple neural implicit fields and performing a quick adjustment to the loop closure. The fast convergence and rapid response to loop closure make our system a truly low-latency system that achieves global consistency. The neural implicit representation enables the rendering of high-fidelity RGB-D images with the extraction of explicit, dense, and interactive surfaces. Experiments were conducted on both synthetic and real-world datasets to evaluate the effectiveness of the proposed approach. The results demonstrate the achieved tracking and mapping accuracy and low-latency performance of our system.

10:30-12:00, Paper WeAA2-CC.3	Add to My Program
SeqTrack3D: Exploring Sequence Information for Robust 3D Point Cloud Tracking

Lin, Yu	Northeastern University
Li, Zhiheng	Northeastern University
Cui, Yubo	Northeastern University
Fang, Zheng	Northeastern University
Keywords: Visual Tracking, Deep Learning for Visual Perception, Computer Vision for Transportation Abstract: 3D single object tracking (SOT) is an important and challenging task for the autonomous driving and mobile robotics. Most existing methods perform tracking between two consecutive frames while ignoring the motion patterns of the target over a series of frames, which would cause performance degradation in scenes with sparse points. To break through this limitation, we introduce Sequence-to-Sequence tracking paradigm and a tracker named SeqTrack3D to capture target motion across continuous frames. Unlike previous methods that primarily adopted three strategies: matching two consecutive point clouds, predicting relative motion, or utilizing sequential point clouds to address feature degradation, our SeqTrack3D combines both historical point clouds and bounding box sequences. This novel approach ensures robust tracking by leveraging location priors from historical boxes, even in scenes with sparse points. Extensive experiments conducted on large-scale datasets show that SeqTrack3D achieves new state-of-the-art performances, improving by 6.00% on NuScenes and 14.13% on Waymo dataset.

10:30-12:00, Paper WeAA2-CC.4	Add to My Program
Ultrafast Square-Root Filter-Based VINS

Peng, Yuxiang	University of Delaware
Chen, Chuchu	University of Delaware
Huang, Guoquan	University of Delaware
Keywords: Localization, Visual-Inertial SLAM, SLAM Abstract: In this paper, we strongly advocate square-root covariance (instead of information) filtering for visual-inertial navigation, in particular on resource-constrained edge devices, because of its superior efficiency and numerical stability. Although Visual-Inertial Navigation Systems (VINS) have made tremendous progress in recent years, they still face resource stringency and numerical instability on embedded systems when imposing limited word length. To overcome these challenges, we develop an ultrafast and numerically-stable square-root filter (SRF)-based VINS algorithm (i.e., SR-VINS). The numerical stability of the proposed SR-VINS is inherited from the adoption of square-root covariance while the never-before-seen efficiency is largely enabled by the novel SRF update method that is based on our new permuted-QR (P-QR), which fully utilizes and properly maintains the upper triangular structure of the square-root covariance matrix. Furthermore, we choose a special ordering of the state variables which is amenable for (P-)QR operations in the SRF propagation and update and prevents unnecessary computation. The proposed SR-VINS is validated extensively through numerical studies, demonstrating that when the state-of-the-art (SOTA) filters have numerical difficulties, our SR-VINS has superior numerical stability, and remarkably, achieves efficient and robust performance on 32-bit single-precision float at a speed nearly twice as fast as the SOTA methods. We also conduct comprehensive real-world experiments to validate the efficiency, accuracy, and robustness of the proposed SR-VINS.

10:30-12:00, Paper WeAA2-CC.5	Add to My Program
Universal Visual Decomposer: Long-Horizon Manipulation Made Easy

Zhang, Zichen	Allen Institute for AI
Li, Yunshuang	Univeresity of Pennsylvania
Bastani, Osbert	University of Pennsylvania
Gupta, Abhishek	University of Washington
Jayaraman, Dinesh	University of Pennsylvania
Ma, Yecheng Jason	University of Pennsylvania
Weihs, Luca	Allen Institute for AI
Keywords: Learning from Demonstration, Imitation Learning, Reinforcement Learning Abstract: Real-world robotic tasks stretch over extended horizons and encompass multiple stages. Learning long-horizon manipulation tasks, however, is a long-standing challenge, and demands decomposing the overarching task into several manageable subtasks to facilitate policy learning and generalization to unseen tasks. Prior task decomposition methods require task-specific knowledge, are computationally intensive, and cannot readily be applied to new tasks. To address these shortcomings, we propose Universal Visual Decomposer (UVD), an off-the-shelf task decomposition method for visual long-horizon manipulation using pre-trained visual representations for robotic control. At a high level, UVD discovers subgoals by detecting phase shifts in the embedding space of the pre-trained representation. Operating purely on visual demonstrations without auxiliary information, UVD can effectively extract visual subgoals embedded in the videos, while incurring zero additional training cost on top of standard visuomotor policy training. Goal-conditioned policies learned with UVD-discovered subgoals exhibit significantly improved compositional generalization at test time to unseen tasks. Furthermore, UVD-discovered subgoals can be used to construct goal-based reward shaping that jump-starts temporally extended exploration for reinforcement learning. We extensively evaluate UVD on both simulation and real-world tasks, and in all cases, UVD substantially outperforms baselines across imitation and reinforcement learning settings on in-domain and out-of-domain task sequences alike, validating the clear advantage of automated visual task decomposition within the simple, compact UVD framework.

10:30-12:00, Paper WeAA2-CC.6	Add to My Program
HEGN: Hierarchical Equivariant Graph Neural Network for 9DoF Point Cloud Registration

Misik, Adam	Siemens Technology, Technical University Munich
Salihu, Driton	Technical University Munich
Su, Xin	Technical University of Munich
Brock, Heike	Siemens AG
Steinbach, Eckehard	Technical University of Munich
Keywords: Deep Learning for Visual Perception, Visual Learning, Computer Vision for Automation Abstract: Given its wide application in robotics, point cloud registration is a widely researched topic. Conventional methods aim to find a rotation and translation that align two point clouds in 6 degrees of freedom (DoF). However, certain tasks in robotics, such as category-level pose estimation, involve non-uniformly scaled point clouds, requiring a 9DoF transform for accurate alignment. We propose HEGN, a novel equivariant graph neural network for 9DoF point cloud registration. HEGN utilizes equivariance to rotation, translation, and scaling to estimate the transformation without relying on point correspondences. Based on graph representations for both point clouds, we extract equivariant node features aggregated in their local, cross-, and global context. In addition, we introduce a novel node pooling mechanism that leverages the cross-context importance of nodes to pool the graph representation. By repeating the feature extraction and node pooling, we obtain a graph hierarchy. Finally, we determine rotation and translation by aligning equivariant features aggregated over the graph hierarchy. To estimate scaling, we leverage scale information contained in the vector norm of the equivariant features. We evaluate the effectiveness of HEGN through experiments with the synthetic ModelNet40 dataset and the real-world ScanObjectNN dataset. The results show the superior performance of HEGN in 9DoF point cloud registration and its competitive performance in conventional 6DoF point cloud registration.


WeAT9-CC Oral Session, CC-419	Add to My Program
Collision Avoidance I

Chair: Wang, Zhuping	Tongji University
Co-Chair: Albu-Schäffer, Alin	DLR - German Aerospace Center

10:30-12:00, Paper WeAT9-CC.1	Add to My Program
CollisionGP: Gaussian Process-Based Collision Checking for Robot Motion Planning

Muñoz Mendi, Javier	Universidad Carlos III De Madrid
Lehner, Peter	German Aerospace Center (DLR)
Moreno, Luis	Carlos III University
Albu-Schäffer, Alin	DLR - German Aerospace Center
Roa, Maximo A.	German Aerospace Center (DLR)
Keywords: Motion and Path Planning, Collision Avoidance, Probabilistic Inference Abstract: Collision checking is the primitive operation of motion planning that consumes most time. Machine learning algorithms have proven to accelerate collision checking. We propose CollisionGP, a Gaussian process-based algorithm for modeling a robot's configuration space and query collision checks. CollisionGP introduces a Pòlya-Gamma auxiliary variable for each data point in the training set to allow classification inference to be done exactly with a closed form expression. Gaussian processes provide a distribution as the output, obtaining a mean and variance for the collision check. The obtained variance is processed to reduce false negatives (FN). We demonstrate that CollisionGP can use GPU acceleration to process collision checks for thousands of configurations much faster than traditional collision detection libraries. Furthermore, we obtain better accuracy, TPR and TNR results than state-of-the-art learning-based algorithms using less support points, thus making our proposed method more sparse.

10:30-12:00, Paper WeAT9-CC.2	Add to My Program
Probabilistic Motion Planning and Prediction Via Partitioned Scenario Replay

de Groot, Oscar	Delft University of Technology
Sridharan, Anish	Starnus Technologiy
Alonso-Mora, Javier	Delft University of Technology
Ferranti, Laura	Delft University of Technology
Keywords: Collision Avoidance, Planning under Uncertainty, Optimization and Optimal Control Abstract: Autonomous mobile robots require predictions of human motion to plan a safe trajectory that avoids them. Because human motion cannot be predicted exactly, future trajectories are typically inferred from real-world data via learning-based approximations. These approximations provide useful information on the pedestrian’s behavior, but may deviate from the data, which can lead to collisions during planning. In this work, we introduce a joint prediction and planning framework, Partitioned Scenario Replay (PSR), that stores and partitions previously observed human trajectories, referred to as scenarios. During planning, scenarios observed in similar situations are reintroduced (or replayed) as motion predictions. By sampling real data and by building on scenario optimization and predictive control, the planner provides probabilistic collision avoidance guarantees in the real-world. Relying on this guarantee to remain safe, PSR can incrementally improve its prediction and planning performance online. We demonstrate our approach on a mobile robot navigating around pedestrians.

10:30-12:00, Paper WeAT9-CC.3	Add to My Program
Prescient Collision-Free Navigation of Mobile Robots with Iterative Multimodal Motion Prediction of Dynamic Obstacles

Zhang, Ze	Chalmers University of Technology
Hajieghrary, Hadi	Magna International
Dean, Emmanuel	Chalmers University of Technology
Akesson, Knut	Chalmers University of Technology
Keywords: Collision Avoidance, Deep Learning Methods, AI-Based Methods Abstract: To explore safe interactions between a mobile robot and dynamic obstacles, this paper presents a comprehensive approach to collision-free navigation in dynamic indoor environments. The approach integrates multimodal motion predictions of dynamic obstacles with predictive control for obstacle avoidance. Multimodal Motion Prediction (MMP) is achieved by a deep-learning method that predicts multiple plausible future positions. By repeating the MMP for each time offset in the future, multi-time-step multimodal motion predictions are obtained. A nonlinear Model Predictive Control (MPC) solver utilizes the prediction outcomes to achieve collision-free trajectory tracking for the mobile robot. The proposed integration of multimodal motion prediction and trajectory tracking outperforms other non-deep-learning methods in complex scenarios. The approach enables safe interaction between the mobile robot and stochastic dynamic obstacles.

10:30-12:00, Paper WeAT9-CC.4	Add to My Program
GPU-Accelerated Optimization-Based Collision Avoidance

Wu, Zeming	Tongji University
Wang, Zhuping	Tongji University
Zhang, Hao	Tongji University
Keywords: Motion and Path Planning, Collision Avoidance, Constrained Motion Planning Abstract: This paper proposes a GPU-accelerated optimization framework for collision avoidance problems where the controlled objects and the obstacles can be modeled as the finite union of convex polyhedra. A novel collision avoidance constraint is proposed based on scale-based collision detection and the strong duality of convex optimization. Under this constraint, the high-dimensional non-convex optimization problems of collision avoidance can be decomposed into several low-dimensional quadratic programmings (QPs) following the paradigm of alternating direction method of multipliers (ADMM). Furthermore, these low-dimensional QPs can be solved parallel with GPUs, significantly reducing computational time. High-fidelity simulations are conducted to validate the proposed method's effectiveness and practicality.

10:30-12:00, Paper WeAT9-CC.5	Add to My Program
Learn to Navigate in Dynamic Environments with Normalized LiDAR Scans

Zhu, Wei	Tohoku University
Hayashibe, Mitsuhiro	Tohoku University
Keywords: Collision Avoidance, Human-Aware Motion Planning, Reinforcement Learning Abstract: The latest robot navigation methods for dynamic environments assume that the states of obstacles, including their geometries and trajectories, are fully observable. While it's easy to obtain these states accurately in simulations, it's exceedingly challenging in the real world. Therefore, a viable alternative is to directly map raw sensor observations into robot actions. However, acquiring skills from high-dimensional raw observations demands massive neural networks and extended training periods. Furthermore, there are discrepancies between simulated and real environments that impede real-world implementations. To overcome these limitations, we propose a Learning framework for robot Navigation in Dynamic environments that uses sequential Normalized LiDAR (LNDNL) scans. We employ long-short-term memory (LSTM) to propagate historical environmental information from the sequential LiDAR observations. Additionally, we customize a LiDAR-integrated simulator to speed up sampling and normalize the geometry of real-world obstacles to match that of simulated objects, thereby bridging the sim-to-real gap. Our extensive comparisons with state-of-the-art baselines and real-world implementations demonstrate the potential of learning to navigate in dynamic environments using raw sensor observations and sim-to-real transfer.

10:30-12:00, Paper WeAT9-CC.6	Add to My Program
Learning Terminal State of the Trajectory Planner: Application for Collision Scenarios of Autonomous Vehicles

Lim, Joonhee	KAIST
Lee, Kibeom	Gachon University
Shin, Jangho	Hyundai Motor Company
Kum, Dongsuk	KAIST
Keywords: Collision Avoidance, Integrated Planning and Learning, Motion and Path Planning Abstract: Collision Avoidance/Mitigation System (CAMS) for autonomous vehicles is a crucial technology that ensures the safety and reliability of autonomous driving systems. Conventional collision avoidance approaches struggle in complex and various scenarios by avoiding collisions based on rules for specific collision scenarios. This has led to learning-based methods using neural networks for adaptive collision avoidance. However, the approaches directly outputting control inputs through neural networks have drawbacks in interpretability and stability. To address these limitations, we propose a trajectory planning method for CAMS that combines deep reinforcement learning (DRL) and quintic polynomial (QP) trajectory planning. The proposed method determines the terminal state and confidence of the trajectory using DRL and plans a QP trajectory based on them. By utilizing the terminal state and confidence of the trajectory rather than direct control inputs as the output of the neural network, it generates a more realistic and continuous path. Moreover, this approach considers collision avoidance and mitigation in an integrated manner through the reward function of RL. Our experimental results demonstrate that the proposed method not only improves interpretability and stability compared to existing learning-based methods but also upholds performance in complex and various collision scenarios.

10:30-12:00, Paper WeAT9-CC.7	Add to My Program
History-Aware Planning for Risk-Free Autonomous Navigation on Unknown Uneven Terrain

Wang, Yinchuan	Shandong University
Du, Nianfei	Shandong University
Qin, Yongsen	Shandong University
Zhang, Xiang	School of Control Science and Engineering, Shandong University
Song, Rui	Shandong University
Wang, Chaoqun	Shandong University
Keywords: Collision Avoidance, Planning under Uncertainty, Autonomous Vehicle Navigation Abstract: It is challenging for the mobile robot to achieve autonomous and mapless navigation in the unknown environment with uneven terrain. In this study, we present a layered and systematic pipeline. At the local level, we maintain a tree structure that is dynamically extended with the navigation. This structure unifies the planning with the terrain identification. Besides, it contributes to explicitly identifying the hazardous areas on uneven terrain. In particular, certain nodes of the tree are consistently kept to form a sparse graph at the global level, which records the history of the exploration. A series of subgoals that can be obtained in the tree and the graph are utilized for leading the navigation. To determine a subgoal, we develop an evaluation method whose input elements can be efficiently obtained on the layered structure. We conduct both simulation and real-world experiments to evaluate the developed method and its key modules. The experimental results demonstrate the effectiveness and efficiency of our method. The robot can travel through the unknown uneven region safely and reach the target rapidly without a preconstructed map.


WeAT15-AX Oral Session, AX-203	Add to My Program
Human Factors and Human-In-The-Loop I

Chair: Ciocarlie, Matei	Columbia University
Co-Chair: De Momi, Elena	Politecnico Di Milano

10:30-12:00, Paper WeAT15-AX.1	Add to My Program
VIDAR: Data Quality Improvement for Monocular 3D Reconstruction through In-Situ Visual Interaction

Gao, Han	National Key Lab for Novel Software Technology, Nanjing Universi
Liu, Yating	Nanjing University
Cao, Fang	Nanjing University
Wu, Hao	Nanjing University
Xu, Fengyuan	National Key Lab for Novel Software Technology, Nanjing Universi
Zhong, Sheng	Nanjing University
Keywords: Human Factors and Human-in-the-Loop Abstract: 3D reconstruction based on monocular videos has attracted wide attention, and existing reconstruction methods usually work in a reconstruction-after-scanning manner. However, these methods suffer from insufficient data collection problems due to the lack of effective guidance for users during the scanning process, which affects reconstruction quality. We propose VIDAR, which visually guides users with the streaming incremental reconstructed mesh in data collection for monocular 3D reconstruction. We propose an incremental mesh extraction algorithm to achieve lossless fusion of streaming incremental mesh data via slice-style management for guidance quality. We also design an incremental mesh rendering algorithm to achieve precise memory reallocation by updating the buffer in a fill-in-the-blank pattern for guidance efficiency. Besides, we introduce several optimizations on data transmission and human-computer interaction to improve the overall system performance. The experiment results on real-world scenes show that VIDAR efficiently delivers high-quality visual guidance and outperforms the non-interactive data collection methods for scene reconstruction.

10:30-12:00, Paper WeAT15-AX.2	Add to My Program
Transparency Control of a 1-DoF Knee Exoskeleton Via Human-In-The-Loop Velocity Optimisation

Cha, Lukas	Technical University of Munich
Guez, Annika	Imperial College London
Chen, Chih-Yu	Technical University of Munich
Kim, Sion	Imperial College London
Yu, Zhenhua	Imperial College London
Xiao, Bo	Imperial College London
Vaidyanathan, Ravi	Imperial College London
Keywords: Human Factors and Human-in-the-Loop, Prosthetics and Exoskeletons, Rehabilitation Robotics Abstract: Rehabilitative robotics, particularly lower-limb exoskeletons (LLEs), have gained increasing importance in aiding patients regain ambulatory functions. One of the challenges in making these systems effective is the implementation of an assist-as-needed (AAN) control strategy that intervenes only when the patient deviates from the correct movement pattern. Equally crucial is the need for the LLE to exhibit "transparency" — minimising its interaction forces with the wearer to feel as natural as possible. This paper introduces a novel approach to transparency control based on a human-in-the-loop velocity optimisation framework. The proposed method employs torque data captured from past steps through a Series Elastic Actuator (SEA) to approximate the wearer's intended future movements and computes a corresponding transparent velocity trajectory. The velocity commands are complemented by an Adaptive Frequency Oscillator (AFO) based position controller that leverages the periodic nature of human gait and is modified with a force sensor for increased reactiveness to human gait variations. This approach is experimentally evaluated against a standard zero-torque controller with a stationary single-degree-of-freedom knee exoskeleton test platform in a proof-of-concept study. Preliminary results indicate that combining adaptive oscillators with interaction force sensing can improve transparency compared to the conventional zero-torque controller, using force readings for position control and torque measurements for velocity optimisation and control.

10:30-12:00, Paper WeAT15-AX.3	Add to My Program
Towards Enhanced Human Activity Recognition for Real-World Human-Robot Collaboration

Yalcinkaya, Beril	Ingeniarius Lda
Couceiro, Micael	University of Coimbra
Pina, Lucas	Ingeniarius Lda
Soares, Salviano	UTAD
Valente, António	University of Trás Os Montes and Alto Douro
Remondino, Fabio	FBK
Keywords: Human Factors and Human-in-the-Loop, Human-Robot Collaboration, Robotics and Automation in Agriculture and Forestry Abstract: This research contributes to the field of Human-Robot Collaboration (HRC) within dynamic and unstructured environments by extending the previously proposed Fuzzy State-Long Short-Term Memory (FS-LSTM) architecture to handle the uncertainty and irregularity inherent in real-world sensor data. Recognising the challenges posed by low-cost sensors, which are highly susceptible to environmental conditions and often fail to provide regular periodic readings, this paper introduces additional pre-processing blocks. These include two indirect Kalman filters and an additional LSTM network, which together enhance the input variables for the fuzzification process. The enhanced FS-LSTM approach is evaluated using real-world data, demonstrating its effectiveness in extracting meaningful information and accurately recognising human activities. This work underscores the potential of robotics in addressing global challenges, particularly in labour-intensive and hazardous tasks. By improving the integration of humans and robots in unstructured environments, this research contributes to the broader exploration of robotics in new societal applications, fostering connections and collaborations across diverse fields.

10:30-12:00, Paper WeAT15-AX.4	Add to My Program
Self-Supervised Regression of sEMG Signals Combining Non-Negative Matrix Factorization with Deep Neural Networks for Robot Hand Multiple Grasping Motion Control

Meattini, Roberto	University of Bologna
Caporali, Alessio	University of Bologna
Bernardini, Alessandra	University of Bologna
Palli, Gianluca	University of Bologna
Melchiorri, Claudio	University of Bologna
Keywords: Human Factors and Human-in-the-Loop, Intention Recognition Abstract: Advanced Human-In-The-Loop (HITL) control strategies for robot hands based on surface electromyography (sEMG) are among major research questions in robotics. Due to intrinsic complexity and inaccuracy of labeling procedures, unsupervised regression of sEMG signals has been employed in literature, however showing several limitations in realizing multiple grasping motion control. In this work, we propose a novel Human-Robot interface (HRi) based on self-supervised regression of sEMG signals, combining Non-Negative Matrix Factorization (NMF) with Deep Neural Networks (DNN) in order to both avoid explicit labeling procedures and have powerful nonlinear fitting capabilities. Experiments involving 10 healthy subjects were carried out, consisting of an offline session for systematic evaluations and comparisons with traditional unsupervised approaches, and an online session for assessing realtime control of a wearable anthropomorphic robot hand. The offline results demonstrate that the proposed self-supervised regression approach overcame traditional unsupervised methods, even considering different robot hands with dissimilar kinematic structures. Furthermore, the subjects were able to successfully perform online control of multiple grasping motions of a real wearable robot hand, reporting for high reliability over repeated grasp-transportation-release tasks with different objects. Statistical support is provided along with experimental outcomes.

10:30-12:00, Paper WeAT15-AX.5	Add to My Program
Maximising Coefficiency of Human-Robot Handovers through Reinforcement Learning

Lagomarsino, Marta	Istituto Italiano Di Tecnologia
Lorenzini, Marta	Istituto Italiano Di Tecnologia
Constable, Merryn Dale	Northumbria University
De Momi, Elena	Politecnico Di Milano
Becchio, Cristina	University Medical Center Hamburg-Eppendorf
Ajoudani, Arash	Istituto Italiano Di Tecnologia
Keywords: Human Factors and Human-in-the-Loop, Physical Human-Robot Interaction, Human-Centered Robotics Abstract: Handing objects to humans is an essential capability for collaborative robots. Previous research works on human-robot handovers focus on facilitating the performance of the human partner and possibly minimising the physical effort needed to grasp the object. However, altruistic robot behaviours may result in protracted and awkward robot motions, contributing to unpleasant sensations by the human partner and affecting perceived safety and social acceptance. This paper investigates whether transferring the psychological principle that "humans act coefficiently as a group" (i.e. simultaneously maximising the benefits of all agents involved) to human-robot cooperative tasks promotes a more seamless and natural interaction. Human-robot coefficiency is first modelled by identifying implicit indicators of human comfort and discomfort as well as calculating the robot energy consumption in performing the desired trajectory. We then present a reinforcement learning approach that uses the human-robot coefficiency score as reward to adapt and learn online the combination of robot interaction parameters that maximises such coefficiency. Results proved that by acting coefficiently the robot could meet the individual preferences of most subjects involved in the experiments, improve the human perceived comfort, and foster trust in the robotic partner.

10:30-12:00, Paper WeAT15-AX.6	Add to My Program
Jacquard V2: Refining Datasets Using the Human in the Loop Data Correction Method

Li, Qiuhao	Northeastern University
Yuan, Shenghai	Nanyang Technological University
Keywords: Human Factors and Human-in-the-Loop, Learning Categories and Concepts, Data Sets for Robotic Vision Abstract: In the context of rapid advancements in industrial automation, vision-based robotic grasping plays an increasingly crucial role. In order to enhance visual recognition accuracy, the utilization of large-scale datasets is imperative for training models to acquire implicit knowledge related to the handling of various objects. Creating datasets from scratch is a time and labor-intensive process. Moreover, existing datasets often contain errors due to automated annotations aimed at expediency, making the improvement of these datasets a substantial research challenge. Consequently, several issues have been identified in the annotation of grasp bounding boxes within the popular Jacquard Grasp. We propose utilizing a Human-In-The-Loop(HIL) method to enhance dataset quality. This approach relies on backbone deep learning networks to predict object positions and orientations for robotic grasping. Predictions with Intersection over Union (IOU) values below 0.2 undergo an assessment by human operators. After their evaluation, the data is categorized into False Negatives(FN) and True Negatives(TN). FN are then subcategorized into either missing annotations or catastrophic labeling errors. Images lacking labels are augmented with valid grasp bounding box information, whereas images afflicted by catastrophic labeling errors are completely removed. The open-source tool Labelbee was employed for 53,026 iterations of HIL dataset enhancement, leading to the removal of 2,884 images and the incorporation of ground truth information for 30,292 images. The enhanced dataset, named the Jacquard V2 Grasping Dataset, served as the training data for a range of neural networks. We have empirically demonstrated that these dataset improvements significantly enhance the training and prediction performance of the same network, resulting in an increase of 7.1% across most popular detection architectures for ten iterations. This refined dataset will be accessible on Google Drive and Baidu Netdisk,

10:30-12:00, Paper WeAT15-AX.7	Add to My Program
Building User Proficiency in Piloting Small Unmanned Aerial Vehicles (sUAV)

Kunde, Siya	University of Nebraska
Duncan, Brittany	University of Nebraska, Lincoln
Keywords: Human Factors and Human-in-the-Loop, Design and Human Factors, Long term Interaction Abstract: Assessing proficiency in small unmanned aerial vehicles (sUAVs) pilots is complex and not well understood, but increasingly important to employ these vehicles in serious jobs such as wildland firefighting and infrastructure inspection. The limited prior work with UAVs has focused on user training using modalities like simulators and VR and no performance assessments with line-of-sight UAVs. This paper presents a training methodology for novice pilots of sUAVs. We presented two studies: the Baseline study (21 participants) and the Training study (16 participants). Our work is of interest to sUAV operators, regulators, and companies developing this technologies to produce a more capable workforce capable of consistent, safe operations. We successfully utilized the method developed in cite{kunde2022recognizing} to assess user proficiency in flying UAVs. We presented a UAV pilot training schedule for novice users (in the Training study), and were able to determine the minimum training time necessary to observe performance gains and mitigate damage. Results indicate that task completions noticeably improved and crashes minimized by day 10 of training, with a training plateau observed by day 15.

10:30-12:00, Paper WeAT15-AX.8	Add to My Program
A Probabilistic Model for Cobot Decision Making to Mitigate Human Fatigue in Repetitive Co-Manipulation Tasks

Yaacoub, Aya	LORIA-CNRS
Thomas, Vincent	LORIA - Universite De Lorraine
Colas, Francis	Inria Nancy Grand Est
Maurice, Pauline	Cnrs - Loria
Keywords: Human Factors and Human-in-the-Loop, Human-Robot Collaboration, Planning under Uncertainty Abstract: Work related musculoskeletal disorders (WMSDs) are very common. Repetitive motion, which is often present in industrial work, is one of the main physical causes of WMSDs. It uses the same set of human joints repeatedly, which leads to localized joint fatigue. In this work, we present a framework to plan a policy of a collaborative robot that reduces the human fatigue in the long term, in highly repetitive co-manipulation tasks, while taking into account the uncertainty in the human postural reaction to the robot motion and the partial observability of the human fatigue state. We model the problem using continuous-state Partially Observable Markov Decision Process (POMDP), and use a physics-based digital human simulator to predict the fatigue cost of the possible robot actions. We then use an online planning algorithm to compute the optimal robot policy. We demonstrate our approach on a simulated experiment in which a robot repeatedly carries an object for the human to work on, and the object Cartesian pose needs to be optimized. We compare the policy generated with our approach with a random, a cyclic and a greedy (short-term optimization) policy, for different user profiles. We show that our approach outperforms the other policies on all tested scenarios.


WeAT18-AX Oral Session, AX-206	Add to My Program
Force Control and Sensing

Chair: Tsuji, Toshiaki	Saitama University
Co-Chair: Huang, Guoquan	University of Delaware

10:30-12:00, Paper WeAT18-AX.1	Add to My Program
Robot-Camera Calibration in Tightly Constrained Environment Using Interactive Perception

Zhong, Fangxun	The Chinese University of Hong Kong
Li, Bin	The Chinese University of Hong Kong
Chen, Wei	The Chinese University of Hong Kong
Liu, Yunhui	Chinese University of Hong Kong
Keywords: Calibration and Identification, Sensor-based Control, Motion Control of Manipulators, Surgical Robotics: Laparoscopy Abstract: Manipulation in tight environment is challenging but increasingly common in vision-guided robotic applications. The significantly reduced amount of available feedback (limited visual cues, field of view, robot motion space, etc.) hinders solving the hand-eye relationship accurately. In this paper, we propose a new generic approach for online camera-robot calibration that could deal with the least feedback input available in tight environment: an arbitrarily restricted motion space and a single feature point with unknown position for the robot end-effector. We introduce the interactive perception to generate prescribed but tunable robot motions to reveal high-dimensional sensory feedback, which is not obtainable from static images. We then define the interactive feature plane (IFP), whose spatial property corresponds to the robot-actuating trajectories. A depth-free adaptive controller is proposed based on image feedback, where the converged orientation of IFP directly harvests the data for solving the hand-eye relationship. Our algorithm requires neither external calibration sensors/objects nor large-scale data acquisition process. Simulations demonstrate the va

10:30-12:00, Paper WeAT18-AX.2	Add to My Program
Degenerate Motions of Multisensor Fusion-Based Navigation

Lee, Woosik	University of Delaware
Chen, Chuchu	University of Delaware
Huang, Guoquan	University of Delaware
Keywords: Calibration and Identification, SLAM, Sensor Fusion Abstract: The system observability analysis is of practical importance, for example, due to its ability to identify the unobservable directions of the estimated state which can influence estimation accuracy and help develop consistent and robust estimators. Recent studies focused on analyzing the observability of the state of various multisensor systems with a particular interest in unobservable directions induced by degenerate motions. However, those studies mostly stay in the specific sensor domain without aiding to extend the understanding to other heterogeneous systems. To this end, in this work, we provide degenerate motion analysis on general local and global sensor-paired systems, offering insights applicable to a wide range of existing navigation systems. Our analysis includes 9 degenerate motion identification including 5 already identified in literature and 4 new motions with both synchronous and asynchronous sensor-pair cases. Comprehensive numerical studies are conducted to verify those identified motions, show the effect of degenerate motion on state estimation, and demonstrate the generalizability of our analysis on various multisensor systems.

10:30-12:00, Paper WeAT18-AX.3	Add to My Program
Interaction Control for Tool Manipulation on Deformable Objects Using Tactile Feedback

Zhang, Hanwen	Institute of Optics and Electronics, CAS
Lu, Zeyu	National University of Singapore
Liang, Wenyu	Institute for Infocomm Research, A*STAR
Yu, Haoyong	National University of Singapore
Mao, Yao	Institute of Optics and Electronics, CAS
Wu, Yan	A*STAR Institute for Infocomm Research
Keywords: Force and Tactile Sensing, Force Control, Contact Modeling Abstract: The sense of touch enables humans to perform many delicate tasks on deformable objects and/or in a vision-denied environment. For a robot to achieve similar desirable interactions, such as administering a swab test, tactile information sensed beyond the tool-in-hand is correspondingly crucial for contact state estimation and contact tracking control. In this paper, we propose a tactile-guided planning and control framework using GTac, a hetero-Geneous Tactile sensor tailored for interaction with deformable objects beyond the immediate contact area. The biomimetic GTac in use is an improved version optimised for readout linearity which provides reliability in contact state estimation and force tracking. While a tactile-based classification and manipulation process is designed to estimate and align to the contact angle between the tool and the environment, a Koopman operator-based optimal control scheme is proposed to address challenges in the nonlinear control arising from the interaction with the deformable object. Several experiments are conducted to verify the effectiveness of the proposed framework. The experimental results demonstrate that the proposed framework can achieve accurate contact angle estimation as well as excellent tracking performance and strong robustness in force control.

10:30-12:00, Paper WeAT18-AX.4	Add to My Program
Development of an Easy-To-Cut Six-Axis Force Sensor

Kawahara, Takamasa	Saitama University
Tsuji, Toshiaki	Saitama University
Keywords: Force and Tactile Sensing, Force Control, Robotics and Automation in Agriculture and Forestry Abstract: Although the potential demand for force sensors in both robotics and automation is high, the complexity of their structure increases the number of manufacturing processes. As a result, the rising cost of sensors has hindered the practical application of force measurement and force control. In this study, a flexure element comprising a structure that is easier to cut and process than conventional ones, as well as holes through the side of a cuboid, is proposed to simplify the manufacturing of force sensors. To ensure the safety of the proposed sensor design, an approximate equation is derived to predict the maximum von Mises stress on the flexure element using design parameters. Subsequently, we clarified a way to attach the strain gauge in a position that improves sensitivity. The results of the actual prototype sensor based on the proposed method show that the maximum nonlinearity error and decoupling error in the other axes are 0.442 %R.O. and 0.660 %R.O., respectively, and the performance is comparable to that of conventional force sensors. Because the prototype has a difference in resolution between the axes, a method for improving the resolution isotropy without changing the difficulty of machining is also proposed. In addition, the validity of the proposed method is demonstrated using experiments. Consequently, a force sensor with the same level of performance was developed using the proposed method, and the cutting process was made easier compared to that of convention

10:30-12:00, Paper WeAT18-AX.5	Add to My Program
An Ultra-Fast Intrinsic Contact Sensing Method for Medical Instruments with Arbitrary Shape

Cao, Guanglin	Institute of Automation, Chinese Academy of Sciences
Chen, Mingcong	City University of Hong Kong
Hu, Jian	Institute of Automation, Chinese Academy of Sciences
Liu, Hongbin	Hong Kong Institute of Science & Innovation, Chinese Academy Of
Keywords: Force Control, Force and Tactile Sensing, Medical Robots and Systems Abstract: Intraoperative contact sensing has the potential to reduce the risk of surgical errors and enhance manipulation capabilities for medical robots, particularly in contact force control. Current intrinsic force sensing (IFS) methods are limited in application to medical instruments with arbitrary shape, due to high computational time and reliance on precise surface equations. This study presents an ultra-fast IFS method that uses multiple planes to establish surface geometry descriptions. The method can reduce high-order contact mechanical models that need to be solved iteratively to a set of linear equations, and calculate contact location analytically. In addition, a robot motion control approach based on the contact sensing method is proposed to maintain stable contact force and regulate the probe's orientation for robotic ultrasound systems (RUSS). Experimental results show that the contact sensing method is robust to friction and can achieve a mean (±SD) displacement error of 1.04±0.43 mm in contact location with computational time less than 1 ms. The system has been evaluated on a phantom with sinusoidal motion. To the best of our knowledge, this is the first study to validate adaptiveness of RUSS under dynamic conditions. The results demonstrated that the system exhibits comparable manipulation capabilities to human operators with only force sensing, indicating a high level of adaptiveness.

10:30-12:00, Paper WeAT18-AX.6	Add to My Program
Proprioceptive-Based Whole-Body Disturbance Rejection Control for Dynamic Motions in Legged Robots

Zhu, Zhengguo	Shandong University
Zhang, Guoteng	Shandong University
Sun, Zhongkai	Shandong University
Chen, Teng	Shandong University
Rong, Xuewen	Shandong University
Xie, Anhuan	Zhejiang University
Li, Yibin	Shandong University
Keywords: Force Control, Motion Control, Robust/Adaptive Control Abstract: This paper presents a control framework for legged robots that enables self-perception and resistance to external disturbances. First, a novel proprioceptive-based disturbance estimator is proposed. Compared with other disturbance estimators, this estimator possesses notable advantages in terms of filtering foot-ground interaction noise and suppressing the accumulation of estimation errors. Additionally, our estimator is a fully proprioceptive-based estimator, eliminating the need for any exteroceptive devices or observers. Second, we present a hierarchical optimized whole-body controller (WBC), which takes into account the full body dynamics, the actuation limits, the external disturbances, and the interactive constraints. Finally, extensive experimental trials conducted on the point-foot biped robot BRAVER validate the capabilities of the proposed estimator and controller under various disturbance conditions.

10:30-12:00, Paper WeAT18-AX.7	Add to My Program
Contact Force Estimation of Robot Manipulators with Imperfect Dynamic Model: On Gaussian Process Adaptive Disturbance Kalman Filter (I)

Wei, Yanran	Beihang University
Lyu, Shangke	Nanyang Technological University
Li, Wenshuo	Beihang University
Yu, Xiang	Beihang University
Guo, Lei	Beihang University
Keywords: Industrial Robots, Force and Tactile Sensing, Calibration and Identification Abstract: This paper is concerned with the contact force estimation problem of robot manipulators based on imperfect dynamic models of the manipulator and the contact force. To handle the imperfect dynamic information of the manipulator, a hybrid model, consisting of the nominal model and the residual dynamics, is established for the manipulator, and the Gaussian process regression (GPR) technique is employed to learn the mean and covariance of the residual dynamics. On this basis, a virtual measurement equation is established for contact force estimation and a Gaussian process adaptive disturbance Kalman filter (GPADKF) is developed where the variational Bayes technique is employed to achieve online identification of the noise statistics in the force dynamics. The GPADKF is capable of decoupling the contact force from residual dynamics and system noises, thereby reducing the dependence on accurate dynamic models of the manipulator and the contact force. Simulation and experimental results demonstrate that the proposed scheme outperforms the state-of-art methods.


WeAT22-NT Oral Session, NT-G304	Add to My Program
Marine Robotics IV

Chair: Hollinger, Geoffrey	Oregon State University
Co-Chair: Jang, Junwoo	University of Michigan

10:30-12:00, Paper WeAT22-NT.1	Add to My Program
A Model for Multi-Agent Autonomy That Uses Opinion Dynamics and Multi-Objective Behavior Optimization

Paine, Tyler	Massachusetts Institute of Technology
Benjamin, Michael	Massachusetts Institute of Technology
Keywords: Marine Robotics, Multi-Robot Systems, Cooperating Robots Abstract: This paper reports a new hierarchical architecture for modeling autonomous multi-robot systems (MRSs): a non-linear dynamical opinion process is used to model high-level group choice, and multi-objective behavior optimization is used to model individual decisions. Using previously reported theoretical results, we show it is possible to design the behavior of the MRS by the selection of a relatively small set of parameters. The resulting behavior - both collective actions and individual actions - can be understood intuitively. The approach is entirely decentralized and the communication cost scales by the number of group options, not agents. We demonstrated the effectiveness of this approach using a hypothetical `explore-exploit-migrate' scenario in a two hour field demonstration with eight unmanned surface vessels (USVs). The results from our preliminary field experiment show the collective behavior is robust even with time-varying network topology and agent dropouts.

10:30-12:00, Paper WeAT22-NT.2	Add to My Program
Convex Geometric Trajectory Tracking Using Lie Algebraic MPC for Autonomous Marine Vehicles

Jang, Junwoo	University of Michigan
Teng, Sangli	University of Michigan, Ann Arbor
Ghaffari, Maani	University of Michigan
Keywords: Marine Robotics, Optimization and Optimal Control, Motion Control Abstract: Controlling marine vehicles in challenging environments is a complex task due to the presence of nonlinear hydrodynamics and uncertain external disturbances. Despite nonlinear model predictive control (MPC) showing potential in addressing these issues, its practical implementation is often constrained by computational limitations. In this paper, we propose an efficient controller for trajectory tracking of marine vehicles by employing a convex error-state MPC on the Lie group. By leveraging the inherent geometric properties of the Lie group, we can construct globally valid error dynamics and formulate a quadratic programming-based optimization problem. Our proposed MPC demonstrates effectiveness in trajectory tracking through extensive-numerical simulations, including scenarios involving ocean currents. Notably, our method substantially reduces computation time compared to nonlinear MPC, making it well-suited for real-time control applications with long prediction horizons or involving small marine vehicles.

10:30-12:00, Paper WeAT22-NT.3	Add to My Program
Mission Planning for Multiple Autonomous Underwater Vehicles with Constrained in Situ Recharging

Singh, Priti	Oregon State University
Hollinger, Geoffrey	Oregon State University
Keywords: Marine Robotics, Path Planning for Multiple Mobile Robots or Agents, Energy and Environment-Aware Automation Abstract: Persistent operation of Autonomous Underwater Vehicles (AUVs) without manual interruption for recharging saves time and total cost for offshore monitoring and data collection applications. In order to facilitate AUVs for long mission durations without ship support, they can be equipped with docking capabilities to recharge in situ at Wave Energy Converter (WEC) with dock recharging stations. However, the power generated at the recharging stations may be constrained depending on the sea conditions. Therefore, a robust mission planning framework is proposed using a centralized Evolutionary Algorithm (EA) and a decentralized Monte Carlo Tree Search (MCTS) method. Both methods incorporate the charge availability constraint at the recharging station in addition to the maximum charge capacity of each AUV. The planner utilizes a time-varying power profile of irregular waves incident at WECs for dock charging and generates efficient mission plans for AUVs by optimizing their time to visit the dock based on the imposed constraint. The effects of increasing the number of AUVs, increasing the number of points of interest in the mission area, and varying sea state on the mission duration are also analyzed.

10:30-12:00, Paper WeAT22-NT.4	Add to My Program
Decentralized Multi-Robot Navigation for Autonomous Surface Vehicles with Distributional Reinforcement Learning

Lin, Xi	Stevens Institute of Technology
Huang, Yewei	Stevens Institute of Technology
Chen, Fanfei	Stevens Institute of Technology
Englot, Brendan	Stevens Institute of Technology
Keywords: Marine Robotics, Path Planning for Multiple Mobile Robots or Agents, Reinforcement Learning Abstract: Collision avoidance algorithms for Autonomous Surface Vehicles (ASV) that follow the Convention on the International Regulations for Preventing Collisions at Sea (COLREGs) have been proposed in recent years. However, it may be difficult and unsafe to follow COLREGs in congested waters, where multiple ASVs are navigating in the presence of static obstacles and strong currents, due to the complex interactions. To address this problem, we propose a decentralized multi-ASV collision avoidance policy based on Distributional Reinforcement Learning, which considers the interactions among ASVs as well as with static obstacles and current flows. We evaluate the performance of the proposed Distributional RL based policy against a traditional RL-based policy and two classical methods, Artificial Potential Fields (APF) and Reciprocal Velocity Obstacles (RVO), in simulation experiments, which show that the proposed policy achieves superior performance in navigation safety, while requiring minimal travel time and energy. A variant of our framework that automatically adapts its risk sensitivity is also demonstrated to improve ASV safety in highly congested environments.

10:30-12:00, Paper WeAT22-NT.5	Add to My Program
Real-Time Planning under Uncertainty for AUVs Using Virtual Maps

Collado-Gonzalez, Ivana	Stevens Institute of Technology
McConnell, John	Stevens Institute of Technology
Wang, Jinkun	Stevens Institute of Technology
Szenher, Paul	Stevens Institute of Technology
Englot, Brendan	Stevens Institute of Technology
Keywords: Marine Robotics, Planning under Uncertainty, Reactive and Sensor-Based Planning Abstract: Reliable localization is an essential capability for marine robots navigating in GPS-denied environments. SLAM, commonly used to mitigate dead reckoning errors, still fails in feature-sparse environments or with limited-range sensors. Pose estimation can be improved by incorporating the uncertainty prediction of future poses into the planning process and choosing actions that reduce uncertainty. However, performing belief propagation is computationally costly, especially when operating in large-scale environments. This work proposes a computationally efficient planning under uncertainty framework suitable for large-scale, feature-sparse environments. Our strategy leverages SLAM graph and occupancy map data obtained from a prior exploration phase to create a virtual map, describing the uncertainty of each map cell using a multivariate Gaussian. The virtual map is then used as a cost map in the planning phase, and performing belief propagation at each step is avoided. A receding horizon planning strategy is implemented, managing a goal-reaching and uncertainty-reduction tradeoff. Simulation experiments in a realistic underwater environment validate this approach. Experimental comparisons against a full belief propagation approach and a standard shortest-distance approach are conducted.

10:30-12:00, Paper WeAT22-NT.6	Add to My Program
Sea-U-Foil: A Hydrofoil Marine Vehicle with Multi-Modal Locomotion

Zhao, Zuoquan	The Chinese University of Hong Kong
Zhai, Yu	The Chinese University of Hong Kong
Gao, Chuanxiang	The Chinese University of Hong Kong
Ding, Wendi	The Chinese University of Hong Kong
Yan, Ruixin	The Chinese University of Hong Kong
Gao, Songqun	Chinese University of Hong Kong
Han, Bingxin	The Chinese University of Hong Kong
Liu, Xuchen	The Chinese University of Hong Kong
Guo, Zixuan	The Chinese University of Hong Kong
Chen, Ben M.	Chinese University of Hong Kong
Keywords: Marine Robotics, Product Design, Development and Prototyping, Search and Rescue Robots Abstract: Autonomous Marine Vehicles (AMVs) have been widely used in many critical tasks such as surveillance, patrolling, marine environment monitoring, and hydrographic surveying. However, most typical AMVs cannot meet the diverse demands of different marine tasks. In this article, we design a new type of remote-controlled hydrofoil marine vehicle, named Sea-U-Foil, which is suitable for different marine scenarios. Sea-U-Foil features three distinct locomotion modes, displacement mode, foilborne mode, and submarine mode, which enable the platform flexible mobility, high-speed and high-load capacities, and superior concealment. Specifically, the submarine mode makes Sea-U-Foil unique among previous studies. In addition, the performance of Sea-U-Foil in foilborne mode outperforms those of most current unmanned surface vehicles (USVs) in terms of speed and payload. To the best of our knowledge, we are the first to introduce a new type of AMV that can work in displacement mode, foilborne mode, and submarine mode. We elaborate on the design principles and methodologies of Sea-U-Foil first, then validate the effectiveness of its tri-modal locomotion through extensive experiments.

10:30-12:00, Paper WeAT22-NT.7	Add to My Program
Learning Which Side to Scan: Multi-View Informed Active Perception with Side Scan Sonar for Autonomous Underwater Vehicles

Venkatramanan Sethuraman, Advaith	University of Michigan
Baldoni, Philip	United States Naval Research Laboratory
Skinner, Katherine	University of Michigan
McMahon, James	The Naval Research Laboratory
Keywords: Marine Robotics, Reactive and Sensor-Based Planning, Object Detection, Segmentation and Categorization Abstract: Autonomous underwater vehicles often perform surveys that capture multiple views of targets in order to provide more information for human operators or automatic target recognition algorithms. In this work, we address the problem of choosing the most informative views that minimize survey time while maximizing classifier accuracy. We introduce a novel active perception framework for multi-view adaptive surveying and reacquisition using side scan sonar imagery.Our framework addresses this challenge by using a graph formulation for the adaptive survey task. We then use Graph Neural Networks (GNNs) to both classify acquired sonar views and reinforcement learning to choose the next best view to capture based on the collected data. We evaluate our method using simulated surveys in a high-fidelity side scan sonar simulator. Our results demonstrate that our approach is able to surpass the state-of-the-art in classification accuracy and efficiency. This framework is a promising approach for more efficient autonomous missions involving side scan sonar, such as underwater exploration, marine archaeology, and environmental monitoring.

10:30-12:00, Paper WeAT22-NT.8	Add to My Program
Development of a Lightweight Underwater Manipulator for Delicate Structural Repair Operations

Mao, Juzheng	Southeast University
Song, Guangming	Southeast University
Hao, Shuang	Southeast University
Zhang, Mingquan	Southeast University
Song, Aiguo	Southeast University
Keywords: Marine Robotics, Robotics and Automation in Construction, Engineering for Robotic Systems Abstract: In recent years, underwater robots have been increasingly used in the maintenance of hydraulic structures. Underwater manipulators are essential devices that are used to carry out such maintenance tasks. For delicate repair operations such as fixing tiny cracks, most existing underwater manipulators face limitations in terms of size, accuracy, and scalability. Therefore, in this letter, we present a novel electric underwater manipulator, named SEU-4. This four-degree-of-freedom manipulator weighs 8.91 kg and has a maximum payload of 9 kg. It has a rapid-switching interface that supports convenient mechanical and electrical connections for end-effectors. To compensate for the disturbances that are present in the complex underwater environment, a trajectory-tracking control strategy based on a disturbance observer and sliding-mode control (DOB-SMC) is proposed. A prototype of the proposed underwater manipulator was created, and a flowing-water experimental platform was constructed to test its trajectory-tracking performance in fast-flowing water. The experimental results show that the manipulator achieves a trajectory-tracking error of 1.03 mm in static water and 2.91 mm in flowing water at 1.2 m/s, which satisfies the requirements of delicate repair operations.


WeAT26-NT Oral Session, NT-G404	Add to My Program
SLAM I

Chair: Wang, Sen	Imperial College London
Co-Chair: Erden, Mustafa Suphi	Heriot-Watt University

10:30-12:00, Paper WeAT26-NT.1	Add to My Program
KDD-LOAM: Jointly Learned Keypoint Detector and Descriptors Assisted LiDAR Odometry and Mapping

Huang, Renlang	Zhejiang University
Zhao, Minglei	Zhejiang University
Chen, Jiming	Zhejiang University
Li, Liang	Zhejiang Univerisity
Keywords: SLAM, Deep Learning Methods, Localization Abstract: Sparse keypoint matching based on distinct 3D feature representations can improve the efficiency and robustness of point cloud registration. Existing learning-based 3D descriptors and keypoint detectors are either independent or loosely coupled, so they cannot fully adapt to each other. In this work, we propose a tightly coupled keypoint detector and descriptor (TCKDD) based on a multi-task fully convolutional network with a probabilistic detection loss. In particular, this self-supervised detection loss fully adapts the keypoint detector to any jointly learned descriptors and benefits the self-supervised learning of descriptors. Extensive experiments on both indoor and outdoor datasets show that our TCKDD achieves textit{state-of-the-art} performance in point cloud registration. Furthermore, we design a keypoint detector and descriptors-assisted LiDAR odometry and mapping framework (KDD-LOAM), whose real-time odometry relies on keypoint descriptor matching-based RANSAC. The sparse keypoints are further used for efficient scan-to-map registration and mapping. Experiments on KITTI dataset demonstrate that KDD-LOAM significantly surpasses LOAM and shows competitive performance in odometry.

10:30-12:00, Paper WeAT26-NT.2	Add to My Program
Campus Map: A Large-Scale Dataset to Support Multi-View VO, SLAM and BEV Estimation

Ross, James	University of Surrey
Kaygusuz, Nimet	University of Surrey
Mendez, Oscar	University of Surrey
Bowden, Richard	University of Surrey
Keywords: SLAM Abstract: Significant advances in robotics and machine learning have resulted in many datasets designed to support research into autonomous vehicle technology. However, these datasets are rarely suitable for a wide variety of navigation tasks. For example, datasets that include multiple cameras often have short trajectories without loops that are unsuitable for the evaluation of longer-range SLAM or odometry systems, and datasets with a single camera often lack other sensors, making them unsuitable for sensor fusion approaches. Furthermore, alternative environmental representations such as semantic Bird's Eye View (BEV) maps are growing in popularity, but datasets often lack accurate ground truth and are not flexible enough to adapt to new research trends. To address this gap, we introduce Campus Map, a novel large-scale multi-camera dataset with 2M images from 6 mounted cameras that includes GPS data and 64-beam, 125k point LiDAR scans totalling 8M points (raw packets also provided). The dataset consists of 16 sequences in a large car park and 6 long-term trajectories around a university campus that provide data to support research into a variety of autonomous driving and parking tasks. Long trajectories (average 10~min) and many loops make the dataset ideal for the evaluation of SLAM, odometry and loop closure algorithms, and we provide several state-of-the-art baselines. We also include 40k semantic BEV maps rendered from a digital twin. This novel approach to ground truth generation allows us to produce more accurate and crisp semantic maps than are currently available. We make the simulation environment available to allow researchers to adapt the dataset to their specific needs.

10:30-12:00, Paper WeAT26-NT.3	Add to My Program
DISO: Direct Imaging Sonar Odometry

Xu, Shida	Imperial College London
Zhang, Kaicheng	Heriot-Watt University
Hong, Ziyang	Heriot-Watt University
Liu, Yuanchang	University College London
Wang, Sen	Imperial College London
Keywords: SLAM Abstract: This paper introduces a novel sonar odometry system that estimates the relative spatial transformation between two sonar image frames. Considering the unique challenges, such as low resolution and high noise, of sonar imagery for odometry and Simultaneous Localization and Mapping (SLAM), the proposed Direct Imaging Sonar Odometry (DISO) system is designed to estimate the relative transformation between two sonar frames by minimizing the aggregated sonar intensity errors of points with high intensity gradients. Moreover, DISO is implemented to incorporate a multi-sensor window optimization technique, a data association strategy and an acoustic intensity outlier rejection algorithm for reliability and accuracy. The effectiveness of DISO is evaluated using both simulated and real-world sonar datasets, showing that it outperforms the existing geometric-only method on localization accuracy and achieves state-of-the-art sonar odometry performance. The source code is available at https://github.com/SenseRoboticsLab/DISO.

10:30-12:00, Paper WeAT26-NT.4	Add to My Program
CURL-MAP: Continuous Mapping and Positioning with CURL Representation

Zhang, Kaicheng	Heriot-Watt University
Ding, Yining	Heriot-Watt University
Xu, Shida	Imperial College London
Hong, Ziyang	Heriot-Watt University
Kong, Xianwen	Heriot-Watt Universiy
Wang, Sen	Imperial College London
Keywords: SLAM Abstract: Maps of LiDAR Simultaneous Localisation and Mapping (SLAM) are often represented as point clouds. They usually take up a huge amount of storage space for large-scale environments, otherwise much structural detail may not be kept. In this paper, a novel paradigm of LiDAR mapping and odometry is designed by leveraging the Continuous and Ultra-compact Representation of LiDAR (CURL). Termed CURL-MAP (Mapping and Positioning), the proposed approach can not only reconstruct 3D maps with a continuously varying density but also efficiently reduce map storage space by using CURL's spherical harmonics implicit encoding. Different from the popular Iterative Closest Point (ICP) based LiDAR odometry techniques, CURL-MAP formulates LiDAR pose estimation as a unique optimisation problem tailored for CURL. Experiment evaluation shows that CURL-MAP achieves state-of-the-art 3D mapping results and competitive LiDAR odometry accuracy. We will release the CURL-MAP codes for the community.

10:30-12:00, Paper WeAT26-NT.5	Add to My Program
Degradation Resilient LiDAR-Radar-Inertial Odometry

Nissov, Morten	NTNU
Khedekar, Nikhil Vijay	NTNU
Alexis, Kostas	NTNU - Norwegian University of Science and Technology
Keywords: SLAM, Aerial Systems: Perception and Autonomy, Field Robots Abstract: Enabling autonomous robots to operate robustly in challenging environments is necessary in a future with increased autonomy. For many autonomous systems, estimation and odometry remains a single point of failure, from which it can often be difficult, if not impossible, to recover. As such robust odometry solutions are of key importance. In this work a method for tightly-coupled LiDAR-Radar-Inertial fusion for odometry is proposed, enabling the mitigation of the effects of LiDAR degeneracy by leveraging a complementary perception modality while preserving the accuracy of LiDAR in well-conditioned environments. The proposed approach combines modalities in a factor graph-based windowed smoother with sensor information-specific factor formulations which enable, in the case of degeneracy, partial information to be conveyed to the graph along the non-degenerate axes. The proposed method is evaluated in real-world tests on a flying robot experiencing degraded conditions including geometric self-similarity as well as obscurant occlusion. For the benefit of the community we release the datasets presented: https://github.com/ntnu-arl/lidar_degeneracy_datasets.

10:30-12:00, Paper WeAT26-NT.6	Add to My Program
Design and Evaluation of a Generic Visual SLAM Framework for Multi Camera Systems

Kaveti, Pushyami	Northeastern University
Vaidyanathan, Shankara Narayanan	Northeastern University
Thamil Chelvan, Arvind	Northeastern University
Singh, Hanumant	Northeatern University
Keywords: SLAM, Data Sets for SLAM, Field Robots Abstract: Multi-camera systems have been shown to improve the accuracy and robustness of SLAM estimates, yet state-of-the-art SLAM systems predominantly support monocular or stereo setups. This paper presents a generic sparse visual SLAM framework capable of running on any number of cameras and in any arrangement. Our SLAM system uses the generalized camera model, which allows us to represent an arbitrary multi-camera system as a single imaging device. Additionally, it takes advantage of the overlapping fields of view (FoV) by extracting cross-matched features across cameras in the rig. This limits the linear rise in the number of features with the number of cameras and keeps the computational load in check while enabling an accurate representation of the scene. We evaluate our method in terms of accuracy, robustness, and run time on indoor and outdoor datasets that include challenging real-world scenarios such as narrow corridors, featureless spaces, and dynamic objects. We show that our system can adapt to different camera configurations and allows real-time execution for typical robotic applications. Finally, we benchmark the impact of the critical design parameters - the number of cameras and the overlap between their FoV that define the camera configuration for SLAM. All our software and datasets are freely available for further research.

10:30-12:00, Paper WeAT26-NT.7	Add to My Program
Ground-Fusion: A Low-Cost Ground SLAM System Robust to Corner Cases

Yin, Jie	Shanghai Jiao Tong University
Li, Ang	Shanghai Jiao Tong University
Xi, Wei	Nankai University
Yu, Wenxian	Shanghai Jiao Tong University
Zou, Danping	Shanghai Jiao Ton University
Keywords: SLAM, Data Sets for SLAM, Sensor Fusion Abstract: We introduce Ground-Fusion, a low-cost sensor fusion simultaneous localization and mapping (SLAM) system for ground vehicles. Our system features efficient initialization, effective sensor anomaly detection and handling, real-time dense color mapping, and robust localization in diverse environments. We tightly integrate RGB-D images, inertial measurements, wheel odometer and GNSS signals within a factor graph to achieve accurate and reliable localization both indoors and outdoors. To ensure successful initialization, we propose an efficient strategy that comprises three different methods: stationary, visual, and dynamic, tailored to handle diverse cases. Furthermore, we develop mechanisms to detect sensor anomalies and degradation, handling them adeptly to maintain system accuracy. Our experimental results on both public and self-collected datasets demonstrate that Ground-Fusion outperforms existing low-cost SLAM systems in corner cases. We release the code and datasets at href{https://github.com/SJTU-ViSYS/Ground-Fusion}{https://github.com/SJTU-ViSYS/Ground-Fusion}.

10:30-12:00, Paper WeAT26-NT.8	Add to My Program
HERO-SLAM: Hybrid Enhanced Robust Optimization of Neural SLAM

Xin, Zhe	Meituan
Yue, Yufeng	Beijing Institute of Technology
Zhang, Liangjun	Baidu
Wu, Chenming	Baidu Research
Keywords: SLAM, Deep Learning for Visual Perception, Vision-Based Navigation Abstract: Simultaneous Localization and Mapping (SLAM) is a fundamental task in robotics, driving numerous applications such as autonomous driving and virtual reality. Recent progress on neural implicit SLAM has shown encouraging and impressive results. However, the robustness of neural SLAM, particularly in challenging or data-limited situations, remains an unresolved issue. This paper presents HERO-SLAM, a Hybrid Enhanced Robust Optimization method for neural SLAM, which combines the benefits of neural implicit field and feature-metric optimization. This hybrid method optimizes a multi-resolution implicit field and enhances robustness in challenging environments with sudden viewpoint changes or sparse data collection. Our comprehensive experimental results on benchmarking datasets validate the effectiveness of our hybrid approach, demonstrating its superior performance over existing implicit field-based methods in challenging scenarios. HERO-SLAM provides a new pathway to enhance the stability, performance, and applicability of neural SLAM in real-world scenarios. Project page: https://hero-slam.github.io.


WeBA1-CC Award Session, CC-Main Hall	Add to My Program
Service Robotics

Chair: Barfoot, Timothy	University of Toronto
Co-Chair: Cavallo, Filippo	University of Florence

13:30-15:00, Paper WeBA1-CC.1	Add to My Program
Censible: A Robust and Practical Global Localization Framework for Planetary Surface Missions

Nash, Jeremy	Jet Propulsion Laboratory
Dwight, Quintin	University of Michigan
Saldyt, Lucas	Jet Propulsion Laboratory
Wang, Haoda	Jet Propulsion Laboratory, California Institute of Technology
Myint, Steven	Jet Propulsion Laboratory
Ansar, Adnan	NASA Jet Propulsion Laboratory
Verma, Vandi	NASA Jet Propulsion Laboratory, California Institute Of
Keywords: Space Robotics and Automation, Field Robots, Localization Abstract: To achieve longer driving distances, planetary robotics missions require accurate localization to counteract position uncertainty. Freedom and precision in driving allows scientists to reach and study sites of interest. Typically, rover global localization has been performed manually by humans, which is accurate but time-consuming as data is relayed between planets. This paper describes a global localization algorithm that is run onboard the Perseverance Mars rover. Our approach matches rover images to orbital maps using a modified census transform to achieve sub-meter accurate, near-human localization performance on a real dataset of 264 Mars rover panoramas. The proposed solution has also been successfully executed on the Perseverance Mars Rover, demonstrating the practicality of our approach.

13:30-15:00, Paper WeBA1-CC.2	Add to My Program
Learning to Walk in Confined Spaces Using 3D Representation

Miki, Takahiro	ETH Zurich
Lee, Joonho	ETH Zurich
Wellhausen, Lorenz	ETH Zürich
Hutter, Marco	ETH Zurich
Keywords: Legged Robots, Robotics in Hazardous Fields, Reinforcement Learning Abstract: Legged robots have the potential to traverse complex terrain and access confined spaces beyond the reach of traditional platforms thanks to their ability to carefully select footholds and flexibly adapt their body posture while walking. However, robust deployment in real-world applications is still an open challenge. In this paper, we present a method for legged locomotion control using reinforcement learning and 3D volumetric representations to enable robust and versatile locomotion in confined and unstructured environments. By employing a two-layer hierarchical policy structure, we exploit the capabilities of a highly robust low-level policy to follow 6D commands and a high-level policy to enable three-dimensional spatial awareness for navigating under overhanging obstacles. Our study includes the development of a procedural terrain generator to create diverse training environments. We present a series of experimental evaluations in both simulation and real-world settings, demonstrating the effectiveness of our approach in controlling a quadruped robot in confined, rough terrain. By achieving this, our work extends the applicability of legged robots to a broader range of scenarios.

13:30-15:00, Paper WeBA1-CC.3	Add to My Program
Efficient and Accurate Transformer-Based 3D Shape Completion and Reconstruction of Fruits for Agricultural Robots

Magistri, Federico	University of Bonn
Marcuzzi, Rodrigo	University of Bonn
Marks, Elias Ariel	University of Bonn
Sodano, Matteo	Photogrammetry and Robotics Lab, University of Bonn
Behley, Jens	University of Bonn
Stachniss, Cyrill	University of Bonn
Keywords: Robotics and Automation in Agriculture and Forestry, RGB-D Perception Abstract: Robots that operate in agricultural environments need a robust perception system that can deal with occlusions, which are naturally present in agricultural scenarios. In this paper, we address the problem of estimating 3D shapes of fruits when only partial observations are available. Generally speaking, such a shape completion can be realized by exploiting prior knowledge about the geometry of the fruit. This is typically done by template matching using traditional optimization algorithms, which are slow but accurate, or by encoding such knowledge into the weights of a neural network, leading to faster but often less accurate estimates. Our approach combines the best of both worlds. It exploits the benefit of having a template representing our object of interest with the advantages of using a neural network to learn how to deform a template. Our experimental evaluation demonstrates that our approach yields accurate estimation at a competitively low inference time in challenging greenhouse environments.

13:30-15:00, Paper WeBA1-CC.4	Add to My Program
CoPAL: Corrective Planning of Robot Actions with Large Language Models

Joublin, Frank	Honda Research Institute Europe
Ceravola, Antonello	Honda Research Institute Europe GmbH
Smirnov, Pavel	Honda Research Institute Europe
Ocker, Felix	Honda
Deigmoeller, Joerg	Honda Research Institute Europe GmbH
Belardinelli, Anna	Honda Research Institute Europe
Wang, Chao	Honda Research Institute Europe GmbH
Hasler, Stephan	Honda Research Institute Europe
Tanneberg, Daniel	Honda Research Institute
Gienger, Michael	Honda Research Institute Europe
Keywords: AI-Enabled Robotics, Software Architecture for Robotic and Automation, Task and Motion Planning Abstract: In the pursuit of fully autonomous robotic systems capable of taking over tasks traditionally performed by humans, the complexity of open-world environments poses a considerable challenge. Addressing this imperative, this study contributes to the field of Large Language Models (LLMs) applied to task and motion planning for robots. We propose a system architecture that orchestrates a seamless interplay between multiple cognitive levels, encompassing reasoning, planning, and motion generation. At its core lies a novel replanning strategy that handles physically grounded, logical, and semantic errors in the generated plans. We demonstrate the efficacy of the proposed feedback architecture, particularly its impact on executability, correctness, and time complexity via empirical evaluation in the context of a simulation and two intricate real-world scenarios: blocks world, barman and pizza preparation.

13:30-15:00, Paper WeBA1-CC.5	Add to My Program
CalliRewrite: Recovering Handwriting Behaviors from Calligraphy Images without Supervision

Luo, Yuxuan	Peking University
Wu, Zekun	Peking University
Lian, Zhouhui	Peking University
Keywords: Art and Entertainment Robotics, AI-Enabled Robotics Abstract: Human-like planning skills and dexterous manipulation have long posed challenges in the fields of robotics and artificial intelligence (AI). The task of reinterpreting calligraphy presents a formidable challenge, as it involves the decomposition of strokes and dexterous utensil control. Previous efforts have primarily focused on supervised learning of a single instrument, limiting the performance of robots in the realm of cross-domain text replication. To address these challenges, we propose CalliRewrite: a coarse-to-fine approach for robot arms to discover and recover plausible writing orders from diverse calligraphy images without requiring labeled demonstrations. Our model achieves fine-grained control of various writing utensils. Specifically, an unsupervised image-to-sequence model decomposes a given calligraphy glyph to obtain a coarse stroke sequence. Using an RL algorithm, a simulated brush is fine-tuned to generate stylized trajectories for robotic arm control. Evaluation in simulation and physical robot scenarios reveals that our method successfully replicates unseen fonts and styles while achieving integrity in unknown characters. To access our code and supplementary materials, please visit our project page: https://luoprojectpage.github.io/callirewrite/.


WeBA2-CC Award Session, CC-301	Add to My Program
Unmanned Aerial Vehicles

Chair: Scaramuzza, Davide	University of Zurich
Co-Chair: Schoellig, Angela P.	TU Munich

13:30-15:00, Paper WeBA2-CC.1	Add to My Program
Co-Design Optimisation of Morphing Topology and Control of Winged Drones

Bergonti, Fabio	Istituto Italiano Di Tecnologia
Nava, Gabriele	Istituto Italiano Di Tecnologia
Wüest, Valentin	EPFL
Paolino, Antonello	Istituto Italiano Di Tecnologia
L'Erario, Giuseppe	Istituto Italiano Di Tecnologia
Pucci, Daniele	Italian Institute of Technology
Floreano, Dario	Ecole Polytechnique Fédérale De Lausanne (EPFL)
Keywords: Aerial Systems: Mechanics and Control, Methods and Tools for Robot System Design, Optimization and Optimal Control Abstract: The design and control of winged aircraft and drones is an iterative process aimed at identifying a compromise of mission-specific costs and constraints. When agility is required, shape-shifting (morphing) drones represent an efficient solution. However, morphing drones require the addition of actuated joints that increase the topology and control coupling, making the design process more complex. We propose a co-design optimisation method that assists the engineers by proposing a morphing drone’s conceptual design that includes topology, actuation, morphing strategy, and controller parameters. The method consists of applying multi-objective constraint-based optimisation to a multi-body winged drone with trajectory optimisation to solve the motion intelligence problem under diverse flight mission requirements, such as energy consumption and mission completion time. We show that co-designed morphing drones outperform fixed-winged drones in terms of energy efficiency and mission time, suggesting that the proposed co-design method could be a useful addition to the aircraft engineering toolbox.

13:30-15:00, Paper WeBA2-CC.2	Add to My Program
FC-Planner: A Skeleton-Guided Planning Framework for Fast Aerial Coverage of Complex 3D Scenes

Feng, Chen	Hong Kong University of Science and Technology
Li, Haojia	The Hong Kong University of Science and Technology
Zhang, Mingjie	Northwestern Polytechnical University
Chen, Xinyi	The Hong Kong University of Science and Technology
Zhou, Boyu	Sun Yat-Sen University
Shen, Shaojie	Hong Kong University of Science and Technology
Keywords: Aerial Systems: Perception and Autonomy, Motion and Path Planning, Aerial Systems: Applications Abstract: 3D coverage path planning for UAVs is a crucial problem in diverse practical applications. However, existing methods have shown unsatisfactory system simplicity, computation efficiency, and path quality in large and complex scenes. To address these challenges, we propose FC-Planner,a skeleton-guided planning framework that can achieve fastaerial coverage of complex 3D scenes without pre-processing.We decompose the scene into several simple subspaces by askeleton-based space decomposition (SSD). Additionally, theskeleton guides us to effortlessly determine free space. Weutilize the skeleton to efficiently generate a minimal set ofspecialized and informative viewpoints for complete cover age. Based on SSD, a hierarchical planner effectively divides the large planning problem into independent sub-problems, enabling parallel planning for each subspace. The carefully designed global and local planning strategies are then in corporated to guarantee both high quality and efficiency in path generation. We conduct extensive benchmark and real world tests, where FC-Planner computes over 10 times faster compared to state-of-the-art methods with shorter path and more complete coverage. The source code will be made publicly available to benefit the community3. Project page: https: //hkust-aerial-robotics.github.io/FC-Planner.

13:30-15:00, Paper WeBA2-CC.3	Add to My Program
Time-Optimal Gate-Traversing Planner for Autonomous Drone Racing

Qin, Chao	University of Toronto
Michet, Maxime Simon Joseph	University of Toronto
Chen, Jingxiang	University of Toronto
Liu, Hugh H.-T.	University of Toronto
Keywords: Aerial Systems: Applications, Motion and Path Planning, Art and Entertainment Robotics Abstract: In drone racing, the time-minimum trajectory is affected by the drone's capabilities, the layout of the race track, and the configurations of the gates (e.g., their shapes and sizes). However, previous studies neglect the configuration of the gates, simply rendering drone racing a waypoint-passing task. This formulation often leads to a conservative choice of paths through the gates, as the spatial potential of the gates is not fully utilized. To address this issue, we present a time-optimal planner that can faithfully model gate constraints with various configurations and thereby generate the most time-efficient trajectory while considering the single-rotor-thrust limits. Our approach excels in computational efficiency which only takes a few seconds to compute the full state and control trajectories of the drone through tracks with dozens of different gates. Extensive simulations and experiments confirm the effectiveness of the proposed methodology, showing that the lap time can be further reduced by taking into account the gate's configuration. We validate our planner in real-world flights and demonstrate super-extreme flight trajectory through race tracks.

13:30-15:00, Paper WeBA2-CC.4	Add to My Program
Sequential Trajectory Optimization for Externally-Actuated Modular Manipulators with Joint Locking

Choe, Jaeu	Seoul National University
Lee, Jeongseob	Seoul National University
Yang, Hyunsoo	Seoul National University
Nguyen, Hai-Nguyen (Hann)	CNRS
Lee, Dongjun	Seoul National University
Keywords: Aerial Systems: Applications, Aerial Systems: Mechanics and Control Abstract: In this paper, we present a novel trajectory planning method for externally-actuated modular manipulators (EAMMs), consisting of multiple rotor-actuated links with joints that can be either locked or unlocked. This joint-locking feature allows effective balancing of the payload capacity and dexterity of the robot but significantly complicates the planning problem by introducing binary decision variables. To address this challenge, we leverage the problem's intrinsic structure, i.e., the payload at the end-effector being enhanced by merely locking its immediate connected links; this allows us to break down the complex planning problem into a series of manageable subproblems and solve them sequentially. Our approach significantly reduces the problem's complexity: in a serial n-link EAMM with m joint-lock mechanisms, where there could potentially be 2^m distinct configurational dynamics, we require solving only n+1 trajectory optimization problems for single rigid body dynamics sequentially, thereby rendering the problem tractable. We substantiate the efficacy of our method through various simulation and experimental studies, covering ground-free and ground-bound configurations as well as both motion-only and manipulation tasks.

13:30-15:00, Paper WeBA2-CC.5	Add to My Program
Spatial Assisted Human-Drone Collaborative Navigation and Interaction through Immersive Mixed Reality

Morando, Luca	New York University
Loianno, Giuseppe	New York University
Keywords: Aerial Systems: Applications Abstract: Aerial robots have the potential to play a crucial role in assisting humans with complex and dangerous tasks. Nevertheless, the future industry demands innovative solutions to streamline the interaction process between humans and drones to enable seamless collaboration and efficient co-working. In this paper, we present a novel tele-immersive framework that facilitates cognitive and physical collaboration between humans and robots through Mixed Reality (MR). This includes a novel bi-directional spatial awareness and a multi-modal virtual-physical interaction approaches. The former seamlessly integrates the physical and virtual worlds, providing a bidirectional egocentric and exocentric environment representations. The latter, leveraging the proposed spatial representation, further enhances the collaboration combining a robot planning algorithm for obstacle avoidance with a variable admittance control. This enables the user to generate commands based on virtual forces while ensuring compatibility with the environment map. We validate the proposed approach by conducting several collaborative planning and exploration tasks involving a drone and a user equipped with a MR headset.

13:30-15:00, Paper WeBA2-CC.6	Add to My Program
A Trajectory-Based Flight Assistive System for Novice Pilots in Drone Racing Scenario

Zhong, Yuhang	Zhejiang Unviersity
Zhao, Guangyu	Zhejiang University
Wang, Qianhao	Zhejiang University
Xu, Guangtong	Zhejiang University
Xu, Chao	Zhejiang University
Gao, Fei	Zhejiang University
Keywords: Human Factors and Human-in-the-Loop, Telerobotics and Teleoperation, Art and Entertainment Robotics Abstract: Drone racing has become a popular international competition and has attained wide attention in recent years. However, the requirements of high-level operation keep the novice pilots away from participating in it. This paper presents a trajectory-based flight assistive system that enables various operators to fly the drone in a racing scene at a high speed. The whole system is structured hierarchically, consisting of both offline and online components. In the offline part, a global time-optimal trajectory is generated as the expert reference, and a dense flight corridor is constructed to provide sufficiently large safe region. In the online part, a remote control-mapped primitive is designed to fast encapsulate pilots' inputs, and the time mapping based trajectory progress is customized to further capture intention. Then, a trajectory planner is proposed to efficiently generate intention-aligned, smooth, feasible, and safe trajectories periodically. Additionally, a yaw planning that provides the pilot with the best suitable view angle is employed to further alleviate the operation difficulty. Simulations and real world experiments are implemented to verify the performance of our system. The maximum velocity can reach 6.0 m/s for a novice drone pilot in a real racing scene. We will open source our code later.


WeBT3-CC Oral Session, CC-313	Add to My Program
Kinematics and Dynamics

Chair: Yi, Jingang	Rutgers University
Co-Chair: Lau, Darwin	The Chinese University of Hong Kong

13:30-15:00, Paper WeBT3-CC.1	Add to My Program
Motion Planning and Inertia Based Control for Impact Aware Manipulation

Khurana, Harshit	EPFL
Billard, Aude	EPFL
Keywords: Impact Aware Manipulation, Motion Control of Manipulators, Motion and Path Planning, Factory Automation Abstract: In this paper, we propose a metric called hitting flux which is used in the motion generation and controls for a robot manipulator to interact with the environment through a hitting or a striking motion. Given the task of placing a known object outside of the workspace of the robot, the robot needs to come in contact with it at a non zero relative speed. The configuration of the robot and the speed at contact matter because they affect the motion of the object. The physical quantity called hitting flux depends on the robot's configuration, the robot speed and the properties of the environment. An approach to achieve the desired directional pre-impact flux for the robot through a combination of a dynamical system (DS) for motion generation and a control system that regulates the directional inertia of the robot is presented. Furthermore, a Quadratic Program (QP) formulation for achieving a desired inertia matrix at a desired position while following a motion plan constrained to the robot limits is presented. The system is tested for different scenarios in simulation showing the repeatability of the procedure and in real scenarios with KUKA LBR iiwa 7 robot.

13:30-15:00, Paper WeBT3-CC.2	Add to My Program
RASCAL: A Scalable, High-Redundancy Robot for Automated Storage and Retrieval Systems

Black, Richard	Microsoft
Caballero, Marco	Microsoft Research
Chatzieleftheriou, Andromachi	Microsoft
Deegan, Tim	Microsoft Research
Heard, Philip	Microsoft Research, Cambridge, UK
Hong, Freddie	Microsoft Research
Joyce, Russell	Microsoft Research
Legtchenko, Sergey	Microsoft
Rowstron, Antony	Microsoft Research
Smith, Adam	Microsoft
Sweeney, David	Microsoft Research
Williams, Hugh	Microsoft
Keywords: Industrial Robots, Climbing Robots, Mechanism Design Abstract: Automated storage and retrieval systems (ASRS) are a key component of the modern storage industry, and are used in a wide range of applications, carrying anything from lightweight tape cartridges to entire pallets of goods. Many of these systems are under pressure to maximise the use of space by growing in height and density, but this can create challenges for the the robots that service them. In this context, we present RASCAL, a novel ASRS robot for small payload items in structured environments, with a focus on system-level scalability and redundancy. We describe the design objectives of RASCAL and how they address some of the limitations of existing robotic systems in this area, such as scalability and redundancy. We then demonstrate the viability of our design with a proof-of-concept implementation of a data centre storage media robot, and show through a series of experiments that its design, speed, accuracy, and energy efficiency are appropriate for this application.

13:30-15:00, Paper WeBT3-CC.3	Add to My Program
Virtual Passive-Joint Space Based Time-Optimal Trajectory Planning for a 4-DOF Parallel Manipulator

Zhao, Jie	Chinese Academy of Sciences
Yang, Guilin	Ningbo Institute of Material Technology and Engineering, Chines
Shi, Haoyu	University of Nottingham, Ningbo China
Chen, Silu	Ningbo Institute of Materials Technology and Engineering, CAS
Chen, Chin-Yin	Ningbo Institute of Material Technology and Engineering, CAS
Zhang, Chi	Ningbo Institute of Material Technology and Engineering, CAS
Keywords: Parallel Robots, Kinematics, Motion and Path Planning Abstract: The 4-DOF (3T1R) 4PPa-2PaR parallel manipulator is developed for high-speed pick-and-place operations. However, conventional trajectory planning methods in either active joint space or Cartesian space have some shortcomings due to its high nonlinear kinematics. Owing to its unique four-to-two leg structure, the middle link that connects to the two proximal parallelogram four-bar linkages in each side only generates 2-DOF translational motions in a vertical plane. By treating each of the middle link as a 2-DOF virtual passive joint, a new trajectory planning method in the 4-DOF virtual passive-joint space is proposed, which not only simplifies the kinematic analysis, but also decreases the kinematics nonlinearity. By introducing the virtual passive joints, both displacement and velocity analyses are readily investigated. The Lagrangian method is employed to formulate the closed-form dynamic model. The quintic B-spline is utilized to generate trajectories in the virtual passive-joint space, while the Genetic Algorithm is implemented to search for the time optimal trajectory. The simulation results indicate that the optimal time planned in the virtual passive-joint space is decreased by 2.8% and 8.1% compared with the active-joint space method and Cartesian space method respectively. The average and peak jerks of the moving platform are decreased by 14.6% and 37.6% compared with the active-joint space method.

13:30-15:00, Paper WeBT3-CC.4	Add to My Program
Direct Kinematic Singularities and Stability Analysis of Sagging Cable-Driven Parallel Robots

Briot, Sébastien	LS2N
Merlet, Jean-Pierre	INRIA
Keywords: Parallel Robots, Kinematics, stability, Industrial Robots Abstract: Sagging cable-driven parallel robots (CDPRs) are often modelled by using the Irvine's model. We will show that their configurations may be unstable, and moreover, that assessing the stability of the robot with the Irvine's model cannot be done by checking the spectrum of a stiffness matrix associated with the platform motions. In the present paper, we show that the static configurations of the sagging CDPRs are local extrema of the functional describing the robot potential energy. For assessing the stability, it is then necessary to check two conditions: The Legendre-Clebsch and the Jacobi conditions, both well known in optimal control theory. We will also (i) prove that there is a link between some singularities of the CDPRs and the limits of stability and (ii) show that singularities of the platform wrench system are not singularities of the geometric model of the sagging CDPRs, contrary to what happens in rigid-link parallel robotics. The stability prediction results are validated in simulation by cross-validating them by using a lumped model, for which the stability can be assessed by analyzing the spectrum of a reduced Hessian matrix of the potential energy.

13:30-15:00, Paper WeBT3-CC.5	Add to My Program
Towards Solving Cable-Driven Parallel Robot Inaccuracy Due to Cable Elasticity

Suarez Roos, Adolfo	IRT Jules Verne
Zake, Zane	IRT Jules Verne
Rasheed, Tahir	IRCCyN - ECN - IRT JV
Pedemonte, Nicolo	IRT Jules Verne
Caro, Stéphane	CNRS/LS2N
Keywords: Parallel Robots, Tendon/Wire Mechanism, Kinematics Abstract: Cable elasticity can significantly impact the accuracy of Cable-Driven Parallel Robots (CDPRs). However, it’s frequently disregarded as negligible in CDPR simulations and designs. In this paper, we propose a numerical approach, referred to as SEECR, which is designed to estimate the behavior of a CDPR featuring elastic cables while ensuring the Static Equilibrium (SE) of the Moving-Platform (MP). By modeling the cables as elastic springs, the proposed approach correctly predicts which cables become slack, estimates the tension distribution among cables and computes unwanted MP motions, allowing to predict the impact of design choices. The results have been validated experimentally on two cable types and configurations.

13:30-15:00, Paper WeBT3-CC.6	Add to My Program
Wrench and Twist Capability Analysis for Cable-Driven Parallel Robots with Consideration of the Actuator Torque-Speed Relationship

Chan, Ngo Foon	The Chinese University of Hong Kong
Lam, Wai Yi	The Chinese University of Hong Kong
Lau, Darwin	The Chinese University of Hong Kong
Keywords: Parallel Robots, Tendon/Wire Mechanism, Manipulation Planning, Wrench-twist Feasibility Abstract: The wrench and twist feasibility are the workspace conditions that indicate whether the mobile-platform (MP) of the cable-driven parallel robots (CDPRs) can provide a sufficient amount of wrench and twist. Traditionally, these two quantities are evaluated independently from the actuator's torque and speed limits, which are assumed to be fixed in the literature, but they are indeed coupled. This results in a conservative usage of the actuator capability and hence hinders the robot's actual feasibility. In this study, new approaches to analyzing and commanding CDPRs by considering the coupling effect are proposed. First, the required wrench of the MP is mapped into the twist space by the motors' torque-speed relationship and becomes the wrench-dependent available twist set. Then a new workspace condition and a new metric are introduced based on the available twist set. The metric shows the maximum allowable MP speed map of the workspace. Finally, a varying speed trajectory is designed based on the metric to optimize the total MP traveling time. This study shows the potential of robot wrench-twist capability and enhances the robot hardware effectiveness without any ha

13:30-15:00, Paper WeBT3-CC.7	Add to My Program
RicMonk: A Three-Link Brachiation Robot with Passive Grippers for Energy-Efficient Brachiation

Grama Srinivas Shourie, Grama Srinivas Shourie	Deutsches Forschungszentrum Für Künstliche Intelligenz, Bremen
Javadi, Mahdi	German Research Center for Artificial Intelligence Robotics Inn
Kumar, Shivesh	DFKI GmbH
Zamani Boroujeni, Hossein	DFKI-Robotics Innovation Center
Kirchner, Frank	University of Bremen
Keywords: Underactuated Robots, Biologically-Inspired Robots, Education Robotics Abstract: This paper presents the design, analysis, and performance evaluation of RicMonk, a novel three-link brachiation robot equipped with passive hook-shaped grippers. Brachiation, an agile and energy-efficient mode of locomotion observed in primates, has inspired the development of RicMonk to explore versatile locomotion and maneuvers on ladder-like structures. The robot’s anatomical resemblance to gibbons and the integration of a tail mechanism for energy injection contribute to its unique capabilities. The paper discusses the use of the Direct Collocation methodology for optimizing trajectories for the robot’s dynamic behaviors and stabilization of these trajectories using a Time-varying Linear Quadratic Regulator. With RicMonk we demonstrate bidirectional brachiation, and provide comparative analysis with its predecessor, AcroMonk - a two-link brachiation robot, to demonstrate that the presence of a passive tail helps improve energy efficiency. The system design, controllers, and software implementation are publicly available on GitHub at https://github.com/dfki-ric-underactuated-lab/ricmonk and the video demonstration of the experiments can be viewed at https://youtu.be/hOuDQI7CD8w.

13:30-15:00, Paper WeBT3-CC.8	Add to My Program
Gaussian Process-Enhanced, External and Internal Convertible Form-Based Control of Underactuated Balance Robots

Han, Feng	Rutgers University
Yi, Jingang	Rutgers University
Keywords: Underactuated Robots, Dynamics, Machine Learning for Robot Control Abstract: External and internal convertible (EIC) form-based motion control (i.e., EIC-based control) is one of the effective approaches for underactuated balance robots. By sequentially controller design, trajectory tracking of the actuated subsystem and balance of the unactuated subsystem can be achieved simultaneously. However, with certain conditions, there exists uncontrolled robot motion under the EIC-based control. We first identify these conditions and then propose an enhanced EIC-based control with a Gaussian process data-driven robot dynamic model. Under the new enhanced EIC-based control, the stability and performance of the closed-loop system are guaranteed. We demonstrate the GP-enhanced control experimentally using two examples of underactuated balance robots.


WeBT4-CC Oral Session, CC-315	Add to My Program
Multi-Robot Systems V

Chair: Sabattini, Lorenzo	University of Modena and Reggio Emilia
Co-Chair: Garcia de Marina, Hector	Universidad De Granada

13:30-15:00, Paper WeBT4-CC.1	Add to My Program
Automation and Artificial Intelligence Technology in Surface Mining: State of the Art, Challenges and Opportunities

Leung, Raymond	The University of Sydney
Hill, Andrew John	University of Sydney
Melkumyan, Arman	The University of Sydney
Keywords: Mining Robotics, Planning, Scheduling and Coordination, Probability and Statistical Methods Abstract: This survey article provides a synopsis on some of the engineering problems, technological innovations, robotic development and automation efforts encountered in the mining industry---particularly in the Pilbara iron-ore region of Western Australia. The goal is to paint the technology landscape and highlight issues relevant to an engineering audience to raise awareness of AI and automation trends in mining. It assumes the reader has no prior knowledge of mining and builds context gradually through focused discussion and short summaries of common open-pit mining operations. The principal activities that take place may be categorized in terms of resource development, mine-, rail- and port operations. From mineral exploration to ore shipment, there are roughly nine steps in between. These include: geological assessment, mine planning and development, production drilling and assaying, blasting and excavation, transportation of ore and waste, crush and screen, stockpile and load-out, rail network distribution, and ore-car dumping. The objective is to describe these processes and provide insights on some of the challenges/opportunities from the perspective of a decade-long industry-university R&D partnership.

13:30-15:00, Paper WeBT4-CC.2	Add to My Program
Hierarchical Traffic Management of Multi-AGV Systems with Deadlock Prevention Applied to Industrial Environments (I)

Pratissoli, Federico	Università Degli Studi Di Modena E Reggio Emilia
Brugioni, Riccardo	RSEngineering Srl
Battilani, Nicola	University of Modena and Reggio Emilia
Sabattini, Lorenzo	University of Modena and Reggio Emilia
Keywords: Multi-Robot Systems, Factory Automation, Path Planning for Multiple Mobile Robots or Agents Abstract: This paper concerns the coordination and the traffic management of a group of Automated Guided Vehicles (AGVs) moving in a real industrial scenario, such as an automated factory or warehouse. The proposed methodology is based on a three-layer control architecture, which is described as follows: 1) the Top Layer (or Topological Layer) allows to model the traffic of vehicles among the different areas of the environment; 2) the Middle Layer allows the path planner to compute a traffic sensitive path for each vehicle; 3) the Bottom Layer (or Roadmap Layer) defines the final routes to be followed by each vehicle and coordinates the AGVs over time. In the paper we describe the coordination strategy we propose, which is executed once the routes are computed and has the aim to prevent congestions, collisions and deadlocks. The coordination algorithm exploits a novel deadlock prevention approach based on time-expanded graphs. Moreover, the presented control architecture aims at grounding theoretical methods to an industrial application by facing the typical practical issues such as graphs difficulties (load/unload locations, weak connections,), a predefined roadmap (constrained by the plant layout), vehicles errors, dynamical obstacles, etc. In this paper we propose a flexible and robust methodology for multi-AGVs traffic-aware management. Moreover, we propose a coordination algorithm, which does not rely on ad hoc assumptions or rules, to prevent collisions and deadlocks and to deal

13:30-15:00, Paper WeBT4-CC.3	Add to My Program
Task Allocation in Heterogeneous Multi-Robot Systems Based on Preference-Driven Hedonic Game

Zhang, Liwang	National University of Defense Technology
Li, Minglong	National University of Defense Technology
Yang, Wenjing	State Key Laboratory of High Performance Computing (HPCL), Schoo
Yang, Shaowu	National University of Defense Technology
Keywords: Multi-Robot Systems, Search and Rescue Robots, Cooperating Robots Abstract: Multiple preferences between robots and tasks have been largely overlooked in previous research on Multi-Robot Task Allocation (MRTA) problems. In this paper, we propose a preference-driven approach based on hedonic game to address the task allocation problem of muti-robot systems in emergency rescue scenarios. We present a distributed framework considering various preferences between robots and tasks to determine the division of coalitions in such problems and evaluate the scalability and adaptability of our algorithm through relevant experiments. Furthermore, considering the strict communication limitations in emergency rescue scenarios, we have verified that our algorithm can efficiently converge to a Nash-stable coalition partition even in conditions of insufficient communication distance.

13:30-15:00, Paper WeBT4-CC.4	Add to My Program
Persistent Monitoring of Multiple Moving Targets Using High Order Control Barrier Functions

Balandi, Lorenzo	Centre Inria De l'Université De Rennes
De Carli, Nicola	CNRS
Robuffo Giordano, Paolo	Irisa Cnrs Umr6074
Keywords: Multi-Robot Systems, Sensor Networks, Cooperating Robots Abstract: This paper considers the problem of persistently monitoring a set of moving targets using a team of aerial vehicles. Each agent in the network is assumed equipped with a camera with limited range and Field of View (FoV) providing bearing measurements and it implements an Information Consensus Filter (ICF) to estimate the state of the target(s). The ICF can be proven to be uniformly globally exponentially stable under a Persistency of Excitation (PE) condition. We then propose a distributed control scheme that allows maintaining a prescribed minimum PE level so as to ensure filter convergence. At the same time, the agents in the group are also allowed to perform additional tasks of interest while maintaining a collective observability of the target(s). In order to enforce satisfaction of the observability constraint, we leverage two main tools: (i) the weighted Observability Gramian with a forgetting factor as a measure of the cumulative acquired information, and (ii) the use of High Order Control Barrier Functions (HOCBF) as a mean to maintain a minimum level of observability for the targets. Simulation results are reported to prove the effectiveness of this approach.

13:30-15:00, Paper WeBT4-CC.5	Add to My Program
EM-Patroller: Entropy Maximized Multi-Robot Patrolling with Steady State Distribution Approximation

Guo, Hongliang	Agency for Science Technology and Research
Kang, Qi	National University of Singapore
Yau, Wei-Yun	I2R
Ang Jr, Marcelo H	National University of Singapore
Rus, Daniela	MIT
Keywords: Multi-Robot Systems, Surveillance Robotic Systems Abstract: This paper investigates the multi-robot patrolling (MuRP) problem in a discrete environment with the objective of approaching the uniform node coverage probability distribution by the robot team. Prevailing MuRP solutions for uniform node coverage either incur high (non-polynomial) computational complexity operations for the global optimal solution, or recourse to simple yet effective heuristics for approximate solutions without any performance guarantee. In this paper, we bridge the gap by proposing an efficient iterative algorithm, namely Entropy Maximized Patroller (EM-Patroller), with the per-iteration performance improvement guarantee and polynomial computational complexity. We reformulate the multi-robot patrolling problem in discrete environments as an 'unnormalized' joint steady state distribution entropy maximization problem, and employ multi-layer perceptron (MLP) to model the relationship between each robot's patrolling strategy and the individual steady state distribution. Then, we derive a multi-agent model-based policy gradient method to gradually update the robots' patrolling strategies towards the optimum. Complexity analysis indicates the polynomial computational complexity of EM-Patroller, and we also show that EM-Patroller has additional benefits of catering to miscellaneous user-defined joint steady state distributions and incorporating other objectives, e.g., entropy maximization of individual steady state distribution, into the objective. We compare E

13:30-15:00, Paper WeBT4-CC.6	Add to My Program
Behavioral-Based Circular Formation Control for Robot Swarms

Bautista, Jesús	Universidad De Granada
Garcia de Marina, Hector	Universidad De Granada
Keywords: Multi-Robot Systems, Swarm Robotics Abstract: This paper focuses on coordinating a robot swarm orbiting a convex path without collisions among the individuals. The individual robots lack braking capabilities and can only adjust their courses while maintaining their constant but different speeds. Instead of controlling the spatial relations between the robots, our formation control algorithm aims to deploy a dense robot swarm that mimics the behavior of tornado schooling fish. To achieve this objective safely, we employ a combination of a scalable overtaking rule, a guiding vector field, and a control barrier function with an adaptive radius to facilitate smooth overtakes. The decision-making process of the robots is distributed, relying only on local information. Practical applications include defensive structures or escorting missions with the added resiliency of a swarm without a centralized command. We provide a rigorous analysis of the proposed strategy and validate its effectiveness through numerical simulations involving a high density of unicycles.

13:30-15:00, Paper WeBT4-CC.7	Add to My Program
Optimization and Evaluation of a Multi Robot Surface Inspection Task through Particle Swarm Optimization

Chiu, Darren	University of Southern California
Nagpal, Radhika	Harvard University
Haghighat, Bahar	University of Groningen
Keywords: Multi-Robot Systems, Swarm Robotics, Sensor Networks Abstract: Robot swarms can be tasked with a variety of automated sensing and inspection applications in aerial, aquatic, and surface environments. In this paper, we study a simplified two-outcome surface inspection task. We task a group of robots to inspect and collectively classify a 2D surface section based on a binary pattern projected on the surface. We use a decentralized Bayesian decision-making algorithm and deploy a swarm of 3-cm sized wheeled robots to inspect a randomized black and white tiled surface section of size 1m x 1m in simulation. We first describe the model parameters that characterize our simulated environment, the robot swarm, and the inspection algorithm. We then employ a noise-resistant heuristic optimization scheme based on the Particle Swarm Optimization (PSO) using a fitness evaluation that combines the swarm's classification decision accuracy and decision time. We use our fitness measure definition to asses the optimized parameters through 100 randomized simulations that vary surface pattern and initial robot poses. The optimized algorithm parameters show up to 55% improvement in median of fitness evaluations against an empirically chosen parameter set.

13:30-15:00, Paper WeBT4-CC.8	Add to My Program
Hierarchical Planning for Long-Horizon Multi-Agent Collective Construction

Singh, Shambhavi	Carnegie Mellon University
Huang, Zejian	Carnegie Mellon University
Kesarimangalam Srinivasan, Akshaya	Carnegie Mellon University
Gutow, Geordan	Carnegie Mellon University
Vundurthy, Bhaskar	Carnegie Mellon University
Choset, Howie	Carnegie Mellon University
Keywords: Multi-Robot Systems, Task and Motion Planning, Robotics and Automation in Construction Abstract: We develop a planner that directs robots to construct a 3D target structure composed of blocks. The robots themselves are cubes of the same size as the blocks, and they may place, carry, or remove one block at a time. When moving, robots are also allowed to climb or descend a block. A construction plan may thus build a staircase-like scaffolding of blocks to reach other blocks at higher levels. The order of block placement is important; for example, a block that sits atop other blocks must be placed after the blocks below it, and a block that needs scaffolding cannot be placed until after the scaffolding is. Prior works focus on end-to-end approaches that simultaneously plan for block placement order and inter-robot collisions. Larger structures are either intractable or yield high-cost solutions. A prior approach mitigates this by decomposing the structure into smaller components that can be planned for independently, but the computational challenge remains. We present a hierarchical approach that first 1) uses A* to determine a sequence of block placements and removals while ignoring inter-robot collision, then 2) identifies ordering constraints between block placement and removal actions, and finally (3) computes collision-free paths for multiple robots to perform said actions. Compared to an optimization approach that minimizes the number of timesteps to complete the structure, we observe a 100x reduction in computation time for comparable solutions.


WeBT7-CC Oral Session, CC-416	Add to My Program
Learning in Control I

Chair: Chitnis, Rohan	Meta AI
Co-Chair: Zhao, Ding	Carnegie Mellon University

13:30-15:00, Paper WeBT7-CC.1	Add to My Program
IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control

Chitnis, Rohan	Massachusetts Institute of Technology
Xu, Yingchen	Meta AI
Hashemi, Bobak	Meta AI
Lehnert, Lucas	Meta AI
Dogan, Ürün	Ruhr-University Bochum, Institut Für Neuroinformatik
Zhu, Zheqing	Stanford University
Delalleau, Olivier	NVIDIA
Keywords: Reinforcement Learning, Machine Learning for Robot Control, Imitation Learning Abstract: Model-based reinforcement learning (RL) has shown great promise due to its sample efficiency, but still struggles with long-horizon sparse-reward tasks, especially in offline settings where the agent learns from a fixed dataset. We hypothesize that model-based RL agents struggle in these environments due to a lack of long-term planning capabilities, and that planning in a temporally abstract model of the environment can alleviate this issue. In this paper, we make two key contributions: 1) we introduce an offline model-based RL algorithm, IQL-TD-MPC, that extends the state-of-the-art Temporal Difference Learning for Model Predictive Control (TD-MPC) with Implicit Q-Learning (IQL); and 2) we propose to use IQL-TD-MPC as a Manager in a hierarchical setting with any off-the-shelf offline RL algorithm as a Worker. More specifically, we pre-train a temporally abstract IQL-TD-MPC Manager to predict "intent embeddings", which roughly correspond to subgoals, via planning. We show that augmenting state representations with intent embeddings generated by an IQL-TD-MPC manager significantly improves off-the-shelf offline RL agents' performance on some of the most challenging D4RL benchmark tasks. For instance, the offline RL algorithms AWAC, TD3-BC, DT, and CQL all get zero or near-zero normalized evaluation scores on the medium and large antmaze tasks, while our modification gives an average score over 40.

13:30-15:00, Paper WeBT7-CC.2	Add to My Program
SLIM: Skill Learning with Multiple Critics

Emukpere, David	Naver Labs Europe
Wu, Bingbing	Naver Labs Europe
Perez, Julien	Naver Labs Europe
Renders, Jean-Michel	Naver Labs Europe
Keywords: Reinforcement Learning, Machine Learning for Robot Control, Deep Learning Methods Abstract: Self-supervised skill learning aims to acquire useful behaviors that leverage the underlying dynamics of the environment. Latent variable models, based on mutual information maximization, have been successful in this task but still struggle in the context of robotic manipulation. As it requires impacting a possibly large set of degrees of freedom composing the environment, mutual information maximization fails alone in producing useful and safe manipulation behaviors. Furthermore, tackling this by augmenting skill discovery rewards with additional rewards through a naive combination might fail to produce desired behaviors. To address this limitation, we introduce SLIM, a multi-critic learning approach for skill discovery with a particular focus on robotic manipulation. Our main insight is that utilizing multiple critics in an actor-critic framework to gracefully combine multiple reward functions leads to a significant improvement in latent-variable skill discovery for robotic manipulation while overcoming possible interference occurring among rewards which hinders convergence to useful skills. Furthermore, in the context of tabletop manipulation, we demonstrate the applicability of our novel skill discovery approach to acquire safe and efficient motor primitives in a hierarchical reinforcement learning fashion and leverage them through planning, significantly surpassing baseline approaches for skill discovery.

13:30-15:00, Paper WeBT7-CC.3	Add to My Program
SPRINT: Scalable Policy Pre-Training Via Language Instruction Relabeling

Zhang, Jesse	University of Southern California
Pertsch, Karl	University of Southern California
Zhang, Jiahui	University of Southern California
Lim, Joseph	Korea Advanced Institute of Science and Technology
Keywords: Reinforcement Learning, Machine Learning for Robot Control, Learning from Demonstration Abstract: Pre-training robots with a rich set of skills can substantially accelerate the learning of downstream tasks. Prior works have defined pre-training tasks via natural language instructions, but doing so requires tedious human annotation of hundreds of thousands of instructions. Thus, we propose SPRINT, a scalable offline policy pre-training approach which substantially reduces the human effort needed for pre-training a diverse set of skills. Our method uses two core ideas to automatically expand a base set of pre-training tasks: instruction relabeling via large language models and cross-trajectory skill chaining with offline reinforcement learning. As a result, SPRINT pre-training equips robots with a richer repertoire of skills that can help an agent generalize to new tasks. Experiments in a household simulator and on a real robot kitchen manipulation task show that SPRINT leads to substantially faster learning of new long-horizon tasks than previous pre-training approaches. Website at https://clvrai.com/sprint.

13:30-15:00, Paper WeBT7-CC.4	Add to My Program
Effective Representation Learning Is More Effective in Reinforcement Learning Than You Think

Zheng, Jiawei	Xi'an Jiaotong University
Song, Yonghong	Xi'an Jiaotong University
Keywords: Reinforcement Learning, Representation Learning, Model Learning for Control Abstract: In reinforcement learning (RL), learning directly from pixels, is commonly known as vision-based RL. Effective state representations are crucial for high performance in vision-based RL. However, in order to learn effective state representations, most current vision-based RL methods based on contrastive unsupervised learning use auxiliary tasks similar to those in computer vision, which does not guarantee the effective information interaction between representation learning and RL. To learn more efficient states, we propose a simple and effective vision-based RL method. It leverages the representations acquired through contrastive learning by the Teacher Encoder and the Student Encoder to collaboratively estimate the Q-function. This cooperative process utilizes the TD error to steer updates to the Teacher Encoder, thereby ensuring effective information exchange between representation learning and RL. We refer to this approach as Reinforcement Learning with Teacher-Student Collaboration (RLTSC). RLTSC incorporates recent advancements in contrastive unsupervised learning, endowing it with potent representation learning capabilities. It provides a robust estimate of the Q-function with minimal variance and effectively guides the Teacher Ecoder to update and acquire a more efficient representation. RLTSC substantially enhances data efficiency in vision-based RL, surpassing state-of-the-art methods on various continuous and discrete control benchmarks. Remarkably, RLTSC even outperforms RL methods based on physical state features in terms of data efficiency for continuous control benchmarks. This may enlighten us: effective representation learning is more effective in reinforcement learning than you think!

13:30-15:00, Paper WeBT7-CC.5	Add to My Program
Learning Highly Dynamic Behaviors for Quadrupedal Robots

Zhang, Chong	Tencent Robotics X
Sheng, Jiapeng	Shandong University
Li, Tingguang	The Chinese University of Hong Kong
Zhang, He	Tencent
Zhou, Cheng	Tencent
Zhu, Qingxu	Tencent
Zhao, Rui	Tencent
Zhang, Yizheng	Tencent
Han, Lei	Tencent Robotics X
Keywords: Reinforcement Learning, Imitation Learning, Learning from Demonstration Abstract: Learning highly dynamic behaviors for robots has been a longstanding challenge. Traditional approaches have demonstrated robust locomotion, but the exhibited behaviors lack diversity and agility. They employ approximate models, which lead to compromises in performance. Data-driven approaches have been shown to reproduce agile behaviors of animals, but typically have not been able to learn highly dynamic behaviors. In this paper, we propose a learning-based approach to enable robots to learn highly dynamic behaviors from animal motion data. The learned controller is deployed on a quadrupedal robot and the results show that the controller is able to reproduce highly dynamic behaviors including sprinting, jumping and sharp turning. Various behaviors can be activated through human interaction using a stick with makers attached to it. Based on the motion pattern of the stick, the robot exhibits walking, running, sitting and jumping, much like the way humans interact with a pet.

13:30-15:00, Paper WeBT7-CC.6	Add to My Program
TWIST: Teacher-Student World Model Distillation for Efficient Sim-To-Real Transfer

Yamada, Jun	University of Oxford
Rigter, Marc	University of Oxford
Collins, Jack	University of Oxford
Posner, Ingmar	Oxford University
Keywords: Reinforcement Learning, Transfer Learning, Machine Learning for Robot Control Abstract: Model-based RL is a promising approach for real-world robotics due to its improved sample efficiency and generalization capabilities compared to model-free RL. However, effective model-based RL solutions for vision-based real-world applications require bridging the sim-to-real gap for any world model learnt. Due to its significant computational cost, standard domain randomisation does not provide an effective solution to this problem. This paper proposes TWIST (Teacher-Student World Model Distillation for Sim-to-Real Transfer) to achieve efficient sim-to-real transfer of vision-based model-based RL using distillation. Specifically, TWIST leverages state observations as readily accessible, privileged information commonly garnered from a simulator to significantly accelerate sim-to-real transfer. Specifically, a teacher world model is trained efficiently on state information. At the same time, a matching dataset is collected of domain-randomised image observations. The teacher world model then supervises a student world model that takes the domain-randomised image observations as input. By distilling the learned latent dynamics model from the teacher to the student model, TWIST achieves efficient and effective sim-to-real transfer for vision-based model-based RL tasks. Experiments in simulated and real robotics tasks demonstrate that our approach outperforms naive domain randomisation and model-free methods in terms of sample efficiency and task performance of sim-to-real transfer.

13:30-15:00, Paper WeBT7-CC.7	Add to My Program
Learning Vision-Based Pursuit-Evasion Robot Policies

Bajcsy, Andrea	Carnegie Mellon University
Loquercio, Antonio	UC Berkeley
Kumar, Ashish	UC Berkeley
Malik, Jitendra	UC Berkeley
Keywords: Machine Learning for Robot Control, Visual Learning, Human-Aware Motion Planning Abstract: Learning strategic robot behavior---like that required in pursuit-evasion interactions---under real-world constraints is extremely challenging. It requires exploiting the dynamics of the interaction, and planning through both physical state and latent intent uncertainty. In this paper, we transform this intractable problem into a supervised learning problem, where a fully-observable robot policy generates supervision for a partially-observable one. We find that the quality of the supervision signal for the partially-observable pursuer policy depends on two key factors: the balance of diversity and optimality of the evader’s behavior, and the strength of the modeling assumptions in the fully-observable policy. We deploy our policy on a physical quadruped robot with an RGB-D camera on pursuit-evasion interactions in the wild. Despite all the challenges, the sensing constraints bring about creativity: the robot is pushed to gather information when uncertain, predict intent from noisy measurements, and anticipate in order to intercept.

13:30-15:00, Paper WeBT7-CC.8	Add to My Program
Integrating Robot Assignment and Maintenance Management: A Multi-Agent Reinforcement Learning Approach for Holistic Control

Bhatta, Kshitij	University of Virginia
Chang, Qing	University of Virginia
Keywords: Intelligent and Flexible Manufacturing, Manufacturing, Maintenance and Supply Chains, Planning, Scheduling and Coordination Abstract: Modern manufacturing requires effective integration of production control and maintenance scheduling to improve productivity and quality. However, there have been few studies on this integrated control due to a lack of a comprehensive manufacturing system model. In response to this challenge, this paper presents a mathematical model framework for a mobile multi-skilled robot-operated manufacturing system that integrates three essential control aspects: robot assignment, maintenance scheduling, and product quality. To demonstrate the effectiveness of this approach, a control problem is solved in the Decentralized Partially Observable Markov Decision Process (Dec-POMDP) framework. Results show that the proposed integrated model outperforms models that consider only system-level parameters, as well as those that only address maintenance scheduling and quality-related parameters.


WeBT8-CC Oral Session, CC-418	Add to My Program
Learning in Grasping and Manipulation II

Chair: Namiki, Akio	Chiba University
Co-Chair: Posa, Michael	University of Pennsylvania

13:30-15:00, Paper WeBT8-CC.1	Add to My Program
Learning to Catch Reactive Objects with a Behavior Predictor

Lu, Kai	University of Oxford
Zhong, Jia-Xing	University of Oxford
Yang, Bo	The Hong Kong Polytechnic University
Wang, Bing	University of Oxford
Markham, Andrew	Oxford University
Keywords: Deep Learning Methods, Reinforcement Learning, Machine Learning for Robot Control Abstract: Tracking and catching moving objects is an important ability for robots in a dynamic world. Whilst some objects have highly predictable state evolution e.g., the ballistic trajectory of a tennis ball, reactive targets alter their behavior in response to the motion of the manipulator. Reactive applications range from gently capturing living animals such as snakes or fish for biological investigations, to smoothly interacting with and assisting a person. Existing works for dynamic catching usually perform target prediction followed by planning, but seldom account for highly non-linear reactive behaviors. Alternatively, Reinforcement Learning (RL) based methods simply treat the target and its motion as part of the observation of the world-state, but perform poorly due to the weak reward signal. In this work, we blend the approach of an explicit, yet learned, target state predictor with RL. We further show how a tightly coupled predictor which `observes' the state of the robot leads to significantly improved anticipatory action, especially with targets that seek to evade the robot following a simple policy. Experiments show that our method achieves an 86.4% (open plane area) and a 73.8% (room) success rate on evasive objects, outperforming monolithic reinforcement learning and other techniques. We also demonstrate the efficacy of our approach across varied targets and trajectories. All code, data, and additional videos: https://kl-research.github.io/dyncatch.

13:30-15:00, Paper WeBT8-CC.2	Add to My Program
Enhancing Task Performance of Learned Simplified Models Via Reinforcement Learning

Bui, Hien	University of Pennsylvania
Posa, Michael	University of Pennsylvania
Keywords: Reinforcement Learning, Model Learning for Control, Dexterous Manipulation Abstract: In contact-rich tasks, the hybrid, multi-modal nature of contact dynamics poses great challenges in model representation, planning, and control. Recent efforts have attempted to address these challenges via data-driven meth- ods, learning dynamical models in combination with model predictive control. Those methods, while effective, rely solely on minimizing forward prediction errors to hope for better task performance with MPC controllers. This weak correlation can result in data inefficiency as well as limitations to overall performance. In response, we propose a novel strategy: using a policy gradient algorithm to find a simplified dynamics model that explicitly maximizes task performance. Specifically, we parameterize the stochastic policy as the perturbed output of the MPC controller, thus, the learned model representation can directly associate with the policy or task performance. We apply the proposed method to contact-rich tasks where a three-fingered robotic hand manipulates previously unknown objects. Our method significantly enhances task success rate by up to 15% in manipulating diverse objects compared to the existing method while sustaining data efficiency. Our method can solve some tasks with success rates of 70% or higher using under 30 minutes of data. All videos and codes are available at https://sites.google.com/view/lcs-rl.

13:30-15:00, Paper WeBT8-CC.3	Add to My Program
Leveraging the Efficiency of Multi-Task Robot Manipulation Via Task-Evoked Planner and Reinforcement Learning

Qian, Haofu	Zhejiang University
Zhang, Haoyang	Zhejiang University
Shao, Jun	Zhejiang University
Zhang, Jiatao	Zhejiang University
Gu, Jason	Dalhousie University
Song, Wei	Zhejiang Lab
Zhu, Shiqiang	Zhejiang University
Keywords: Reinforcement Learning, Manipulation Planning Abstract: Multi-task learning has expanded the boundaries of robotic manipulation, enabling the execution of increasingly complex tasks. However, policies learned through reinforcement learning frequently exhibit limited generalization and narrow distributions, which restrict their effectiveness in multi-task training. Addressing the challenge of obtaining policies with generalization and stability represents a non-trivial problem. To tackle this issue, we propose a planning-guided reinforcement learning method. It leverages a task-evoked planner(TEP) and a reinforcement learning approach with planner's guidance. TEP utilizes reusable samples as the source, with the aim of learning reachability information across different task scenarios. Then in reinforcement learning, TEP assesses and guides the Actor towards better outputs and smoothly enhances the performance in multi-task benchmarks. We evaluate this approach within the Meta-World framework and compare it with other typical multi-task algorithms in terms of learning efficiency and effectiveness. Depending on experimental results, our method has more efficiency, higher success rates, and demonstrates more realistic behavior.

13:30-15:00, Paper WeBT8-CC.4	Add to My Program
Generalize by Touching: Tactile Ensemble Skill Transfer for Robotic Furniture Assembly

Lin, Haohong	Carnegie Mellon University
Corcodel, Radu	Mitsubishi Electric Research Laboratories
Zhao, Ding	Carnegie Mellon University
Keywords: Reinforcement Learning, Force and Tactile Sensing, Transfer Learning Abstract: Furniture assembly remains an unsolved problem in robotic manipulation due to its long task horizon and nongeneralizable operations plan. This paper presents the Tactile Ensemble Skill Transfer (TEST) framework, a pioneering offline reinforcement learning (RL) approach that incorporates tactile feedback in the control loop. TEST's core design is to learn a skill transition model for high-level planning, along with a set of adaptive intra-skill goal-reaching policies. Such design aims to solve the robotic furniture assembly problem in a more generalizable way, facilitating seamless chaining of skills for this long-horizon task. We first sample demonstration from a set of heuristic policies and trajectories consisting of a set of randomized sub-skill segments, enabling the acquisition of rich robot trajectories that capture skill stages, robot states, visual indicators, and crucially, tactile signals. Leveraging these trajectories, our offline RL method discerns skill termination conditions and coordinates skill transitions. Our evaluations highlight the proficiency of TEST on the in-distribution furniture assemblies, its adaptability to unseen furniture configurations, and its robustness against visual disturbances. Ablation studies further accentuate the pivotal role of two algorithmic components: the skill transition model and tactile ensemble policies. Results indicate that TEST can achieve a success rate of 90% and is over 4 times more efficient than the heuristic policy in both in-distribution and generalization settings, suggesting a scalable skill transfer approach for contact-rich manipulation.

13:30-15:00, Paper WeBT8-CC.5	Add to My Program
Sim2Real Manipulation on Unknown Objects with Tactile-Based Reinforcement Learning

Su, Entong	University of California San Diego
Jia, Chengzhe	University of California SanDiego
Qin, Yuzhe	UC San Diego
Zhou, Wenxuan	Carnegie Mellon University
Macaluso, Annabella	University of California, San Diego
Huang, Binghao	University of California, San Diego
Wang, Xiaolong	UC San Diego
Keywords: Reinforcement Learning, Force and Tactile Sensing, Sensor-based Control Abstract: Using tactile sensors for manipulation remains one of the most challenging problems in robotics. At the heart of these challenges is generalization: How can we train a tactile-based policy that can manipulate unseen and diverse objects? In this paper, we propose to perform Reinforcement Learning with only visual tactile sensing inputs on diverse objects in a physical simulator. By training with diverse objects in simulation, it enables the policy to generalize to unseen objects. However, leveraging simulation introduces the Sim2Real transfer problem. To mitigate this problem, we study different tactile representations and evaluate how each affects real-robot manipulation results after transfer. We conduct our experiments on diverse real-world objects and show significant improvements over baselines. Our project page is available at: https://tactilerl.github.io

13:30-15:00, Paper WeBT8-CC.6	Add to My Program
Synchronized Dual-Arm Rearrangement Via Cooperative MTSP

Li, Wenhao	National University of Defense Technology
Zhang, Shishun	National University of Defense Technology
Dai, Sisi	National University of Defense Technology
Huang, Hui	Shenzhen University
Hu, Ruizhen	Shenzhen University
Chen, Xiaohong	Hunan University of Technology and Business
Xu, Kai	National University of Defense Technology
Keywords: Reinforcement Learning, Task Planning, Dual Arm Manipulation Abstract: Synchronized dual-arm rearrangement is widely studied as a common scenario in industrial applications. It often faces scalability challenges due to the computational complexity of robotic arm rearrangement and the high-dimensional nature of dual-arm planning. To address these challenges, we formulated the problem as cooperative mTSP, a variant of mTSP where agents share cooperative costs, and utilized reinforcement learning for its solution. Our approach involved representing rearrangement tasks using a task state graph that captured spatial relationships and a cooperative cost matrix that provided details about action costs. Taking these representations as observations, we designed an attention-based network to effectively combine them and provide rational task scheduling. Furthermore, a cost predictor is also introduced to directly evaluate actions during both training and planning, significantly expediting the planning process. Our experimental results demonstrate that our approach outperforms existing methods in terms of both performance and planning efficiency.

13:30-15:00, Paper WeBT8-CC.7	Add to My Program
EquivAct: SIM(3)-Equivariant Visuomotor Policies Beyond Rigid Object Manipulation

Yang, Jingyun	Stanford University
Deng, Congyue	Stanford
Wu, Jimmy	Princeton University
Antonova, Rika	Stanford University
Guibas, Leonidas	Stanford University
Bohg, Jeannette	Stanford University
Keywords: Representation Learning, Deep Learning in Grasping and Manipulation, Mobile Manipulation Abstract: If a robot masters folding a kitchen towel, we would also expect it to master folding a beach towel. However, existing works for policy learning that rely on data augmentation are still limited in achieving this level of generalization. Our insight is to add equivariance to both the visual object representation and policy architecture. We propose EquivAct which utilizes SIM(3)-equivariant network structures that guarantee generalization across all possible object translations, 3D rotations, and scales by construction. EquivAct is trained in two phases. We first pre-train a SIM(3)-equivariant visual representation on simulated scene point clouds. Then, we learn a SIM(3)-equivariant visuomotor policy on top of the pre-trained visual representation using a small amount of source task demonstrations. We show that the learned policy directly transfers to objects that substantially differ in scale, position, and orientation from the source demonstrations. In simulation, we evaluate our method in three manipulation tasks involving deformable and articulated objects that go beyond typical rigid object manipulation tasks that prior work considered. We show that our method outperforms prior works that do not use equivariant architectures or do not use our contrastive pre-training procedure. We also show real robot experiments where the robot watches 20 demonstrations of a tabletop task and transfers zero-shot to a mobile manipulation task in a much larger setup. Project website: https://equivact.github.io

13:30-15:00, Paper WeBT8-CC.8	Add to My Program
Multi Actor-Critic DDPG for Robot Action Space Decomposition: A Framework to Control Large 3D Deformation of Soft Linear Objects

Daniel, Mélodie	LaBRI - Université De Bordeaux
Magassouba, Aly	INP Clermont
Aranda, Miguel	Universidad De Zaragoza
Lequievre, Laurent	Université Clermont Auvergne - CNRS
Corrales Ramon, Juan Antonio	Universidade De Santiago De Compostela
Iglesias, Roberto	Univ of Santiago De Compostela
Mezouar, Youcef	Clermont Auvergne INP - SIGMA Clermont
Keywords: Reinforcement Learning, Deep Learning in Grasping and Manipulation Abstract: Robotic manipulation of deformable linear objects (DLOs) has great potential for applications in diverse fields such as agriculture or industry. However, a major challenge lies in acquiring accurate deformation models that describe the relationship between robot motion and DLO deformations. Such models are difficult to calculate analytically and vary among DLOs. Consequently, manipulating DLOs poses significant challenges, particularly in achieving large deformations that require highly accurate global models. To address these challenges, this paper presents MultiAC6: a new multi Actor-Critic framework for robot action space decomposition to control large 3D deformations of DLOs. In our approach, two deep reinforcement learning (DRL) agents orient and position a robot gripper to deform a DLO into the desired shape. Unlike previous DRL-based studies, MultiAC6 is able to solve the sim-to-real gap, achieving large 3D deformations up to 40 cm in real-world settings. Experimental results also show that MultiAC6 has a 66% higher success rate than a single-agent approach. Further experimental studies demonstrate that MultiAC6 generalizes well, without retraining, to DLOs with different lengths or materials.


WeBT10-CC Oral Session, CC-501	Add to My Program
Soft Robot Applications I

Chair: Wen, Li	Beihang University
Co-Chair: Okamura, Allison M.	Stanford University

13:30-15:00, Paper WeBT10-CC.1	Add to My Program
Efficient RRT*-Based Safety-Constrained Motion Planning for Continuum Robots in Dynamic Environments

Luo, Peiyu	Southern University of Science and Technology
Yao, Shilong	City University of Hong Kong/Southern University of Science And
Yue, Yiyao	Southern University of Science and Technology
Yan, Hong	City University of Hong Kong
Wang, Jiankun	Southern University of Science and Technology
Meng, Max Q.-H.	The Chinese University of Hong Kong
Keywords: Soft Robot Applications, Constrained Motion Planning, Medical Robots and Systems Abstract: Continuum robots, characterized by their high flexibility and infinite degrees of freedom (DoFs), have gained prominence in applications such as minimally invasive surgery and hazardous environment exploration. However, the intrinsic complexity of continuum robots requires a significant amount of time for their motion planning, posing a hurdle to their practical implementation. To tackle these challenges, efficient motion planning methods such as Rapidly Exploring Random Trees (RRT) and its variant, RRT, have been employed. This paper introduces a unique RRT-based motion control method tailored for continuum robots. Our approach embeds safety constraints derived from the robots' posture states, facilitating autonomous navigation and obstacle avoidance in rapidly changing environments. Simulation results show efficient trajectory planning amidst multiple dynamic obstacles and provide a robust performance evaluation based on the generated postures. Finally, preliminary tests were conducted on a two-segment cable-driven continuum robot prototype, confirming the effectiveness of the proposed planning approach. This method is versatile and can be adapted and deployed for various types of continuum robots through parameter adjustments.

13:30-15:00, Paper WeBT10-CC.2	Add to My Program
Ultrafast Capturing In-Flight Objects with Reprogrammable Working Speed Ranges

Jiang, Yongkang	Tongji University
Tong, Xin	Shenzhen Institute of Advanced Technology, CAS
Sun, Zhongqing	Shenzhen Institute of Advanced Technology，Chinese Academy
Zhou, Yanmin	Tongji University
Wang, Zhipeng	Tongji University
Jiang, Shuo	Tongji University
Yin, Zhen	Tongji University
Ding, Yulong	Tongji University
He, Bin	Tongji University
Li, Yingtian	Shenzhen Institutes of Advanced Technology, Chinese Academy of S
Keywords: Soft Robot Applications, Grasping, Mechanism Design Abstract: In-flight high-speed object capturing is crucial in nature to improve survival and adaptation to the environment, such as the predation of frogs, leopards, and eagles. Despite its ubiquitousness in nature, capturing fast-moving objects is extremely challenging in engineering implementations. In this paper, we report an ultrafast gripper based on tunable bistable structures. Different from current designs which are only suitable for objects with certain speed ranges once the grippers are fabricated, the working range of object speed of the proposed gripper could be reprogrammed by controlling the sensitivity of the structures. We present the design and fabrication of the proposed gripper in detail. A theoretical model is introduced to construct the energy landscape of the structures and the force response of the gripper when programmed to different states. The results show that in the original state, the gripper is capable of capturing a flying table tennis ball with a high speed of 15 m/s in only 6 ms. When the proposed gripper is controlled to the ultra-sensitive state, a flying ball with only 1 m/s could also be captured. This work broadens the frontiers of in-flight capturing design, and we envision broader promising applications.

13:30-15:00, Paper WeBT10-CC.3	Add to My Program
A Restorable, Variable Stiffness Pneumatic Soft Gripper Based on Jamming of Strings of Beads

Han, Fenglin	Central South University
Fei, Lei	Central South University
Zou, Run	Central South University
Li, Weijian	Central South University
Zhou, JingHao	Central South University
Zhao, Haiming	Central South University
Keywords: Soft Robot Applications, Grippers and Other End-Effectors, Soft Robot Materials and Design, particle jamming Abstract: Soft robots based on particle jamming cannot return to the initial position and initial mechanical state due to the accumulation of particles after removing the particle jamming,which means poor restorability, and the compliance of the robots during deformation will be reduced because of the jamming effect.Here,we present the design, fabrication, and tests of a novel soft actuator with good restorability and compliance. To improve the restorability of the actuator, we used cotton threads to connect the spherical acrylic beads into form strings instead of discrete beads.The beads could be pulled to the initial position by the threads, the actuator also returns to the initial state. To avoid the jamming effect during the deformation of the actuator, we used compressed air to drive the actuator and injected the beads into the actuator after the active deformation. To reduce the driving pressure and facilitate the flow of the beads, an initial noncontact, frame-type strain constraint structure was designed for the soft actuator. Experimental data show that the actuator was flexible during bending and the stiffness can increase more than 12-fold to resist the external load. By pulling the threads, the actuator could be restored to the initial state with an error of less than 3% of the actuator length after an operation cycle. The soft gripper based on the actuator can grasp repeatedly or laterally. The gripper can grasp soft objects such as a piece of tofu and a balloon of water.

13:30-15:00, Paper WeBT10-CC.4	Add to My Program
Hard Shell, Soft Core: Binary Actuators for Deep-Sea Applications

Sourkounis, Cora Maria	Leibniz University Hannover
García Morales, Ditzia Susana	Leibniz Universität Hannover
Kwasnitschka, Tom	GEOMAR Helmholtz Centre for Ocean Research Kiel
Raatz, Annika	Leibniz Universität Hannover
Keywords: Soft Robot Applications, Marine Robotics, Mechanism Design Abstract: Deep-sea research represents invaluable opportunities to unravel hidden ecosystems, uncover unknown biodiversity, and provide critical insights into the Earth’s history and the impacts of climate change. Due to the extreme conditions, exploring the deep-sea traditionally requires costly equipment, such as specific diving robots, engineered to withstand the high pressure. Our research aims to reduce the costs of deep-sea sediment sampling by introducing a novel actuation system for suction samplers, that capitalises the advantages of soft material actuators. At first glance, soft material actuators may not appear suitable for the harsh conditions that prevail in the deepsea, but when combined with a rigid, bistable mechanism there is great potential for improving the accessibility of sampling and research in this challenging environment. The binary actuation system that results from this combination, is modular, scalable,lightweight, and low cost in comparison to existing solutions.

13:30-15:00, Paper WeBT10-CC.5	Add to My Program
Tip-Clutching Winch for High Tensile Force Application with Soft Growing Robots

Osele, Obumneme Godson	Stanford University
Barhydt, Kentaro	Massachusetts Institute of Technology
Cerone, Nicholas	Massachusetts Institute of Technology
Okamura, Allison M.	Stanford University
Asada, Harry	MIT
Keywords: Soft Robot Applications, Mechanism Design, Soft Robot Materials and Design Abstract: The navigational abilities of tip-everting soft growing robots, known as vine robots, are compromised when tip-mount devices are added to enable carrying of payloads. We present a new method for securing a vine robot to objects or its environment that exploits the unique eversion-based growth mechanism and flexibility of vine robots, while keeping the tip of the vine robot free of encumbrance. Our implementation is a tip-clutching winch, into which vine robots can insert themselves and anchor to via powerful overlapping belt friction. The device enables passive, high-strength, and reversible fastening, and can easily release the vine robot. This approach enables carrying of loads of at least 28 kg (limited by the tensile strength of the vine robot body material and winch actuator torque capacity), as well as novel material transport and locomotion capabilities.

13:30-15:00, Paper WeBT10-CC.6	Add to My Program
Symmetry-Aware Reinforcement Learning for Robotic Assembly under Partial Observability with a Soft Wrist

Hai, Nguyen	Northeastern University
Kozuno, Tadashi	Omron Sinic X
Beltran-Hernandez, Cristian Camilo	Omron Sinic X
Hamaya, Masashi	OMRON SINIC X Corporation
Keywords: Soft Robot Applications, Modeling, Control, and Learning for Soft Robots, Compliant Assembly Abstract: This study tackles the representative yet challenging contact-rich peg-in-hole task of robotic assembly, using a soft wrist that can operate more safely and tolerate lower-frequency control signals than a rigid one. Previous studies often use a fully observable formulation, requiring external setups or estimators for the peg-to-hole pose. In contrast, we use a partially observable formulation and deep reinforcement learning from demonstrations to learn a memory-based agent that acts purely on haptic and proprioceptive signals. Moreover, previous works do not incorporate potential domain symmetry and thus must search for solutions in a bigger space. Instead, we propose to leverage the symmetry for sample efficiency by augmenting the training data and constructing auxiliary losses to force the agent to adhere to the symmetry. Results in simulation with five different symmetric peg shapes show that our proposed agent can be comparable to or even outperform a state-based agent. In particular, the sample efficiency also allows us to learn directly on the hardware within 3 hours.

13:30-15:00, Paper WeBT10-CC.7	Add to My Program
Force Estimation at the Bionic Soft Arm’s Tool-Center-Point During the Interaction with the Environment

Pilch, Samuel	University of Stuttgart
Müller, Daniel	University of Stuttgart
Sawodny, Oliver	University of Stuttgart
Keywords: Soft Robot Applications, Modeling, Control, and Learning for Soft Robots, Force Control Abstract: Soft continuum robots enable new application areas in contrast to standard rigid robots, such as interaction with a varying environment. Due to their compliant continuous structure, they are inherently safe and adaptive to environmental conditions. In this paper, the interaction with the environment is performed at the tool-center-point of a soft continuum manipulator and is realized by a hybrid force-position control. For this, a force estimation model is derived to substitute the force sensor at the tool-center-point. The force estimation is probabilistic and relies on normal distributions considering model parameters and deviations from model identification of the soft continuum robot. It also provides a qualitative measure for the contact estimation. This paper first presents the probabilistic force estimation model and then shows the hybrid force-position control using the presented model. From the results, it is concluded that force sensing is replaceable for the environment interaction.

13:30-15:00, Paper WeBT10-CC.8	Add to My Program
Field-Evaluated Closed Structure Soft Gripper Enhances the Shelf Life of Harvested Blackberries

Johnson, Philip. H	University of Lincoln
Junge, Kai	École Polytechnique Fédérale De Lausanne
Whitfield, Charles	National Institute of Agricultural Botany
Hughes, Josie	EPFL
Calisti, Marcello	The University of Lincoln
Keywords: Soft Robot Applications, Robotics and Automation in Agriculture and Forestry, Grippers and Other End-Effectors Abstract: Soft robotic grippers are intrinsically delicate while grasping objects, and can rely on mechanical deformation to adapt to different shapes without explicit control. These characteristics are particularly appealing for agriculture, where items of produce from the same crop can vary significantly in shape and size, and delicate harvesting is among the first concerns for fruit quality. Various soft robotic grippers have been proposed for harvesting different produce types, however their employment in field testing has been extremely limited. In this paper we developed the first closed structure soft gripper for the harvest of blackberries. We adapted an existing gripper concept, initially testing it on a sensorised raspberry physical twin. Then, followed grower-guided protocols to pick blackberries in farm polytunnels, and to evaluate the shelf life in comparison with berries picked by professional human pickers. Our results with ten experimental varieties showed a picking success rate of 95.4% demonstrating the capability of a closed structure gripper to adapt mechanically to fruit-shape variability. Moreover, a shelf life assessment on seven measured traits reported greatly improved shelf life of between 30 and 150%, across all traits for gripper harvested blackberries. Our study demonstrates the potential of soft grippers for delicate fruit harvesting, and indicates how to increase the impact of robotics in agriculture.


WeBT12-CC Oral Session, CC-503	Add to My Program
Transfer Learning

Chair: Yang, Jianfei	Nanyang Technological University
Co-Chair: Sóti, Gergely	Karlsruhe University of Applied Sciences

13:30-15:00, Paper WeBT12-CC.1	Add to My Program
Fine-Tuning Point Cloud Transformers with Dynamic Aggregation

Fei, Jiajun	Tsinghua University
Deng, Zhidong	Tsinghua University
Keywords: Transfer Learning, Representation Learning, Object Detection, Segmentation and Categorization Abstract: Point clouds play an important role in 3D analysis, which has broad applications in robotics and autonomous driving. The pre-training fine-tuning paradigm has shown great potential in the point cloud domain. Full fine-tuning is generally effective but leads to a heavy storage and computational burden, which becomes inefficient and unacceptable as the size of pretrained models scales. Although efficient fine-tuning approaches have significant progress in other domains, they generally perform worse for point clouds. To overcome this dilemma, we revisit the official Point-MAE implementation and find the critical role of aggregation in fine-tuning performances. Inspired by such discoveries, we propose a novel dynamic aggregation (DA) method to replace previous static aggregation like mean or max pooling for pre-trained point cloud Transformers. Besides standard metrics such as accuracy or mIoU, we evaluate the number of tunable parameters and additional FLOPs for a fair comparison of our method to different fine-tuning approaches. We construct several DA variants and validate them through extensive experiments. Experimental results demonstrate that DA has competitive performances against full fine-tuning and other efficient fine-tuning approaches. The code is publicly available at https://github.com/JaronTHU/DynamicAggregation.

13:30-15:00, Paper WeBT12-CC.2	Add to My Program
MoPA: Multi-Modal Prior Aided Domain Adaptation for 3D Semantic Segmentation

Cao, Haozhi	Nanyang Technological University
Xu, Yuecong	National University of Singapore
Yang, Jianfei	Nanyang Technological University
Yin, Pengyu	Nanyang Technological University
Yuan, Shenghai	Nanyang Technological University
Xie, Lihua	NanyangTechnological University
Keywords: Transfer Learning, Object Detection, Segmentation and Categorization, Deep Learning Methods Abstract: Multi-modal unsupervised domain adaptation (MM-UDA) for 3D semantic segmentation is a practical solution to embed semantic understanding in autonomous systems without expensive point-wise annotations. While previous MM-UDA methods can achieve overall improvement, they suffer from significant class-imbalanced performance, restricting their adoption in real applications. This imbalanced performance is mainly caused by: 1) self-training with imbalanced data and 2) the lack of pixel-wise 2D supervision signals. In this work, we propose Multi-modal Prior Aided (MoPA) domain adaptation to improve the performance of rare objects. Specifically, we develop Valid Ground-based Insertion (VGI) to rectify the imbalance supervision signals by inserting prior rare objects collected from the wild while avoiding introducing artificial artifacts that lead to trivial solutions. Meanwhile, our SAM consistency loss leverages the 2D prior semantic masks from SAM as pixel-wise supervision signals to encourage consistent predictions for each object in the semantic mask. The knowledge learned from modal-specific prior is then shared across modalities to achieve better rare object segmentation. Extensive experiments show that our method achieves state-of-the-art performance on the challenging MM-UDA benchmark. Code will be available at https://github.com/AronCao49/MoPA.

13:30-15:00, Paper WeBT12-CC.3	Add to My Program
Cross Domain Policy Transfer with Effect Cycle-Consistency

Zhu, Ruiqi	King's College London
Dai, Tianhong	Imperial College London
Celiktutan, Oya	King's College London
Keywords: Transfer Learning, Machine Learning for Robot Control Abstract: Training a robotic policy from scratch using deep reinforcement learning methods can be prohibitively expensive due to sample inefficiency. To address this challenge, transferring a pre-trained policy in the source domain to the target domain becomes an attractive solution. Previous research has typically focused on domains with similar state and action spaces but differing in other aspects. In this paper, our primary focus lies in domains with different state and action spaces, which has broader practical implications, i.e. transfer the policy from robot A to robot B. Unlike prior methods that rely on paired data, we propose a novel approach for learning the mapping functions between state and action spaces across domains with unpaired data. We propose effect cycle-consistency, which aligns the effects of transitions across two domains through a symmetrical optimization structure for learning these mapping functions. Once the mapping functions are learned, we can seamlessly transfer the policy from the source domain to the target domain without the need for additional fine-tuning. Our approach has been tested through experiments conducted on three locomotion tasks and two robotic manipulation tasks. The empirical results demonstrate that our method not only achieves better performance of the transferred policies but also reduces alignment errors significantly compared to the baselines.

13:30-15:00, Paper WeBT12-CC.4	Add to My Program
Parameter-Efficient Prompt Learning for 3D Point Cloud Understanding

Sun, Hongyu	Renmin University of China
Wang, Yongcai	Renmin University of China
Chen, Wang	RENMIN UNIVERSITY of CHINA
Deng, Haoran	Renmin University of China
Li, Deying	Renmin University of China
Keywords: Transfer Learning Abstract: This paper presents a parameter-efficient prompt tuning method, named PPT, to adapt a large multi-modal model for 3D point cloud understanding. Existing strategies are quite expensive in computation and storage, and depend on time-consuming prompt engineering. We address the problems from three aspects. Firstly, a PromptLearner module is devised to replace hand-crafted prompts with learnable contexts to automate the prompt tuning process. Then, we lock the pre-trained backbone instead of adopting the full fine-tuning paradigm to substantially improve the parameter efficiency. Finally, a lightweight PointAdapter module is arranged near target tasks to enhance prompt tuning for 3D point cloud understanding. Comprehensive experiments are conducted to demonstrate the superior parameter and data efficiency of the proposed method. Meanwhile, we obtain new records on 4 public datasets and multiple 3D tasks, i.e., point cloud recognition, few-shot learning, and part segmentation. The implementation is available at https://github.com/auniquesun/PPT.

13:30-15:00, Paper WeBT12-CC.5	Add to My Program
BEVUDA: Multi-Geometric Space Alignments for Domain Adaptive BEV 3D Object Detection

Liu, Jiaming	Peking University
Zhang, Rongyu	Nanjing University
Li, Xiaoqi	Peking University
Chi, Xiaowei	Hong Kong University of Science and Technology
Chen, Zehui	University of Science and Technology of China
Lu, Ming	Intel Labs
Guo, Yandong	OPPO Research Institute
Zhang, Shanghang	Peking University
Keywords: Transfer Learning, Deep Learning for Visual Perception Abstract: Vision-centric bird-eye-view (BEV) perception has shown promising potential in autonomous driving. Recent works mainly focus on improving efficiency or accuracy but neglect the challenges when facing environment changing, resulting in severe degradation of transfer performance. For BEV perception, we figure out the significant domain gaps existing in typical real-world cross-domain scenarios and make the first attempt to solve the Domain Adaption (DA) problem for multi-view 3D object detection. Since BEV perception approaches are complicated and contain several components, the domain shift accumulation on multiple geometric spaces (i.e., 2D, 3D Voxel, BEV) makes BEV DA even challenging. In this paper, we propose a Multi-space Alignment Teacher-Student (MATS) framework to ease the domain shift accumulation, which consists of a Depth-Aware Teacher (DAT) and a Geometric-space Aligned Student (GAS) model. DAT tactfully combines target lidar and reliable depth prediction to construct depth-aware information, extracting target domain-specific knowledge in Voxel and BEV feature spaces. It then transfers the sufficient domain knowledge of multiple spaces to the student model. In order to jointly alleviate the domain shift, GAS projects multi-geometric space features to a shared geometric embedding space and decreases data distribution distance between two domains. To verify the effectiveness of our method, we conduct BEV 3D object detection experiments on three cross-domain scenarios and achieve state-of-the-art performance. The code will be released at https://github.com/liujiaming1996/BEV_UDA.

13:30-15:00, Paper WeBT12-CC.6	Add to My Program
6-DOF Grasp Pose Evaluation and Optimization Via Transfer Learning from NeRFs

Sóti, Gergely	Karlsruhe University of Applied Sciences
Huang, Xi	Karlsruhe Institute of Technology
Wurll, Christian	Karlsruhe University of Applied Sciences
Hein, Björn	Karlsruhe University of Applied Sciences
Keywords: Transfer Learning, Representation Learning, Grasping Abstract: We address the problem of robotic grasping of known and unknown objects using implicit behavior cloning. We train a grasp evaluation model from a small number of demonstrations that outputs higher values for grasp candidates that are more likely to succeed in grasping. This evaluation model serves as an objective function, that we maximize to identify successful grasps. Key to our approach is the utilization of learned implicit representations of visual and geometric features derived from a pre-trained NeRF. Though trained exclusively in a simulated environment with simplified objects and 4-DoF top-down grasps, our evaluation model and optimization procedure demonstrate generalization to 6-DoF grasps and novel objects both in simulation and in real-world settings, without the need for additional data. Supplementary material is available at: https://gergely-soti.github.io/grasp

13:30-15:00, Paper WeBT12-CC.7	Add to My Program
Multi-Level Progressive Reinforcement Learning for Control Policy in Physical Simulations

Wu, Kefei	ShanghaiTech University
He, Xuming	ShanghaiTech University
Wang, Yang	Shanghaitech University
Liu, Xiaopei	SHANGHAITECH UNIVERSITY
Keywords: Transfer Learning, Reinforcement Learning, Simulation and Animation Abstract: Training model-free intelligent agents in complex real-world scenarios using reinforcement learning (RL) often necessitates simulation-based environments due to high physical expenses. However, when simulation takes a long time, e.g., in an unsteady 3D fluid simulation with interactions to the controllable solids, existing RL algorithms meet difficulty to accomplish training within a reasonable timeframes. In this paper, we propose a novel multi-level framework for RL to accelerate convergence as the first attempt to address this difficulty. Motivated by the idea of multi-grid solver, the control policy on a virtual agent over time can be decomposed into different frequency bands, which can be progressively learned via a set of simulations in a coarse-to-fine manner. It is expected that most RL trials are performed in coarse simulations to learn lower frequency bands with efficient convergence, while higher frequency levels require much less RL trials, thus significantly accelerating the learning process. To implement our idea, we designed a novel multi-level residual network with a filter module attached, where each level of the network is learned by performing RL for a given simulation resolution. The proposed framework is evaluated by conducting policy learning experiments on virtual aerial (2D) and underwater (3D) robots, both requiring time-consuming physical simulations. Our results demonstrate a 50% decrease in learning time compared to a direct RL approach, while achieving similar control performance.

13:30-15:00, Paper WeBT12-CC.8	Add to My Program
Kalman Filter-Based One-Shot Sim-To-Real Transfer Learning

Dongqingwei, Dongqingwei	Shenyang Institute of Automation, Chinese Academy of Sciences
Zeng, Peng	Shenyang Institute of Automation Chinese Academy of Sciences
Wan, Guangxi	Shenyang Institute of Automation, Chinese Academy of Sciences
He, Yunpeng	Shenyang Institute of Automation, Chinese Academy of Sciences
Dong, Xiaoting	Shenyang Institute of Automation, Chinese Academic of Science
Keywords: Transfer Learning, Reinforcement Learning, Machine Learning for Robot Control Abstract: Deep reinforcement learning algorithms offer a promising method for industrial robots to tackle unstructured and complex scenarios that are difficult to model. However, due to constraints related to equipment lifespan and safety requirements, acquiring a number of samples directly from the physical environment is often infeasible. With the development of increasingly realistic simulators, it has become feasible for industrial robots to acquire complex motion skills within simulated environments. Nonetheless, the ”reality gap” frequently results in performance degradation when transferring policies trained in simulators to physical systems. In this paper, we treat the reality gap between a physical environment (target domain) and a simulated environment (source domain) as a Gaussian perturbation and utilize Kalman filtering to reduce the discrepancy between source and target domain data. We refine the source domain controller using target domain data to enhance the controller’s adaptability to the target domain. The efficacy of the proposed method is demonstrated in reaching tasks and peg-in-hole tasks conducted on PR2 and UR5 robotic platforms.


WeBT15-AX Oral Session, AX-203	Add to My Program
Human Factors and Human-In-The-Loop II

Chair: Maruyama, Hisataka	Nagoya University
Co-Chair: Chrysostomou, Dimitrios	Aalborg University

13:30-15:00, Paper WeBT15-AX.1	Add to My Program
SEQUEL: Semi-Supervised Preference-Based RL with Query Synthesis Via Latent Interpolation

Marta, Daniel	KTH Royal Institute of Technology
Holk, Simon	KTH Royal Institute of Technology
Pek, Christian	Delft University of Technology
Leite, Iolanda	KTH Royal Institute of Technology
Keywords: Human Factors and Human-in-the-Loop, Reinforcement Learning, Representation Learning Abstract: Preference-based reinforcement learning (RL) poses as a recent research direction in robot learning, by allowing humans to teach robots through preferences on pairs of desired behaviours. Nonetheless, to obtain realistic robot policies, an arbitrarily large number of queries is required to be answered by humans. In this work, we approach the sample-efficiency challenge by presenting a technique which synthesizes queries, in a semi-supervised learning perspective. To achieve this, we leverage latent variational autoencoder (VAE) representations of trajectory segments (sequences of state-action pairs). Our approach manages to produce queries which are closely aligned with those labeled by humans, while avoiding excessive uncertainty according to the human preference predictions as determined by reward estimations. Additionally, by introducing variation without deviating from the original human's intents, more robust reward function representations are achieved. We compare our approach to recent state-of-the-art preference-based RL semi-supervised learning techniques. Our experimental findings reveal that we can enhance the generalization of the estimated reward function without requiring additional human intervention. Lastly, to confirm the practical applicability of our approach, we conduct experiments involving actual human users in a simulated social navigation setting. Videos of the experiments can be found at https://sites.google.com/view/rl-sequel

13:30-15:00, Paper WeBT15-AX.2	Add to My Program
Learning When to Ask for Help: Efficient Interactive Navigation Via Implicit Uncertainty Estimation

Igbinedion, Ifueko	Massachusetts Institute of Technology
Karaman, Sertac	Massachusetts Institute of Technology
Keywords: Human Factors and Human-in-the-Loop, Vision-Based Navigation, Reinforcement Learning Abstract: Robots operating alongside humans often encounter unfamiliar environments that make autonomous task completion challenging. Though improving models and increasing dataset size can enhance a robot's performance in unseen environments, data collection and model refinement may be impractical in every environment. Approaches that utilize human demonstrations through manual operation can aid in refinement and generalization, but often require significant data collection efforts to generate enough demonstration data to achieve satisfactory task performance. Interactive approaches allow for humans to provide correction to robot action in real time, but intervention policies are often based on explicit factors related to state and task understanding that may be difficult to generalize. Addressing these challenges, we train a lightweight interaction policy that allows robots to decide when to proceed autonomously or request expert assistance at estimated times of uncertainty. An implicit estimate of uncertainty is learned via evaluating the feature extraction capabilities of the robot's visual navigation policy. By incorporating part-time human interaction, robots recover quickly from their mistakes, significantly improving the odds of task completion. Incorporating part-time interaction yields an increase in success of 0.38 with only a 0.3 expert interaction rate within the Habitat simulation environment using a simulated human expert. We further show success transferring this approach to a new domain with a real human expert, improving success from less than 0.1 with an autonomous agent to 0.92 with a 0.23 human interaction rate. This approach provides a practical means for robots to interact and learn from humans in real-world settings.

13:30-15:00, Paper WeBT15-AX.3	Add to My Program
JaywalkerVR: A VR System for Collecting Safety-Critical Pedestrian-Vehicle Interactions

Mukoya, Kenta	Carnegie Mellon University
Weng, Erica	Carnegie Mellon University
Choudhury, Rohan	Carnegie Mellon University
Kitani, Kris	Carnegie Mellon University
Keywords: Virtual Reality and Interfaces, Motion and Path Planning, Human Factors and Human-in-the-Loop Abstract: Developing autonomous vehicles that can safely interact with pedestrians requires large amounts of pedestrian and vehicle data in order to learn accurate pedestrian-vehicle interaction models. However, gathering data that include crucial but rare scenarios - such as pedestrians jaywalking into heavy traffic - can be costly and unsafe to collect. We propose a virtual reality human-in-the-loop simulator, JaywalkerVR, to obtain vehicle-pedestrian interaction data to address these challenges. Our system enables efficient, affordable, and safe collection of long-tail pedestrian-vehicle interaction data. Using our proposed simulator, we create a high-quality dataset with vehicle-pedestrian interaction data from safety critical scenarios called CARLA-VR. The CARLA-VR dataset addresses the lack of long-tail data samples in commonly used real world autonomous driving datasets. We demonstrate that models trained with CARLA-VR improve displacement error and collision rate by 10.7% and 4.9%, respectively, and are more robust in rare vehicle-pedestrian scenarios.

13:30-15:00, Paper WeBT15-AX.4	Add to My Program
Human Preference-Aware Rebalancing and Charging for Shared Electric Micromobility Vehicles

Tan, Heng	Lehigh University
Yuan, Yukun	University of Tennessee at Chattanooga
Yan, Hua	LEHIGH UNIVERSITY
Zhong, Shuxin	Rutgers University
Yang, Yu	Lehigh University
Keywords: Intelligent Transportation Systems, Human Factors and Human-in-the-Loop, Reinforcement Learning Abstract: Shared electric micromobility has surged to a popular model of urban transportation due to its efficiency in short-distance trips and environmentally friendly characteristics compared to traditional automobiles. However, managing thousands of shared electric micromobility vehicles, including rebalancing and charging to meet users' travel demands still has been a challenge. Existing methods generally ignore human preferences in vehicle selection and assume all nearby vehicles have an equal chance of being selected, which is unrealistic based on our findings. To address this problem, we design PERCEIVE, a human preference-aware rebalancing and charging framework for shared electric micromobility vehicles. Specifically, we model human preferences in vehicle selection based on vehicle usage history and current status (e.g., energy level) and incorporate the vehicle selection model into a robust adversarial reinforcement learning framework. We further utilize conformal prediction to quantify human preference uncertainty and fuse it with the reinforcement learning framework. We evaluate our framework using two months of real-world electric micromobility operation data in a city. Experimental results show that our method achieves a performance gain of at least 4.02% in the net revenue and offers more robust performance in worst-case scenarios compared to state-of-the-art baselines.

13:30-15:00, Paper WeBT15-AX.5	Add to My Program
A Power-Aware Control Strategy for an Elbow Effort-Compensation Device

Mobedi, Emir	Istituto Italiano Di Tecnologia
Hjorth, Sebastian	Istituto Italiano Di Technologia
Kim, Wansoo	Hanyang University ERICA
De Momi, Elena	Politecnico Di Milano
Tsagarakis, Nikos	Istituto Italiano Di Tecnologia
Ajoudani, Arash	Istituto Italiano Di Tecnologia
Keywords: Physically Assistive Devices, Wearable Robotics, Human Factors and Human-in-the-Loop Abstract: This work presents a reactive control strategy for loading and sudden unloading of an elbow effort-compensation device controlled in force. Through this control strategy, in addition to an individual's forearm weight, an external load can be detected and adaptively compensated via a feed-forward force reference, facilitating the execution of arbitrary movements by the wearer. In case of a sudden contact/load loss, a power-aware strategy is implemented to immediately eliminate the portion of external loading in the force reference. The adaptive compensation of the external loads is achieved through an electromyography interface. Instead, to react to sudden load releases, we set a power limit on the tendon, and continuously measure it through an encoder and a load cell connected with the cable. Two sets of experiments are designed to test the proposed load-releasing method on a bench-top setup with 2 kg, and 3.9 kg, and a human subject with 0.5 and 1 kg. Next, the overall scenario including load-compensation and load-releasing are carried out on eight human subjects with 0.5 and 1 kg loads to evaluate the release and compensation time, and the effort reduction with respect to non-powered exoskeleton case. Results show that the average compensation/release time (payload) among subjects is measured as 0.98/0.91 seconds (0.5 kg), and 1/0.86 seconds (1 kg). The average effort reduction among the subjects are also reported as 66.4%, and 67.11% for 0.5 kg, and 1 kg, respectively.

13:30-15:00, Paper WeBT15-AX.6	Add to My Program
A Planar Compliant Contact Control Applied to Multi-Dimensional Elastic Gripper for Unexpected Contact

Huang, Junnan	Tsinghua University
Wang, Xuefeng	Peking University
Xia, Chongkun	Tsinghua University
Liu, Houde	Shenzhen Graduate School, Tsinghua University
Shao, Mingqi	Tsinghua Shenzhen International Graduate School
Liang, Bin	Tsinghua University
Keywords: Human Factors and Human-in-the-Loop, Safety in HRI, Human-Robot Collaboration Abstract: It is difficult to guarantee an empty living environment to prevent unexpected contact between the object being manipulated by the robot and unplanned obstacles. In this paper, we propose a planar compliant contact control method for planar manipulation to cope with unexpected contact. We first use sheet gel as a multi-dimensional passive elastic element and combine it with a two-finger gripper to design an elastic gripper. Subsequently, we explore the lumped parameter model for the force-displacement relationship of gel deformation and combine the model with the highly impedance motion of robots to design an elastic interaction controller. The controller not only actively adjusts the deformation of the gel to provide the desired contact force and torque depending on contact, but also performs avoidance by following the surface of obstacles. Finally, we design and deploy several planar compliant contact experiments to validate the proposed method and demonstrate the unexpected contact response in humanrobot co-packing. The results show that our method enables the robot to remain compliance in the face of unexpected contact caused by unplanned obstacles, which provides a guarantee for safe manipulation. Physics experiments can be viewed in the attached video.

13:30-15:00, Paper WeBT15-AX.7	Add to My Program
Enabling Passivity for Cartesian Workspace Restrictions

Hjorth, Sebastian	Istituto Italiano Di Technologia
Lachner, Johannes	Massachusetts Institute of Technology
Ajoudani, Arash	Istituto Italiano Di Tecnologia
Chrysostomou, Dimitrios	Aalborg University
Keywords: Human-Robot Collaboration, Disassembly, Compliance and Impedance Control Abstract: An emerging trend in the field of human-robot collaboration is the disassembly of end-of-life products. Safety is a crucial requirement of the disassembly process since wornout or damaged products could break, possibly resulting in dangerous behavior of the robot. To protect the user from such behavior, this work addresses this challenge through the implementation of an energy-aware Cartesian impedance controller, combined with virtual workspace restrictions. Hereby, the passivity of the robotic system is ensured. The paper proposed two approaches to ensure the passivity of the system when subjected to workspace restrictions due to unplanned interactions and contact loss. The first approach employs an augmented energy tank with restricted energy flow. The second approach monitors the overall energy flow, regulating and separating non-passive behavior, caused by workspace restrictions. The approaches are evaluated and compared with each other, by using a KUKA LBR iiwa robot. The results highlight the potential of virtual workspace restrictions in human-robot collaborative disassembly tasks.

13:30-15:00, Paper WeBT15-AX.8	Add to My Program
Decision Making for Human-In-The-Loop Robotic Agents Via Uncertainty-Aware Reinforcement Learning

Singi, Siddharth	Columbia University
He, Zhanpeng	Columbia University
Pan, Alvin	Columbia University
Patel, Sandipkumar	Columbia University
Sigurdsson, Gunnar	Amazon
Piramuthu, Robinson	Amazon
Song, Shuran	Columbia University
Ciocarlie, Matei	Columbia University
Keywords: Human Factors and Human-in-the-Loop, Human-Robot Collaboration, Human-Centered Robotics Abstract: In a Human-in-the-Loop paradigm, a robotic agent is able to act mostly autonomously in solving a task, but can request help from an external expert when needed. However, knowing when to request such assistance is critical: too few requests can lead to the robot making mistakes, but too many requests can overload the expert. In this paper, we present a Reinforcement learning-based approach to this problem, where a semi-autonomous agent asks for external assistance when it has low confidence in the eventual success of the task. The confidence level is computed by estimating the variance of the return from the current state. We iteratively improve this estimate during training using a Bellman-like recursion. On discrete navigation problems with both fully- and partially-observable state information, we show that our method makes effective use of a limited budget of expert calls at run-time, despite having no access to the expert at training time. To the best of our knowledge, this is the first instance of using the variance of the return computed in an RL framework as a guidance measure for a Human-in-the-Loop agent.


WeBT17-AX Oral Session, AX-205	Add to My Program
Legged Robots and Learning I

Chair: Della Santina, Cosimo	TU Delft
Co-Chair: Hutter, Marco	ETH Zurich

13:30-15:00, Paper WeBT17-AX.1	Add to My Program
Seeing through the Grass: Semantic Pointcloud Filter for Support Surface Learning

Li, Anqiao	ETH Zurich
Yang, Chenyu	ETH Zurich
Frey, Jonas	ETH Zurich
Lee, Joonho	ETH Zurich
Cadena Lerma, Cesar	ETH Zurich
Hutter, Marco	ETH Zurich
Keywords: Legged Robots, Deep Learning for Visual Perception, Field Robots Abstract: Mobile ground robots require perceiving and understanding their surrounding support surface to move around autonomously and safely. The support surface is commonly estimated based on exteroceptive depth measurements, e.g., from LiDARs. However, the measured depth fails to align with the true support surface in the presence of high grass or other penetrable vegetation. In this work, we present the semantic pointcloud filter (SPF), a convolutional neural network (CNN) that learns to adjust LiDAR measurements to align with the underlying support surface. The SPF is trained in a semi-self-supervised manner and takes as an input a LiDAR pointcloud and RGB image. The network predicts a binary segmentation mask that identifies the specific points requiring adjustment, along with estimating their corresponding depth values. To train the segmentation task, 464 distinct images are manually labeled into rigid and non-rigid terrain. The depth estimation task is trained in a self-supervised manner by utilizing the future footholds of the robot to estimate the support surface based on a Gaussian process. Our method can correctly adjust the support surface prior to interacting with the terrain and is extensively tested on the quadruped robot ANYmal. We show the qualitative benefits of SPF in natural environments for elevation mapping and traversability estimation compared to using raw sensor measurements and existing smoothing methods. Quantitative analysis is performed in various natural e

13:30-15:00, Paper WeBT17-AX.2	Add to My Program
Manipulator As a Tail: Promoting Dynamic Stability for Legged Locomotion

Huang, Huang	University of California at Berkeley
Loquercio, Antonio	UC Berkeley
Kumar, Ashish	UC Berkeley
Thakkar, Neerja	UC Berkeley
Goldberg, Ken	UC Berkeley
Malik, Jitendra	UC Berkeley
Keywords: Legged Robots, Incremental Learning, Whole-Body Motion Planning and Control Abstract: For locomotion, is an arm on a legged robot a liability or an asset for locomotion? Biological systems evolved additional limbs beyond legs that facilitates postural control. This work shows how a manipulator can be an asset for legged locomotion at high speeds or under external perturbations, where the arm serves beyond manipulation. Since the system has 15 degrees of freedom (twelve for the legged robot and three for the arm), off-the-shelf reinforcement learning (RL) algorithms struggle to learn effective locomotion policies. Inspired by Bernstein’s neurophysiological theory of animal motor learning, we develop an incremental training procedure that initially freezes some degrees of freedom and gradually releases them, using behaviour cloning (BC) from an early learning procedure to guide optimization in later learning. Simulation experiments show that our policy increases the success rate by up to 61 percentage points over the baselines. Simulation and real robot experiments suggest that our policy learns to use the arm as a “tail” to initiate robot turning at high speeds and to stabilize the quadruped under external perturbations. Quantitatively, in simulation experiments, we cut the failure rate up to 43.6% during high-speed turning and up to 31.8% for quadruped under external forces compared to using a locked arm.

13:30-15:00, Paper WeBT17-AX.3	Add to My Program
Two-Stage Learning of Highly Dynamic Motions with Rigid and Articulated Soft Quadrupeds

Vezzi, Francesco	Technical University of Delft
Ding, Jiatao	Delft University of Technology
Raffin, Antonin	DLR
Kober, Jens	TU Delft
Della Santina, Cosimo	TU Delft
Keywords: Legged Robots, Machine Learning for Robot Control, Motion Control Abstract: Controlled execution of dynamic motions in quadrupedal robots, especially those with articulated soft bodies, presents a unique set of challenges that traditional methods struggle to address efficiently. In this study, we tackle these issues by relying on a simple yet effective two-stage learning framework to generate dynamic motions for quadrupedal robots. First, a gradient-free evolution strategy is employed to discover simply represented control policies, eliminating the need for a predefined reference motion. Then, we refine these policies using deep reinforcement learning. Our approach enables the acquisition of complex motions like pronking and back-flipping, effectively from scratch. Additionally, our method simplifies the traditionally labour-intensive task of reward shaping, boosting the efficiency of the learning process. Importantly, our framework proves particularly effective for articulated soft quadrupeds, whose inherent compliance and adaptability make them ideal for dynamic tasks but also introduce unique control challenges.

13:30-15:00, Paper WeBT17-AX.4	Add to My Program
Expert Composer Policy: Scalable Skill Repertoire for Quadruped Robots

Galelli Christmann, Guilherme Henrique	Inventec Corporation
Luo, Ying-Sheng	Inventec Corporation
Chen, Wei-Chao	Inventec Inc
Keywords: Legged Robots, Reinforcement Learning Abstract: We propose the expert composer policy, a framework to reliably expand the skill repertoire of quadruped agents. The composer policy links pair of experts via transitions to a sampled target state, allowing experts to be composed sequentially. Each expert specializes in a single skill, such as a locomotion gait or a jumping motion. Instead of a hierarchical or mixture-of-experts architecture, we train a single composer policy in an independent process that is not conditioned on the other expert policies. By reusing the same composer policy, our approach enables adding new experts without affecting existing ones, enabling incremental repertoire expansion and preserving original motion quality. We measured the transition success rate of 72 transition pairs and achieved an average success rate of 99.99%, which is over 10% higher than the baseline random approach, and outperforms other state-of-the-art methods. Using domain randomization during training we ensure a successful transfer to the real world, where we achieve an average transition success rate of 97.22% (N=360) in our experiments.

13:30-15:00, Paper WeBT17-AX.5	Add to My Program
Learning Agile Bipedal Motions on a Quadrupedal Robot

Li, Yunfei	Tsinghua University
Li, Jinhan	Tsinghua University
Fu, Wei	Tsinghua University
Wu, Yi	Tsinghua University
Keywords: Legged Robots, Reinforcement Learning Abstract: Can a quadrupedal robot perform bipedal motions like humans? Although developing human-like behaviors is more often studied on costly bipedal robot platforms, we present a solution over a lightweight quadrupedal robot that unlocks the agility of the quadruped in an upright standing pose and is capable of a variety of human-like motions. Our framework is with a hierarchical structure. At the low level is a motion-conditioned control policy that allows the quadrupedal robot to track desired base and front limb movements while balancing on two hind feet. The policy is commanded by a high-level motion generator that gives trajectories of parameterized human-like motions to the robot from multiple modalities of human input. We for the first time demonstrate various bipedal motions on a quadrupedal robot, and showcase interesting human-robot interaction modes including mimicking human videos, following natural language instructions, and physical interaction. The video is available at https://sites.google.com/view/bipedal-motions-quadruped.

13:30-15:00, Paper WeBT17-AX.6	Add to My Program
LAGOON: Language-Guided Motion Control

Xu, Shusheng	Tsinghua University
Wang, Huaijie	Tsinghua University
Ouyang, Yutao	Xiamen University
Gao, Jiaxuan	Tsinghua University
Mei, Zhiyu	Tsinghua University
Yu, Chao	Tsinghua University
Wu, Yi	Tsinghua University
Keywords: Legged Robots, Reinforcement Learning Abstract: Abstract— We aim to control a robot to physically behave in the real world following any high-level language command like “cartwheel” or “kick”. Although human motion datasets exist, this task remains particularly challenging since generative models can produce physically unrealistic motions, which will be more severe for robots due to different body structures and physical properties. Deploying such a motion to a physical robot can cause even greater difficulties due to the sim2real gap. We develop LAnguage-Guided mOtion cONtrol (LAGOON), a multi-phase reinforcement learning (RL) method to generate physically realistic robot motions under language commands. LAGOON first leverages a pre-trained model to generate a human motion from a language command. Then an RL phase trains a control policy in simulation to mimic the generated human motion. Finally, with domain randomization, our learned policy can be deployed to a quadrupedal robot, leading to a quadrupedal robot that can take diverse behaviors in the real world under natural language commands.

13:30-15:00, Paper WeBT17-AX.7	Add to My Program
Learning Quadrupedal Locomotion with Impaired Joints Using Random Joint Masking

Kim, Mincheol	Seoul National University of Sciences and Technology
Shin, Ukcheol	CMU(Carnegie Mellon University)
Kim, Jung-Yup	Seoul National University of Science & Technology
Keywords: Legged Robots, Reinforcement Learning Abstract: Quadrupedal robots have played a crucial role in various environments, from structured environments to complex harsh terrains, thanks to their agile locomotion ability. However, these robots can easily lose their locomotion functionality if damaged by external accidents or internal malfunctions. In this paper, we propose a novel deep reinforcement learning frame-work to enable a quadrupedal robot to walk with impaired joints. The proposed framework consists of three components: 1) a random joint masking strategy for simulating impaired joint scenarios, 2) a joint state estimator to predict an implicit status of current joint condition based on past observation history, and 3) progressive curriculum learning to allow a single network to conduct both normal gait and various joint-impaired gaits. We verify that our framework enables the Unitree’s Go1 robot to walk under various impaired joint conditions in real-world indoor and outdoor environments.

13:30-15:00, Paper WeBT17-AX.8	Add to My Program
Multi-Task Learning of Active Fault-Tolerant Controller for Leg Failures in Quadruped Robots

Hou, Taixian	FuDan University
Tu, Jiaxin	FuDan University
Gao, Xiaofei	Beijing Zhitong Robot Technology Co., Ltd
Dong, Zhiyan	Fudan University
Zhai, Peng	Fudan University
ZHang, Lihua	Fudan University
Keywords: Legged Robots, Reinforcement Learning, Body Balancing Abstract: Electric quadruped robots used in outdoor exploration are susceptible to leg-related electrical or mechanical failures. Unexpected joint power loss and joint locking can immediately pose a falling threat. Typically, controllers lack the capability to actively sense the condition of their own joints and take proactive actions. Maintaining the original motion patterns could lead to disastrous consequences, as the controller may produce irrational output within a short period of time, further creating the risk of serious physical injuries. This paper presents a hierarchical fault-tolerant control scheme employing a multi-task training architecture capable of actively perceiving and overcoming two types of leg joint faults. The architecture simultaneously trains three joint task policies for health, power loss, and locking scenarios in parallel, introducing a symmetric reflection initialization technique to ensure rapid and stable gait skill transformations. Experiments demonstrate that the control scheme is robust in unexpected scenarios where a single leg experiences concurrent joint faults in two joints. Furthermore, the policy retains the robot's planar mobility, enabling rough velocity tracking. Finally, zero-shot Sim2Real transfer is achieved on the real-world SOLO8 robot, countering both electrical and mechanical failures.


WeBT19-NT Oral Session, NT-G301	Add to My Program
Medical Robots V

Chair: Krieger, Axel	Johns Hopkins University
Co-Chair: Alambeigi, Farshid	University of Texas at Austin

13:30-15:00, Paper WeBT19-NT.1	Add to My Program
Towards a Novel Soft Magnetic Laparoscope for Single Incision Laparoscopic Surgery

Liu, Hui	University of Tennessee Knoxville
Li, Ning	The University of Tennessee
Li, Shuai	University of Tennessee Knoxville
Mancini, Gregory	The University of Tennessee Graduate School of Medicine
Tan, Jindong	University of Tennessee, Knoxville
Keywords: Medical Robots and Systems, Soft Robot Materials and Design, Surgical Robotics: Laparoscopy Abstract: In single-incision laparoscopic surgery (SILS), magnetic anchoring and guidance system (MAGS) is a promising technique to prevent clutter in surgical workspace and provide a larger vision field. Existing camera designs mainly rely on rigid structure design, resulting in risks of losing magnetic coupling and impacting tissue during the insertion and coupling procedure. In this paper, we proposed a wireless MAGS consisting of soft material and structure design. The camera can bend at the exit of the trocar and maintain strong coupling with the external actuator. The operation principle and modeling were established to investigate the parameter design. An easier insertion procedure was introduced and demonstrated in experiment. The bendability was tested showing the camera could reach 20 degrees in bending angle and 16.4mm in displacement. The insertion and deployment took less than 2 minutes on average.

13:30-15:00, Paper WeBT19-NT.2	Add to My Program
Magnetic-Guided Flexible Origami Robot Toward Long-Term Phototherapy of H. Pylori in the Stomach

Yuan, Sishen	The Chinese University of Hong Kong
Liang, Baijia	Chinese University of Hong Kong
Wong, Po Wa	The Chinese University of Hong Kong
Xu, Mingjing	The Chinese University of Hong Kong
Li, Chi Hsuan	The Chinese University of Hong Kong
Li, Zhen	Qilu Hospital of Shandong University
Ren, Hongliang	Chinese Univ Hong Kong (CUHK) & National Univ Singapore(NUS)
Keywords: Medical Robots and Systems, Soft Robot Materials and Design Abstract: Helicobacter pylori, a pervasive bacterial infection associated with gastrointestinal disorders such as gastritis, peptic ulcer disease, and gastric cancer, impacts approximately 50% of the global population. The efficacy of standard clinical eradication therapies is diminishing due to the rise of antibiotic-resistant strains, necessitating alternative treatment strategies. Photodynamic therapy (PDT) emerges as a promising prospect in this context. This study presents the development and implementation of a magnetically-guided origami robot, incorporating flexible printed circuit units for sustained and stable phototherapy of Helicobacter pylori. Each integrated unit is equipped with wireless charging capabilities, producing an optimal power output that can concurrently illuminate up to 15 LEDs at their maximum intensity. Crucially, these units can be remotely manipulated via a magnetic field, facilitating both translational and rotational movements. We propose an openloop manual control sequence that allows the formation of a stable, compliant triangular structure through the interaction of internal magnets. This adaptable configuration is uniquely designed to withstand the dynamic squeezing environment prevalent in real-world gastric applications. The research herein represents a significant stride in leveraging technology for innovative medical solutions, particularly in the management of antibiotic-resistant Helicobacter pylori infections.

13:30-15:00, Paper WeBT19-NT.3	Add to My Program
Flexible Tactile-Sensing Gripper Design and Excessive Force Protection Function for Endovascular Surgery Robots

Lyu, Chuqiao	Beijing Institute of Technology
Guo, Shuxiang	Southern University of Science and Technology
Yan, Yonggan	Beijing Institute of Technology
Zhang, Yongxin	Changhai Hospital
Zhang, Yongwei	Changhai Hospital
Yang, Pengfei	Changhai Hospital
Liu, Jianmin	Changhai Hospital
Keywords: Medical Robots and Systems, Soft Sensors and Actuators, Force and Tactile Sensing Abstract: Research on endovascular surgery robots (ESR) is continuously developing, because ESR can protect surgeons from radiation exposure. For designing an ESR manipulator, the main challenge is controlling the soft surgical tools and measuring the endovascular stress simultaneously. To solve these problems, a flexible tactile-sensing gripper (FTG) is designed in this study. Firstly, a catheter grasping model is constructed, and the factors affecting the force measurement are quantitatively analyzed. Secondly, the simulation experiments based on FTG models with three different sizes are implemented. When the catheter force is too large, shrinking the grasping distance of FTG can avoid the surgical risk. This method protects the surgeon's behavior and controls the catheter force at the same time, which is named excessive force protection function (EFPF). Thirdly, the FTG prototype which meet the surgical requirements is made and integrated into the ESR manipulator. This manipulator can measure the catheter forces by detecting the coordinate of marks on FTG surface. The calibrated FTG gets the average and maximum errors of force sensing approximately 37 mN and 223 mN, respectively. Finally, in the experiment of carotid artery catheterization, EFPF can control the catheter force within 393 mN, which is far less than the control group's 1351 mN.

13:30-15:00, Paper WeBT19-NT.4	Add to My Program
Simultaneous Estimation of Shape and Force Along Highly Deformable Surgical Manipulators Using Sparse FBG Measurement

Lu, Yiang	The Chinese University of Hong Kong
Li, Bin	The Chinese University of Hong Kong
Chen, Wei	The Chinese University of Hong Kong
Yan, Junyan	The Chinese University of Hong Kong
Cheng, Shing Shin	The Chinese University of Hong Kong
Wang, Jiangliu	The Chinese University of Hong Kong
Zhou, Jianshu	The Chinese University of Hong Kong
Dou, Qi	The Chinese University of Hong Kong
Liu, Yunhui	Chinese University of Hong Kong
Keywords: Medical Robots and Systems, Soft Sensors and Actuators, Soft Robot Applications Abstract: Recently, fiber optic sensors such as fiber Bragg gratings (FBGs) have been widely investigated for shape reconstruction and force estimation of flexible surgical robots. However, most existing approaches need precise model parameters of FBGs inside the fiber and their alignments with the flexible robots for accurate sensing results. Another challenge lies in online acquiring external forces at arbitrary locations along the flexible robots, which is highly required when with large deflections in robotic surgery. In this paper, we propose a novel data-driven paradigm for simultaneous estimation of shape and force along highly deformable flexible robots by using sparse strain measurement from a single-core FBG fiber. A thin-walled soft sensing tube helically embedded with FBG sensors is designed for a robotic-assisted flexible ureteroscope with large deflection up to 270 degrees and a bend radius under 10 mm. We introduce and study three learning models by incorporating spatial strain encoders, and compare their performances in both free space without interactions as well as constrained environments with contact forces at different locations. The experimental results in terms of dynamic shape-force sensing accuracy demonstrate the effectiveness and superiority of the proposed methods.

13:30-15:00, Paper WeBT19-NT.5	Add to My Program
Autonomous System for Tumor Resection (ASTR)-Dual-Arm Robotic Midline Partial Glossectomy

Ge, Jiawei	Johns Hopkins University
Kam, Michael	Johns Hopkins University
Opfermann, Justin	Johns Hopkins University
Saeidi, Hamed	University of North Carolina Wilmington
Leonard, Simon	The Johns Hopkins University
Mady, Leila	Johns Hopkins University
Schnermann, Martin	National Cancer Institute
Krieger, Axel	Johns Hopkins University
Keywords: Medical Robots and Systems, Software Architecture for Robotic and Automation, Control Architectures and Programming Abstract: Head and neck cancer is the seventh most common cancer worldwide, with squamous cell carcinoma being the most prevalent histologic type. Surgical resection is a primary treatment modality, and precisely identifying tumor edges and ensuring adequate resection margins are critical for optimizing oncologic outcomes. This letter presents an innovative autonomous system for tumor resection (ASTR) and conducts a feasibility study by performing autonomous midline partial glossectomy for pseudotumor with millimeter accuracy. ASTR consists of a dual-camera vision system, an electrosurgical tool, a vacuum grasping tool, two 6-DOF manipulators, and an autonomous control system. The letter introduces an ontology-based research framework for creating and implementing a complex autonomous surgical workflow, using the glossectomy as a case study. Porcine tongues are used in this study, and marked using color inks and near-infrared fluorescent (NIRF) markers to indicate the pseudotumor. ASTR monitors the NIRF markers and gathers spatial and color data from the samples, enabling planning and execution of robot trajectories in accordance with the proposed glossectomy workflow. The system successfully performs six consecutive supervised autonomous pseudotumor resections on porcine samples. The average surface and depth resection errors measure 0.73±0.60mm and 1.89±0.54mm, respectively. The resection accuracy is demonstrated to be on par with manual glossectomy performed by an otolaryngologist.

13:30-15:00, Paper WeBT19-NT.6	Add to My Program
A Semi-Autonomous Data Driven Shared Control Framework for Robotic Manipulation and Cutting of an Unknown Deformable Tissue

Strohmeyer, Nicholas	University of Texas at Austin
Park, Ji Hwan	The University of Texas at Austin
Murphy, Braden	The University of Texas at Austin
Alambeigi, Farshid	University of Texas at Austin
Keywords: Medical Robots and Systems, Surgical Robotics: Laparoscopy, Human-Robot Teaming Abstract: In this work, we propose a semi-autonomous scheme to synergistically share the complicated task of manipulation and cutting of an unknown deformable tissue (U-DT) between a remote surgeon and a surgical robot. Particularly, utilizing the da Vinci Research Kit (dVRK) platform, we have designed and successfully demonstrated a fully functional shared control scheme for an autonomous tensioning and tele-cutting of a U-DT. We have shown the system's ability to cooperate with a remote surgeon by leveraging an online data-driven learning and adaptive control method coupled with a reduced-order trajectory planning module that depends on just two parameters. By performing 25 experiments on custom-designed silicon phantoms and defining a set of success/failure metrics, we have put forward findings that establish a causal relationship between these two important parameters and the success or failure of the performed experiments.

13:30-15:00, Paper WeBT19-NT.7	Add to My Program
Design and Evaluation of a Modular Robotic System for Microsurgery

Torrealba Molina, Jenireth	Imperial College London
AbuBaker, Toqa	Imperial College London
Huang, Yanpei	Imperial College London
Cheng, Xiaoxiao	Imperial College of Science, Technology and Medicine, London UK
Devillard, Alexis	Imperial College London
Burdet, Etienne	Imperial College London
Keywords: Medical Robots and Systems, Surgical Robotics: Laparoscopy, Mechanism Design Abstract: The manipulation of instruments under a microscope suffers from physiological tremor and human errors, which are inevitable in long microsurgery interventions. Robotic systems developed in recent years for microsurgery are expensive and not flexible, as they cannot use standard instruments, and need the surgeon to modify their operative skills and strategies. In this paper, we introduce a modular robotic system for microsurgery enabling the surgeon to operate using conventional instruments. Our system was implemented using a commercial Kinova robot and a dedicated modular end-effector that uses standard microsurgery instruments. An initial teleoperation validation was carried out by eleven participants, who could successfully control the microsurgery tools to perform basic surgical movements. Furthermore, participants performed a simple anastomosis task with the robot and compared it to manual control. The results showed that robotic control is superior to manual control in simple surgical tasks and the converse in complex tasks. Participants preferred the proposed robotic system due to its user-friendliness and effort reduction.

13:30-15:00, Paper WeBT19-NT.8	Add to My Program
Analyzing Accessibility in Robot-Assisted Vitreoretinal Surgery: Integrating Eye Posture and Robot Position

Inagaki, Satoshi	NSK.Ltd
Alikhani, Alireza	Augen Klinik Und Poliklinik, Klinikum Rechts Der Isar Der Techn
Navab, Nassir	TU Munich
Maier, Mathias	Klinikum Rechts Der Isar Der TU München
Nasseri, M. Ali	Technische Universitaet Muenchen
Keywords: Medical Robots and Systems, Surgical Robotics: Planning, Optimization and Optimal Control Abstract: Several robotic frameworks have been recently developed to assist ophthalmic surgeons in performing complex vitreoretinal procedures such as subretinal injection. However, in order to intuitively integrate robots into the surgical workflow, it is crucial to emphasize that an accessibility analysis framework for vitreoretinal surgery must be considered as an essential component. Such a framework, ideally, considers the comprehensive factors of the eye anatomy and its positioning, the insertion point, and the initial pose and position of the robot. By combining the mobilization of the eyeball and adjusting the pose and position of the robot, the accessibility of such systems is significantly optimized. At the same time, the accessible-visible area is better and faster matched to the working volume of the robot. This paper presents an analysis of an expansion strategy for the robot's accessibility and visibility area. The outcomes of this method demonstrate the promising potential to enhance the robot's accessibility, as evidenced in our analytical and experimental findings from 22.4% to 99.0% of the required working area on an adjustable phantom model.


WeBT22-NT Oral Session, NT-G304	Add to My Program
Marine Robotics V

Chair: Kim, Jinwhan	KAIST
Co-Chair: Sattar, Junaed	University of Minnesota

13:30-15:00, Paper WeBT22-NT.1	Add to My Program
An Open-Source Solution for Fast and Accurate Underwater Mapping with a Low-Cost Mechanical Scanning Sonar

Hansen, Tim	Constructor University
Birk, Andreas	Constructor University
Keywords: Marine Robotics, SLAM, Hardware-Software Integration in Robotics Abstract: An open-source software framework is presented that allows real-time underwater mapping with popular marine robotics components, namely a BlueRobotics BlueROV2 with its standard Ping360 Mechanical Scanning Sonar (MSS) and a A50 Doppler Velocity Log (DVL), which are low-cost devices for their respective types - if not even the most affordable ones on the market. The software runs with low computational power on a Raspberry Pi4. The framework builds upon Synthetic Scan Formation (SSF) where single MSS beams or scan-lines are embedded into a pose-graph. The rendering of scans is not only based on navigation, but based on the graph itself. Scans formed from scan-lines can be optimized by online Simultaneous Localization and Mapping (SLAM) and result in improved scans, based on the current state of the graph. In subsequent steps this leads to improved registration results. To this end, a combination of two different types of loop-closures is presented. Namely a consecutive loop closure, and a proximity based loop closure, which reduces the overall drift. The framework is validated in three different test-environments, namely a pool, a test-tank with a gantry for ground truth motion, and the flooded basement of a WW-II submarine bunker. Among others, it is shown that there is an increased accuracy compared to conventional SLAM and that the software is usable in real-time during a mission with the low-cost hardware.

13:30-15:00, Paper WeBT22-NT.2	Add to My Program
Boundary Factors for Seamless State Estimation between Autonomous Underwater Docking Phases

Terán Espinoza, Aldo	KTH Royal Institute of Technology
Teran Espinoza, Antonio	Massachusetts Institute of Technology
Folkesson, John	KTH
Sigray, Peter	KTH Royal Institute of Technology
Kuttenkeuler, Jakob	KTH Royal Institute of Technology
Keywords: Marine Robotics, SLAM, Sensor Fusion Abstract: Autonomous underwater docking is of the utmost importance for expanding the capabilities of Autonomous Underwater Vehicles (AUVs). Due to a historical focus on underwater docking to only static targets, the research gap in underwater docking to dynamically active targets has been left relatively untouched. We address the state estimation problem that arises when trying to rendezvous a chaser AUV with a dynamic target by modeling the scenario as a factor graph optimization-based Simultaneous Localization and Mapping problem. We present a set of boundary factors that aid the inference process by seamlessly transitioning the target's state between the different observability stages, intrinsic to any dynamic docking scenario. We benchmark the performance of our approach using the Stonefish simulated environment.

13:30-15:00, Paper WeBT22-NT.3	Add to My Program
Vision-Based Water Clearance Determination in Maritime Environment

Schiller, Carl	ABB Corporate Research, Baden-Dättwil, Switzerland
Maas, Deran	ABB Corporate Research, Baden-Dättwil, Switzerland
Arsenali, Bruno	ABB
Peltola, Jukka	ABB Marine and Ports
Tervo, Kalevi	Aalto University School of Science and Technology
Maranò, Stefano	ABB Corporate Research
Keywords: Marine Robotics, Vision-Based Navigation, Autonomous Vehicle Navigation Abstract: Determining the distances from the hull of the own ship to obstacles or land, i.e. water clearance, is a fundamental task in navigation. This is particularly relevant during maneuvering in the harbor or navigating in confined waters. We introduce the concepts of area water clearance and line water clearance. Area water clearance is important especially for path planning and obstacle avoidance. Line water clearance is critical for maneuvering when approaching the quay. In this work, we present a vision-based approach to determine the water clearance. A single calibrated camera together with a semantic segmentation network is used to detect the water region in an image, and back-projection to determine the water clearance on the sea surface in world units. We validate the proposed approach on real data collected from two distinct vessels, where the proposed method is able to produce reliable water clearance for distances beyond one kilometer. During harbor maneuvering 90% of the relative water clearance errors were found to be between −2.3% and 3%.

13:30-15:00, Paper WeBT22-NT.4	Add to My Program
Adaptive Landmark Color for AUV Docking in Visually Dynamic Environments

Knutson, Corey	University of Minnesota - Twin Cities
Cao, Zhipeng	University of Minnesoata -- Twin Cities
Sattar, Junaed	University of Minnesota
Keywords: Marine Robotics, Vision-Based Navigation, Engineering for Robotic Systems Abstract: Autonomous Underwater Vehicles (AUVs) conduct missions underwater without the need for human intervention. A docking station (DS) can extend mission times of an AUV by providing a location for the AUV to recharge its batteries and receive updated mission information. Various methods for locating and tracking a DS exist, but most rely on expensive acoustic sensors, or are vision-based, which is significantly affected by water quality. In this paper, we present a vision-based method that utilizes adaptive color LED markers and dynamic color filtering to maximize landmark visibility in varying water conditions. Both AUV and DS utilize cameras to determine the water background color in order to calculate the desired marker color. No communication between AUV and DS is needed to determine marker color. Experiments conducted in a pool and lake show our method performs 10 times better than static color thresholding methods as background color varies. DS detection is possible at a range of 5 meters in clear water with minimal false positives.

13:30-15:00, Paper WeBT22-NT.5	Add to My Program
Navigable Area Detection and Perception-Guided Model Predictive Control for Autonomous Navigation in Narrow Waterways

Kim, Jonghwi	KAIST
Lee, Changyu	KAIST
Chung, Dongha	KAIST
Kim, Jinwhan	KAIST
Keywords: Marine Robotics, Vision-Based Navigation, Semantic Scene Understanding Abstract: This paper presents an integrated navigation and control strategy for an autonomous surface vehicle (ASV) to operate in narrow waterways without relying on GPS. The proposed method uses a camera and a light detection and ranging (LiDAR) sensor to detect navigable regions in the waterway. A deep learning-based semantic segmentation algorithm is applied to detect the navigable region in camera images, and the segmented region is projected onto the water surface using planar homography. A line-detection algorithm is also introduced to improve the reliability of detecting navigable regions from LiDAR measurements. A safe collision-free path for the ASV is generated within the navigable regions using model predictive control-based local path planning and control algorithms. The performance and practical utility of the proposed method were demonstrated through field experiments using a small cruise boat, modified as an autonomous surface vehicle.

13:30-15:00, Paper WeBT22-NT.6	Add to My Program
An Online Self-Calibrating Refractive Camera Model with Application to Underwater Odometry

Singh, Mohit	NTNU: Norwegian University of Science and Technology
Dharmadhikari, Mihir Rahul	NTNU - Norwegian University of Science and Technology
Alexis, Kostas	NTNU - Norwegian University of Science and Technology
Keywords: Marine Robotics, Visual-Inertial SLAM Abstract: This work presents a camera model for refractive media such as water and its application in underwater visual-inertial odometry. The model is self-calibrating in real-time and is free of known correspondences or calibration targets. It is separable as a distortion model (dependent on refractive index n and radial pixel coordinate) and a virtual pinhole model (as a function of n). We derive the self-calibration formulation leveraging epipolar constraints to estimate the refractive index and subsequently correct for distortion. Through experimental studies using an underwater robot integrating cameras and inertial sensing, the model is validated regarding the accurate estimation of the refractive index and its benefits for robust odometry estimation in an extended envelope of conditions. Lastly, we show the transition between media and the estimation of the varying refractive index online, thus allowing computer vision tasks across refractive media.

13:30-15:00, Paper WeBT22-NT.7	Add to My Program
Enhancing Visual Inertial SLAM with Magnetic Measurements

Joshi, Bharat	University of South Carolina
Rekleitis, Ioannis	University of South Carolina
Keywords: Marine Robotics, Visual-Inertial SLAM, SLAM Abstract: This paper presents an extension to visual inertial odometry (VIO) by introducing tightly-coupled fusion of magnetometer measurements. A sliding window of keyframes is optimized by minimizing re-projection errors, relative inertial errors, and relative magnetometer orientation errors. The results of IMU orientation propagation are used to efficiently transform magnetometer measurements between frames producing relative orientation constraints between consecutive frames. The soft and hard iron effects are calibrated using an ellipsoid fitting algorithm. The introduction of magnetometer data results in significant reductions in the orientation error and also in recovery of the true yaw orientation with respect to the magnetic north. The proposed framework operates in all environments with slow-varying magnetic fields, mainly outdoors and underwater. We have focused our work on the underwater domain, especially in underwater caves, as the narrow passage and turbulent flow make it difficult to perform loop closures and reset the localization drift. The underwater caves present challenges to VIO due to the absence of ambient light and the confined nature of the environment, while also being a crucial source of fresh water and providing valuable historical records. Experimental results from underwater caves demonstrate the improvements in accuracy and robustness introduced by the proposed VIO extension.

13:30-15:00, Paper WeBT22-NT.8	Add to My Program
Underwater Volumetric Mapping Using Imaging Sonar and Free-Space Modeling Approach

Oliveira, António José	INESC TEC
Ferreira, Bruno	INESC TEC
Cruz, Nuno	University of Porto
Keywords: Marine Robotics, Mapping, SLAM Abstract: Lack of information and perceptual ambiguity are key problems in sonar-based mapping applications. We propose a technique for mapping of underwater environments, building on the finite, positive, sonar beamwidth. Our approach models the free-space covered by each emitted acoustic pulse, employing volumetric techniques to create grid-based submaps of the unoccupied water volumes through images collected from imaging sonars. A representation of the occupied space is obtained by exploration of the free-space frontier. Special attention is given to acoustic image preparation and segmentation. Experimental results are provided based on real data collected from a dam shaft scenario.


WeBT23-NT Oral Session, NT-G401	Add to My Program
Aerial Systems: Motion Control and Planning

Chair: Gasteratos, Antonios	Democritus University of Thrace
Co-Chair: Nikolakopoulos, George	Luleå University of Technology

13:30-15:00, Paper WeBT23-NT.1	Add to My Program
Light-Weight Approach for Safe Landing in Populated Areas

Mitroudas, Tilemahos	Democritus University of Thrace
Balaska, Vasiliki	Democritus University of Thrace
Psomoulis, Athanasios	Democritus University of Thrace
Gasteratos, Antonios	Democritus University of Thrace
Keywords: Aerial Systems: Applications, Mapping, Localization Abstract: Landing safety is a challenge heavily engaging the research community recently, due to the increasing interest in applications availed by aerial vehicles. In this paper, we propose a landing safety pipeline based on state of the art object detectors and OctoMap. First, a point cloud of surface obstacles is generated, which is then inserted in an OctoMap. The unoccupied areas are identified, thus resulting to a sum of safe landing points. Due to the low processing time achieved by state of the art object detectors and the efficient point cloud manipulation using OctoMap, it is feasible for our approach to deploy on low-weight embedded systems. The proposed pipeline has been evaluated in many simulation scenarios, varying in people density, number, and movement. Simulations were executed with an Nvidia Jetson Nano in the loop to confirm the pipeline's performance and robustness in a low computing power hardware. The experiments yielded promising results with a 87% success rate.

13:30-15:00, Paper WeBT23-NT.2	Add to My Program
Design and Evaluation of Motion Planners for Quadrotors in Environments with Varying Complexities

Shao, Yifei	University of Pennsylvania
Wu, Yuwei	University of Pennsylvania
Jarin-Lipschitz, Laura	University of Pennsylvania
Chaudhari, Pratik	University of Pennsylvania
Kumar, Vijay	University of Pennsylvania
Keywords: Aerial Systems: Applications, Motion and Path Planning, Performance Evaluation and Benchmarking Abstract: Motion planning techniques for quadrotors have advanced significantly over the past decade. Most successful planners have two stages: a front-end that determines a path that incorporates geometric (or kinematic or input) constraints and specifies the homotopy class of the trajectory, and a back-end that optimizes this path to respect dynamics and input constraints. While there are many different choices for each stage, the eventual performance depends critically not only on these choices, but also on the environment. Given a new environment, it is difficult to decide a priori how one should design a motion planner. In this work, we develop (i) a procedure to construct parametrized environments, (ii) metrics that characterize the difficulty of motion planning in these environments, and (iii) an open-source software stack that can be used to combine a wide variety of two-stage planners seamlessly. We perform experiments in simulations and a real platform. We find, somewhat conveniently, that geometric front-ends are sufficient for environments with varying complexities if combined with dynamics-aware backends. The metrics we designed faithfully capture the planning difficulty in a given environment. All code is available at https://github.com/KumarRobotics/kr_mp_design.

13:30-15:00, Paper WeBT23-NT.3	Add to My Program
AutoTrans: A Complete Planning and Control Framework for Autonomous UAV Payload Transportation

Li, Haojia	The Hong Kong University of Science and Technology
Wang, Haokun	The Hong Kong University of Science and Technology
Feng, Chen	Hong Kong University of Science and Technology
Gao, Fei	Zhejiang University
Zhou, Boyu	Sun Yat-Sen University
Shen, Shaojie	Hong Kong University of Science and Technology
Keywords: Aerial Systems: Applications, Motion and Path Planning, Robust/Adaptive Control Abstract: The robotics community is increasingly interested in autonomous aerial transportation. Unmanned aerial vehicles with suspended payloads have advantages over other systems, including mechanical simplicity and agility, but pose great challenges in planning and control. To realize fully autonomous aerial transportation, this paper presents a systematic solution to address these difficulties. First, we present a real-time planning method that generates smooth trajectories considering the time-varying shape and non-linear dynamics of the system, ensuring whole-body safety and dynamic feasibility. Additionally, an adaptive NMPC with a hierarchical disturbance compensation strategy is designed to overcome unknown external perturbations and inaccurate model parameters. Extensive experiments show that our method is capable of generating high-quality trajectories online, even in highly constrained environments, and tracking aggressive flight trajectories accurately, even under significant uncertainty. We plan to release our code to benefit the community.

13:30-15:00, Paper WeBT23-NT.4	Add to My Program
Bat Planner: Aggressive Flying Ball Player

Yu, Huan	Zhejiang University
Tu, Jie	Zhejiang University
Wang, Pengqin	The Hong Kong University of Science and Technology
Zheng, Zhi	Zhejiang University
Zhang, Kewen	Zhejiang University of Technology
Lu, GuoDong	Zhejiang University
Gao, Fei	Zhejiang University
Wang, Jin	Zhejiang University
Keywords: Aerial Systems: Applications, Motion and Path Planning, Task and Motion Planning Abstract: In this paper, an aggressive quadrotor Ball plAying sysTem called BAT is proposed, whose goal is to intercept a flying ball and volley it towards a designated target. Aggressive means BAT operates the quadrotor aggressively to intercept balls that are far away and hit them to distant positions in ways that are beyond the reach of existing methods. The trajectory prediction of the ball is achieved by integrating forward the current position and velocity estimates using an extended kalman filter, and implementing cubic interpolation at the time resolution to calculate the continuous gradient for optimization. Facing the challenge of finding feasible hitting actions under extreme circumstances, we propose a two-stage planning approach, including transition point design and hitting primitive generation, with a simplified expression of uncoupled hitting actions. To obtain the best hitting motion, a trajectory optimization method is proposed, which can jointly optimize the hitting terminal states and time cost, considering dynamic feasibility and anticollision constraints. To avoid pathological hitting, a defensive rule constraint and its constraint transcription method are proposed. A large number of simulation and real-world experiments are conducted, which prove the flying ball player can hit arriving balls from different directions and distances to arbitrary targets.

13:30-15:00, Paper WeBT23-NT.5	Add to My Program
An NMPC Framework for Tracking and Releasing a Cable-Suspended Load to a Ground Target Using a Multirotor UAV

Panetsos, Fotis	National Technical University of Athens
Karras, George	University of Thessaly
Kyriakopoulos, Kostas	New York University - Abu Dhabi
Keywords: Aerial Systems: Applications, Motion Control, Field Robots Abstract: In this work, we present a nonlinear Model Predictive Control (NMPC) scheme for tracking a ground target using a multirotor with a cable-suspended load. The NMPC framework relies on the dynamic model of the UAV with the suspended load and, hence, an estimate of the load state is obtained by fusing the measurements of a downward-facing camera and a load cell with an Unscented Kalman Filter (UKF). Additionally, since the NMPC relies on the future behavior of the system, the trajectory of the ground target throughout the predicted time horizon of the NMPC, is required. Towards this direction, Bézier curves are employed in order to predict the future trajectory of the target, which moves in an arbitrary way. The ultimate goal of the proposed framework is to release the suspended load to the ground target and, consequently, a condition is checked at each time instant that triggers the opening of a gripper, located at the lower edge of the cable. The performance of the proposed control scheme is experimentally validated using an octorotor.

13:30-15:00, Paper WeBT23-NT.6	Add to My Program
Multi-Vehicle Dynamic Water Surface Monitoring

Nekovar, Frantisek	Czech Technical University in Prague
Faigl, Jan	Czech Technical University in Prague
Saska, Martin	Czech Technical University in Prague
Keywords: Aerial Systems: Applications, Path Planning for Multiple Mobile Robots or Agents, Environment Monitoring and Management Abstract: Repeated exploration of a water surface to detect objects of interest and their subsequent monitoring is important in search-and-rescue or ocean clean-up operations. Since the location of any detected object is dynamic, we propose to address the combined surface exploration and monitoring of the detected objects by modeling spatio-temporal reward states and coordinating a team of vehicles to collect the rewards. The model characterizes the dynamics of the water surface and enables the planner to predict future system states. The state reward value relevant to the particular water surface cell increases over time and is nullified by being in a sensor range of a vehicle. Thus, the proposed multi-vehicle planning approach is to minimize the collective value of the dynamic model reward states. The purpose is to address vehicles' motion constraints by using model predictive control on receding horizon and fully exploiting the utilized vehicles' motion capabilities. Based on the evaluation results, the approach indicates improvement in a solution to the kinematic orienteering problem and the team orienteering problem in the monitoring task compared to the existing solutions. The proposed approach has been experimentally verified, supporting its feasibility in real-world monitoring tasks.

13:30-15:00, Paper WeBT23-NT.7	Add to My Program
Aerial Physical Human Robot Interaction for Payload Transportation

Prajapati, Pratik	Indian Institute of Technology Gandhinagar
Vashista, Vineet	Indian Institute of Technology Gandhinagar
Keywords: Aerial Systems: Applications, Physical Human-Robot Interaction, Human-Robot Collaboration Abstract: Recent human-robot interaction paradigms on aerial robots unfold many potential applications, and efforts are further being made to explore this field. Physical interaction with aerial robots can provide an intuitive way of delivering high-level commands and allowing humans to perform collaborative tasks. The presented work demonstrates the feasibility of deploying the aerial robot to physically work with the human operator to transport the payload collaboratively in outdoor settings. A system comprised of a rigid object lifted by a human and a quadcopter from its end is considered. Custom build sensor systems, namely Human Handle Device and Cable Attitude Device, have been designed to estimate human commands and state feedback reliably. A control strategy for the quadcopter is designed to interact naturally with the operator for safer and smooth collaborative payload transportation. Successful outdoor experiments with five novice subjects are presented that demonstrate the feasibility and potential application of the proposed modality.

13:30-15:00, Paper WeBT23-NT.8	Add to My Program
On Experimental Emulation of Printability and Fleet Aware Generic Mesh Decomposition for Enabling Aerial 3D Printing

Stamatopoulos, Marios-Nektarios	Luleå University of Technology
Banerjee, Avijit	Luleå University of Technology
Nikolakopoulos, George	Luleå University of Technology
Keywords: Aerial Systems: Applications, Robotics and Automation in Construction Abstract: This article introduces an experimental emulation of a novel chunk-based flexible multi-DoF aerial 3D printing framework. The experimental demonstration of the overall autonomy focuses on precise motion planning and task allocation for a UAV, traversing through a series of planned space-filling paths involved in the aerial 3D printing process without physically depositing the overlaying material. The flexible multi-DoF aerial 3D printing is a newly developed framework and has the potential to strategically distribute the envisioned 3D model to be printed into small, manageable chunks suitable for distributed 3D printing. Moreover, by harnessing the dexterous flexibility due to the 6 DoF motion of UAV, the framework enables the provision of integrating the overall autonomy stack, potentially opening up an entirely new frontier in additive manufacturing. However, it's essential to note that the feasibility of this pioneering concept is still in its very early stage of development, which yet needs to be experimentally verified. Towards this direction, experimental emulation serves as the crucial stepping stone, providing a pseudo mockup scenario by virtual material deposition, helping to identify technological gaps from simulation to reality. Experimental emulation results, supported by critical analysis and discussion, lay the foundation for addressing the technological and research challenges to significantly push the boundaries of the state-of-the-art 3D printing mechanism.


WeBT25-NT Oral Session, NT-G403	Add to My Program
Localization V

Chair: He, Fenghua	Harbin Institute of Technology
Co-Chair: Huang, Guoquan	University of Delaware

13:30-15:00, Paper WeBT25-NT.1	Add to My Program
Visual Localization in Repetitive and Symmetric Indoor Parking Lots Using 3D Key Text Graph

Kim, Joohyung	Korea University
Koo, Gunhee	Samsung Electronics
Park, Heewon	Samsung Electronics
Doh, Nakju	Korea University
Keywords: Localization Abstract: Indoor parking lots are the GPS-denied spaces to which vision-based localization approaches have usually been applied to solve localization problems. However, due to the repetitiveness and symmetry of the spaces, visual localization methods commonly confront difficulties in estimating precise 3D poses. In this study, we propose four novel modules that improve localization precision by imposing the existing methods with the spatial discerning ability. The first module constructs a key text graph that represents the topology of key texts in the space and becomes the basis for discerning repetitiveness and symmetry. Next, the orientation filtering module estimates the unknown 3D orientation of the query image and resolves spatial symmetric ambiguity. The similarity scoring module sorts out the top-scored database images, discerning the spatial repetitiveness based on detected key text bounding boxes. Our pose verification module evaluates the pose confidence of top-scored candidates and determines the most reliable pose. Our method has been validated in two real indoor parking lots, achieving new state-of-the-art performance levels.

13:30-15:00, Paper WeBT25-NT.2	Add to My Program
VPRTempo: A Fast Temporally Encoded Spiking Neural Network for Visual Place Recognition

Hines, Adam D.	Queensland University of Technology
Stratton, Peter	University of Queensland
Milford, Michael J	Queensland University of Technology
Fischer, Tobias	Queensland University of Technology
Keywords: Localization, Bioinspired Robot Learning Abstract: Spiking Neural Networks (SNNs) are at the forefront of neuromorphic computing thanks to their potential energy-efficiency, low latencies, and capacity for continual learning. While these capabilities are well suited for robotics tasks, SNNs have seen limited adaptation in this field thus far. This work introduces a SNN for Visual Place Recognition (VPR) that is both trainable within minutes and queryable in milliseconds, making it well suited for deployment on compute- constrained robotic systems. Our proposed system, VPRTempo, overcomes slow training and inference times using an abstracted SNN that trades biological realism for efficiency. VPRTempo employs a temporal code that determines the timing of a single spike based on a pixel’s intensity, as opposed to prior SNNs relying on rate coding that determined the number of spikes; improving spike efficiency by over 100%. VPRTempo is trained using Spike-Timing Dependent Plasticity and a supervised delta learning rule enforcing that each output spiking neuron responds to just a single place. We evaluate our system on the Nordland and Oxford RobotCar benchmark localization datasets, which include up to 27k places. We found that VPRTempo’s accuracy is comparable to prior SNNs and the popular NetVLAD place recognition algorithm, while being several orders of magnitude faster and suitable for real-time deployment – with inference speeds over 50 Hz on CPU. VPRTempo could be integrated as a loop closure component for online SLAM on resource-constrained systems such as space and underwater robots.

13:30-15:00, Paper WeBT25-NT.3	Add to My Program
17-Point Algorithm Revisited: Toward a More Accurate Way

Xie, Chen	Harbin Institute of Technology
Xing, Rui	Harbin Institute of Technology
Hao, Ning	Harbin Institute of Technology
He, Fenghua	Harbin Institute of Technology
Keywords: Localization, Computer Vision for Automation, SLAM Abstract: 17-point algorithm is a popular method in relative pose estimation of multi-cameras. However, the role of overlap in 17-point algorithm remains unexplored. And the relaxed way in solving constrained normal equation leads to sub-optimal results. Both of them influence accuracy of the estimated pose. In this paper, we theoretically analyze the influence of overlap and the solvability of 17-point algorithm. In addition, we show that the abuse of overlap can harm accuracy in practice. In light of these findings, we propose an improved 17-point algorithm, which avoids using overlaps and derives a simple way to solve normal equation on manifold. Both simulations and real world data experiments demonstrate the proposed one outperforms the traditional 17-point algorithm in term of accuracy.

13:30-15:00, Paper WeBT25-NT.4	Add to My Program
AnyLoc: Towards Universal Visual Place Recognition

Keetha, Nikhil Varma	Carnegie Mellon University
Mishra, Avneesh	International Institute of Information Technology, Hyderabad
Karhade, Jay	Carnegie Mellon University
Jatavallabhula, Krishna Murthy	MIT
Scherer, Sebastian	Carnegie Mellon University
Krishna, Madhava	IIIT Hyderabad
Garg, Sourav	University of Adelaide
Keywords: Localization, Recognition, Deep Learning for Visual Perception Abstract: Visual Place Recognition (VPR) is vital for robot localization. To date, the most performant VPR approaches are environment- and task-specific: while they exhibit strong performance in structured environments (predominantly urban driving), their performance degrades severely in unstructured environments, rendering most approaches brittle to robust real-world deployment. In this work, we develop a universal solution to VPR -- a technique that works across a broad range of structured and unstructured environments (urban, outdoors, indoors, aerial, underwater, and subterranean environments) without any re-training or fine-tuning. We demonstrate that general-purpose feature representations derived from off-the-shelf self-supervised models with no VPR-specific training are the right substrate upon which to build such a universal VPR solution. Combining these derived features with unsupervised feature aggregation enables our suite of methods, AnyLoc, to achieve up to 4X significantly higher performance than existing approaches. We further obtain a 6% improvement in performance by characterizing the semantic properties of these features, uncovering unique domains which encapsulate datasets from similar environments. Our detailed experiments and analysis lay a foundation for building VPR solutions that may be deployed anywhere, anytime, and across anyview. We encourage the readers to explore our project page and interactive demos: https://anyloc.github.io/

13:30-15:00, Paper WeBT25-NT.5	Add to My Program
Lightweight Ground Texture Localization

Wilhelm, Aaron	Cornell University
Napp, Nils	Cornell University
Keywords: Localization, Vision-Based Navigation, SLAM Abstract: We present a lightweight ground texture based localization algorithm (L-GROUT) that improves the state of the art in performance and can be run in real-time on single board computers without GPU acceleration. Such computers are ubiquitous on small indoor robots and thus this work enables high-precision, millimeter-level localization without instrumenting, marking, or modifying the environment. The key innovations are an improved database feature extraction algorithm, a dimensionality reduction method based on locality preserving projections (LPP) that can accommodate faster-to-compute binary features, and an improved spatial filtering step that better preserves performance when the databases are tuned for lightweight applications. We demonstrate the approach by running the whole system on a low-cost single board computer (Raspberry Pi 4) to produce global localization estimates at greater than 4Hz on an outdoor asphalt dataset.

13:30-15:00, Paper WeBT25-NT.6	Add to My Program
NeRF-VINS: A Real-Time Neural Radiance Field Map-Based Visual-Inertial Navigation System

Katragadda, Saimouli	University of Delaware
Lee, Woosik	University of Delaware
Peng, Yuxiang	University of Delaware
Geneva, Patrick	University of Delaware
Chen, Chuchu	University of Delaware
Guo, Chao	Google
Li, Mingyang	Alphabet Inc
Huang, Guoquan	University of Delaware
Keywords: Localization, Visual-Inertial SLAM, Deep Learning Methods Abstract: Achieving efficient and consistent localization with a prior map remains challenging in robotics. Conventional keyframe-based approaches often suffer from sub-optimal view- points due to limited field of view (FOV) and/or constrained motion, thus degrading the localization performance. To ad- dress this issue, we design a real-time tightly-coupled Neural Radiance Fields (NeRF)-aided visual-inertial navigation system (VINS). In particular, by effectively leveraging the NeRF’s potential to synthesize novel views, the proposed NeRF-VINS overcomes the limitations of traditional keyframe-based maps (with limited views) and optimally fuses IMU, monocular images, and synthetically rendered images within an efficient filter-based framework. This tightly-coupled fusion enables efficient 3D motion tracking with bounded errors. We extensively validate the proposed NeRF-VINS against the state-of-the-art methods that use prior map information, and demonstrate its ability to perform real-time localization, at 15 Hz, on a resource- constrained Jetson AGX Orin embedded platform.

13:30-15:00, Paper WeBT25-NT.7	Add to My Program
Night-Rider: Nocturnal Vision-Aided Localization in Streetlight Maps Using Invariant Extended Kalman Filtering

Gao, Tianxiao	University of Macao
Zhao, Mingle	University of Macau
Xu, Chengzhong	University of Macau
Kong, Hui	University of Macau
Keywords: Localization, Visual-Inertial SLAM, Sensor Fusion Abstract: Vision-aided localization for low-cost mobile robots in diverse environments has attracted widespread attention recently. Although many current systems are applicable in daytime environments, nocturnal visual localization is still an open problem owing to the lack of stable visual information. An insight from most nocturnal scenes is that the static and bright streetlights are reliable visual information for localization. Hence we propose a nocturnal vision-aided localization system in streetlight maps with a novel data association and matching scheme using object detection methods. We leverage the Invariant Extended Kalman Filter (InEKF) to fuse IMU, odometer, and camera measurements for consistent state estimation at night. Furthermore, a tracking recovery module is also designed for tracking failures. Experimental results indicate that our proposed system achieves accurate and robust localization with less than 0.2% relative error of trajectory length in four nocturnal environments.


WeKN1-HL Keynote Session, National Convention Hall	Add to My Program
Keynote: Robotics Foundations II

Chair: Loianno, Giuseppe	New York University

15:30-16:00, Paper WeKN1-HL.1	Add to My Program
Formal Methods for Safety-Critical Control

Belta, Calin	Boston University


WeKN2-CC Keynote Session, CC-Main Hall	Add to My Program
Keynote: Automation II

Chair: Faragasso, Angela	Finger Vision Inc

15:30-16:00, Paper WeKN2-CC.1	Add to My Program
Robots in the Wild: From Research Labs to the Real World

Catalano, Manuel Giuseppe	Istituto Italiano di Tecnologia


WeKN3-CC Keynote Session, CC-301	Add to My Program
Keynote: Human Centered and Lifelike Robotics II

Chair: Laschi, Cecilia	National University of Singapore

15:30-16:00, Paper WeKN3-CC.1	Add to My Program
Building Guidance Bridges with Generative Models for Robot Learning and Control

Soh, Harold	National University of Singapore


WeKN4-CC Keynote Session, CC-303	Add to My Program
Keynote: Robots for Unstructured Environments II

Chair: Chalvatzaki, Georgia	Technische Universität Darmstadt

15:30-16:00, Paper WeKN4-CC.1	Add to My Program
Unleashing the Power of Many: Decentralized Control of Multi-Robot Systems

Sabattini, Lorenzo	University of Modena and Reggio Emilia


WeKN5-CC Keynote Session, CC-503	Add to My Program
Keynote: Healthcare and Medical Robotics II

Chair: Mohammed, Samer	University of Paris Est Créteil - (UPEC)

15:30-16:00, Paper WeKN5-CC.1	Add to My Program
Human-Wearable Robot Co-Adaptation

Kim, Myunghee	University of Illinois at Chicago


WeCT4-CC Oral Session, CC-315	Add to My Program
Distributed Robot Systems

Chair: Guo, Meng	Peking University

16:30-18:00, Paper WeCT4-CC.1	Add to My Program
Dynamic Multi-Agent Deep Deterministic Policy Gradient for Autonomous Navigation of Reconfigurable Unmanned Aerial Vehicle

Lu, Xin	University of Electronic Science and Technology
Wu, Zegui	University of Electronic Science and Technology of China
Zhao, Ruqing	University of Electronic Science and Technology of China
Li, Fusheng	University of Electronic Science and Technology of China
Keywords: Distributed Robot Systems, Aerial Systems: Mechanics and Control, Aerial Systems: Applications Abstract: The reconfigurable unmanned aerial vehicle (RUAV) has the ability to create and break physical links to self-assemble and self-disassemble in midair. For the changes in task or environment, this system can dynamically disassemble the rectangular structure into multiple individual UAV modules or integrate these UAV modules into a whole. For practical applications, the R-UAV requires collaborative decision-making for autonomous navigation in complex environments. However, the navigation problem of the R-UAV has not been investigated. In this paper, we propose a dynamic multi-agent deep deterministic policy gradient (DMADDPG) algorithm for autonomous navigation of R-UAV. This algorithm introduces the leader agent assignment mechanism and a collaborative experience reward. The former deals with the action conflict problem caused by the disappearance of the UAV agent when multiple UAV modules are assembled. The latter provides guidance for the UAV agent to plan a collision-free and efficient trajectory. We validate our strategy in both simulation and practical scenarios, and experimental results demonstrate that the proposed scheme can generate reasonable and efficient paths for R-UAV in the presence of obstacles. The experiment video is available at https://youtu.be/mVm0qCvB7HY.

16:30-18:00, Paper WeCT4-CC.2	Add to My Program
FogROS2-LS: A Location-Independent Fog Robotics Framework for Latency Sensitive ROS2 Applications

Chen, Kaiyuan	University of California, Berkeley
Wang, Michael	Bosch
Gualtieri, Marcus	Bosch Research
Tian, Nan	University of California, Berkeley
Juette, Christian	Bosch Research
Ren, Liu	Robert Bosch North America Research Technology Center
Kubiatowicz, John	UC Berkeley
Goldberg, Ken	UC Berkeley
Keywords: Distributed Robot Systems, Multi-Robot Systems, Networked Robots Abstract: Limiting latency is essential for critical robot applications such as collision avoidance or target tracking and is challenging for Cloud or Fog robotics applications due to network congestion and failures. We introduce FogROS2-Latency-Sensitive(LS), a Fog Robotics framework that offers secure, location-independent connections between robots and latency-sensitive robotic services. FogROS2-LS offloads conventional on-board state estimators and feedback controllers to Cloud and Edge compute hardware without modifying the existing application in ROS2. In presence of multiple identical services, it dynamically identifies and transitions to the optimal service deployment that fulfills the application's latency requirement, thereby empowering robots with restricted on-board computing capacity to safely and efficiently navigate dynamic, human-dense environments. We evaluate FogROS2-LS with two latency sensitive case studies: (1) Collision Avoidance: a robot arm guided by visual feedback from consistent distance estimation and collision checking on Cloud and Edge. FogROS2-LS reduces collision failures by up to 8.5 times by selecting the best available machine (2) Target Tracking: FogROS2-LS also enables robust and continuous target following and can recover from network failures.

16:30-18:00, Paper WeCT4-CC.3	Add to My Program
Leveraging Tethers for Distributed Formation Control of Simple Robots

Cutler, Sadie	Cornell University
Petersen, Kirstin Hagelskjaer	Cornell University
Keywords: Distributed Robot Systems, Multi-Robot Systems, Sensor-based Control Abstract: Tethers have great potential in multi-robot systems from enabling retrieval of deployed robots and facilitating power transfer, to use by the robots as a net or partition. In this paper, we show in simulation that tethers can also be used to do distributed formation control on very simple robots. Specifically, our simulated agents are connected in series by un-actuated, flexible, fixed-length tethers and use tether angle and strain, in conjunction with the physical constraints of the tethers, to adjust their position with respect to their neighbors. This presents a significant simplification over traditional formation control which, at a minimum, requires exteroceptive sensors to perceive bearing and/or distance to nearby agents. We present and evaluate an algorithm on a large set of transitions between formations with 5 agents and an example transition with 35 agents. The convergence time grows with the number of agents, however, the memory and computation time per agent remain constant. Future work will investigate the ability to use tethers and strain for reactive behaviors and more diverse tasks.

16:30-18:00, Paper WeCT4-CC.4	Add to My Program
Distributed Differential Dynamic Programming Architectures for Large-Scale Multi-Agent Control

Saravanos, Augustinos	Georgia Institute of Technology
Aoyama, Yuichiro	Georgia Institute of Technology
Zhu, Hongchang	Georgia Institute of Technology
Theodorou, Evangelos	Georgia Institute of Technology
Keywords: Distributed Robot Systems, Optimization and Optimal Control, Multi-Robot Systems, Swarms Abstract: This paper proposes two decentralized multi-agent optimal control methods that combine the computational efficiency and scalability of Differential Dynamic Programming (DDP) and the distributed nature of the Alternating Direction Method of Multipliers (ADMM). The first one, Nested Distributed DDP (ND-DDP), is a three-level architecture which employs ADMM for consensus, an augmented Lagrangian layer for local constraints and DDP as the local optimizer. The second one, Merged Distributed DDP (MD-DDP), is a two-level architecture that addresses both consensus and local constraints with ADMM, further reducing computational complexity. Both frameworks are fully decentralized since all computations are parallelizable among the agents and only local communication is necessary. Simulation results that scale up to thousands of cars and hundreds of drones demonstrate the effectiveness of the algorithms. Superior scalability to large-scale systems against other DDP and sequential quadratic programming methods is also illustrated. Finally, hardware experiments on a multi-robot platform verify the applicability of the methods. A video with all results is provided in the supplementary material.

16:30-18:00, Paper WeCT4-CC.5	Add to My Program
Accelerated K-Serial Stable Coalition for Dynamic Capture and Resource Defense

Chen, Junfeng	Peking University
Tang, Zili	Peking University
Guo, Meng	Peking University
Keywords: Distributed Robot Systems, Planning, Scheduling and Coordination, Integrated Planning and Learning Abstract: Coalition is an important mean of multi-robot systems to collaborate on common tasks. An adaptive coalition strategy is essential for the online performance in dynamic and unknown environments. In this work, the problem of territory defense by large-scale heterogeneous robotic teams is considered. The tasks include exploration, capture of dynamic targets, and perimeter defense over valuable resources. Since each robot can choose among many tasks, it remains a challenging problem to coordinate jointly these robots such that the overall utility is maximized. This work proposes a generic coalition strategy called K-serial stable coalition algorithm. Different from centralized approaches, it is distributed and complete, meaning that only local communication is required and a K-serial Stable solution is ensured. Furthermore, to accelerate adaptation to dynamic targets and resource distribution that are only perceived online, a heterogeneous graph attention network based heuristic is learned to select more appropriate parameters and promising initial solutions during local optimization. Compared with manual heuristics or end-to-end predictors, it is shown to both improve online adaptability and retain the quality guarantee. The proposed methods are validated via large-scale simulations with 170 robots and hardware experiments of 13 robots, against several strong baselines such as GreedyNE and FastMaxSum.

16:30-18:00, Paper WeCT4-CC.6	Add to My Program
Sensor-Based Multi-Robot Coverage Control with Spatial Separation in Unstructured Environments

Wang, Xinyi	The Chinese University of Hong Kong
Xu, Jiwen	The Chinese University of Hong Kong
Gao, Chuanxiang	The Chinese University of Hong Kong
Chen, Yizhou	Chinese University of Hong Kong
Zhang, Jihan	Chinese University of Hong Kong
Wang, Chenggang	Shanghai Jiao Tong University
Ding, Yulong	Tongji University
Chen, Ben M.	Chinese University of Hong Kong
Keywords: Distributed Robot Systems, Reactive and Sensor-Based Planning, Collision Avoidance Abstract: Multi-robot systems have increasingly become instrumental in tackling coverage problems. However, the challenge of optimizing task efficiency without compromising task success still persists, particularly in expansive, unstructured scenarios with dense obstacles. This paper presents an innovative, decentralized Voronoi-based coverage control approach to reactively navigate these complexities while guaranteeing safety. This approach leverages the active sensing capabilities of multi-robot systems to supplement GIS (Geographic Information System), offering a more comprehensive and real-time understanding of environments like post-disaster. Based on point cloud data, which is inherently non-convex and unstructured, this method efficiently generates collision-free Voronoi regions using only local sensing information through spatial decomposition and spherical mirroring techniques. Then, deadlock-aware guided map integrated with a gradient-optimized, centroid Voronoi-based coverage control policy, is constructed to improve efficiency by avoiding exhaustive searches and local sensing pitfalls. The effectiveness of our algorithm has been validated through extensive numerical simulations in high-fidelity environments, demonstrating significant improvements in task success rate, coverage ratio, and task execution time compared with others.

16:30-18:00, Paper WeCT4-CC.7	Add to My Program
Localized and Incremental Probabilistic Inference for Large-Scale Networked Dynamical Systems

Matsuka, Kai	California Institute of Technology
Chung, Soon-Jo	Caltech
Keywords: Distributed Robot Systems, Sensor Networks, Swarms, SLAM Abstract: We present new algorithms for Distributed Factor Graph Optimization (DFGO) problems that arise in the probabilistic inference of large-scale networked robotic systems. First, for the batch DFGO problem, we derive the Local Consensus ADMM (LC-ADMM) algorithm. LC-ADMM is fully localized; therefore, the computational effort, communication bandwidth, and memory for each agent scale like o(1) with respect to the network size. We establish two new theoretical results for LC-ADMM: (1) exponential convergence when the objective is strongly convex and has a Lipschitz continuous subdifferential, and (2) o(1/k) convergence when the objective is convex and has a unique solution. Second, we also develop the Incremental DFGO algorithm (iDFGO) for real-time problems by combining the ideas from LC-ADMM and the Bayes tree. The iDFGO algorithm incrementally recomputes estimates when new factors are added to the graph and is scalable with respect to both network size and time. We validate LC-ADMM and iDFGO in simulations with examples from multi-agent Simultaneous Localization and Mapping (SLAM) and power grids.


WeCT8-CC Oral Session, CC-418	Add to My Program
Manipulation Planning

Chair: Makita, Satoshi	Fukuoka Institute of Technology
Co-Chair: Higashimori, Mitsuru	Osaka University

16:30-18:00, Paper WeCT8-CC.1	Add to My Program
Toward Optimal Tabletop Rearrangement with Multiple Manipulation Primitives

Huang, Baichuan	Rutgers University
Zhang, Xujia	Southern University of Science and Technology
Yu, Jingjin	Rutgers University
Keywords: Manipulation Planning, Task and Motion Planning Abstract: In practice, many types of manipulation actions (e.g., pick-n-place and push) are needed to accomplish real-world manipulation tasks. Yet, limited research exists that explores the synergistic integration of different manipulation actions for optimally solving long-horizon task-and-motion planning problems. In this study, we propose and investigate planning high-quality action sequences for solving long-horizon tabletop rearrangement tasks in which multiple manipulation primitives are required. Denoting the problem rearrangement with multiple manipulation primitives (REMP), we develop two algorithms, hierarchical best-first search (HBFS) and parallel Monte Carlo tree search for multi-primitive rearrangement (PMMR) toward optimally resolving the challenge. Extensive simulation and real robot experiments demonstrate that both methods effectively tackle REMP, with HBFS excelling in planning speed and PMMR producing human-like, high-quality solutions with a nearly 100% success rate. Source code and supplementary materials will be available at https://github.com/arc-l/remp.

16:30-18:00, Paper WeCT8-CC.2	Add to My Program
ReorientDiff: Diffusion Model Based Reorientation for Object Manipulation

Mishra, Utkarsh	Georgia Institute of Technology
Chen, Yongxin	Georgia Institute of Technology
Keywords: Probabilistic Inference, Manipulation Planning, Deep Learning Methods Abstract: The ability to manipulate objects in desired configurations is a fundamental requirement for robots to complete various practical applications. While certain goals can be achieved by picking and placing the objects of interest directly, object reorientation is needed for precise placement in most of the tasks. In such scenarios, the object must be reoriented and re-positioned into intermediate poses that facilitate accurate placement at the target pose. To this end, we propose a reorientation planning method, ReorientDiff, that utilizes a diffusion model-based approach. The proposed method employs both visual inputs from the scene, and goal-specific language prompts to plan intermediate reorientation poses. Specifically, the scene and language-task information are mapped into a joint scene-task representation feature space, which is subsequently leveraged to condition the diffusion model. The diffusion model samples intermediate poses based on the representation using classifier-free guidance and then uses gradients of learned feasibility-score models for implicit iterative pose-refinement. The proposed method is evaluated using a set of YCB-objects and a suction gripper, demonstrating a success rate of 95.2% in simulation. Overall, we present a promising approach to address the reorientation challenge in manipulation by learning a conditional distribution, which is an effective way to move towards generalizable object manipulation. More results can be found on our website: utkarshmishra04.github.io/ReorientDiff

16:30-18:00, Paper WeCT8-CC.3	Add to My Program
ForceSight: Text-Guided Mobile Manipulation with Visual-Force Goals

Collins, Jeremy	Georgia Institute of Technology
Houff, Cody	Georgia Institute of Technology
Tan, You Liang	Georgia Institute of Technology
Kemp, Charles C.	Hello Robot Inc
Keywords: Manipulation Planning, Deep Learning in Grasping and Manipulation, Mobile Manipulation Abstract: We present ForceSight, a system for text-guided mobile manipulation that predicts visual-force goals using a text-conditioned vision transformer. Given a single RGBD image and a text prompt, ForceSight determines a target end-effector pose in the camera frame (kinematic goal) and the associated forces (force goal). Together, these two components form a visual-force goal. Prior work has demonstrated that deep models outputting human-interpretable kinematic goals can enable dexterous manipulation by real robots. Forces are critical to manipulation, yet have typically been relegated to low-level execution in these systems. When deployed on a mobile manipulator equipped with an eye-in-hand RGBD camera, ForceSight performed tasks such as precision grasps, drawer opening, and object handovers with an 81% success rate in unseen environments with object instances that differed significantly from the training data. In a separate experiment, relying exclusively on visual servoing and ignoring force goals dropped the success rate from 90% to 45%, demonstrating that force goals can significantly enhance performance. The appendix, videos, code, and trained models are available at https://force-sight.github.io/.

16:30-18:00, Paper WeCT8-CC.4	Add to My Program
Unknown Object Retrieval in Confined Space through Reinforcement Learning with Tactile Exploration

Zhao, Xinyuan	Agency for Science, Technology and Research
Liang, Wenyu	Institute for Infocomm Research, A*STAR
Zhang, Xiaoshi	National University of Singapore
Chew, Chee Meng	National University of Singapore
Wu, Yan	A*STAR Institute for Infocomm Research
Keywords: Manipulation Planning, Force and Tactile Sensing, Dexterous Manipulation Abstract: The potential of tactile sensing for dexterous robotic manipulation has been demonstrated by its ability to enable nuanced real-world interactions. In this study, the retrieval of unknown objects from confined spaces, which is unsuitable for conventional visual perception and gripper-based manipulation, is identified and addressed. Specifically, a tactile-sensorized tool stick that well fits in the narrow space is utilized to provide multi-point contact sensing in object manipulation. A reinforcement learning (RL) agent with a hybrid action space is then proposed to acquire the optimal policy for manipulating the objects without prior knowledge of their physical properties. To accelerate on-hardware training, a focused training strategy is adopted with the hypothesis that an agent trained on a small set of representative shapes can be generalized to a wide range of everyday objects. Additionally, a curriculum on terminal goals is designed to further accelerate the hardware-based training process. Comparative experiments and ablation studies have been conducted to evaluate the effectiveness and robustness of the proposed approach, which highlights the high success rate of our solution for retrieving everyday objects.

16:30-18:00, Paper WeCT8-CC.5	Add to My Program
The Grasp Loop Signature: A Topological Representation for Manipulation Planning with Ropes and Cables

Mitrano, Peter	University of Michigan
Berenson, Dmitry	University of Michigan
Keywords: Manipulation Planning, Motion and Path Planning Abstract: Robotic manipulation of deformable, one-dimensional objects (DOOs) like ropes or cables has important potential applications in manufacturing, agriculture, and surgery. In such environments, the task may involve threading through or avoiding becoming tangled with objects like racks or frames. Grasping with multiple grippers can create closed loops between the robot and DOO, and If an obstacle lies within this loop, it may be impossible to reach the goal. However, prior work has only considered the topology of the DOO in isolation, ignoring the arms that are manipulating it. Searching over possible grasps to accomplish the task without considering such topological information is very inefficient, as many grasps will not lead to progress on the task due to topological constraints. Therefore, we propose a grasp loop signature which categorizes the topology of these grasp loops and show how it can be used to guide planning. We perform experiments in simulation on two DOO manipulation tasks to show that using the signature is faster and succeeds more often than methods that rely on local geometry or finite-horizon planning. Finally, we demonstrate using the signature in the real world to manipulate a cable in a scene with obstacles using a dual-arm robot.

16:30-18:00, Paper WeCT8-CC.6	Add to My Program
Articulated Object Manipulation with Coarse-To-Fine Affordance for Mitigating the Effect of Point Cloud Noise

Ling, Suhan	Peking University
Wang, Yian	Umass Amherst
Wu, Ruihai	Peking University
Wu, Shiguang	Chinese Academy of Sciences Beijing, China
Zhuang, Yuzheng	Huawei Technologies Company
Xu, Tianyi	Peking University
Li, Yu	BUPT
Liu, Chang	Peking University
Dong, Hao	Peking University
Keywords: Manipulation Planning Abstract: 3D articulated objects are inherently challenging for manipulation due to the varied geometries and intricate functionalities associated with articulated objects. Point-level affordance, which predicts the per-point actionable score and thus proposes the best point to interact with, has demonstrated excellent performance and generalization capabilities in articulated object manipulation. However, a significant challengeremains: while previous works use perfect point cloud generated in simulation, the models cannot directly apply to the noisy point cloud in the real-world. To tackle this challenge, we leverage the property of real-world scanned point cloud that, the point cloud becomes less noisy when the camera is closer to the object. Therefore, we propose a novel coarse-to-fine affordance learning pipeline to mitigate the effect of point cloud noise in two stages. In the first stage, we learn the affordance on the noisy far point cloud which includes the whole object to propose the approximated place to manipulate. Then, we move the camera in the front of the approximated place, scan a less noisy point cloud containing precise local geometries for manipulation, and learn affordance on such point cloud to propose fine-grained final actions. The proposed method is thoroughly evaluated both using large-scale simulated noisy point clouds mimicing real-world scans, and in the real world scenarios, with superiority over existing methods, demonstrating the effectiveness in tackling the noisy real-world point cloud problem.

16:30-18:00, Paper WeCT8-CC.7	Add to My Program
Improved M4M: Faster and Richer Planning for Manipulation among Movable Objects in Cluttered 3D Workspaces

Saxena, Dhruv Mauria	The Robotics Institute, Carnegie Mellon University
Likhachev, Maxim	Carnegie Mellon University
Keywords: Manipulation Planning, Task and Motion Planning Abstract: We are interested in enabling robots to solve difficult pick-and-place manipulation tasks in cluttered and constrained environments. If the robot does not have collision-free access to the object-of-interest (OoI) which it intends to grasp and extract from the workspace, it must reason about which movable objects to rearrange, where to move them, and how it may do so. In recent work we introduced E-M4M, a graph search-based solver for solving such Manipulation tasks Among Movable Objects (MAMO). In this paper we make several improvements to E-M4M - we introduce the use of prehensile or pick-and-place rearrangement actions in addition to pushes; we show that by running it as a depth-first search improves performance; we show how the search can be run ``eagerly lazily'' to only simulate actions in a physics-based simulator when necessary; finally we relax the assumption that we require perfect knowledge of the physical properties of objects (mass and coefficient of friction in particular). The improved version of E-M4M presented in this paper, I-M4M, is a faster and more versatile MAMO solver with a rich action space. We discuss the impact of the improvements we make in an extensive simulation study and show previously unachievable results on a real-world PR2 robot.

16:30-18:00, Paper WeCT8-CC.8	Add to My Program
Preprocessing-Based Kinodynamic Motion Planning Framework for Intercepting Projectiles Using a Robot Manipulator

Natarajan, Ramkumar	Robotics Institute, Carnegie Mellon University
Yang, Hanlan	Carnegie Mellon University
Xie, Qintong	University of Oxford
Oza, Yash	Amazon Robotics
Das, Manash Pratim	Carnegie Mellon University
Islam, Fahad	Carnegie Mellon University
Saleem, Muhammad Suhail	Carnegie Mellon University
Choset, Howie	Carnegie Mellon University
Likhachev, Maxim	Carnegie Mellon University
Keywords: Manipulation Planning, Motion and Path Planning, Optimization and Optimal Control Abstract: We are interested in studying sports with robots and starting with the problem of intercepting a projectile moving toward a robot manipulator equipped with a shield. To successfully perform this task, the robot needs to (i) detect the incoming projectile, (ii) predict the projectile's future motion, (iii) plan a minimum-time rapid trajectory that can evade obstacles and intercept the projectile, and (iv) execute the planned trajectory. These four steps must be performed under the manipulator's dynamic limits and extreme time constraints (<350ms in our setting) to successfully intercept the projectile. In addition, we want these trajectories to be smooth to reduce the robot's joint torques and the impulse on the platform on which it is mounted. To this end, we propose a kinodynamic motion planning framework that preprocesses smooth trajectories offline to allow real-time collision-free executions online. We present an end-to-end pipeline along with our planning framework, including perception, prediction, and execution modules. We evaluate our framework experimentally in simulation and show that it has a higher blocking success rate than the baselines. Further, we deploy our pipeline on a robotic system comprising an industrial arm (ABB IRB-1600) and an onboard stereo camera (ZED 2i), which achieves a 78% success rate in projectile interceptions.


WeCT10-CC Oral Session, CC-501	Add to My Program
Soft Robot Applications II

Chair: Zhao, Huichan	Tsinghua University
Co-Chair: Paik, Jamie	Ecole Polytechnique Federale De Lausanne

16:30-18:00, Paper WeCT10-CC.1	Add to My Program
A Soft, Lightweight Flipping Robot with Versatile Motion Capabilities for Wall-Climbing Applications

Chen, Rui	Chongqing University
Tao, Xinrui	Chongqing University
Cao, Changyong (Chase)	Case Western Reserve University
Jiang, Pei	Chongqing University
Luo, Jun	Chongqing University
Sun, Yu	University of Toronto
Keywords: Soft Robot Applications, Soft Robot Materials and Design, Climbing Robots, Electroadhesion Abstract: Soft wall-climbing robots have been limited in their ability to perform complex locomotion in diverse environments due to their structure and weight. Thus far, soft wall-climbing robots with integrated functions that can locomote in complex 3D environments are yet to be developed. This study addresses this challenge by presenting a lightweight (2.57 g) soft wall-climbing robot with integrated linear, turning, and transitioning motion capabilities. The soft robot employs three pneumatic bending actuators and two adaptive electroadhesion (EA) pads, which enable it to flip forward, transition between two walls, turn in two directions, and adhere to various surfaces. Different motion and control strategies are proposed based on a theoretical model. The experimental results demonstrate that the robot can move at an average speed of 3.85 mm/s on horizontal, vertical, and inverted walls and make transitions between walls with different pinch angles within 180°. Additionally, the soft robot can carry a miniature camera on vertical walls to perform detection and surveillance tasks. This work provides a reliable structure and control strategy to enhance the multifunctionalit

16:30-18:00, Paper WeCT10-CC.2	Add to My Program
Tetraflex: A Multigait Soft Robot for Object Transportation in Confined Environments

Wharton, Peter	University of Bristol
You, Tsam Lung	University of Bristol
Jenkinson, George	University of Bristol
Diteesawat, Richard Suphapol	University of Bristol
Le, Nguyen Hao	University of Bristol
Hall, Edith-Clare	University of Bristol
Garrad, Martin	University of Bristol
Conn, Andrew	University of Bristol
Rossiter, Jonathan	University of Bristol
Keywords: Soft Robot Applications, Soft Robot Materials and Design, Search and Rescue Robots Abstract: Unstructured environments call for versatile robots with adaptable morphology that can perform multiple goal-directed actions including locomotion in confined spaces, environmental mapping, object retrieval and object manipulation. In response to these challenges, we present the Polyflex design concept for fabrication of modular, soft truss robots and demonstrate its varied capabilities in a tetrahedral robot (Tetraflex). Tetraflex is composed of six pneumatically actuated bellows joined at four points by rigid nodes. By extending or contracting the bellows, Tetraflex is capable of large size and shape change, and rolling, crawling and bounding gaits. Furthermore, Tetraflex is able to roll onto and engulf objects then subsequently transport them with the crawling gait. The rolling gait discretises Tetraflex’s locomotion into predictable steps on a triangular grid, simplifying odometry and allowing the use of path planning to attain a desired position. The size of rolling step can be changed at any time by dynamically varying the size of the robot. The crawling and bounding gaits enable Tetraflex to move in smaller incremental steps or through narrow passages (80 mm wide). The maximum speed was attained with a bounding locomotion gait at 19.6 mm/s (0.15 body lengths per second, or BL/s). Rolling locomotion attained between 15.6 and 19.4 mm/s (0.12-0.15 BL/s), and crawling 7.8 mm/s (0.06 BL/s). The rolling gait was the most accurate gait, achieving 2.3% linear deviation.

16:30-18:00, Paper WeCT10-CC.3	Add to My Program
A Strong Underwater Soft Manipulator with Planarly-Bundled Actuators and Accurate Position Control

Tang, Kailuan	Harbin Institute of Technology
Lu, Chenghua	University of Bristol
Chen, Yishan	Southern University of Science and Technology
Xiao, Yin	Southern University of Science and Technology
Wu, Shijian	Southern University of Science and Technology
Tang, Shaowu	Southern University of Science and Technology
Wang, Hexiang	The University of Sydney
Zhang, Binbin	Southern University of Science and Technology
Shen, Zhong	The University of Hong Kong
Yi, Juan	Southern University of Science and Technology
Liu, Sicong	Southern University of Science and Technology
Wang, Zheng	Southern University of Science and Technology
Keywords: Soft Robot Applications, Soft Sensors and Actuators, Hydraulic/Pneumatic Actuators Abstract: Soft robotic manipulators have inherent advantages in underwater applications, as they generate motion by deforming seamless muscles rather than having rotational joints or sliding cylinders, as well as having excellent passive adaptability. However, limited by insufficient structural stiffness, achieving high payload and positioning accuracy remains challenging in existing soft manipulator designs. In this work, we propose an innovative approach to underwater soft manipulator design: 1) by constraining high- power optimized actuators with densely spaced lateral supporting plates, we could significantly enhance structural stiffness as well as improve the model accuracy drastically; 2) paired with a novel flow-controllable open-circuit hydraulic actuation, we could keep the manipulator smoothly operated and depth-compensation-free; 3) in result, the manipulator could be modelled kinematically in a simplified way for position control. The entire workflow from mechanical design to actuation and control is presented. A prototype soft manipulator was developed to validate the proposed design experimentally.

16:30-18:00, Paper WeCT10-CC.4	Add to My Program
Learning-Based Object Recognition Via a Eutectogel Electronic Skin Enabled Soft Robotic Gripper

Deng, Mo	University of Science and Technology of China
Fan, Fengya	University of Science and Technology of China
Wei, Xi	University of Science and Technology of China
Keywords: Soft Robot Applications, Soft Sensors and Actuators, Modeling, Control, and Learning for Soft Robots Abstract: Compared to the traditional robot, which is rigidly structured, the soft robot, usually made of soft material, or following a continuous movement pattern, has attracted extensive attention due to its unique features, such as high adaptivity to various unstructured environments and safe interaction with living beings through the deformable interface. However, mechanical and morphological requirements limit the design and implementation of a compatible sensing module, which restricts the further development of robotic functionality. Here, we designed a flexible soft sensing Wire with the piezoresistive Eutectogel packed in an Ecoflex tube (WEE), which is sensitive, stable, and easily manipulated. The wire and its array facilitated the perception function of the soft gripper and acted as the Electronic skin (E-skin) to acquire information from grasped objects. With the built-in E-skin, the gripper achieved object recognition at an accuracy of 93.78% for standard geometric objects in 9 categories based on a machine learning model. In addition, our design successfully demonstrated its application in fruit sorting, which proves its robustness and versatility. The proposed WEE-based E-skin can be easily applied to other soft robots with facile integration and further expedites advanced functionalization in robot-object interaction.

16:30-18:00, Paper WeCT10-CC.5	Add to My Program
Design and Validation of Slender Extensible Continuum Robot for Solar Wing Re-Unfolding in Aerospace

Wang, Pengyuan	Harbin Institute of Technology
Zheng, Zheng	Yangtze River Delta HIT Robot Technology Research Institute
Sun, Jiazhen	Harbin Institute of Technology
Liu, Yuqiang	Beijing Institute of Sapcecraft System Engineering
He, Zongbo	The Beijing Institute of Spacecraft System Engineering
Xing, Zhiguang	Harbin Institute of Technology, Weihai
Zhao, Jianwen	Harbin Institute of Technology, Weihai
Keywords: Soft Robot Applications, Space Robotics and Automation, Soft Robot Materials and Design Abstract: The solar array wing deployment of orbiting satellites cannot be performed due to power failure of the connector caused by uncertain loads such as high temperature or vibration in the launching process of the spacecraft. There is currently a lack of suitable unlocking solutions for solar wing re-unfolding. This paper proposes a solution in which an extensible continuum robot (ECR) carrying the unlocking device enters the gap between the satellite and the solar wing, re-unlocking the solar wing. This solution effectively leverages the advantages of ECR collision buffering and adaptable maneuverability within confined space. In response to the proposed solution, the designed ECR with two segments helical spring structure features scalability, hollowness, lightweight, and a big length-diameter ratio. To perform the critical unlocking task, an end effector with the function of loosening and unplugging the aerospace connector for communication is designed based on the drive device away from itself to reduce the inertia of the manipulator. The information from the cameras and force sensors is used to estimate the extent of task execution. We establish an experimental setup to simulate the process of unlocking. The results validate that the ECR successfully accesses the gap (65mm) and accomplishes the unlocking task. The ECR has great application potential for on-orbit service.

16:30-18:00, Paper WeCT10-CC.6	Add to My Program
Bio-Inspired Pupal-Mode Actuator with Ultra-Crossing Capability for Soft Robots

Wang, Zhenxing	Chinese Academy of Sciences
He, Xiao	Shenyang Institute of Automation, Chinese Academy of Sciences
Zhang, Yuhang	Shenyang Institute of Automation
Zhang, Cheng	Shenyang Institute of Automation, Chinese Academy of Sciences
Sun, Lei	The First Affiliated Hospital of China Medical University
Wang, Zhidong	Chiba Institute of Technology
Xu, Shun	The First Affiliated Hospital of China Medical University
Liu, Hao	Chinese Academy of Sciences
Keywords: Soft Robot Applications, Surgical Robotics: Laparoscopy, Biologically-Inspired Robots Abstract: Robot-assisted Natural Orifice Transluminal Endoscopic Surgery (NOTES) represents a paradigm shift in surgical practice, significantly minimizing patient morbidity. However, the variability of inner diameter and the inter-luminal crossing within the luminal tracts lead to challenge for effective robotic intervention. Inspired by the motion of the chrysalis during its transformation, we designed an innovative pupal-mode actuator for NOTES robots. Through the manipulation of its internal air chambers, this actuator is capable of replicating wrigglelike movements. Through experimental analysis, we have acquired the constitutive characteristics of this actuator. Subsequently, an innovative gastric endoscopy robot is developed base the actuator and tested in a phantom. The results of the task simulations substantiate that the pupal-mode actuator has the capability to reduce resistance and enhance the safety of the endoscopic intervention.

16:30-18:00, Paper WeCT10-CC.7	Add to My Program
Crawling Soft Robot Exploiting Wheel-Legs and Multimodal Locomotion for High Terrestrial Maneuverability

Ai, Xinpei	Hanyang University
Yue, Hengmao	Hanyang University
Wang, Wei	Hanyang University
Keywords: Soft Robot Applications, Wheeled Robots, Search and Rescue Robots, Multimodal Locomotion Abstract: How to efficiently traverse complex terrain remains an unresolved challenge for mobile soft robots, because their deformable bodies limit the magnitude of the forces they can exert on the environment. To achieve high maneuverability, this study demonstrates a pneumatic soft crawling robot equipped with wheel-legs capable of multimodal locomotion to negotiate various obstacles. The soft robot consists of a pneumatic soft actuator capable of multiple modes of bending deformation as the body and four identical multi-spoked wheel-legs with passive unidirectional forward rotation as limbs. The synergy of the body actuator and wheel-legs enables the robot to achieve multiple crawling gaits, including gecko-like crawling and inchworm-like crawling. A single gait or a combination of multiple gaits, as well as shape-morphing of the body, enables the robot to navigate obstacles as diverse as confined spaces, inclined surfaces, gaps, and stairs, or to avoid obstacles by circumventing them. Our study substantially improves the maneuverability of pneumatic soft crawling robots, thereby providing new routes for the potential applications of soft robots in obstacle-filled scenarios, including search and rescue, exploration, and inspection.

16:30-18:00, Paper WeCT10-CC.8	Add to My Program
Compliant Robotic Gripper with Integrated Ripeness Sensing for Blackberry Harvesting

De, Arvyn	Georgia Institute of Technology
Kumar, Divyam	Georgia Institute of Technology
Kwuan, Ian	Georgia Institute of Technology
Qiu, Alex	Georgia Institute of Technology
Hu, Ai-Ping	Georgia Tech Research Institute
Keywords: Agricultural Automation, Soft Robot Applications Abstract: Global blackberry demand has been surging due to their antioxidant and nutritional value in a traditional diet. However, blackberries have extreme fragility (resulting in up to 85% of harvest batches sustaining damage) and near-ripe and ripe blackberries are difficult to distinguish in normal lighting conditions. These challenges in maintaining the blackberry supply motivate the development of an autonomous robotic solution to harvest fully ripe blackberries with minimal damage. The present paper details the mechanical design, methodology, analysis, and experimental results of a compliant robotic gripper created for this purpose. The gripper has a compact form factor and retractable fingers with specialized TPU finger pads for gentle picks, a near-infrared (NIR) reflectance-based probe for detecting full ripeness and a standardized harvesting sequence for effectively picking berries. In an outdoor harvesting experiment, the gripper attempted picking 26 berries without ripeness sensing, with 65.4% (17) being successfully picked and 38.5% (10) sustaining damage. The movements of the robot arm in the harvesting sequence were accordingly adjusted and finalized for following in-lab experiments, in which the gripper was also outfitted with ripeness sensing. Out of 40 berries, 62.5% (25) were successfully picked, with 0% of them sustaining damage. The ripeness probe classified 56 ripe and 11 near-ripe berries, with 89% (50) of the ripe and 64% (7) of the near-ripe berries being correctly classified. In a second in-lab experiment, 16 of 20 berries were successfully picked, with 2 sustaining damage.


WeCT13-AX Oral Session, AX-201	Add to My Program
Human-Robot Collaboration III

Chair: Arami, Arash	University of Waterloo
Co-Chair: Dai, Houde	Haixi Institutes, Chinese Academy of Sciences

16:30-18:00, Paper WeCT13-AX.1	Add to My Program
Usability Evaluation Framework for Close-Proximity Collaboration with Large Industrial Manipulators

Hald, Kasper	Aalborg University
Rehm, Matthias	Aalborg University
Keywords: Human-Robot Collaboration, Acceptability and Trust, Design and Human Factors Abstract: Our goal is to design a framework for holistic evaluation of human-robot collaboration systems. To this end we utilize several standardized questionnaires administered while participants perform collaborative tasks in robot work cells. We used Standard Usability Scale and the Usability metric for user experience questionnaires to access usability, NASA Task-Load for workload, two questionnaires for human-robot trust as well as the Unified theory of acceptance and use of technology questionnaires. We performed two pilot tests of our framework with human-robot collabotation work cells at two test sites as part of the DrapeBot project. The goal of the project is to enable human-robot collaboration in the process of carbon fiber draping the production of outer parts. After utilizing the evaluation framework at the two test sites we found that the collection of questionnaires were easy to adapt to each work cell and the practical limitation around running the experiments. Both work cells scored high in usability, expected increase of productivity, as well as high trust and low anxiety, but both work cells scored low on expectancy of use for work in the future at their current state of development.

16:30-18:00, Paper WeCT13-AX.2	Add to My Program
MyoPassivity Map: Does Multi-Channel sEMG Correlate with the Energetic Behavior of Upper-Limb Biomechanics During Physical Human-Robot Interaction?

Oliver, Suzanne	New York University
Paik, Peter	New York University
Zhou, Xingyuan	New York University
Atashzar, S. Farokh	New York University (NYU), US
Keywords: Human-Centered Robotics, Telerobotics and Teleoperation, Haptics and Haptic Interfaces Abstract: The human arm has an intrinsic capacity to absorb energy during physical human-robot interaction (pHRI), which can be identified as biomechanical excess of passivity (EoP). This can be used as a central factor in the development of passivity-based pHRI controllers, securing haptic transparency while guaranteeing pHRI stability. Despite its significance, the real-time estimation of EoP remains an under-investigated topic. For the first time, we investigate the relationship between the EoP and muscle activity of the forearm at the wrist joint while analyzing sixteen surface electromyography (sEMG) sensors. The study explores optimal sensor placement for maximizing the correlation between muscle activity and the estimated EoP. Ten subjects participated in this study. The EoP of the wrist was identified through high-frequency perturbations in four directions, and two instructed co-contraction levels. The results uncover a strong correlation between sEMG and EoP. This paper also reports the effect of the direction of pHRI interaction on the EoP of the wrist, with increased energetic passivity in the abduction-adduction direction compared to supination-pronation. The findings of this paper indicate that sEMG encodes significant potential for real-time estimation of EoP in the design of next-generation pHRI controllers supporting concurrent transparency and stability.

16:30-18:00, Paper WeCT13-AX.3	Add to My Program
Language-Guided Active Sensing of Confined, Cluttered Environments Via Object Rearrangement Planning

Chen, Weihan	Purdue University
Ren, Hanwen	Purdue University
Qureshi, Ahmed H.	Purdue University
Keywords: Human-Centered Robotics, RGB-D Perception, Perception for Grasping and Manipulation Abstract: Language-guided active sensing is a robotics subtask where a robot with an onboard sensor interacts efficiently with the environment via object manipulation to maximize perceptual information, following given language instructions. These tasks appear in various practical robotics applications, such as household service, search and rescue, and environment monitoring. Despite many applications, the existing works do not account for language instructions and have mainly focused on surface sensing, i.e., perceiving the environment from the outside without rearranging it for dense sensing. Therefore, in this paper, we introduce the first language-guided active sensing approach that allows users to observe specific parts of the environment via object manipulation. Our method spatially associates the environment with language instructions, determines the best camera viewpoints for perception, and then iteratively selects and relocates the best view-blocking objects to provide the dense perception of the region of interest. We evaluate our method against different baseline algorithms in simulation and also demonstrate it in real-world confined cabinet-like settings with multiple unknown objects. Our results show that the proposed method exhibits better performance across different metrics and successfully generalizes to real-world complex scenarios.

16:30-18:00, Paper WeCT13-AX.4	Add to My Program
Feedforward Control of Lower Limb Exoskeletons: Which Torque Profile Should We Use?

Dinovitzer, Hannah	University of Waterloo
Shushtari, Mohammad	University of Waterloo
Arami, Arash	University of Waterloo
Keywords: Human-Centered Robotics, Wearable Robotics, Physical Human-Robot Interaction Abstract: Despite the increased use of lower limb exoskeletons as gait training and mobility assistive devices, their controllers often lack the ability to synchronize and adapt to meet individual users' needs. This paper investigates two control approaches for lower limb exoskeletons: a real-time kinematic state-dependent estimation of desired torques with an inverse dynamics model and a data-driven component in the first approach, and a pre-defined torque control based on gait speed and phase in the second approach. These controllers are linearly combined to shift the controller behavior between pure kinematic state-dependent and pure gait phase-dependent control.} These combinations were tested during overground and treadmill walking with nine able-bodied participants. The linearly combined controller with a greater emphasis on kinematic state-dependent control produced a more natural gait in terms of spatiotemporal metrics. This is reflected by 0.1m/s increases in overground walking speed and 5% decrease in percent stance compared to walking with a passive exoskeleton. This controller also decreases the overall activity of lower limb muscles by up to 25% and thigh co-contractions by up to 40%. Participant feedback through a questionnaire, in terms of perceived effort, walking naturalness, and stability, also favored the aforementioned controller.

16:30-18:00, Paper WeCT13-AX.5	Add to My Program
Design of Two Morphing Robot Surfaces and Results from a User Study on What People Want and Expect of Them, towards a “Robot-Room”

Kumar, Nithesh	Clemson
Chao, Hsin-Ming	Cornell University
Tassari, Bruno Dantas da Silva	Cornell University
Sabinson, Elena	Cornell University
Walker, Ian	Clemson University
Green, Keith Evan	Cornell University
Keywords: Human-Centered Robotics, Product Design, Development and Prototyping Abstract: We propose, examine prototypes of, and collect user input on morphing robotic surface, “robot-room” elements that, individually or in combination, change the functionality of the rooms we live in, directly controlled by the room’s occupants engaging with it. Robot-rooms represent an advance in human-robot interaction whereby human interaction is within a machine that physically envelops us. We discuss the motivation for such robot-rooms, present initial work aimed at their physical realization, and report on a user study of 80 participants to learn what people might want of and expect from robot rooms, the results of which will inform both the iterative design of the robot room and the thinking of our community as it grapples with how we want to live with (and “in”) robots. Keywords: Robot surfaces, User studies

16:30-18:00, Paper WeCT13-AX.6	Add to My Program
Automatic Trust Estimation from Movement Data in Industrial Human-Robot Collaboration Based on Deep Learning

Rehm, Matthias	Aalborg University
Pontikis, Ioannis	AALBORG UNIVERSITY
Hald, Kasper	Aalborg University
Keywords: Human-Robot Collaboration, Acceptability and Trust, Human-Centered Automation Abstract: Trust in automation is usually assessed with post-interaction questionnaires. For human robot collaboration it would be beneficial to assess the trust level during the interaction to adjust the robot's collaboration behavior to the user expectations. In this paper we investigate if trust can be estimated from observable behavior like movements during the interaction with a large industrial manipulator. To this end, we report on a data collection for two tasks during collaborative draping, the transport of large cut pieces and the actual draping process in close proximity to the robot. The data is used to train and compare different deep learning models. Results show that automatic trust estimation is feasible, which opens up to using trust as a parameter for informing the interaction with robots.

16:30-18:00, Paper WeCT13-AX.7	Add to My Program
A Dual Closed-Loop Control Strategy for Human-Following Robots Respecting Social Space

Peng, Jianwei	University of Chinese Academy of Sciences
Liao, Zhelin	Fujian Agriculture and Forestry University
Su, Zefan	Fuzhou University
Yao, Hanchen	Haixi Institutes, Chinese Academy of Sciences
Zeng, Yadan	Nanyang Technology University
Dai, Houde	Haixi Institutes, Chinese Academy of Sciences
Keywords: Human-Robot Collaboration Abstract: Human following for mobile robots has emerged as a promising technique with widespread applications. To ensure psychological comfort while collaborating, coexisting, and interacting with humans, robots need to respect the social space of the target person. In this study, we propose a dual closed-loop human-following control strategy that combines model predictive control (MPC) and impedance control. The outer-loop MPC ensures precise control of the robot's posture while tracking the target person's velocity and direction to coordinate the motion between them. The inner-loop impedance controller is employed to regulate the robot's motion and interaction force with the target person, enabling the robot to maintain a respectful and comfortable distance from the target person. Concretely, the social interaction dynamics characteristics between the robot and the target person are described by human-robot interaction dynamics, which considers the rules of social space. Furthermore, an obstacle avoidance component constructed using behavioral dynamics is integrated into the impedance controller. Experimental results demonstrate the effectiveness of the proposed method in achieving human following and obstacle avoidance without intruding into the intimate zone of the target person.

16:30-18:00, Paper WeCT13-AX.8	Add to My Program
A Bayesian Optimization Framework for the Automatic Tuning of MPC-Based Shared Controllers

van der Horst, Anne	Eindhoven University of Technology
Meere, Bastiaan Guillermo Lorenzo	Eindhoven University of Technology
Krishnamoorthy, Dinesh	TU Eindhoven
Bakker, Saray	Delft University of Technology
van de Vrande, Bram	Philips
Stoutjesdijk, Henry	Philips Medical Systems
Alonso, Marco	Company
Torta, Elena	Eindhoven University of Technology
Keywords: Human-Robot Collaboration, AI-Based Methods, Medical Robots and Systems Abstract: This paper presents a Bayesian optimization framework for the automatic tuning of shared controllers which are defined as a Model Predictive Control (MPC) problem. The proposed framework includes the design of performance metrics as well as the representation of user inputs for simulation-based optimization. The framework is applied to the optimization of a shared controller for an Image Guided Therapy robot. VR-based user experiments confirm the increase in performance of the automatically tuned MPC shared controller with respect to a hand-tuned baseline version as well as its generalization ability.


WeCT19-NT Oral Session, NT-G301	Add to My Program
Medical Robots VI

Chair: Do, Thanh Nho	University of New South Wales
Co-Chair: Iordachita, Ioan Iulian	Johns Hopkins University

16:30-18:00, Paper WeCT19-NT.1	Add to My Program
Learning-Based Efficient Phase-Amplitude Modulation and Hybrid Control for MRI-Guided Focused Ultrasound Treatment

Dai, Jing	The University of Hong Kong
Zhu, Bohao	University of Hong Kong
Wang, Xiaomei	The University of Hong Kong
Jiang, Zhiyi	The University of Hong Kong
Wu, Mengjie	The University of Hong Kong
Liang, Liyuan	The University of Hong Kong
Xie, Xiaochen	Harbin Institute of Technology, Shenzhen
Lam, James	University of Hong Kong
Chang, Hing-Chiu	The University of Hong Kong
Kwok, Ka-Wai	The University of Hong Kong
Keywords: Medical Robots and Systems, Surgical Robotics: Planning Abstract: Magnetic resonance-guided focused ultrasound (MRg-FUS) has become attractive, accrediting to its non-invasive nature. However, ultrasound beams focusing and steering is still challenging owing to aberrations induced by soft tissue heterogeneity. In particular for beam motion control to ensure real-time and precise tracking in the deep-seated region over abdominal organs, while considering full-wave propagation. To this end, we proposed a closed-loop hybrid control scheme and a learning-based modulation model for robot-assisted MRg-FUS treatments. By introducing a rapid phase estimator to provide an efficient (<3 ms) solution, the robust H_infinity controller enables real-time and accurate tracking (0.30 mm) without prior knowledge of heterogeneous media, even under unknown disturbances. Our model enables rapid (2.65 ms) phase-amplitude modulation and precise targeting (mean 0.35 mm, max. 0.65 mm), meeting clinical standard. Focal obliquity is significantly “aligned” to only 2.7°. Results from sensitivity analysis and transducer design also support the model’s clinical feasibility and potential in widespread MRg-FUS treatments.

16:30-18:00, Paper WeCT19-NT.2	Add to My Program
Co-Axial Slender Tubular Robot (CAST): Towards Robotized Operation for Transorbital Neurosurgery with Minimal Invasiveness

Wang, Shuai	HKPU(The Hong Kong Polytechnic University)
Zhao, Qing xiang	Hong Kong Institute of Science & Innovation, Centre for Artifici
Chen, Jian	University of Chinese Academy of Sciences
Chen, Mingcong	City University of Hong Kong
Cao, Guanglin	Institute of Automation, Chinese Academy of Sciences
Hu, Jian	Institute of Automation, Chinese Academy of Sciences
Zhu, Runfeng	The Hong Kong Polytechnic University
Liu, Hongbin	Hong Kong Institute of Science & Innovation, Chinese Academy Of
Keywords: Medical Robots and Systems, Surgical Robotics: Steerable Catheters/Needles, Modeling, Control, and Learning for Soft Robots Abstract: Transorbital Neuro Surgery (TNS) offers a novel treatment towards the lesion inside skull pursuing minimal invasiveness. Most conventional TNS tools are rigid and straight, limiting the dexterity and accessibility in passing a small port. Bendable and steerable surgical tools provides an alternative for this issue. In this work, we proposed a dual-segment slender surgical robot arm for TNS, which is a Co-Axial Slender Tubular robot (CAST), and modelled it using novel approaches. Another contribution is tendon-mortise shaped slits along the axial direction, enhancing the overall stiffness. The bending of CAST is actuated by pushing/pulling distance, and the maximum diameter is only 1.7mm with high dexterity after mounting on a rigid robot arm. Experiments demonstrates that the proposed the slit design doubles the stiffness properties compared to traditional rectangle slit designs. The path-following task shows that the position error was maximally 3mm in open-looped control. Test on a skull model demonstrates that the whole system could successfully perform electrocoagulation procedure inside the depth of skull in a robotized manner effectively.

16:30-18:00, Paper WeCT19-NT.3	Add to My Program
Vascular Centerline-Guided Autonomous Navigation Methods for Robot-Lead Endovascular Interventions

Li, Naner	Huazhong University of Science and Technology
Wang, Yiwei	Huazhong University of Science and Technology
Cheng, Haoyuan	Huazhong University of Science and Technology
Zhao, Huan	Huazhong University of Science and Technology
Ding, Han	Huazhong University of Science and Technology
Keywords: Medical Robots and Systems, Surgical Robotics: Steerable Catheters/Needles, Robotics and Automation in Life Sciences Abstract: In minimally invasive endovascular interventional surgery, guidewire navigation is an indispensable process. However, even experienced physicians often encounter difficulties in manually manipulating the guidewire for branch selection, while also facing the risk of radiation exposure. In this study, we investigated robotic autonomous guidewire navigation methods. An electromagnetic system was used to track the real-time position and orientation of the guidewire tip, and a state space representing the guidewire within the vascular environment was constructed to guide the robot in precise guidewire manipulation. Experimental results demonstrated that the proposed trial-and-error and centerline-guided methods successfully completed navigation tasks in a static environment, outperforming human navigation performance in terms of trajectory smoothness, trajectory length, and incorrect branch entry counts. For dynamic environment navigation, dynamic time warping (DTW), a technique for measuring the similarity between two temporal sequences, was integrated into the centerline-guided method. The proposed approaches eliminate the need for visual feedback and thereby minimizing the risk of radiation exposure for both patients and medical staff present in the operating room during the procedure.

16:30-18:00, Paper WeCT19-NT.4	Add to My Program
A Soft Micro-Robotic Catheter for Aneurysm Treatment: A New Design and Enhanced Euler-Bernoulli Model with Cross-Section Optimization

Emanuele, Nicotra	UNSW Sydney
Chi Cong, Nguyen	University of New South Wales
Davies, James J.	University of New South Wales
Phan, Phuoc Thien	University of New South Wales
Hoang, Trung Thien	University of New South Wales
Bibhu, Sharma	UNSW Sydney
Ji, Adrienne	University of New South Wales
Zhu, Kefan	UNSW Sydney
Ngo, Trung Dung	University of Prince Edward Island
Ho, Van	Japan Advanced Institute of Science and Technology
La, Hung	University of Nevada at Reno
Lovell, Nigel Hamilton	University of New South Wales
Do, Thanh Nho	University of New South Wales
Keywords: Medical Robots and Systems, Surgical Robotics: Steerable Catheters/Needles, Soft Robot Applications Abstract: Aneurysms, balloon-like bulges in blood vessels, present a signifi cant health risk due to their potential to rupture, leading to life-threatening internal bleeding. Current treatments often involve delivering embolic materials or metal coils to fill these bulges, occluding them from the pressure of blood flow. However, clinical micro-catheters that deploy embolic materials used today face limitations, primarily their rigidity and the lack of active control over the bending tip of the catheter. This paper introduces a new soft micro-robotics catheter, with diameter of only 0.8 mm, equipped with a hollow channel. With this new design, the new device can induce bending motions at its tip for active steerability to reach desired aneurysm targets and then perform the delivery of embolic materials and tools. To enhance the control and precise navigation during procedures, a robust mathematical model and image processing techniques are also introduced and validated. Experiments are also performed to characterise and validate the model’s accuracy and the steerability and navigation capabilities of the new micro-catheter.

16:30-18:00, Paper WeCT19-NT.5	Add to My Program
A Generic Modeling Framework for the Design of Tendon-Driven Continuum Manipulators with Flexure Patterns

Liu, Yang	The University of Texas at Austin
Kim, Hansoul	The University of California, Berkeley
Kulkarni, Yash	The University of Texas at Austin
Alambeigi, Farshid	University of Texas at Austin
Keywords: Medical Robots and Systems, Surgical Robotics: Steerable Catheters/Needles, Tendon/Wire Mechanism Abstract: In this paper, a novel mathematical framework is introduced for modeling deformation behavior of Tendon- Driven Continuum Manipulators (TD-CMs) featuring discontinuous cross-sectional geometries (i.e., having flexural patterns). Leveraging this framework, we also introduce the concept of design space by which the deformation-behavior space of a TD-CM can intuitively be analyzed via its geometrical design parameters. To thoroughly evaluate the performance of the proposed modeling framework, we have conducted various simulation studies and experiments

16:30-18:00, Paper WeCT19-NT.6	Add to My Program
Bevel-Tip Needle Deflection Modeling, Simulation, and Validation in Multi-Layer Tissues

Wang, Yanzhou	Johns Hopkins University
Al-Zogbi, Lidia	Johns Hopkins University
Liu, Guanyun	University of Florida
Liu, Jiawei	Johns Hopkins University
Tokuda, Junichi	Brigham and Women's Hospital and Harvard Medical School
Krieger, Axel	Johns Hopkins University
Iordachita, Ioan Iulian	Johns Hopkins University
Keywords: Medical Robots and Systems, Surgical Robotics: Steerable Catheters/Needles Abstract: Percutaneous needle insertions are commonly performed for diagnostic and therapeutic purposes as an effective alternative to more invasive surgical procedures. However, the outcome of needle-based approaches relies heavily on the accuracy of needle placement, which remains a challenge even with robot assistance and medical imaging guidance due to needle deflection caused by contact with soft tissues. In this paper, we present a novel mechanics-based 2D bevel-tip needle model that can account for the effect of nonlinear strain-dependent behavior of biological soft tissues under compression. Real-time finite element simulation allows multiple control inputs along the length of the needle with full three-degree-of-freedom (DOF) planar needle motions. Cross-validation studies using custom-designed multi-layer tissue phantoms as well as heterogeneous chicken breast tissues result in less than 1mm in-plane errors for insertions reaching depths of up to 61 mm, demonstrating the validity and generalizability of the proposed method.

16:30-18:00, Paper WeCT19-NT.7	Add to My Program
Excitation Trajectory Optimization for Dynamic Parameter Identification Using Virtual Constraints in Hands-On Robotic System

Huanyu, Tian	Beijing Institution of Technology
Huber, Martin	King's College London
Mower, Christopher Edwin	Huawei Technologies Research & Development
Han, Zhe	Beijing Institute of Technology
Li, Changsheng	Beijing Institute of Technology
Duan, Xingguang	Beijing Institute of Technology
Bergeles, Christos	King's College London
Keywords: Physical Human-Robot Interaction, Medical Robots and Systems, Optimization and Optimal Control Abstract: This paper proposes a novel, more computationally efficient method for optimizing robot excitation trajectories for dynamic parameter identification, emphasizing self-collision avoidance. This addresses the system identification challenges for getting high-quality training data associated with co-manipulated robotic arms that can be equipped with a variety of tools, a common scenario in industrial but also clinical and research contexts. Utilizing the Unified Robotics Description Format (URDF) to implement a symbolic Python implementation of the Recursive Newton-Euler Algorithm (RNEA), the approach aids in dynamically estimating parameters such as inertia using regression analyses on data from real robots. The excitation trajectory was evaluated and achieved on par criteria when compared to state-of-the-art reported results which didn't consider self-collision and tool calibrations. Furthermore, physical Human-Robot Interaction (pHRI) admittance control experiments were conducted in a surgical context to evaluate the derived inverse dynamics model showing a 30.1% workload reduction by the NASA TLX questionnaire.


WeCT21-NT Oral Session, NT-G303	Add to My Program
Bioinspired Robot Abilities

Chair: Kanada, Ayato	Kyushu University
Co-Chair: Yoon, Jungwon	Gwangju Institutue of Science and Technology

16:30-18:00, Paper WeCT21-NT.1	Add to My Program
Geared Rod-Driven Continuum Robot with Woodpecker-Inspired Extension Mechanism and IMU-Based Force Sensing

Mavinkurve, Ujjal	Kyushu University
Kanada, Ayato	Kyushu University
Tafrishi, Seyed Amir	Cardiff Univerity
Honda, Koki	The University of Tokyo
Nakashima, Yasutaka	Kyushu University
Yamamoto, Motoji	Kyushu University
Keywords: Biologically-Inspired Robots, Compliant Joints and Mechanisms, Force and Tactile Sensing Abstract: Continuum robot arms that can access confined spaces are useful in many applications, such as invasive surgery, search and rescue, and inspection. However, their reach is often limited because their extension mechanism relies on elastic deformation or folding structures. To address this challenge, we propose a continuum robot with a novel extension mechanism inspired by the impressive ability of woodpeckers to extend and bend their long tongues to catch insects in tree holes. The proposed mechanism can change the effective length of the robot from almost zero to any length by moving the robot's body back and forth. Our prototype robot demonstrated a maximum extension of 450 mm and a minimum bending radius of 125 mm. In addition, we developed a Gaussian process regression model to predict an external force applied to the robot's tip using inertial measurement units. This enabled us to determine the magnitude and direction of the force with an error rate of 4.8 percent and 11.1 percent, even when the robot's length was varied between the training and test data. The unrestricted extension capability of the proposed approach has the potential to increase the application prospects of continuum robots.

16:30-18:00, Paper WeCT21-NT.2	Add to My Program
Anisotropic Body Compliance Facilitates Robotic Sidewinding in Complex Environments

Kojouharov, Velin	Georgia Institute of Technology
Wang, Tianyu	Georgia Institute of Technology
Fernandez, Matthew	Georgia Institute of Technology
Maeng, Jiyeon	Georgia Institute of Technology
Goldman, Daniel	Georgia Institute of Technology
Keywords: Biologically-Inspired Robots, Mechanism Design, Redundant Robots Abstract: Sidewinding, a locomotion strategy characterized by the coordination of lateral and vertical body undulations, is frequently observed in rattlesnakes and has been successfully implemented by limbless robotic systems for effective movement across diverse terrestrial terrains. However, the integration of compliant mechanisms into sidewinding limbless robots remains less explored, posing challenges for navigation in complex, rheologically diverse environments. Inspired by a notable control simplification via mechanical intelligence in lateral undulation, which offloads feedback control to passive body mechanics and interactions with the environment, we present an innovative design of a mechanically intelligent limbless robot for sidewinding. This robot features a decentralized bilateral cable actuation system that resembles organismal muscle actuation mechanisms. We develop a feedforward controller that incorporates programmable body compliance into the sidewinding gait template. Our experimental results highlight the emergence of mechanical intelligence when the robot is equipped with an appropriate level of body compliance. This allows the robot to 1) locomote more energetically efficiently, as evidenced by a reduced cost of transport, and 2) navigate through terrain heterogeneities, all achieved in an open-loop manner, without the need for environmental awareness.

16:30-18:00, Paper WeCT21-NT.3	Add to My Program
Combining Tail and Reaction Wheel for Underactuated Spatial Reorientation in Robot Falling with Quadratic Programming

Chu, Xiangyu	The Chinese University of Hong Kong
Wang, Shengzhi	The Chinese University of Hong Kong
Ng, Raymond	Chinese University of Hong Kong
Fan, Chun Yin	Chinese University of Hong Kong
An, Jiajun	The Chinese University of Hong Kong
Au, K. W. Samuel	The Chinese University of Hong Kong
Keywords: Biologically-Inspired Robots, Motion Control, Underactuated Robots Abstract: Inertial appendages (e.g., tails and reaction wheels) have shown their reorientation capability to enhance robots' mobility while airborne or improve robots' safety in falling. The tail, especially with two Degrees of Freedom (DoFs), is normally subject to its limited Range of Motion (RoM). Although the reaction wheel circumvents this limitation, its efficiency has been shown lower than the tail in terms of inducing Moment of Inertia (MoI). In literature, only one type of inertial appendages has been used on terrestrial robots in the air, e.g., either using a tail on the hexapedal robot RHex or using a reaction wheel on the jumping quadruped robot SpaceBok. In this paper, to benefit from both unlimited RoM and efficient MoI-inducing, we propose combining a 1-DoF tail and a reaction wheel together for spatial reorientation (regulating robot body's 3D orientation). Inspired by this, a hybrid tail-wheel robot is built, i.e., the tail that creates roll motion is attached to a wheel-equipped robot whose wheels act like a reaction wheel and generate pitch rotation; however, the robot is underactuated on the yaw rotation. To achieve its real-time spatial reorientation, we propose a novel quadratic programming algorithm based on a geometric metric for the underactuated hybrid tail-wheel robot. Within the proposed algorithm, the physical limitations on tail and wheel velocities are automatically accommodated. Numerical comparisons among wheel-wheel, tail-wheel, and 2-DoF tail robots s

16:30-18:00, Paper WeCT21-NT.4	Add to My Program
Environment-Modulated Self-Assembly by Changes in Modules' Buoyancy

Chen, Xiao	University of Sheffield
Han, Junyi	University of Sheffield
Jin, Xin	University of Sheffield
Miyashita, Shuhei	University of Sheffield
Keywords: Biologically-Inspired Robots, Process Control, Additive Manufacturing Abstract: While many inkjet printers employ only four types of ink (i.e. CKMY) to produce a wide range of colors, numerous technical challenges still exist for contemporary 3D printers to fabricate various materials and generate composite products such as electric devices. Conversely, there have been attempts and endeavors to make things through self-assembly of parts, analogous to the autonomous and decentralized development process of the human body from just 20 types of amino acids. In our previous work, we proposed a method for the rapid production of 3D objects using the centimeter-sized modules (referred to as Roblets) capable of generating a 2D structure and subsequently self-folding themselves into a 3D configuration, akin to origami. To further leverage the capability of generating a wide variety of different types of structures by combining different modules, this research studies a method of automatically selecting and supplying modules using environmental cues. More precisely, we developed a mechanism to couple different modules corresponding to three different environments (on a flat surface, on low-dense saline, and on saturated saline) and yielded different module configurations. The process of self-assembly necessitated the application of perturbation, which was realized by imparting magnetic torque originating from an external magnetic field onto the magnets embedded in the modules.

16:30-18:00, Paper WeCT21-NT.5	Add to My Program
Analysis and Validation of Stiffness and Payload of Nematode-Inspired Cable Routing Method for Cable Driven Redundant Manipulator

Kim, Hoyoung	GIST
Yoon, Jungwon	Gwangju Institutue of Science and Technology
Keywords: Biologically-Inspired Robots, Tendon/Wire Mechanism, Kinematics Abstract: The cable-driven redundant manipulator (CDRM) has significant potential for applications in narrow and hazardous spaces. However, traditional CDRMs have limited stiffness and load capacity due to their cable routing method. To address these limitations, several scholars have proposed new mechanisms and control strategies. Nevertheless, the cable routing method has not changed, and CDRMs continue to suffer from their limitations. Recently, a nematode-inspired cable routing method was proposed; however, stiffness calculations, derivation of inverse kinematics, and validation of stiffness and load capacity were incomplete. In this paper, we calculate the analytic equivalent stiffness of the nematode-inspired cable routing method and compare it with other cable routing methods. Additionally, we derived and simulate the kinematics and an effective inverse kinematics algorithm. Finally, we validate the stiffness and load capacity using a developed prototype.


WeCT22-NT Oral Session, NT-G304	Add to My Program
Space Robotics I

Chair: Chowdhury, Souma	University at Buffalo, State University of New York
Co-Chair: Zhu, ZhengHong (George)	York University

16:30-18:00, Paper WeCT22-NT.1	Add to My Program
Super-Resolution of Lunar Satellite Images for Enhanced Robotic Traverse Planning

Delgado-Centeno, José Ignacio	Universite Du Luxembourg
Harder, Paula	Mila - Quebec AI Institute
Bickel, Valentin	ETH Zurich
Moseley, Ben	ETH Zurich
Kalaitzis, Freddie	University of Oxford
Ganju, Siddha	NVIDIA
Olivares-Mendez, Miguel A.	Interdisciplinary Centre for Security, Reliability and Trust - U
Keywords: Space Robotics and Automation, AI-Based Methods, Deep Learning Methods Abstract: Lunar exploration missions require detailed and accurate planning to ensure their safety. Remote sensing data, such as optical satellite imagery acquired by lunar orbiters, is key for the identification of future landing and mission sites. Here, robot- and astronaut-scale obstacles are the most relevant to resolve, however, the spatial resolution of the available image data is often insufficient - particularly in the poorly illuminated polar regions of the Moon -, leading to uncertainty. This work shows how a novel single-image Super-Resolution (SR) application - ANUBIS, Adversarial Network for Uncertainty Based Image Super-resolution - can enhance lunar surface imagery by improving their resolution by a factor of 2, outperforming other approaches and benchmarks. The enhanced images improve the reliability and detail of lunar traverse planning and topographic reconstruction, while providing an estimate of the uncertainty associated with the enhancement process, vital to ensure mission planning integrity. This work demonstrates how machine learning-driven processing can enhance existing data products to maximize their value for science and exploration of the Moon and other celestial

16:30-18:00, Paper WeCT22-NT.2	Add to My Program
PPO-Based Dynamic Control of Uncertain Floating Platforms in Zero-G Environment

Ramezani, Mahya	University of Luxembourg
Alandihallaj, Mohammadamin	University of Luxembourg
Hein, Andreas	University of Luxembourg
Keywords: Space Robotics and Automation, Reinforcement Learning Abstract: Abstract— In the realm of space exploration, floating platforms play a crucial role in scientific investigations and technological advancements. However, controlling these platforms in zero-gravity environments presents unique challenges, including uncertainties and disturbances. This paper introduces an innovative approach that combines Proximal Policy Optimization (PPO) with Model Predictive Control (MPC) in the zero-gravity laboratory (Zero-G Lab) at the University of Luxembourg. This approach leverages PPO’s reinforcement learning power and MPC’s precision to navigate the complex control dynamics of floating platforms. Unlike traditional control methods, this PPO-MPC approach learns from MPC predictions, adapting to unmodeled dynamics and disturbances, resulting in a resilient control framework tailored to the zero-gravity environment. Simulations and experiments in the Zero-G Lab validate this approach, showcasing the adaptability of the PPO agent. This research opens new possibilities for controlling floating platforms in zero-gravity settings, promising advancements in space exploration.

16:30-18:00, Paper WeCT22-NT.3	Add to My Program
Learning-Aided Control of Robotic Tether-Net with Maneuverable Nodes to Capture Large Space Debris

Boonrath, Achira	University at Buffalo, SUNY
Liu, Feng	The State University of New York, University at Buffalo
Botta, Eleonora	University at Buffalo
Chowdhury, Souma	University at Buffalo, State University of New York
Keywords: Space Robotics and Automation, Reinforcement Learning, Motion Control Abstract: Maneuverable tether-net systems launched from an unmanned spacecraft offer a promising solution for the active removal of large space debris. Guaranteeing the successful capture of such space debris is dependent on the ability to reliably maneuver the tether-net system -- a flexible, many-DoF (thus complex) system -- for a wide range of launch scenarios. Here, scenarios are defined by the relative location of the debris with respect to the chaser spacecraft. This paper represents and solves this problem as a hierarchically decentralized implementation of robotic trajectory planning and control and demonstrates the effectiveness of the approach when applied to two different tether-net systems, with 4 and 8 maneuverable units (MUs), respectively. Reinforcement learning (policy gradient) is used to design the centralized trajectory planner that, based on the relative location of the target debris at the launch of the net, computes the final aiming positions of each MU, from which their trajectory can be derived. Each MU then seeks to follow its assigned trajectory by using a decentralized PID controller that outputs the MU's thrust vector and is informed by noisy sensor feedback (for realism) of its relative location. System performance is assessed in terms of capture success and overall fuel consumption by the MUs. Reward shaping and surrogate models are used to respectively guide and speed up the RL process. Simulation-based experiments show that this approach allows the successful capture of debris at fuel costs that are notably lower than nominal baselines, including in scenarios where the debris is significantly off-centered compared to the approaching chaser spacecraft.

16:30-18:00, Paper WeCT22-NT.4	Add to My Program
Online Supervised Training of Spaceborne Vision During Proximity Operations Using Adaptive Kalman Filtering

Park, Tae Ha	Stanford University
D’Amico, Simone	Stanford University
Keywords: Space Robotics and Automation, Deep Learning for Visual Perception, Continual Learning Abstract: This work presents an Online Supervised Training (OST) method to enable robust vision-based navigation about a non-cooperative spacecraft. Spaceborne Neural Networks (NN) are susceptible to domain gap as they are primarily trained with synthetic images due to the inaccessibility of space. OST aims to close this gap by training a pose estimation NN online using incoming flight images during Rendezvous and Proximity Operations (RPO). The pseudo-labels are provided by an adaptive unscented Kalman filter where the NN is used in the loop as a measurement module. Specifically, the filter tracks the target’s relative orbital and attitude motion, and its accuracy is ensured by robust on-ground training of the NN using only synthetic data. The experiments on real hardware-in-the-loop trajectory images show that OST can improve the NN performance on the target image domain given that OST is performed on images of the target viewed from a diverse set of directions during RPO.

16:30-18:00, Paper WeCT22-NT.5	Add to My Program
Towards Real-World Efficiency: Domain Randomization in Reinforcement Learning for Pre-Capture of Free-Floating Moving Targets by Autonomous Robots

Beigomi, Bahador	York University
Zhu, ZhengHong (George)	York University
Keywords: Space Robotics and Automation, Deep Learning in Grasping and Manipulation, Reinforcement Learning Abstract: In this research, we introduce a deep reinforcement learning-based control approach to address the intricate challenge of the robotic pre-grasping phase under microgravity conditions. Leveraging reinforcement learning eliminates the necessity for manual feature design, therefore simplifying the problem and empowering the robot to learn pre-grasping policies through trial and error. Our methodology incorporates an off-policy reinforcement learning framework, employing the soft actor-critic technique to enable the gripper to proficiently approach a free-floating moving object, ensuring optimal pre-grasp success. For effective learning of the pre-grasping approach task, we developed a reward function that offers the agent clear and insightful feedback. Our case study examines a pre-grasping task where a Robotiq 3F gripper is required to navigate towards a free-floating moving target, pursue it, and subsequently position itself at the desired pre-grasp location. We assessed our approach through a series of experiments in both simulated and real-world environments. The source code, along with recordings of real-world robot grasping, is available at Fanuc_Robotiq_Grasp.

16:30-18:00, Paper WeCT22-NT.6	Add to My Program
SPADES: A Realistic Spacecraft Pose Estimation Dataset Using Event Sensing

Rathinam, Arunkumar	University of Luxembourg
Qadadri, Haytam	University of Strasbourg
Aouada, Djamila	SnT, University of Luxembourg
Keywords: Space Robotics and Automation, Data Sets for Robotic Vision, Deep Learning for Visual Perception Abstract: In recent years, there has been a growing demand for improved autonomy for in-orbit operations such as rendezvous, docking, and proximity manoeuvres, leading to increased interest in employing Deep Learning-based Spacecraft Pose Estimation techniques. However, due to limited access to real target datasets, algorithms are often trained using synthetic data and applied in the real domain, resulting in a performance drop due to the domain gap. State-of-the-art approaches employ Domain Adaptation techniques to mitigate this issue. In the search for viable solutions, event sensing has been explored in the past and shown to reduce the domain gap between simulations and real-world scenarios. Event sensors have made significant advancements in hardware and software in recent years. Moreover, the characteristics of the event sensor offer several advantages in space applications compared to RGB sensors. To facilitate further training and evaluation of DL-based models, we introduce a new dataset, SPADES, comprising real event data acquired in a controlled laboratory environment and simulated event data using the same camera intrinsics. Furthermore, we introduce an image-based event representation that performs better than existing representations. In addition, we propose an effective data filtering method to improve the quality of training data, thus enhancing model performance. A multifaceted baseline evaluation was conducted using different event representations, event filtering strategies, and algorithmic frameworks, and the results are summarized. The dataset will be made available at http://cvi2.uni.lu/spades.

16:30-18:00, Paper WeCT22-NT.7	Add to My Program
Covariance Based Terrain Mapping for Autonomous Mobile Robots

Werner, Lennart	ETH Zürich
Proença, Pedro F.	California Institute of Technology
Nuechter, Andreas	University of Würzburg
Brockers, Roland	California Institute of Technology
Keywords: Space Robotics and Automation, Mapping, Vision-Based Navigation Abstract: In this paper, we present a local, robot-centric navigation map optimized for autonomous mobile robots operating in unknown environments, enhancing their onboard perception systems for collision-free operation with far look-ahead distances. Utilizing a novel converging covariance cell representation, our approach effectively analyzes hazards such as obstacles and hazardous slopes in both terrestrial and aerial navigation contexts. The new technique specifically targets mapping from stereo scenarios with ultra short baseline and highly oblique viewpoints close to the ground. Our methodology surpasses traditional window-based hazard analysis by resolving sub-cell size obstacles and terrain gradients at the individual cell level, thereby avoiding the computational overhead typically associated with such analyses. It leverages a multi-resolution strategy adaptive to the range errors common in stereo vision systems, making it particularly suitable for embedded systems with computational limitations. Functionality includes constant-time queries for height, obstacle presence, and slope details, boasting improvements in run time, memory usage, precision, and resolvable obstacle size compared to existing grid-based mapping algorithms. We validate our approach through rigorous simulation and real-world testing. This technique will be used for the local mapping and collision avoidance on NASA's CADRE lunar rovers.

16:30-18:00, Paper WeCT22-NT.8	Add to My Program
VINSat: Solving the Lost-In-Space Problem with Visual-Inertial Navigation

McCleary, Kyle	Carnegie Mellon University
Gurumurthy, Swaminathan	Carnegie Mellon University
Fisch, Paulo R.M.	Carnegie Mellon University
Tayal, Saral	Carnegie Mellon University
Manchester, Zachary	Carnegie Mellon University
Lucia, Brandon	Carnegie Mellon University
Keywords: Space Robotics and Automation, Vision-Based Navigation, Data Sets for Robotic Vision Abstract: Rapid growth in the number of nanosatellite deployments has heightened the need for rapid, cost-effective, and accurate orbit determination (OD). This paper introduces a solution to this “lost-in-space” problem that we call Visual-Inertial Navigation for Satellites (VINSat). VINSat performs OD using data from an inertial measurement unit (IMU) and a low-cost RGB camera. Machine learning techniques are used to identify known landmarks in images captured by the spacecraft. These landmark locations are then combined with IMU data and a dynamics model in a batch nonlinear least-squares state estimator to determine the full state of the spacecraft. We validate VINSat in simulation using real nadir-pointing imagery and find that 85% of simulated satellites are localized to under 5 km within 6 hours (4 orbits). This performance substantially surpasses that of ground radar, demonstrating significantly faster and more precise localization without any reliance on ground infrastructure.


WeCT23-NT Oral Session, NT-G401	Add to My Program
Aerial Systems: Applications

Chair: Su, Yao	Beijing Institute for General Artificial Intelligence
Co-Chair: Saska, Martin	Czech Technical University in Prague

16:30-18:00, Paper WeCT23-NT.1	Add to My Program
A Compiler Framework for Proactive UAV Regulation Enforcement

Tang, Huaxin	Binghamton University
Burns, John Henry	SUNY Binghamton
Strong, Alexander	Binghamton University
Liu, Yu David	SUNY Binghamton
Keywords: Aerial Systems: Applications Abstract: In the rapidly evolving landscape of Unmanned Aerial Vehicles (UAVs), regulation enforcement is critical. Unfortunately, existing practices are largely manual and reactive in nature. In this paper, we present THEMIS, a novel compiler-directed approach for automated and proactive regulation enforcement. By expressing regulations through a specification language and integrating their enforcement into the compilation process, THEMIS enables safe and regulation-compliant UAV flights by enforcing prohibited and restricted areas, avoiding flights over humans, and managing maximum limits of altitude and speed. Our framework features a bi-directional interface that allows the concrete algorithms used for enforcement to be customized. Our evaluation shows THEMIS-compiled autopilots can adhere to regulatory constraints amidst complex flight conditions, while significantly reducing the burden of UAV operators.

16:30-18:00, Paper WeCT23-NT.2	Add to My Program
Real-Time Dynamic-Consistent Motion Planning for Over-Actuated UAVs

Su, Yao	Beijing Institute for General Artificial Intelligence
Zhang, Jingwen	University of California, Los Angeles
Jiao, Ziyuan	Beijing Institute for General Artificial Intelligence
Li, Hang	Beijing Institute for General Artificial Intelligence
Wang, Meng	Beijing Institute for General Artificial Intelligence
Liu, Hangxin	Beijing Institute for General Artificial Intelligence (BIGAI)
Keywords: Aerial Systems: Applications, Constrained Motion Planning, Aerial Systems: Mechanics and Control Abstract: Existing motion planning approaches for over-actuated unmanned aerial vehicle (UAV) platforms can achieve online planning without considering dynamics. However, in many envisioned application areas such as aerial manipulation, payload delivery, and moving target tracking, it is critical to ensure dynamic consistency in the generated trajectory. The dynamics of these platforms introduce a high nonlinearity, leading to a substantial increase in computational burden. This paper presents an efficient method to plan motions that are consistent with the dynamics of over-actuated UAVs. With a hierarchical control structure, the dimension of the optimization problem is greatly reduced with synthesized wrench commands. Additionally, by exploring the dynamics of over-actuated UAVs, the complex planning process is decoupled into two simpler sub-problems. As a result, the proposed planner can be solved as two small quadratic programmings (QPs) and deployed in real-time. The computational efficiency and dynamic consistency of the proposed method are verified through both simulations and experiments, including comparison with other approaches and dynamic target tracking.

16:30-18:00, Paper WeCT23-NT.3	Add to My Program
Trajectory Optimization for Cooperatively Localizing Quadrotor UAVs

Go, H S Helson	University of Toronto
Liu, Hugh H.-T.	University of Toronto
Keywords: Aerial Systems: Applications, Cooperating Robots, Path Planning for Multiple Mobile Robots or Agents Abstract: In this paper, an Active Cooperative Localization system for Quadrotor Unmanned Aerial Vehicles is developed. The optimal trajectories are determined by minimizing the uncertainty in position estimation by Extended Kalman Filter. In this system, a piecewise-polynomial parameterization of trajectories is adopted for the optimizer, and the underlying state estimator is updated with appropriate models of sensors and quadrotor dynamics. This system is verified in extensive simulations in the scenario of a team of quadrotors with heterogeneous GNSS capabilities. These simulations answer an open question, showing that solving for trajectories by minimizing Kalman covariance computed in a noiseless environment is reasonable and that the optimized trajectories offers visible reductions in positioning uncertainty in the presence of noise.

16:30-18:00, Paper WeCT23-NT.4	Add to My Program
Extending Guiding Vector Field to Track Unbounded UAV Paths

Feurgard, Mael	Ecole Nationale De l'Aviation Civile
Hattenberger, Gautier	ENAC, French Civil Aviation University
Lacroix, Simon	LAAS/CNRS
Keywords: Aerial Systems: Mechanics and Control, Motion Control Abstract: A recent advance in vector field path following is the introduction of the Parametric Guiding Vector Field method. It allows for singularity-free vector fields with strong convergence guarantees, usable even for self-intersecting paths. However, the method requires significant gain tuning for practical use. In particular, for unbounded paths, the gains will inevitably become ill-suited for efficient path following. We propose a method to overcome this issue by introducing a dynamic step adaptation strategy, which provides additional normalization properties to the field. This allows the following of unbounded curves and reduces the number of gains to tune. The proposed improvements are verified in simulations using the PaparazziUAV software.

16:30-18:00, Paper WeCT23-NT.5	Add to My Program
Tethered Lifting-Wing Multicopter Landing Like Kite

Wei, Haoyu	Beihang University
Wang, Shuai	Beihang University
Quan, Quan	Beihang University
Keywords: Aerial Systems: Mechanics and Control, Motion Control Abstract: Automatic landing of tethered unmanned aerial vehicles (UAVs) is an important issue. Typically, UAVs rely on location sensors such as global navigation satellite system (GNSS) and external cameras to obtain location data. However, harsh environments such as denial GNSS or strong winds make it difficult for UAVs to approach the landing area, and common solutions cannot be used for automatic landing. A tethered lifting-wing multicopter has a structure and static stability similar to a kite. Inspired by kites, this paper proposes a new landing method for tethered lifting-wing multicopters, which can be used without location or velocity sensors. During the landing phase, the tethered lifting-wing multicopter only needs to keep the rotor thrust to actively straighten the tethered cable and a constant attitude similar to that of a kite to keep position stability and increase damping. Meanwhile, the winch only needs to recover the cable at a constant speed until the tethered lifting-wing multicopter returns to its base. The feasibility and practicability of this method are demonstrated by real flight experiments.

16:30-18:00, Paper WeCT23-NT.6	Add to My Program
AirFisheye Dataset: A Multi-Model Fisheye Dataset for UAV Applications

Jaisawal, Pravin Kumar	Hamburg University of Technology
Papakonstantinou, Stephanos	Hamburg University of Technology
Gollnick, Volker	Hamburg University of Technology
Keywords: Aerial Systems: Perception and Autonomy, AI-Based Methods, Data Sets for Robotic Vision Abstract: Drone applications require perception all around the vehicle for obstacle avoidance during drone navigation. Due to the weight and computation limitations on UAVs, using a large number of sensors e.g. a large amount of cameras could be prohibitive. In such scenarios, usage of fisheye camera with a wider field of view is very beneficial. Despite the usefulness of fisheye camera for UAV applications, not much work has been carried out to develop perception algorithm for fisheye camera. One of the main problems being the lack of publicly available omnidirectional datasets in relation to drone flight. With this paper, we address this gap by presenting AirFisheye dataset, which is applicable for tasks such as segmentation, depth estimation and depth completion, among other tasks required for autonomous drone navigation. Also, a generic framework for creating synthetic fisheye images is provided. Furthermore, we propose a novel occlusion correction algorithm that removes incorrectly projected LiDAR point clouds into the camera image due to the viewpoint variation of both sensors. We release about 26K images and LiDAR scans along with annotations. Baseline code and supporting scripts are available at https:// collaborating.tuhh.de/ilt/airfisheye-dataset

16:30-18:00, Paper WeCT23-NT.7	Add to My Program
Bio-Inspired Visual Relative Localization for Large Swarms of UAVs

Krizek, Martin	Czech Technical University in Prague
Vrba, Matous	Faculty of Electrical Engineering, Czech Technical University In
Barisic, Antonella	University of Zagreb, Faculty of Electrical Engineering and Comp
Bogdan, Stjepan	University of Zagreb
Saska, Martin	Czech Technical University in Prague
Keywords: Aerial Systems: Perception and Autonomy, Swarm Robotics, Deep Learning for Visual Perception Abstract: We propose a new approach to visual perception for relative localization of agents within large-scale swarms of UAVs. Inspired by biological perception utilized by schools of sardines, swarms of bees, and other large groups of animals capable of moving in a decentralized yet coherent manner, our method does not rely on detecting individual neighbors by each agent and estimating their relative position, but rather we propose to regress a neighbor density over distance. This allows for a more accurate distance estimation as well as better scalability with respect to the number of neighbors. Additionally, a novel swarm control algorithm is proposed to make it compatible with the new relative localization method. We provide a thorough evaluation of the presented methods and demonstrate that the regressing approach to distance estimation is more robust to varying relative pose of the targets and that it is suitable to be used as the main source of relative localization for swarm stabilization.

16:30-18:00, Paper WeCT23-NT.8	Add to My Program
Heuristic-Based Incremental Probabilistic Roadmap for Efficient UAV Exploration in Dynamic Environments

Xu, Zhefan	Carnegie Mellon University
Suzuki, Christopher	Carnegie Mellon University
Zhan, Xiaoyang	Carnegie Mellon University
Shimada, Kenji	Carnegie Mellon University
Keywords: Aerial Systems: Applications, Field Robots, Search and Rescue Robots Abstract: Autonomous exploration in dynamic environments necessitates a planner that can proactively respond to changes and make efficient and safe decisions for robots. Although plenty of sampling-based works have shown success in exploring static environments, their inherent sampling randomness and limited utilization of previous samples often result in sub-optimal exploration efficiency. Additionally, most of these methods struggle with efficient replanning and collision avoidance in dynamic settings. To overcome these limitations, we propose the Heuristic-based Incremental Probabilistic Roadmap Exploration (HIRE) planner for UAVs exploring dynamic environments. The proposed planner adopts an incremental sampling strategy based on the probabilistic roadmap constructed by heuristic sampling toward the unexplored region next to the free space, defined as the heuristic frontier regions. The heuristic frontier regions are detected by applying a lightweight vision-based method to the different levels of the occupancy map. Moreover, our dynamic module ensures that the planner dynamically updates roadmap information based on the environment changes and avoids dynamic obstacles. Simulation and physical experiments prove that our planner can efficiently and safely explore dynamic environments. Our software is available on GitHub with the experiment video.

Technical Program for Wednesday May 15, 2024