| |
Last updated on May 27, 2024. This conference program is tentative and subject to change
Technical Program for Wednesday May 15, 2024
|
WeAA1-CC Award Session, CC-Main Hall |
Add to My Program |
Robot Manipulation |
|
|
Chair: Harada, Kensuke | Osaka University |
Co-Chair: Dogar, Mehmet R | University of Leeds |
|
10:30-12:00, Paper WeAA1-CC.1 | Add to My Program |
Open X-Embodiment: Robotic Learning Datasets and RT-X Models |
|
Levine, Sergey | UC Berkeley |
Finn, Chelsea | Stanford University |
Goldberg, Ken | UC Berkeley |
Chen, Lawrence Yunliang | UC Berkeley |
Sukhatme, Gaurav | University of Southern California |
Dass, Shivin | UT Austin |
Pinto, Lerrel | New York University |
Zhu, Yuke | The University of Texas at Austin |
Zhu, Yifeng | The University of Texas at Austin |
Song, Shuran | Columbia University |
Mees, Oier | University of California, Berkeley |
Pathak, Deepak | Carnegie Mellon University |
Fang, Hao-Shu | Shanghai Jiao Tong University |
Christensen, Henrik Iskov | UC San Diego |
Ding, Mingyu | UC Berkeley |
Lee, Youngwoon | University of California, Berkeley |
Sadigh, Dorsa | Stanford University |
Radosavovic, Ilija | UC Berkeley |
Bohg, Jeannette | Stanford University |
Wang, Xiaolong | UC San Diego |
Li, Xuanlin | UC San Diego |
Rana, Krishan | Queensland University of Technology |
Kawaharazuka, Kento | The University of Tokyo |
Matsushima, Tatsuya | The University of Tokyo |
Oh, Jihoon | The University of Tokyo |
Osa, Takayuki | University of Tokyo |
Kroemer, Oliver | Carnegie Mellon University |
Kim, Beomjoon | Korea Advanced Institute of Science and Technology |
Johns, Edward | Imperial College London |
Stulp, Freek | DLR - Deutsches Zentrum Für Luft Und Raumfahrt E.V |
Schneider, Jan | Max Planck Institute for Intelligent Systems |
Wu, Jiajun | Stanford University |
Li, Yunzhu | University of Illinois Urbana-Champaign |
Ben Amor, Heni | Arizona State University |
Ott, Lionel | ETH Zurich |
Martín-Martín, Roberto | University of Texas at Austin |
Hausman, Karol | Google Brain |
Vuong, Quan | UC San Diego |
Sanketi, Pannag | Google |
Heess, Nicolas | Google Deepmind |
Vanhoucke, Vincent | Google |
Pertsch, Karl | UC Berkeley & Stanford University |
Schaal, Stefan | Google X |
Chi, Cheng | Columbia University |
Pan, Chuer | Stanford University |
Bewley, Alex | Google |
Keywords: Data Sets for Robot Learning, Imitation Learning, Deep Learning Methods
Abstract: Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train ``generalist’’ cross-robot policy that can be adapted efficiently to new robots, tasks, and environments? In this paper, we provide datasets in standardized data formats and models to make it possible to explore this possibility in the context of robotic manipulation, alongside experimental results that provide an example of effective cross-robot policies. We assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms.
|
|
10:30-12:00, Paper WeAA1-CC.2 | Add to My Program |
Towards Generalizable Zero-Shot Manipulation Via Translating Human Interaction Plans |
|
Bharadhwaj, Homanga | Carnegie Mellon University |
Gupta, Abhinav | Carnegie Mellon University |
Kumar, Vikash | Meta AI |
Tulsiani, Shubham | Carnegie Mellon University |
Keywords: Machine Learning for Robot Control, Learning from Demonstration, Big Data in Robotics and Automation
Abstract: We pursue the goal of developing robots that can interact zero-shot with generic unseen objects via a diverse repertoire of manipulation skills and show how passive human videos can serve as a rich source of data for learning such generalist robots. Unlike typical robot learning approaches which directly learn how a robot should act from interaction data, we adopt a factorized approach that can leverage large-scale human videos to learn how a human would accomplish a desired task (a human `plan'), followed by `translating’ this plan to the robot’s embodiment. Specifically, we learn a human `plan predictor’ that, given a current image of a scene and a goal image, predicts the future hand and object configurations. We combine this with a `translation’ module that learns a plan-conditioned robot manipulation policy, and allows following humans plans for generic manipulation tasks in a zero-shot manner with no deployment-time training. Importantly, while the plan predictor can leverage large-scale human videos for learning, the translation module only requires a small amount of in-domain data, and can generalize to tasks not seen during training. We show that our learned system can perform over 16 manipulation skills that generalize to 40 objects, encompassing 100 real-world tasks for table-top manipulation and diverse in-the-wild manipulation. https://homangab.github.io/hopman/
|
|
10:30-12:00, Paper WeAA1-CC.3 | Add to My Program |
Hearing Touch: Audio-Visual Pretraining for Contact-Rich Manipulation |
|
Mejia, Jared | Carnegie Mellon University |
Dean, Victoria | Carnegie Mellon University |
Hellebrekers, Tess | Meta AI Research |
Gupta, Abhinav | Carnegie Mellon University |
Keywords: Representation Learning, Sensorimotor Learning, Robot Audition
Abstract: Although pre-training on a large amount of data is beneficial for robot learning, current paradigms only perform large-scale pretraining for visual representations, whereas representations for other modalities are trained from scratch. In contrast to the abundance of visual data, it is unclear what relevant internet-scale data may be used for pretraining other modalities such as tactile sensing. Such pretraining becomes increasingly crucial in the low-data regimes common in robotics applications. In this paper, we address this gap by using contact microphones as an alternative tactile sensor. Our key insight is that contact microphones capture inherently audio-based information, allowing us to leverage large-scale audio-visual pretraining to obtain representations that boost the performance of robotic manipulation. To the best of our knowledge, our method is the first approach leveraging large-scale multisensory pre-training for robotic manipulation.
|
|
10:30-12:00, Paper WeAA1-CC.4 | Add to My Program |
SARA-RT: Scaling up Robotics Transformers with Self-Adaptive Robust Attention |
|
Leal, Isabel | Google Deepmind |
Choromanski, Krzysztof | Google DeepMind Robotics |
Jain, Deepali | Robotics at Google |
Dubey, Avinava | Google |
Varley, Jacob | Google |
Ryoo, Michael S. | Google, Stony Brook University |
Lu, Yao | Google |
Liu, Frederick | Google |
Sindhwani, Vikas | Google Brain, NYC |
Sarlos, Tamas | Google Research |
Oslund, Kenneth | Google |
Hausman, Karol | Google Brain |
Vuong, Quan | UC San Diego |
Rao, Kanishka | Google |
Keywords: Deep Learning Methods, Deep Learning in Grasping and Manipulation
Abstract: We present Self-Adaptive Robust Attention for Robotics Transformers (SARA-RT): a new paradigm for addressing the emerging challenge of scaling up Robotics Transformers (RT) for on-robot deployment. SARA-RT relies on the new method of fine-tuning proposed by us, called up-training. It converts pre-trained or already fine-tuned Transformer-based robotic policies of quadratic time complexity (including massive billion-parameter vision-language-action models or VLAs), into their efficient linear-attention counterparts maintaining high quality. We demonstrate the effectiveness of SARA-RT by speeding up: (a) the class of recently introduced RT-2 models, the first VLA robotic policies pre-trained on internet-scale data, as well as (b) Point Cloud Transformer (PCT) robotic policies operating on large point clouds. We complement our results with the rigorous mathematical analysis providing deeper insight into the phenomenon of SARA.
|
|
10:30-12:00, Paper WeAA1-CC.5 | Add to My Program |
DenseTact-Mini: An Optical Tactile Sensor for Grasping Multi-Scale Objects from Flat Surfaces |
|
Do, Won Kyung | Stanford University |
Dhawan, Ankush | Stanford University |
Kitzmann, Mathilda | Stanford University |
Kennedy, Monroe | Stanford University |
Keywords: Grasping, Force and Tactile Sensing, Grippers and Other End-Effectors
Abstract: Dexterous manipulation, especially of small daily objects, continues to pose complex challenges in robotics. This paper introduces the DenseTact-Mini, an optical tactile sensor with a soft, rounded, smooth gel surface and compact design equipped with a synthetic fingernail. We propose three distinct grasping strategies: tap grasping using adhesion forces such as electrostatic and van der Waals, fingernail grasping leveraging rolling/sliding contact between the object and fingernail, and fingertip grasping with two soft fingertips. Through comprehensive evaluations, the DenseTact-Mini demonstrates a lifting success rate exceeding 90.2% when grasping various objects, including items such as 1mm basil seeds, thin paperclips, and items larger than 15mm such as bearings. This work demonstrates the potential of soft optical tactile sensors for dexterous manipulation and grasping.
|
|
10:30-12:00, Paper WeAA1-CC.6 | Add to My Program |
Constrained Bimanual Planning with Analytic Inverse Kinematics |
|
Cohn, Thomas | Massachusetts Institute of Technology |
Shaw, Seiji | Massachusetts Institute of Technology |
Simchowitz, Max | MIT |
Tedrake, Russ | Massachusetts Institute of Technology |
Keywords: Bimanual Manipulation, Constrained Motion Planning, Kinematics
Abstract: In order for a bimanual robot to manipulate an object that is held by both hands, it must construct motion plans such that the transformation between its end effectors remains fixed. This amounts to complicated nonlinear equality constraints in the configuration space, which are difficult for trajectory optimizers. In addition, the set of feasible configurations becomes a measure zero set, which presents a challenge to sampling-based motion planners. We leverage an analytic solution to the inverse kinematics problem to parametrize the configuration space, resulting in a lower-dimensional representation where the set of valid configurations has positive measure. We describe how to use this parametrization with existing motion planning algorithms, including sampling-based approaches, trajectory optimizers, and techniques that plan through convex inner-approximations of collision-free space.
|
|
WeAA2-CC Award Session, CC-301 |
Add to My Program |
Robot Vision |
|
|
Chair: Chaumette, Francois | Inria Center at University of Rennes |
Co-Chair: Hashimoto, Koichi | Tohoku University |
|
10:30-12:00, Paper WeAA2-CC.1 | Add to My Program |
Deep Evidential Uncertainty Estimation for Semantic Segmentation under Out-Of-Distribution Obstacles |
|
Ancha, Siddharth | Massachusetts Institute of Technology |
Osteen, Philip | U.S. Army Research Laboratory |
Roy, Nicholas | Massachusetts Institute of Technology |
Keywords: Deep Learning for Visual Perception, Deep Learning Methods, Visual Learning
Abstract: In order to navigate safely and reliably in novel environments, robots must estimate perceptual uncertainty when confronted with out-of-distribution (OOD) obstacles not seen in training data. We present a method to accurately estimate pixel-wise uncertainty in semantic segmentation without requiring real or synthetic OOD examples at training time. From a shared per-pixel latent feature representation, a classification network predicts a categorical distribution over semantic labels, while a normalizing flow estimates the probability density of features under the training distribution. The label distribution and density estimates are combined in a Dirichlet-based evidential uncertainty framework that efficiently computes epistemic and aleatoric uncertainty in a single neural network forward pass. Our method is enabled by three key contributions. First, we simplify the problem of learning a transformation to the training data density by starting from a fitted Gaussian mixture model instead of the conventional standard normal distribution. Second, we learn a richer and more expressive latent pixel representation to aid OOD detection by training a decoder to reconstruct input image patches. Third, we perform theoretical analysis of the loss function used in the evidential uncertainty framework and propose a principled objective that more accurately balances training the classification and density estimation networks. We demonstrate the accuracy of our uncertainty estimation approach under long-tail OOD obstacle classes for semantic segmentation in both off-road and urban driving environments.
|
|
10:30-12:00, Paper WeAA2-CC.2 | Add to My Program |
NGEL-SLAM: Neural Implicit Representation-Based Global Consistent Low-Latency SLAM System |
|
Mao, Yunxuan | Zhejiang University |
Yu, Xuan | Zhejiang University |
Zhang, Zhuqing | Zhejiang University |
Wang, Kai | HuaWei |
Wang, Yue | Zhejiang University |
Xiong, Rong | Zhejiang University |
Liao, Yiyi | Zhejiang University |
Keywords: SLAM
Abstract: Neural implicit representations have emerged as a promising solution for addressing the challenges of Simultaneous Localization and Mapping (SLAM) problems in indoor scenes. This paper presents NGEL-SLAM, a low-latency global consistent SLAM system that utilizes neural implicit scene representation. To ensure global consistency, our system incorporates loop closure in the tracking module and maintains a global consistent map by representing the scene using multiple neural implicit fields and performing a quick adjustment to the loop closure. The fast convergence and rapid response to loop closure make our system a truly low-latency system that achieves global consistency. The neural implicit representation enables the rendering of high-fidelity RGB-D images with the extraction of explicit, dense, and interactive surfaces. Experiments were conducted on both synthetic and real-world datasets to evaluate the effectiveness of the proposed approach. The results demonstrate the achieved tracking and mapping accuracy and low-latency performance of our system.
|
|
10:30-12:00, Paper WeAA2-CC.3 | Add to My Program |
SeqTrack3D: Exploring Sequence Information for Robust 3D Point Cloud Tracking |
|
Lin, Yu | Northeastern University |
Li, Zhiheng | Northeastern University |
Cui, Yubo | Northeastern University |
Fang, Zheng | Northeastern University |
Keywords: Visual Tracking, Deep Learning for Visual Perception, Computer Vision for Transportation
Abstract: 3D single object tracking (SOT) is an important and challenging task for the autonomous driving and mobile robotics. Most existing methods perform tracking between two consecutive frames while ignoring the motion patterns of the target over a series of frames, which would cause performance degradation in scenes with sparse points. To break through this limitation, we introduce Sequence-to-Sequence tracking paradigm and a tracker named SeqTrack3D to capture target motion across continuous frames. Unlike previous methods that primarily adopted three strategies: matching two consecutive point clouds, predicting relative motion, or utilizing sequential point clouds to address feature degradation, our SeqTrack3D combines both historical point clouds and bounding box sequences. This novel approach ensures robust tracking by leveraging location priors from historical boxes, even in scenes with sparse points. Extensive experiments conducted on large-scale datasets show that SeqTrack3D achieves new state-of-the-art performances, improving by 6.00% on NuScenes and 14.13% on Waymo dataset.
|
|
10:30-12:00, Paper WeAA2-CC.4 | Add to My Program |
Ultrafast Square-Root Filter-Based VINS |
|
Peng, Yuxiang | University of Delaware |
Chen, Chuchu | University of Delaware |
Huang, Guoquan | University of Delaware |
Keywords: Localization, Visual-Inertial SLAM, SLAM
Abstract: In this paper, we strongly advocate square-root covariance (instead of information) filtering for visual-inertial navigation, in particular on resource-constrained edge devices, because of its superior efficiency and numerical stability. Although Visual-Inertial Navigation Systems (VINS) have made tremendous progress in recent years, they still face resource stringency and numerical instability on embedded systems when imposing limited word length. To overcome these challenges, we develop an ultrafast and numerically-stable square-root filter (SRF)-based VINS algorithm (i.e., SR-VINS). The numerical stability of the proposed SR-VINS is inherited from the adoption of square-root covariance while the never-before-seen efficiency is largely enabled by the novel SRF update method that is based on our new permuted-QR (P-QR), which fully utilizes and properly maintains the upper triangular structure of the square-root covariance matrix. Furthermore, we choose a special ordering of the state variables which is amenable for (P-)QR operations in the SRF propagation and update and prevents unnecessary computation. The proposed SR-VINS is validated extensively through numerical studies, demonstrating that when the state-of-the-art (SOTA) filters have numerical difficulties, our SR-VINS has superior numerical stability, and remarkably, achieves efficient and robust performance on 32-bit single-precision float at a speed nearly twice as fast as the SOTA methods. We also conduct comprehensive real-world experiments to validate the efficiency, accuracy, and robustness of the proposed SR-VINS.
|
|
10:30-12:00, Paper WeAA2-CC.5 | Add to My Program |
Universal Visual Decomposer: Long-Horizon Manipulation Made Easy |
|
Zhang, Zichen | Allen Institute for AI |
Li, Yunshuang | Univeresity of Pennsylvania |
Bastani, Osbert | University of Pennsylvania |
Gupta, Abhishek | University of Washington |
Jayaraman, Dinesh | University of Pennsylvania |
Ma, Yecheng Jason | University of Pennsylvania |
Weihs, Luca | Allen Institute for AI |
Keywords: Learning from Demonstration, Imitation Learning, Reinforcement Learning
Abstract: Real-world robotic tasks stretch over extended horizons and encompass multiple stages. Learning long-horizon manipulation tasks, however, is a long-standing challenge, and demands decomposing the overarching task into several manageable subtasks to facilitate policy learning and generalization to unseen tasks. Prior task decomposition methods require task-specific knowledge, are computationally intensive, and cannot readily be applied to new tasks. To address these shortcomings, we propose Universal Visual Decomposer (UVD), an off-the-shelf task decomposition method for visual long-horizon manipulation using pre-trained visual representations for robotic control. At a high level, UVD discovers subgoals by detecting phase shifts in the embedding space of the pre-trained representation. Operating purely on visual demonstrations without auxiliary information, UVD can effectively extract visual subgoals embedded in the videos, while incurring zero additional training cost on top of standard visuomotor policy training. Goal-conditioned policies learned with UVD-discovered subgoals exhibit significantly improved compositional generalization at test time to unseen tasks. Furthermore, UVD-discovered subgoals can be used to construct goal-based reward shaping that jump-starts temporally extended exploration for reinforcement learning. We extensively evaluate UVD on both simulation and real-world tasks, and in all cases, UVD substantially outperforms baselines across imitation and reinforcement learning settings on in-domain and out-of-domain task sequences alike, validating the clear advantage of automated visual task decomposition within the simple, compact UVD framework.
|
|
10:30-12:00, Paper WeAA2-CC.6 | Add to My Program |
HEGN: Hierarchical Equivariant Graph Neural Network for 9DoF Point Cloud Registration |
|
Misik, Adam | Siemens Technology, Technical University Munich |
Salihu, Driton | Technical University Munich |
Su, Xin | Technical University of Munich |
Brock, Heike | Siemens AG |
Steinbach, Eckehard | Technical University of Munich |
Keywords: Deep Learning for Visual Perception, Visual Learning, Computer Vision for Automation
Abstract: Given its wide application in robotics, point cloud registration is a widely researched topic. Conventional methods aim to find a rotation and translation that align two point clouds in 6 degrees of freedom (DoF). However, certain tasks in robotics, such as category-level pose estimation, involve non-uniformly scaled point clouds, requiring a 9DoF transform for accurate alignment. We propose HEGN, a novel equivariant graph neural network for 9DoF point cloud registration. HEGN utilizes equivariance to rotation, translation, and scaling to estimate the transformation without relying on point correspondences. Based on graph representations for both point clouds, we extract equivariant node features aggregated in their local, cross-, and global context. In addition, we introduce a novel node pooling mechanism that leverages the cross-context importance of nodes to pool the graph representation. By repeating the feature extraction and node pooling, we obtain a graph hierarchy. Finally, we determine rotation and translation by aligning equivariant features aggregated over the graph hierarchy. To estimate scaling, we leverage scale information contained in the vector norm of the equivariant features. We evaluate the effectiveness of HEGN through experiments with the synthetic ModelNet40 dataset and the real-world ScanObjectNN dataset. The results show the superior performance of HEGN in 9DoF point cloud registration and its competitive performance in conventional 6DoF point cloud registration.
|
|
WeAT1-CC Oral Session, CC-303 |
Add to My Program |
Motion and Path Planning I |
|
|
Chair: Tsagarakis, Nikos | Istituto Italiano Di Tecnologia |
Co-Chair: Okuda, Hiroyuki | Nagoya University |
|
10:30-12:00, Paper WeAT1-CC.1 | Add to My Program |
Autonomous Navigation with Online Replanning and Recovery Behaviors for Wheeled-Legged Robots Using Behavior Trees |
|
De Luca, Alessio | Istituto Italiano Di Tecnologia |
Muratore, Luca | Istituto Italiano Di Tecnologia |
Tsagarakis, Nikos | Istituto Italiano Di Tecnologia |
Keywords: Motion and Path Planning, Reactive and Sensor-Based Planning, Field Robots
Abstract: Performing autonomous navigation in cluttered and unstructured terrains still remains a challenging task for legged and wheeled mobile robots. To accomplish such a task, online planners shall incorporate new terrain information perceived while the robot is moving within its environment. While hybrid mobility robots offer high flexibility in traversing challenging terrains by leveraging the advantages of both wheeled and legged locomotion, the effective hybrid planning of the mobility actions that transparently combine both modes of locomotion has not been extensively explored. In this work, we present a hierarchical online hybrid primitive-based planner for autonomous navigation with wheeled-legged robots. The framework is handled by a Behavior Tree (BT) and it takes into account recovery methods to deal with possible failures during the execution of the navigation/mobility plan. The framework was evaluated in multiple irregular and heavily cluttered simulated environments randomly generated and in real-world trials, using the CENTAURO robot platform. With these experiments, we demonstrated autonomous capabilities without any human intervention, even in case of collision or planner failures.
|
|
10:30-12:00, Paper WeAT1-CC.2 | Add to My Program |
Signal Temporal Logic Neural Predictive Control |
|
Meng, Yue | Massachusetts Institute of Technology |
Fan, Chuchu | Massachusetts Institute of Technology |
Keywords: Motion and Path Planning, Machine Learning for Robot Control, AI-Based Methods
Abstract: Ensuring safety and meeting temporal specifications are critical challenges for long-term robotic tasks. Signal temporal logic (STL) has been widely used to systematically and rigorously specify these requirements. However, traditional methods of finding the control policy under those STL requirements are computationally complex and not scalable to high-dimensional or systems with complex nonlinear dynamics. Reinforcement learning (RL) methods can learn the policy to satisfy the STL specifications via hand-crafted or STL-inspired rewards, but might encounter unexpected behaviors due to ambiguity and sparsity in the reward. In this paper, we propose a method to directly learn a neural network controller to satisfy the requirements specified in STL. Our controller learns to roll out trajectories to maximize the STL robustness score in training. In testing, similar to Model Predictive Control (MPC), the learned controller predicts a trajectory within a planning horizon to ensure the satisfaction of the STL requirement in deployment. A backup policy is designed to ensure safety when our controller fails. Our approach can adapt to various initial conditions and environmental parameters. We conduct experiments on six tasks, where our method with the backup policy outperforms the classical methods (MPC, STL-solver), model-free and model-based RL methods in STL satisfaction rate, especially on tasks with complex STL specifications while being 10X-100X faster than the classical method
|
|
10:30-12:00, Paper WeAT1-CC.3 | Add to My Program |
Multi-Query TDSP for Path Planning in Time-Varying Flow Fields |
|
Lee, James Ju Heon | University of Technology Sydney |
Yoo, Chanyeol | University of Technology Sydney |
Anstee, Stuart David | Defence Science and Technology Group |
Fitch, Robert | University of Technology Sydney |
Keywords: Motion and Path Planning, Marine Robotics
Abstract: Many applications of path planning in time-varying flow fields, particularly in areas such as marine robotics and ship routing, can be modelled as instances of the time-varying shortest path (TDSP) problem. Although there are no known polynomial-time solutions to TDSP in general, our recent work has identified a tractable case where the flow is modelled as piecewise constant. Extending this method to allow for computational reuse in larger multi-query problems, however, requires additional thought. This paper shows that the piecewise-linear form of the cost function employed in previously work can be used to build an analogy of a shortest path tree, thereby enabling optimal concatenation of sub-problem solutions in the absence of an optimal substructure, and without uniform time discretisation. We present a framework for multi-query TDSP that finds an optimal path that passes through a defined sequence of waypoints and is computationally efficient. Performance comparison is provided in simulation that shows large (up to 100x) speedup compared to a naive approach. This result is significant for applications such as ship routing, where route evaluation is a desirable capability.
|
|
10:30-12:00, Paper WeAT1-CC.4 | Add to My Program |
CTopPRM: Clustering Topological PRM for Planning Multiple Distinct Paths in 3D Environments |
|
Novosad, Matej | Faculty of Electrical Engineering, Czech Technical University In |
Penicka, Robert | Czech Technical University in Prague |
Vonasek, Vojtech | Czech Technical University in Prague |
Keywords: Motion and Path Planning, Planning, Scheduling and Coordination
Abstract: We propose a new method called Clustering Topological PRM (CTopPRM) for finding multiple distinct paths in 3D cluttered environments. Finding such distinct paths is useful in many applications. Among others, using multiple distinct paths is necessary for optimization-based trajectory planners where found trajectories are restricted to only a single homotopy class of a given path. Distinct paths can also be used to guide sampling-based motion planning and thus increase the effectiveness of planning in environments with narrow passages. Graph-based representation called roadmap is a common representation for path planning and also for finding multiple distinct paths. Yet, challenging environments with multiple narrow passages require a densely sampled roadmap to capture the connectivity of the environment. Searching such a dense roadmap for multiple paths is computationally too expensive. The majority of existing methods construct only a sparse roadmap which, however, struggles to find all distinct paths in challenging environments. To this end, we propose the CTopPRM which creates a sparse graph by clustering an initially sampled dense roadmap. Such a reduced roadmap allows fast identification of homotopically distinct paths captured in the dense roadmap. We show, that compared to the existing methods the CTopPRM improves the probability of finding all distinct paths by almost 20%, during same run-time. The source code of our method is released as an open-source package.
|
|
10:30-12:00, Paper WeAT1-CC.5 | Add to My Program |
Stein Variational Guided Model Predictive Path Integral Control: Proposal and Experiments with Fast Maneuvering Vehicles |
|
Honda, Kohei | Nagoya University |
Akai, Naoki | Nagoya University |
Suzuki, Kosuke | Nagoya University |
Aoki, Mizuho | Nagoya University |
Hosogaya, Hirotaka | Nagoya University |
Okuda, Hiroyuki | Nagoya University |
Suzuki, Tatsuya | Nagoya University |
Keywords: Motion and Path Planning, Optimization and Optimal Control, Collision Avoidance
Abstract: This paper presents a novel Stochastic Optimal Control (SOC) method based on Model Predictive Path Integral control (MPPI), named Stein Variational Guided MPPI (SVG-MPPI), designed to handle rapidly shifting multimodal optimal action distributions. While MPPI can find a Gaussian-approximated optimal action distribution in closed form, i.e., without iterative solution updates, it struggles with the multimodality of the optimal distributions. This is due to the less representative nature of the Gaussian. To overcome this limitation, our method aims to identify a target mode of the optimal distribution and guide the solution to converge to fit it. In the proposed method, the target mode is roughly estimated using a modified Stein Variational Gradient Descent (SVGD) method and embedded into the MPPI algorithm to find a closed-form ``mode-seeking'' solution that covers only the target mode, thus preserving the fast convergence property of MPPI. Our simulation and real-world experimental results demonstrate that SVG-MPPI outperforms both the original MPPI and other state-of-the-art sampling-based SOC algorithms in terms of path-tracking and obstacle-avoidance capabilities.
|
|
10:30-12:00, Paper WeAT1-CC.6 | Add to My Program |
An Efficient Solution to the 2D Visibility Problem in Cartesian Grid Maps and Its Application in Heuristic Path Planning |
|
Ibrahim, Ibrahim | KU Leuven |
Gillis, Joris | KU Leuven |
Decré, Wilm | Katholieke Universiteit Leuven |
Swevers, Jan | KU Leuven |
Keywords: Computational Geometry, Simulation and Animation, Motion and Path Planning
Abstract: This paper introduces a novel, lightweight method to solve the visibility problem for 2D grids. The proposed method evaluates the existence of lines-of-sight from a source point to all other grid cells in a single pass with no preprocessing and independently of the number and shape of obstacles. It has a compute and memory complexity of mathcal{O}(n), where n = n_{x}times n_{y} is the size of the grid, and requires at most ten arithmetic operations per grid cell. In the proposed approach, we use a linear first-order hyperbolic partial differential equation to transport the visibility quantity in all directions. In order to accomplish that, we use an entropy-satisfying upwind scheme that converges to the true visibility polygon as the step size goes to zero. This dynamic-programming approach allows the evaluation of visibility for an entire grid much faster than typical algorithms. We provide a practical application of our proposed algorithm by posing the visibility quantity as a heuristic and implementing a deterministic, local-minima-free path planner, setting apart the proposed planner from traditional methods. Lastly, we provide necessary algorithms and an open-source implementation of the proposed methods.
|
|
10:30-12:00, Paper WeAT1-CC.7 | Add to My Program |
Efficient Clothoid Tree-Based Local Path Planning for Self-Driving Robots |
|
Lee, Minhyeong | Seoul National University |
Lee, Dongjun | Seoul National University |
Keywords: Motion and Path Planning, Wheeled Robots
Abstract: In this paper, we propose a real-time clothoid tree-based path planning for self-driving robots. Clothoids, curves that exhibit linear curvature profiles, play an important role in road design and path planning due to their appealing properties. Nevertheless, their real-time applications face considerable challenges, primarily stemming from the lack of a closed-form clothoid expression. To address these challenges, we introduce two innovative techniques: 1) an efficient and precise clothoid approximation using the Gauss-Legendre quadrature; and 2) a data-efficient decoder for interpolating clothoid splines that leverages the symmetry and similarity of clothoids. These techniques are demonstrated with numerical examples. The clothoid approximation ensures an accurate and smooth representation of the curve, and the clothoid spline decoder effectively accelerates the clothoid tree exploration by relaxing the problem constraints and reducing the problem size. Both techniques are integrated into our path planning algorithm and evaluated in various driving scenarios.
|
|
10:30-12:00, Paper WeAT1-CC.8 | Add to My Program |
Decentralized Lifelong Path Planning for Multiple Ackerman Car-Like Robots |
|
Guo, Teng | Rutgers University |
Yu, Jingjin | Rutgers University |
Keywords: Motion and Path Planning, Multi-Robot Systems, Path Planning for Multiple Mobile Robots or Agents
Abstract: Path planning for multiple non-holonomic robots in continuous domains constitutes a difficult robotics challenge with many applications. Despite significant recent progress on the topic, computationally efficient and high-quality solutions are lacking, especially in lifelong settings where robots must continuously take on new tasks. In this work, we make it possible to extend key ideas enabling state-of-the-art (SOTA) methods for multi-robot planning in discrete domains to the motion planning of multiple Ackerman (car-like) robots in lifelong settings, yielding high-performance centralized and decentralized planners. Our planners compute trajectories that allow the robots to reach precise SE(2) goal poses. The effectiveness of our methods is thoroughly evaluated and confirmed using both simulation and real-world experiments.
|
|
10:30-12:00, Paper WeAT1-CC.9 | Add to My Program |
Energy-Aware Ergodic Search: Continuous Exploration for Multi-Agent Systems with Battery Constraints |
|
Seewald, Adam | Yale University |
Lerch, Cameron | Yale University |
Chancán, Marvin | Yale University |
Dollar, Aaron | Yale University |
Abraham, Ian | Yale University |
Keywords: Motion and Path Planning, Energy and Environment-Aware Automation
Abstract: Continuous exploration without interruption is important in scenarios such as search and rescue and precision agriculture, where consistent presence is needed to detect events over large areas. Ergodic search already derives continuous trajectories in these scenarios so that a robot spends more time in areas with high information density. However, existing literature on ergodic search does not consider the robot's energy constraints, limiting how long a robot can explore. In fact, if the robots are battery-powered, it is physically not possible to continuously explore on a single battery charge. Our paper tackles this challenge, integrating ergodic search methods with energy-aware coverage. We trade off battery usage and coverage quality, maintaining uninterrupted exploration by at least one agent. Our approach derives an abstract battery model for future state-of-charge estimation and extends canonical ergodic search to ergodic search under battery constraints. Empirical data from simulations and real-world experiments demonstrate the effectiveness of our energy-aware ergodic search, which ensures continuous exploration and guarantees spatial coverage.
|
|
WeAT2-CC Oral Session, CC-311 |
Add to My Program |
Actuation |
|
|
Chair: Thomas, Ulrike | Chemnitz University of Technology |
Co-Chair: Haddadin, Sami | Technical University of Munich |
|
10:30-12:00, Paper WeAT2-CC.1 | Add to My Program |
Development of Variable Transmission Series Elastic Actuator for Hip Exoskeletons |
|
Wang, Tianci | City University of Hong Kong |
Wen, Hao | City University of Hong Kong |
Song, Zaixin | City University of Hong Kong |
Dong, Zhiping | City University of Hong Kong |
Liu, Chunhua | City University of Hong Kong |
Keywords: Actuation and Joint Mechanisms, Compliant Joints and Mechanisms, Compliance and Impedance Control
Abstract: Series Elastic Actuator-based exoskeleton can offer precise torque control and transparency when interacting with human wearers. Accurate control of SEA-produced torques ensures the wearer’s voluntary motion and supports the implementation of multiple assistive paradigms. In this paper, a novel variable transmission series elastic actuator (VTSEA) is developed to meet torque-speed requirements in different exoskeleton-assisted locomotion modes, such as running, walking, sit-to-stand, and stand-to-sit. The VTSEA features a SEA-coupled variable transmission ratio adjusting mechanism and works between three discrete levels of transmission ratio depending on the user’s initiative. The proposed prototype can also improve transparency in human-robot interaction. Also, an accurate torque controller with inertial compensation is developed for the VTSEA via the singular perturbation theory, and its stability is proved. The feasibility of the proposed VTSEA prototype and the precise output torque performance of VTSEA are verified by experiments.
|
|
10:30-12:00, Paper WeAT2-CC.2 | Add to My Program |
Optimization of Mono and Bi-Articular Parallel Elastic Elements for a Robotic Arm Performing a Pick-And-Place Task |
|
Marchal, Maxime | Vrije Universiteit Brussel |
Furnémont, Raphaël | Vrije Universiteit Brussel |
Vanderborght, Bram | Vrije Universiteit Brussel |
Mostafaoui, Ghiles | CNRS, University of CergyPontoise, ENSEA |
Verstraten, Tom | Vrije Universiteit Brussel |
Keywords: Actuation and Joint Mechanisms, Compliant Joints and Mechanisms, Mechanism Design
Abstract: Actuation concepts such as Series Elastic Actuation (SEA), Parallel Elastic Actuation (PEA), and Biarticular Actuation (BA), which introduce elastic elements into the structure, have the potential to reduce the electrical energy consumption of a robot. This letter presents an optimization of the arrangement of springs for a 3 degrees of freedom robotic arm, with the aim of decreasing the electrical energy consumption for a given pick-and-place task. Through simulations and experimental validation, we show that the optimal configuration in terms of electrical energy consumption and complexity consists of rigid actuation on joint 1 and PEAs on joints 2 and 3. With this configuration, root mean square (RMS) and peak load torques for a specific pick-and-place task can be reduced respectively by up to 43% and 44% for joint 2, and by 15% and 21% for joint 3 compared to the configuration without springs.
|
|
10:30-12:00, Paper WeAT2-CC.3 | Add to My Program |
A Novel Compact Design of a Lever-Cam-Based Variable Stiffness Actuator: LC-VSA |
|
Zhu, Hongxi | Chemnitz University of Technology |
Thomas, Ulrike | Chemnitz University of Technology |
Keywords: Actuation and Joint Mechanisms, Compliant Joints and Mechanisms, Mechanism Design
Abstract: The safer interaction between humans and robots is one of the challenges in robotics. To protect humans and robots from impact, researchers have developed many different soft robots, which incorporate mechanical springs into their joints. The forthcoming generation of soft robots necessitates adaptable joint stiffness to accommodate various tasks. Consequently, the development of variable stiffness joints (VSA) has become crucial. Among the prevalent approaches for stiffness adjustment, lever mechanisms have been implemented in numerous variable stiffness joints. Nonetheless, the integration of the lever technology into VSA often faces challenges in achieving a compact design. This paper introduces a mechanically compact design for a novel lever-cam-based variable stiffness joint.
|
|
10:30-12:00, Paper WeAT2-CC.4 | Add to My Program |
Design and Modeling of a Compact Serial Variable Stiffness Actuator (SVSA-III) with Linear Stiffness Profile |
|
Yi, Shuowen | Wuhan University |
Liu, Siyu | The School of Power and Mechanical Engineering, Wuhan University |
Liao, Junbei | Wuhan University |
Guo, Zhao | Wuhan University |
Keywords: Actuation and Joint Mechanisms, Compliant Joints and Mechanisms, Mechanism Design
Abstract: Variable stiffness actuator (VSA) can imitate natural muscles in their compliance capbility, which can provide flexible adaptability for robots, improving the safety of robots interacting with the environment or human. This paper presents a new compact serial variable stiffness actuator ((SVSA-III)) with linear stiffness profile based on symmetrical variable lever arm mechanism. The stiffness motor is used to regulate the position of the pivot located on the Archimedean Spiral Relocation Mechanism (ASRM), so that the stiffness of the actuator can be adjusted (softening or hardening). By designing the lever length, the range of stiffness adjustment can change from 0.3Nm/degree to therotical infinity. Moreover, the continuous linear stiffness profile of the actuator can be customized by solving the transcendental equation of the relationship between the actuator stiffness and the rotation angle of the stiffness motor. SVSA-III has the advantages of compact structure, wide-range stiffness regulation, reduced control difficulty, and linear stiffness profile. Two experiments of step response and stiffness tracking have proved the high accuracy and fast response for both theoretical stiffness and position adjustment.
|
|
10:30-12:00, Paper WeAT2-CC.5 | Add to My Program |
Optimally Controlling the Timing of Energy Transfer in Elastic Joints: Experimental Validation of the Bi-Stiffness Actuation Concept |
|
Pozo Fortunić, Edmundo | Technical University of Munich |
Yildirim, Mehmet Can | Technical University of Munich |
Ossadnik, Dennis | Technical University of Munich |
Swikir, Abdalla | Technical University of Munich |
Abdolshah, Saeed | KUKA Deutschland GmbH |
Haddadin, Sami | Technical University of Munich |
Keywords: Actuation and Joint Mechanisms, Compliant Joints and Mechanisms, Optimization and Optimal Control
Abstract: Elastic actuation taps into elastic elements' energy storage for dynamic motions beyond rigid actuation. While Series Elastic Actuators (SEA) and Variable Stiffness Actuators (VSA) are highly sophisticated, they do not fully provide control over energy transfer timing. To overcome this problem on the basic system level, the Bi-Stiffness Actuation (BSA) concept was recently proposed. Theoretically, it allows for full link decoupling, while simultaneously being able to lock the spring in the drive train via a switch-and-hold mechanism. Thus, the user would be in full control of the potential energy storage and release timing. In this work, we introduce an initial proof-of-concept of Bi-Stiffness-Actuation in the form of a 1-DoF physical prototype, which is implemented using a modular testbed. We present a hybrid system model, as well as the mechatronic implementation of the actuator. We corroborate the feasibility of the concept by conducting a series of hardware experiments using an open-loop control signal obtained by trajectory optimization. Here, we compare the performance of the prototype with a comparable SEA implementation. We show that BSA outperforms SEA 1) in terms of maximum velocity at low final times and 2) in terms of the movement strategy itself: The clutch mechanism allows the BSA to generate consistent launch sequences while the SEA has to rely on lengthy and possibly dangerous oscillatory swing-up motions.
|
|
10:30-12:00, Paper WeAT2-CC.6 | Add to My Program |
Experimental Comparison of Pinwheel and Non-Pinwheel Designs of 3D-Printed Cycloidal Gearing for Robotics |
|
Roozing, Wesley | University of Twente |
Roozing, Glenn | Auto Elect B.V |
Keywords: Actuation and Joint Mechanisms, Mechanism Design
Abstract: Recent trends in robotic actuation have highlighted the need for low-cost, high performance, and efficient gearing. We present an experimental study comparing pinwheel and non-pinwheel designs of cycloidal gearing. The open source designs are 3D-printable combined with off-the-shelf components, achieving a high performance-to-cost ratio. Extensive experimental data is presented, that compares two prototypes on run-in behaviour and a number of quantitative metrics including transmission error, play, friction, and stiffness. Furthermore, we assess overall actuator performance through position control experiments, and a 10-hour endurance test. The results show strong performance characteristics, and crucially, suggest that non-pinwheel designs of cycloidal gearing can be a lower complexity and cost alternative to classical pinwheel designs, while offering similar performance.
|
|
10:30-12:00, Paper WeAT2-CC.7 | Add to My Program |
Design and Optimization of an Origami-Inspired Foldable Pneumatic Actuator |
|
Chen, Huaiyuan | Shanghai Jiao Tong University |
Ma, Yiyuan | Shanghai Jiao Tong University |
Chen, Weidong | Shanghai Jiao Tong University |
Keywords: Hydraulic/Pneumatic Actuators, Actuation and Joint Mechanisms, Modeling, Control, and Learning for Soft Robots
Abstract: A novel origami-inspired foldable pneumatic actuator is proposed in this letter to satisfy the comprehensive requirements in wearable assistive application. The pneumatic actuator combines the origami structure of designed Quadrangular-Expand pattern and the foldable pneumatic bellow. Integrated origami structure regulates the motion of actuator with high contraction ratio and enables accurate modeling. The origami framework also improves the strength of bearing negative pressure, and thus can provide bidirectional actuation. The workflow including design, fabrication and mathematic modeling of the pneumatic actuator is presented in detail. Based on the actuator modeling, the multi-objective optimization for parameters using Genetic Algorithm is then conducted to obtain the trade-off design. The verifications for static characteristics of output torque, as well as the dynamic characteristics of power density, mechanical efficiency and frequency response, have been conducted. In summary, the proposed actuator is powerful and energy-efficient.
|
|
10:30-12:00, Paper WeAT2-CC.8 | Add to My Program |
A Non-Magnetic Dual-Mode Linear Pneumatic Actuator: Initial Design and Assessment |
|
Portha, Timothée | University of Strasbourg |
Barbé, Laurent | University of Strasbourg, ICube CNRS |
Geiskopf, Francois | INSA De Strasbourg |
Vappou, Jonathan | CNRS, Universite De Strasbourg |
Renaud, Pierre | ICube |
Keywords: Hydraulic/Pneumatic Actuators
Abstract: A pneumatic linear actuator is presented and evaluated. Designed to operate in demanding environments such as MRI, it is developed to be used with two motion control modes: 1) a step-by-step mode with tooth-based gripping to ensure precision, 2) a continuous mode available locally for fine positioning. The actuator can also be disengaged to enable direct handling by an operator, for example for comanipulation. The design is presented. A prototype, developed in the medical context, is implemented and characterized. A specific step-by-step control sequence is then elaborated based on its characterization. Testing of the dual-mode actuation is finally described. The complementarity between the two motion modes and possible adaptations of the original design are discussed.
|
|
10:30-12:00, Paper WeAT2-CC.9 | Add to My Program |
Variable Stiffness Floating Spring Leg: Performing Net-Zero Energy Cost Tasks Not Achievable Using Fixed Stiffness Springs |
|
Kim, Sung | Vanderbilt University |
Braun, David | Vanderbilt University |
Keywords: Compliant Joints and Mechanisms, Actuation and Joint Mechanisms, Legged Robots
Abstract: Sitting down and standing up from a chair and, similarly, moving heavy objects up and down between factory lines are examples of cyclic tasks that require large forces but little to no net mechanical energy. Motor-driven artificial limbs and industrial robots can help humans do these tasks, but motors require energy to provide force even if they supply no net mechanical energy. Springs are energetically conservative mechanical elements useful for building robots that require no energy when performing cyclic tasks. However, conventional springs can be limited by their non-customizable force-deflection behavior -- for example, when they cannot meet the force demand despite storing enough energy to perform a cyclic task. Variable stiffness springs are a special type of spring with customizable force-deflection behavior, but most typical variable stiffness springs require energy to amplify force similar to motors. In this paper, we introduce a new type of variable stiffness spring design which is energetically conservative despite having a customizable force-deflection behavior. We present the theory of these springs and demonstrate their utility in performing a net-zero mechanical energy cost lifting task that requires force amplification and as such is not realizable using conventional springs.
|
|
WeAT3-CC Oral Session, CC-313 |
Add to My Program |
Kinematics |
|
|
Chair: Kroeger, Torsten | Karlsruher Institut Für Technologie (KIT) |
Co-Chair: Chirikjian, Gregory | National University of Singapore |
|
10:30-12:00, Paper WeAT3-CC.1 | Add to My Program |
Accurate Kinematic Modeling Using Autoencoders on Differentiable Joints |
|
Wilhelm, Nikolas Jakob | Technical University of Munich |
Haddadin, Sami | Technical University of Munich |
Burgkart, Rainer | Technische Universität München |
van der Smagt, Patrick | Volkswagen Group |
Karl, Maximilian | Volkswagen AG |
Keywords: Deep Learning Methods, Kinematics
Abstract: In robotics and biomechanics, accurately determining joint parameters and computing the corresponding forward and inverse kinematics are critical yet often challenging tasks, especially when dealing with highly individualized and partly unknown systems. This paper unveils a cutting-edge kinematic optimizer, underpinned by an autoencoder-based architecture, to address these challenges. Utilizing a neural network, our approach simulates inverse kinematics, converting measurement data into joint-specific parameters during encoding, enabling a stable optimization process. These parameters are subsequently processed through a predefined, differentiable forward kinematics model, resulting in a decoded representation of the original data. Beyond offering a comprehensive solution to kinematics challenges, our method also unveils previously unidentified joint parameters. Real experimental data from knee and hand joints validate the optimizer's efficacy. Additionally, our optimizer is multifunctional: it streamlines the modeling and automation of kinematics and enables a nuanced evaluation of diverse modeling techniques. By assessing the differences in reconstruction losses, we illuminate the merits of each approach. Collectively, this preliminary study signifies advancements in kinematic optimization, with potential applications spanning both biomechanics and robotics.
|
|
10:30-12:00, Paper WeAT3-CC.2 | Add to My Program |
A Miniature Water Jumping Robot Based on Accurate Interaction Force Analysis |
|
Yan, Jihong | Harbin Institute of Technology |
Zhang, Xin | Harbin Institute of Technology |
Yang, Kai | Harbin Institute of Technology |
Zhao, Jie | Harbin Institute of Technology |
Keywords: Dynamics, Mechanism Design, Kinematics, trajectory optimization
Abstract: Water jumping motion extends the robot's movement space and flexibility. However, the jumping performance is influenced by multiple factors such as driving force, rowing trajectory and robot structure. The interaction force between the robot and water surface is complicated due to water deformation, and the difficulty of the water jumping increases with the robot's scale. This paper designs a miniature water jumping robot with rowing driving legs. The hydrodynamic model between driving legs and water is established based on the modified Wagner theory with consideration of water surface deformation. Particularly, the dynamic model of the robot for the whole jumping process is also developed relate to multiple factors. Then the jumping performance is improved by optimizing the energy storage modality, rowing trajectory and supporting leg shapes through the theoretical analysis and experiments. The fabricated robot weights 91 g, and its length, width and height are 220 mm, 410 mm and 95 mm respectively. The maximum water jumping height and distance are 241 and 965 mm.
|
|
10:30-12:00, Paper WeAT3-CC.3 | Add to My Program |
Jerk-Limited Traversal of One-Dimensional Paths and Its Application to Multi-Dimensional Path Tracking |
|
Kiemel, Jonas | Karlsruhe Institute of Technology |
Kroeger, Torsten | Karlsruher Institut Für Technologie (KIT) |
Keywords: Kinematics, Constrained Motion Planning
Abstract: In this paper, we present an iterative method to quickly traverse multi-dimensional paths considering jerk constraints. As a first step, we analyze the traversal of each individual path dimension. We derive a range of feasible target accelerations for each intermediate waypoint of a one-dimensional path using a binary search algorithm. Computing a trajectory from waypoint to waypoint leads to the fastest progress on the path when selecting the highest feasible target acceleration. Similarly, it is possible to calculate a trajectory that leads to minimum progress along the path. This insight allows us to control the traversal of a one-dimensional path in such a way that a reference path length of a multi-dimensional path is approximately tracked over time. In order to improve the tracking accuracy, we propose an iterative scheme to adjust the temporal course of the selected reference path length. More precisely, the temporal region causing the largest position deviation is identified and updated at each iteration. In our evaluation, we thoroughly analyze the performance of our method using seven-dimensional reference paths with different path characteristics. We show that our method manages to quickly traverse the reference paths and compare the required traversing time and the resulting path accuracy with other state-of-the-art approaches.
|
|
10:30-12:00, Paper WeAT3-CC.4 | Add to My Program |
The Kinematics of Constant Curvature Continuum Robots through Three Segments |
|
Li, Yucheng | University of Dayton |
Myszka, David H. | University of Dayton |
Murray, Andrew | University of Dayton |
Keywords: Kinematics, Formal Methods in Robotics and Automation, Soft Robot Applications
Abstract: This letter presents an investigation into the mathematical relationships between the positions and orientations at the segment tips of a piecewise constant curvature (PCC) continuum robot with up to three segments. For one-segment, a reachability criterion is proposed, which simplifies the calculation of the neighboring orientation. For two-segments, a reachability criterion is proposed and the redundancy of its inverse kinematics solution is found, establishing a circle of tip locations. For three-segments, the redundancy of the inverse kinematics includes tips that lie on a sphere providing a closed-form solution to the inverse kinematics problem. These relationships are derived from the unique characteristics of the bisecting plane of a single segment. The degenerate cases for the solutions are also addressed. These outcomes stem from a specific PCC parametrization, with implications extending to the general PCC model. Note that this study is grounded solely in simulation.
|
|
10:30-12:00, Paper WeAT3-CC.5 | Add to My Program |
An Analytic Solution to the 3D CSC Dubins Path Problem |
|
Montano, Victor | University of Houston |
Navkar, Nikhil | Hamad Medical Corporation |
Becker, Aaron | University of Houston |
Keywords: Kinematics, Nonholonomic Motion Planning, Motion and Path Planning
Abstract: We present an analytic solution to the 3D Dubins path problem for paths composed of an initial circular arc, a straight component, and a final circular arc. These are commonly called CSC paths. By modeling the start and goal configurations of the path as the base frame and final frame of an RRPRR manipulator, we treat this as an inverse kinematics problem. The kinematic features of the 3D Dubins path are built into the constraints of our manipulator model. Furthermore, we show that the number of solutions is not constant, with up to seven valid CSC path solutions even in non-singular regions. An implementation of solution is available at https://github.com/aabecker/dubins3D
|
|
10:30-12:00, Paper WeAT3-CC.6 | Add to My Program |
Polytope-Based Continuous Scalar Performance Measure with Analytical Gradient for Effective Robot Manipulation |
|
Somenedi Nageswara Rao, Keerthi Sagar | Irish Manufacturing Research Limited, Ireland |
Caro, Stéphane | CNRS/LS2N |
Padir, Taskin | Northeastern University |
Long, Philip | Atlantic Technological University |
Keywords: Kinematics, Optimization and Optimal Control, Parallel Robots
Abstract: Performance measures are essential to characterize a robot's ability to carry out manipulation tasks. Generally, these measures examine the system’s kinematic transformations from configuration to task space, but the Capacity margin, a polytope based kinetostatic index, provides additionally, both an accurate evaluation of the twist and wrench capacities of a robotic manipulator. However, this index is the minimum of a discontinuous scalar function leading to difficulties when computing gradients thereby rendering it unsuitable for online numerical optimization. In this letter, we propose a novel performance index using an approximation of the capacity margin. The proposed index is continuous and differentiable, characteristics that are essential for modelling smooth and predictable system behavior. We demonstrate the effectiveness both as a constraint and objective function for inverse kinematics optimization. Moreover, to show its practical use, two opposing robot architectures are chosen: (i) Serial robot - Universal Robot- UR5 (6-dof); Rethink Robotics- Sawyer Robot (7-dof) and (ii) Parallel manipulator - Cable Driven Parallel Robot to validate the results through both simulation and experiments. A visual representation of the performance index is also presented.
|
|
10:30-12:00, Paper WeAT3-CC.7 | Add to My Program |
Kinematic Optimization of a Robotic Arm for Automation Tasks with Human Demonstration |
|
Meir, Inbar | Tel Aviv University |
Bechar, Avital | Agricultural Research Organization |
Sintov, Avishai | Tel-Aviv University |
Keywords: Kinematics, Industrial Robots
Abstract: Robotic arms are highly common in various automation processes such as manufacturing lines. However, these highly capable robots are usually degraded to simple repetitive tasks such as pick-and-place. On the other hand, designing an optimal robot for one specific task consumes large resources of engineering time and costs. In this paper, we propose a novel concept for optimizing the fitness of a robotic arm to perform a specific task based on human demonstration. Fitness of a robot arm is a measure of its ability to follow recorded human arm and hand paths. The optimization is conducted using a modified variant of the Particle Swarm Optimization for the robot design problem. In the proposed approach, we generate an optimal robot design along with the required path to complete the task. The approach could reduce the time-to-market of robotic arms and enable the standardization of modular robotic parts. Novice users could easily apply a minimal robot arm to various tasks. Two test cases of common manufacturing tasks are presented yielding optimal designs and reduced computational effort by up to 92%.
|
|
10:30-12:00, Paper WeAT3-CC.8 | Add to My Program |
Enhancing Motion Trajectory Segmentation of Rigid Bodies Using a Novel Screw-Based Trajectory-Shape Representation |
|
Verduyn, Arno | KU Leuven |
Vochten, Maxim | KU Leuven |
De Schutter, Joris | KU Leuven |
Keywords: Kinematics, Learning from Demonstration
Abstract: Trajectory segmentation refers to dividing a trajectory into meaningful consecutive sub-trajectories. This paper focuses on trajectory segmentation for 3D rigid-body motions. Most segmentation approaches in the literature represent the body’s trajectory as a point trajectory, considering only its translation and neglecting its rotation. We propose a novel trajectory representation for rigid-body motions that incorporates both translation and rotation, and additionally exhibits several invariant properties. This representation consists of a geometric progress rate and a third-order trajectory-shape descriptor. Concepts from screw theory were used to make this representation time-invariant and also invariant to the choice of body reference point. This new representation is validated for a self-supervised segmentation approach, both in simulation and using real recordings of human-demonstrated pouring motions. The results show a more robust detection of consecutive sub-motions with distinct features and a more consistent segmentation compared to conventional representations. We believe that other existing segmentation methods may benefit from using this trajectory representation to improve their invariance.
|
|
10:30-12:00, Paper WeAT3-CC.9 | Add to My Program |
Model Reduction in Soft Robotics Using Locally Volume-Preserving Primitives |
|
Xu, Yi | National University of Singapore |
Chirikjian, Gregory | National University of Singapore |
Keywords: Kinematics, Modeling, Control, and Learning for Soft Robots
Abstract: A new, and extremely efficient, computational modeling paradigm is introduced here for specific finite elasticity problems that arise in the context of soft robotics. Whereas continuum mechanics is a very classical area of study that is broadly applicable throughout engineering, and significant effort has been devoted to the development of intricate constitutive models for finite elasticity, we show that for the most part, the isochoric (locally volume-preserving) constraint dominates behavior, and this can be built into closed-form kinematic deformation fields before even considering other aspects of constitutive modeling. We therefore focus on developing and applying primitive deformations that each observe this constraint. By composing a wide enough variety of such deformations, many of the most common behaviors observed in soft robots can be replicated. Case studies include isotropic objects subjected to different boundary conditions, a non-isotropic helically-reinforced tube, and a not-purely-kinematic scenario with gravity loading. We show that this method is at least 50 times faster than the ABAQUS implementation of the finite element method (FEM), and has speed comparable with the real-time FEM framework SOFA. Experiments show that both our method and ABAQUS have approximately 10% error relative to experimentally measured displacements, as well as to each other. And our method outperforms SOFA when the deformation is highly nonlinear.
|
|
WeAT4-CC Oral Session, CC-315 |
Add to My Program |
Multi-Robot Systems IV |
|
|
Chair: Lam, Tin Lun | The Chinese University of Hong Kong, Shenzhen |
Co-Chair: Best, Graeme | University of Technology Sydney |
|
10:30-12:00, Paper WeAT4-CC.1 | Add to My Program |
Automatic Configuration of Multi-Agent Model Predictive Controllers Based on Semantic Graph World Models |
|
de Vos, Koen | Eindhoven University of Technology |
Torta, Elena | Eindhoven University of Technology |
Bruyninckx, Herman | KU Leuven |
López Martínez, César Augusto | Eindhoven University of Technology |
van de Molengraft, Marinus Jacobus Gerardus | University of Technology Eindhoven |
Keywords: Multi-Robot Systems, Constrained Motion Planning, Cooperating Robots
Abstract: We propose a shared semantic map architecture to construct and configure Model Predictive Controllers (MPC) dynamically, that solve navigation problems for multiple robotic agents sharing parts of the same environment. The navigation task is represented as a sequence of semantically labeled areas in the map, that must be traversed sequentially, i.e. a route. Each semantic label represents one or more constraints on the robots’ motion behaviour in that area. The advantages of this approach are: (i) an MPC-based motion controller in each individual robot can be (re-)configured, at runtime, with the locally and temporally relevant parameters; (ii) the application can influence, also at runtime, the navigation behaviour of the robots, just by adapting the semantic labels; and (iii) the robots can reason about their need for coordination, through analyzing over which horizon in time and space their routes overlap. The paper provides simulations of various representative situations, showing that the approach of runtime configuration of the MPC drastically decreases computation time, while retaining task execution performance similar to an approach in which each robot always includes all other robots in its MPC computations.
|
|
10:30-12:00, Paper WeAT4-CC.2 | Add to My Program |
Meta-Reinforcement Learning Based Cooperative Surface Inspection of 3D Uncertain Structures Using Multi-Robot Systems |
|
Chen, Junfeng | Peking University |
Gao, Yuan | Shenzhen Institute of Artificial Intelligence and Robotics for S |
Hu, Junjie | The Chinese University of Hong Kong, Shenzhen |
Deng, Fuqin | Shenzhen Institute of Artificial Intelligence and Robotics for S |
Lam, Tin Lun | The Chinese University of Hong Kong, Shenzhen |
Keywords: Multi-Robot Systems, Constrained Motion Planning, Reinforcement Learning
Abstract: This paper presents a decentralized cooperative motion planning approach for surface inspection of 3D structures which includes uncertainties like size, number, shape, position, using multi-robot systems (MRS). Given that most of existing works mainly focus on surface inspection of single and fully known 3D structures, our motivation is two-fold: first, 3D structures separately distributed in 3D environments are complex, therefore the use of MRS intuitively can facilitate an inspection by fully taking advantage of sensors with different capabilities. Second, performing the aforementioned tasks when considering uncertainties is a complicated and time-consuming process because we need to explore, figure out the size and shape of 3D structures and then plan surface-inspection path. To overcome these challenges, we present a meta-learning approach that provides a decentralized planner for each robot to improve the exploration and surface inspection capabilities. The experimental results demonstrate our method can outperform other methods by approximately 10.5%-27% on success rate and 70%-75% on inspection speed.
|
|
10:30-12:00, Paper WeAT4-CC.3 | Add to My Program |
Decentralized Multi-Agent Trajectory Planning in Dynamic Environments with Spatiotemporal Occupancy Grid Maps |
|
Wu, Siyuan | Delft University of Technology |
Chen, Gang | Delft University of Technology |
Shi, Moji | Delft University of Technology |
Alonso-Mora, Javier | Delft University of Technology |
Keywords: Multi-Robot Systems, Motion and Path Planning, Distributed Robot Systems
Abstract: This paper proposes a decentralized trajectory planning framework for the collision avoidance problem of multiple micro aerial vehicles (MAVs) in environments with static and dynamic obstacles. The framework utilizes spatiotemporal occupancy grid maps (SOGM), which forecast the occupancy status of neighboring space in the near future, as the environment representation. Based on this representation, we extend the kinodynamic A* and the corridor-constrained trajectory optimization algorithms to efficiently tackle static and dynamic obstacles with arbitrary shapes. Collision avoidance between communicating robots is integrated by sharing planned trajectories and projecting them onto the SOGM. The simulation results show that our method achieves competitive performance against state-of-the-art methods in dynamic environments with different numbers and shapes of obstacles. Finally, the proposed method is validated in real experiments.
|
|
10:30-12:00, Paper WeAT4-CC.4 | Add to My Program |
Communicating Intent As Behaviour Trees for Decentralised Multi-Robot Coordination |
|
Hull, Rhett | University of Technology Sydney |
Moratuwage, Diluka Prasanjith | University of Technology Sydney |
Scheide, Emily | Oregon State University |
Fitch, Robert | University of Technology Sydney |
Best, Graeme | University of Technology Sydney |
Keywords: Multi-Robot Systems, Path Planning for Multiple Mobile Robots or Agents
Abstract: We propose a decentralised multi-robot coordination algorithm that features a rich representation for encoding and communicating each robot’s intent. This representation for “intent messages” enables improved coordination behaviour and communication efficiency in difficult scenarios, such as those where there are unknown points of contention that require negotiation between robots. Each intent message is an adaptive policy that conditions on identified points of contention that conflict with the intentions of other robots. These policies are concisely expressed as behaviour trees via algebraic logic simplification, and are interpretable by robot teammates and human operators. We propose this intent representation in the context of the Dec-MCTS online planning algorithm for decentralised coordination. We present results for a generalised multi-robot orienteering domain that show improved plan convergence and coordination performance over standard Dec-MCTS enabled by the intent representation’s ability to encode and facilitate negotiation over points of contention.
|
|
10:30-12:00, Paper WeAT4-CC.5 | Add to My Program |
Partial Belief Space Planning for Scaling Stochastic Dynamic Games |
|
Vakil, Kamran | Boston University |
Coffey, Mela | Boston University |
Pierson, Alyssa | Boston University |
Keywords: Multi-Robot Systems, Path Planning for Multiple Mobile Robots or Agents, Planning under Uncertainty
Abstract: This paper presents a method to reduce computations for stochastic dynamic games with game-theoretic belief space planning through partially propagating beliefs. Complex interactions in scenarios such as surveillance, herding, and racing can be modeled using game-theoretic frameworks in the belief space. Stochastic dynamic games can be solved to a local Nash Equilibrium using a game-theoretic belief space variant of an iterative Linear Quadratic Gaussian (iLQG). However, the scalability of this method suffers due to the large dimensionality of beliefs which the iLQG must propagate. We examine the utility of partial belief space propagation, which allows polynomial runtime to decrease. We validate our findings through simulations and hardware implementation.
|
|
10:30-12:00, Paper WeAT4-CC.6 | Add to My Program |
Decentralized Multi-Agent Active Search and Tracking When Targets Outnumber Agents |
|
Banerjee, Arundhati | Carnegie Mellon University |
Schneider, Jeff | Carnegie Mellon University |
Keywords: Multi-Robot Systems, Planning under Uncertainty, Localization
Abstract: Multi-agent multi-target tracking has a wide range of applications, including wildlife patrolling, security surveillance or environment monitoring. Such algorithms often make restrictive assumptions: the number of targets and/or their initial locations may be assumed known, or agents may be pre-assigned to monitor disjoint partitions of the environment, reducing the burden of exploration. This also limits applicability when there are fewer agents than targets, since agents are unable to continuously follow the targets in their fields of view. Multi-agent tracking algorithms additionally assume inter-agent synchronization of observations, or the presence of a central controller to coordinate joint actions. Instead, we focus on the setting of decentralized multi-agent, multi-target, simultaneous active search-and-tracking with asynchronous inter-agent communication. Our proposed algorithm DecSTER uses a sequential monte carlo implementation of the Probability Hypothesis Density filter for posterior inference combined with Thompson sampling for decentralized multi-agent decision making. We compare different action selection policies, focusing on scenarios where targets outnumber agents. In simulation, we demonstrate that DecSTER is robust to unreliable inter-agent communication and outperforms information-greedy baselines in terms of the Optimal Sub-Pattern Assignment (OSPA) metric for different numbers of targets and varying teamsizes.
|
|
10:30-12:00, Paper WeAT4-CC.7 | Add to My Program |
Multi-Robot Autonomous Exploration and Mapping under Localization Uncertainty with Expectation-Maximization |
|
Huang, Yewei | Stevens Institute of Technology |
Lin, Xi | Stevens Institute of Technology |
Englot, Brendan | Stevens Institute of Technology |
Keywords: Multi-Robot Systems, Planning under Uncertainty, Reactive and Sensor-Based Planning
Abstract: We propose an autonomous exploration algorithm designed for decentralized multi-robot teams, which takes into account map and localization uncertainties of range-sensing mobile robots. Virtual landmarks are used to quantify the combined impact of process noise and sensor noise on map uncertainty. Additionally, we employ an iterative expectation-maximization inspired algorithm to assess the potential outcomes of both a local robot’s and its neighbors’ next-step actions. To evaluate the effectiveness of our framework, we conduct a comparative analysis with state-of-the-art algorithms. The results of our experiments show the proposed algorithm’s capacity to strike a balance between curbing map uncertainty and achieving efficient task allocation among robots.
|
|
10:30-12:00, Paper WeAT4-CC.8 | Add to My Program |
Optimal Task Allocation for Heterogeneous Multi-Robot Teams with Battery Constraints |
|
Calvo, Álvaro | University of Seville |
Capitan, Jesus | University of Seville |
Keywords: Multi-Robot Systems, Planning, Scheduling and Coordination, Task Planning
Abstract: This paper presents a novel approach to optimal multi-robot task allocation in heterogeneous teams of robots. When robots have heterogeneous capabilities and there are diverse objectives and constraints to comply with, computing optimal plans can become especially hard. Moreover, we increase the problem complexity by: 1) considering battery-limited robots that need to schedule recharges; 2) tasks that can be decomposed into multiple fragments; and 3) multi-robot tasks that need to be executed by a coalition synchronously. We define a new problem for heterogeneous multi-robot task allocation and formulate it as a Mixed-Integer Linear Program that includes all the aforementioned features. Then we use an off-the-shelf solver to show the type of optimal solutions that our planner can produce and assess its performance in random scenarios. Our method, which is released as open-source code, represents a first step to formalize and analyze a complex problem that has not been solved in the state of the art.
|
|
10:30-12:00, Paper WeAT4-CC.9 | Add to My Program |
Bigraph Matching Weighted with Learnt Incentive Function for Multi-Robot Task Allocation |
|
Paul, Steve | University of Connecticut |
Maurer, Nathan | University at Buffalo |
Chowdhury, Souma | University at Buffalo, State University of New York |
Keywords: Multi-Robot Systems, Planning, Scheduling and Coordination, Task Planning
Abstract: Most real-world Multi-Robot Task Allocation (MRTA) problems require fast and efficient decision-making, which is often achieved using heuristics-aided methods such as genetic algorithms, auction-based methods, and bipartite graph matching methods. These methods often assume a form that lends better explainability compared to an end-to-end (learnt) neural network based policy for MRTA. However, deriving suitable heuristics can be tedious, risky and in some cases impractical if problems are too complex. This raises the question: can these heuristics be learned? To this end, this paper particularly develops a Graph Reinforcement Learning (GRL) framework to learn the heuristics or incentives for a bipartite graph matching approach to MRTA. Specifically a Capsule Attention policy model is used to learn how to weight task/robot pairings (edges) in the bipartite graph that connects the set of tasks to the set of robots. The original capsule attention network architecture is fundamentally modified by adding encoding of robots' state graph, and two Multihead Attention based decoders whose output are used to construct a LogNormal distribution matrix from which positive bigraph weights can be drawn. The performance of this new bigraph matching approach augmented with a GRL-derived incentive is found to be at par with the original bigraph matching approach that used expert-specified heuristics, with the former offering notable robustness benefits. During training, the learned incentive policy is found to get initially closer to the expert-specified incentive and then slightly deviate from its trend.
|
|
WeAT5-CC Oral Session, CC-411 |
Add to My Program |
Visual Perception and Learning I |
|
|
Chair: Najjaran, Homayoun | University of Victoria |
Co-Chair: Ravendran, Ahalya | The Commonwealth Scientific and Industrial Research Organisation |
|
10:30-12:00, Paper WeAT5-CC.1 | Add to My Program |
Bag of Views: An Appearance-Based Approach to Next-Best-View Planning for 3D Reconstruction |
|
Hatami Gazani, Sara | University of Victoria |
Tucsok, Matthew | University of British Columbia |
Mantegh, Iraj | National Research Council Canada |
Najjaran, Homayoun | University of Victoria |
Keywords: Computer Vision for Automation, Aerial Systems: Perception and Autonomy, Reactive and Sensor-Based Planning
Abstract: UAV-based intelligent data acquisition for 3D reconstruction and monitoring of infrastructure has experienced an increasing surge of interest due to recent advancements in image processing and deep learning-based techniques. View planning is an essential part of this task that dictates the information capture strategy and heavily impacts the quality of the 3D model generated from the captured data. Recent methods have used prior knowledge or partial reconstruction of the target to accomplish view planning for active reconstruction; the former approach poses a challenge for complex or newly identified targets while the latter is computationally expensive. In this work, we present Bag-of-Views (BoV), a fully appearance-based model used to assign utility to the captured views for both offline dataset refinement and online next-best-view (NBV) planning applications targeting the task of 3D reconstruction. With this contribution, we also developed the View Planning Toolbox (VPT), a lightweight package for training and testing machine learning-based view planning frameworks, custom view dataset generation of arbitrary 3D scenes, and 3D reconstruction. Through experiments which pair a BoV-based reinforcement learning model with VPT, we demonstrate the efficacy of our model in reducing the number of required views for high-quality reconstructions in dataset refinement and NBV planning.
|
|
10:30-12:00, Paper WeAT5-CC.2 | Add to My Program |
See through the Real World Haze Scenes: Navigating the Synthetic-To-Real Gap in Challenging Image Dehazing |
|
Chen, Shijie | Fudan University |
Mahdizadeh, Mohammad | Fudan University |
Yu, Chong | Fudan University & NVIDIA |
Fan, Jiayuan | Fudan University |
Chen, Tao | Fudan University |
Keywords: Computer Vision for Automation, Computer Vision for Manufacturing, Computer Vision for Transportation
Abstract: Dehazing and enhancing visibility in real-world hazy images pose significant challenges due to the physical complexity of haze, the variability in haze conditions, the tendency to capture more details in noisy scenes, and the risk of overexposure. Many existing single RGB image dehazing methods tend to perform well in synthetic hazy scenarios but struggle in real-world situations. It stems from the fact that these methods often rely solely on deep learning techniques or classical approaches. In addition, they neglect overall image quality improvement. To partially address these challenges, we introduce an innovative approach that harnesses the strengths of both modalities to dehaze and enhance visibility in a single real-world hazy RGB image. First, both Low-level and deep features are extracted, and then a pre-trained vector quantization GAN is employed to create a discrete codebook of well-detailed data patches. A decoder component, enhanced with a normalized module, effectively utilizes these high-quality features to produce clear results. Additionally, a controllable operation is introduced to improve feature matching. To further enhance dehazing and generalizability, the decoder's output undergoes a sequence of gamma-correction operations and generates a sequence of multi-exposure images that are combined to create a haze-free, visually pleasing, and higher-quality final image. The method effectively reduces haziness, enhances sharpness, preserves natural colors, and minimizes artifacts in challenging real-world scenarios. The proposed approach surpasses five SOTA methods in both qualitative and quantitative evaluations across three key metrics, utilizing two real-world and three synthetic hazy image datasets. Notably, it achieves a substantial improvement in real-world datasets over the second-best method, with gains of 0.5702 and 0.129 in FADE metrics for the RTTS and Fattal datasets, respectively.
|
|
10:30-12:00, Paper WeAT5-CC.3 | Add to My Program |
CopperTag: A Real-Time Occlusion-Resilient Fiducial Marker |
|
Bian, Xu | Xi’an Jiaotong University |
Chen, Wenzhao | Youibot Robotics Co., Ltd |
Tian, Xiaoyu | Carnegie Mellon University |
Ran, Donglai | Youibot Robotics Co., Ltd |
Keywords: Computer Vision for Automation, Computer Vision for Manufacturing, Industrial Robots
Abstract: Fiducial markers, like AprilTag and ArUco, are extensively utilized in robotics applications within industrial environments, encompassing navigation, docking, and object grasping tasks. However, in contrast to controlled laboratory conditions, markers installed in factory grounds or equipment surfaces, often face challenges like damage or contamination. These issues can lead to compromised marker integrity, resulting in reduced detection reliability. To address this challenge, we propose a novel fiducial marker called CopperTag, which incorporates circular and square elements to create a robust occlusion-resistant pattern. The CopperTag detection process relies on three fundamental steps: firstly, extracting all lines from the image; secondly, identifying corners; and lastly, searching for quadrilateral candidate regions using ellipses and nearby corners. The Reed-Solomon (RS) algorithm is utilized for both encoding and decoding the information content. This algorithm possesses the ability to recover corrupted messages in situations where CopperTag data is incomplete. The experimental results illustrate that CopperTag exhibits superior robustness and accuracy in detection when compared to other state-of-the-art fiducial markers, even in scenarios with heavy occlusion. Moreover, CopperTag maintains an average processing time of 10ms per frame on a standard laptop, effectively meeting the real-time demands of robotics applications.
|
|
10:30-12:00, Paper WeAT5-CC.4 | Add to My Program |
Robust Collaborative Perception without External Localization and Clock Devices |
|
Lei, Zixing | Shanghai Jiao Tong University |
Ni, Zhenyang | Shanghai Jiao Tong University |
Han, Ruize | Chinese Academy of Sciences |
Tang, Shuo | Shanghai Jiao Tong University |
Feng, Chen | New York University |
Chen, Siheng | Shanghai Jiao Tong University |
Wang, Yanfeng | Shanghai Jiao Tong University |
Keywords: Computer Vision for Automation, Computer Vision for Transportation, Deep Learning for Visual Perception
Abstract: A consistent spatial-temporal coordination across multiple agents is fundamental for collaborative perception, which seeks to improve perception abilities through information exchange among agents. To achieve this spatial-temporal alignment, traditional methods depend on external devices to provide localization and clock signals. However, hardware-generated signals could be vulnerable to noise and potentially malicious attack, jeopardizing the precision of spatial-temporal alignment. Rather than relying on external hardwares, this work proposes a novel approach: aligning by recognizing the inherent geometric patterns within the perceptual data of various agents. Following this spirit, we propose a robust collaborative perception system that operates independently of external localization and clock devices. The key module of our system,~emph{FreeAlign}, constructs a salient object graph for each agent based on its detected boxes and uses a graph neural network to identify common subgraphs between agents, leading to accurate relative pose and time. We validate emph{FreeAlign} on both real-world and simulated datasets. The results show that, the ~emph{FreeAlign} empowered robust collaborative perception system perform comparably to systems relying on precise localization and clock devices. We will release code related to this work.
|
|
10:30-12:00, Paper WeAT5-CC.5 | Add to My Program |
DerainNeRF: 3D Scene Estimation with Adhesive Waterdrop Removal |
|
Li, Yunhao | Westlake University |
Wu, Jing | Westlake University |
Zhao, Lingzhe | Westlake University |
Liu, Peidong | Westlake University |
Keywords: Computer Vision for Automation, Computer Vision for Transportation, Visual Learning
Abstract: When capturing images through the glass during rainy or snowy weather conditions, the resulting images often contain waterdrops adhered on the glass surface, and these waterdrops significantly degrade the image quality and perfor- mance of many computer vision algorithms. To tackle these limitations, we propose a method to reconstruct the clear 3D scene implicitly from multi-view images degraded by water- drops. Our method exploits an attention network to predict the location of waterdrops and then train a Neural Radiance Fields to recover the 3D scene implicitly. By leveraging the strong scene representation capabilities of NeRF, our method can render high-quality novel-view images with waterdrops removed. Extensive experimental results on both synthetic and real datasets show that our method is able to generate clear 3D scenes and outperforms existing state-of-the-art (SOTA) image adhesive waterdrop removal methods.
|
|
10:30-12:00, Paper WeAT5-CC.6 | Add to My Program |
Learning Interaction Regions and Motion Trajectories Simultaneously from Egocentric Demonstration Videos |
|
Xin, Jianjia | Beijing University of Technology |
Wang, Lichun | Beijing University of Technology |
Xu, Kai | Beijing University of Technology |
Yang, Chao | Beijing University of Technology |
Yin, Baocai | Beijing University of Technology |
Keywords: Computer Vision for Automation, Deep Learning for Visual Perception, Data Sets for Robotic Vision
Abstract: Learning to interact with objects is significant for robots to integrate into human environments. When the interaction semantic is definite, manually guiding the manipulator is a commonly used method to teach robots how to interact with objects. However, the learning results are robot-dependent because the mechanical parameters are different for different robots, which means the learning process must be executed again. Moreover, during the manual guiding process, operators are responsible for recognizing the region being contacted and providing expert motion programming, which limits the robot's intelligence. To improve the degree of automation for robots interacting with objects, this paper proposes IRMT-Net (Interaction Region and Motion Trajectory prediction Network) to predict the interaction region and motion trajectory simultaneously based on images. IRMT-Net achieves state-of-the-art interaction region prediction results on Epic-kitchens dataset, generates reasonable motion trajectories and can support robot interaction in actual situations.
|
|
10:30-12:00, Paper WeAT5-CC.7 | Add to My Program |
Marrying NeRF with Feature Matching for One-Step Pose Estimation |
|
Chen, Ronghan | Sheyang Institute of Automation, Chinese Academy of Sciences |
Cong, Yang | Chinese Academy of Science, China |
Ren, Yu | Shenyang Institute of Automation Chinese Academy of Sciences |
Keywords: Computer Vision for Automation, Deep Learning for Visual Perception, Localization
Abstract: Given the image collection of an object, we aim at building a real-time image-based pose estimation method, which requires neither its CAD model nor hours of object-specific training. Recent NeRF-based methods provide a promising solution by directly optimizing the pose from pixel loss between rendered and target images. However, during inference, they require long converging time, and suffer from local minima, making them impractical for real-time robot applications. We aim at solving this problem by marrying image matching with NeRF. With 2D matches and depth rendered by NeRF, we directly solve the pose in one step by building 2D-3D correspondences between target and initial view, thus allowing for real-time prediction. Moreover, to improve the accuracy of 2D-3D correspondences, we propose a 3D consistent point mining strategy, which effectively discards unfaithful points reconstruted by NeRF. Moreover, current NeRF-based methods naively optimizing pixel loss fail at occluded images. Thus, we further propose a 2D matches based sampling strategy to preclude the occluded area. Experimental results on representative datasets prove that our method outperforms state-of-the-art methods, and improves inference efficiency by 90x, achieving real-time prediction at 6 FPS.
|
|
10:30-12:00, Paper WeAT5-CC.8 | Add to My Program |
Occluded Part-Aware Graph Convolutional Networks for Skeleton-Based Action Recognition |
|
Kim, Min Hyuk | Chonnam National University |
Kim, Min Ju | Chonnam National University |
Yoo, Seok Bong | Chonnam National University |
Keywords: Computer Vision for Automation, Deep Learning for Visual Perception, Recognition
Abstract: Recognizing human action is one of the most critical factors in the visual perception of robots. Specifically, skeleton-based action recognition has been actively researched to enhance recognition performance at a lower cost. However, action recognition in occlusion situations, where body parts are not visible, is still challenging. We propose an occluded part-aware graph convolutional network (OP-GCN) to address this challenge using the optimal occluded body parts. The proposed model uses an occluded part detector to identify occluded body parts within a human skeleton. It is based on an autoencoder trained on a nonoccluded human skeleton and exploits the symmetry and angular information of the skeleton. Then, we select an optimal group constructed considering the occluded body parts. Each group comprises five sets of joint nodes, focusing on the body parts, excluding the occluded ones. Finally, to enhance interaction within the selected groups, we apply an interpart association module, considering the fusion of global and local elements. The experimental results reveal that the proposed model outperforms others on the occluded datasets. These comparative experiments demonstrate the effectiveness of the study in addressing the challenge of action recognition in occlusion situations. Our code is publicly available at https://github.com/MJ-Kor/OP-GCN.
|
|
10:30-12:00, Paper WeAT5-CC.9 | Add to My Program |
MAL: Motion-Aware Loss with Temporal and Distillation Hints for Self-Supervised Depth Estimation |
|
Dong, Yuejiang | Tsinghua University |
Zhang, Fang-Lue | Victoria University of Wellington |
Zhang, Song-Hai | Tsinghua University |
Keywords: Computer Vision for Automation, Deep Learning for Visual Perception, RGB-D Perception
Abstract: Depth perception is crucial for a wide range of robotic applications. Multi-frame self-supervised depth estimation methods have gained research interest due to their ability to leverage large-scale, unlabeled real-world data. However, the self-supervised methods often rely on the assumption of a static scene and their performance tends to degrade in dynamic environments. To address this issue, we present Motion-Aware Loss, which leverages the temporal relation among consecutive input frames and a novel distillation scheme between the teacher and student networks in the multi-frame self-supervised depth estimation methods. Specifically, we associate the spatial locations of moving objects with the temporal order of input frames to eliminate errors induced by object motion. Meanwhile, we enhance the original distillation scheme in multi-frame methods to better exploit the knowledge from a teacher network. MAL is a novel, plug-and-play module designed for seamless integration into multi-frame self-supervised monocular depth estimation methods. Adding MAL into previous state-of-the-art methods leads to a reduction in depth estimation errors by up to 4.2% and 10.8% on KITTI and CityScapes benchmarks, respectively.
|
|
WeAT6-CC Oral Session, CC-414 |
Add to My Program |
Visual Servoing |
|
|
Chair: Valada, Abhinav | University of Freiburg |
Co-Chair: Loianno, Giuseppe | New York University |
|
10:30-12:00, Paper WeAT6-CC.1 | Add to My Program |
Stereo Image-Based Visual Servoing towards Feature-Based Grasping |
|
Enyedy, Albert | Worcester Polytechnic Institute |
Aswale, Ashay | Worcester Polytechnic Institute |
Calli, Berk | Worcester Polytechnic Institute |
Gennert, Michael | Worcester Polytechnic Institute |
Keywords: Visual Servoing, Grasping, Humanoid Robot Systems
Abstract: This paper presents an image-based visual servoing scheme that can control robotic manipulators in 3D space using 2D stereo images without needing to perform stereo reconstruction. We use a stereo camera in an eye-to-hand configuration for controlling the robot to reach target positions by directly mapping image space errors to joint space actuation. We achieve convergence without a-priori knowledge of the target object, a reference 2D image, or 3D data. By doing so, we can reach targets in unstructured environments using high-resolution RGB images instead of utilizing relatively noisy depth data. We conduct several experiments on two different physical robots. The Panda 7DOF arm grasps a static target in 3D space, grasps a pitcher handle, and picks and places a box by determining the approach angle using 2D image features, demonstrating that this algorithm can be used for grasping practical objects in 3D space using only 2D image features for feedback. Our second platform, the Atlas humanoid robot, reaches a target from an unknown starting configuration, demonstrating that this controller achieves convergence to a target, even with the uncertainties introduced by walking to a new location. We believe that this algorithm is a step towards enabling intuitive interfaces that allow a user to initiate a grasp on an object by specifying a grasping point in a 2D image.
|
|
10:30-12:00, Paper WeAT6-CC.2 | Add to My Program |
Visual Feedback Control of an Underactuated Hand for Grasping Brittle and Soft Foods |
|
Kai, Ryogo | Chuo University |
Isobe, Yuzuka | Chuo University |
Pathak, Sarthak | Chuo University |
Umeda, Kazunori | Chuo University |
Keywords: Visual Servoing, Grasping, Underactuated Robots
Abstract: This paper presents a novel method to control an underactuated hand by using only a monocular camera, not using any internal sensors. In food factories, robots are required to handle a wide variety of foods without damaging them. To accomplish this, the use of underactuated hands is effective because they can adapt to various food shapes. However, if internal sensors such as tactile sensors and force sensors are used in the underactuated hands, it may cause a problem with hygiene and require complicated calibration. Moreover, if external sensors such as cameras are used, it is necessary to grasp foods without damaging them by using external information such as images. In our method, to tackle these problems, a camera is used as an external sensor. First, contact between the hand and the object is detected by using the contours of both, obtained from a camera image. Then, to avoid damaging the object, the following information is extracted from camera images and observed: the centroid of both the hand and object, the deformation of the object, and the occlusion rate of the hand. Furthermore, to prevent the object from dropping while the robotic arm is in motion, the distance between the centroid of the hand and the object is calculated. The experiments were conducted using twelve different food items.
|
|
10:30-12:00, Paper WeAT6-CC.3 | Add to My Program |
Compositional Servoing by Recombining Demonstrations |
|
Argus, Maximilian | University of Freiburg |
Nayak, Abhijeet | University of Freiburg |
Büchner, Martin | University of Freiburg |
Galesso, Silvio | University of Freiburg |
Valada, Abhinav | University of Freiburg |
Brox, Thomas | University of Freiburg |
Keywords: Visual Servoing, Manipulation Planning, Learning from Demonstration
Abstract: Learning-based manipulation policies from image inputs often show weak task transfer capabilities. In contrast, visual servoing methods allow efficient task transfer in high-precision scenarios while requiring only a few demonstrations. In this work, we present a framework that formulates the visual servoing task as graph traversal. Our method not only extends the robustness of visual servoing, but also enables multitask capability based on a few task-specific demonstrations. We construct demonstration graphs by splitting existing demonstrations and recombining them. In order to traverse the demonstration graph in the inference case, we utilize a similarity function that helps select the best demonstration for a specific task. This enables us to compute the shortest path through the graph. Ultimately, we show that recombining demonstrations leads to higher task-respective success. We present extensive simulation and real-world experimental results that demonstrate the efficacy of our approach.
|
|
10:30-12:00, Paper WeAT6-CC.4 | Add to My Program |
Second-Order Position-Based Visual Servoing of a Robot Manipulator |
|
Godinho Ribeiro, Eduardo | University of Săo Paulo |
de Queiroz Mendes, Raul | Eindhoven University of Technology (TU/e) |
Terra, Marco Henrique | University of Sao Paulo |
Grassi Junior, Valdir | University of Săo Paulo |
Keywords: Visual Servoing, Motion Control
Abstract: Visual Servoing is an established approach for controlling robots using visual feedback. Most controllers in this domain generate velocity control signals to guide the cameras to desired positions and orientations. However, the dynamic characteristics of conventional visual servoing controllers may be unsatisfactory, and the velocity signal itself hinders the connection between the feature velocity model and the robot's dynamics. Consequently, research has explored models incorporating the second-order derivative of features and the robot's acceleration. The current state-of-the-art techniques mainly focus on image-based visual servoing, which deals with feature errors in the image domain. In this work, we propose an acceleration-based controller for the position-based visual servoing framework, which models the error in Cartesian space. Our approach involves extracting an acceleration control signal from the traditional velocity-based controller. To achieve this, we redefine the camera orientation using quaternions, generate new interaction matrices, and conduct comprehensive comparative experiments in simulated and real robot scenarios. We show that our method provides better dynamic properties in both image and Cartesian spaces, superior tracking performance, and less sensitivity to noise compared to velocity controllers.
|
|
10:30-12:00, Paper WeAT6-CC.5 | Add to My Program |
Event-Triggered Image Moments Predictive Control for Tracking Evolving Features Using UAVs |
|
Aspragkathos, Sotiris | NTUA |
Karras, George | University of Thessaly |
Kyriakopoulos, Kostas | New York University - Abu Dhabi |
Keywords: Visual Servoing, Visual Tracking, Sensor-based Control
Abstract: This paper presents a novel approach for tracking deformable contour targets using Unmanned Aerial Vehicles (UAVs). The proposed scheme combines image moments descriptor and event-triggered Nonlinear Model Predictive Control (NMPC) for efficient and accurate tracking. The deformable contour model allows adaptation to the evolving target's shape, while the proposed event-triggered scheme achieves improved computational efficiency and extended flight duration while generating new control sequences for the UAV. Real-world experiments validate the scheme, showcasing its robustness in handling complex scenarios. This approach holds promise for various applications, such as surveillance and autonomous navigation.
|
|
10:30-12:00, Paper WeAT6-CC.6 | Add to My Program |
Lattice-Based Shape Tracking and Servoing of Elastic Objects |
|
Shetab-Bushehri, Mohammadreza | Université Clermont Auvergne, Institut Pascal |
Aranda, Miguel | Universidad De Zaragoza |
Mezouar, Youcef | Clermont Auvergne INP - SIGMA Clermont |
Ozgur, Erol | SIGMA-Clermont / Institut Pascal |
Keywords: Visual Servoing, Visual Tracking, Sensor-based Control, Manipulation of Deformable Objects
Abstract: In this paper, we propose a general unified tracking-servoing approach for controlling the shape of elastic deformable objects using robotic arms. Our approach works by forming a lattice around the object, binding the object to the lattice, and tracking and servoing the lattice instead of the object. This makes our approach have full control over the deformation of elastic deformable objects of any general form (linear, thin-shell, volumetric) in 3D space. Furthermore, it decouples the runtime complexity of the approach from the objects’ geometric complexity. Our approach is based on the As-Rigid-As-Possible (ARAP) deformation model. It requires no mechanical parameter of the object to be known and can drive the object toward desired shapes through large deformations. The inputs to our approach are the point cloud of the object’s surface in its rest shape and the point cloud captured by a 3D camera in each frame. Overall, our approach is more broadly applicable than existing approaches. We validate the efficiency of our approach through numerous experiments with elastic deformable objects of various shapes and materials (paper, rubber, plastic, foam).
|
|
10:30-12:00, Paper WeAT6-CC.7 | Add to My Program |
DCPT: Darkness Clue-Prompted Tracking in Nighttime UAVs |
|
Zhu, Jiawen | Dalian University of Technology |
Tang, Huayi | Dalian University of Technology |
Cheng, Zhi-Qi | Carnegie Mellon University |
He, Jun-yan | Alibaba Group |
Luo, Bin | Alibaba Group |
Qiu, Shihao | Dalian University of Technology |
Li, Shengming | Dalian University of Technology |
Lu, Huchuan | Dalian University of Technology |
Keywords: Visual Tracking
Abstract: Existing nighttime unmanned aerial vehicle (UAV) trackers follow an “Enhance-then-Track” architecture - first using a light enhancer to brighten the nighttime video, then employing a daytime tracker to locate the object. This separate enhancement and tracking fails to build an end-to-end trainable vision system. To address this, we propose a novel architecture called Darkness Clue-Prompted Tracking (DCPT) that achieves robust UAV tracking at night by efficiently learning to generate darkness clue prompts. Without a separate enhancer, DCPT directly encodes anti-dark capabilities into prompts using a darkness clue prompter (DCP). Specifically, DCP iteratively learns emphasizing and undermining projections for darkness clues. It then injects these learned visual prompts into a daytime tracker with fixed parameters across transformer layers. Moreover, a gated feature aggregation mechanism enables adaptive fusion between prompts and between prompts and the base model. Extensive experiments show state-of-the-art performance for DCPT on multiple dark scenario benchmarks. The unified end-to-end learning of enhancement and tracking in DCPT enables a more trainable system. The darkness clue prompting efficiently injects anti-dark knowledge without extra modules. Code is available at https://github.com/bearyi26/DCPT.
|
|
10:30-12:00, Paper WeAT6-CC.8 | Add to My Program |
Unifying Foundation Models with Quadrotor Control for Visual Tracking Beyond Object Categories |
|
Saviolo, Alessandro | New York University |
Rao, Pratyaksh | New York University |
Radhakrishnan, Vivek | Technology Innovation Institute, New York University |
Xiao, Jiuhong | New York University |
Loianno, Giuseppe | New York University |
Keywords: Visual Tracking, Aerial Systems: Applications, Vision-Based Navigation
Abstract: Visual control enables quadrotors to adaptively navigate using real-time sensory data, bridging perception with action. Yet, challenges persist, including generalization across scenarios, maintaining reliability, and ensuring real-time responsiveness. This paper introduces a perception framework grounded in foundational models for universal object detection and tracking, moving beyond specific training categories. Integral to our approach is a multi-layered tracker integrated with the foundational detector, ensuring continuous target visibility, even when faced with motion blur, abrupt light shifts, and occlusions. Complementing this, we introduce a model-free controller tailored for resilient quadrotor visual tracking. Our system operates efficiently on limited hardware, relying solely on an onboard camera and an inertial measurement unit. Through extensive validation in diverse challenging indoor and outdoor environments, we demonstrate our system's effectiveness and adaptability. In conclusion, our research represents a step forward in quadrotor visual tracking, moving from task-specific methods to more versatile and adaptable operations.
|
|
10:30-12:00, Paper WeAT6-CC.9 | Add to My Program |
DroneMOT: Drone-Based Multi-Object Tracking Considering Detection Difficulties and Simultaneous Moving of Drones and Objects |
|
Wang, Peng | Renmin University of China |
Wang, Yongcai | Renmin University of China |
Li, Deying | Renmin University of China |
Keywords: Visual Tracking, Aerial Systems: Perception and Autonomy, Human Detection and Tracking
Abstract: Multi-object tracking (MOT) on static platforms, such as by surveillance cameras, has achieved significant progress, with various paradigms providing attractive performances. However, the effectiveness of traditional MOT methods is significantly reduced when it comes to dynamic platforms like drones. This decrease is attributed to the distinctive challenges in the MOT-on-drone scenario: (1) objects are generally small in the image plane, often blurred, and frequently occluded, making them challenging to detect and recognize; (2) drones move and see objects from different angles, causing the unreliability of the predicted positions and feature embeddings of the objects. This paper proposes DroneMOT, which firstly proposes a Dual-Domain integrated Attention (DIA) module that considers the fast movements of drones to enhance the drone-based object detection and feature embedding for small-sized, blurred, and occluded objects. Then, an innovative Motion-Driven Association (MDA) scheme is introduced, considering the concurrent movements of both the drone and the objects. Within MDA, an Adaptive Feature Synchronization (AFS) technique is presented to update the object features seen from different angles. Additionally, a Dual Motion-based Prediction (DMP) method is employed to forecast the object positions. Finally, both the refined feature embeddings and the predicted positions are integrated to enhance the object association. Comprehensive evaluations on VisDrone2019-MOT and UAVDT datasets show that DroneMOT provides substantial performance improvements over the state-of-the-art in the domain of MOT on drones.
|
|
WeAT7-CC Oral Session, CC-416 |
Add to My Program |
Learning in Planning |
|
|
Chair: Zhao, Ding | Carnegie Mellon University |
Co-Chair: Hamaya, Masashi | OMRON SINIC X Corporation |
|
10:30-12:00, Paper WeAT7-CC.1 | Add to My Program |
Human-Robot Gym: Benchmarking Reinforcement Learning in Human-Robot Collaboration |
|
Thumm, Jakob | Technical University of Munich |
Trost, Felix | Technical University of Munich |
Althoff, Matthias | Technische Universität München |
Keywords: Reinforcement Learning, Human-Robot Collaboration, Safety in HRI
Abstract: Deep reinforcement learning (RL) has shown promising results in robot motion planning with first attempts in human-robot collaboration (HRC). However, a fair comparison of RL approaches in HRC under the constraint of guaranteed safety is yet to be made. We, therefore, present human-robot gym, a benchmark suite for safe RL in HRC. We provide challenging, realistic HRC tasks in a modular simulation framework. Most importantly, human-robot gym is the first benchmark suite that includes a safety shield to provably guarantee human safety. This bridges a critical gap between theoretic RL research and its real-world deployment. Our evaluation of six tasks led to three key results: (a) the diverse nature of the tasks offered by human-robot gym creates a challenging benchmark for state-of-the-art RL methods, (b) by leveraging expert knowledge in form of an action imitation reward, the RL agent can outperform the expert, and (c) our agents negligibly overfit to training data.
|
|
10:30-12:00, Paper WeAT7-CC.2 | Add to My Program |
Improving the Generalization of Unseen Crowd Behaviors for Reinforcement Learning Based Local Motion Planners |
|
Ng, Wen Zheng Terence | Nanyang Technological University |
Chen, Jianda | Nanyang Technological University |
Pan, Sinno Jialin | The Chinese University of Hong Kong |
Zhang, Tianwei | Nanyang Technological University |
Keywords: Reinforcement Learning, Collision Avoidance, Machine Learning for Robot Control
Abstract: Deploying a safe mobile robot policy in scenarios with human pedestrians is challenging due to their unpredictable movements. Current Reinforcement Learning-based motion planners rely on a single policy to simulate pedestrian movements and could suffer from the over-fitting issue. Alternatively, framing the collision avoidance problem as a multi-agent framework, where agents generate dynamic movements while learning to reach their goals, can lead to conflicts with human pedestrians due to their homogeneity. To tackle this problem, we introduce an efficient method that enhances agent diversity within a single policy by maximizing an information-theoretic objective. This diversity enriches each agent's experiences, improving its adaptability to unseen crowd behaviors. In assessing an agent's robustness against unseen crowds, we propose diverse scenarios inspired by pedestrian crowd behaviors. Our behavior-conditioned policies outperform existing works in these challenging scenes, reducing potential collisions without additional time or travel.
|
|
10:30-12:00, Paper WeAT7-CC.3 | Add to My Program |
Human-Aligned Longitudinal Control for Occluded Pedestrian Crossing with Visual Attention |
|
Asodia, Vinal | University of Surrey |
Feng, Zhenhua | University of Surrey |
Fallah, Saber | University of Surrey |
Keywords: Reinforcement Learning, Collision Avoidance, Human-Centered Automation
Abstract: Reinforcement Learning (RL) has been widely used to create generalizable autonomous vehicles. However, they rely on fixed reward functions that struggle to balance values like safety and efficiency. How can autonomous vehicles balance different driving objectives and human values in a constantly changing environment? To bridge this gap, we propose an adaptive reward function that utilizes visual attention maps to detect pedestrians in the driving scene and dynamically switch between prioritizing safety or efficiency depending on the current observation. The visual attention map is used to provide spatial attention to the RL agent to boost the training efficiency of the pipeline. We evaluate the pipeline against variants of an occluded pedestrian crossing scenario in the CARLA Urban Driving simulator. Specifically, the proposed pipeline is compared against a modular setup that combines the well-established object detection model, YOLO, with a Proximal Policy Optimization (PPO) agent. The results indicate that the proposed approach can compete with the modular setup while yielding greater training efficiency. The trajectories collected with the approach confirm the effectiveness of the proposed adaptive reward function.
|
|
10:30-12:00, Paper WeAT7-CC.4 | Add to My Program |
Projection-Based Fast and Safe Policy Optimization for Reinforcement Learning |
|
Lin, Shijun | University of Science and Technology of China |
Wang, Hao | University of Science and Technology of China |
Chen, Ziyang | University of Science and Technology of China |
Kan, Zhen | University of Science and Technology of China |
Keywords: Reinforcement Learning, Task and Motion Planning
Abstract: While reinforcement learning (RL) attracts increasing research attention, maximizing the return while keeping the agent safe at the same time remains an open problem. Motivated to address this challenge, this work proposes a new Fast and Safe Policy Optimization (FSPO) algorithm, which consists of three steps: the first step involves reward improvement update, the second step projects the policy to the neighborhood of the baseline policy to accelerate the optimization process, and the third step addresses the constraint violation by projecting the policy back onto the constraint set. Such a projection-based optimization can improve the convergence and learning performance. Unlike many existing works that require convex approximations for the objectives and constraints, this work exploits a first- order method to avoid expensive computations and high dimensional issues, enabling fast and safe policy optimization, especially for challenging tasks. Numerical simulation and physical experiments demonstrate that FSPO outperforms existing methods in terms of safety guarantees and task completion rate.
|
|
10:30-12:00, Paper WeAT7-CC.5 | Add to My Program |
Symmetry Considerations for Learning Task Symmetric Robot Policies |
|
Mittal, Mayank | ETH Zurich |
Rudin, Nikita | ETH Zurich, NVIDIA |
Klemm, Victor | ETH Zurich |
Allshire, Arthur | University of Toronto |
Hutter, Marco | ETH Zurich |
Keywords: Reinforcement Learning, Machine Learning for Robot Control
Abstract: Symmetry is a fundamental aspect of many real-world robotic tasks. However, current deep reinforcement learning (DRL) approaches can seldom harness and exploit symmetry effectively. Often, the learned behaviors fail to achieve the desired transformation invariances and suffer from motion artifacts. For instance, a quadruped may exhibit different gaits when commanded to move forward or backward, even though it is symmetrical about its torso. This issue becomes further pronounced in high-dimensional or complex environments, where DRL methods are prone to local optima and fail to explore regions of the state space equally. Past methods on encouraging symmetry for robotic tasks have studied this topic mainly in a single-task setting, where symmetry usually refers to symmetry in the motion, such as the gait patterns. In this paper, we revisit this topic for goal-conditioned tasks in robotics, where symmetry lies mainly in task execution and not necessarily in the learned motions themselves. In particular, we investigate two approaches to incorporate symmetry invariance into DRL – data augmentation and mirror loss function. We provide a theoretical foundation for using augmented samples in an on-policy setting. Based on this, we show that the corresponding approach achieves faster convergence and improves the learned behaviors in various challenging robotic tasks, from climbing boxes with a quadruped to dexterous manipulation.
|
|
10:30-12:00, Paper WeAT7-CC.6 | Add to My Program |
Learning Dual-Arm Object Rearrangement for Cartesian Robots |
|
Zhang, Shishun | National University of Defense Technology |
She, Qijin | National University of Defense Technology |
Li, Wenhao | National University of Defense Technology |
Zhu, Chenyang | National University of Defense Technology |
Wang, Yongjun | National University of Defense Technology |
Hu, Ruizhen | Shenzhen University |
Xu, Kai | National University of Defense Technology |
Keywords: Reinforcement Learning, Task and Motion Planning
Abstract: This work focuses on the dual-arm object rearrangement problem abstracted from a realistic industrial scenario of Cartesian robots. The goal of this problem is to transfer all the objects from sources to targets with the minimum total completion time. To achieve the goal, the core idea is to develop an effective object-to-arm task assignment strategy for minimizing the cumulative task execution time and maximizing the dual-arm cooperation efficiency. One of the difficulties in the task assignment is the scalability problem. As the number of objects increases, the computation time of traditional offline-search-based methods grows strongly for computational complexity. Encouraged by the adaptability of reinforcement learning (RL) in long-sequence task decisions, we propose an online task assignment decision method based on RL, and the computation time of our method only increases linearly with the number of objects. Further, we design an attention-based network to model the dependencies between the input states during the whole task execution process to help find the most reasonable object-to-arm correspondence in each task assignment round. In the experimental part, we adapt some search-based methods to this specific setting and compare our method with them. Experimental result shows that our approach achieves outperformance over search-based methods in total execution time and computational efficiency, and also verifies the generalization of our method to different numbers of objects. In addition, we show the effectiveness of our method deployed on the real robot in the supplementary video.
|
|
10:30-12:00, Paper WeAT7-CC.7 | Add to My Program |
Guided Online Distillation: Promoting Safe Reinforcement Learning by Offline Demonstration |
|
Li, Jinning | University of California, Berkeley |
Liu, Xinyi | University of Michigan |
Zhu, Banghua | University of California, Berkeley |
Jiao, Jiantao | University of California, Berkeley |
Tomizuka, Masayoshi | University of California |
Tang, Chen | University of California Berkeley |
Zhan, Wei | Univeristy of California, Berkeley |
Keywords: Reinforcement Learning, Learning from Demonstration, Robot Safety
Abstract: Safe Reinforcement Learning (RL) aims to find a policy that achieves high rewards while satisfying cost constraints. When learning from scratch, safe RL agents tend to be overly conservative, which impedes exploration and restrains the overall performance. In many realistic tasks, e.g. autonomous driving, large-scale expert demonstration data are available. We argue that extracting expert policy from offline data to guide online exploration is a promising solution to mitigate the conserveness issue. Large-capacity models, e.g. decision transformers (DT), have been proven to be competent in offline policy learning. However, data collected in real-world scenarios rarely contain dangerous cases (e.g., collisions), which makes it prohibitive for the policies to learn safety concepts. Besides, these bulk policy networks cannot meet the computation speed requirements at inference time on real-world tasks such as autonomous driving. To this end, we propose Guided Online Distillation (GOLD), an offline-to-online safe RL framework. GOLD distills an offline DT policy into a lightweight policy network through guided online safe RL training, which outperforms both the offline DT policy and online safe RL algorithms. Experiments in both benchmark safe RL tasks and real-world driving tasks based on the Waymo Open Motion Dataset (WOMD) demonstrate that GOLD can successfully distill lightweight policies and solve decision-making problems in challenging safety-critical scenarios.
|
|
10:30-12:00, Paper WeAT7-CC.8 | Add to My Program |
Sample-Efficient Learning to Solve a Real-World Labyrinth Game Using Data-Augmented Model-Based Reinforcement Learning |
|
Bi, Thomas | ETH Zurich |
D'Andrea, Raffaello | ETHZ |
Keywords: Reinforcement Learning, Engineering for Robotic Systems, Visual Learning
Abstract: Motivated by the challenge of achieving rapid learning in physical environments, this paper presents the development and training of a robotic system designed to navigate and solve a labyrinth game using model-based reinforcement learning techniques. The method involves extracting low-dimensional observations from camera images, along with a cropped and rectified image patch centered on the current position within the labyrinth, providing valuable information about the labyrinth layout. The learning of a control policy is performed purely on the physical system using model-based reinforcement learning, where the progress along the labyrinth's path serves as a reward signal. Additionally, we exploit the system's inherent symmetries to augment the training data. Consequently, our approach learns to successfully solve a popular real-world labyrinth game in record time, with only 5 hours of real-world training data.
|
|
10:30-12:00, Paper WeAT7-CC.9 | Add to My Program |
Active Neural Topological Mapping for Multi-Agent Exploration |
|
Yang, Xinyi | Tsinghua University |
Yang, Yuxiang | Tsinghua University |
Yu, Chao | Tsinghua University |
Chen, Jiayu | Tsinghua University |
Jincheng, Yu | Tsinghua University |
Ren, Haibing | Meituan Inc |
Yang, Huazhong | Tsinghua University |
Wang, Yu | Tsinghua University |
Keywords: Reinforcement Learning, Path Planning for Multiple Mobile Robots or Agents
Abstract: This paper investigates the multi-agent cooperative exploration problem, which requires multiple agents to explore an unseen environment via sensory signals in a limited time. A popular approach to exploration tasks is to combine active mapping with planning. Metric maps capture the details of the spatial representation, but are memory intensive and may vary significantly between scenarios, resulting in inferior generalization. Topological maps are a promising alternative as they consist only of nodes and edges with abstract but essential information and are less influenced by the scene structures. However, most existing topology-based exploration tasks utilize classical methods for planning, which are time-consuming and sub-optimal due to their handcrafted design. Deep reinforcement learning (DRL) has shown great potential for learning (near) optimal policies through fast end-to-end inference. In this paper, we propose Multi-Agent Neural Topological Mapping (MANTM) to improve exploration efficiency and generalization for multi-agent exploration tasks. MANTM mainly comprises a Topological Mapper and a novel RL-based Hierarchical Topological Planner (HTP). The Topological Mapper employs a visual encoder and distance-based heuristics to construct a graph containing main nodes and their corresponding ghost nodes. The HTP leverages graph neural networks to capture correlations between agents and graph nodes in a coarse-to-fine manner for effective global goal selection. Extensi
|
|
WeAT8-CC Oral Session, CC-418 |
Add to My Program |
Learning in Grasping and Manipulation I |
|
|
Chair: Tahara, Kenji | Kyushu University |
Co-Chair: Zhang, Haichao | Horizon Robotics |
|
10:30-12:00, Paper WeAT8-CC.1 | Add to My Program |
Efficient Multi-Task and Transfer Reinforcement Learning with Parameter-Compositional Framework |
|
Sun, Lingfeng | University of California, Berkeley |
Zhang, Haichao | Horizon Robotics |
Xu, Wei | Horizon Robotics |
Tomizuka, Masayoshi | University of California |
Keywords: Reinforcement Learning, Transfer Learning, Deep Learning in Grasping and Manipulation
Abstract: In this work, we investigate the potential of improving multi-task training and leveraging it for transferring in the reinforcement learning setting. We identify several challenges towards this goal and propose a transferring approach with a parameter-compositional formulation. We investigate ways to improve the training of multi-task reinforcement learning, which serves as the foundation for transferring. Then we conduct a number of transferring experiments on various manipulation tasks. Experimental results demonstrate that the proposed approach can have improved performance in the multi-task training stage, and further show effective transferring in terms of both sample efficiency and performance.
|
|
10:30-12:00, Paper WeAT8-CC.2 | Add to My Program |
Goal-Conditioned Reinforcement Learning with Disentanglement-Based Reachability Planning |
|
Qian, Zhifeng | Tongji University |
Mingyu, You | Tongji |
Hongjun, Zhou | Tongji University |
Xu, Xuanhui | TongJi University |
He, Bin | Tongji University |
Keywords: Reinforcement Learning, Representation Learning, Manipulation Planning
Abstract: Goal-Conditioned Reinforcement Learning (GCRL) can enable agents to spontaneously set diverse goals to learn a set of skills. Despite the excellent works proposed in various fields, reaching distant goals in temporally extended tasks remains a challenge for GCRL. Current works tackled this problem by leveraging planning algorithms to plan intermediate subgoals to augment GCRL. Their methods need two crucial requirements: (i) a state representation space to search valid subgoals, and (ii) a distance function to measure the reachability of subgoals. However, they struggle to scale to high-dimensional state space due to their non-compact representations. Moreover, they cannot collect high-quality training data through standard GC policies, which results in an inaccurate distance function. Both affect the efficiency and performance of planning and policy learning. In the paper, we propose a goal-conditioned RL algorithm combined with Disentanglement-based Reachability Planning (REPlan) to solve temporally extended tasks. In REPlan, a Disentangled Representation Module (DRM) is proposed to learn compact representations which disentangle robot poses and object positions from high-dimensional observations in a self-supervised manner. A simple REachability discrimination Module (REM) is also designed to determine the temporal distance of subgoals. Moreover, REM computes intrinsic bonuses to encourage the collection of novel states for training. We evaluate our REPlan in three vision-
|
|
10:30-12:00, Paper WeAT8-CC.3 | Add to My Program |
KINet: Unsupervised Forward Models for Robotic Pushing Manipulation |
|
Rezazadeh, Alireza | University of Minnesota |
Choi, Changhyun | University of Minnesota, Twin Cities |
Keywords: Representation Learning, Deep Learning Methods, Manipulation Planning
Abstract: Object-centric representation is an essential abstraction for forward prediction. Most existing forward models learn this representation through extensive supervision (e.g., object class and bounding box) although such ground-truth information is not readily accessible in reality. To address this, we introduce KINet (Keypoint Interaction Network) ---an end-to-end unsupervised framework to reason about object interactions based on a keypoint representation. Using visual observations, our model learns to associate objects with keypoint coordinates and discovers a graph representation of the system as a set of keypoint embeddings and their relations. It then learns an action-conditioned forward model using contrastive estimation to predict future keypoint states. By learning to perform physical reasoning in the keypoint space, our model automatically generalizes to scenarios with a different number of objects, novel backgrounds, and unseen object geometries. Experiments demonstrate the effectiveness of our model in accurately performing forward prediction and learning plannable object-centric representations for downstream robotic pushing manipulation tasks.
|
|
10:30-12:00, Paper WeAT8-CC.4 | Add to My Program |
Intrinsic Language-Guided Exploration for Complex Long-Horizon Robotic Manipulation Tasks |
|
Triantafyllidis, Eleftherios | The University of Edinburgh |
Christianos, Filippos | University of Edinburgh |
Li, Zhibin (Alex) | University College London |
Keywords: Deep Learning Methods, Reinforcement Learning, Deep Learning in Grasping and Manipulation
Abstract: Current reinforcement learning algorithms struggle in sparse and complex environments, most notably in long-horizon manipulation tasks entailing a plethora of different sequences. In this work, we propose the Intrinsically Guided Exploration from Large Language Models (IGE-LLMs) framework. By leveraging LLMs as an assistive intrinsic reward, IGE-LLMs guides the exploratory process in reinforcement learning to address intricate long-horizon with sparse rewards robotic manipulation tasks. We evaluate our framework and related intrinsic learning methods in an environment challenged with exploration, and a complex robotic manipulation task challenged by both exploration and long-horizons. Results show IGE-LLMs (i) exhibit notably higher performance over related intrinsic methods and the direct use of LLMs in decision-making, (ii) can be combined and complement existing learning methods highlighting its modularity, (iii) are fairly insensitive to different intrinsic scaling parameters, and (iv) maintain robustness against increased levels of uncertainty and horizons.
|
|
10:30-12:00, Paper WeAT8-CC.5 | Add to My Program |
Touch-Based Manipulation with Multi-Fingered Robot Using Off-Policy RL and Temporal Contrastive Learning |
|
Morihira, Naoki | Honda R&D, Ltd |
Deo, Pranav | Honda R&D Co. Ltd |
Bhadu, Manoj | Honda R&D Co. Ltd |
Hayashi, Akinobu | Honda R&D Co., Ltd |
Hasegawa, Tadaaki | Honda R&D Co., Ltd |
Otsubo, Satoshi | Honda R&D |
Osa, Takayuki | University of Tokyo |
Keywords: Reinforcement Learning, In-Hand Manipulation, Dexterous Manipulation
Abstract: Tactile information holds promise for enhancing the manipulation capabilities of multi-fingered robots. In tasks such as in-hand manipulation, where robots frequently switch between contact and non-contact states, it is important to address the partial observability of tactile sensors and to properly consider the history of observations and actions. Previous studies have shown that Recurrent Neural Network (RNN) can be used to learn latent representations for handling observation and action histories. However, this approach is usually combined with on-policy reinforcement learning (RL) and suffers from low sample efficiency. Integrating RNN with off-policy RL could enhance sample efficiency, but this often compromises stability and robustness, especially as the dimensions of observation and action increase. This paper presents a time-contrastive learning approach tailored for off-policy RL. Our method incorporates a temporal contrastive model and introduces a surrogate loss to extract task-related latent representations, enhancing the pursuit of the optimal policy. Simulations and real robot experiments demonstrate that our proposed method outperforms RNN-based approaches.
|
|
10:30-12:00, Paper WeAT8-CC.6 | Add to My Program |
Learning Language-Conditioned Deformable Object Manipulation with Graph Dynamics |
|
Deng, Yuhong | National University of Singapore |
Mo, Kai | Tsinghua University, Shenzhen International Graduate School |
Xia, Chongkun | Tsinghua University |
Wang, Xueqian | Center for Artificial Intelligence and Robotics, Graduate School |
Keywords: Manipulation Planning, Deep Learning in Grasping and Manipulation, Dexterous Manipulation
Abstract: Multi-task learning of deformable object manipulation is a challenging problem in robot manipulation. Most previous works address this problem in a goal-conditioned way and adapt goal images to specify different tasks, which limits the multi-task learning performance and can not generalize to new tasks. Thus, we adapt language instruction to specify deformable object manipulation tasks and propose a learning framework. We first design a unified Transformer-based architecture to understand multi-modal data and output picking and placing action. Besides, we have applied the visible connectivity graph to tackle nonlinear dynamics and complex configuration of the deformable object. Both simulated and real experiments have demonstrated that the proposed method is effective and can generalize to unseen instructions and tasks. Compared with the state-of-the-art method, our method achieves higher success rates (87.2% on average) and has a 75.6% shorter inference time. We also demonstrate that our method performs well in real-world experiments. Supplementary videos can be found at https://sites.google.com/view/language-deformable.
|
|
10:30-12:00, Paper WeAT8-CC.7 | Add to My Program |
Grasp Anything: Combining Teacher-Augmented Policy Gradient Learning with Instance Segmentation to Grasp Arbitrary Objects |
|
Mosbach, Malte | University of Bonn |
Behnke, Sven | University of Bonn |
Keywords: Reinforcement Learning, Grasping, Sensorimotor Learning
Abstract: Interactive grasping from clutter, akin to human dexterity, is one of the longest-standing problems in robot learning. Challenges stem from the intricacies of visual perception, the demand for precise motor skills, and the complex interplay between the two. In this work, we present Teacher-Augmented Policy Gradient (TAPG), a novel two-stage learning framework that synergizes reinforcement learning and policy distillation. After training a teacher policy to master the motor control based on object pose information, TAPG facilitates guided, yet adaptive, learning of a sensorimotor policy, based on object segmentation. We zero-shot transfer from simulation to a real robot by using Segment Anything Model for promptable object segmentation. Our trained policies adeptly grasp a wide variety of objects from cluttered scenarios in simulation and the real world based on human-understandable prompts. Furthermore, we show robust zero-shot transfer to novel objects. Videos of our experiments are available at url{https://maltemosbach.github.io/grasp_anything}.
|
|
10:30-12:00, Paper WeAT8-CC.8 | Add to My Program |
Composable Interaction Primitives: A Structured Policy Class for Efficiently Learning Sustained-Contact Manipulation Skills |
|
Abbatematteo, Ben | University of Texas at Austin |
Rosen, Eric | Brown University |
Thompson, Skye | MIT |
Akbulut, Mete Tuluhan | Bogazici University |
Rammohan, Sreehari | Brown University |
Konidaris, George | Brown University |
Keywords: Reinforcement Learning, Integrated Planning and Learning, Deep Learning in Grasping and Manipulation
Abstract: We propose a new policy class, Composable Interaction Primitives (CIPs), specialized for learning sustained-contact manipulation skills like opening a drawer, pulling a lever, turning a wheel, or shifting gears. CIPs have two primary design goals: to minimize what must be learned by exploiting structure present in the world and the robot, and to support sequential composition by construction, so that learned skills can be used by a task-level planner. Using an ablation experiment in four simulated manipulation tasks, we show that the structure included in CIPs substantially improves the efficiency of motor skill learning. We then show that CIPs can be used for plan execution in a zero-shot fashion by sequencing learned skills. We validate our approach on real robot hardware by learning and sequencing two manipulation skills.
|
|
10:30-12:00, Paper WeAT8-CC.9 | Add to My Program |
MoDem-V2: Visuo-Motor World Models for Real-World Robot Manipulation |
|
Lancaster, Patrick | Meta AI |
Hansen, Nicklas | University of California San Diego |
Rajeswaran, Aravind | Meta AI |
Kumar, Vikash | Meta AI |
Keywords: Reinforcement Learning, Sensorimotor Learning, Imitation Learning
Abstract: Robotic systems that aspire to operate in uninstrumented real-world environments must perceive the world directly via onboard sensing. Vision-based learning systems aim to eliminate the need for environment instrumentation by building an implicit understanding of the world based on raw pixels, but navigating the contact-rich high-dimensional search space from solely sparse visual reward signals significantly exacerbates the challenge of exploration. The applicability of such systems is thus typically restricted to simulated or heavily engineered environments since agent exploration in the real-world without the guidance of explicit state estimation and dense rewards can lead to unsafe behavior and safety faults that are catastrophic. In this study, we isolate the root causes behind these limitations to develop a system, called MoDem-V2, capable of learning contact-rich manipulation directly in the uninstrumented real world. Building on the latest algorithmic advancements in model-based reinforcement learning (MBRL), demo-bootstrapping, and effective exploration, MoDem-V2 can acquire contact-rich dexterous manipulation skills directly in the real world. We identify key ingredients for leveraging demonstrations in model learning while respecting real-world safety considerations -- exploration centering, agency handover, and actor-critic ensembles. We empirically demonstrate the contribution of these ingredients in four complex visuo-motor manipulation problems in both simulation and the real world. To the best of our knowledge, our work presents the first successful system for demonstration-augmented visual MBRL trained directly in the real world. Visit sites.google.com/view/modemv2 for videos and more details.
|
|
WeAT9-CC Oral Session, CC-419 |
Add to My Program |
Collision Avoidance I |
|
|
Chair: Wang, Zhuping | Tongji University |
Co-Chair: Albu-Schäffer, Alin | DLR - German Aerospace Center |
|
10:30-12:00, Paper WeAT9-CC.1 | Add to My Program |
CollisionGP: Gaussian Process-Based Collision Checking for Robot Motion Planning |
|
Muńoz Mendi, Javier | Universidad Carlos III De Madrid |
Lehner, Peter | German Aerospace Center (DLR) |
Moreno, Luis | Carlos III University |
Albu-Schäffer, Alin | DLR - German Aerospace Center |
Roa, Maximo A. | German Aerospace Center (DLR) |
Keywords: Motion and Path Planning, Collision Avoidance, Probabilistic Inference
Abstract: Collision checking is the primitive operation of motion planning that consumes most time. Machine learning algorithms have proven to accelerate collision checking. We propose CollisionGP, a Gaussian process-based algorithm for modeling a robot's configuration space and query collision checks. CollisionGP introduces a Pňlya-Gamma auxiliary variable for each data point in the training set to allow classification inference to be done exactly with a closed form expression. Gaussian processes provide a distribution as the output, obtaining a mean and variance for the collision check. The obtained variance is processed to reduce false negatives (FN). We demonstrate that CollisionGP can use GPU acceleration to process collision checks for thousands of configurations much faster than traditional collision detection libraries. Furthermore, we obtain better accuracy, TPR and TNR results than state-of-the-art learning-based algorithms using less support points, thus making our proposed method more sparse.
|
|
10:30-12:00, Paper WeAT9-CC.2 | Add to My Program |
Probabilistic Motion Planning and Prediction Via Partitioned Scenario Replay |
|
de Groot, Oscar | Delft University of Technology |
Sridharan, Anish | Starnus Technologiy |
Alonso-Mora, Javier | Delft University of Technology |
Ferranti, Laura | Delft University of Technology |
Keywords: Collision Avoidance, Planning under Uncertainty, Optimization and Optimal Control
Abstract: Autonomous mobile robots require predictions of human motion to plan a safe trajectory that avoids them. Because human motion cannot be predicted exactly, future trajectories are typically inferred from real-world data via learning-based approximations. These approximations provide useful information on the pedestrian’s behavior, but may deviate from the data, which can lead to collisions during planning. In this work, we introduce a joint prediction and planning framework, Partitioned Scenario Replay (PSR), that stores and partitions previously observed human trajectories, referred to as scenarios. During planning, scenarios observed in similar situations are reintroduced (or replayed) as motion predictions. By sampling real data and by building on scenario optimization and predictive control, the planner provides probabilistic collision avoidance guarantees in the real-world. Relying on this guarantee to remain safe, PSR can incrementally improve its prediction and planning performance online. We demonstrate our approach on a mobile robot navigating around pedestrians.
|
|
10:30-12:00, Paper WeAT9-CC.3 | Add to My Program |
Prescient Collision-Free Navigation of Mobile Robots with Iterative Multimodal Motion Prediction of Dynamic Obstacles |
|
Zhang, Ze | Chalmers University of Technology |
Hajieghrary, Hadi | Magna International |
Dean, Emmanuel | Chalmers University of Technology |
Akesson, Knut | Chalmers University of Technology |
Keywords: Collision Avoidance, Deep Learning Methods, AI-Based Methods
Abstract: To explore safe interactions between a mobile robot and dynamic obstacles, this paper presents a comprehensive approach to collision-free navigation in dynamic indoor environments. The approach integrates multimodal motion predictions of dynamic obstacles with predictive control for obstacle avoidance. Multimodal Motion Prediction (MMP) is achieved by a deep-learning method that predicts multiple plausible future positions. By repeating the MMP for each time offset in the future, multi-time-step multimodal motion predictions are obtained. A nonlinear Model Predictive Control (MPC) solver utilizes the prediction outcomes to achieve collision-free trajectory tracking for the mobile robot. The proposed integration of multimodal motion prediction and trajectory tracking outperforms other non-deep-learning methods in complex scenarios. The approach enables safe interaction between the mobile robot and stochastic dynamic obstacles.
|
|
10:30-12:00, Paper WeAT9-CC.4 | Add to My Program |
GPU-Accelerated Optimization-Based Collision Avoidance |
|
Wu, Zeming | Tongji University |
Wang, Zhuping | Tongji University |
Zhang, Hao | Tongji University |
Keywords: Motion and Path Planning, Collision Avoidance, Constrained Motion Planning
Abstract: This paper proposes a GPU-accelerated optimization framework for collision avoidance problems where the controlled objects and the obstacles can be modeled as the finite union of convex polyhedra. A novel collision avoidance constraint is proposed based on scale-based collision detection and the strong duality of convex optimization. Under this constraint, the high-dimensional non-convex optimization problems of collision avoidance can be decomposed into several low-dimensional quadratic programmings (QPs) following the paradigm of alternating direction method of multipliers (ADMM). Furthermore, these low-dimensional QPs can be solved parallel with GPUs, significantly reducing computational time. High-fidelity simulations are conducted to validate the proposed method's effectiveness and practicality.
|
|
10:30-12:00, Paper WeAT9-CC.5 | Add to My Program |
Learn to Navigate in Dynamic Environments with Normalized LiDAR Scans |
|
Zhu, Wei | Tohoku University |
Hayashibe, Mitsuhiro | Tohoku University |
Keywords: Collision Avoidance, Human-Aware Motion Planning, Reinforcement Learning
Abstract: The latest robot navigation methods for dynamic environments assume that the states of obstacles, including their geometries and trajectories, are fully observable. While it's easy to obtain these states accurately in simulations, it's exceedingly challenging in the real world. Therefore, a viable alternative is to directly map raw sensor observations into robot actions. However, acquiring skills from high-dimensional raw observations demands massive neural networks and extended training periods. Furthermore, there are discrepancies between simulated and real environments that impede real-world implementations. To overcome these limitations, we propose a Learning framework for robot Navigation in Dynamic environments that uses sequential Normalized LiDAR (LNDNL) scans. We employ long-short-term memory (LSTM) to propagate historical environmental information from the sequential LiDAR observations. Additionally, we customize a LiDAR-integrated simulator to speed up sampling and normalize the geometry of real-world obstacles to match that of simulated objects, thereby bridging the sim-to-real gap. Our extensive comparisons with state-of-the-art baselines and real-world implementations demonstrate the potential of learning to navigate in dynamic environments using raw sensor observations and sim-to-real transfer.
|
|
10:30-12:00, Paper WeAT9-CC.6 | Add to My Program |
Learning Terminal State of the Trajectory Planner: Application for Collision Scenarios of Autonomous Vehicles |
|
Lim, Joonhee | KAIST |
Lee, Kibeom | Gachon University |
Shin, Jangho | Hyundai Motor Company |
Kum, Dongsuk | KAIST |
Keywords: Collision Avoidance, Integrated Planning and Learning, Motion and Path Planning
Abstract: Collision Avoidance/Mitigation System (CAMS) for autonomous vehicles is a crucial technology that ensures the safety and reliability of autonomous driving systems. Conventional collision avoidance approaches struggle in complex and various scenarios by avoiding collisions based on rules for specific collision scenarios. This has led to learning-based methods using neural networks for adaptive collision avoidance. However, the approaches directly outputting control inputs through neural networks have drawbacks in interpretability and stability. To address these limitations, we propose a trajectory planning method for CAMS that combines deep reinforcement learning (DRL) and quintic polynomial (QP) trajectory planning. The proposed method determines the terminal state and confidence of the trajectory using DRL and plans a QP trajectory based on them. By utilizing the terminal state and confidence of the trajectory rather than direct control inputs as the output of the neural network, it generates a more realistic and continuous path. Moreover, this approach considers collision avoidance and mitigation in an integrated manner through the reward function of RL. Our experimental results demonstrate that the proposed method not only improves interpretability and stability compared to existing learning-based methods but also upholds performance in complex and various collision scenarios.
|
|
10:30-12:00, Paper WeAT9-CC.7 | Add to My Program |
History-Aware Planning for Risk-Free Autonomous Navigation on Unknown Uneven Terrain |
|
Wang, Yinchuan | Shandong University |
Du, Nianfei | Shandong University |
Qin, Yongsen | Shandong University |
Zhang, Xiang | School of Control Science and Engineering, Shandong University |
Song, Rui | Shandong University |
Wang, Chaoqun | Shandong University |
Keywords: Collision Avoidance, Planning under Uncertainty, Autonomous Vehicle Navigation
Abstract: It is challenging for the mobile robot to achieve autonomous and mapless navigation in the unknown environment with uneven terrain. In this study, we present a layered and systematic pipeline. At the local level, we maintain a tree structure that is dynamically extended with the navigation. This structure unifies the planning with the terrain identification. Besides, it contributes to explicitly identifying the hazardous areas on uneven terrain. In particular, certain nodes of the tree are consistently kept to form a sparse graph at the global level, which records the history of the exploration. A series of subgoals that can be obtained in the tree and the graph are utilized for leading the navigation. To determine a subgoal, we develop an evaluation method whose input elements can be efficiently obtained on the layered structure. We conduct both simulation and real-world experiments to evaluate the developed method and its key modules. The experimental results demonstrate the effectiveness and efficiency of our method. The robot can travel through the unknown uneven region safely and reach the target rapidly without a preconstructed map.
|
|
WeAT10-CC Oral Session, CC-501 |
Add to My Program |
Soft Sensors and Actuators II |
|
|
Chair: Hughes, Josie | EPFL |
Co-Chair: Shi, Chaoyang | Tianjin University |
|
10:30-12:00, Paper WeAT10-CC.1 | Add to My Program |
Multi-Tap Resistive Sensing and FEM Modeling Enables Shape and Force Estimation in Soft Robots |
|
Cangan, Barnabas Gavin | ETH Zurich |
Tian, Sizhe | Inria, Université De Lille |
Escaida Navarro, Stefan | Universidad De O'Higgins |
Beger, Artem | Festo SE & Co. KG |
Duriez, Christian | INRIA |
Katzschmann, Robert Kevin | ETH Zurich |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Sensors and Actuators, Grasping
Abstract: We tackle the problem of proprioception in soft robots, specifically soft grippers with tight packaging constraints, relying only on intrinsic sensors. While various sensing ap- proaches have been studied towards curvature estimation, we look into being able to sense local deformations. To accomplish this, we use a widely available, off-the-shelf resistive sensor and multi-tap this sensor, i.e. make multiple electrical connections onto the resistive layer of the sensor. This allows us to measure changes in resistance at multiple segments throughout the length of the sensor, providing improved resolution of local deformations in the soft body. These measurements inform a finite-element- method (FEM) based model to then estimate the shape of the soft body and the magnitude of an external force acting at a known arbitrary location. Our model-based approach estimates soft body deformation with approximately 3% average relative error and taking into account internal fluidic actuation, our estimate of external force disturbance has 11% relative error within a 5 N range. The combined sensing and modeling approach can be integrated into soft manipulation platforms to enable features such as identifying shape and material properties of an object being grasped. Such manipulators can benefit from the softness and compliance while being proprioceptive relying only on embedded sensing and not on external systems such as motion capture, which is essential for deployment in real-world scena
|
|
10:30-12:00, Paper WeAT10-CC.2 | Add to My Program |
Learning Motion Reconstruction from Demonstration Via Multi-Modal Soft Tactile Sensing |
|
Pan, Cheng | Swiss Federal Institute of Technology Lausanne (EPFL) |
Gilday, Kieran | EPFL |
Sologuren, Emily | MIT |
Junge, Kai | École Polytechnique Fédérale De Lausanne |
Hughes, Josie | EPFL |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Sensors and Actuators, Learning from Demonstration
Abstract: Learning manipulation from demonstration is a key way for humans to teach complex tasks. However, this domain mainly focuses on kinetic teaching, and does not consider imitation of interaction forces which is essential for more contact rich tasks. We propose a framework that enables robotic imitation of contact from human demonstration using a wearable finger-tip sensor. By developing a multi-modal sensor (providing both force and contact location) and robotic collection of simple training data of different motion primitives (tapping, rotation and translation), an LSTM-based model can be used to replicate motion from tactile demonstration only. To evaluate this approach, we explore the performance on increasingly complex testing data generated by a robot, and also demonstrate the full pipeline from human demonstration via the sensor used as a wearable device. This approach of using tactile sensing as a means of inferring the required robot motion paves the way for imitation of more contact-rich tasks, and enables imitation of tasks where the demonstration and imitation is performed with different body-schema.
|
|
10:30-12:00, Paper WeAT10-CC.3 | Add to My Program |
A Generalized Motion Control Framework of Dielectric Elastomer Actuators: Dynamic Modeling, Sliding-Mode Control and Experimental Evaluation |
|
Zou, Jiang | Shanghai Jiao Tong University |
Kassim, Shakiru Olajide | School of Engineering, University of Aberdeen, Scotland |
Ren, Jieji | Shanghai Jiao Tong University |
Vaziri, Vahid | University of Aberdeen |
Aphale, Sumeet S. | University of Aberdeen |
Gu, Guoying | Shanghai Jiao Tong University |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Sensors and Actuators, Motion Control, Dielectric elastomer actuators
Abstract: The continuous electromechanical deformation of dielectric elastomer actuators (DEAs) suffers from rate-dependent viscoelasticity, mechanical vibration and configuration dependency, making the generalized dynamic modeling and precise control elusive. In this work, we present a generalized motion control framework for DEAs capable of accommodating different configurations, materials and degrees of freedom (DOFs). First, a generalized, control-enabling dynamic model is developed for DEAs by taking both nonlinear electromechanical coupling, mechanical vibration and rate-dependent viscoelasticity into consideration. Further, a state observer is introduced to predict the unobservable viscoelasticity. Then, an Enhanced Exponential Reaching Law based Sliding-Mode Controller (EERLSMC) is proposed to minimize the viscoelasticity of DEAs. Its stability is also proved mathematically. The experimental results obtained for different DEAs (four configurations, two materials and multi-DOFs) demonstrate that our dynamic model can precisely describe their complex dynamic responses and the EERLSMC can achieve precise tracking control; verifying the generality of our framework.
|
|
10:30-12:00, Paper WeAT10-CC.4 | Add to My Program |
Vision-Based Tip Force Estimation on a Soft Continuum Robot |
|
Chen, Xingyu | University College London |
Shi, Jialei | University College London |
Wurdemann, Helge Arne | University College London |
George Thuruthel, Thomas | University College London |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Sensors and Actuators, Soft Robot Applications
Abstract: Soft continuum robots, fabricated from elastomeric materials, offer unparalleled flexibility and adaptability, making them ideal for applications such as minimally invasive surgery and inspections in constrained environments. With the miniaturization of imaging technologies and the development of novel control algorithms, these devices provide exceptional opportunities to visualize the internal structures of the human body. However, there are still challenges in accurately estimating external forces applied to these systems using current technologies. Adding additional sensors is challenging without compromising the softness of the device. This work presents a visual deformation-based force sensing framework for soft continuum robots. The core idea behind this work is that point loads lead to unique deformation profiles in an actuated soft-bodied robot. We introduce a Convolutional Neural Network-based tip force estimation method that utilizes arbitrarily placed camera images and actuation inputs to predict applied tip forces. Experimental validation was performed using the STIFF-FLOP robot, a pneumatically actuated soft robot developed for minimally invasive surgery. Our vision-based force estimation model demonstrated a sensing precision of 0.05 N in the XY plane during testing, with data collection and training taking only 70 minutes.
|
|
10:30-12:00, Paper WeAT10-CC.5 | Add to My Program |
Soft Bending Actuator with Fiber-Jamming Variable Stiffness and Fiber-Optic Proprioception |
|
Kang, Joonwon | Seoul National University |
Lee, Sudong | EPFL (École Polytechnique Fédérale De Lausanne) |
Park, Yong-Lae | Seoul National University |
Keywords: Soft Robot Materials and Design, Soft Sensors and Actuators, Compliant Joints and Mechanisms
Abstract: Soft actuators with a function of variable stiffness are beneficial to the improvement of the adaptability of robots, expanding the application areas and environments. We propose a tendon-driven soft bending actuator that can change its stiffness using fiber jamming. The actuator is made of an elastomer tube filled with different types of fiber. The three types of fibers play different roles in maintaining the structure, variable stiffness by jamming, and fiber-optic shape sensing while sharing the same structure and materials, realizing a compact form factor of the entire structure. The stiffness of the actuator can be increased to higher than three times its original stiffness by jamming. In addition to jamming, the proposed actuator has a special function of shape sensing that estimates the tip location of the actuator based on image sensing from optical fibers packaged with the jamming fibers. The tip position sensing shows accuracies with errors of 3.1%, 3.0%, and 6.7% for the x, y, and z axes, respectively, using feature extraction and a deep neural network. The proposed actuator has two degrees of freedom (i.e., bending on two orthogonal planes) and is controlled by two tendons. When connected in series, multiple actuators form a soft robotic manipulator (i.e., arm), physically compliant or capable of delivering a relatively high force to the target objects.
|
|
10:30-12:00, Paper WeAT10-CC.6 | Add to My Program |
A Light and Heat-Seeking Vine-Inspired Robot with Material-Level Responsiveness |
|
Deglurkar, Shivani | University of California, San Diego |
Xiao, Charles | University of California, Santa Barbara |
Gockowski, Luke | University of California Santa Barbara |
Valentine, Megan | University of California, Santa Barbara |
Hawkes, Elliot Wright | University of California, Santa Barbara |
Keywords: Soft Robot Materials and Design, Soft Sensors and Actuators, Mechanism Design
Abstract: The fields of soft and bio-inspired robotics promise to imbue synthetic systems with capabilities found in the natural world. However, many of these biological capabilities are yet to be realized. For example, vines in nature direct growth via localized responses embedded in the cells of vine body, allowing an organism without a central brain to successfully search for resources (e.g., light). Yet to date, vine-inspired robots have yet to show such localized embedded responsiveness. Here we present a vine-inspired robotic device with material-level responses embedded in its skin and capable of “growing” and steering toward either a light or heat stimulus. We present basic modeling of the concept, design details, and experimental results showing its behavior in response to infrared (IR) and visible light. Our simple design concept advances the capabilities of bio-inspired robots and lays the foundation for future “growing” robots that are capable of seeking light or heat, yet are extremely simple and low-cost. Potential applications include solar tracking, and in the future, firefighting smoldering fires. We envision using similar robots to find hot spots in hard- to-access environments, allowing us to put out potentially long- burning fires faster.
|
|
10:30-12:00, Paper WeAT10-CC.7 | Add to My Program |
Morphological Design for Pneumatic Soft Actuators and Robots with Desired Deformation Behavior |
|
Chen, Feifei | Shanghai Jiao Tong University |
Song, Zenan | Shanghai Jiao Tong University |
Chen, Shitong | Shanghai Jiaotong University |
Gu, Guoying | Shanghai Jiao Tong University |
Zhu, Xiangyang | Shanghai Jiao Tong University |
Keywords: Soft Robot Materials and Design, Soft Sensors and Actuators, Shape Optimization, Soft Robot Applications
Abstract: A homogeneous pneumatic soft robot may generate complex output motions using a simple input pressure, resulting from its morphological shape that locally deforms the soft material to different degrees by simultaneously tailoring the structural characteristics and orienting the input pressure. To date, design of the morphological shape (inverse problem) has not been fully addressed. This article outlines a geometry-mechanics-optimization integrated approach to automatically shaping a pneumatic soft actuator or robot that achieves the desired deformation behavior. Instead of constraining the robot's geometry within any predefined regular shape, we employ B-splines to allow generation of freeform boundary surfaces, and use nonlinear mechanical modelling and shape derivative based optimization to navigate the high-dimensional design space. Our design framework can readily regulate the surface quality during the morphological evolution, by imposing the geometric constraints in terms of the principal curvatures and the minimal distance between surfaces as penalty functions. The effect of external forces including the gravity and the interaction force at the end-effector is also taken int
|
|
10:30-12:00, Paper WeAT10-CC.8 | Add to My Program |
Thermally-Activated Biochemically-Sustained Reactor for Soft Fluidic Actuation |
|
Liu, Jialun | The University of Sheffield |
Soliman, MennaAllah | University of Sheffield |
Damian, Dana | University of Sheffield |
Keywords: Soft Robot Materials and Design, Soft Sensors and Actuators, Soft Robot Applications
Abstract: Soft robots have shown remarkable distinct capabilities due to their high deformation. Recently increasing attention has been dedicated to developing fully soft robots to exploit their full potential, with a recognition that electronic powering may limit this achievement. Alternative powering sources compatible with soft robots have been identified such as combustion and chemical reactions. A further milestone to such systems would be to increase the controllability and responsiveness of their underlying reactions in order to achieve more complex behaviors for soft robots. In this paper, we present a thermally-activated reactor incorporating a biocompatible hydrogel valve that enables control of the biochemical reaction of sugar and yeast. The biochemical reaction is utilized to generate contained pressure, which in turn powers a fluidic soft actuator. Experiments were conducted to evaluate the response time of the hydrogel valves with three different crosslinker concentrations. Among the tested concentrations, we found that the lowest crosslinker concentration yielded the fastest response time of the valve at an ambient temperature of 50°C. We also evaluated the pressure generation capacity of the reactor, which can reach up to 0.22 bar, and demonstrated the thermo-responsive behavior of the reactor to trigger a biochemical reaction for powering a fluidic soft actuator. This work opens up the possibility to power and control tetherless and fully soft robots.
|
|
10:30-12:00, Paper WeAT10-CC.9 | Add to My Program |
Pulsating Fluidic Sensor for Sensing of Location, Pressure and Contact Area |
|
Jones, Joanna | University of Sheffield |
Pontin, Marco | University of Sheffield |
Damian, Dana | University of Sheffield |
Keywords: Soft Robot Materials and Design, Soft Sensors and Actuators, Soft Robot Applications
Abstract: Designing information-rich and space-efficient sensors is a key challenge for soft robotics, and crucial for the development of safe soft robots. Sensing and understanding the environmental interactions with a minimal footprint is especially important in the medical context, where portability and unhindered patient/user movement is a priority, to move towards personalized and decentralized healthcare solutions. In this work, a pulsating fluidic soft sensor (PFS) capable of determining location, pressure and contact area of press events is shown. The sensor relies on spatio-temporal resistance changes driven by a pulsating conductive fluid. The sensor demonstrates good repeatability and distinction of single and multiple press events, detecting single indents of sizes greater than 1 cm, forces larger than 2 N, and various locations across the sensor, as well as multiple indents spaced 2 cm apart. Furthermore, the sensor is demonstrated in two applications to detect foot placement and grip location. Overall, the sensor represents an improvement towards minimizing electronic hardware, and cost of the sensing solution, without sacrificing the richness of the sensing information in the field of soft fluidic sensors.
|
|
WeAT11-CC Oral Session, CC-502 |
Add to My Program |
Semantic Scene Understanding I |
|
|
Chair: Fujii, Hiromitsu | Chiba Institute of Technology |
Co-Chair: Beetz, Michael | University of Bremen |
|
10:30-12:00, Paper WeAT11-CC.1 | Add to My Program |
Perception through Cognitive Emulation: “A Second Iteration of NaivPhys4RP for Learningless and Safe Recognition and 6D-Pose Estimation of (Transparent) Objects” |
|
Kenghagho Kenfack, Franklin | University of Bremen |
Neumann, Michael | Uni Bremen |
Mania, Patrick | University of Bremen |
Beetz, Michael | University of Bremen |
Keywords: Semantic Scene Understanding, Cognitive Modeling, Perception for Grasping and Manipulation
Abstract: In our previous work, we designed a human-like white-box and causal generative model of perception NaivPhys4RP, essentially based on cognitive emulation to understand the past, the present and the future of the state of complex worlds from poor observations. In this paper, as recommended in that previous work, we first refine the theoretical model of NaivPhys4RP in terms of integration of variables as well as perceptual inference tasks to solve. Intuitively, the system is closed under the injection, update and dependency of variables. Then, we present a first implementation of NaivPhys4RP that demonstrates the learningless and safe recognition and 6D-Pose estimation of objects from poor sensor data (e.g., occlusion, transparency, poor-depth, in-hand). This does not only make a substantial step forward comparatively to classical perception systems in perceiving objects in these scenarios, but escape the burden of data-intensive learning and operate safely (transparency and causality — we fit sensor data into mentally constructed meaningful worlds). With respect to ChatGPT’s ambitions, it can imagine physico-realistic socio-physical scenes from texts, demonstrate understanding of these texts, and all these with no data- and resource-intensive learning.
|
|
10:30-12:00, Paper WeAT11-CC.2 | Add to My Program |
Mapping High-Level Semantic Regions in Indoor Environments without Object Recognition |
|
Bigazzi, Roberto | University of Modena and Reggio Emilia |
Baraldi, Lorenzo | Universitŕ Degli Studi Di Modena E Reggio Emilia |
Kousik, Shreyas | Georgia Institute of Technology |
Cucchiara, Rita | Universitŕ Degli Studi Di Modena E Reggio Emilia |
Pavone, Marco | Stanford University |
Keywords: Semantic Scene Understanding, Deep Learning for Visual Perception, Visual Learning
Abstract: Robots require a semantic understanding of their surroundings to operate in an efficient and explainable way in human environments. In the literature, there has been an extensive focus on object labeling and exhaustive scene graph generation; less effort has been focused on the task of purely identifying and mapping large semantic regions. The present work proposes a method for semantic region mapping via embodied navigation in indoor environments, generating a high-level representation of the knowledge of the agent. To enable region identification, the method uses a vision-to-language model to provide scene information for mapping. By projecting egocentric scene understanding into the global frame, the proposed method generates a semantic map as a distribution over possible region labels at each location. This mapping procedure is paired with a trained navigation policy to enable autonomous map generation. The proposed method significantly outperforms a variety of baselines, including an object-based system and a pretrained scene classifier, in experiments in a photorealistic simulator.
|
|
10:30-12:00, Paper WeAT11-CC.3 | Add to My Program |
LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model As an Agent |
|
Yang, Jianing | University of Michigan |
Chen, Xuweiyi | University of Michigan |
Qian, Shengyi | University of Michigan |
Madaan, Nikhil | Bloomberg |
Iyengar, Madhavan | University of Michigan |
Fouhey, David | University of Michigan |
Chai, Joyce | University of Michigan |
Keywords: Semantic Scene Understanding, Deep Learning for Visual Perception, RGB-D Perception
Abstract: 3D visual grounding is a critical skill for household robots, enabling them to navigate, manipulate objects, and answer questions based on their environment. While existing approaches often rely on extensive labeled data or exhibit limitations in handling complex language queries, we propose LLM-Grounder, a novel zero-shot, open-vocabulary, Large Language Model (LLM)-based 3D visual grounding pipeline. LLM-Grounder utilizes an LLM to decompose complex natural language queries into semantic constituents and employs a visual grounding tool, such as OpenScene or LERF, to identify objects in a 3D scene. The LLM then evaluates the spatial and commonsense relations among the proposed objects to make a final grounding decision. Our method does not require any labeled training data and can generalize to novel 3D scenes and arbitrary text queries. We evaluate LLM-Grounder on the ScanRefer benchmark and demonstrate state-of-the-art zero-shot grounding accuracy. Our findings indicate that LLMs significantly improve the grounding capability, especially for complex language queries, making LLM-Grounder an effective approach for 3D vision-language tasks in robotics.
|
|
10:30-12:00, Paper WeAT11-CC.4 | Add to My Program |
Learning Off-Road Terrain Traversability with Self-Supervisions Only |
|
Seo, Junwon | Agency for Defense Development |
Sim, Sungdae | Agency for Defense Development |
Shim, Inwook | Inha University |
Keywords: Semantic Scene Understanding, Deep Learning for Visual Perception, Vision-Based Navigation
Abstract: Estimating the traversability of terrain should be reliable and accurate in diverse conditions for autonomous driving in off-road environments. However, learning-based ap- proaches often yield unreliable results when confronted with unfamiliar contexts, and it is challenging to obtain manual annotations frequently for new circumstances. In this paper, we introduce a method for learning traversability from images that utilizes only self-supervision and no manual labels, enabling it to easily learn traversability in new circumstances. To this end, we first generate self-supervised traversability labels from past driving trajectories by labeling regions traversed by the vehicle as highly traversable. Using the self-supervised labels, we then train a neural network that identifies terrains that are safe to traverse from an image using a one-class classification algorithm. Additionally, we supplement the limitations of self- supervised labels by incorporating methods of self-supervised learning of visual representations. To conduct a comprehensive evaluation, we collect data in a variety of driving environments and perceptual conditions and show that our method produces reliable estimations in various environments. In addition, the experimental results validate that our method outperforms other self-supervised traversability estimation methods and achieves comparable performances with supervised learning methods trained on manually labeled data.
|
|
10:30-12:00, Paper WeAT11-CC.5 | Add to My Program |
Improving Radial Imbalances with Hybrid Voxelization and RadialMix for LiDAR 3D Semantic Segmentation |
|
Li, Jiale | Zhejiang University |
Dai, Hang | University of Glasgow |
Wang, Yu | YUNJI Technology Co. Ltd |
Cao, Guangzhi | Pegasus Technology |
Luo, Chun | YUNJI Technology Co. Ltd |
Ding, Yong | Zhejiang University |
Keywords: Semantic Scene Understanding, Deep Learning for Visual Perception, Deep Learning Methods
Abstract: Huge progress has been made in LiDAR 3D semantic segmentation, but there are still two under-explored imbalances on the radial axis: points are unevenly concentrated on the near side, and the distribution of foreground object instances is skewed to the near side. This leads the training of the model to favor semantics at the near side with the majority of points and object instances. Both the popular cylindrical and the neglected spherical voxelizations aim to address the problem of imbalanced point distribution by increasing the volume of voxels along the radial distance to include fewer near-side points in a smaller voxel and more far-side points in a bigger voxel. However, this causes a problem of the receptive field enlarging along the radial distance, which is not desirable in LiDAR point clouds since the size of an object is distance-independent. This can be addressed in cubic voxelization which has a fixed volume of voxels. Thus, we propose a new LiDAR 3D semantic segmentation network (Hi-VoxelNet) with Hybrid Voxelization that leverages the advantages of cubic, cylindrical, and spherical voxelizations for hybrid voxel feature learning. To address the radial imbalance of object instances, we propose a novel data augmentation technique termed as RadialMix that uses radial sample duplication to increase the number of distant foreground object instances and mixes the radial duplication with another point cloud for enriching the training samples. With the joint improvements of the radial imbalances, our method archives state-of-the-art performance on nuScenes and SemanticKITTI datasets, and consistently shows significant improvements along the radial distances. Our code is publicly available at https://github.com/jialeli1/lidarseg3d.
|
|
10:30-12:00, Paper WeAT11-CC.6 | Add to My Program |
Few-Shot Panoptic Segmentation with Foundation Models |
|
Käppeler, Markus | University of Freiburg |
Petek, Kürsat | University of Freiburg |
Vödisch, Niclas | University of Freiburg |
Burgard, Wolfram | University of Technology Nuremberg |
Valada, Abhinav | University of Freiburg |
Keywords: Semantic Scene Understanding, Deep Learning for Visual Perception, Object Detection, Segmentation and Categorization
Abstract: Current state-of-the-art methods for panoptic segmentation require an immense amount of annotated training data that is both arduous and expensive to obtain posing a significant challenge for their widespread adoption. Concurrently, recent breakthroughs in visual representation learning have sparked a paradigm shift leading to the advent of large foundation models that can be trained with completely unlabeled images. In this work, we propose to leverage such task-agnostic image features to enable few-shot panoptic segmentation by presenting Segmenting Panoptic Information with Nearly 0 labels (SPINO). In detail, our method combines a DINOv2 backbone with lightweight network heads for semantic segmentation and boundary estimation. We show that our approach, albeit being trained with only ten annotated images, predicts high-quality pseudo-labels that can be used with any existing panoptic segmentation method. Notably, we demonstrate that SPINO achieves competitive results compared to fully supervised baselines while using less than 0.3% of the ground truth labels, paving the way for learning complex visual recognition tasks leveraging foundation models. To illustrate its general applicability, we further deploy SPINO on real-world robotic vision systems for both outdoor and indoor environments. To foster future research, we make the code and trained models publicly available at http://spino.cs.uni-freiburg.de.
|
|
10:30-12:00, Paper WeAT11-CC.7 | Add to My Program |
End-To-End Semantic Segmentation Network for Low-Light Scenes |
|
Mu, Hongmin | Beijing University of Chemical Technology |
Zhang, Gang | Beijing University of Chemical Technology |
Zhou, MengChu | New Jersey Institute of Technology |
Cao, Zhengcai | Harbin Institute of Technology |
Keywords: Semantic Scene Understanding, Deep Learning for Visual Perception, Visual Learning
Abstract: In the fields of robotic perception and computer vision, achieving accurate semantic segmentation of low-light or nighttime scenes is challenging. This is primarily due to the limited visibility of objects and the reduced texture and color contrasts among them. To address the issue of limited visibility, we propose a hierarchical gated convolution unit, which simultaneously expands the receptive field and restores edge texture. To address the issue of reduced texture among objects, we propose a dual closed-loop bipartite matching algorithm to establish a total loss function consisting of the unsupervised illumination enhancement loss and supervised intersection-over-union loss, thus enabling the joint minimization of both losses via the Hungarian algorithm. We thus achieve end-to-end training for a semantic segmentation network especially suitable for handling low-light scenes. Experimental results demonstrate that the proposed network surpasses existing methods on the Cityscapes dataset and notably outperforms state-of-the-art methods on both Dark Zurich and Nighttime Driving datasets.
|
|
10:30-12:00, Paper WeAT11-CC.8 | Add to My Program |
DefFusion: Deformable Multimodal Representation Fusion for 3D Semantic Segmentation |
|
Xu, Rongtao | Institute of Automation, Chinese Academy of Sciences, Beijing, C |
Wang, Changwei | Casia |
Zhang, Duzhen | Institute of Automation, Chinese Academy of Sciences |
Zhang, Man | Beijing University of Posts and Telecommunications |
Xu, Shibiao | Beijing University of Posts and Telecommunications |
Meng, Weiliang | Institute of Automation, Chinese Academy of Sciences |
Zhang, Xiaopeng | National Laboratory of Pattern Recognition, Institute of Automat |
Keywords: Semantic Scene Understanding, Autonomous Agents, Sensor Fusion
Abstract: The complementarity between camera and LiDAR data makes fusion methods a promising approach to improve 3D semantic segmentation performance. Recent transformer-based methods have also demonstrated superiority in segmentation. However, multimodal solutions incorporating transformers are underexplored and face two key inherent difficulties: over-attention and noise from different modal data. To overcome these challenges, we propose a Deformable Multimodal Representation Fusion (DefFusion) framework consisting mainly of a Deformable Representation Fusion Transformer and Dynamic Representation Augmentation Modules. The Deformable Representation Fusion Transformer introduces the deformable mechanism in multimodal fusion, avoiding over-attention and improving efficiency by adaptively modeling a 2D key/value set for a given 3D query, thus enabling multimodal fusion with higher flexibility. To enhance the 2D representation and 3D representation, the Dynamic Representation Enhancement Module is proposed to dynamically remove noise in the input representation via Dynamic Grouped Representation Generation and Dynamic Mask Generation. Extensive experiments validate that our model achieves the best 3D semantic segmentation performance on SemanticKITTI and NuScenes benchmarks.
|
|
10:30-12:00, Paper WeAT11-CC.9 | Add to My Program |
Lifelong LERF: Local 3D Semantic Inventory Monitoring Using FogROS2 |
|
Rashid, Adam | UC Berkeley |
Kim, Chung Min | University of California, Berkeley |
Kerr, Justin | University of California, Berkeley |
Fu, Letian | UC Berkeley |
Hari, Kush | UC Berkeley |
Ahmad, Ayah | University of California, Berkeley |
Chen, Kaiyuan | University of California, Berkeley |
Huang, Huang | University of California at Berkeley |
Gualtieri, Marcus | Bosch Research |
Wang, Michael | Bosch |
Juette, Christian | Bosch Research |
Tian, Nan | University of California, Berkeley |
Ren, Liu | Robert Bosch North America Research Technology Center |
Goldberg, Ken | UC Berkeley |
Keywords: Semantic Scene Understanding, Continual Learning, SLAM
Abstract: Inventory monitoring in homes, factories, and retail stores relies on maintaining data despite objects being swapped, added, removed, or moved. We introduce Lifelong LERF, a method that allows a mobile robot with minimal compute to jointly optimize a dense language and geometric representation of its surroundings. Lifelong LERF maintains this representation over time by detecting semantic changes and selectively updating these regions of the environment, avoiding the need to exhaustively remap. Human users can query inventory by providing natural language queries and receiving a 3D heatmap of potential object locations. To manage the computational load, we use Fog-ROS2, a cloud robotics platform, to offload resource-intensive tasks. Lifelong LERF obtains poses from a monocular RGBD SLAM backend, and uses these poses to progressively optimize a Language Embedded Radiance Field (LERF) for semantic monitoring. Experiments with 3-5 objects arranged on a tabletop and a Turtlebot with a RealSense camera suggest that Lifelong LERF can persistently adapt to changes in objects with up to 91% accuracy.
|
|
WeAT12-CC Oral Session, CC-503 |
Add to My Program |
Deep Learning in Grasping and Manipulation IV |
|
|
Chair: Yamazaki, Kimitoshi | Shinshu University |
Co-Chair: Dijkman, Daniel | Qualcomm |
|
10:30-12:00, Paper WeAT12-CC.1 | Add to My Program |
RGBManip: Monocular Image-Based Robotic Manipulation through Active Object Pose Estimation |
|
An, Boshi | Peking University |
Geng, Yiran | Peking University |
Chen, Kai | The Chinese University of Hong Kong |
Li, Xiaoqi | Peking University |
Dou, Qi | The Chinese University of Hong Kong |
Dong, Hao | Peking University |
Keywords: AI-Based Methods, Deep Learning in Grasping and Manipulation, Perception for Grasping and Manipulation
Abstract: Robotic manipulation requires accurate perception of the environment, which poses a significant challenge due to its inherent complexity and constantly changing nature. In this context, RGB image and point-cloud observations are two commonly used modalities in visual-based robotic manipulation, but each of these modalities have their own limitations. Commercial point-cloud observations often suffer from issues like sparse sampling and noisy output due to the limits of the emission-reception imaging principle. On the other hand, RGB images, while rich in texture information, lack essential depth and 3D information crucial for robotic manipulation. To mitigate these challenges, we propose an image-only robotic manipulation framework that leverages an eye-on-hand monocular camera installed on the robot's parallel gripper. By moving with the robot gripper, this camera gains the ability to actively perceive the object from multiple perspectives during the manipulation process. This enables the estimation of 6D object poses, which can be utilized for manipulation. While, obtaining images from more and diverse viewpoints typically improves pose estimation, it also increases the manipulation time. To address this trade-off, we employ a reinforcement learning policy to synchronize the manipulation strategy with active perception, achieving a balance between 6D pose accuracy and manipulation efficiency. Our experimental results in both simulated and real-world environments showcase the state-of-the-art effectiveness of our approach. We believe that our method will inspire further research on real-world-oriented robotic manipulation.
|
|
10:30-12:00, Paper WeAT12-CC.2 | Add to My Program |
Part-Guided 3D RL for Sim2Real Articulated Object Manipulation |
|
Xie, Pengwei | Tsinghua University |
Chen, Rui | Tsinghua University |
Chen, Siang | Tsinghua University |
Qin, Yuzhe | UC San Diego |
Xiang, Fanbo | University of California San Diego |
Sun, Tianyu | Tsinghua University |
Xu, Jing | Tsinghua University |
Wang, Guijin | Tsinghua University |
Su, Hao | UCSD |
Keywords: Deep Learning in Grasping and Manipulation, RGB-D Perception, Reinforcement Learning
Abstract: Manipulating unseen articulated objects through visual feedback is a critical but challenging task for real robots. Existing learning-based solutions mainly focus on visual affordance learning or other pre-trained visual models to guide manipulation policies, which face challenges for novel instances in real-world scenarios. In this paper, we propose a novel part-guided 3D RL framework, which can learn to manipulate articulated objects without demonstrations. We combine the strengths of 2D segmentation and 3D RL to improve the efficiency of RL policy training. To improve the stability of the policy on real robots, we design a Frame-consistent Uncertainty-aware Sampling (FUS) strategy to get a condensed and hierarchical 3D representation. In addition, a single versatile RL policy can be trained on multiple articulated object manipulation tasks simultaneously in simulation and shows great generalizability to novel categories and instances. Experimental results demonstrate the effectiveness of our framework in both simulation and real-world settings.
|
|
10:30-12:00, Paper WeAT12-CC.3 | Add to My Program |
MORPH: Design Co-Optimization with Reinforcement Learning Via a Differentiable Hardware Model Proxy |
|
He, Zhanpeng | Columbia University |
Ciocarlie, Matei | Columbia University |
Keywords: Deep Learning in Grasping and Manipulation, Reinforcement Learning, Mechanism Design
Abstract: We introduce MORPH, a method for co-optimization of hardware design parameters and control policies in simulation using reinforcement learning. Like most co-optimization methods, MORPH relies on a model of the hardware being optimized, usually simulated based on the laws of physics. However, such a model is often difficult to integrate into an effective optimization routine. To address this, we introduce a proxy hardware model, which is always differentiable and enables efficient co-optimization alongside a long-horizon control policy using RL. MORPH is designed to ensure that the optimized hardware proxy remains as close as possible to its realistic counterpart, while still enabling task completion. We demonstrate our approach on simulated 2D reaching and 3D multi-fingered manipulation tasks.
|
|
10:30-12:00, Paper WeAT12-CC.4 | Add to My Program |
Mastering Stacking of Diverse Shapes with Large-Scale Iterative Reinforcement Learning on Real Robots |
|
Lampe, Thomas | Google UK Ltd |
Abdolmaleki, Abbas | DeepMind |
Huang, Sandy H. | Google DeepMind |
Bechtle, Sarah | Google DeepMind |
Springenberg, Jost Tobias | Albert-Ludwigs Universitaet Freiburg |
Bloesch, Michael | Google |
Groth, Oliver | University of Oxford |
Hafner, Roland | Google DeepMind |
Hertweck, Tim | DeepMind |
Neunert, Michael | Google |
Wulfmeier, Markus | Google DeepMind |
Zhang, Jingwei | DeepMind |
Nori, Francesco | Google DeepMind |
Heess, Nicolas | Google Deepmind |
Riedmiller, Martin | DeepMind |
Keywords: Deep Learning in Grasping and Manipulation, Reinforcement Learning, Machine Learning for Robot Control
Abstract: Reinforcement learning solely from an agent's self-generated data is often believed to be infeasible for learning on real robots, due to the amount of data needed. However, if done right, agents learning from real data can be surprisingly efficient through re-using previously collected sub-optimal data. In this paper we demonstrate how the increased understanding of off-policy learning methods and their embedding in an iterative online/offline scheme ("collect and infer") can drastically improve data-efficiency by using all the collected experience, which empowers learning from real robot experience only. Moreover, the resulting policy improves significantly over the state of the art on a recently proposed real robot manipulation benchmark. Our approach learns end-to-end, directly from pixels, and does not rely on additional human domain knowledge such as a simulator or demonstrations.
|
|
10:30-12:00, Paper WeAT12-CC.5 | Add to My Program |
Information-Driven Affordance Discovery for Efficient Robotic Manipulation |
|
Mazzaglia, Pietro | University of Gent |
Cohen, Taco | Qualcomm AI Research |
Dijkman, Daniel | Qualcomm |
Keywords: AI-Based Methods, Deep Learning in Grasping and Manipulation, Perception for Grasping and Manipulation
Abstract: Robotic affordances, providing information about what actions can be taken in a given situation, can aid robotic manipulation. However, learning about affordances requires expensive large annotated datasets of interactions or demonstrations. In this work, we argue that well-directed interactions with the environment can mitigate this problem and propose an information-based measure to augment the agent's objective and accelerate the affordance discovery process. We provide a theoretical justification of our approach and we empirically validate the approach both in simulation and real-world tasks. Our method, which we dub IDA, enables the efficient discovery of visual affordances for several action primitives, such as grasping, stacking objects, or opening drawers, strongly improving data efficiency in simulation, and it allows us to learn grasping affordances in a small number of interactions, on a real-world setup with a UFACTORY xArm 6 robot arm.
|
|
10:30-12:00, Paper WeAT12-CC.6 | Add to My Program |
HybGrasp: A Hybrid Learning-To-Adapt Architecture for Efficient Robot Grasping |
|
Mun, Jungwook | Korea Advanced Institute of Science and Technology |
Truong Giang, Khang | KAIST |
Lee, Yunghee | Korea Advanced Institute of Science and Technology |
Oh, Nayoung | KAIST |
Huh, Sejoon | Korea Advanced Institute of Science and Technology |
Kim, Min | KAIST |
Jo, Sungho | Korea Advanced Institute of Science and Technology (KAIST) |
Keywords: Deep Learning in Grasping and Manipulation, Reinforcement Learning, Multifingered Hands
Abstract: Despite the prevalence of robotic manipulation tasks in various real-world applications of different requirements and needs, there has been a lack of focus on enhancing the adaptability of robotic grasping systems. Most of the current literature constructs models around a single gripper, succumbing to a tradeoff between gripper complexity and generalizability. Adapting such models pre-trained on one type of gripper to another to work around the tradeoff is inefficient and not scalable, as it would require tremendous effort and computational cost to generate new datasets and relearn the grasping task. In this letter, we propose a novel hybrid architecture for robot grasping that efficiently learns to adapt to different gripper designs. Our approach involves a three-step process that first obtains a rough grasp pose prediction from a parallel gripper model, then predicts an adaptive action using a convolutional neural network, and finally refines the predicted action with reinforcement learning. The proposed method shows significant improvements in grasping performance compared to existing methods for both generated datasets and real-world scenarios, presenting a promising direction for improving the adaptability and flexibility of robotic manipulation systems.
|
|
10:30-12:00, Paper WeAT12-CC.7 | Add to My Program |
Efficient Heatmap-Guided 6-Dof Grasp Detection in Cluttered Scenes |
|
Chen, Siang | Tsinghua University |
Tang, Wei | Tsinghua University |
Xie, Pengwei | Tsinghua University |
Yang, Wenming | Tsinghua University |
Wang, Guijin | Tsinghua University |
Keywords: Deep Learning in Grasping and Manipulation, RGB-D Perception, Grasping
Abstract: Fast and robust object grasping in clutter is a crucial component of robotics. Most current works resort to the whole observed point cloud for 6-Dof grasp generation, ignoring the guidance information excavated from global semantics, thus limiting high-quality grasp generation and real-time performance. In this work, we show that the widely used heatmaps are underestimated in the efficiency of 6-Dof grasp generation. Therefore, we propose an effective local grasp generator combined with grasp heatmaps as guidance, which infers in a global-to-local semantic-to-point way. Specifically, Gaussian encoding and the grid-based strategy are applied to predict grasp heatmaps as guidance to aggregate local points into graspable regions and provide global semantic information. Further, a novel non-uniform anchor sampling mechanism is designed to improve grasp accuracy and diversity. Benefiting from the high-efficiency encoding in the image space and focusing on points in local graspable regions, our framework can perform high-quality grasp detection in real-time and achieve state-of-the-art results. In addition, real robot experiments demonstrate the effectiveness of our method with a success rate of 94% and a clutter completion rate of 100%.
|
|
10:30-12:00, Paper WeAT12-CC.8 | Add to My Program |
A Hyper-Network Based End-To-End Visual Servoing with Arbitrary Desired Poses |
|
Yu, Hongxiang | Zhejiang University |
Chen, Anzhe | Zhejiang University |
Xu, Kechun | Zhejiang University |
Zhou, Zhongxiang | Zhejiang University |
Jing, Wei | Alibaba |
Wang, Yue | Zhejiang University |
Xiong, Rong | Zhejiang University |
Keywords: Deep Learning in Grasping and Manipulation, Transfer Learning, Visual Servoing
Abstract: Recently, several works achieve end-to-end visual servoing (VS) for robotic manipulation by replacing traditional controller with differentiable neural networks, but lose the ability to servo arbitrary desired poses. This letter proposes a differentiable architecture for arbitrary pose servoing: a hyper-network based neural controller (HPN-NC). To achieve this, HPN-NC consists of a hyper net and a low-level controller, where the hyper net learns to generate the parameters of the low-level controller and the controller uses the 2D keypoints error for control like traditional image-based visual servoing (IBVS). HPN-NC can complete 6 degree of freedom visual servoing with large initial offset. Taking advantage of the fully differentiable nature of HPN-NC, we provide a three-stage training procedure to servo real world objects. With self-supervised end-to-end training, the performance of the integrated model can be further improved in unseen scenes and the amount of manual annotations can be significantly reduced.
|
|
10:30-12:00, Paper WeAT12-CC.9 | Add to My Program |
6-DoF Closed-Loop Grasping with Reinforcement Learning |
|
Herland, Sverre | Norwegian University of Science and Technology |
Bach, Kerstin | Norwegian University of Science and Technology |
Misimi, Ekrem | SINTEF Ocean |
Keywords: Deep Learning in Grasping and Manipulation, Reinforcement Learning, Perception for Grasping and Manipulation
Abstract: We present a novel vision-based, 6-DoF grasping framework based on Deep Reinforcement Learning (DRL) that is capable of directly synthesizing continuous 6-DoF actions in cartesian space. Our proposed approach uses visual observations from an eye-in-hand RGB-D camera, and we mitigate the sim-to-real gap with a combination of domain randomization, image augmentation, and segmentation tools. Our method consists of an off-policy, maximum-entropy, Actor-Critic algorithm that learns a policy from a binary reward and a few simulated example grasps. It does not need any real-world grasping examples, is trained completely in simulation, and is deployed directly to the real world without any fine-tuning. The efficacy of our approach is demonstrated in simulation and experimentally validated in the real world on 6-DoF grasping tasks, achieving state-of-the-art results of an 86% mean zero-shot success rate on previously unseen objects, an 85% mean zero-shot success rate on a class of previously unseen adversarial objects, and a 74.3% mean zero-shot success rate on a class of previously unseen, challenging "6-DoF" objects. Raw footage of real-world validation can be found at https://youtu.be/bwPf8Imvook
|
|
WeAT13-AX Oral Session, AX-201 |
Add to My Program |
Human-Robot Collaboration I |
|
|
Chair: Kuchenbecker, Katherine J. | Max Planck Institute for Intelligent Systems |
Co-Chair: Zhang, Yunbo | Rochester Institute of Technology |
|
10:30-12:00, Paper WeAT13-AX.1 | Add to My Program |
Self-Supervised 6-DoF Robot Grasping by Demonstration Via Augmented Reality Teleoperation System |
|
Dengxiong, Xiwen | Rochester Institute of Technology |
Wang, Xueting | Rochester Institute of Technology |
Bai, Shi | Wing |
Zhang, Yunbo | Rochester Institute of Technology |
Keywords: Human-Centered Automation, Telerobotics and Teleoperation, Learning from Demonstration
Abstract: Most existing 6-DoF robot grasping solutions depend on strong supervision on grasp pose to ensure satisfactory performance, which could be laborious and impractical when the robot works in some restricted area. To this end, we propose a self-supervised 6-DoF grasp pose detection framework via an Augmented Reality (AR) teleoperation system that can efficiently learn human demonstrations and provide 6-DoF grasp poses without grasp pose annotations. Specifically, the system collects the human demonstration from the AR environment and contrastively learns the grasping strategy from the demonstration. For the real-world experiment, the proposed system leads to satisfactory grasping abilities and learning to grasp unknown objects within three demonstrations.
|
|
10:30-12:00, Paper WeAT13-AX.2 | Add to My Program |
Trust Recognition in Human-Robot Cooperation Using EEG |
|
Xu, Caiyue | Tongji University |
Zhang, Changming | Tongji University |
Zhou, Yanmin | Tongji University |
Wang, Zhipeng | Tongji University |
Lu, Ping | Tongji University |
He, Bin | Tongji University |
Keywords: Acceptability and Trust, Human-Robot Collaboration
Abstract: Collaboration between humans and robots is becoming increasingly crucial in our daily life. In order to accomplish efficient cooperation, trust recognition is vital, empowering robots to predict human behaviors and make trust-aware decisions. Consequently, there is an urgent need for a generalized approach to recognize human-robot trust. This study addresses this need by introducing an EEG-based method for trust recognition during human-robot cooperation. A human-robot cooperation game scenario is used to stimulate various human trust levels when working with robots. To enhance recognition performance, the study proposes an EEG Vision Transformer model coupled with a 3-D spatial representation to capture the spatial information of EEG, taking into account the topological relationship among electrodes. To validate this approach, a public EEG-based human trust dataset called EEGTrust is constructed. Experimental results indicate the effectiveness of the proposed approach, achieving an accuracy of 74.99% in slice-wise cross-validation and 62.00% in trial-wise cross-validation. This outperforms baseline models in both recognition accuracy and generalization. Furthermore, an ablation study demonstrates a significant improvement in trust recognition performance of the spatial representation. The source code and EEGTrust dataset are available at https://github.com/CaiyueXu/EEGTrust.
|
|
10:30-12:00, Paper WeAT13-AX.3 | Add to My Program |
Multimodal Multi-User Surface Recognition with the Kernel Two-Sample Test (I) |
|
Khojasteh, Behnam | Max Planck Institute for Intelligent Systems |
Solowjow, Friedrich | RWTH Aachen University |
Trimpe, Sebastian | RWTH Aachen University |
Kuchenbecker, Katherine J. | Max Planck Institute for Intelligent Systems |
Keywords: Human-Centered Automation, Force and Tactile Sensing
Abstract: Machine learning and deep learning have been used extensively to classify physical surfaces through images and time-series contact data. However, these methods rely on human expertise and entail the time-consuming processes of data and parameter tuning. To overcome these challenges, we propose an easily implemented framework that can directly handle heterogeneous data sources for classification tasks. Our data-versus-data approach automatically quantifies distinctive differences in distributions in a high-dimensional space via kernel two-sample testing between two sets extracted from multimodal data (e.g., images, sounds, haptic signals). We demonstrate the effectiveness of our technique by benchmarking against expertly engineered classifiers for visual-audio-haptic surface recognition due to the industrial relevance, difficulty, and competitive baselines of this application; ablation studies confirm the utility of key components of our pipeline. As shown in our open-source code, we achieve 97.2% accuracy on a standard multi-user dataset with 108 surface classes, outperforming the state-of-the-art machine-learning algorithm by 6% on a more difficult version of the task. The fact that our classifier obtains this performance with minimal data processing in the standard algorithm setting reinforces the powerful nature of kernel methods for learning to recognize complex patterns.
|
|
10:30-12:00, Paper WeAT13-AX.4 | Add to My Program |
Learning User Preferences for Complex Cobotic Tasks: Meta-Behaviors and Human Groups |
|
Vella, Elena | The University of Melbourne |
Chapman, Airlie | University of Melbourne |
Lipovetzky, Nir | The University of Melbourne |
Keywords: Acceptability and Trust, Human-Robot Teaming, Human-Robot Collaboration
Abstract: In complex tasks (beyond a single targeted controller) requiring robots to collaborate with multiple human users, two challenges arise: complex tasks are often composed of multiple behaviors which can only be evaluated as a collective (a meta-behavior) and user preferences often differ between individuals, yet successful interactions are expected across groups. To address these challenges, we formulate a set-wise preference learning problem, and validate a cost function that captures human group preferences for complex collaborative robotic tasks (cobotics). We develop a sparse optimization formulation to introduce a distinctiveness metric that aggregates individuals with similar preference profiles. Analysis of anonymized unlabelled preferences provides further insight into group preferences. Identification of the mode average most-preferred meta-behavior and minimum covariance bound allows us to analyze group cohesion. A user study with 43 participants is used to validate group preference profiles.
|
|
10:30-12:00, Paper WeAT13-AX.5 | Add to My Program |
Learning Self-Confidence from Semantic Action Embeddings for Improved Trust in Human-Robot Interaction |
|
Goubard, Cedric | Imperial College London |
Demiris, Yiannis | Imperial College London |
Keywords: Acceptability and Trust, Human-Centered Robotics, Human-Robot Collaboration
Abstract: In Human-Robot Interaction scenarios, human factors like trust can greatly impact task performance and interaction quality. Recent research has confirmed that perceived robot proficiency is a major antecedent of trust. By making robots aware of their capabilities, we can allow them to choose when to perform low-confidence actions, thus actively controlling the risk of trust reduction. In this paper, we propose Self-Confidence through Observed Novel Experiences (SCONE), a policy to learn self-confidence from experience using semantic action embeddings. Using an assistive cooking setting, we show that the semantic aspect allows SCONE to learn self-confidence faster than existing approaches, while also achieving promising performance in simple instructions following. Finally, we share results from a pilot study with 31 participants, showing that such a self-confidence-aware policy increases capability-based human trust.
|
|
10:30-12:00, Paper WeAT13-AX.6 | Add to My Program |
Interactive Navigation in Environments with Traversable Obstacles Using Large Language and Vision-Language Models |
|
Zhang, Zhen | The Chinese University of Hong Kong |
Lin, Anran | The Chinese University of Hong Kong |
Wong, Chun Wai | The Chinese University of Hong Kong |
Chu, Xiangyu | The Chinese University of Hong Kong |
Dou, Qi | The Chinese University of Hong Kong |
Au, K. W. Samuel | The Chinese University of Hong Kong |
Keywords: Human-Centered Robotics, AI-Based Methods, Reactive and Sensor-Based Planning
Abstract: This paper proposes an interactive navigation framework by using large language and vision-language models, allowing robots to navigate in environments with traversable obstacles. We utilize the large language model (GPT-3.5) and the open-set Vision-language Model (Grounding DINO) to create an action-aware costmap to perform effective path planning without fine-tuning. With the large models, we can achieve an end-to-end system from textual instructions like “Can you pass through the curtains to deliver medicines to me?”, to bounding boxes (e.g., curtains) with action-aware attributes. They can be used to segment LiDAR point clouds into two parts: traversable and untraversable parts, and then an action-aware costmap is constructed for generating a feasible path. The pre-trained large models have great generalization ability and do not require additional annotated data for training, allowing fast deployment in the interactive navigation tasks. We choose to use multiple traversable objects such as curtains and grasses for verification by instructing the robot to traverse them. Besides, traversing curtains in a medical scenario was tested. All experimental results demonstrated the proposed framework’s effectiveness and adaptability to diverse environments.
|
|
10:30-12:00, Paper WeAT13-AX.7 | Add to My Program |
From Unstable Electrode Contacts to Reliable Control: A Deep Learning Approach for HD-sEMG in Neurorobotics |
|
Tyacke, Eion | New York University |
Gupta, Kunal | New York University |
Patel, Jay | New York University |
Katoch, Raghav | New York University |
Atashzar, S. Farokh | New York University (NYU), US |
Keywords: Human-Centered Robotics, Brain-Machine Interfaces, Gesture, Posture and Facial Expressions
Abstract: In the past decade, there has been significant advancement in designing wearable neural interfaces for controlling neurorobotic systems, particularly bionic limbs. These interfaces function by decoding signals captured non-invasively from the skin's surface. Portable high-density surface electromyography (HD-sEMG) modules combined with deep learning decoding have attracted interest by achieving excellent gesture prediction and myoelectric control of prosthetic systems and neurorobots. However, factors like small electrode size and unstable electrode-skin contacts make HD-sEMG susceptible to pixel electrode drops. The sparse electrode-skin disconnections rooted in issues such as low adhesion, sweating, hair blockage, and skin stretch challenge the reliability and scalability of these modules as the perception unit for neurorobotic systems. This paper proposes a novel deep-learning model providing resiliency for HD-sEMG modules, which can be used in the wearable interfaces of neurorobots. The proposed 3D Dilated Efficient CapsNet model trains on an augmented input space to computationally `force' the network to learn channel dropout variations and thus learn robustness to channel dropout. The proposed framework maintained high performance under a sensor dropout reliability study conducted. Results show conventional models' performance significantly degrades with dropout and is recovered using the proposed architecture and the training paradigm.
|
|
10:30-12:00, Paper WeAT13-AX.8 | Add to My Program |
Enhanced Human-Robot Collaboration with Intent Prediction Using Deep Inverse Reinforcement Learning |
|
Mitra, Mukund | IISc Bangalore |
Kumar, Gyanig | Indian Institute of Sciences, India |
Chakrabarti, Partha Pratim | Indian Institute of Technology, Kharagpur, India |
Biswas, Pradipta | Indian Institute of Science |
Keywords: Human-Centered Automation, Intention Recognition, Human-Robot Collaboration
Abstract: In shared autonomy, human-robot handover for object delivery is crucial. Accurate robot predictions of human hand motion and intentions enhance collaboration efficiency. However, low prediction accuracy increases mental and physical demands on the user. In this work, we propose a system for predicting hand motion and intended target during human-robot handover using Inverse Reinforcement Learning (IRL). A set of feature functions were designed to explicitly capture users’ preferences during the task. The proposed approach was experimentally validated through user studies. Results indicate that the proposed method outperformed other state-of-the-art methods (PI-IRL, BP-HMT, RNNIK-MKF and CMk=5) with users feeling comfortable reaching upto 60% of the total distance to the target for handover with 90% target prediction accuracy. The target prediction accuracy reaches 99.9% when less than 20% of the task remains.
|
|
10:30-12:00, Paper WeAT13-AX.9 | Add to My Program |
ToP-ToM: Trust-Aware Robot Policy with Theory of Mind |
|
Yu, Chuang | University College London |
Serhan, Baris | The University of Manchester |
Cangelosi, Angelo | University of Manchester |
Keywords: Cognitive Control Architectures, Acceptability and Trust, Human Factors and Human-in-the-Loop
Abstract: Theory of Mind (ToM) is a fundamental cognitive architecture that endows humans with the ability to attribute mental states to others. Humans infer the desires, beliefs, and intentions of others by observing their behavior and, in turn, adjust their actions to facilitate better interpersonal communication and team collaboration. In this paper, we investigated trust-aware robot policy with the theory of mind in a multiagent setting where a human collaborates with a robot against another human opponent. We show that by only focusing on team performance, the robot may resort to the reverse psychology trick, which poses a significant threat to trust maintenance. The human's trust in the robot will collapse when they discover deceptive behavior by the robot. To mitigate this problem, we adopt the robot theory of mind model to infer the human's trust beliefs, including true belief and false belief (an essential element of ToM). We designed a dynamic trust-aware reward function based on different trust beliefs to guide the robot policy learning, which aims to balance between avoiding human trust collapse due to robot reverse psychology and leveraging its potential to boost team performance. The experimental results demonstrate the importance of the ToM-based robot policy for human-robot trust and the effectiveness of our robot ToM-based robot policy in multiagent interaction settings.
|
|
WeAT15-AX Oral Session, AX-203 |
Add to My Program |
Human Factors and Human-In-The-Loop I |
|
|
Chair: Ciocarlie, Matei | Columbia University |
Co-Chair: De Momi, Elena | Politecnico Di Milano |
|
10:30-12:00, Paper WeAT15-AX.1 | Add to My Program |
VIDAR: Data Quality Improvement for Monocular 3D Reconstruction through In-Situ Visual Interaction |
|
Gao, Han | National Key Lab for Novel Software Technology, Nanjing Universi |
Liu, Yating | Nanjing University |
Cao, Fang | Nanjing University |
Wu, Hao | Nanjing University |
Xu, Fengyuan | National Key Lab for Novel Software Technology, Nanjing Universi |
Zhong, Sheng | Nanjing University |
Keywords: Human Factors and Human-in-the-Loop
Abstract: 3D reconstruction based on monocular videos has attracted wide attention, and existing reconstruction methods usually work in a reconstruction-after-scanning manner. However, these methods suffer from insufficient data collection problems due to the lack of effective guidance for users during the scanning process, which affects reconstruction quality. We propose VIDAR, which visually guides users with the streaming incremental reconstructed mesh in data collection for monocular 3D reconstruction. We propose an incremental mesh extraction algorithm to achieve lossless fusion of streaming incremental mesh data via slice-style management for guidance quality. We also design an incremental mesh rendering algorithm to achieve precise memory reallocation by updating the buffer in a fill-in-the-blank pattern for guidance efficiency. Besides, we introduce several optimizations on data transmission and human-computer interaction to improve the overall system performance. The experiment results on real-world scenes show that VIDAR efficiently delivers high-quality visual guidance and outperforms the non-interactive data collection methods for scene reconstruction.
|
|
10:30-12:00, Paper WeAT15-AX.2 | Add to My Program |
Transparency Control of a 1-DoF Knee Exoskeleton Via Human-In-The-Loop Velocity Optimisation |
|
Cha, Lukas | Technical University of Munich |
Guez, Annika | Imperial College London |
Chen, Chih-Yu | Technical University of Munich |
Kim, Sion | Imperial College London |
Yu, Zhenhua | Imperial College London |
Xiao, Bo | Imperial College London |
Vaidyanathan, Ravi | Imperial College London |
Keywords: Human Factors and Human-in-the-Loop, Prosthetics and Exoskeletons, Rehabilitation Robotics
Abstract: Rehabilitative robotics, particularly lower-limb exoskeletons (LLEs), have gained increasing importance in aiding patients regain ambulatory functions. One of the challenges in making these systems effective is the implementation of an assist-as-needed (AAN) control strategy that intervenes only when the patient deviates from the correct movement pattern. Equally crucial is the need for the LLE to exhibit "transparency" — minimising its interaction forces with the wearer to feel as natural as possible. This paper introduces a novel approach to transparency control based on a human-in-the-loop velocity optimisation framework. The proposed method employs torque data captured from past steps through a Series Elastic Actuator (SEA) to approximate the wearer's intended future movements and computes a corresponding transparent velocity trajectory. The velocity commands are complemented by an Adaptive Frequency Oscillator (AFO) based position controller that leverages the periodic nature of human gait and is modified with a force sensor for increased reactiveness to human gait variations. This approach is experimentally evaluated against a standard zero-torque controller with a stationary single-degree-of-freedom knee exoskeleton test platform in a proof-of-concept study. Preliminary results indicate that combining adaptive oscillators with interaction force sensing can improve transparency compared to the conventional zero-torque controller, using force readings for position control and torque measurements for velocity optimisation and control.
|
|
10:30-12:00, Paper WeAT15-AX.3 | Add to My Program |
Towards Enhanced Human Activity Recognition for Real-World Human-Robot Collaboration |
|
Yalcinkaya, Beril | Ingeniarius Lda |
Couceiro, Micael | University of Coimbra |
Pina, Lucas | Ingeniarius Lda |
Soares, Salviano | UTAD |
Valente, António | University of Trás Os Montes and Alto Douro |
Remondino, Fabio | FBK |
Keywords: Human Factors and Human-in-the-Loop, Human-Robot Collaboration, Robotics and Automation in Agriculture and Forestry
Abstract: This research contributes to the field of Human-Robot Collaboration (HRC) within dynamic and unstructured environments by extending the previously proposed Fuzzy State-Long Short-Term Memory (FS-LSTM) architecture to handle the uncertainty and irregularity inherent in real-world sensor data. Recognising the challenges posed by low-cost sensors, which are highly susceptible to environmental conditions and often fail to provide regular periodic readings, this paper introduces additional pre-processing blocks. These include two indirect Kalman filters and an additional LSTM network, which together enhance the input variables for the fuzzification process. The enhanced FS-LSTM approach is evaluated using real-world data, demonstrating its effectiveness in extracting meaningful information and accurately recognising human activities. This work underscores the potential of robotics in addressing global challenges, particularly in labour-intensive and hazardous tasks. By improving the integration of humans and robots in unstructured environments, this research contributes to the broader exploration of robotics in new societal applications, fostering connections and collaborations across diverse fields.
|
|
10:30-12:00, Paper WeAT15-AX.4 | Add to My Program |
Self-Supervised Regression of sEMG Signals Combining Non-Negative Matrix Factorization with Deep Neural Networks for Robot Hand Multiple Grasping Motion Control |
|
Meattini, Roberto | University of Bologna |
Caporali, Alessio | University of Bologna |
Bernardini, Alessandra | University of Bologna |
Palli, Gianluca | University of Bologna |
Melchiorri, Claudio | University of Bologna |
Keywords: Human Factors and Human-in-the-Loop, Intention Recognition
Abstract: Advanced Human-In-The-Loop (HITL) control strategies for robot hands based on surface electromyography (sEMG) are among major research questions in robotics. Due to intrinsic complexity and inaccuracy of labeling procedures, unsupervised regression of sEMG signals has been employed in literature, however showing several limitations in realizing multiple grasping motion control. In this work, we propose a novel Human-Robot interface (HRi) based on self-supervised regression of sEMG signals, combining Non-Negative Matrix Factorization (NMF) with Deep Neural Networks (DNN) in order to both avoid explicit labeling procedures and have powerful nonlinear fitting capabilities. Experiments involving 10 healthy subjects were carried out, consisting of an offline session for systematic evaluations and comparisons with traditional unsupervised approaches, and an online session for assessing realtime control of a wearable anthropomorphic robot hand. The offline results demonstrate that the proposed self-supervised regression approach overcame traditional unsupervised methods, even considering different robot hands with dissimilar kinematic structures. Furthermore, the subjects were able to successfully perform online control of multiple grasping motions of a real wearable robot hand, reporting for high reliability over repeated grasp-transportation-release tasks with different objects. Statistical support is provided along with experimental outcomes.
|
|
10:30-12:00, Paper WeAT15-AX.5 | Add to My Program |
Maximising Coefficiency of Human-Robot Handovers through Reinforcement Learning |
|
Lagomarsino, Marta | Istituto Italiano Di Tecnologia |
Lorenzini, Marta | Istituto Italiano Di Tecnologia |
Constable, Merryn Dale | Northumbria University |
De Momi, Elena | Politecnico Di Milano |
Becchio, Cristina | University Medical Center Hamburg-Eppendorf |
Ajoudani, Arash | Istituto Italiano Di Tecnologia |
Keywords: Human Factors and Human-in-the-Loop, Physical Human-Robot Interaction, Human-Centered Robotics
Abstract: Handing objects to humans is an essential capability for collaborative robots. Previous research works on human-robot handovers focus on facilitating the performance of the human partner and possibly minimising the physical effort needed to grasp the object. However, altruistic robot behaviours may result in protracted and awkward robot motions, contributing to unpleasant sensations by the human partner and affecting perceived safety and social acceptance. This paper investigates whether transferring the psychological principle that "humans act coefficiently as a group" (i.e. simultaneously maximising the benefits of all agents involved) to human-robot cooperative tasks promotes a more seamless and natural interaction. Human-robot coefficiency is first modelled by identifying implicit indicators of human comfort and discomfort as well as calculating the robot energy consumption in performing the desired trajectory. We then present a reinforcement learning approach that uses the human-robot coefficiency score as reward to adapt and learn online the combination of robot interaction parameters that maximises such coefficiency. Results proved that by acting coefficiently the robot could meet the individual preferences of most subjects involved in the experiments, improve the human perceived comfort, and foster trust in the robotic partner.
|
|
10:30-12:00, Paper WeAT15-AX.6 | Add to My Program |
Jacquard V2: Refining Datasets Using the Human in the Loop Data Correction Method |
|
Li, Qiuhao | Northeastern University |
Yuan, Shenghai | Nanyang Technological University |
Keywords: Human Factors and Human-in-the-Loop, Learning Categories and Concepts, Data Sets for Robotic Vision
Abstract: In the context of rapid advancements in industrial automation, vision-based robotic grasping plays an increasingly crucial role. In order to enhance visual recognition accuracy, the utilization of large-scale datasets is imperative for training models to acquire implicit knowledge related to the handling of various objects. Creating datasets from scratch is a time and labor-intensive process. Moreover, existing datasets often contain errors due to automated annotations aimed at expediency, making the improvement of these datasets a substantial research challenge. Consequently, several issues have been identified in the annotation of grasp bounding boxes within the popular Jacquard Grasp. We propose utilizing a Human-In-The-Loop(HIL) method to enhance dataset quality. This approach relies on backbone deep learning networks to predict object positions and orientations for robotic grasping. Predictions with Intersection over Union (IOU) values below 0.2 undergo an assessment by human operators. After their evaluation, the data is categorized into False Negatives(FN) and True Negatives(TN). FN are then subcategorized into either missing annotations or catastrophic labeling errors. Images lacking labels are augmented with valid grasp bounding box information, whereas images afflicted by catastrophic labeling errors are completely removed. The open-source tool Labelbee was employed for 53,026 iterations of HIL dataset enhancement, leading to the removal of 2,884 images and the incorporation of ground truth information for 30,292 images. The enhanced dataset, named the Jacquard V2 Grasping Dataset, served as the training data for a range of neural networks. We have empirically demonstrated that these dataset improvements significantly enhance the training and prediction performance of the same network, resulting in an increase of 7.1% across most popular detection architectures for ten iterations. This refined dataset will be accessible on Google Drive and Baidu Netdisk,
|
|
10:30-12:00, Paper WeAT15-AX.7 | Add to My Program |
Building User Proficiency in Piloting Small Unmanned Aerial Vehicles (sUAV) |
|
Kunde, Siya | University of Nebraska |
Duncan, Brittany | University of Nebraska, Lincoln |
Keywords: Human Factors and Human-in-the-Loop, Design and Human Factors, Long term Interaction
Abstract: Assessing proficiency in small unmanned aerial vehicles (sUAVs) pilots is complex and not well understood, but increasingly important to employ these vehicles in serious jobs such as wildland firefighting and infrastructure inspection. The limited prior work with UAVs has focused on user training using modalities like simulators and VR and no performance assessments with line-of-sight UAVs. This paper presents a training methodology for novice pilots of sUAVs. We presented two studies: the Baseline study (21 participants) and the Training study (16 participants). Our work is of interest to sUAV operators, regulators, and companies developing this technologies to produce a more capable workforce capable of consistent, safe operations. We successfully utilized the method developed in cite{kunde2022recognizing} to assess user proficiency in flying UAVs. We presented a UAV pilot training schedule for novice users (in the Training study), and were able to determine the minimum training time necessary to observe performance gains and mitigate damage. Results indicate that task completions noticeably improved and crashes minimized by day 10 of training, with a training plateau observed by day 15.
|
|
10:30-12:00, Paper WeAT15-AX.8 | Add to My Program |
A Probabilistic Model for Cobot Decision Making to Mitigate Human Fatigue in Repetitive Co-Manipulation Tasks |
|
Yaacoub, Aya | LORIA-CNRS |
Thomas, Vincent | LORIA - Universite De Lorraine |
Colas, Francis | Inria Nancy Grand Est |
Maurice, Pauline | Cnrs - Loria |
Keywords: Human Factors and Human-in-the-Loop, Human-Robot Collaboration, Planning under Uncertainty
Abstract: Work related musculoskeletal disorders (WMSDs) are very common. Repetitive motion, which is often present in industrial work, is one of the main physical causes of WMSDs. It uses the same set of human joints repeatedly, which leads to localized joint fatigue. In this work, we present a framework to plan a policy of a collaborative robot that reduces the human fatigue in the long term, in highly repetitive co-manipulation tasks, while taking into account the uncertainty in the human postural reaction to the robot motion and the partial observability of the human fatigue state. We model the problem using continuous-state Partially Observable Markov Decision Process (POMDP), and use a physics-based digital human simulator to predict the fatigue cost of the possible robot actions. We then use an online planning algorithm to compute the optimal robot policy. We demonstrate our approach on a simulated experiment in which a robot repeatedly carries an object for the human to work on, and the object Cartesian pose needs to be optimized. We compare the policy generated with our approach with a random, a cyclic and a greedy (short-term optimization) policy, for different user profiles. We show that our approach outperforms the other policies on all tested scenarios.
|
|
WeAT16-AX Oral Session, AX-204 |
Add to My Program |
Force and Tactile Sensing IV |
|
|
Chair: Konyo, Masashi | Tohoku University |
Co-Chair: Chen, Chao | Monash University |
|
10:30-12:00, Paper WeAT16-AX.1 | Add to My Program |
GelRoller: A Rolling Vision-Based Tactile Sensor for Large Surface Reconstruction Using Self-Supervised Photometric Stereo Method |
|
Zhang, Zhiyuan | Huazhong University of Science and Technology |
Ma, Huan | Huazhong University of Science and Technology |
Zhou, Yulin | Huazhong University of Science and Technology |
Ji, Jingjing | Huazhong University of Science and Technology |
Yang, Hua | Huazhong University of Science and Technology |
Keywords: Force and Tactile Sensing, Deep Learning in Grasping and Manipulation, Product Design, Development and Prototyping
Abstract: Accurate perception of the surrounding environment stands as a primary objective for robots. Through tactile interaction, vision-based tactile sensors provide the capability to capture high-resolution and multi-modal surface information of objects, thereby facilitating robots in achieving more dexterous manipulations. However, the prevailing GelSight sensors entail intricate calibration procedures, posing challenges in their application on curved surfaces and requiring the maintenance of stable lighting conditions throughout experimentation. Additionally, constrained by shape and structure, current vision-based tactile sensors are predominantly applied to measurements within a limited area. In this study, we design a novel cylindrical vision-based tactile sensor that enables continuous and swift perception of large-scale object surfaces through rolling. To tackle the challenges posed by laborious calibration processes, we propose a self-supervised photometric stereo method based on deep learning, which eliminates pre-calibration requirements and enables the derivation of surface normals from a single image without relying on stable lighting conditions. Finally, we perform surface reconstruction from normal and point cloud registration on the multiple frames of images obtained by rolling the cylindrical sensor, resulting in large surface reconstruction. We compare our method with the representative lookup table method in the GelSight sensors. The results show that the proposed method enhances both reconstruction accuracy and robustness, thereby demonstrating the potential of the proposed sensor in large-scale surface reconstruction. Codes and mechanical structures are available at: https://github.com/ZhangZhiyuanZhang/GelRoller
|
|
10:30-12:00, Paper WeAT16-AX.2 | Add to My Program |
Marker-Embedded Tactile Image Generation Via Generative Adversarial Networks |
|
Kim, Won Dong | Korea Advanced Institute of Science & Technology (KAIST) |
Yang, Sanghoon | KAIST |
Kim, Woojong | KAIST |
Kim, Jeong-Jung | Korea Institute of Machinery & Materials (KIMM) |
Kim, Chang-Hyun | Korea Institute of Machinery and Materials (KIMM) |
Kim, Jung | KAIST |
Keywords: Force and Tactile Sensing, Deep Learning Methods, Simulation and Animation
Abstract: Data-driven methods have been successfully applied to images from vision-based tactile sensors to fulfill various manipulation tasks. Nevertheless, these methods remain inefficient because of the lack of methods for simulating the sensors. Relevant research on simulating vision-based tactile sensors generally focus on generating images without markers, owing to the challenges in accurately generating marker motions caused by elastomer deformation. This disallows access to tactile information deducible from markers. In this work, we propose a generative adversarial network (GAN)-based method to generate realistic marker-embedded tactile images in GelSight-like vision-based tactile sensors. We trained the proposed GAN model with an aligned real tactile and simulated depth image dataset obtained from deforming the sensor against various objects. This allows the model to translate simulated depth image sequences into RGB tactile images with markers. Furthermore, the generator in the proposed GAN allows the network to integrate the history of deformations from the depth image sequences to generate realistic marker motions during the normal and lateral sensor deformations. We evaluated and compared the positional accuracy of the markers and image similarity metrics of the images generated via our method with those from prior methods. The generated tactile images from the proposed model show a 28.3 % decrease in marker positional error and a 93.5 % decrease in the image similarity m
|
|
10:30-12:00, Paper WeAT16-AX.3 | Add to My Program |
TEXterity: Tactile Extrinsic DeXterity |
|
Bronars, Antonia | MIT |
Kim, Sangwoon | Massachusetts Institute of Technology |
Patre, Parag | Magna International |
Rodriguez, Alberto | Massachusetts Institute of Technology |
Keywords: Force and Tactile Sensing, In-Hand Manipulation, Perception for Grasping and Manipulation
Abstract: We introduce a novel approach that combines tactile estimation and control for in-hand object manipulation. By integrating measurements from robot kinematics and an image-based tactile sensor, our framework estimates and tracks object pose while simultaneously generating motion plans in a receding horizon fashion to control the pose of a grasped object. This approach consists of a discrete pose estimator that tracks the most likely sequence of object poses in a coarsely discretized grid, and a continuous pose estimator-controller to refine the pose estimate and accurately manipulate the pose of the grasped object. Our method is tested on diverse objects and configurations, achieving desired manipulation objectives and outperforming single-shot methods in estimation accuracy. The proposed approach holds potential for tasks requiring precise manipulation and limited intrinsic in-hand dexterity under visual occlusion, laying the foundation for closed-loop behavior in applications such as regrasping, insertion, and tool use. Please see https://sites.google.com/view/texterity for videos of real-world demonstrations.
|
|
10:30-12:00, Paper WeAT16-AX.4 | Add to My Program |
Optimization of Flexible Bronchoscopy Shape Sensing Using Fiber Optic Sensors |
|
Liu, Xinran | University of Chinese Academy of Sciences |
Chen, Hao | University of Chinese Academy of Sciences |
Liu, Hongbin | Hong Kong Institute of Science & Innovation, Chinese Academy Of |
Keywords: Force and Tactile Sensing, Intelligent and Flexible Manufacturing
Abstract: This work presents a novel shape evaluation and optimization approach for shape sensing, specifically targeting the constrained, irregular, and intricate spatial shapes of flexible bronchoscopes (FB) in human bronchial tree. The proposed evaluation criteria and optimization methods combine clinical significance related to bronchial anatomical structures and address issues related to singular points and discontinuities in traditional shape reconstruction models. Three-dimensional experiments were conducted within eight spatial complex configurations printed from a proportional bronchial model. The 3D experiment results demonstrate an average reduction of approximately 34.1% in shape reconstruction errors across all eight airway models compared to the traditional model, validating the effectiveness and feasibility.
|
|
10:30-12:00, Paper WeAT16-AX.5 | Add to My Program |
Tactile-Informed Action Primitives Mitigate Jamming in Dense Clutter |
|
Brouwer, Dane | Stanford University |
Citron, Joshua | Stanford University |
Choi, Hojung | Stanford University |
Lepert, Marion | Stanford University |
Lin, Michael A. | Stanford University |
Bohg, Jeannette | Stanford University |
Cutkosky, Mark | Stanford University |
Keywords: Force and Tactile Sensing, Multi-Contact Whole-Body Motion Planning and Control
Abstract: It is difficult for robots to retrieve objects in densely cluttered lateral access scenes with movable objects as jamming against adjacent objects and walls can inhibit progress. We propose the use of two action primitives---burrowing and excavating---that can fluidize the scene to un-jam obstacles and enable continued progress. Even when these primitives are implemented in an open loop manner at clock-driven intervals, we observe a decrease in the final distance to the target location. Furthermore, we combine the primitives into a closed loop hybrid control strategy using tactile and proprioceptive information to leverage the advantages of both primitives without being overly disruptive. In doing so, we achieve a 10-fold increase in success rate above the baseline control strategy and significantly improve completion times as compared to the primitives alone or a naive combination of them.
|
|
10:30-12:00, Paper WeAT16-AX.6 | Add to My Program |
Crosstalk-Free Impedance-Separating Array Measurement for Iontronic Tactile Sensors |
|
Hou, Funing | Fudan University |
Li, Gang | Hebei University of Technology |
Mu, Chenxing | Hebei University of Technology |
Shi, Mengqi | Hebei University of Technology |
Liu, Jixiao | Hebei University of Technology |
Guo, Shijie | Hebei University of Technology |
Keywords: Force and Tactile Sensing, Physical Human-Robot Interaction
Abstract: Iontronic tactile sensors are promising to measure spatial-temporal contact information with high performance. However, no suitable measuring method has been presented, due to issues with crosstalk and non-negligible equivalent resistance. Hence, this study presents an impedance-separating method, which does not require complex analog components. A general Quadri-Terminal Impedance Network (QTIN) model is introduced to reduce crosstalk, which has specific compatibility with the impedance-separating method. The precise ranges are measured, showing non-rectangle shapes suitable for the response of iontronic tactile sensors. A simple denoising method is provided to reduce initial array noise obviously. This work could benefit various scenarios, such as human-robot interaction and physiological information monitoring.
|
|
10:30-12:00, Paper WeAT16-AX.7 | Add to My Program |
Visual-Tactile Learning of Garment Unfolding for Robot-Assisted Dressing |
|
Zhang, Fan | Honda Research Institute EU |
Demiris, Yiannis | Imperial College London |
Keywords: Force and Tactile Sensing, Physical Human-Robot Interaction, Manipulation Planning
Abstract: Assistive robots have the potential to support disabled and elderly people in daily dressing activities. An intermediate stage of dressing is to manipulate the garment from a crumpled initial state to an unfolded configuration that facilitates robust dressing. Applying quasi-static grasping actions with vision feedback on garment unfolding usually suffers from occluded grasping points. In this work, we propose a dynamic manipulation strategy: tracing the garment edge until the hidden corner is revealed. We introduce a model-based approach, where a deep visual-tactile predictive model iteratively learns to perform servoing from raw sensor data. The predictive model is formalized as Conditional Variational Autoencoder with contrastive optimization, which jointly learns underlying visual-tactile latent representations, a latent garment dynamics model, and future predictions of garment states. Two cost functions are explored: the visual cost defined by garment corner positions guarantees the gripper to move towards the corner, while the tactile cost defined by garment edge poses prevents the garment from falling from the gripper. The experimental results demonstrate the improvement of our contrastive visual-tactile model predictive control over single sensing modality and baseline model learning techniques. The proposed method enables a robot to unfold back-opening hospital gowns and perform upper-body dressing.
|
|
10:30-12:00, Paper WeAT16-AX.8 | Add to My Program |
Multimodal Visual-Tactile Representation Learning through Self-Supervised Contrastive Pre-Training |
|
Dave, Vedant | Montanuniversität Leoben |
Lygerakis, Fotios | University of Leoben |
Rueckert, Elmar | Montanuniversitaet Leoben |
Keywords: Force and Tactile Sensing, Representation Learning
Abstract: The rapidly evolving field of robotics necessitates methods that can facilitate the fusion of multiple modalities. Specifically, when it comes to interacting with tangible objects, effectively combining visual and tactile sensory data is key to understanding and navigating the complex dynamics of the physical world, enabling a more nuanced and adaptable response to changing environments. Nevertheless, much of the earlier work in merging these two sensory modalities has relied on supervised methods utilizing datasets labeled by humans. This paper introduces MViTac, a novel methodology that leverages contrastive learning to integrate vision and touch sensations in a self-supervised fashion. By availing both sensory inputs, MViTac leverages intra and inter-modality losses for learning representations, resulting in enhanced material property classification and more adept grasping prediction. Through a series of experiments, we showcase the effectiveness of our method and its superiority over existing state-of-the-art self-supervised and supervised techniques. In evaluating our methodology, we focus on two distinct tasks: material classification and grasping success prediction. Our results indicate that MViTac facilitates the development of improved modality encoders, yielding more robust representations as evidenced by linear probing assessments.
|
|
10:30-12:00, Paper WeAT16-AX.9 | Add to My Program |
A Hierarchical Framework for Robot Safety Using Whole-Body Tactile Sensors |
|
Jiang, Shuo | Northeastern University |
Wong, Lawson L.S. | Northeastern University |
Keywords: Force and Tactile Sensing, Robot Safety, Multi-Contact Whole-Body Motion Planning and Control
Abstract: Using tactile signal is a natural way to perceive potential dangers and safeguard robots. One possible method is to use full-body tactile sensors on the robot and perform safety maneuvers when dangerous stimuli are detected. In this work, we proposed a method based on full-body tactile sensors that operates at three different levels of granularity to ensure that robot interacts with the environment safely. The results showed that our system dramatically reduced the overall collision chance compared with several baselines, and intelligently handled current collisions. Our proposed framework is generalizable to a wide variety of robots, enabling them to predict and avoid dangerous collisions and reactively handle accidental tactile stimuli.
|
|
WeAT17-AX Oral Session, AX-205 |
Add to My Program |
Legged Robots IV |
|
|
Chair: Zhao, Ye | Georgia Institute of Technology |
Co-Chair: Kober, Jens | TU Delft |
|
10:30-12:00, Paper WeAT17-AX.1 | Add to My Program |
Robust Jumping with an Articulated Soft Quadruped Via Trajectory Optimization and Iterative Learning |
|
Ding, Jiatao | Delft University of Technology |
van Löben Sels, Mees Alexander | TU Delft |
Angelini, Franco | University of Pisa |
Kober, Jens | TU Delft |
Della Santina, Cosimo | TU Delft |
Keywords: Legged Robots, Optimization and Optimal Control, Modeling, Control, and Learning for Soft Robots
Abstract: Quadrupeds deployed in real-world scenarios need to be robust to unmodelled dynamic effects. In this work, we aim to increase the robustness of quadrupedal periodic forward jumping (i.e., pronking) by unifying cutting-edge model-based trajectory optimization and iterative learning control. Using a reduced-order soft anchor model, the optimization-based motion planner generates the periodic reference trajectory. The controller then iteratively learns the feedforward control signal in a repetition process, without requiring an accurate full-body model. When enhanced by a continuous learning mechanism, the proposed controller can learn the control input without resetting the system at the end of each iteration. Simulations and experiments on a quadruped with parallel springs demonstrate that continuous jumping can be learned in a matter of minutes, with high robustness against various types of terrain.
|
|
10:30-12:00, Paper WeAT17-AX.2 | Add to My Program |
Unlocking Versatile Locomotion: A Novel Quadrupedal Robot with 4-DoFs Legs for Roller Skating |
|
Chen, Jiawei | Beihang University |
Qin, Ripeng | Inner Mongolia University of Science and Technology |
Huang, Longfei | Beihang University,Beijing Institute of Spacecraft System |
He, Zongbo | Beijing Institute of Sapcecraft System Engineering |
Xu, Kun | Beijing University |
Ding, Xilun | Beijing Univerisity of Aeronautics and Astronautics |
Keywords: Legged Robots, Mechanism Design, Motion Control
Abstract: Roller skating with passive wheels on a quadrupedal robot is more efficient than traditional walking. However, the typical mammalian quadruped robot with 3-DoFs legs can only perform one dynamic roller skating gait and has difficulty achieving turning motion. To address this limitation, we designed a novel quadrupedal robot with each leg having 4-DoFs to enable various roller skating locomotion including Swizzling, Stroking, and trot-like gaits while easily achieving turning motions. We considered the geometrical characteristics of the passive wheel and used the Levenberg-Marquardt method in robot kinematics to improve precision for both roller skating kinematics and contact point position for the dynamics controller. The position of the robot foot and the yaw angle of the passive wheel are decoupled for motion planning of all proposed gaits. Our proposed kinematics with wheeled geometry was verified through experiments to have higher precision, while the feasibility of all proposed roller-skating gaits was confirmed during straight motion and turning motion with a small radius on our prototype robot. Finally, we discussed the mobility efficiency of different roller skating gaits which were found to be more efficient than walking.
|
|
10:30-12:00, Paper WeAT17-AX.3 | Add to My Program |
Efficient Terrain Map Using Planar Regions for Footstep Planning on Humanoid Robots |
|
Mishra, Bhavyansh | Institute of Human and Machine Cognition, University of West Flo |
Calvert, Duncan | IHMC, UWF |
Bertrand, Sylvain | Institute for Human and Machine Cognition |
Pratt, Jerry | Inst. for Human and Machine Cognition |
Sevil, Hakki Erhan | University of West Florida |
Griffin, Robert J. | Institute for Human and Machine Cognition (IHMC) |
Keywords: Humanoid and Bipedal Locomotion, Legged Robots, Mapping
Abstract: Humanoid robots possess the ability to perform complex tasks in challenging environments. However, they require a model of the surroundings in a representation that is sufficient enough for downstream tasks such as footstep planning. The maps generated by existing mapping algorithms are either sparse, insufficient for footstep planning, memory intensive, or too slow for dynamic humanoid behaviors. In this work, we develop a mapping algorithm that combines planar region measurements along with kinematic-inertial state estimates to build a dense but efficient map of bounded planar surfaces. We present novel algorithms for plane feature matching, tracking and registration for mapping within a factor graph framework. The generated map is not only memory efficient, but also offers higher reliability and speed in bipedal footstep planning, than was possible earlier. The complete algorithm is also demonstrated using a full-scale humanoid robot, Nadia, walking over both flat ground and rough terrain utilizing the generated terrain map.
|
|
10:30-12:00, Paper WeAT17-AX.4 | Add to My Program |
Convergent iLQR for Safe Trajectory Planning and Control of Legged Robots |
|
Zhu, James | Carnegie Mellon University |
Payne, J. Joe | Carnegie Mellon University |
Johnson, Aaron M. | Carnegie Mellon University |
Keywords: Legged Robots, Optimization and Optimal Control, Robust/Adaptive Control
Abstract: In order to perform highly dynamic and agile maneuvers, legged robots typically spend time in underactuated domains (e.g. with feet off the ground) where the system has limited command of its acceleration and a constrained amount of time before transitioning to a new domain (e.g. foot touchdown). Meanwhile, these transitions can instantaneously change the system’s state, possibly causing perturbations to be mapped arbitrarily far away from the target trajectory. These properties make it difficult for local feedback controllers to effectively recover from disturbances as the system evolves through underactuated domains and hybrid impact events. To address this, we utilize the fundamental solution matrix that characterizes the evolution of perturbations through a hybrid trajectory and its 2-norm, which represents the worst- case growth of perturbations. In this paper, the worst-case perturbation analysis is used to explicitly reason about the tracking performance of a hybrid trajectory and is incorporated in an iLQR framework to optimize a trajectory while taking into account the closed-loop convergence of the trajectory under an LQR tracking controller. The generated convergent trajectories recover more effectively from perturbations, are more robust to large disturbances, and use less feedback control effort than trajectories generated with traditional methods.
|
|
10:30-12:00, Paper WeAT17-AX.5 | Add to My Program |
Optimization Based Dynamic Skateboarding of Quadrupedal Robot |
|
Xu, Zhe | Beijing Institute of Technology |
Al-Khulaqui, Mohamed | Xiaomi Inc |
Ma, Hanxin | BeihangUniversity |
Wang, Jiajun | UBTECH Robotics |
Xin, Quanbin | Beijing Xiaomi Robot Technology Co., Ltd |
You, Yangwei | Institute for Infocomm Research |
Zhou, Mingliang | Beijing Xiaomi Mobile Software Co., Ltd |
Xiang, Diyun | XIAOMI |
Zhang, Shiwu | University of Science and Technology of China |
Keywords: Legged Robots, Optimization and Optimal Control, Whole-Body Motion Planning and Control
Abstract: Robot skateboarding is an unexplored and challenging task for legged robots. Accurately modeling the dynamics of dual floating bases and developing effective planning and control methods present significant complexities in accomplishing skateboarding behavior.This paper focuses on enabling the quadrupedal platform CyberDog2 to achieve dynamic balancing and acceleration on a skateboard. An optimization-based control pipeline is developed through careful derivation of the system's equations of motion, considering both the robot and skateboard dynamics. By accounting for system physical constraints, an advanced offline trajectory optimization method is employed to generate various acceleration trajectories, creating a motion library for the system. An online linear model predictive control with whole body control framework is used to track the generated trajectories and stablize the system in real-time. To validate its effectiveness , we conducted experiments in various scenarios. The quadrupedal robot successfully performed acceleration from a static state to various velocities and demonstrated the ability to balance and steer the skateboard.
|
|
10:30-12:00, Paper WeAT17-AX.6 | Add to My Program |
Hierarchical Experience-Informed Navigation for Multi-Modal Quadrupedal Rebar Grid Traversal |
|
Asselmeier, Maxwell | Georgia Institute of Technology |
Ivanova, Evgeniia | SkyMul |
Zhou, Ziyi | Georgia Institute of Technology |
Vela, Patricio | Georgia Institute of Technology |
Zhao, Ye | Georgia Institute of Technology |
Keywords: Legged Robots, Robotics and Automation in Construction, Constrained Motion Planning
Abstract: This study focuses on a layered, experience-based, multi-modal contact planning framework for agile quadrupedal locomotion over a constrained rebar environment. To this end, our hierarchical planner incorporates locomotion-specific modules into the high-level contact sequence planner and solves kinodynamically-aware trajectory optimization as the low-level motion planner. Through quantitative analysis of the experience accumulation process and experimental validation of the kinodynamic feasibility of the generated locomotion trajectories, we demonstrate that the experience planning heuristic offers an effective way of providing candidate footholds for a legged contact planner. Additionally, we introduce a guiding torso path heuristic at the global planning level to enhance the navigation success rate in the presence of environmental obstacles. Our results indicate that the torso-path guided experience accumulation requires significantly fewer offline trials to successfully reach the goal compared to regular experience accumulation. Finally, our planning framework is validated in both dynamics simulations and real hardware implementations on a quadrupedal robot provided by Skymul Inc.
|
|
10:30-12:00, Paper WeAT17-AX.7 | Add to My Program |
Learning-Based Propulsion Control for Amphibious Quadruped Robots with Dynamic Adaptation to Changing Environment |
|
Yao, Qingfeng | Shenyang Institute of Automation, Chinese Academy of Sciences |
Meng, Linghan | Shenyang Institute of Automation |
Zhang, Qifeng | Shenyang Institute of Automation, CAS |
Zhao, Jing | Shenyang Institute of Automation (SIA), Chinese Academy of Scien |
Pajarinen, Joni | Aalto University |
Wang, Xiaohui | Heriot-Watt University |
Li, Zhibin (Alex) | University College London |
Wang, Cong | Delft University of Technology (TU Delft) |
Keywords: Legged Robots, Robust/Adaptive Control, Reinforcement Learning
Abstract: This paper proposes a learning-based adaptive propulsion control (APC) method for a quadruped robot integrated with thrusters in amphibious environments, allowing it to move efficiently in water while maintaining its ground locomotion capabilities. We designed the specific reinforcement learning method to train the neural network to perform the vector propulsion control. Our approach coordinates the legs and propeller, enabling the robot to achieve speed and trajectory tracking tasks in the presence of actuator failures and unknown disturbances. Our simulated validations of the robot in water demonstrate the effectiveness of the trained neural network to predict the disturbances and actuator failures based on historical information, showing that the framework is adaptable to changing environments and is suitable for use in dynamically changing situations. Our proposed approach is suited to the hardware augmentation of quadruped robots to create avenues in the field of amphibious robotics and expand the use of quadruped robots in various applications.
|
|
10:30-12:00, Paper WeAT17-AX.8 | Add to My Program |
Reinforcement Learning for Blind Stair Climbing with Legged and Wheeled-Legged Robots |
|
Chamorro, Simon | Université De Sherbrooke |
Klemm, Victor | ETH Zurich |
de la Iglesia Valls, Miguel | ETH Zürich |
Pal, Chris | Polytechnique Montreal |
Siegwart, Roland | ETH Zurich |
Keywords: Machine Learning for Robot Control, Legged Robots, Humanoid and Bipedal Locomotion
Abstract: In recent years, legged and wheeled-legged robots have gained prominence for tasks in environments predominantly created for humans across various domains. One significant challenge faced by many of these robots is their limited capability to navigate stairs, which hampers their functionality in multi-story environments. This study proposes a method aimed at addressing this limitation, employing reinforcement learning to develop a versatile controller applicable to a wide range of robots. In contrast to the conventional velocity-based controllers, our approach builds upon a position-based formulation of the RL task, which we show to be vital for stair climbing. Furthermore, the methodology leverages an asymmetric actor-critic structure, enabling the utilization of privileged information from simulated environments during training while eliminating the reliance on exteroceptive sensors during real-world deployment. Another key feature of the proposed approach is the incorporation of a boolean observation within the controller, enabling the activation or deactivation of a stair-climbing mode. We present our results on different quadrupeds and bipedal robots in simulation and showcase how our method allows the balancing robot Ascento to climb 15cm stairs in the real world, a task that was previously impossible for this robot.
|
|
10:30-12:00, Paper WeAT17-AX.9 | Add to My Program |
Modeling and Analysis of Combined Rimless Wheel with Tensegrity Spine |
|
Xiang, Yuxuan | Japan Advanced Institute of Science and Technology |
Zheng, Yanqiu | Ritsumeikan University |
Asano, Fumihiko | Japan Advanced Institute of Science and Technology |
Keywords: Passive Walking, Legged Robots
Abstract: In the natural world, benefited from the advantages of the spine, quadrupeds exhibiting extraordinary flexibility which allowing them to move efficiently on variable terrains. The previous researches have indicated the legged robots which efficiently utilizing their spine can achieve rapid and stable locomotion. However,within the field of legged robot dynamics, the design of the spine and understanding how it positively influences locomotion is unclear, which is significant for quadruped robot to achieve efficient and stable walking. In this study, we proposed a model formed by tensegrity spine and rimless wheel to represent quadrupeds, using passive dynamic walking as a method, which has been well-demonstrated for observing the inherent characteristics, exhibited the locomotion characteristic of the model proposed. By numerical simulation, we observed change trend of locomotion performance with the configurations of spine's shape, and found direction of spine design that have a positive impact on walking. These findings contribute to the design of spine structures in quadruped robots.
|
|
WeAT18-AX Oral Session, AX-206 |
Add to My Program |
Force Control and Sensing |
|
|
Chair: Tsuji, Toshiaki | Saitama University |
Co-Chair: Huang, Guoquan | University of Delaware |
|
10:30-12:00, Paper WeAT18-AX.1 | Add to My Program |
Robot-Camera Calibration in Tightly Constrained Environment Using Interactive Perception |
|
Zhong, Fangxun | The Chinese University of Hong Kong |
Li, Bin | The Chinese University of Hong Kong |
Chen, Wei | The Chinese University of Hong Kong |
Liu, Yunhui | Chinese University of Hong Kong |
Keywords: Calibration and Identification, Sensor-based Control, Motion Control of Manipulators, Surgical Robotics: Laparoscopy
Abstract: Manipulation in tight environment is challenging but increasingly common in vision-guided robotic applications. The significantly reduced amount of available feedback (limited visual cues, field of view, robot motion space, etc.) hinders solving the hand-eye relationship accurately. In this paper, we propose a new generic approach for online camera-robot calibration that could deal with the least feedback input available in tight environment: an arbitrarily restricted motion space and a single feature point with unknown position for the robot end-effector. We introduce the interactive perception to generate prescribed but tunable robot motions to reveal high-dimensional sensory feedback, which is not obtainable from static images. We then define the interactive feature plane (IFP), whose spatial property corresponds to the robot-actuating trajectories. A depth-free adaptive controller is proposed based on image feedback, where the converged orientation of IFP directly harvests the data for solving the hand-eye relationship. Our algorithm requires neither external calibration sensors/objects nor large-scale data acquisition process. Simulations demonstrate the va
|
|
10:30-12:00, Paper WeAT18-AX.2 | Add to My Program |
Degenerate Motions of Multisensor Fusion-Based Navigation |
|
Lee, Woosik | University of Delaware |
Chen, Chuchu | University of Delaware |
Huang, Guoquan | University of Delaware |
Keywords: Calibration and Identification, SLAM, Sensor Fusion
Abstract: The system observability analysis is of practical importance, for example, due to its ability to identify the unobservable directions of the estimated state which can influence estimation accuracy and help develop consistent and robust estimators. Recent studies focused on analyzing the observability of the state of various multisensor systems with a particular interest in unobservable directions induced by degenerate motions. However, those studies mostly stay in the specific sensor domain without aiding to extend the understanding to other heterogeneous systems. To this end, in this work, we provide degenerate motion analysis on general local and global sensor-paired systems, offering insights applicable to a wide range of existing navigation systems. Our analysis includes 9 degenerate motion identification including 5 already identified in literature and 4 new motions with both synchronous and asynchronous sensor-pair cases. Comprehensive numerical studies are conducted to verify those identified motions, show the effect of degenerate motion on state estimation, and demonstrate the generalizability of our analysis on various multisensor systems.
|
|
10:30-12:00, Paper WeAT18-AX.3 | Add to My Program |
Interaction Control for Tool Manipulation on Deformable Objects Using Tactile Feedback |
|
Zhang, Hanwen | Institute of Optics and Electronics, CAS |
Lu, Zeyu | National University of Singapore |
Liang, Wenyu | Institute for Infocomm Research, A*STAR |
Yu, Haoyong | National University of Singapore |
Mao, Yao | Institute of Optics and Electronics, CAS |
Wu, Yan | A*STAR Institute for Infocomm Research |
Keywords: Force and Tactile Sensing, Force Control, Contact Modeling
Abstract: The sense of touch enables humans to perform many delicate tasks on deformable objects and/or in a vision-denied environment. For a robot to achieve similar desirable interactions, such as administering a swab test, tactile information sensed beyond the tool-in-hand is correspondingly crucial for contact state estimation and contact tracking control. In this paper, we propose a tactile-guided planning and control framework using GTac, a hetero-Geneous Tactile sensor tailored for interaction with deformable objects beyond the immediate contact area. The biomimetic GTac in use is an improved version optimised for readout linearity which provides reliability in contact state estimation and force tracking. While a tactile-based classification and manipulation process is designed to estimate and align to the contact angle between the tool and the environment, a Koopman operator-based optimal control scheme is proposed to address challenges in the nonlinear control arising from the interaction with the deformable object. Several experiments are conducted to verify the effectiveness of the proposed framework. The experimental results demonstrate that the proposed framework can achieve accurate contact angle estimation as well as excellent tracking performance and strong robustness in force control.
|
|
10:30-12:00, Paper WeAT18-AX.4 | Add to My Program |
Development of an Easy-To-Cut Six-Axis Force Sensor |
|
Kawahara, Takamasa | Saitama University |
Tsuji, Toshiaki | Saitama University |
Keywords: Force and Tactile Sensing, Force Control, Robotics and Automation in Agriculture and Forestry
Abstract: Although the potential demand for force sensors in both robotics and automation is high, the complexity of their structure increases the number of manufacturing processes. As a result, the rising cost of sensors has hindered the practical application of force measurement and force control. In this study, a flexure element comprising a structure that is easier to cut and process than conventional ones, as well as holes through the side of a cuboid, is proposed to simplify the manufacturing of force sensors. To ensure the safety of the proposed sensor design, an approximate equation is derived to predict the maximum von Mises stress on the flexure element using design parameters. Subsequently, we clarified a way to attach the strain gauge in a position that improves sensitivity. The results of the actual prototype sensor based on the proposed method show that the maximum nonlinearity error and decoupling error in the other axes are 0.442 %R.O. and 0.660 %R.O., respectively, and the performance is comparable to that of conventional force sensors. Because the prototype has a difference in resolution between the axes, a method for improving the resolution isotropy without changing the difficulty of machining is also proposed. In addition, the validity of the proposed method is demonstrated using experiments. Consequently, a force sensor with the same level of performance was developed using the proposed method, and the cutting process was made easier compared to that of convention
|
|
10:30-12:00, Paper WeAT18-AX.5 | Add to My Program |
An Ultra-Fast Intrinsic Contact Sensing Method for Medical Instruments with Arbitrary Shape |
|
Cao, Guanglin | Institute of Automation, Chinese Academy of Sciences |
Chen, Mingcong | City University of Hong Kong |
Hu, Jian | Institute of Automation, Chinese Academy of Sciences |
Liu, Hongbin | Hong Kong Institute of Science & Innovation, Chinese Academy Of |
Keywords: Force Control, Force and Tactile Sensing, Medical Robots and Systems
Abstract: Intraoperative contact sensing has the potential to reduce the risk of surgical errors and enhance manipulation capabilities for medical robots, particularly in contact force control. Current intrinsic force sensing (IFS) methods are limited in application to medical instruments with arbitrary shape, due to high computational time and reliance on precise surface equations. This study presents an ultra-fast IFS method that uses multiple planes to establish surface geometry descriptions. The method can reduce high-order contact mechanical models that need to be solved iteratively to a set of linear equations, and calculate contact location analytically. In addition, a robot motion control approach based on the contact sensing method is proposed to maintain stable contact force and regulate the probe's orientation for robotic ultrasound systems (RUSS). Experimental results show that the contact sensing method is robust to friction and can achieve a mean (±SD) displacement error of 1.04±0.43 mm in contact location with computational time less than 1 ms. The system has been evaluated on a phantom with sinusoidal motion. To the best of our knowledge, this is the first study to validate adaptiveness of RUSS under dynamic conditions. The results demonstrated that the system exhibits comparable manipulation capabilities to human operators with only force sensing, indicating a high level of adaptiveness.
|
|
10:30-12:00, Paper WeAT18-AX.6 | Add to My Program |
Proprioceptive-Based Whole-Body Disturbance Rejection Control for Dynamic Motions in Legged Robots |
|
Zhu, Zhengguo | Shandong University |
Zhang, Guoteng | Shandong University |
Sun, Zhongkai | Shandong University |
Chen, Teng | Shandong University |
Rong, Xuewen | Shandong University |
Xie, Anhuan | Zhejiang University |
Li, Yibin | Shandong University |
Keywords: Force Control, Motion Control, Robust/Adaptive Control
Abstract: This paper presents a control framework for legged robots that enables self-perception and resistance to external disturbances. First, a novel proprioceptive-based disturbance estimator is proposed. Compared with other disturbance estimators, this estimator possesses notable advantages in terms of filtering foot-ground interaction noise and suppressing the accumulation of estimation errors. Additionally, our estimator is a fully proprioceptive-based estimator, eliminating the need for any exteroceptive devices or observers. Second, we present a hierarchical optimized whole-body controller (WBC), which takes into account the full body dynamics, the actuation limits, the external disturbances, and the interactive constraints. Finally, extensive experimental trials conducted on the point-foot biped robot BRAVER validate the capabilities of the proposed estimator and controller under various disturbance conditions.
|
|
10:30-12:00, Paper WeAT18-AX.7 | Add to My Program |
Contact Force Estimation of Robot Manipulators with Imperfect Dynamic Model: On Gaussian Process Adaptive Disturbance Kalman Filter (I) |
|
Wei, Yanran | Beihang University |
Lyu, Shangke | Nanyang Technological University |
Li, Wenshuo | Beihang University |
Yu, Xiang | Beihang University |
Guo, Lei | Beihang University |
Keywords: Industrial Robots, Force and Tactile Sensing, Calibration and Identification
Abstract: This paper is concerned with the contact force estimation problem of robot manipulators based on imperfect dynamic models of the manipulator and the contact force. To handle the imperfect dynamic information of the manipulator, a hybrid model, consisting of the nominal model and the residual dynamics, is established for the manipulator, and the Gaussian process regression (GPR) technique is employed to learn the mean and covariance of the residual dynamics. On this basis, a virtual measurement equation is established for contact force estimation and a Gaussian process adaptive disturbance Kalman filter (GPADKF) is developed where the variational Bayes technique is employed to achieve online identification of the noise statistics in the force dynamics. The GPADKF is capable of decoupling the contact force from residual dynamics and system noises, thereby reducing the dependence on accurate dynamic models of the manipulator and the contact force. Simulation and experimental results demonstrate that the proposed scheme outperforms the state-of-art methods.
|
|
WeAT19-NT Oral Session, NT-G301 |
Add to My Program |
Medical Robots IV |
|
|
Chair: Mylonas, George | Imperial College London |
Co-Chair: Navab, Nassir | TU Munich |
|
10:30-12:00, Paper WeAT19-NT.1 | Add to My Program |
On the Disentanglement of Tube Inequalities in Concentric Tube Continuum Robots |
|
Grassmann, Reinhard M. | University of Toronto |
Senyk, Anastasiia | Ukrainian Catholic University |
Burgner-Kahrs, Jessica | University of Toronto |
Keywords: Medical Robots and Systems, Modeling, Control, and Learning for Soft Robots, Kinematics
Abstract: Concentric tube continuum robots utilize nested tubes, which are subject to a set of inequalities. Current approaches to account for inequalities rely on branching methods such as if-else statements. It can introduce discontinuities, may result in a complicated decision tree, has a high wall-clock time, and cannot be vectorized. This affects the behavior and result of downstream methods in control, learning, workspace estimation, and path planning, among others. In this paper, we investigate a mapping to mitigate branching methods. We derive a lower triangular transformation matrix to disentangle the inequalities and provide proof for the unique existence. It transforms the interdependent inequalities into independent box constraints. Further investigations are made for sampling, control, and workspace estimation. Approaches utilizing the proposed mapping are at least 14 times faster (up to 176 times faster), generate always valid joint configurations, are more interpretable, and are easier to extend.
|
|
10:30-12:00, Paper WeAT19-NT.2 | Add to My Program |
3D Navigation of a Magnetic Swimmer Using a 2D Ultrasonography Probe Manipulated by a Robotic Arm for Position Feedback |
|
Gorroochurn, Premal | Columbia University |
Hong, Charles | Georgia Institute of Technology |
Klebuc, Carter | University of Houston |
Lu, Yitong | University of Houston |
Phan, Ngoc Tu Khue | Kerr High School |
Garcia Gonzalez, Javier | University of Houston |
Becker, Aaron | University of Houston |
Julien, Leclerc | University of Houston |
Keywords: Medical Robots and Systems, Motion Control, Control Architectures and Programming
Abstract: Millimeter-scale magnetic rotating swimmers have multiple potential medical applications. They could, for example, navigate inside the bloodstream of a patient toward an occlusion and remove it. Magnetic rotating swimmers have internal magnets and propeller fins with a helical shape. A rotating magnetic field applies torque on the swimmer and makes it rotate. The shape of the swimmer, combined with the rotational movement, generates a propulsive force. Visual feedback is suitable for in-vitro closed-loop control. However, in-vivo procedures will require different feedback modalities due to the opacity of the human body. In this paper, we provide new methods and tools that enable the 3D control of a magnetic swimmer using a 2D ultrasonography device attached to a robotic arm to sense the swimmer's position. We also provide an algorithm that computes the placement of the robotic arm and a controller that keeps the swimmer within the ultrasound imaging slice. The position measurement and closed-loop control were tested experimentally.
|
|
10:30-12:00, Paper WeAT19-NT.3 | Add to My Program |
An Intelligent Robotic Endoscope Control System Based on Fusing Natural Language Processing and Vision Models |
|
Dong, Beili | Imperial College London |
Chen, Junhong | Imperial College London |
Wang, Zeyu | Imperial College London |
Deng, Kaizhong | Imperial College London |
Li, Yiping | Imperial College London |
Lo, Benny Ping Lai | Imperial College London |
Mylonas, George | Imperial College London |
Keywords: Medical Robots and Systems, Motion Control
Abstract: In recent years, the area of Robot-Assisted Minimally Invasive Surgery (RAMIS) is standing on the the verge of a new wave of innovations. However, autonomy in RAMIS is still in a primitive stage. Therefore, most surgeries still require manual control of the endoscope and the robotic instruments, resulting in surgeons needing to switch attention between performing surgical procedures and moving endoscope camera. Automation may reduce the complexity of surgical operations and consequently reduce the cognitive load on the surgeon while speeding up the surgical process. In this paper, a hybrid robotic endoscope control system based on fusion model of natural language processing (NLP) and modified YOLO-V8 vision model is proposed. This proposed system can analyze the current surgical workflow and generate logs to summarize the procedure for teaching and providing feedback to junior surgeons. The user study of this system indicated a significant reduction of the number of clutching actions and mean task time, which effectively enhanced the surgical training.
|
|
10:30-12:00, Paper WeAT19-NT.4 | Add to My Program |
AiAReSeg: Catheter Detection and Segmentation in Interventional Ultrasound Using Transformers |
|
Ranne, Alex | Imperial College London |
Velikova, Yordanka | TU Munich |
Navab, Nassir | TU Munich |
Rodriguez y Baena, Ferdinando | Imperial College, London, UK |
Keywords: Medical Robots and Systems, Object Detection, Segmentation and Categorization, Simulation and Animation
Abstract: To date, endovascular surgeries are performed using the golden standard of Fluoroscopy, which uses ionising radiation to visualise catheters and vasculature. Prolonged Fluoroscopic exposure is harmful for the patient and the clinician, and may lead to severe post-operative sequlae such as the development of cancer. Meanwhile, the use of interventional Ultrasound has gained popularity, due to its well-known benefits of small spatial footprint, fast data acquisition, and produce higher tissue contrast images. However, ultrasound images are hard to interpret, and it is difficult to localise vessels, catheters, and guidewires within them. This work proposed a solution using an adaptation of a state-of-the-art machine learning architecture (Transformers) to detect and segment catheters in axial interventional Ultrasound image sequences. The network architecture was inspired by the Attention in Attention mechanism, temporal tracking networks, and introduced a novel 3D segmentation head that performs 3D deconvolution across time. In order to facilitate training of such deep learning networks, we introduced a new data synthesis pipeline that used physics-based catheter insertion simulations, along with a convolutional ray-casting ultrasound simulator to produce synthetic ultrasound images of endovascular interventions. The proposed method was validated on a hold-out validation dataset, thus demonstrated robustness to ultrasound noise and a wide range of scanning angles. It was also tested on data collected from silicon-based aorta phantoms, thus demonstrated its potential for translation from sim-to-real. This work represents a significant step towards safer and more efficient endovascular surgery using interventional ultrasound.
|
|
10:30-12:00, Paper WeAT19-NT.5 | Add to My Program |
Hybrid Robot for Percutaneous Needle Intervention Procedures: Mechanism Design and Experiment Verification |
|
Zhang, Hanyi | Imperial College London |
Yao, Guocai | Tsinghua University |
Zhang, Feifan | University College London |
Lin, Fanchuan | Beihang University |
Sun, Fuchun | Tsinghua Univerisity |
Keywords: Medical Robots and Systems, Parallel Robots, Mechanism Design
Abstract: This paper presents a 6-DOF hybrid robot for percutaneous needle intervention procedures. The new robot combines the advantages of both serial robots and parallel robots, featuring compactness, high accuracy, and small footprint while overcoming the problems of the high cost of serial robots and the small workspace and singularity issue of parallel robots. Besides, by analyzing the workspace of the robot, the equation is derived between the structure parameter and workspace to adjust the parameters of the robot to satisfy different working scenes. According to the experiment, the accuracy of the robot is related to the position, distance, and insertion angle. The result shows that the performance is better when working near the center workspace and away from the servos and the average error of the robot is 1.39mm. The phantom experiment of lumbar puncture validates its feasibility.
|
|
10:30-12:00, Paper WeAT19-NT.6 | Add to My Program |
Envibroscope: Real-Time Monitoring and Prediction of Environmental Motion for Enhancing Safety in Robot-Assisted Microsurgery |
|
Alikhani, Alireza | Augen Klinik Und Poliklinik, Klinikum Rechts Der Isar Der Techn |
Inagaki, Satoshi | NSK.Ltd |
Dehghani, Shervin | TUM |
Maier, Mathias | Klinikum Rechts Der Isar Der TU München |
Navab, Nassir | TU Munich |
Nasseri, M. Ali | Technische Universitaet Muenchen |
Keywords: Medical Robots and Systems, Robot Safety, Machine Learning for Robot Control
Abstract: Several robotic systems have been emerged in the recent past to enhance the precision of micro-surgeries such as retinal procedures. Significant advancements have recently been achieved to increase the precision of such systems beyond surgeon capabilities. However, little attention has been paid to the impact of non-predicted and sudden movements of the patient and the environment. Therefore, analyzing environmental motion and vibrations is crucial to ensuring the optimal performance and reliability of medical systems that require micron-level precision, especially in real-life scenarios. To address this challenge, this paper introduces a novel environmental motion analysis system that employs a grid layout with distributed sensing nodes throughout the environment. This system effectively tracks undesired movements (motions) at designated locations and predicts upcoming motions using neural network-based approaches. The outcomes of our experiments exhibit promising prospects for real-time motion monitoring and prediction, which has the potential to form a solid basis for enhancing the automation, safety, integration, and overall efficiency of robot-assisted micro-surgeries.
|
|
10:30-12:00, Paper WeAT19-NT.7 | Add to My Program |
Cooperative vs. Teleoperation Control of the Steady Hand Eye Robot with Adaptive Sclera Force Control: A Comparative Study |
|
Esfandiari, Mojtaba | Johns Hopkins University |
Kim, Ji Woong | Johns Hopkins University |
Zhao, Botao | Johns Hopkins University |
Amirkhani, Golchehr | Johns Hopkins University |
Hadi, Muhammad | Johns Hopkins University |
Gehlbach, Peter | Johns Hopkins Medical Institute |
Taylor, Russell H. | The Johns Hopkins University |
Iordachita, Ioan Iulian | Johns Hopkins University |
Keywords: Medical Robots and Systems, Robust/Adaptive Control, Force Control
Abstract: A surgeon's physiological hand tremor can significantly impact the outcome of delicate and precise retinal surgery, such as retinal vein cannulation (RVC) and epiretinal membrane peeling. Robot-assisted eye surgery technology provides ophthalmologists with advanced capabilities such as hand tremor cancellation, hand motion scaling, and safety constraints that enable them to perform these otherwise challenging and high-risk surgeries with high precision and safety. Steady-Hand Eye Robot (SHER) with cooperative control mode can filter out surgeon's hand tremor, yet another important safety feature, that is, minimizing the contact force between the surgical instrument and sclera surface for avoiding tissue damage cannot be met in this control mode. Also, other capabilities, such as hand motion scaling and haptic feedback, require a teleoperation control framework. In this work, for the first time, we implemented a teleoperation control mode incorporated with an adaptive sclera force control algorithm using a PHANTOM Omni haptic device and a force-sensing surgical instrument equipped with Fiber Bragg Grating (FBG) sensors attached to the SHER 2.1 end-effector. This adaptive sclera force control algorithm allows the robot to dynamically minimize the tool-sclera contact force. Moreover, for the first time, we compared the performance of the proposed adaptive teleoperation mode with the cooperative mode by conducting a vessel-following experiment inside an eye phantom under a microscope.
|
|
10:30-12:00, Paper WeAT19-NT.8 | Add to My Program |
Adaptive Motion Scaling for Robot-Assisted Microsurgery Based on Hybrid Offline Reinforcement Learning and Damping Control |
|
Jiang, Peiyang | University of Bristol |
Li, Wei | Imperial College London |
Li, Yifan | University of Bristol |
Zhang, Dandan | Imperial College London |
Keywords: Medical Robots and Systems, Robust/Adaptive Control
Abstract: Motion scaling is essential to empower users to conduct precise manipulation during teleoperation for robot-assisted microsurgery (RAMS). A constant, small motion scaling ratio can enhance the precision of teleoperation but hinder the operator from quickly reaching distant targets. The concept of self-adaptive motion scaling has been proposed in previous work. However, previous frameworks required extensive manual tuning of core parameters, which significantly depends on prior knowledge and may potentially lead to non-optimal solutions. This paper presents a hybrid offline reinforcement learning and damping control approach to regulate the motion scaling ratio for different operations during offline training. This method can take user-specific characteristics into consideration and help them achieve better teleoperation performance. Comparisons are made with and without using the adaptive motion-scaling algorithm. Detailed user studies indicate that a suitable motion-scaling ratio can be obtained and adjusted online. The overall performance of the operators in terms of time cost for task completion is significantly improved, while the variance of average speed and the total distance for robot operation is reduced.
|
|
10:30-12:00, Paper WeAT19-NT.9 | Add to My Program |
Chained Flexible Capsule Endoscope: Unraveling the Conundrum of Size Limitations and Functional Integration for Gastrointestinal Transitivity |
|
Yuan, Sishen | The Chinese University of Hong Kong |
Li, Guang | Chinese University of Hong Kong |
Liang, Baijia | Chinese University of Hong Kong |
Li, Lailu | The Chinese University of Hong Kong |
Zheng, Qingzhuo | The Chinese University of Hong Kong |
Song, Shuang | Harbin Institute of Technology (Shenzhen) |
Li, Zhen | Qilu Hospital of Shandong University |
Ren, Hongliang | Chinese Univ Hong Kong (CUHK) & National Univ Singapore(NUS) |
Keywords: Medical Robots and Systems, Soft Robot Applications
Abstract: Capsule endoscopes, predominantly serving diagnostic functions, provide lucid internal imagery but are devoid of surgical or therapeutic capabilities. Consequently, despite lesion detection, physicians frequently resort to traditional endoscopic or open surgical procedures for treatment, resulting in more complex, potentially risky interventions. To surmount these limitations, this study introduces a flexible capsule endoscope (FCE) design concept, specifically conceived to navigate the inherent volume constraints of capsule endoscopes whilst augmenting their therapeutic functionalities. The FCE’s distinctive flexibility originates from a conventional rotating joint design and the incision pattern in the flexible material. In vitro experiments validated the passive navigation ability of the FCE in rugged intestinal tracts. Further, the FCE demonstrates consistent reptile-like peristalsis under the influence of an external magnetic field, and possesses the capability for film expansion and disintegration under high-frequency electromagnetic stimulation. These findings illuminate a promising path toward amplifying the therapeutic capacities of capsule endoscopes without necessitating a size compromise.
|
|
WeAT20-NT Oral Session, NT-G302 |
Add to My Program |
Performance Evaluation and Benchmarking |
|
|
Chair: Roa, Maximo A. | German Aerospace Center (DLR) |
Co-Chair: Alenyŕ, Guillem | Institut De Robňtica I Informŕtica Industrial CSIC-UPC |
|
10:30-12:00, Paper WeAT20-NT.1 | Add to My Program |
A Group Theoretic Metric for Robot State Estimation Leveraging Chebyshev Interpolation |
|
Agrawal, Varun | Georgia Institute of Technology |
Dellaert, Frank | Verdant Robotics/Georgia Tech |
Keywords: Performance Evaluation and Benchmarking
Abstract: We propose a new metric for robot state estimation based on the recently introduced text{SE}_2(3) Lie group definition. Our metric is related to prior metrics for SLAM but explicitly takes into account the linear velocity of the state estimate, improving over current pose-based trajectory analysis. This has the benefit of providing a single, quantitative metric to evaluate state estimation algorithms against, while being compatible with existing tools and libraries. Since ground truth data generally consists of pose data from motion capture systems, we also propose an approach to compute the ground truth linear velocity based on polynomial interpolation. Using Chebyshev interpolation and a pseudospectral parameterization, we can accurately estimate the ground truth linear velocity of the trajectory in an optimal fashion with best approximation error. We demonstrate how this approach performs on multiple robotic platforms where accurate state estimation is vital, and compare it to alternative approaches such as finite differences. The pseudospectral parameterization also provides a means of trajectory data compression as an additional benefit. Experimental results show our method provides a valid and accurate means of comparing state estimation systems, which is also easy to interpret and report.
|
|
10:30-12:00, Paper WeAT20-NT.2 | Add to My Program |
AD4RL: Autonomous Driving Benchmarks for Offline Reinforcement Learning with Value-Based Dataset |
|
Lee, Dongsu | Soongsil University |
Eom, Chanin | Soongsil University |
Kwon, Minhae | Soongsil University |
Keywords: Performance Evaluation and Benchmarking, Data Sets for Robot Learning, Reinforcement Learning
Abstract: Offline reinforcement learning has emerged as a promising technology by enhancing its practicality through the use of pre-collected large datasets. Despite its practical benefits, most algorithm development research in offline reinforcement learning still relies on game tasks with synthetic datasets. To address such limitations, this paper provides autonomous driving datasets and benchmarks for offline reinforcement learning research. We provide 19 datasets, including real-world human driver's datasets, and seven popular offline reinforcement learning algorithms in three realistic driving scenarios. We also provide a unified decision-making process model that can operate effectively across different scenarios, serving as a reference framework in algorithm design. Our research lays the groundwork for further collaborations in the community to explore practical aspects of existing reinforcement learning methods. Dataset and codes can be found in https://sites.google.com/view/ad4rl.
|
|
10:30-12:00, Paper WeAT20-NT.3 | Add to My Program |
The Cluttered Environment Picking Benchmark (CEPB) for Advanced Warehouse Automation |
|
D'Avella, Salvatore | Sant'Anna School of Advanced Studies |
Bianchi, Matteo | University of Pisa |
Sundaram, Ashok M. | German Aerospace Center (DLR) |
Avizzano, Carlo Alberto | Scuola Superiore Sant'Anna |
Roa, Maximo A. | German Aerospace Center (DLR) |
Tripicchio, Paolo | Scuola Superiore Sant'Anna |
Keywords: Performance Evaluation and Benchmarking, Grasping, Dexterous Manipulation
Abstract: Autonomous and reliable robotic grasping is a desirable functionality in robotic manipulation and is still an open problem. Standardized benchmarks are important tools for evaluating and comparing robotic grasping and manipulation systems among different research groups, also sharing with the community the best practices to learn from errors. An ideal benchmarking protocol should encompass the different aspects underpinning grasp execution, including the mechatronic design of grippers, planning, perception, and control to give information on each aspect and the overall problem. The proposed work gives an overview of the benchmarks, datasets, and competitions that have been proposed and adopted in the last few years and presents a novel benchmark with protocols for different tasks that evaluate both the single components of the system and the system as a whole, introducing an evaluation metric that allows for a fair comparison in highly cluttered scenes taking into account the difficulty of the clutter. A website dedicated to the benchmark containing information on the different tasks, maintaining the leaderboards, and serving as a contact point for the community is also provided.
|
|
10:30-12:00, Paper WeAT20-NT.4 | Add to My Program |
SceneReplica: Benchmarking Real-World Robot Manipulation by Creating Replicable Scenes |
|
Khargonkar, Ninad | University of Texas at Dallas |
Allu, Sai Haneesh | The University of Texas at Dallas |
Lu, Yangxiao | The University of Texas at Dallas |
P, Jishnu Jaykumar | The University of Texas at Dallas |
Prabhakaran, B | University of Texas at Dallas |
Xiang, Yu | University of Texas at Dallas |
Keywords: Performance Evaluation and Benchmarking, Grasping, Perception for Grasping and Manipulation
Abstract: We present a new reproducible benchmark for evaluating robot manipulation in the real world, specifically focusing on a pick-and-place task. Our benchmark uses the YCB object set, a commonly used dataset in the robotics community, to ensure that our results are comparable to other studies. Additionally, the benchmark is designed to be easily reproducible in the real world, making it accessible to researchers and practitioners. We also provide our experimental results and analyzes for model-based and model-free 6D robotic grasping on the benchmark, where representative algorithms are evaluated for object perception, grasping planning, and motion planning. We believe that our benchmark will be a valuable tool for advancing the field of robot manipulation. By providing a standardized evaluation framework, researchers can more easily compare different techniques and algorithms, leading to faster progress in developing robot manipulation methods. Appendix, code and videos for the project are available at https://irvlutd.github.io/SceneReplica.
|
|
10:30-12:00, Paper WeAT20-NT.5 | Add to My Program |
CRITERIA: A New Benchmarking Paradigm for Evaluating Trajectory Prediction Models for Autonomous Driving |
|
Chen, Changhe | University of Toronto |
Pourkeshavarz, Mozhgan | Research Scientist at Huawei |
Rasouli, Amir | Huawei Technologies Canada |
Keywords: Performance Evaluation and Benchmarking, Intelligent Transportation Systems, Intention Recognition
Abstract: Benchmarking is a common method for evaluating trajectory prediction models for autonomous driving. Existing benchmarks rely on datasets, which are biased towards more common scenarios, such as cruising, and distance-based metrics that are computed by averaging over all scenarios. Following such a regiment provides a little insight into the properties of the models both in terms of how well they can handle different scenarios and how admissible and diverse their outputs are. There exist a number of complementary metrics designed to measure the admissibility and diversity of trajectories, however, they suffer from biases, such as length of trajectories. In this paper, we propose a new benChmarking paRadIgm for evaluaTing trajEctoRy predIction Approaches (CRITERIA). Particularly, we propose 1) a method for extracting driving scenarios at varying levels of specificity according to the structure of the roads, models' performance, and data properties for fine-grained ranking of prediction models; 2) A set of new bias-free metrics for measuring diversity, by incorporating the characteristics of a given scenario, and admissibility, by considering the structure of roads and kinematic compliancy, motivated by real-world driving constraints; 3) Using the proposed benchmark, we conduct extensive experimentation on a representative set of the prediction models using the large scale Argoverse dataset. We show that the proposed benchmark can produce a more accurate ranking of the models and serve as a means of characterizing their behavior. We further present ablation studies to highlight contributions of different elements that are used to compute the proposed metrics
|
|
10:30-12:00, Paper WeAT20-NT.6 | Add to My Program |
LET-3D-AP: Longitudinal Error Tolerant 3D Average Precision for Camera-Only 3D Detection |
|
Hung, Wei-Chih | Waymo |
Casser, Vincent Michael | Waymo |
Kretzschmar, Henrik | Waymo |
Hwang, Jyh-Jing | Waymo |
Anguelov, Dragomir | Waymo |
Keywords: Performance Evaluation and Benchmarking, Object Detection, Segmentation and Categorization, Data Sets for Robotic Vision
Abstract: The 3D~Average Precision (3D AP) relies on the intersection over union between predictions and ground truth objects. However, camera-only detectors have limited depth accuracy, which may cause otherwise reasonable predictions that suffer from such longitudinal localization errors to be treated as false positives. We therefore propose variants of the 3D AP metric to be more permissive with respect to depth estimation errors. Specifically, our novel longitudinal error tolerant metrics, LET-3D-AP and LET-3D-APL, allow longitudinal localization errors of the prediction boxes up to a given tolerance. To evaluate the proposed metrics, we also construct a new test set for the Waymo Open Dataset, tailored to camera-only 3D detection methods. Surprisingly, we find that state-of-the-art camera-based detectors can outperform popular LiDAR-based detectors with our new metrics past at 10% depth error tolerance, suggesting that existing camera-based detectors already have the potential to surpass LiDAR-based detectors in downstream applications. We believe the proposed metrics and the new benchmark dataset will facilitate advances in the field of camera-only 3D detection by providing more informative signals that can better indicate the system-level performance.
|
|
10:30-12:00, Paper WeAT20-NT.7 | Add to My Program |
HuNavSim: A ROS 2 Human Navigation Simulator for Benchmarking Human-Aware Robot Navigation |
|
Perez-Higueras, Noe | University Pablo De Olavide |
Otero, Roberto | University Pablo De Olavide |
Caballero, Fernando | Universidad De Sevilla |
Merino, Luis | Universidad Pablo De Olavide |
Keywords: Performance Evaluation and Benchmarking, Simulation and Animation, Human-Aware Motion Planning
Abstract: This work presents the Human Navigation Simulator (HuNavSim), a novel open-source tool for the simulation of different human-agent navigation behaviors in scenarios with mobile robots. The tool, the first programmed under the ROS 2 framework, can be used together with different well-known robotics simulators like Gazebo. The main goal is to facilitate the development and evaluation of human-aware robot navigation systems in simulation. In addition to a general human-navigation model, HuNavSim includes, as a novelty, a rich set of individual and varied human navigation behaviors and a comprehensive set of metrics for social navigation benchmarking.
|
|
10:30-12:00, Paper WeAT20-NT.8 | Add to My Program |
RobotPerf: An Open-Source, Vendor-Agnostic, Benchmarking Suite for Evaluating Robotics Computing System Performance |
|
Mayoral-Vilches, Victor | Klagenfurt University |
Jabbour, Jason | Harvard University |
Hsiao, Yu-Shun | Harvard University |
Wan, Zishen | Georgia Institute of Technology |
Crespo-Álvarez, Martińo | Acceleration Robotics |
Stewart, Matthew | Harvard University |
Reina-Muńoz, Juan Manuel | Acceleration Robotics |
Nagras, Prateek | Acceleration Robotics |
Vikhe, Gaurav | Acceleration Robotics |
Bakhshalipour, Mohammad | Carnegie Mellon University |
Pinzger, Martin | Universität Klagenfurt |
Rass, Stefan | Alpen-Adria Universität Klagenfurt |
Panigrahi, Smruti | Ford Motor Company |
Corradi, Giulio | AMD |
Roy, Niladri | Intel |
Gibbons, Phillip | Carnegie Mellon University |
Neuman, Sabrina | Boston University |
Plancher, Brian | Barnard College, Columbia University |
Janapa Reddi, Vijay | Harvard University |
Keywords: Performance Evaluation and Benchmarking, Software Tools for Benchmarking and Reproducibility, Computer Architecture for Robotic and Automation
Abstract: We introduce RobotPerf, a vendor-agnostic benchmarking suite designed to evaluate robotics computing performance across a diverse range of hardware platforms using ROS 2 as its common baseline. The suite encompasses ROS 2 packages covering the full robotics pipeline and integrates two distinct benchmarking approaches: black-box testing, which measures performance by eliminating upper layers and replacing them with a test application, and grey-box testing, an application-specific measure that observes internal system states with minimal interference. Our benchmarking framework provides ready-to-use tools and is easily adaptable for the assessment of custom ROS 2 computational graphs. Drawing from the knowledge of leading robot architects and system architecture experts, RobotPerf establishes a standardized approach to robotics benchmarking. As an open-source initiative, RobotPerf remains committed to evolving with community input to advance the future of hardware-accelerated robotics.
|
|
10:30-12:00, Paper WeAT20-NT.9 | Add to My Program |
Standardization of Cloth Objects and Its Relevance in Robotic Manipulation |
|
Garcia-Camacho, Irene | Institut De Robňtica I Informŕtica Industrial CSIC-UPC |
Longhini, Alberta | KTH Royal Institute of Technology |
Welle, Michael C. | KTH Royal Institute of Technology |
Alenyŕ, Guillem | CSIC-UPC |
Kragic, Danica | KTH |
Borrŕs Sol, Júlia | Institut De Robňtica I Informŕtica Industrial (CSIC-UPC) |
Keywords: Performance Evaluation and Benchmarking, Software Tools for Benchmarking and Reproducibility, Grasping
Abstract: The field of robotics faces inherent challenges in manipulating deformable objects, particularly in understanding and standardising fabric properties like elasticity, stiffness, and friction. While the significance of these properties is evident in the realm of cloth manipulation, accurately categorising and comprehending them in real-world applications remains elusive. This study sets out to address two primary objectives: (1) to provide a framework suitable for robotics applications to characterise cloth objects, and (2) to study how these properties influence robotic manipulation tasks. Our preliminary results validate the framework's ability to characterise cloth properties and compare cloth sets, and reveal the influence that different properties have on the outcome of five manipulation primitives. We believe that, in general, results on the manipulation of clothes should be reported along with a better description of the garments used in the evaluation. This paper proposes a set of these measures.
|
|
WeAT22-NT Oral Session, NT-G304 |
Add to My Program |
Marine Robotics IV |
|
|
Chair: Hollinger, Geoffrey | Oregon State University |
Co-Chair: Jang, Junwoo | University of Michigan |
|
10:30-12:00, Paper WeAT22-NT.1 | Add to My Program |
A Model for Multi-Agent Autonomy That Uses Opinion Dynamics and Multi-Objective Behavior Optimization |
|
Paine, Tyler | Massachusetts Institute of Technology |
Benjamin, Michael | Massachusetts Institute of Technology |
Keywords: Marine Robotics, Multi-Robot Systems, Cooperating Robots
Abstract: This paper reports a new hierarchical architecture for modeling autonomous multi-robot systems (MRSs): a non-linear dynamical opinion process is used to model high-level group choice, and multi-objective behavior optimization is used to model individual decisions. Using previously reported theoretical results, we show it is possible to design the behavior of the MRS by the selection of a relatively small set of parameters. The resulting behavior - both collective actions and individual actions - can be understood intuitively. The approach is entirely decentralized and the communication cost scales by the number of group options, not agents. We demonstrated the effectiveness of this approach using a hypothetical `explore-exploit-migrate' scenario in a two hour field demonstration with eight unmanned surface vessels (USVs). The results from our preliminary field experiment show the collective behavior is robust even with time-varying network topology and agent dropouts.
|
|
10:30-12:00, Paper WeAT22-NT.2 | Add to My Program |
Convex Geometric Trajectory Tracking Using Lie Algebraic MPC for Autonomous Marine Vehicles |
|
Jang, Junwoo | University of Michigan |
Teng, Sangli | University of Michigan, Ann Arbor |
Ghaffari, Maani | University of Michigan |
Keywords: Marine Robotics, Optimization and Optimal Control, Motion Control
Abstract: Controlling marine vehicles in challenging environments is a complex task due to the presence of nonlinear hydrodynamics and uncertain external disturbances. Despite nonlinear model predictive control (MPC) showing potential in addressing these issues, its practical implementation is often constrained by computational limitations. In this paper, we propose an efficient controller for trajectory tracking of marine vehicles by employing a convex error-state MPC on the Lie group. By leveraging the inherent geometric properties of the Lie group, we can construct globally valid error dynamics and formulate a quadratic programming-based optimization problem. Our proposed MPC demonstrates effectiveness in trajectory tracking through extensive-numerical simulations, including scenarios involving ocean currents. Notably, our method substantially reduces computation time compared to nonlinear MPC, making it well-suited for real-time control applications with long prediction horizons or involving small marine vehicles.
|
|
10:30-12:00, Paper WeAT22-NT.3 | Add to My Program |
Mission Planning for Multiple Autonomous Underwater Vehicles with Constrained in Situ Recharging |
|
Singh, Priti | Oregon State University |
Hollinger, Geoffrey | Oregon State University |
Keywords: Marine Robotics, Path Planning for Multiple Mobile Robots or Agents, Energy and Environment-Aware Automation
Abstract: Persistent operation of Autonomous Underwater Vehicles (AUVs) without manual interruption for recharging saves time and total cost for offshore monitoring and data collection applications. In order to facilitate AUVs for long mission durations without ship support, they can be equipped with docking capabilities to recharge in situ at Wave Energy Converter (WEC) with dock recharging stations. However, the power generated at the recharging stations may be constrained depending on the sea conditions. Therefore, a robust mission planning framework is proposed using a centralized Evolutionary Algorithm (EA) and a decentralized Monte Carlo Tree Search (MCTS) method. Both methods incorporate the charge availability constraint at the recharging station in addition to the maximum charge capacity of each AUV. The planner utilizes a time-varying power profile of irregular waves incident at WECs for dock charging and generates efficient mission plans for AUVs by optimizing their time to visit the dock based on the imposed constraint. The effects of increasing the number of AUVs, increasing the number of points of interest in the mission area, and varying sea state on the mission duration are also analyzed.
|
|
10:30-12:00, Paper WeAT22-NT.4 | Add to My Program |
Decentralized Multi-Robot Navigation for Autonomous Surface Vehicles with Distributional Reinforcement Learning |
|
Lin, Xi | Stevens Institute of Technology |
Huang, Yewei | Stevens Institute of Technology |
Chen, Fanfei | Stevens Institute of Technology |
Englot, Brendan | Stevens Institute of Technology |
Keywords: Marine Robotics, Path Planning for Multiple Mobile Robots or Agents, Reinforcement Learning
Abstract: Collision avoidance algorithms for Autonomous Surface Vehicles (ASV) that follow the Convention on the International Regulations for Preventing Collisions at Sea (COLREGs) have been proposed in recent years. However, it may be difficult and unsafe to follow COLREGs in congested waters, where multiple ASVs are navigating in the presence of static obstacles and strong currents, due to the complex interactions. To address this problem, we propose a decentralized multi-ASV collision avoidance policy based on Distributional Reinforcement Learning, which considers the interactions among ASVs as well as with static obstacles and current flows. We evaluate the performance of the proposed Distributional RL based policy against a traditional RL-based policy and two classical methods, Artificial Potential Fields (APF) and Reciprocal Velocity Obstacles (RVO), in simulation experiments, which show that the proposed policy achieves superior performance in navigation safety, while requiring minimal travel time and energy. A variant of our framework that automatically adapts its risk sensitivity is also demonstrated to improve ASV safety in highly congested environments.
|
|
10:30-12:00, Paper WeAT22-NT.5 | Add to My Program |
Real-Time Planning under Uncertainty for AUVs Using Virtual Maps |
|
Collado-Gonzalez, Ivana | Stevens Institute of Technology |
McConnell, John | Stevens Institute of Technology |
Wang, Jinkun | Stevens Institute of Technology |
Szenher, Paul | Stevens Institute of Technology |
Englot, Brendan | Stevens Institute of Technology |
Keywords: Marine Robotics, Planning under Uncertainty, Reactive and Sensor-Based Planning
Abstract: Reliable localization is an essential capability for marine robots navigating in GPS-denied environments. SLAM, commonly used to mitigate dead reckoning errors, still fails in feature-sparse environments or with limited-range sensors. Pose estimation can be improved by incorporating the uncertainty prediction of future poses into the planning process and choosing actions that reduce uncertainty. However, performing belief propagation is computationally costly, especially when operating in large-scale environments. This work proposes a computationally efficient planning under uncertainty framework suitable for large-scale, feature-sparse environments. Our strategy leverages SLAM graph and occupancy map data obtained from a prior exploration phase to create a virtual map, describing the uncertainty of each map cell using a multivariate Gaussian. The virtual map is then used as a cost map in the planning phase, and performing belief propagation at each step is avoided. A receding horizon planning strategy is implemented, managing a goal-reaching and uncertainty-reduction tradeoff. Simulation experiments in a realistic underwater environment validate this approach. Experimental comparisons against a full belief propagation approach and a standard shortest-distance approach are conducted.
|
|
10:30-12:00, Paper WeAT22-NT.6 | Add to My Program |
Sea-U-Foil: A Hydrofoil Marine Vehicle with Multi-Modal Locomotion |
|
Zhao, Zuoquan | The Chinese University of Hong Kong |
Zhai, Yu | The Chinese University of Hong Kong |
Gao, Chuanxiang | The Chinese University of Hong Kong |
Ding, Wendi | The Chinese University of Hong Kong |
Yan, Ruixin | The Chinese University of Hong Kong |
Gao, Songqun | Chinese University of Hong Kong |
Han, Bingxin | The Chinese University of Hong Kong |
Liu, Xuchen | The Chinese University of Hong Kong |
Guo, Zixuan | The Chinese University of Hong Kong |
Chen, Ben M. | Chinese University of Hong Kong |
Keywords: Marine Robotics, Product Design, Development and Prototyping, Search and Rescue Robots
Abstract: Autonomous Marine Vehicles (AMVs) have been widely used in many critical tasks such as surveillance, patrolling, marine environment monitoring, and hydrographic surveying. However, most typical AMVs cannot meet the diverse demands of different marine tasks. In this article, we design a new type of remote-controlled hydrofoil marine vehicle, named Sea-U-Foil, which is suitable for different marine scenarios. Sea-U-Foil features three distinct locomotion modes, displacement mode, foilborne mode, and submarine mode, which enable the platform flexible mobility, high-speed and high-load capacities, and superior concealment. Specifically, the submarine mode makes Sea-U-Foil unique among previous studies. In addition, the performance of Sea-U-Foil in foilborne mode outperforms those of most current unmanned surface vehicles (USVs) in terms of speed and payload. To the best of our knowledge, we are the first to introduce a new type of AMV that can work in displacement mode, foilborne mode, and submarine mode. We elaborate on the design principles and methodologies of Sea-U-Foil first, then validate the effectiveness of its tri-modal locomotion through extensive experiments.
|
|
10:30-12:00, Paper WeAT22-NT.7 | Add to My Program |
Learning Which Side to Scan: Multi-View Informed Active Perception with Side Scan Sonar for Autonomous Underwater Vehicles |
|
Venkatramanan Sethuraman, Advaith | University of Michigan |
Baldoni, Philip | United States Naval Research Laboratory |
Skinner, Katherine | University of Michigan |
McMahon, James | The Naval Research Laboratory |
Keywords: Marine Robotics, Reactive and Sensor-Based Planning, Object Detection, Segmentation and Categorization
Abstract: Autonomous underwater vehicles often perform surveys that capture multiple views of targets in order to provide more information for human operators or automatic target recognition algorithms. In this work, we address the problem of choosing the most informative views that minimize survey time while maximizing classifier accuracy. We introduce a novel active perception framework for multi-view adaptive surveying and reacquisition using side scan sonar imagery.Our framework addresses this challenge by using a graph formulation for the adaptive survey task. We then use Graph Neural Networks (GNNs) to both classify acquired sonar views and reinforcement learning to choose the next best view to capture based on the collected data. We evaluate our method using simulated surveys in a high-fidelity side scan sonar simulator. Our results demonstrate that our approach is able to surpass the state-of-the-art in classification accuracy and efficiency. This framework is a promising approach for more efficient autonomous missions involving side scan sonar, such as underwater exploration, marine archaeology, and environmental monitoring.
|
|
10:30-12:00, Paper WeAT22-NT.8 | Add to My Program |
Development of a Lightweight Underwater Manipulator for Delicate Structural Repair Operations |
|
Mao, Juzheng | Southeast University |
Song, Guangming | Southeast University |
Hao, Shuang | Southeast University |
Zhang, Mingquan | Southeast University |
Song, Aiguo | Southeast University |
Keywords: Marine Robotics, Robotics and Automation in Construction, Engineering for Robotic Systems
Abstract: In recent years, underwater robots have been increasingly used in the maintenance of hydraulic structures. Underwater manipulators are essential devices that are used to carry out such maintenance tasks. For delicate repair operations such as fixing tiny cracks, most existing underwater manipulators face limitations in terms of size, accuracy, and scalability. Therefore, in this letter, we present a novel electric underwater manipulator, named SEU-4. This four-degree-of-freedom manipulator weighs 8.91 kg and has a maximum payload of 9 kg. It has a rapid-switching interface that supports convenient mechanical and electrical connections for end-effectors. To compensate for the disturbances that are present in the complex underwater environment, a trajectory-tracking control strategy based on a disturbance observer and sliding-mode control (DOB-SMC) is proposed. A prototype of the proposed underwater manipulator was created, and a flowing-water experimental platform was constructed to test its trajectory-tracking performance in fast-flowing water. The experimental results show that the manipulator achieves a trajectory-tracking error of 1.03 mm in static water and 2.91 mm in flowing water at 1.2 m/s, which satisfies the requirements of delicate repair operations.
|
|
WeAT23-NT Oral Session, NT-G401 |
Add to My Program |
Aerial Systems: Mechanics and Control IV |
|
|
Chair: Ott, Christian | TU Wien |
Co-Chair: Perez-Arancibia, Nestor O | Washington State University (WSU) |
|
10:30-12:00, Paper WeAT23-NT.1 | Add to My Program |
MPS: A New Method for Selecting the Stable Closed-Loop Equilibrium Attitude-Error Quaternion of a UAV During Flight |
|
Gonçalves, Francisco | Washington State University |
Bena, Ryan | University of Southern California |
Matveev, Konstantin | Washington State University |
Perez-Arancibia, Nestor O | Washington State University (WSU) |
Keywords: Aerial Systems: Mechanics and Control, Motion Control, Space Robotics and Automation
Abstract: We present model predictive selection (MPS), a new method for selecting the stable closed-loop (CL) equilibrium attitude-error quaternion (AEQ) of an uncrewed aerial vehicle (UAV) during the execution of high-speed yaw maneuvers. In this approach, we minimize the cost of yawing measured with a performance figure of merit (PFM) that takes into account both the aerodynamic-torque control input and attitude-error state of the UAV. Specifically, this method uses a control law with a term whose sign is dynamically switched in real time to select, between two options, the torque associated with the lesser cost of rotation as predicted by a dynamical model of the UAV derived from first principles. This problem is relevant because the selection of the stable CL equilibrium AEQ significantly impacts the performance of a UAV during high-speed rotational flight, from both the power and control-error perspectives. To test and demonstrate the functionality and performance of the proposed method, we present data collected during one hundred real-time high-speed yaw-tracking flight experiments. These results highlight the superior capabilities of the proposed MPS-based scheme when compared to a benchmark controller commonly used in aerial robotics, as the PFM used to quantify the cost of flight is reduced by 60.30 %, on average. To our best knowledge, these are the first flight-test results that thoroughly demonstrate, evaluate, and compare the performance of a real-time controller capable of selecting the stable CL equilibrium AEQ during operation.
|
|
10:30-12:00, Paper WeAT23-NT.2 | Add to My Program |
Realtime Brain-Inspired Adaptive Learning Control for Nonlinear Systems with Configuration Uncertainties (I) |
|
Zhang, Yanhui | Zhejiang University |
Tong, Zheyu | Zhejiang University |
Zhang, YiFan | Zhejiang University |
Chen, Song | Zhejiang University |
Yang, Junyuan | Zhejiang University |
Chen, Weifang | Zhejiang University |
Keywords: Aerial Systems: Mechanics and Control, Reinforcement Learning, Imitation Learning
Abstract: This paper investigates the problem of adaptive tracking control for quadcopter in the presence of nonlinear configuration uncertainties. It utilizes a real-time brain-inspired learning control (RBiLC) method to address the challenges posed by nonlinear time-varying uncertain instructions. To address the issue of flight control law reconfiguration caused by unknown changes in the fuselage configuration (e.g., propellers or motors), this paper introduces an online learning-evaluation-optimization reconstruction mechanism based on RBiLC. The proposed adaptive learning controller mitigates the need for extensive human resources and reduces the time required for flight controller design. The Lyapunov-Krasovskii function is introduced as a compensatory measure to address the impact of parameter uncertainty on system stability. Furthermore, this paper proposes a signed sinusoidal function perturbation estimate to guide the direction and magnitude throughout the online learning process. The approach conducts a theoretical stability analysis on a quadcopter vehicle considering uncertainties in UAV dynamics modeling. The results demonstrate that the proposed scheme achieves superior control and faster adaptation, enabling the system to ultimately converge to a compact set within a limited time domain. Finally, software-in-the-loop (SITL) simulations and flight verification results are presented to validate the proposed control strategy
|
|
10:30-12:00, Paper WeAT23-NT.3 | Add to My Program |
Safety-Conscious Pushing on Diverse Oriented Surfaces with Underactuated Aerial Vehicles |
|
Hui, Tong | Technical University of Denmark |
Fernández González, Manuel Jesús | Automation and Control, Technical University of Denmark |
Fumagalli, Matteo | Danish Technical University |
Keywords: Aerial Systems: Mechanics and Control, Robot Safety, Underactuated Robots
Abstract: Pushing tasks performed by aerial manipulators can be used for contact-based industrial inspections. Underactuated aerial vehicles are widely employed in aerial manipulation due to their widespread availability and relatively low cost. Industrial infrastructures often consist of diverse oriented work surfaces. When interacting with such surfaces, the coupled gravity compensation and interaction force generation of underactuated aerial vehicles can present the potential challenge of near-saturation operations. The blind utilization of these platforms for such tasks can lead to instability and accidents, creating unsafe operating conditions and potentially damaging the platform. In order to ensure safe pushing on these surfaces while managing platform saturation, this work establishes a safety assessment process. This process involves the prediction of the saturation level of each actuator during pushing across variable surface orientations. Furthermore, the assessment results are used to plan and execute physical experiments, ensuring safe operations and preventing platform damage.
|
|
10:30-12:00, Paper WeAT23-NT.4 | Add to My Program |
Geranos: A Novel Tilted-Rotors Aerial Robot for the Transportation of Poles |
|
Gorlo, Nicolas | ETH Zurich |
Müller, Mario Sven | ETH Zürich |
Bamert, Samuel | ETH Zürich |
Reinhart, Tim | ETH Zurich |
Stadler, Henriette | ETH Zürich |
Cathomen, Rafael | ETH Zurich |
Käppeli, Gabriel | ETH Zürich |
Shen, Hua | ETH Zürich |
Cuniato, Eugenio | ETH Zurich |
Tognon, Marco | Inria Rennes |
Siegwart, Roland | ETH Zurich |
Keywords: Aerial Systems: Applications, Robotics and Automation in Construction, Grippers and Other End-Effectors
Abstract: Building and maintaining structures like antennas and cable-car masts in challenging terrain often involves hazardous and expensive sling-loaded helicopter operations. In this work, we challenge this paradigm by proposing Geranos: a multicopter unmanned aerial vehicle (UAV) adept at precisely placing vertical poles, blending load transport with precision. Geranos minimizes the effects of the poles' large moment of inertia by adopting a ring design that accommodates the pole in its center. To grasp the load, we developed a two-part grasping mechanism, creating a near-rigid connection between the UAV and the load. This lightweight construction is reliable and robust while not relying on active forces to maintain the grasp. The UAV utilizes four main propellers to counteract gravity and four auxiliary ones for enhanced lateral positional accuracy, ensuring full actuation around hovering. In a demonstration mimicking the installation of antennas or cable-car masts, we show the capability of Geranos to assemble poles (mass of 3kg and length of 2m) on top of each other. In this scenario, Geranos demonstrates an impressive load-placement accuracy of less than 5cm.
|
|
10:30-12:00, Paper WeAT23-NT.5 | Add to My Program |
Robust and Energy-Efficient Control for Multi-Task Aerial Manipulation with Automatic Arm-Switching |
|
Wu, Ying | Sun Yat-Sen University |
Zhou, Zida | Sun Yat-Sen University |
Wei, Mingxin | Sun Yat-Sen University |
Cheng, Hui | Sun Yat-Sen University |
Keywords: Aerial Systems: Applications, Robust/Adaptive Control, Model Learning for Control
Abstract: Aerial manipulation has received increasing research interest with wide applications of drones. To perform specific tasks, robotic arms with various mechanical structures will be mounted on the drone. It results in sudden disturbances to the aerial manipulator when switching the robotic arm or interacting with the environment. Hence, it is challenging to design a generic and robust control strategy adapted to various robotic arms when achieving multi-task aerial manipulation. In this paper, we present a learning-based control algorithm that allows online trajectory optimization and tracking to accomplish various aerial interaction tasks without manual adjustment. The proposed energy-saved trajectory planning approach integrates coupled dynamics model with a single rigid body to generate the energy-efficient trajectory for the aerial manipulator. Addressing the challenges of precise control when performing aerial manipulation tasks, this paper presents a controller based on deep neural networks that classifies and learns accurate forces and moments caused by different robotic arms and interactions. Moreover, the forces arising from robotic arm motions are delicately used as part of the drone’s power to save energy. Extensive real-world experiments demonstrate that the proposed method can adapt to various robotic arms and interactions when performing multi-task aerial manipulation.
|
|
10:30-12:00, Paper WeAT23-NT.6 | Add to My Program |
Optimal Collaborative Transportation for Under-Capacitated Vehicle Routing Problems Using Aerial Drone Swarms |
|
Kopparam Sreedhara, Akash | Vellore Institute of Technology, Vellore |
Padala, Deepesh | Vellore Institute of Technology, Vellore |
Mahesh, Shashank | Vellore Institute of Technology, Vellore |
Cui, Kai | Technische Universität Darmstadt |
Li, Mengguang | Technische Universität Darmstadt |
Koeppl, Heinz | Technische Universität Darmstadt |
Keywords: Aerial Systems: Applications, Optimization and Optimal Control, Swarm Robotics
Abstract: Swarms of aerial drones have recently been considered for last-mile deliveries in urban logistics or automated construction. At the same time, collaborative transportation of payloads by multiple drones is another important area of recent research. However, efficient coordination algorithms for collaborative transportation of many payloads by many drones remain to be considered. In this work, we formulate the collaborative transportation of payloads by a swarm of drones as a novel, under-capacitated generalization of vehicle routing problems (VRP), which may also be of separate interest. In contrast to standard VRP and capacitated VRP, we must additionally consider waiting times for payloads lifted cooperatively by multiple drones, and the corresponding coordination. Algorithmically, we provide a solution encoding that avoids deadlocks and formulate an appropriate alternating minimization scheme to solve the problem. On the hardware side, we integrate our algorithms with collision avoidance and drone controllers. The approach and the impact of the system integration are successfully verified empirically, both on a swarm of real nano-quadcopters and for large swarms in simulation. Overall, we provide a framework for collaborative transportation with aerial drone swarms, that uses only as many drones as necessary for the transportation of any single payload.
|
|
10:30-12:00, Paper WeAT23-NT.7 | Add to My Program |
A Modular Aerial System Based on Homogeneous Quadrotors with Fault-Tolerant Control |
|
Li, Mengguang | Technische Universität Darmstadt |
Cui, Kai | Technische Universität Darmstadt |
Koeppl, Heinz | Technische Universität Darmstadt |
Keywords: Aerial Systems: Mechanics and Control, Swarm Robotics, Aerial Systems: Applications
Abstract: The standard quadrotor is one of the most popular and widely used aerial vehicle of recent decades, offering great maneuverability with mechanical simplicity. However, the under-actuation characteristic limits its applications, especially when it comes to generating desired wrench with six degrees of freedom (DOF). Therefore, existing work often compromises between mechanical complexity and the controllable DOF of the aerial system. To take advantage of the mechanical simplicity of a standard quadrotor, we propose a modular aerial system, IdentiQuad, that combines only homogeneous quadrotor-based modules. Each IdentiQuad can be operated alone like a standard quadrotor, but at the same time allows task-specific assembly, increasing the controllable DOF of the system. Each module is interchangeable within its assembly. We also propose a general controller for different configurations of assemblies, capable of tolerating rotor failures and balancing the energy consumption of each module. The functionality and robustness of the system and its controller are validated using physics-based simulations for different assembly configurations.
|
|
10:30-12:00, Paper WeAT23-NT.8 | Add to My Program |
Observer-Based Controller Design for Oscillation Damping of a Novel Suspended Underactuated Aerial Platform |
|
Das, Hemjyoti | Technical University of Vienna |
Vu, Minh Nhat | TU Wien, Austria |
Egle, Tobias | TU Wien |
Ott, Christian | TU Wien |
Keywords: Aerial Systems: Mechanics and Control, Underactuated Robots
Abstract: In this work, we present a novel actuation strategy for a suspended aerial platform. By utilizing an underactuation approach, we demonstrate the successful oscillation damping of the proposed platform, modeled as a spherical double pendulum. A state estimator is designed in order to obtain the deflection angles of the platform, which uses only onboard IMU measurements. The state estimator is an extended Kalman filter (EKF) with intermittent measurements obtained at different frequencies. An optimal state feedback controller and a PD+ controller are designed in order to dampen the oscillations of the platform in the joint space and task space respectively. The proposed underactuated platform is found to be more energy-efficient than an omnidirectional platform and requires fewer actuators. The effectiveness of our proposed system is validated using both simulations and experimental studies.
|
|
10:30-12:00, Paper WeAT23-NT.9 | Add to My Program |
MOAR Planner: Multi-Objective and Adaptive Risk-Aware Path Planning for Infrastructure Inspection with a UAV |
|
Petit, Louis | McGill University |
Lussier Desbiens, Alexis | Université De Sherbrooke |
Keywords: Aerial Systems: Perception and Autonomy, Aerial Systems: Applications, Task and Motion Planning
Abstract: The problem of autonomous navigation for UAV inspection remains challenging as it requires effectively navigating in close proximity to obstacles, while accounting for dynamic risk factors such as weather conditions, communication reliability, and battery autonomy. This paper introduces the MOAR path planner which addresses the complexities of evolving risks during missions. It offers real-time trajectory adaptation while concurrently optimizing safety, time, and energy. The planner employs a risk-aware cost function that integrates pre-computed cost maps, the new concepts of damage and insertion costs, and an adaptive speed planning framework. With that, the optimal path is searched in a graph using a discrete representation of the state and action spaces. The method is evaluated through simulations and real-world flight tests. The results show the capability to generate real-time trajectories spanning a broad range of evaluation metrics, around 90% of the range occupied by popular algorithms. The proposed framework contributes by enabling UAVs to navigate more autonomously and reliably in critical missions.
|
|
WeAT24-NT Oral Session, NT-G402 |
Add to My Program |
Field Robotics and Automation |
|
|
Chair: Liarokapis, Minas | The University of Auckland |
Co-Chair: Chowdhary, Girish | University of Illinois at Urbana Champaign |
|
10:30-12:00, Paper WeAT24-NT.1 | Add to My Program |
SOL: A Compact, Portable, Telescopic, Soft-Robotic Sun-Tracking Mechanism for Improved Solar Power Production |
|
Busby, Bryan | The University of Auckland |
Duan, Shifei | University of Auckland |
Thompson, Marcus | Whanauka Limited |
Liarokapis, Minas | The University of Auckland |
Keywords: Energy and Environment-Aware Automation
Abstract: Solar power is becoming an increasingly popular option for energy production in commercial and private applications. While installing solar panels (photovoltaic cells) in a stationary configuration is simple and inexpensive, such a setup fails to maximise their potential solar energy production. Single- and dual-axis sun trackers automatically adjust the tilt angle of photovoltaic cells so as to directly face towards sun, but these also come with their own drawbacks such as increased cost and weight. This paper presents SOL, a soft-robotic, dual-axis, sun-tracking mechanism for improved solar panel efficiency. The proposed design was built to be compact, portable, and lightweight, and it utilises closed-loop control for the intelligent actuation of a set of soft telescopic structures that raise and tilt the solar panels in the direction of the sun. The performance of the proposed solar tracking platform was experimentally validated in terms of its maximum elevation at different azimuths and its ability to balance different loads. The result is a device that provides solar panel users with an accessible, affordable, and convenient means of increasing the efficiency of their solar energy system.
|
|
10:30-12:00, Paper WeAT24-NT.2 | Add to My Program |
Measuring Ball Joint Faults in Parabolic-Trough Solar Plants with Data Augmentation and Deep Learning |
|
Pérez Cutińo, Miguel Angel | Universidad De Sevilla |
Capitan, Jesus | University of Seville |
Díaz-Báńez, José-Miguel | Universidad Sevilla |
Valverde, Juan | University of Seville |
Keywords: Energy and Environment-Aware Automation, Deep Learning for Visual Perception, Aerial Systems: Applications
Abstract: Automatic inspection of parabolic-trough solar plants is key to preventing failures that can harm the environment and the production of green energy. In this work, we propose a novel methodology to inspect ball joints in parabolic trough collectors, which is a relevant problem that is not adequately covered in the literature. Images collected by an Unmanned Aerial Vehicle are segmented using deep learning to extract ball joint components. In order to generate rich training datasets, we develop a novel data augmentation technique by rotating joints and adding synthetic image background, and demonstrate its impact on the object detection accuracy. Then two types of faults are analyzed: fluid leaks, by means of image color filtering; and geometric shape anomalies, by measuring joint angles of the robotic arms. We propose metrics to quantify these faults and evaluate the damage of the inspected components. Our experimental results with images from operating commercial plants show that we can automatically detect leaks and anomalous angular geometry with a low failure rate compared to human labeling.
|
|
10:30-12:00, Paper WeAT24-NT.3 | Add to My Program |
ECDP: Energy Consumption Disaggregation Pipeline for Energy Optimization in Lightweight Robots |
|
Heredia, Juan | University of Southern Denmark |
Kirschner, Robin Jeanne | TU Munich, Institute for Robotics and Systems Intelligence |
Schlette, Christian | University of Southern Denmark (SDU) |
Abdolshah, Saeed | KUKA Deutschland GmbH |
Haddadin, Sami | Technical University of Munich |
Mikkel, Kjćrgaard | University of Southern Denmark |
Keywords: Energy and Environment-Aware Automation, Engineering for Robotic Systems
Abstract: Limited resources and resulting energy crises occurring all over the world highlight the importance of energy efficiency in technological developments such as robotic manipulators. Efficient energy consumption of manipulators is necessary to make them affordable and spread their application in the future industry. Previously, the power consumption of the robot motion was the main factor considered in the evaluation of energy efficiency. Lately, the paradigm in industrial robotics shifted towards lightweight robot manipulators which require a new investigation on the disaggregation of robot energy consumption. In this paper, we propose a novel pipeline to identify and disaggregate the energy use of mechatronic devices and apply it to lightweight industrial robots. The proposed method allows the identification of the electronic components consumption, mechanical losses, electrical losses, and required mechanical energy for robot motion. We evaluate the pipeline and understand the distribution of energy consumption using four different manipulators, namely, Universal Robot's UR5e, UR10e, Franka Emika's FR3, and Kinova Gen3. The experimental results show that most of the energy (60- 90%) is consumed by the electronic components of the robot control box. Using this knowledge, the approaches to further optimize their energy consumption need to shift towards efficient robot electronic design instead of efficient robot mass distribution or motion control.
|
|
10:30-12:00, Paper WeAT24-NT.4 | Add to My Program |
Autonomous UAV Mission Cycling: A Mobile Hub Approach for Precise Landings and Continuous Operations in Challenging Environments |
|
Moortgat-Pick, Alexander | Technical University of Munich (TUM) |
Schwahn, Marie | Technical University of Munich |
Adamczyk, Anna | Technical University of Munich (TUM) |
Duecker, Daniel Andre | Technical University of Munich (TUM) |
Haddadin, Sami | Technical University of Munich |
Keywords: Environment Monitoring and Management, Aerial Systems: Applications, Field Robots
Abstract: Environmental monitoring via UAVs offers unprecedented aerial observation capabilities. However, the limited flight durations of typical multirotors and the demands on human attention in outdoor missions call for more autonomous solutions. Addressing the specific challenges of precise UAV landings—especially amidst wind disturbances, obstacles, and unreliable global localization—we introduce a mobile hub concept. This hub facilitates continuous mission cycling for unmodified off-the-shelf UAVs. Our approach centers on a small landing platform affixed to a robotic arm, adeptly correcting UAV pose errors in windy conditions. Compact enough for installation in an economy car, the system emphasizes two novel strategies. Firstly, external visual tracking of the UAV informs the landing controls for both the drone and the robotic arm. The arm compensates for UAV positioning errors and aligns the platform's attitude with the UAV for stable landings, even on small platforms under windy conditions. Secondly, the robotic arm can transport the UAV inside the hub, perform maintenance tasks like battery replacements, and then facilitate direct relaunches. Importantly, our design places all operational responsibility on the hub, ensuring the UAV remains unaltered. This ensures broad compatibility with standard UAVs, only necessitating an API for attitude setpoints. Experimental results underscore the efficiency of our model, achieving safe landings with minimal errors (≤ 7 cm) in winds up to 5 Beaufort (8.1 m/s). In essence, our mobile hub concept significantly boosts UAV mission availability, allowing for autonomous operations even under challenging conditions.
|
|
10:30-12:00, Paper WeAT24-NT.5 | Add to My Program |
Low-To-High Resolution Path Planner for Robotic Gas Distribution Mapping |
|
Nanavati, Rohit Vishwajit | Loughborough University |
Rhodes, Callum | Imperial College London |
Coombes, Matthew | Loughborough University |
Liu, Cunjia | Loughborough University |
Keywords: Environment Monitoring and Management, Robotics in Hazardous Fields, Reactive and Sensor-Based Planning
Abstract: Robotic gas distribution mapping improves the understanding of a hazardous gas dispersion while putting the human operator out of danger. Generating an accurate gas distribution map quickly is of utmost importance in situations such as gas leaks and industrial incidents, so that the efficient use of resources in response to incidents can be facilitated. In this paper, to incorporate the operational requirement on map granularity, we propose a low-to-high resolution path planner that first guides a single robots to quickly and sparsely sample the region of interest to generate a low resolution gas distribution map, followed by high resolution sampling informed by the low resolution map as a prior. The low resolution prior acts as a coverage survey allowing the algorithm to perform a relatively exploitative search of high concentration regions, resulting in overall shorter mission times. The proposed framework is designed to iteratively identify the next best T locations to sample, which prioritises the potentially high reward locations, while ensuring that the robot can travel to and sample the chosen locations within a user specified map update cycle. We present a simulation study to demonstrate the alternating exploration-exploitation like behaviour along with bench-marking its performance in contrast to the traditional sampling path planners and various reward functions.
|
|
10:30-12:00, Paper WeAT24-NT.6 | Add to My Program |
Persistent Monitoring of Large Environments with Robot Deployment Scheduling in between Remote Sensing Cycles |
|
Masaba, Kizito | Dartmouth College |
Roznere, Monika | Dartmouth College |
Jeong, Mingi | Dartmouth College |
Quattrini Li, Alberto | Dartmouth College |
Keywords: Environment Monitoring and Management, Robotics in Under-Resourced Settings, Planning under Uncertainty
Abstract: This paper proposes a novel decision-making framework for planning “when” and “where” to deploy robots based on prior data with the goal of persistently monitoring a spatio-temporal phenomenon in an environment. We specifically focus on large lake monitoring, where remote sensors, such as satellites, can provide a snapshot of the target phenomenon at regular cycles. Between these cycles, Autonomous Surface Vehicles (ASVs) can be deployed to maintain an up-to-date model of the phenomenon. However, deploying ASVs has a significant logistical overhead in terms of time and cost. It requires a team of people to go on site and spend typically a day to monitor the deployment. It is vital to not only be intentional about where to sample in the environment on a given day, but also determine the worth of deploying the ASVs that day at all. Therefore, we propose a persistent monitoring strategy that provides the days and locations of when and where to sample with the robots by leveraging Gaussian Process model estimates of future trends based on collected remote sensing and point measurement data. Our approach minimizes the number of days and locations for sampling, while preserving the quality of estimates. Through simulation experiments using realistic spatio-temporal datasets, we demonstrate the benefits of our approach over traditional deployment strategies, including significant savings on the effort and operational cost of deploying the ASVs.
|
|
10:30-12:00, Paper WeAT24-NT.7 | Add to My Program |
System Calibration of a Field Phenotyping Robot with Multiple High-Precision Profile Laser Scanners |
|
Esser, Felix | University of Bonn |
Tombrink, Gereon | University of Bonn |
Cornelißen, André | University of Bonn |
Klingbeil, Lasse | University of Bonn |
Kuhlmann, Heiner | University of Bonn |
Keywords: Field Robots, Robotics and Automation in Agriculture and Forestry, Agricultural Automation
Abstract: The creation of precise and high-resolution crop point clouds in agricultural fields has become a key challenge for high-throughput phenotyping applications. This work implements a novel calibration method to calibrate the laser scanning system of an agricultural field robot consisting of two industrial-grade laser scanners used for high-precise 3D crop point cloud creation. The calibration method optimizes the transformation between the scanner origins and the robot pose by minimizing 3D point omnivariances within the point cloud. Moreover, we present a novel factor graph-based pose estimation method that fuses total station prism measurements with IMU and GNSS heading information for high-precise pose determination during calibration. The root-mean-square error of the distances to a georeferenced ground truth point cloud results in 0.8 cm after parameter optimization. Furthermore, our results show the importance of a reference point cloud in the calibration method needed to estimate the vertical translation of the calibration. Challenges arise due to non-static parameters while the robot moves, indicated by systematic deviations to a ground truth terrestrial laser scan.
|
|
10:30-12:00, Paper WeAT24-NT.8 | Add to My Program |
Atmospheric Aerosol Diagnostics with UAV-Based Holographic Imaging and Computer Vision |
|
Bristow, Nathaniel | University of Minnesota |
Pardoe, Nikolas | University of Minnesota |
Hong, Jiarong | ME, UMN |
Keywords: Field Robots, Vision-Based Navigation, Aerial Systems: Applications
Abstract: Emissions of particulate matter into the atmosphere are essential to characterize, in terms of properties such as particle size, morphology, and composition, to better understand impacts on public health and the climate. However, there is no currently available technology capable of measuring individual particles with such high detail over the extensive domains associated with events such as wildfires or volcanic eruptions. To solve this problem, we present an autonomous measurement system involving an unmanned aerial vehicle (UAV) coupled with a digital inline holographic microscope for in situ particle diagnostics. The flight control uses computer vision to localize and then trace the movements of particle-laden flows while sampling particles to determine their properties as they are transported away from their source. We demonstrate this system applied to measuring particulate matter in smoke plumes and discuss broader implications for this type of system in similar applications.
|
|
10:30-12:00, Paper WeAT24-NT.9 | Add to My Program |
WayFASTER: A Self-Supervised Traversability Prediction for Increased Navigation Awareness |
|
Valverde Gasparino, Mateus | University of Illinois at Urbana-Champaign |
Sivakumar, Arun Narenthiran | University of Illinois at Urbana Champaign |
Chowdhary, Girish | University of Illinois at Urbana Champaign |
Keywords: Field Robots, Vision-Based Navigation, Robotics and Automation in Agriculture and Forestry
Abstract: Accurate and robust navigation in unstructured environments requires fusing data from multiple sensors. Such fusion ensures that the robot is better aware of its surroundings, including areas of the environment that are not immediately visible but were visible at a different time. To solve this problem, we propose a method for traversability prediction in challenging outdoor environments using a sequence of RGB and depth images fused with pose estimations. Our method, termed WayFASTER (Waypoints-Free Autonomous System for Traversability with Enhanced Robustness), uses experience data recorded from a receding horizon estimator to train a self-supervised neural network for traversability prediction, eliminating the need for heuristics. Our experiments demonstrate that our method excels at avoiding obstacles, and correctly detects that traversable terrains, such as tall grass, can be navigable. By using a sequence of images, WayFASTER significantly enhances the robot’s awareness of its surroundings, enabling it to predict the traversability of terrains that are not immediately visible. This enhanced awareness contributes to better navigation performance in environments where such predictive capabilities are essential.
|
|
WeAT25-NT Oral Session, NT-G403 |
Add to My Program |
Localization IV |
|
|
Chair: Chen, Changhao | National University of Defense Technology |
Co-Chair: Tan, U-Xuan | Singapore University of Techonlogy and Design |
|
10:30-12:00, Paper WeAT25-NT.1 | Add to My Program |
A Coarse-To-Fine Place Recognition Approach Using Attention-Guided Descriptors and Overlap Estimation |
|
Fu, Chencan | Zhejiang University |
Li, Lin | Zhejiang University |
Mei, Jianbiao | Zhejiang University |
Ma, Yukai | Zhejiang Unicersity |
Peng, Linpeng | Zhejiang University |
Zhao, Xiangrui | Zhejiang University |
Liu, Yong | Zhejiang University |
Keywords: Localization, Mapping, Deep Learning Methods
Abstract: Place recognition is a challenging but crucial task in robotics. Current description-based methods may be limited by representation capabilities, while pairwise similarity-based methods require exhaustive searches, which is time-consuming. In this paper, we present a novel coarse-to-fine approach to address these problems, which combines BEV (Bird's Eye View) feature extraction, coarse-grained matching and fine-grained verification. In the coarse stage, our approach utilizes an attention-guided network to generate attention-guided descriptors. We then employ a fast affinity-based candidate selection process to identify the Top-textit{K} most similar candidates. In the fine stage, we estimate pairwise overlap among the narrowed-down place candidates to determine the final match. Experimental results on the KITTI and KITTI-360 datasets demonstrate that our approach outperforms state-of-the-art methods. The code will be released publicly soon.
|
|
10:30-12:00, Paper WeAT25-NT.2 | Add to My Program |
LHMap-Loc: Cross-Modal Monocular Localization Using LiDAR Point Cloud Heat Map |
|
Wu, Xinrui | Shanghai Jiao Tong University |
Xu, Jianbo | SJTU |
Hu, Puyuan | ShanghaiJiaoTongUniversity |
Wang, Guangming | University of Cambridge |
Wang, Hesheng | Shanghai Jiao Tong University |
Keywords: Localization, Mapping, Deep Learning Methods
Abstract: Localization using a monocular camera in the pre-built LiDAR point cloud map has drawn increasing attention in the field of autonomous driving and mobile robotics. However, there are still many challenges (e.g. difficulties of map storage, poor localization robustness in large scenes) in accurately and efficiently implementing cross-modal localization. To solve these problems, a novel pipeline termed LHMap-loc is proposed, which achieves accurate and efficient monocular localization in LiDAR maps. Firstly, feature encoding is carried out on the original LiDAR point cloud map by generating offline heat point clouds, by which the size of the original LiDAR map is compressed. Then, an end-to-end online pose regression network is designed based on optical flow estimation and spatial attention to achieve real-time monocular visual localization in a pre-built map. In addition, a series of experiments have been conducted to prove the effectiveness of the proposed method. Our code is available at: https://github.com/IRMVLab/LHMap-loc.
|
|
10:30-12:00, Paper WeAT25-NT.3 | Add to My Program |
LocNDF: Neural Distance Field Mapping for Robot Localization |
|
Wiesmann, Louis | University of Bonn |
Guadagnino, Tiziano | University of Bonn |
Vizzo, Ignacio | Dexory |
Zimmerman, Nicky | University of Lugano |
Pan, Yue | University of Bonn |
Kuang, Haofei | University of Bonn |
Behley, Jens | University of Bonn |
Stachniss, Cyrill | University of Bonn |
Keywords: Localization, Mapping, Deep Learning Methods
Abstract: Mapping an environment is essential for several robotic tasks, particularly for localization. In this paper, we address the problem of mapping the environment using LiDAR point clouds with the goal to obtain a map representation that is well suited for robot localization. To this end, we utilize a neural network to learn a discretization-free distance field of a given scene for localization. In contrast to prior approaches, we directly work on the sensor data and do not assume a perfect model of the environment or rely on normals. Inspired by the recently proposed NeRF representations, we supervise the network by points sampled along the measured beams, and our loss is designed to learn a valid distance field. Additionally, we show how to perform scan registration and global localization directly within the neural distance field. We illustrate the capabilities to globally localize within an indoor environment utilizing a particle filter as well as to perform scan registration by tracking the pose of a car based on matching LiDAR scans to the neural distance field.
|
|
10:30-12:00, Paper WeAT25-NT.4 | Add to My Program |
Looking beneath More: A Sequence-Based Localizing Ground Penetrating Radar Framework |
|
Zhang, Pengyu | National University of Defense Technology |
Zhi, Shuaifeng | National University of Defense Technology |
Yuan, Yuelin | Hikauto |
Bi, Beizhen | National University of Defense Technology |
Xin, Qin | National University of Defense Technology |
Huang, Xiaotao | National University of Defense Technology |
Shen, Liang | National University of Defense Technology |
Keywords: Localization, Mapping, Transfer Learning
Abstract: Localizing ground penetrating radar (LGPR) has been proven to be a promising technology for robot localization in various dynamic environments. However, the extreme scarcity of underground features introduces false candidate matches and brings unique challenges to this task. In this paper, we propose a sequence-based framework for LGPR to address the aforementioned issues. Specifically, we first introduce a trainable strategy to extract robust underground features in multi-weather conditions. By further using sequential infor- mation, our LGPR system can observe richer underground scene contexts, and the associated multi-frame scans could also improve the performance of underground place recognition. We demonstrate the superiority of our proposed method by comparing it against several recent state-of-the-art baseline methods applied to GPR image tasks. Experimental results on large public and self-collected datasets show that our proposed framework significantly improves the performance of various baselines in different scenarios.
|
|
10:30-12:00, Paper WeAT25-NT.5 | Add to My Program |
Increasing SLAM Pose Accuracy by Ground-To-Satellite Image Registration |
|
Zhang, Yanhao | University of Technology Sydney |
Shi, Yujiao | The Australian National University |
Wang, Shan | The Australian National University |
Vora, Ankit | Ford Motor Company |
Perincherry, Akhil | Ford Motor Company |
Chen, Yongbo | Australian National University |
Li, Hongdong | Australian National University and NICTA |
Keywords: Localization, SLAM, Deep Learning for Visual Perception
Abstract: Vision-based localization for autonomous driving has been of great interest among researchers. When a pre-built 3D map is not available, the techniques of visual simultaneous localization and mapping (SLAM) are typically adopted. Due to error accumulation, visual SLAM (vSLAM) usually suffers from long-term drift. This paper proposes a framework to increase the localization accuracy by fusing the vSLAM with a deep-learning based ground-to-satellite (G2S) image registration method. In this framework, a coarse (spatial correlation bound check) to fine (visual odometry consistency check) method is designed to select the valid G2S prediction. The selected prediction is then fused with the SLAM measurement by solving a scaled pose graph problem. To further increase the localization accuracy, we provide an iterative trajectory fusion pipeline. The proposed framework is evaluated on two well-known autonomous driving datasets, and the results demonstrate the accuracy and robustness in terms of vehicle localization.
|
|
10:30-12:00, Paper WeAT25-NT.6 | Add to My Program |
EffLoc: Lightweight Vision Transformer for Efficient 6-DOF Camera Relocalization |
|
Xiao, Zhendong | South China University of Technology |
Chen, Changhao | National University of Defense Technology |
Shan, Yang | South China University of Technology |
Wei, Wu | School of Automation Science and Engineering, South China Univers |
Keywords: Localization, SLAM, Deep Learning Methods
Abstract: Camera relocalization is pivotal in computer vision, with applications in AR, drones, robotics, and autonomous driving. It estimates 3D camera position and orientation (6-DoF) from images. Unlike traditional methods like SLAM, recent strides use deep learning for direct end-to-end pose estimation. We propose EffLoc, a novel efficient Vision Transformer for single-image camera relocalization. EffLoc's hierarchical layout, memory-bound self-attention, and feed-forward layers boost memory efficiency and inter-channel communication. Our introduced sequential group attention (SGA) module enhances computational efficiency by diversifying input features, reducing redundancy, and expanding model capacity. EffLoc excels in efficiency and accuracy, outperforming prior methods, such as AtLoc and MapNet. It thrives on large-scale outdoor car-driving scenario, ensuring simplicity, end-to-end trainability, and eliminating handcrafted loss functions.
|
|
10:30-12:00, Paper WeAT25-NT.7 | Add to My Program |
SAGE-ICP: Semantic Information-Assisted ICP |
|
Cui, Jiaming | Zhejiang University |
Chen, Jiming | Zhejiang University |
Li, Liang | Zhejiang Univerisity |
Keywords: Localization, SLAM, Semantic Scene Understanding
Abstract: Robust and accurate pose estimation in unknown environments is an essential part of robotic applications. We focus on LiDAR-based point-to-point ICP combined with effective semantic information. This paper proposes a novel semantic information-assisted ICP method named SAGE-ICP, which leverages semantics in odometry. The semantic information for the whole scan is timely and efficiently extracted by a 3D convolution network, and these point-wise labels are deeply involved in every part of the registration, including semantic voxel downsampling, data association, adaptive local map, and dynamic vehicle removal. Unlike previous semantic-aided approaches, the proposed method can improve localization accuracy in large-scale scenes even if the semantic information has certain errors. Experimental evaluations on KITTI and KITTI-360 show that our method outperforms the baseline methods, and improves accuracy while maintaining real-time performance, i.e., runs faster than the sensor frame rate.
|
|
10:30-12:00, Paper WeAT25-NT.8 | Add to My Program |
HR-APR: APR-Agnostic Framework with Uncertainty Estimation and Hierarchical Refinement for Camera Relocalisation |
|
Liu, Changkun | The Hong Kong University of Science and Technology |
Chen, Shuai | University of Oxford |
Zhao, Yukun | Hong Kong University of Science and Technology |
Huang, Huajian | The Hong Kong University of Science and Technology |
Prisacariu, Victor | University of Oxford |
Braud, Tristan | HKUST |
Keywords: Localization, Visual Learning, Probabilistic Inference
Abstract: Absolute Pose Regressors (APRs) directly estimate camera poses from monocular images, but their accuracy is unstable for different queries. Uncertainty-aware APRs provide uncertainty information on the estimated pose, alleviating the impact of these unreliable predictions. However, existing uncertainty modelling techniques are often coupled with a specific APR architecture, resulting in suboptimal performance compared to state-of-the-art (SOTA) APR methods. This work introduces a novel APR-agnostic framework, HR-APR, that formulates uncertainty estimation as cosine similarity estimation between the query and database features. It does not rely on or affect APR network architecture, which is flexible and computationally efficient. In addition, we take advantage of the uncertainty for pose refinement to enhance the performance of APR. The extensive experiments demonstrate the effectiveness of our framework, reducing 27.4% and 15.2% of computational overhead on the 7Scenes and Cambridge Landmarks datasets while maintaining the SOTA accuracy in single-image APRs.
|
|
10:30-12:00, Paper WeAT25-NT.9 | Add to My Program |
Implicit Learning of Scene Geometry from Poses for Global Localization |
|
Altillawi, Mohammad | Huawei, Autonomous University of Barcelona, |
Li, Shile | Algolux Germany |
Prakhya, Sai Manoj | Huawei Technologies Deutscheland GmbH |
Liu, Ziyuan | Huawei Group |
Serrat, Joan | Computer Vision Center and Computer Science Department, Universi |
Keywords: Localization, Visual Learning, Virtual Reality and Interfaces
Abstract: Global visual localization estimates the absolute pose of a camera using a single image, in a previously mapped area. Obtaining the pose from a single image enables many robotics and augmented/virtual reality applications. Inspired by latest advances in deep learning, many existing approaches directly learn and regress 6 DoF pose from an input image. However, these methods do not fully utilize the underlying scene geometry for pose regression. The challenge in monocular relocalization is the minimal availability of supervised training data, which is just the corresponding 6 DoF poses of the images. In this paper, we propose to utilize these minimal available labels (.i.e, poses) to learn the underlying 3D geometry of the scene and use the geometry, in return, to estimate a 6 DoF pose in a geometric manner. We present a learning method that uses these pose labels and rigid alignment to learn two 3D geometric representations (X, Y, Z coordinates) of the scene, one in camera coordinate frame and the other in global coordinate frame. Given a single image, it estimates these two 3D scene representations, which are then aligned to estimate a pose that matches the pose label. This formulation allows for the active inclusion of additional learning constraints to minimize 3D alignment errors between the two 3D scene representations and 2D re-projection errors between the 3D global scene representation and 2D image pixels, which improves localization accuracy. At inference time, our mo
|
|
WeAT26-NT Oral Session, NT-G404 |
Add to My Program |
SLAM I |
|
|
Chair: Wang, Sen | Imperial College London |
Co-Chair: Erden, Mustafa Suphi | Heriot-Watt University |
|
10:30-12:00, Paper WeAT26-NT.1 | Add to My Program |
KDD-LOAM: Jointly Learned Keypoint Detector and Descriptors Assisted LiDAR Odometry and Mapping |
|
Huang, Renlang | Zhejiang University |
Zhao, Minglei | Zhejiang University |
Chen, Jiming | Zhejiang University |
Li, Liang | Zhejiang Univerisity |
Keywords: SLAM, Deep Learning Methods, Localization
Abstract: Sparse keypoint matching based on distinct 3D feature representations can improve the efficiency and robustness of point cloud registration. Existing learning-based 3D descriptors and keypoint detectors are either independent or loosely coupled, so they cannot fully adapt to each other. In this work, we propose a tightly coupled keypoint detector and descriptor (TCKDD) based on a multi-task fully convolutional network with a probabilistic detection loss. In particular, this self-supervised detection loss fully adapts the keypoint detector to any jointly learned descriptors and benefits the self-supervised learning of descriptors. Extensive experiments on both indoor and outdoor datasets show that our TCKDD achieves textit{state-of-the-art} performance in point cloud registration. Furthermore, we design a keypoint detector and descriptors-assisted LiDAR odometry and mapping framework (KDD-LOAM), whose real-time odometry relies on keypoint descriptor matching-based RANSAC. The sparse keypoints are further used for efficient scan-to-map registration and mapping. Experiments on KITTI dataset demonstrate that KDD-LOAM significantly surpasses LOAM and shows competitive performance in odometry.
|
|
10:30-12:00, Paper WeAT26-NT.2 | Add to My Program |
Campus Map: A Large-Scale Dataset to Support Multi-View VO, SLAM and BEV Estimation |
|
Ross, James | University of Surrey |
Kaygusuz, Nimet | University of Surrey |
Mendez, Oscar | University of Surrey |
Bowden, Richard | University of Surrey |
Keywords: SLAM
Abstract: Significant advances in robotics and machine learning have resulted in many datasets designed to support research into autonomous vehicle technology. However, these datasets are rarely suitable for a wide variety of navigation tasks. For example, datasets that include multiple cameras often have short trajectories without loops that are unsuitable for the evaluation of longer-range SLAM or odometry systems, and datasets with a single camera often lack other sensors, making them unsuitable for sensor fusion approaches. Furthermore, alternative environmental representations such as semantic Bird's Eye View (BEV) maps are growing in popularity, but datasets often lack accurate ground truth and are not flexible enough to adapt to new research trends. To address this gap, we introduce Campus Map, a novel large-scale multi-camera dataset with 2M images from 6 mounted cameras that includes GPS data and 64-beam, 125k point LiDAR scans totalling 8M points (raw packets also provided). The dataset consists of 16 sequences in a large car park and 6 long-term trajectories around a university campus that provide data to support research into a variety of autonomous driving and parking tasks. Long trajectories (average 10~min) and many loops make the dataset ideal for the evaluation of SLAM, odometry and loop closure algorithms, and we provide several state-of-the-art baselines. We also include 40k semantic BEV maps rendered from a digital twin. This novel approach to ground truth generation allows us to produce more accurate and crisp semantic maps than are currently available. We make the simulation environment available to allow researchers to adapt the dataset to their specific needs.
|
|
10:30-12:00, Paper WeAT26-NT.3 | Add to My Program |
DISO: Direct Imaging Sonar Odometry |
|
Xu, Shida | Imperial College London |
Zhang, Kaicheng | Heriot-Watt University |
Hong, Ziyang | Heriot-Watt University |
Liu, Yuanchang | University College London |
Wang, Sen | Imperial College London |
Keywords: SLAM
Abstract: This paper introduces a novel sonar odometry system that estimates the relative spatial transformation between two sonar image frames. Considering the unique challenges, such as low resolution and high noise, of sonar imagery for odometry and Simultaneous Localization and Mapping (SLAM), the proposed Direct Imaging Sonar Odometry (DISO) system is designed to estimate the relative transformation between two sonar frames by minimizing the aggregated sonar intensity errors of points with high intensity gradients. Moreover, DISO is implemented to incorporate a multi-sensor window optimization technique, a data association strategy and an acoustic intensity outlier rejection algorithm for reliability and accuracy. The effectiveness of DISO is evaluated using both simulated and real-world sonar datasets, showing that it outperforms the existing geometric-only method on localization accuracy and achieves state-of-the-art sonar odometry performance. The source code is available at https://github.com/SenseRoboticsLab/DISO.
|
|
10:30-12:00, Paper WeAT26-NT.4 | Add to My Program |
CURL-MAP: Continuous Mapping and Positioning with CURL Representation |
|
Zhang, Kaicheng | Heriot-Watt University |
Ding, Yining | Heriot-Watt University |
Xu, Shida | Imperial College London |
Hong, Ziyang | Heriot-Watt University |
Kong, Xianwen | Heriot-Watt Universiy |
Wang, Sen | Imperial College London |
Keywords: SLAM
Abstract: Maps of LiDAR Simultaneous Localisation and Mapping (SLAM) are often represented as point clouds. They usually take up a huge amount of storage space for large-scale environments, otherwise much structural detail may not be kept. In this paper, a novel paradigm of LiDAR mapping and odometry is designed by leveraging the Continuous and Ultra-compact Representation of LiDAR (CURL). Termed CURL-MAP (Mapping and Positioning), the proposed approach can not only reconstruct 3D maps with a continuously varying density but also efficiently reduce map storage space by using CURL's spherical harmonics implicit encoding. Different from the popular Iterative Closest Point (ICP) based LiDAR odometry techniques, CURL-MAP formulates LiDAR pose estimation as a unique optimisation problem tailored for CURL. Experiment evaluation shows that CURL-MAP achieves state-of-the-art 3D mapping results and competitive LiDAR odometry accuracy. We will release the CURL-MAP codes for the community.
|
|
10:30-12:00, Paper WeAT26-NT.5 | Add to My Program |
Degradation Resilient LiDAR-Radar-Inertial Odometry |
|
Nissov, Morten | NTNU |
Khedekar, Nikhil Vijay | NTNU |
Alexis, Kostas | NTNU - Norwegian University of Science and Technology |
Keywords: SLAM, Aerial Systems: Perception and Autonomy, Field Robots
Abstract: Enabling autonomous robots to operate robustly in challenging environments is necessary in a future with increased autonomy. For many autonomous systems, estimation and odometry remains a single point of failure, from which it can often be difficult, if not impossible, to recover. As such robust odometry solutions are of key importance. In this work a method for tightly-coupled LiDAR-Radar-Inertial fusion for odometry is proposed, enabling the mitigation of the effects of LiDAR degeneracy by leveraging a complementary perception modality while preserving the accuracy of LiDAR in well-conditioned environments. The proposed approach combines modalities in a factor graph-based windowed smoother with sensor information-specific factor formulations which enable, in the case of degeneracy, partial information to be conveyed to the graph along the non-degenerate axes. The proposed method is evaluated in real-world tests on a flying robot experiencing degraded conditions including geometric self-similarity as well as obscurant occlusion. For the benefit of the community we release the datasets presented: https://github.com/ntnu-arl/lidar_degeneracy_datasets.
|
|
10:30-12:00, Paper WeAT26-NT.6 | Add to My Program |
Design and Evaluation of a Generic Visual SLAM Framework for Multi Camera Systems |
|
Kaveti, Pushyami | Northeastern University |
Vaidyanathan, Shankara Narayanan | Northeastern University |
Thamil Chelvan, Arvind | Northeastern University |
Singh, Hanumant | Northeatern University |
Keywords: SLAM, Data Sets for SLAM, Field Robots
Abstract: Multi-camera systems have been shown to improve the accuracy and robustness of SLAM estimates, yet state-of-the-art SLAM systems predominantly support monocular or stereo setups. This paper presents a generic sparse visual SLAM framework capable of running on any number of cameras and in any arrangement. Our SLAM system uses the generalized camera model, which allows us to represent an arbitrary multi-camera system as a single imaging device. Additionally, it takes advantage of the overlapping fields of view (FoV) by extracting cross-matched features across cameras in the rig. This limits the linear rise in the number of features with the number of cameras and keeps the computational load in check while enabling an accurate representation of the scene. We evaluate our method in terms of accuracy, robustness, and run time on indoor and outdoor datasets that include challenging real-world scenarios such as narrow corridors, featureless spaces, and dynamic objects. We show that our system can adapt to different camera configurations and allows real-time execution for typical robotic applications. Finally, we benchmark the impact of the critical design parameters - the number of cameras and the overlap between their FoV that define the camera configuration for SLAM. All our software and datasets are freely available for further research.
|
|
10:30-12:00, Paper WeAT26-NT.7 | Add to My Program |
Ground-Fusion: A Low-Cost Ground SLAM System Robust to Corner Cases |
|
Yin, Jie | Shanghai Jiao Tong University |
Li, Ang | Shanghai Jiao Tong University |
Xi, Wei | Nankai University |
Yu, Wenxian | Shanghai Jiao Tong University |
Zou, Danping | Shanghai Jiao Ton University |
Keywords: SLAM, Data Sets for SLAM, Sensor Fusion
Abstract: We introduce Ground-Fusion, a low-cost sensor fusion simultaneous localization and mapping (SLAM) system for ground vehicles. Our system features efficient initialization, effective sensor anomaly detection and handling, real-time dense color mapping, and robust localization in diverse environments. We tightly integrate RGB-D images, inertial measurements, wheel odometer and GNSS signals within a factor graph to achieve accurate and reliable localization both indoors and outdoors. To ensure successful initialization, we propose an efficient strategy that comprises three different methods: stationary, visual, and dynamic, tailored to handle diverse cases. Furthermore, we develop mechanisms to detect sensor anomalies and degradation, handling them adeptly to maintain system accuracy. Our experimental results on both public and self-collected datasets demonstrate that Ground-Fusion outperforms existing low-cost SLAM systems in corner cases. We release the code and datasets at href{https://github.com/SJTU-ViSYS/Ground-Fusion}{https://github.com/SJTU-ViSYS/Ground-Fusion}.
|
|
10:30-12:00, Paper WeAT26-NT.8 | Add to My Program |
HERO-SLAM: Hybrid Enhanced Robust Optimization of Neural SLAM |
|
Xin, Zhe | Meituan |
Yue, Yufeng | Beijing Institute of Technology |
Zhang, Liangjun | Baidu |
Wu, Chenming | Baidu Research |
Keywords: SLAM, Deep Learning for Visual Perception, Vision-Based Navigation
Abstract: Simultaneous Localization and Mapping (SLAM) is a fundamental task in robotics, driving numerous applications such as autonomous driving and virtual reality. Recent progress on neural implicit SLAM has shown encouraging and impressive results. However, the robustness of neural SLAM, particularly in challenging or data-limited situations, remains an unresolved issue. This paper presents HERO-SLAM, a Hybrid Enhanced Robust Optimization method for neural SLAM, which combines the benefits of neural implicit field and feature-metric optimization. This hybrid method optimizes a multi-resolution implicit field and enhances robustness in challenging environments with sudden viewpoint changes or sparse data collection. Our comprehensive experimental results on benchmarking datasets validate the effectiveness of our hybrid approach, demonstrating its superior performance over existing implicit field-based methods in challenging scenarios. HERO-SLAM provides a new pathway to enhance the stability, performance, and applicability of neural SLAM in real-world scenarios. Project page: https://hero-slam.github.io.
|
|
WeAL-EX Poster Session, Exhibition Hall |
Add to My Program |
Late Breaking Results Poster IV |
|
|
|
10:30-12:00, Paper WeAL-EX.1 | Add to My Program |
Development of Real-Time Motion Mapping for Surgical Robot |
|
Peuchpen, Pantita | The Hong Kong University of Science and Technology (Guangzhou) |
Liu, Haichao | The Hong Kong University of Science and Technology |
Ma, Jun | The Hong Kong University of Science and Technology |
Keywords: Telerobotics and Teleoperation, Mapping, Haptics and Haptic Interfaces
Abstract: Enhancing healthcare equity is a key global policy objective under the United Nations' Sustainable Development Goals (SDGs). Geographical hurdles to healthcare access present substantial challenges, resulting in decreased service utilization, lower uptake of preventive care, and diminished survival rates, particularly among individuals residing distant from healthcare facilities. Therefore, teleoperation technology has been implemented in the fields of medicine and surgery to address this issue. However, this technology requires a high level of precision and control. This paper presents the method for mapping the human hand’s motion from surgeon to the robot arm and providing haptic feedback to send the force feedback back from the tips of instruments to surgeons.
|
|
10:30-12:00, Paper WeAL-EX.2 | Add to My Program |
Micromanipulation Assistance Via Motion Guidanceto a Spatiotemporal Ideal Trajectory Using GMM and LSTM |
|
Mori, Ryoya | Nagoya University |
Aoyama, Tadayoshi | Nagoya University |
Kobayashi, Taisuke | National Institute of Informatics |
Sakamoto, Kazuya | Nagoya University |
Takeuchi, Masaru | Nagoya University |
Hasegawa, Yasuhisa | Nagoya University |
Keywords: Imitation Learning, Biological Cell Manipulation, Human-Centered Robotics
Abstract: Intracytoplasmic sperm injection (ICSI) requires high skill to rotate the oocyte without causing damage using micropipettes. The oocyte rotation process is challenging due to the need for delicate and fast manipulations, as well as limited depth information. To address these difficulties, we propose a micromanipulation assistance system that utilizes a Gaussian Mixture Model (GMM) and Long Short-Term Memory (LSTM) to guide operators along an ideal spatiotemporal trajectory. The system learns static ideal trajectory points from data on an expert's pipette manipulations using GMM. An LSTM is trained to learn the expert's manipulations by inferring the expected pipette manipulation at each time step. During assistance for novice operators, the system provides real-time haptic and visual guidance by combining the element of spatial guidance to the static ideal trajectory and the element of time-series-aware inference by inference through the trained LSTM. Experiments conducted on novice operators demonstrated that the GMM+LSTM assistance system significantly improved operational efficiency and reduced cell damage when compared to both the conventional system and an assistance system with LSTM alone. These results demonstrate the effectiveness of the spatiotemporal guidance approach for assisting complex micromanipulation tasks.
|
|
10:30-12:00, Paper WeAL-EX.3 | Add to My Program |
Feedforward Macro-Mini Dynamics Compensation Toward Dynamically Transparent Exoskeletons |
|
Shimoyama, Takuma | Graduate School of Informatics and Engineering, the University O |
Noda, Tomoyuki | ATR Computational Neuroscience Laboratories |
Teramae, Tatsuya | ATR Computational Neuroscience Laboratories |
Nakata, Yoshihiro | The University of Electro-Communications |
Keywords: Prosthetics and Exoskeletons, Rehabilitation Robotics
Abstract: Mechanical transparency in exoskeletons, i.e., the robots do not adversely affect patients' body dynamics, is essential for robot rehabilitation. Conventional transparency has only aimed at zero interaction force, but considering transparency during robot-assisted movements has been overlooked. We focused on the ability to keep interaction forces at a non-zero target value during assistance and named it force-based dynamic transparency. As the interaction force increases, the robot's mechanical losses also increase, making it more challenging to achieve force-based dynamic transparency. We aim to achieve force-based dynamic transparency by extending the concept of the distributed macro-mini actuation approach. By using a pneumatic–electromagnetic hybrid actuator effectively compensates each mechanical loss by distributing these losses to the appropriate actuators. We have already demonstrated robust torque generation by distributing the kinetic friction force to the electromagnetic actuator of the pneumatic-electromagnetic hybrid actuator in our ICRA 2024 contributed paper. This research proposes a more general control design policy to distribute and compensate the physical properties of elements consisting of robots to each actuator to realize force-based dynamic transparency by the distributed macro-mini actuation approach using hybrid actuators.
|
|
10:30-12:00, Paper WeAL-EX.4 | Add to My Program |
Research on Planning and Control Methods for Refined Operations of Excavation Robot |
|
Lu, Liang | Tongji University |
Zhu, Minyan | Tongji University |
Tang, Chengzong | Tongji University |
Wang, Zhipeng | Tongji University |
He, Bin | Tongji University |
Keywords: Robotics and Automation in Construction, Task Planning, Motion Control
Abstract: A set of planning and control methods is designed to improve the precision operation ability of the excavation robot. The main work achievements are summarized as follows: (1)A refined work comprehensive trajectory optimization strategy was proposed, which improves the ability of excavation robot in refined work at the planning level. (2)A joint trajectory optimization method based on MABC and SQP was proposed, which improved the efficiency of trajectory optimization. (3)A variable universe fuzzy PID control strategy was designed to further reduce trajectory tracking errors.
|
|
10:30-12:00, Paper WeAL-EX.5 | Add to My Program |
Design of a Modular Supernumerary Mechanical Limb Actuated by a Foot Interface |
|
Chao, Elizabeth Ting | The Chinese University of Hong Kong |
Chan, Sheung Yan | The Chinese University of Hong Kong |
Huang, Yanpei | Imperial College London |
Eden, Jonathan | University of Melborune |
Burdet, Etienne | Imperial College London |
Lau, Darwin | The Chinese University of Hong Kong |
Keywords: Wearable Robotics, Prosthetics and Exoskeletons, Tendon/Wire Mechanism
Abstract: Supernumerary limbs are wearable devices that can act both as a prosthetic or as a mechanism of human augmentation to provide additional extremities rather than replace missing ones. Compared to typical manipulators, supernumerary limbs are unique as they are not fixed to a single base. The high redundancy of the human body can be taken advantage of since the supernumerary limb can be controlled directly by moving the base. Addressing these considerations, the development of a task-specific, cable-driven Superlimb would allow for the practicality of such a wearable device. For the intuitive control of the supernumerary limb, a foot interface was implemented. The end effector is directly actuated mechanically through the wheel motion translated to the displacement of inner steel cables within an outer housing. the result is a wearable device in which haptic feedback can be felt by the user from the end effector to the foot. Experiments were performed which demonstrated the workload of an additional supernumerary limb attached to one's body was not significantly higher than on a fixed base.
|
|
10:30-12:00, Paper WeAL-EX.6 | Add to My Program |
Drone-Enabled Last Mile Delivery for Energy Management in UGV Teams |
|
Singh, Gaurav | Iowa State University |
Mandal, Shashwata | Iowa State University |
Bhattacharya, Sourabh | Iowa State University |
Keywords: Energy and Environment-Aware Automation, Multi-Robot Systems, Planning, Scheduling and Coordination
Abstract: Autonomous multi-robot systems deployed outdoors experience bottlenecking in efficiency due to recharge operations. Many challenges come up while attempting to place static recharge stations outdoors due to various factors such as distance from robots, navigation, charge time, and recharge sequence. To overcome these challenges we implement a recharge methodology that uses UAVs to deliver secondary batteries to a robot that charges its primary battery. We propose a framework and algorithms for finding efficient delivery sequences for recharging the robots, in which we explore the use of the nearest-first approach. The combined impact of the algorithms on efficiencies of UAV-UGV collaboration is studied. A testbed is set up to evaluate the feasibility and scalability of the system in the real world, using Crazyflies and Boe-Bots.
|
|
10:30-12:00, Paper WeAL-EX.7 | Add to My Program |
Vision-Driven Robotic System for Autonomous Sewing of Elastic Fabrics |
|
Marchello, Gabriele | Istituto Italiano Di Tecnologia |
Abidi, Syed Haider Jawad | Istituto Italiano Di Tecnologia |
Lahoud, Marcel | Italian Institute of Technology |
Fontana, Eleonora | Istituto Italiano Di Tecnologia |
Meddahi, Amal | Italian Institute of Technoloy |
Baizid, Khelifa | Italian Institute of Technology |
Farajtabar, Mohammad | University of Calgary |
D'Imperio, Mariapaola | Istituto Italiano Di Tecnologia |
Cannella, Ferdinando | Istituto Italiano Di Tecnologia |
Keywords: Industrial Robots, Grippers and Other End-Effectors, Soft Robot Applications
Abstract: Manipulating soft materials has always been one of the most difficult problems in robotics, due to the non-linear mechanical behaviour of fabrics. Therefore, the automation of systems based on the manipulation of soft materials (such as the clothing industry) has been very limited. We present a robotic cell that supports workers by automating the production of cyclist garments, composed of an elastic cloth and a foam pad to sew together. The robotic cell is comprised of two robotic arms equipped with a two-finger parallel gripper and a pneumatic needle gripper to flatten the cloth and pick the foam pad, respectively. Moreover, a Cartesian robot is employed to drive the two fabrics under the needle of a sewing machine. This project aims to improve the productivity of garments and the working conditions of the operators. The results obtained by the robotic cell are comparable with conventional ones both in quality and production time. In addition, the modularity underlying the design of this structure ensures a high degree of flexibility. Therefore, the system can be used to make all types of garments.
|
|
10:30-12:00, Paper WeAL-EX.8 | Add to My Program |
EAIK: A Toolbox for Efficient Analytical Inverse Kinematics by Subproblem Decomposition |
|
Ostermeier, Daniel | Technical University of Munich |
Külz, Jonathan | Technical University of Munich |
Keywords: Kinematics, Industrial Robots, Software Tools for Robot Programming
Abstract: Current methods for general closed-form inverse kinematics (IK) suffer from slow derivation speed and complex setup procedures. Our IK toolbox provides high usability by encapsulating all its functionalities in a Python package. It automatically derives a robot’s kinematic structure from either a URDF File or a set of Denavit–Hartenberg (DH) Parameters. We achieve millisecond derivation speeds for the subproblem decomposition, numeric stability for the IK solutions, and microsecond IK computation times that surpass current numerical methods whilst providing an analytical complete solution set.
|
|
10:30-12:00, Paper WeAL-EX.9 | Add to My Program |
3D Actuation and Trajectory Control of Ferrofluidic Droplet Robot Swarms for Targeted Drug Delivery |
|
Fan, Xinjian | Soochow University |
Zhang, Yunfei | Soochow University |
Yang, Zhan | Soochow University |
Keywords: Automation at Micro-Nano Scales, Biologically-Inspired Robots, Micro/Nano Robots
Abstract: Research on microrobot swarms points to exciting applications, but handling these swarms is much more complex than dealing with individual robots. The challenge starts with making these tiny robots efficiently and in large numbers, which current methods can't always do well. Additionally, orchestrating the collective operation of these swarms within the human body, especially beyond mere planar movement, poses a substantial challenge. Most existing research doesn't fully tackle how to control these swarms in three-dimensional spaces within the body, which is crucial for delivering medicine right where it's needed. To address the aforementioned problems and challenges, this work presents an innovative method based on microfluidic technology for tackling the issue of mass production of microrobots. Furthermore, this work constructs an 8-axis distributed electromagnetic coil to realize a decoupled three-dimensional (3D) spatial control method for microrobot swarms based on magnetic force and torque, solving the challenges related to three-dimensional actuation and trajectory control. By using 3D printing and animal tissue, we finally create environments that mimic human tissues for 3D locomotion tests of microrobot swarms. This research endeavors to advance our capabilities in manipulating these minuscule robotic swarms, paving the way for novel disease treatment methods that promise greater precision and reduced invasiveness.
|
|
10:30-12:00, Paper WeAL-EX.10 | Add to My Program |
Autonomous Loose Fruit Collection for Oil Palm Plantation |
|
Ismail, Muhamad Khuzaifah | Sime Darby Plantation Research |
Keywords: Robotics and Automation in Agriculture and Forestry, Agricultural Automation
Abstract: The Autonomous Loose Fruit Robot ALFRo project, spearheaded by SD Plantation Research ( presents a groundbreaking solution to address labor shortages and operational inefficiencies in the oil palm industry Leveraging advanced technologies including robotics, artificial intelligence ( and automation, ALFRo is able to carry out labor intensive tasks that eases the burden of labor shortage in the plantation industry The ALFRo project involved research and development activities that focused on integrating advanced sensors with image processing algorithms to detect palm fruitlets that are left on the ground and evacuating the valuable consignment in a timely manner ALFRo features scalability and agility for application in the oil palm plantations Considerations of operational sustainability is demonstrated via minimal operational losses and modification to the soil conditions With a planned timeline of one year for development and extensive testing, ALFRo aims to set new standards for loose fruit collection in the palm oil industry By enhancing efficiency, productivity, and safety while minimizing environmental impact, ALFRo represents a transformative shift in oil palm estate management, paving the way for a more sustainable and prosperous future in the industry and beyond
|
|
10:30-12:00, Paper WeAL-EX.11 | Add to My Program |
Balance Recovery Via Whole-Body Model Predictive Control for Wheeled Bipedal Robots |
|
Lee, Young Hun | Korea Institute of Machinery & Materials |
Kang, Woosong | DGIST |
Park, Jongwoo | Korea Institue of Machinery & Materials |
Ahn, Jeongdo | Korea Institute of Machinery and Materials |
Park, Dongil | Korea Institute of Machinery and Materials (KIMM) |
Park, Chanhun | KIMM |
Keywords: Legged Robots, Whole-Body Motion Planning and Control, Optimization and Optimal Control
Abstract: This poster presents a whole body controller based on model predictive control (MPC), which enables a wheeled bipedal robot to demonstrate dynamic locomotion over various terrains including slope and stair, as well as under various types of external disturbances. To stabilize the robot's balance, optimal torques for each joint are generated through the MPC method. The proposed whole body controller was tested on the wheeled bipedal robot. Locomotive abilities are evaluated in the Gazebo simulator.
|
|
10:30-12:00, Paper WeAL-EX.12 | Add to My Program |
An Origami-Inspired Approach to Height Adjustment of Wind Assisted Ship Propulsion |
|
Kim, Chan | Seoul National University |
Jung, Sun-Pill | Seoul National University |
Cho, Kyu-Jin | Seoul National University, Biorobotics Laboratory |
Keywords: Mechanism Design, Actuation and Joint Mechanisms, Tendon/Wire Mechanism
Abstract: The International Maritime Organization (IMO) is pushing for cleaner seas. They've set targets to reduce CO2 emissions by 40% by 2030 and 70% by 2050 based on 2008 levels. Ships are now required to adopt more eco-friendly technologies means green ships. One approach to creating a green ship is to reduce fuel consumption. This involves using an auxiliary propulsion device that utilizes wind power. This method can be applied to existing ships. We plan to manufacture the rotor sailThe figure illustrates the rotor sail. Rotor sail system uses a rotating actuator on the inner tower to spin the outer panel. The rotor sail operates on the principle of the Magnus effect. When a rotor sail rotates and encounters wind, it generates lift. The rotor sail needs to be tall because wind strength increases with altitude at sea, enhancing the Magnus effect. Our target rotor sail height is 35m. If the sail is divided into layers mean diameter is not constant, and flow separation occurs, leading to efficiency issues. The rotor sail's height can create navigational issues, such as when passing under bridges or docking. Thus, the sail's height must occasionally be reduced. Then What if the rotor sail could fold to the required height? By folding the upper part to the required height, can achieve a height at which the rotor sail operates efficiently. Therefore, the selection or additional design of a foldable method that can maintain a constant diameter is necessary.
|
|
10:30-12:00, Paper WeAL-EX.13 | Add to My Program |
2 DoF Prosthetic Wrist with Concave-Convex Rolling Contact Joint |
|
Jeong, Inchul | Seoul National University |
Cho, Kyu-Jin | Seoul National University, Biorobotics Laboratory |
Keywords: Actuation and Joint Mechanisms, Prosthetics and Exoskeletons, Tendon/Wire Mechanism
Abstract: Wrist plays crucial role with its 2 Degree of Freedom(DoF) in orientating hand position to grasp objects with desired grasping posture. Absence of wrist simplifies kinematics of upper extremity. Prosthetic hand users without wrist relies on other joints or intact limb and suffers discomfort from compensatory motion, which can leads to residual limb pain, secondary musculoskeletal disease and overuse syndrome. Wrist not only involves in orientating hands, but also takes its part in manipulating hand or grasped tools. People actively use their wrist in activities of daily living. Coupled DoF motion, which is known as dart throwing motion, is mainly used in ADL with various coupling ratio. With this characteristics of wrist, prosthetic users preferred hand with wrist to hand without wrist. To fulfill needs of prosthetic users, artificial limb needs to be lighter, have better usability and functionality with anthropomorphic size. Prosthetic wrist needs to meet those conditions to restore missing functions for amputee In this paper, we propose design of 3d printed prosthetic wrist with compact size and low inertia with 2 DoF actuation. To enlarge load capacity with light weight material and mimic concave shape of wrist row, rolling contact joint was used with concave-convex shape to lower contact stress. Tendon driven actuating system enables proximal placement of motor with low inertia. Joint surface and tendon routing are designed to enable decoupling of 2 DoF.
|
|
10:30-12:00, Paper WeAL-EX.14 | Add to My Program |
Size-Adaptive Robotic Gripper with Constant Gripping Force Using Electromagnetic Fuse Switch Mechanism |
|
Kim, Tae Hwan | Seoul National University |
Park, Yong-Lae | Seoul National University |
Keywords: Grippers and Other End-Effectors, Grasping, Mechanism Design
Abstract: The demand for customized and personalized products has recently been increasing, and it would be advantageous to have the capability to manufacture products of various sizes on a single production line, necessitating adaptive robotic grippers for different objects. We propose a size-adaptive robotic gripper capable of grasping objects of various sizes without using any sensors. The proposed mechanism employs an electromagnetic fuse that delivers gripping force below a threshold level and disconnects the actuation circuit when overloaded, resulting in a constant gripping force applied to objects of different sizes. In this system, while the gripper holds an object, the gripping force can be controlled by adjusting the amount of electric current supplied to the electromagnetic fuse. Since force transmission is determined by the geometry and motion of the mechanism, the transfer function of force transmission for the proposed mechanism is modeled and optimized to generate a constant gripping force regardless of the size of the object. Experimental results confirm that the gripper is capable of grasping various objects without closed-loop control.
|
|
10:30-12:00, Paper WeAL-EX.15 | Add to My Program |
SEMG-Based Hand Gesture Recognition by Time-Frequency Domain Multifeature Coupling Network |
|
Wang, Peiyao | Shenyang University of Technology |
Li, Yazhou | Shenyang University of Technology |
Li, Kairu | Shenyang University of Technology |
Keywords: Gesture, Posture and Facial Expressions, Prosthetics and Exoskeletons, Datasets for Human Motion
Abstract: Surface electromyography (sEMG), which enables tracking of electrical activity within muscles, is widely applied to human-machine interaction (HMI), such as gesture recognition and prosthetic control. However, electrode displacement can seriously affect the sEMG-based motion recognition accuracy. Therefore, in practice, users have to retraining each time when they rewear sEMG electrodes, which increases their training burden and affects user experience. Therefore, we propose a Global-Local Time-Frequency Coupling Network (GL-TF Coupling Network) for sEMG-based gesture recognition. The network innovatively adopts a compact convolution-transformer structure, where the convolutional module is responsible for learning dual-channel signals in time and frequency domains to extract low-level local features of gesture actions. Combined with a self-attention module, it captures global correlations within local time-frequency features. Additionally, a simple classifier module composed of fully connected layers predicts gesture categories of sEMG signals. To enhance the multi-channel information fusion capability among sEMG signals, a “conical flask” structure for the convolutional fusion channel is introduced, coupling information across different channels. Experiment results demonstrate an average gesture recognition accuracy of 90% on the public "EMG data for gestures" dataset and 90.69% on our "ED-sEMG" dataset which includes scenarios of electrode displacement.
|
|
10:30-12:00, Paper WeAL-EX.16 | Add to My Program |
ImitationBT: Imitation Learning for Behavior Tree Generation from DRL Agents |
|
Bathula, Shailendra Sekhar | University of Georgia |
Parasuraman, Ramviyas | University of Georgia |
Keywords: Imitation Learning, Behavior-Based Systems, Reinforcement Learning
Abstract: Behavior Trees (BT) stand as a favored control architecture among game designers and robotics experts, prized for their modularity, reactivity, and hierarchical structure. These properties enable BTs to offer scalable and clear-cut solutions to a wide array of decision-making challenges. In contrast, Deep Reinforcement Learning (DRL) has demonstrated exceptional performance but faces hesitancy in high-stakes domains due to its reliance on neural networks, which present challenges in verifiability and explainability. In this context, we introduce a novel framework designed to bridge the gap between the high performance of DRL and the desirable transparency and verifiability of BTs. By employing imitation learning to capture and transfer the expertise of a reinforcement learning model, we pave the way for generating BTs that are not only effective but also transparent, interpretable, and readily verifiable for real world problems.
|
|
10:30-12:00, Paper WeAL-EX.18 | Add to My Program |
Accurate Loop Closure with Panoptic Information and Scan Context++ for LiDAR-Based SLAM |
|
Tan, Louise | Kumoh National Institute of Technology |
Lee, Heoncheol | Kumoh National Institute of Technology |
Keywords: SLAM, Semantic Scene Understanding
Abstract: Loop closing is crucial in a SLAM system to reduce drift accumulations. Most SLAM systems only leverage low-level geometric features, leaving high-level information unused. The implementation of panoptic information into the Scan Context++ algorithm to improve loop closure detection accuracy is proposed. The proposed approach is able to exploit LiDAR odometry and panoptic information to perform loop closure detection as well as pose estimation and mapping. Experimental results shows improvements in loop closure detection with the implementation of panoptic information.
|
|
10:30-12:00, Paper WeAL-EX.19 | Add to My Program |
Flexure Hinge-Based Miniature Parallel Manipulator for Eye-Box Expansion of AR-HUD System |
|
Park, Yong-Min | Seoul National University |
Jung, Sun-Pill | Seoul National University |
You, Jang-Woo | Samsung |
Koh, Je-Sung | Ajou University |
Lee, Hong-Seok | Pukyong National University |
Cho, Kyu-Jin | Seoul National University, Biorobotics Laboratory |
Keywords: Parallel Robots, Industrial Robots, Multi-Robot Systems
Abstract: Augmented reality head-up display (AR-HUD) implements augmented information on the road to provide users with a diverse experience. Maxwellian view display and holographic display were devised to implement realistic AR; however, their small eye-box limited the potential for HUD applications. In this paper, we developed a manipulator to mechanically expand the eye-box into 3D space and maximize the utilization of light, unlike previous solutions. A modified delta robot mechanism which consists of three legs with 90-degree arrangement is used to manipulate two adjacent projectors with discrete 3-DOF movements that enter images into both eyes in real time, respectively. We utilized the origami fabrication method for its lightweight and simple fabrication and proposed a triangular-prism-shaped parallelogram linkage, which can be used as a linkage for delta robots. The linkage has stiffness and zero-backlash to move the projector at the required speed and precision within the workspace. The eye-box was expanded to 140 mm X 110.6 mm X 140 mm, and the positioning error of the AR-HUD system was evaluated to be less than 1mm.
|
|
10:30-12:00, Paper WeAL-EX.20 | Add to My Program |
Model-Based Real-Time Simulator for Robotic Electromagnetic Actuation |
|
Ko, Yeongoh | Chonnam National University |
Lee, Han-Sol | Chonnam National University |
Kim, Chang-Sei | Chonnam National University |
Keywords: Medical Robots and Systems, Simulation and Animation
Abstract: This study introduces a real-time simulator designed for a robotic electromagnetic actuator (EMA) system. To address the complexities of electromagnetic field computations, a simplified magnetic field model based on the Biot-Savart law is proposed. The proposed model reduces calculation time from 48 seconds using the Finite Element Method (FEM) to 204 milliseconds and shows less than 4% error compared to FEM simulations and real measurements for the principal axis. Within the ROS Gazebo environment, the simulator provides visualization of the robot EMA system and its magnetic fields. It operates by receiving joystick commands for robot pose, computing currents based on the desired fields and posture, and transmitting these currents simultaneously to both the real system and the simulator. Experimental results exhibit 5° errors in capsule movements and a 2.29mm root mean square error (RMSE) for guidewire navigation.
|
|
10:30-12:00, Paper WeAL-EX.21 | Add to My Program |
ILPSR: Imitation Learning with Predictable Skill Representation for Long-Horizon Manipulation Tasks |
|
Wang, Hao | University of Science and Technology of China |
Zhang, Hao | University of Science and Technology of China |
Li, Lin | University of Science and Technology of China |
Qian, Tangyu | University of Science and Technology of China |
Zhou, Zhangli | University of Science and Technology of China |
Kan, Zhen | University of Science and Technology of China |
Keywords: Deep Learning in Grasping and Manipulation, AI-Based Methods, Learning from Experience
Abstract: Robots rely heavily on prior experience when learning new tasks. However, traditional supervised learning-based methods are limited by the need for large-scale and high-quality datasets as well as generalization fragility, resulting in their poor scalability. To address these problems, this work proposes the Imitation Learning with Predictable Skill Representation (ILPSR) to drive robots to learn downstream tasks robustly and efficiently with priori data from previous tasks. To better utilize the prior experience, a Predictable Skill Representation Learning model (PSRL) is first developed to extract predictable skill embeddings and skill priors from the prior data. Subsequently, a skill-based behavioral cloning method is employed to apply the learned skill embeddings for policy learning and generalization in downstream target tasks. Experiments demonstrate that ILPSR can more effectively perform challenging long-horizon complex manipulation skills, with learning performance outperforming baselines.
|
|
10:30-12:00, Paper WeAL-EX.22 | Add to My Program |
Servo Integrated Nonlinear Model Predictive Control for Overactuated Tiltable-Quadrotors |
|
Li, Jinjie | The University of Tokyo |
Sugihara, Junichiro | The University of Tokyo |
Zhao, Moju | The University of Tokyo |
Keywords: Aerial Systems: Mechanics and Control, Motion Control
Abstract: Quadrotors are widely employed across various domains, yet conventional models face limitations due to underactuation, where attitude control is closely tied to positional adjustments. In contrast, quadrotors equipped with tiltable rotors offer overactuation, empowering them to track both position and attitude references. However, the nonlinear dynamics of the drone body and the sluggish response of tilting servos pose challenges for conventional cascade controllers. In this study, we propose a control methodology for tilting-rotor quadrotors leveraging nonlinear model predictive control (NMPC). Unlike conventional approaches, our method preserves the full dynamics without simplification and utilizes actuator commands directly as control inputs. Notably, we incorporate a first-order servo model within the NMPC framework. Through simulation, we observe that integrating the servo dynamics not only enhances control performance but also accelerates convergence. To assess the efficacy of our approach, we fabricate a tiltable-quadrotor and deploy the algorithm onboard at a frequency of 100Hz. Extensive real-world experiments demonstrate smooth and rapid pose tracking performance.
|
|
10:30-12:00, Paper WeAL-EX.23 | Add to My Program |
Design of Multi-Functional and Deployable Small-Scale Modular Robot Using Origami-Based Compliant Structure |
|
Kim, Junhyung | Seoul National University |
Jung, Mincheol | Seoul National University |
Kim, Jaehoon | Seoul National University |
Park, Yong-Lae | Seoul National University |
Keywords: Cellular and Modular Robots, Multi-Robot Systems, Sensor-based Control
Abstract: Manipulation tasks in confined spaces are challenges for human workers and modular robots, characterized by deployable and multifunctional capabilities, have recently gained attentions in these applications. They can be dexterous even with spatial limitations and can alter their form factors based on their modularity, offering a wide range of motion and functionality. However, achieving lightweight and compact designs remains a challenge due to power sources and electric motors. Moreover, oversimplified designs and lightweight structures may degrade the performance of precise control. To address these issues, we aim to develop a versatile, dexterous, and controllable modular robot in a compact centimeter-scale size. The actuator modules, made of shape memory alloy (SMA) springs and smart composite microstructure (SCM) technology, enable linear and bending motions. The sensor module directly integrated into the actuator module detects the actuation states by measuring the capacitance change. Multiple actuator-sensor modules can be combined for diverse applications. The module's performance is experimentally characterized by comparison of its mechanical responses with analytical models based on the relationship between the temperature of the SMA and the generated force. Closed-loop control performance is evaluated using root-mean-square error (RMSE). Lastly, the application demonstrates various module combinations for manipulators with different target motions.
|
|
10:30-12:00, Paper WeAL-EX.24 | Add to My Program |
Neuro-Symbolic Task Replanning Using Large Language Models |
|
Kwon, Minseo | Ewha Womans University |
Kim, Young J. | Ewha Womans University |
Keywords: Task Planning
Abstract: We propose a novel task replanning pipeline for executing complicated robotic tasks on physical robots utilizing a combination of a symbolic task planner and a multi-modal Large Language Model (LLM). Our pipeline begins by obtaining the semantic and spatial relationships of target objects in the environment using a multimodal LLM and an open-vocabulary object detection model. Then, LLM specifies a planning problem based on scene and user-provided goal descriptions, which a symbolic planner then utilizes to plan tasks. These plans are translated into low-level programming languages for execution on the robot, with syntax and semantic checking by LLM to ensure correctness and replanning if failed. We demonstrate the implementation of our pipeline on dual UR5e robots across various benchmark tasks, including pick and place, stacking blocks, and block rearrangement, to verify the effectiveness.
|
|
10:30-12:00, Paper WeAL-EX.25 | Add to My Program |
Lifting 2D Pretrained Knowledge to 3D for Object Grounding |
|
S, Ashwin | Indian Institute of Science |
Bannur, Ganesh | Indian Institute of Science, RV College of Engineering |
Amrutur, Bharadwaj | Indian Institute of Science |
Keywords: Semantic Scene Understanding, Object Detection, Segmentation and Categorization, Deep Learning for Visual Perception
Abstract: We propose GAP (Ground And Project), a method for leveraging pretrained 2D models and NeRF for 3D grounding. Recently 3D grounding has seen the emergence of techniques such as LERF, which distil CLIP’s knowledge of scenes by training a separate network. However, it is prohibitive to train such a network to fully extract the capabilities of 2D models. An alternate method for connecting 2D models to 3D is by lifting their 2D mask outputs to 3D. This enables the use of pre-existing 2D models for 3D grounding. GAP adopts this approach and projects 2D grounding masks into 3D using depth information from NeRF. We demonstrate GAP with a 2D grounding pipeline consisting of two models, visual grounding and text spotting. Incorporating text spotting increases the accuracy of grounding by disambiguating between multiple instances of an object (for example an HP laptop vs a Lenovo laptop). GAP demonstrates stronger 3D grounding capabilities when compared to LERF especially in such multi-instance scenes. It is also able to transfer precise masks predicted by 2D models into 3D. GAP has wide utility in robotics such as guiding object manipulation and identifying navigation goals. Critical for mobile robots, GAP enables adapting to new scenes rapidly since it uses pretrained models and only trains NeRF. Finally, while we give a specific pipeline, our technique is generic and can incorporate any model/pipeline which takes image-query pairs as input and gives masks as output.
|
|
10:30-12:00, Paper WeAL-EX.26 | Add to My Program |
Upper-Limb Motion Intention Estimation Using Surface EMG and Soft Strain Sensors for Soft Wearable Robots |
|
Kim, Jaehyeon | Seoul National University |
Lee, Minhee | Seoul National University |
Hwang, Sungjae | Seoul National University |
Choi, YeongJin | Seoul National University |
Kim, Jeongnam | Seoul National University |
Park, Yong-Lae | Seoul National University |
Keywords: Human Detection and Tracking, Soft Sensors and Actuators, Deep Learning Methods
Abstract: Motion estimation plays an important role in human-assistive robotic systems, since it provides the information on the motion intention of the user and the states of the system. Motion estimation with a surface electromyography (sEMG) technique is one of promising methods in that the sEMG signals provide the information on the muscle activation of the user. Researchers have studied different methods of estimating body motions using sEMG, especially via data-driven approaches. Deep learning, one of commonly used techniques, has shown a reasonable performance on gesture recognition, but estimating accurate joint motions is still a challenge since it is difficult to extract the intermediate states of the body only from sEMG signals. To address this issue, we propose a method of estimating upper-limb motions using both sEMG and soft strain sensor data. A soft strain sensor, made of highly stretchable elastomer embedded with a liquid-metal conductor, is able to detect the strain on the joint where the sensor is placed. We use the output from the CNN-RNN models to find the angle displacement of the joint in this work. Using the current state on muscle activation detected by the sEMG and the strain on the elbow joint measured by the soft sensor, the system is able to reliably estimate the joint angle in real time. The average root-mean-square error of the estimated joint angle displacement from the model is 1.7 deg, while the maximum angle displacement of sample dataset is 12.2 deg.
|
|
WeBA1-CC Award Session, CC-Main Hall |
Add to My Program |
Service Robotics |
|
|
Chair: Barfoot, Timothy | University of Toronto |
Co-Chair: Cavallo, Filippo | University of Florence |
|
13:30-15:00, Paper WeBA1-CC.1 | Add to My Program |
Censible: A Robust and Practical Global Localization Framework for Planetary Surface Missions |
|
Nash, Jeremy | Jet Propulsion Laboratory |
Dwight, Quintin | University of Michigan |
Saldyt, Lucas | Jet Propulsion Laboratory |
Wang, Haoda | Jet Propulsion Laboratory, California Institute of Technology |
Myint, Steven | Jet Propulsion Laboratory |
Ansar, Adnan | NASA Jet Propulsion Laboratory |
Verma, Vandi | NASA Jet Propulsion Laboratory, California Institute Of |
Keywords: Space Robotics and Automation, Field Robots, Localization
Abstract: To achieve longer driving distances, planetary robotics missions require accurate localization to counteract position uncertainty. Freedom and precision in driving allows scientists to reach and study sites of interest. Typically, rover global localization has been performed manually by humans, which is accurate but time-consuming as data is relayed between planets. This paper describes a global localization algorithm that is run onboard the Perseverance Mars rover. Our approach matches rover images to orbital maps using a modified census transform to achieve sub-meter accurate, near-human localization performance on a real dataset of 264 Mars rover panoramas. The proposed solution has also been successfully executed on the Perseverance Mars Rover, demonstrating the practicality of our approach.
|
|
13:30-15:00, Paper WeBA1-CC.2 | Add to My Program |
Learning to Walk in Confined Spaces Using 3D Representation |
|
Miki, Takahiro | ETH Zurich |
Lee, Joonho | ETH Zurich |
Wellhausen, Lorenz | ETH Zürich |
Hutter, Marco | ETH Zurich |
Keywords: Legged Robots, Robotics in Hazardous Fields, Reinforcement Learning
Abstract: Legged robots have the potential to traverse complex terrain and access confined spaces beyond the reach of traditional platforms thanks to their ability to carefully select footholds and flexibly adapt their body posture while walking. However, robust deployment in real-world applications is still an open challenge. In this paper, we present a method for legged locomotion control using reinforcement learning and 3D volumetric representations to enable robust and versatile locomotion in confined and unstructured environments. By employing a two-layer hierarchical policy structure, we exploit the capabilities of a highly robust low-level policy to follow 6D commands and a high-level policy to enable three-dimensional spatial awareness for navigating under overhanging obstacles. Our study includes the development of a procedural terrain generator to create diverse training environments. We present a series of experimental evaluations in both simulation and real-world settings, demonstrating the effectiveness of our approach in controlling a quadruped robot in confined, rough terrain. By achieving this, our work extends the applicability of legged robots to a broader range of scenarios.
|
|
13:30-15:00, Paper WeBA1-CC.3 | Add to My Program |
Efficient and Accurate Transformer-Based 3D Shape Completion and Reconstruction of Fruits for Agricultural Robots |
|
Magistri, Federico | University of Bonn |
Marcuzzi, Rodrigo | University of Bonn |
Marks, Elias Ariel | University of Bonn |
Sodano, Matteo | Photogrammetry and Robotics Lab, University of Bonn |
Behley, Jens | University of Bonn |
Stachniss, Cyrill | University of Bonn |
Keywords: Robotics and Automation in Agriculture and Forestry, RGB-D Perception
Abstract: Robots that operate in agricultural environments need a robust perception system that can deal with occlusions, which are naturally present in agricultural scenarios. In this paper, we address the problem of estimating 3D shapes of fruits when only partial observations are available. Generally speaking, such a shape completion can be realized by exploiting prior knowledge about the geometry of the fruit. This is typically done by template matching using traditional optimization algorithms, which are slow but accurate, or by encoding such knowledge into the weights of a neural network, leading to faster but often less accurate estimates. Our approach combines the best of both worlds. It exploits the benefit of having a template representing our object of interest with the advantages of using a neural network to learn how to deform a template. Our experimental evaluation demonstrates that our approach yields accurate estimation at a competitively low inference time in challenging greenhouse environments.
|
|
13:30-15:00, Paper WeBA1-CC.4 | Add to My Program |
CoPAL: Corrective Planning of Robot Actions with Large Language Models |
|
Joublin, Frank | Honda Research Institute Europe |
Ceravola, Antonello | Honda Research Institute Europe GmbH |
Smirnov, Pavel | Honda Research Institute Europe |
Ocker, Felix | Honda |
Deigmoeller, Joerg | Honda Research Institute Europe GmbH |
Belardinelli, Anna | Honda Research Institute Europe |
Wang, Chao | Honda Research Institute Europe GmbH |
Hasler, Stephan | Honda Research Institute Europe |
Tanneberg, Daniel | Honda Research Institute |
Gienger, Michael | Honda Research Institute Europe |
Keywords: AI-Enabled Robotics, Software Architecture for Robotic and Automation, Task and Motion Planning
Abstract: In the pursuit of fully autonomous robotic systems capable of taking over tasks traditionally performed by humans, the complexity of open-world environments poses a considerable challenge. Addressing this imperative, this study contributes to the field of Large Language Models (LLMs) applied to task and motion planning for robots. We propose a system architecture that orchestrates a seamless interplay between multiple cognitive levels, encompassing reasoning, planning, and motion generation. At its core lies a novel replanning strategy that handles physically grounded, logical, and semantic errors in the generated plans. We demonstrate the efficacy of the proposed feedback architecture, particularly its impact on executability, correctness, and time complexity via empirical evaluation in the context of a simulation and two intricate real-world scenarios: blocks world, barman and pizza preparation.
|
|
13:30-15:00, Paper WeBA1-CC.5 | Add to My Program |
CalliRewrite: Recovering Handwriting Behaviors from Calligraphy Images without Supervision |
|
Luo, Yuxuan | Peking University |
Wu, Zekun | Peking University |
Lian, Zhouhui | Peking University |
Keywords: Art and Entertainment Robotics, AI-Enabled Robotics
Abstract: Human-like planning skills and dexterous manipulation have long posed challenges in the fields of robotics and artificial intelligence (AI). The task of reinterpreting calligraphy presents a formidable challenge, as it involves the decomposition of strokes and dexterous utensil control. Previous efforts have primarily focused on supervised learning of a single instrument, limiting the performance of robots in the realm of cross-domain text replication. To address these challenges, we propose CalliRewrite: a coarse-to-fine approach for robot arms to discover and recover plausible writing orders from diverse calligraphy images without requiring labeled demonstrations. Our model achieves fine-grained control of various writing utensils. Specifically, an unsupervised image-to-sequence model decomposes a given calligraphy glyph to obtain a coarse stroke sequence. Using an RL algorithm, a simulated brush is fine-tuned to generate stylized trajectories for robotic arm control. Evaluation in simulation and physical robot scenarios reveals that our method successfully replicates unseen fonts and styles while achieving integrity in unknown characters. To access our code and supplementary materials, please visit our project page: https://luoprojectpage.github.io/callirewrite/.
|
|
WeBA2-CC Award Session, CC-301 |
Add to My Program |
Unmanned Aerial Vehicles |
|
|
Chair: Scaramuzza, Davide | University of Zurich |
Co-Chair: Schoellig, Angela P. | TU Munich |
|
13:30-15:00, Paper WeBA2-CC.1 | Add to My Program |
Co-Design Optimisation of Morphing Topology and Control of Winged Drones |
|
Bergonti, Fabio | Istituto Italiano Di Tecnologia |
Nava, Gabriele | Istituto Italiano Di Tecnologia |
Wüest, Valentin | EPFL |
Paolino, Antonello | Istituto Italiano Di Tecnologia |
L'Erario, Giuseppe | Istituto Italiano Di Tecnologia |
Pucci, Daniele | Italian Institute of Technology |
Floreano, Dario | Ecole Polytechnique Fédérale De Lausanne (EPFL) |
Keywords: Aerial Systems: Mechanics and Control, Methods and Tools for Robot System Design, Optimization and Optimal Control
Abstract: The design and control of winged aircraft and drones is an iterative process aimed at identifying a compromise of mission-specific costs and constraints. When agility is required, shape-shifting (morphing) drones represent an efficient solution. However, morphing drones require the addition of actuated joints that increase the topology and control coupling, making the design process more complex. We propose a co-design optimisation method that assists the engineers by proposing a morphing drone’s conceptual design that includes topology, actuation, morphing strategy, and controller parameters. The method consists of applying multi-objective constraint-based optimisation to a multi-body winged drone with trajectory optimisation to solve the motion intelligence problem under diverse flight mission requirements, such as energy consumption and mission completion time. We show that co-designed morphing drones outperform fixed-winged drones in terms of energy efficiency and mission time, suggesting that the proposed co-design method could be a useful addition to the aircraft engineering toolbox.
|
|
13:30-15:00, Paper WeBA2-CC.2 | Add to My Program |
FC-Planner: A Skeleton-Guided Planning Framework for Fast Aerial Coverage of Complex 3D Scenes |
|
Feng, Chen | Hong Kong University of Science and Technology |
Li, Haojia | The Hong Kong University of Science and Technology |
Zhang, Mingjie | Northwestern Polytechnical University |
Chen, Xinyi | The Hong Kong University of Science and Technology |
Zhou, Boyu | Sun Yat-Sen University |
Shen, Shaojie | Hong Kong University of Science and Technology |
Keywords: Aerial Systems: Perception and Autonomy, Motion and Path Planning, Aerial Systems: Applications
Abstract: 3D coverage path planning for UAVs is a crucial problem in diverse practical applications. However, existing methods have shown unsatisfactory system simplicity, computation efficiency, and path quality in large and complex scenes. To address these challenges, we propose FC-Planner,a skeleton-guided planning framework that can achieve fastaerial coverage of complex 3D scenes without pre-processing.We decompose the scene into several simple subspaces by askeleton-based space decomposition (SSD). Additionally, theskeleton guides us to effortlessly determine free space. Weutilize the skeleton to efficiently generate a minimal set ofspecialized and informative viewpoints for complete cover age. Based on SSD, a hierarchical planner effectively divides the large planning problem into independent sub-problems, enabling parallel planning for each subspace. The carefully designed global and local planning strategies are then in corporated to guarantee both high quality and efficiency in path generation. We conduct extensive benchmark and real world tests, where FC-Planner computes over 10 times faster compared to state-of-the-art methods with shorter path and more complete coverage. The source code will be made publicly available to benefit the community3. Project page: https: //hkust-aerial-robotics.github.io/FC-Planner.
|
|
13:30-15:00, Paper WeBA2-CC.3 | Add to My Program |
Time-Optimal Gate-Traversing Planner for Autonomous Drone Racing |
|
Qin, Chao | University of Toronto |
Michet, Maxime Simon Joseph | University of Toronto |
Chen, Jingxiang | University of Toronto |
Liu, Hugh H.-T. | University of Toronto |
Keywords: Aerial Systems: Applications, Motion and Path Planning, Art and Entertainment Robotics
Abstract: In drone racing, the time-minimum trajectory is affected by the drone's capabilities, the layout of the race track, and the configurations of the gates (e.g., their shapes and sizes). However, previous studies neglect the configuration of the gates, simply rendering drone racing a waypoint-passing task. This formulation often leads to a conservative choice of paths through the gates, as the spatial potential of the gates is not fully utilized. To address this issue, we present a time-optimal planner that can faithfully model gate constraints with various configurations and thereby generate the most time-efficient trajectory while considering the single-rotor-thrust limits. Our approach excels in computational efficiency which only takes a few seconds to compute the full state and control trajectories of the drone through tracks with dozens of different gates. Extensive simulations and experiments confirm the effectiveness of the proposed methodology, showing that the lap time can be further reduced by taking into account the gate's configuration. We validate our planner in real-world flights and demonstrate super-extreme flight trajectory through race tracks.
|
|
13:30-15:00, Paper WeBA2-CC.4 | Add to My Program |
Sequential Trajectory Optimization for Externally-Actuated Modular Manipulators with Joint Locking |
|
Choe, Jaeu | Seoul National University |
Lee, Jeongseob | Seoul National University |
Yang, Hyunsoo | Seoul National University |
Nguyen, Hai-Nguyen (Hann) | CNRS |
Lee, Dongjun | Seoul National University |
Keywords: Aerial Systems: Applications, Aerial Systems: Mechanics and Control
Abstract: In this paper, we present a novel trajectory planning method for externally-actuated modular manipulators (EAMMs), consisting of multiple rotor-actuated links with joints that can be either locked or unlocked. This joint-locking feature allows effective balancing of the payload capacity and dexterity of the robot but significantly complicates the planning problem by introducing binary decision variables. To address this challenge, we leverage the problem's intrinsic structure, i.e., the payload at the end-effector being enhanced by merely locking its immediate connected links; this allows us to break down the complex planning problem into a series of manageable subproblems and solve them sequentially. Our approach significantly reduces the problem's complexity: in a serial n-link EAMM with m joint-lock mechanisms, where there could potentially be 2^m distinct configurational dynamics, we require solving only n+1 trajectory optimization problems for single rigid body dynamics sequentially, thereby rendering the problem tractable. We substantiate the efficacy of our method through various simulation and experimental studies, covering ground-free and ground-bound configurations as well as both motion-only and manipulation tasks.
|
|
13:30-15:00, Paper WeBA2-CC.5 | Add to My Program |
Spatial Assisted Human-Drone Collaborative Navigation and Interaction through Immersive Mixed Reality |
|
Morando, Luca | New York University |
Loianno, Giuseppe | New York University |
Keywords: Aerial Systems: Applications
Abstract: Aerial robots have the potential to play a crucial role in assisting humans with complex and dangerous tasks. Nevertheless, the future industry demands innovative solutions to streamline the interaction process between humans and drones to enable seamless collaboration and efficient co-working. In this paper, we present a novel tele-immersive framework that facilitates cognitive and physical collaboration between humans and robots through Mixed Reality (MR). This includes a novel bi-directional spatial awareness and a multi-modal virtual-physical interaction approaches. The former seamlessly integrates the physical and virtual worlds, providing a bidirectional egocentric and exocentric environment representations. The latter, leveraging the proposed spatial representation, further enhances the collaboration combining a robot planning algorithm for obstacle avoidance with a variable admittance control. This enables the user to generate commands based on virtual forces while ensuring compatibility with the environment map. We validate the proposed approach by conducting several collaborative planning and exploration tasks involving a drone and a user equipped with a MR headset.
|
|
13:30-15:00, Paper WeBA2-CC.6 | Add to My Program |
A Trajectory-Based Flight Assistive System for Novice Pilots in Drone Racing Scenario |
|
Zhong, Yuhang | Zhejiang Unviersity |
Zhao, Guangyu | Zhejiang University |
Wang, Qianhao | Zhejiang University |
Xu, Guangtong | Zhejiang University |
Xu, Chao | Zhejiang University |
Gao, Fei | Zhejiang University |
Keywords: Human Factors and Human-in-the-Loop, Telerobotics and Teleoperation, Art and Entertainment Robotics
Abstract: Drone racing has become a popular international competition and has attained wide attention in recent years. However, the requirements of high-level operation keep the novice pilots away from participating in it. This paper presents a trajectory-based flight assistive system that enables various operators to fly the drone in a racing scene at a high speed. The whole system is structured hierarchically, consisting of both offline and online components. In the offline part, a global time-optimal trajectory is generated as the expert reference, and a dense flight corridor is constructed to provide sufficiently large safe region. In the online part, a remote control-mapped primitive is designed to fast encapsulate pilots' inputs, and the time mapping based trajectory progress is customized to further capture intention. Then, a trajectory planner is proposed to efficiently generate intention-aligned, smooth, feasible, and safe trajectories periodically. Additionally, a yaw planning that provides the pilot with the best suitable view angle is employed to further alleviate the operation difficulty. Simulations and real world experiments are implemented to verify the performance of our system. The maximum velocity can reach 6.0 m/s for a novice drone pilot in a real racing scene. We will open source our code later.
|
|
WeBT1-CC Oral Session, CC-303 |
Add to My Program |
Motion and Path Planning II |
|
|
Chair: Yong, Sze Zheng | Northeastern University |
Co-Chair: Liu, Sicong | Southern University of Science and Technology |
|
13:30-15:00, Paper WeBT1-CC.1 | Add to My Program |
RBI-RRT*: Efficient Sampling-Based Path Planning for High-Dimensional State Space |
|
Chen, Fang | Southern University of Science and Technology |
Zheng, Yu | Tencent |
Wang, Zheng | Southern University of Science and Technology |
Chi, Wanchao | Tencent |
Liu, Sicong | Southern University of Science and Technology |
Keywords: Motion and Path Planning
Abstract: Sampling-based planning algorithms such as RRT have been proved to be efficient in solving path planning problems for robotic systems. Various improvements to the RRT algorithm have been presented to improve the performance of the extension and convergence of the random trees, such as Informed RRT*. However, with the growth of spatial dimensions, the time consumption of randomly sampling the entire state space and incrementally rewiring the random trees raises drastically before a feasible solution is found. In this paper, to enhance the convergence performance of optimal solutions, we present Reconstructed Bi-directional Informed RRT* (RBI-RRT*) path planning algorithm. The algorithm acts as RRT-Connect to rapidly find a feasible solution, which helps compress the sampling space as Informed RRT* does. After the random trees are transformed into RRT* structure by the reconstruction process in RBI-RRT*, the algorithm continues to find the near-optimal path. A series of simulations and real-world robot experiments were conducted to evaluate the algorithm against existing planning algorithms. Compared to Informed RRT* Connect, RBI-RRT* reduced the computation time of achieving a specific cost by 22.1% on average in simulations and 11.2% in the real-world robotic arm experiments. The results show that RBI-RRT* is more efficient in high-dimensional planning problems.
|
|
13:30-15:00, Paper WeBT1-CC.2 | Add to My Program |
Quasi-Static Path Planning for Continuum Robots by Sampling on Implicit Manifold |
|
Wang, Yifan | Georgia Institute of Technology |
Chen, Yue | Georgia Institute of Technology |
Keywords: Motion and Path Planning, Flexible Robotics
Abstract: Continuum robots (CR) offer excellent dexterity and compliance in contrast to rigid-link robots, making them suitable for navigating through, and interacting with, confined environments. However, the study of path planning for CRs while considering external elastic contact is limited. The challenge lies in the fact that CRs can have multiple possible configurations when in contact, rendering the forward kinematics not well-defined, and characterizing the set of feasible robot configurations is non-trivial. In this paper, we propose to perform quasi-static path planning on an implicit manifold. We model elastic obstacles as external potential fields and formulate the robot statics in the potential field as the extremal trajectory of an optimal control problem. We show that the set of stable robot configurations is a smooth manifold diffeomorphic to a submanifold embedded in the product space of the CR actuation and base internal wrench. We then propose to perform path planning on this manifold using AtlasRRT*, a sampling-based planner dedicated to planning on implicit manifolds. Simulations in different operation scenarios were conducted and the results show that the proposed planner outperforms Euclidean space planners in terms of success rate and computational efficiency.
|
|
13:30-15:00, Paper WeBT1-CC.3 | Add to My Program |
Reconfiguration of a 2D Structure Using Spatio-Temporal Planning and Load Transferring |
|
Garcia Gonzalez, Javier | University of Houston |
Yannuzzi, Michael | University of Houston |
Kramer, Peter | TU Braunschweig |
Rieck, Christian | Technische Universität Braunschweig |
Fekete, Sándor | Technische Universität Braunschweig |
Becker, Aaron | University of Houston |
Keywords: Building Automation, Motion and Path Planning, Swarm Robotics
Abstract: We present progress on the problem of reconfiguring a 2D arrangement of building material by a cooperative group of robots. These robots must avoid collisions, deadlocks, and are subjected to the constraint of maintaining connectivity of the structure. We develop two reconfiguration methods, one based on spatio-temporal planning, and one based on target swapping, to increase building efficiency. The first method can significantly reduce planning times compared to other multi-robot planners. The second method helps to reduce the amount of time robots spend waiting for paths to be cleared, and the overall distance traveled by the robots.
|
|
13:30-15:00, Paper WeBT1-CC.4 | Add to My Program |
Neural Informed RRT*: Learning-Based Path Planning with Point Cloud State Representations under Admissible Ellipsoidal Constraints |
|
Huang, Zhe | University of Illinois at Urbana-Champaign |
Chen, Hongyu | University of Illinois at Urbana-Champaign |
Pohovey, John | University of Illinois Urbana-Champaign |
Driggs-Campbell, Katherine | University of Illinois at Urbana-Champaign |
Keywords: Motion and Path Planning, AI-Based Methods
Abstract: Sampling-based planning algorithms like Rapidly-exploring Random Tree (RRT) are versatile in solving path planning problems. RRT* offers asymptotic optimality but requires growing the tree uniformly over the free space, which leaves room for efficiency improvement. To accelerate convergence, rule-based informed approaches sample states in an admissible ellipsoidal subset of the space determined by the current path cost. Learning-based alternatives model the topology of the free space and infer the states close to the optimal path to guide planning. We propose Neural Informed RRT* to combine the strengths from both sides. We define point cloud representations of free states. We perform Neural Focus, which constrains the point cloud within the admissible ellipsoidal subset from Informed RRT*, and feeds into PointNet++ for refined guidance state inference. In addition, we introduce Neural Connect to build connectivity of the guidance state set and further boost performance in challenging planning problems. Our method surpasses previous works in path planning benchmarks while preserving probabilistic completeness and asymptotic optimality. We deploy our method on a mobile robot and demonstrate real world navigation around static obstacles and dynamic humans. Code is available at https://github.com/tedhuang96/nirrt_star.
|
|
13:30-15:00, Paper WeBT1-CC.5 | Add to My Program |
Motions in Microseconds Via Vectorized Sampling-Based Planning |
|
Thomason, Wil | Rice University |
Kingston, Zachary | Rice University |
Kavraki, Lydia | Rice University |
Keywords: Motion and Path Planning
Abstract: Modern sampling-based motion planning algorithms typically take between hundreds of milliseconds to dozens of seconds to find collision-free motions for high degree-of-freedom problems. This paper presents performance improvements of more than 500x over the state-of-the-art, bringing planning times into the range of microseconds and solution rates into the range of kilohertz, without specialized hardware. Our key insight is how to exploit fine-grained parallelism within planning, providing generality-preserving algorithmic improvements to any such planner and significantly accelerating critical subroutines, such as forward kinematics and collision checking. We demonstrate our approach over a diverse set of challenging, realistic problems for complex robots ranging from 7 to 14 degrees-of-freedom. Moreover, we show that our approach does not require high-power hardware by also evaluating on a low-power single-board computer. The planning speeds demonstrated are fast enough to reside in the range of control frequencies and open up new avenues of motion planning research.
|
|
13:30-15:00, Paper WeBT1-CC.6 | Add to My Program |
Gathering Data from Risky Situations with Pareto-Optimal Trajectories |
|
Brodt, Brennan | Boston University |
Pierson, Alyssa | Boston University |
Keywords: Motion and Path Planning, Optimization and Optimal Control, Autonomous Agents
Abstract: This paper proposes a formulation for the risk-aware path planning problem which utilizes multi-objective optimization to dynamically plan trajectories that satisfy multiple complex mission specifications. In the setting of persistent monitoring, we develop a method for representing environmental information and risk in a way that allows for local sampling to generate Pareto-dominant solutions over a receding horizon. We propose two algorithms capable of solving these problems: a dense sampling approach and an improved method utilizing noisy gradient descent. Simulation results demonstrate the efficacy of our methods at persistently gathering information while avoiding risk, robust to randomly-generated environments.
|
|
13:30-15:00, Paper WeBT1-CC.7 | Add to My Program |
RETRO: Reactive Trajectory Optimization for Real-Time Robot Motion Planning in Dynamic Environments |
|
Dastider, Apan | University of Central Florida |
Fang, Hao | University of Central Florida |
Mingjie, Lin | University of Central Florida |
Keywords: Motion and Path Planning, Optimization and Optimal Control, Planning under Uncertainty
Abstract: Reactive trajectory optimization for robotics presents formidable challenges, demanding the rapid generation of purposeful robot motion in complex and swiftly changing dynamic environments. While much existing research predominantly addresses robotic motion planning with predefined objectives, emerging problems in robotic trajectory optimization frequently involve dynamically evolving objectives and stochastic motion dynamics. However, effectively addressing such reactive trajectory optimization challenges for robot manipulators proves difficult due to inefficient, high-dimensional trajectory representations and a lack of consideration for time optimization. In response, we introduce a novel trajectory optimization framework called RETRO. RETRO employs adaptive optimization techniques that span both spatial and temporal dimensions. As a result, it achieves a remarkable computing complexity of O(T^ 2.4 ) +O(T n^2 ), a significant improvement over the naive application of DDP, which leads to a complexity of O(n^ 4 ) when reasonable time step sizes are used. To evaluate RETRO’s performance in terms of error, we conducted a comprehensive analysis of its regret bounds, comparing it to an Oracle value function obtained through an Oracle trajectory optimization algorithm. Our analytical findings demonstrate that RETRO’s total regret can be upper-bounded by a function of the chosen time step size. Moreover, our approach delivers smoothly optimized robot trajectories within the joint-space, offering flexibility and adaptability for various tasks. It seamlessly integrates task-specific requirements such as collision avoidance while maintaining real-time control rates. We validate the effectiveness of our framework through extensive simulations and real-world robot experiments in closed-loop manipulation scenarios. For further details and supplementary materials, please visit: https://sites.google.com/view/retro-optimal-control/home
|
|
13:30-15:00, Paper WeBT1-CC.8 | Add to My Program |
WiTHy A*: Winding-Constrained Motion Planning for Tethered Robot Using Hybrid A* |
|
Chipade, Vishnu S. | University of Michigan |
Kumar, Rahul | Northeastern University |
Yong, Sze Zheng | Northeastern University |
Keywords: Motion and Path Planning, Constrained Motion Planning, Nonholonomic Motion Planning
Abstract: In this paper, a variant of hybrid A* is developed to find the shortest path for a curvature-constrained robot, that is tethered at its start position, such that the tether satisfies user-defined winding angle constraints. A variant of tangent graphs is used as an underlying graph for searching a path using A* in order to reduce the overall computation and define appropriate cost metrics to ensure winding angle constraints are satisfied. Conditions are provided under which the proposed algorithm is guaranteed to find a winding angle-constrained path. The effectiveness and performance of the proposed algorithm are studied in simulation.
|
|
13:30-15:00, Paper WeBT1-CC.9 | Add to My Program |
Differentiable Boustrophedon Paths That Enable Optimization Via Gradient Descent |
|
Manzini, Thomas | Texas A&M |
Murphy, Robin | Texas A&M |
Keywords: Motion and Path Planning, Optimization and Optimal Control
Abstract: This paper introduces a differentiable representation for the optimization of boustrophedon path plans in convex polygons, explores an additional parameter of these path plans that can be optimized, discusses the properties of this representation that can be leveraged during the optimization process and shows that the previously published attempt at optimization of these path plans was too coarse to be practically useful. Experiments were conducted to show that this differentiable representation can reproduce scores from traditional discrete representations of boustrophedon path plans with high fidelity. Finally, optimization via gradient descent was attempted but found to fail because the search space is far more non-convex than was previously considered in the literature. The wide range of applications for boustrophedon path plans means that this work has the potential to improve path planning efficiency in numerous areas of robotics, including mapping and search tasks using uncrewed aerial systems, environmental sampling tasks using uncrewed marine vehicles, and agricultural tasks using ground vehicles, among numerous others applications.
|
|
WeBT2-CC Oral Session, CC-311 |
Add to My Program |
Robot Design |
|
|
Chair: Bonev, Ilian | École De Technologie Supérieure |
Co-Chair: Yeshmukhametov, Azamat | Nazarbayev University |
|
13:30-15:00, Paper WeBT2-CC.1 | Add to My Program |
Torsion-Induced Compliant Joints and Its Application to Flat-Foldable and Self-Assembling Robotic Arm |
|
Yang, Dong-Wook | Korea Advanced Institute of Science and Technology (KAIST) |
Park, Hyun-Su | KAIST |
Jang, Keon-Ik | Korea Advanced Institute of Science and Technology |
Han, Jae-Hung | KAIST |
Lee, Dae-Young | Korea Advanced Institute of Science and Technology |
Keywords: Compliant Joints and Mechanisms, Soft Robot Applications, Soft Robot Materials and Design
Abstract: The joint design of origami-inspired robots is one of the most distinctive features that distinguishes them from conventional robots. A joint design using material’s compliance enables origami robots to implement complex transformational movements in a lightweight and simple manner. However, utilizing the continuum bending mode of materials brings critical problems, including undesired movements and joint radius. This study introduces a solution to these problems through a torsion-based compliant joint (T-C joint) design, which utilizes the torsion deformation of materials. The potential of the T-C joint is demonstrated in a flat-foldable and self-assembling robotic arm, providing its applicability in environments with form-factor limitations and minimal human intervention. The robotic arm—comprising links, joints, and a gripper—can fold into a flat state, deploy with precision and minimal weight, and effectively manipulate target objects. This demonstration shows the real-world application of the proposed joint design.
|
|
13:30-15:00, Paper WeBT2-CC.2 | Add to My Program |
OriTrack: A Small, 3 Degree of Freedom, Origami Solar Tracker |
|
Winston, Crystal | Stanford University |
Casey, Leo | Google, Inc |
Keywords: Energy and Environment-Aware Automation, Soft Robot Applications, Soft Robot Materials and Design
Abstract: In response to the need for sustainable energy solutions, solar panels have gained significant traction. One way to increase the energy capture of solar systems is through solar tracking, a means of reorienting solar panels throughout the day in order to face the sun. The energy consumption increase that comes with solar tracking often far outweighs the amount of energy required to move the panel, which makes it a compelling strategy for improving solar systems. Unfortunately, while solar trackers are commonly used in large solar farms, they are rarely used on rooftops, an area where solar panels are commonly installed. This is for two primary reasons: (1) most commercially available solar trackers are too large to be installed on roofs and (2) even if traditional solar trackers were made in a more compact form-factor it would be difficult to densely lay them out on a roof without the trackers substantially shading each other. In order to address these issues, we introduce OriTrack, a small three-degree-of-freedom (3 DOF) solar tracker which reduces the area of its shadow by reducing its height as it tracks the sun. In this paper we discuss the design, manufacturing, and control of OriTrack. We then compare OriTrack to a flat reference panel, the solar energy solution commonly used on roofs today, and find that OriTrack demonstrates 23% increased energy production. This result suggests OriTrack could be used as a future solution for solar tracking on rooftops.
|
|
13:30-15:00, Paper WeBT2-CC.3 | Add to My Program |
Reinforcement Learning for Freeform Robot Design |
|
Li, Muhan | Northwestern University |
Matthews, David | Northwestern University |
Kriegman, Sam | Northwestern University |
Keywords: Evolutionary Robotics
Abstract: Inspired by the necessity of morphological adaptation in animals, a growing body of work has attempted to expand robot training to encompass physical aspects of a robot's design. However, reinforcement learning methods capable of optimizing the 3D morphology of a robot have been restricted to reorienting or resizing the limbs of a predetermined and static topological genus. Here we show policy gradients for designing freeform robots with arbitrary external and internal structure. This is achieved through actions that deposit or remove bundles of atomic building blocks to form higher-level nonparametric macrostructures such as appendages, organs and cavities. Although results are provided for open loop control only, we discuss how this method could be adapted for closed loop control and sim2real transfer to physical machines in future.
|
|
13:30-15:00, Paper WeBT2-CC.4 | Add to My Program |
A Helical Bistable Soft Gripper Enable by Pneumatic Actuation |
|
Yin, Xuanchun | South China Agricultural University |
Xie, Junliang | South China Agricultural University |
Zhou, Pengyu | South China Agricultural University |
Wen, Sheng | South China Agricultural University |
Zhang, Jiantao | South China Agricultural University |
Keywords: Grippers and Other End-Effectors, Biologically-Inspired Robots, Soft Robot Applications
Abstract: There are many instances of helical mechanisms that are used to efficiently grasp different objects with various shapes and sizes in nature. Inspired by the helical grasping in the nature, we proposed a helical bistable soft gripper with high load capacity and energy saving. An off-the-shelf bistable steel shell (BSS) as the stiff element was inserted into a 3D printing soft helical exo-skeleton to achieve coiling around and holding the objects with-out energy consumption. Two air pouches were designed as the actuator to control the transition between the two stable states. To facilitate gripper design, a simplified model of the gripper was conducted, and the geometric parameters of the gripper are listed in a table for reference. The transition pressures between the two stable states were experimentally characterized. Moreover, we conduct experiments to demonstrate the capability of the gripper in two working modes. The gripper exhibits coiling diameters ranging between 40 mm and 60 mm and is successfully attached to various slender objects of different geometries with a maximum holding force of 92.67 N (up to 135.1 times of its mass) in hanging mode. Finally, the gripper was integrated into a robot arm and successfully grasped different objects, and the maximum grasping weight is 221.6 g in the grasping mode.
|
|
13:30-15:00, Paper WeBT2-CC.5 | Add to My Program |
Singularity Analysis of Kinova's Link 6 Robot~Arm Via Grassmann Line Geometry |
|
Asgari, Milad | École De Technologie Supérieure |
Bonev, Ilian | École De Technologie Supérieure |
Gosselin, Clement | Université Laval |
Keywords: Kinematics, Mechanism Design, Actuation and Joint Mechanisms
Abstract: Unlike parallel robots, for which hundreds of different architectures have been proposed, the vast majority of six-degree-of-freedom (DOF) serial robots have one of two simple architectures. In both architectures, the inverse kinematics can be solved in closed form and the singularities described by trivial geometric and algebraic conditions. These conditions can be readily obtained by analyzing the determinant of the robot's Jacobian matrix, and provide an in-depth understanding of the robot's singularities, which is essential for its optimal use. However, for various reasons, robot arms with unorthodox architectures are occasionally designed. Such arms do not have closed-form inverse kinematics and little insight into their singularities can be gained by analyzing the determinant of their Jacobian. One such robot arm for which the conventional singularity analysis approach fails is the new Link~6 collaborative robot by Kinova. In this paper, we study the complex singularities of Link~6 by investigating all possibilities for screw dependencies, deriving a simple equation for each case, and then describing each singularity type using Grassmann line geometry. Twelve different singularity configurations are identified and described with seven relatively simple geometric conditions. Our approach is general and can be applied to other robot arms.
|
|
13:30-15:00, Paper WeBT2-CC.6 | Add to My Program |
Design and Testing of a Multi-Module, Tetherless, Soft Robotic Eel |
|
Hall, Robin | Worcester Polytechnic Institute |
Espinosa, Gabriel | Worcester Polytechnic Institute |
Chiang, Shou-Shan | Worcester Polytechnic Institute |
Onal, Cagdas | WPI |
Keywords: Marine Robotics, Soft Robot Materials and Design, Biologically-Inspired Robots
Abstract: This paper presents a free-swimming, tetherless, cable-driven modular soft robotic fish. The body comprises a series of 3D-printed wave spring structures that create a flexible biologically inspired shape that is capable of an anguilliform swimming gait. A three-module soft robotic fish was designed, fabricated, and evaluated. The motion of the robot was characterized and different combinations of actuation amplitude, frequency, and phase shift were determined experimentally to determine the optimal parameters that maximized speed and minimized the cost of transport (COT). The maximum speed recorded was 0.20 BL/s (body lengths per second) with a COT of 15.82. These results were compared against other robotic and biological fish. We operated the robot, untethered, in a variety of environments to test how it was able to function outside of laboratory settings.
|
|
13:30-15:00, Paper WeBT2-CC.7 | Add to My Program |
Untethered Underwater Soft Robot with Thrust Vectoring |
|
Hall, Robin | Worcester Polytechnic Institute |
Onal, Cagdas | WPI |
Keywords: Marine Robotics, Soft Robot Materials and Design, Soft Robot Applications
Abstract: This paper introduces DRAGON: Deformable Robot for Agile Guided Observation and Navigation, a free-swimming deformable impeller-powered vectored underwater vehicle (VUV). A 3D-printed wave spring structure directs the water drawn through the center of the robot by an impeller, enabling it to move smoothly in different directions. The robot is designed to have a narrow cylindrical profile to lower drag and improve agility. It has a maximum recorded speed of 2.1 BL/s (body lengths per second) and a minimum cost of transport (COT) of 2.9. The robot has two degrees of freedom (DoFs) and is capable of performing a variety of maneuvers including a full circle with a radius of 0.23 m (1.4 BL) and a figure eight, which it completed in 4.98 s (72.3 degree/s) and 10.74 s respectively. We operated the robot, untethered, in various environments to test the robustness of the design and analyze its motion and performance.
|
|
13:30-15:00, Paper WeBT2-CC.8 | Add to My Program |
A Backdrivable Axisymmetric Kinematically Redundant (6+3)-Degree-Of-Freedom Hybrid Parallel Manipulator |
|
Kim, Jehyeok | Université Laval |
Gosselin, Clement | Université Laval |
Keywords: Redundant Robots, Mechanism Design, Physical Human-Robot Interaction
Abstract: A kinematically redundant (6+3)-degree-of-freedom (DOF) hybrid parallel robot with an axisymmetric workspace is proposed. By arranging the first revolute joint of each leg such that they have the same rotation axis, this robot can achieve an axisymmetric workspace, resulting in a large reachable workspace. In addition, type II singularities, which critically limit the orientational workspace, can be fully avoided by utilizing kinematic redundancy. A gripper mechanism is developed to increase the orientational workspace by exploiting the redundant DOFs. Moreover, the orientational workspace can be further increased by introducing a redundant DOF with a constant angle. As a result, the proposed hybrid parallel robot achieves a high workspace-to-footprint ratio comparable to that of serial robots. A CAD model of the robot and computer animations are provided to demonstrate the large workspaces and the gripper mechanism. A significant advantage of the proposed robot over serial architectures is that the robot is backdrivable since it uses direct-drive or quasi-direct-drive actuators.
|
|
13:30-15:00, Paper WeBT2-CC.9 | Add to My Program |
Design of a Fully Pulley-Guided Wire-Driven Prismatic Tensegrity Robot: Friction Impact to Robot Payload Capacity |
|
Yeshmukhametov, Azamat | Nazarbayev University |
Koganezawa, Koichi | Tokai University |
Keywords: Redundant Robots, Tendon/Wire Mechanism, Mechanism Design
Abstract: The tensegrity structure was initially created as a static structure, but it has gained significant attention among robotics researchers due to its benefits, including high payload capability, shock resistance, and resiliency. However, implementing tensegrity structures in robotics presents new technical challenges, primarily related to their wire-driven structure, such as wire-routing and wire-friction problems. Therefore, this research letter proposes a technical solution for the aforementioned problems. The main contribution of this research is the design of frictionless pulley-guided nodes. To validate the proposed concept, we conducted comparative experiments between a common tensegrity prototype and a pulley-guided prototype, evaluating wire tension distribution and payload capacity.
|
|
WeBT3-CC Oral Session, CC-313 |
Add to My Program |
Kinematics and Dynamics |
|
|
Chair: Yi, Jingang | Rutgers University |
Co-Chair: Lau, Darwin | The Chinese University of Hong Kong |
|
13:30-15:00, Paper WeBT3-CC.1 | Add to My Program |
Motion Planning and Inertia Based Control for Impact Aware Manipulation |
|
Khurana, Harshit | EPFL |
Billard, Aude | EPFL |
Keywords: Impact Aware Manipulation, Motion Control of Manipulators, Motion and Path Planning, Factory Automation
Abstract: In this paper, we propose a metric called hitting flux which is used in the motion generation and controls for a robot manipulator to interact with the environment through a hitting or a striking motion. Given the task of placing a known object outside of the workspace of the robot, the robot needs to come in contact with it at a non zero relative speed. The configuration of the robot and the speed at contact matter because they affect the motion of the object. The physical quantity called hitting flux depends on the robot's configuration, the robot speed and the properties of the environment. An approach to achieve the desired directional pre-impact flux for the robot through a combination of a dynamical system (DS) for motion generation and a control system that regulates the directional inertia of the robot is presented. Furthermore, a Quadratic Program (QP) formulation for achieving a desired inertia matrix at a desired position while following a motion plan constrained to the robot limits is presented. The system is tested for different scenarios in simulation showing the repeatability of the procedure and in real scenarios with KUKA LBR iiwa 7 robot.
|
|
13:30-15:00, Paper WeBT3-CC.2 | Add to My Program |
RASCAL: A Scalable, High-Redundancy Robot for Automated Storage and Retrieval Systems |
|
Black, Richard | Microsoft |
Caballero, Marco | Microsoft Research |
Chatzieleftheriou, Andromachi | Microsoft |
Deegan, Tim | Microsoft Research |
Heard, Philip | Microsoft Research, Cambridge, UK |
Hong, Freddie | Microsoft Research |
Joyce, Russell | Microsoft Research |
Legtchenko, Sergey | Microsoft |
Rowstron, Antony | Microsoft Research |
Smith, Adam | Microsoft |
Sweeney, David | Microsoft Research |
Williams, Hugh | Microsoft |
Keywords: Industrial Robots, Climbing Robots, Mechanism Design
Abstract: Automated storage and retrieval systems (ASRS) are a key component of the modern storage industry, and are used in a wide range of applications, carrying anything from lightweight tape cartridges to entire pallets of goods. Many of these systems are under pressure to maximise the use of space by growing in height and density, but this can create challenges for the the robots that service them. In this context, we present RASCAL, a novel ASRS robot for small payload items in structured environments, with a focus on system-level scalability and redundancy. We describe the design objectives of RASCAL and how they address some of the limitations of existing robotic systems in this area, such as scalability and redundancy. We then demonstrate the viability of our design with a proof-of-concept implementation of a data centre storage media robot, and show through a series of experiments that its design, speed, accuracy, and energy efficiency are appropriate for this application.
|
|
13:30-15:00, Paper WeBT3-CC.3 | Add to My Program |
Virtual Passive-Joint Space Based Time-Optimal Trajectory Planning for a 4-DOF Parallel Manipulator |
|
Zhao, Jie | Chinese Academy of Sciences |
Yang, Guilin | Ningbo Institute of Material Technology and Engineering, Chines |
Shi, Haoyu | University of Nottingham, Ningbo China |
Chen, Silu | Ningbo Institute of Materials Technology and Engineering, CAS |
Chen, Chin-Yin | Ningbo Institute of Material Technology and Engineering, CAS |
Zhang, Chi | Ningbo Institute of Material Technology and Engineering, CAS |
Keywords: Parallel Robots, Kinematics, Motion and Path Planning
Abstract: The 4-DOF (3T1R) 4PPa-2PaR parallel manipulator is developed for high-speed pick-and-place operations. However, conventional trajectory planning methods in either active joint space or Cartesian space have some shortcomings due to its high nonlinear kinematics. Owing to its unique four-to-two leg structure, the middle link that connects to the two proximal parallelogram four-bar linkages in each side only generates 2-DOF translational motions in a vertical plane. By treating each of the middle link as a 2-DOF virtual passive joint, a new trajectory planning method in the 4-DOF virtual passive-joint space is proposed, which not only simplifies the kinematic analysis, but also decreases the kinematics nonlinearity. By introducing the virtual passive joints, both displacement and velocity analyses are readily investigated. The Lagrangian method is employed to formulate the closed-form dynamic model. The quintic B-spline is utilized to generate trajectories in the virtual passive-joint space, while the Genetic Algorithm is implemented to search for the time optimal trajectory. The simulation results indicate that the optimal time planned in the virtual passive-joint space is decreased by 2.8% and 8.1% compared with the active-joint space method and Cartesian space method respectively. The average and peak jerks of the moving platform are decreased by 14.6% and 37.6% compared with the active-joint space method.
|
|
13:30-15:00, Paper WeBT3-CC.4 | Add to My Program |
Direct Kinematic Singularities and Stability Analysis of Sagging Cable-Driven Parallel Robots |
|
Briot, Sébastien | LS2N |
Merlet, Jean-Pierre | INRIA |
Keywords: Parallel Robots, Kinematics, stability, Industrial Robots
Abstract: Sagging cable-driven parallel robots (CDPRs) are often modelled by using the Irvine's model. We will show that their configurations may be unstable, and moreover, that assessing the stability of the robot with the Irvine's model cannot be done by checking the spectrum of a stiffness matrix associated with the platform motions. In the present paper, we show that the static configurations of the sagging CDPRs are local extrema of the functional describing the robot potential energy. For assessing the stability, it is then necessary to check two conditions: The Legendre-Clebsch and the Jacobi conditions, both well known in optimal control theory. We will also (i) prove that there is a link between some singularities of the CDPRs and the limits of stability and (ii) show that singularities of the platform wrench system are not singularities of the geometric model of the sagging CDPRs, contrary to what happens in rigid-link parallel robotics. The stability prediction results are validated in simulation by cross-validating them by using a lumped model, for which the stability can be assessed by analyzing the spectrum of a reduced Hessian matrix of the potential energy.
|
|
13:30-15:00, Paper WeBT3-CC.5 | Add to My Program |
Towards Solving Cable-Driven Parallel Robot Inaccuracy Due to Cable Elasticity |
|
Suarez Roos, Adolfo | IRT Jules Verne |
Zake, Zane | IRT Jules Verne |
Rasheed, Tahir | IRCCyN - ECN - IRT JV |
Pedemonte, Nicolo | IRT Jules Verne |
Caro, Stéphane | CNRS/LS2N |
Keywords: Parallel Robots, Tendon/Wire Mechanism, Kinematics
Abstract: Cable elasticity can significantly impact the accuracy of Cable-Driven Parallel Robots (CDPRs). However, it’s frequently disregarded as negligible in CDPR simulations and designs. In this paper, we propose a numerical approach, referred to as SEECR, which is designed to estimate the behavior of a CDPR featuring elastic cables while ensuring the Static Equilibrium (SE) of the Moving-Platform (MP). By modeling the cables as elastic springs, the proposed approach correctly predicts which cables become slack, estimates the tension distribution among cables and computes unwanted MP motions, allowing to predict the impact of design choices. The results have been validated experimentally on two cable types and configurations.
|
|
13:30-15:00, Paper WeBT3-CC.6 | Add to My Program |
Wrench and Twist Capability Analysis for Cable-Driven Parallel Robots with Consideration of the Actuator Torque-Speed Relationship |
|
Chan, Ngo Foon | The Chinese University of Hong Kong |
Lam, Wai Yi | The Chinese University of Hong Kong |
Lau, Darwin | The Chinese University of Hong Kong |
Keywords: Parallel Robots, Tendon/Wire Mechanism, Manipulation Planning, Wrench-twist Feasibility
Abstract: The wrench and twist feasibility are the workspace conditions that indicate whether the mobile-platform (MP) of the cable-driven parallel robots (CDPRs) can provide a sufficient amount of wrench and twist. Traditionally, these two quantities are evaluated independently from the actuator's torque and speed limits, which are assumed to be fixed in the literature, but they are indeed coupled. This results in a conservative usage of the actuator capability and hence hinders the robot's actual feasibility. In this study, new approaches to analyzing and commanding CDPRs by considering the coupling effect are proposed. First, the required wrench of the MP is mapped into the twist space by the motors' torque-speed relationship and becomes the wrench-dependent available twist set. Then a new workspace condition and a new metric are introduced based on the available twist set. The metric shows the maximum allowable MP speed map of the workspace. Finally, a varying speed trajectory is designed based on the metric to optimize the total MP traveling time. This study shows the potential of robot wrench-twist capability and enhances the robot hardware effectiveness without any ha
|
|
13:30-15:00, Paper WeBT3-CC.7 | Add to My Program |
RicMonk: A Three-Link Brachiation Robot with Passive Grippers for Energy-Efficient Brachiation |
|
Grama Srinivas Shourie, Grama Srinivas Shourie | Deutsches Forschungszentrum Für Künstliche Intelligenz, Bremen |
Javadi, Mahdi | German Research Center for Artificial Intelligence Robotics Inn |
Kumar, Shivesh | DFKI GmbH |
Zamani Boroujeni, Hossein | DFKI-Robotics Innovation Center |
Kirchner, Frank | University of Bremen |
Keywords: Underactuated Robots, Biologically-Inspired Robots, Education Robotics
Abstract: This paper presents the design, analysis, and performance evaluation of RicMonk, a novel three-link brachiation robot equipped with passive hook-shaped grippers. Brachiation, an agile and energy-efficient mode of locomotion observed in primates, has inspired the development of RicMonk to explore versatile locomotion and maneuvers on ladder-like structures. The robot’s anatomical resemblance to gibbons and the integration of a tail mechanism for energy injection contribute to its unique capabilities. The paper discusses the use of the Direct Collocation methodology for optimizing trajectories for the robot’s dynamic behaviors and stabilization of these trajectories using a Time-varying Linear Quadratic Regulator. With RicMonk we demonstrate bidirectional brachiation, and provide comparative analysis with its predecessor, AcroMonk - a two-link brachiation robot, to demonstrate that the presence of a passive tail helps improve energy efficiency. The system design, controllers, and software implementation are publicly available on GitHub at https://github.com/dfki-ric-underactuated-lab/ricmonk and the video demonstration of the experiments can be viewed at https://youtu.be/hOuDQI7CD8w.
|
|
13:30-15:00, Paper WeBT3-CC.8 | Add to My Program |
Gaussian Process-Enhanced, External and Internal Convertible Form-Based Control of Underactuated Balance Robots |
|
Han, Feng | Rutgers University |
Yi, Jingang | Rutgers University |
Keywords: Underactuated Robots, Dynamics, Machine Learning for Robot Control
Abstract: External and internal convertible (EIC) form-based motion control (i.e., EIC-based control) is one of the effective approaches for underactuated balance robots. By sequentially controller design, trajectory tracking of the actuated subsystem and balance of the unactuated subsystem can be achieved simultaneously. However, with certain conditions, there exists uncontrolled robot motion under the EIC-based control. We first identify these conditions and then propose an enhanced EIC-based control with a Gaussian process data-driven robot dynamic model. Under the new enhanced EIC-based control, the stability and performance of the closed-loop system are guaranteed. We demonstrate the GP-enhanced control experimentally using two examples of underactuated balance robots.
|
|
WeBT4-CC Oral Session, CC-315 |
Add to My Program |
Multi-Robot Systems V |
|
|
Chair: Sabattini, Lorenzo | University of Modena and Reggio Emilia |
Co-Chair: Garcia de Marina, Hector | Universidad De Granada |
|
13:30-15:00, Paper WeBT4-CC.1 | Add to My Program |
Automation and Artificial Intelligence Technology in Surface Mining: State of the Art, Challenges and Opportunities |
|
Leung, Raymond | The University of Sydney |
Hill, Andrew John | University of Sydney |
Melkumyan, Arman | The University of Sydney |
Keywords: Mining Robotics, Planning, Scheduling and Coordination, Probability and Statistical Methods
Abstract: This survey article provides a synopsis on some of the engineering problems, technological innovations, robotic development and automation efforts encountered in the mining industry---particularly in the Pilbara iron-ore region of Western Australia. The goal is to paint the technology landscape and highlight issues relevant to an engineering audience to raise awareness of AI and automation trends in mining. It assumes the reader has no prior knowledge of mining and builds context gradually through focused discussion and short summaries of common open-pit mining operations. The principal activities that take place may be categorized in terms of resource development, mine-, rail- and port operations. From mineral exploration to ore shipment, there are roughly nine steps in between. These include: geological assessment, mine planning and development, production drilling and assaying, blasting and excavation, transportation of ore and waste, crush and screen, stockpile and load-out, rail network distribution, and ore-car dumping. The objective is to describe these processes and provide insights on some of the challenges/opportunities from the perspective of a decade-long industry-university R&D partnership.
|
|
13:30-15:00, Paper WeBT4-CC.2 | Add to My Program |
Hierarchical Traffic Management of Multi-AGV Systems with Deadlock Prevention Applied to Industrial Environments (I) |
|
Pratissoli, Federico | Universitŕ Degli Studi Di Modena E Reggio Emilia |
Brugioni, Riccardo | RSEngineering Srl |
Battilani, Nicola | University of Modena and Reggio Emilia |
Sabattini, Lorenzo | University of Modena and Reggio Emilia |
Keywords: Multi-Robot Systems, Factory Automation, Path Planning for Multiple Mobile Robots or Agents
Abstract: This paper concerns the coordination and the traffic management of a group of Automated Guided Vehicles (AGVs) moving in a real industrial scenario, such as an automated factory or warehouse. The proposed methodology is based on a three-layer control architecture, which is described as follows: 1) the Top Layer (or Topological Layer) allows to model the traffic of vehicles among the different areas of the environment; 2) the Middle Layer allows the path planner to compute a traffic sensitive path for each vehicle; 3) the Bottom Layer (or Roadmap Layer) defines the final routes to be followed by each vehicle and coordinates the AGVs over time. In the paper we describe the coordination strategy we propose, which is executed once the routes are computed and has the aim to prevent congestions, collisions and deadlocks. The coordination algorithm exploits a novel deadlock prevention approach based on time-expanded graphs. Moreover, the presented control architecture aims at grounding theoretical methods to an industrial application by facing the typical practical issues such as graphs difficulties (load/unload locations, weak connections,), a predefined roadmap (constrained by the plant layout), vehicles errors, dynamical obstacles, etc. In this paper we propose a flexible and robust methodology for multi-AGVs traffic-aware management. Moreover, we propose a coordination algorithm, which does not rely on ad hoc assumptions or rules, to prevent collisions and deadlocks and to deal
|
|
13:30-15:00, Paper WeBT4-CC.3 | Add to My Program |
Task Allocation in Heterogeneous Multi-Robot Systems Based on Preference-Driven Hedonic Game |
|
Zhang, Liwang | National University of Defense Technology |
Li, Minglong | National University of Defense Technology |
Yang, Wenjing | State Key Laboratory of High Performance Computing (HPCL), Schoo |
Yang, Shaowu | National University of Defense Technology |
Keywords: Multi-Robot Systems, Search and Rescue Robots, Cooperating Robots
Abstract: Multiple preferences between robots and tasks have been largely overlooked in previous research on Multi-Robot Task Allocation (MRTA) problems. In this paper, we propose a preference-driven approach based on hedonic game to address the task allocation problem of muti-robot systems in emergency rescue scenarios. We present a distributed framework considering various preferences between robots and tasks to determine the division of coalitions in such problems and evaluate the scalability and adaptability of our algorithm through relevant experiments. Furthermore, considering the strict communication limitations in emergency rescue scenarios, we have verified that our algorithm can efficiently converge to a Nash-stable coalition partition even in conditions of insufficient communication distance.
|
|
13:30-15:00, Paper WeBT4-CC.4 | Add to My Program |
Persistent Monitoring of Multiple Moving Targets Using High Order Control Barrier Functions |
|
Balandi, Lorenzo | Centre Inria De l'Université De Rennes |
De Carli, Nicola | CNRS |
Robuffo Giordano, Paolo | Irisa Cnrs Umr6074 |
Keywords: Multi-Robot Systems, Sensor Networks, Cooperating Robots
Abstract: This paper considers the problem of persistently monitoring a set of moving targets using a team of aerial vehicles. Each agent in the network is assumed equipped with a camera with limited range and Field of View (FoV) providing bearing measurements and it implements an Information Consensus Filter (ICF) to estimate the state of the target(s). The ICF can be proven to be uniformly globally exponentially stable under a Persistency of Excitation (PE) condition. We then propose a distributed control scheme that allows maintaining a prescribed minimum PE level so as to ensure filter convergence. At the same time, the agents in the group are also allowed to perform additional tasks of interest while maintaining a collective observability of the target(s). In order to enforce satisfaction of the observability constraint, we leverage two main tools: (i) the weighted Observability Gramian with a forgetting factor as a measure of the cumulative acquired information, and (ii) the use of High Order Control Barrier Functions (HOCBF) as a mean to maintain a minimum level of observability for the targets. Simulation results are reported to prove the effectiveness of this approach.
|
|
13:30-15:00, Paper WeBT4-CC.5 | Add to My Program |
EM-Patroller: Entropy Maximized Multi-Robot Patrolling with Steady State Distribution Approximation |
|
Guo, Hongliang | Agency for Science Technology and Research |
Kang, Qi | National University of Singapore |
Yau, Wei-Yun | I2R |
Ang Jr, Marcelo H | National University of Singapore |
Rus, Daniela | MIT |
Keywords: Multi-Robot Systems, Surveillance Robotic Systems
Abstract: This paper investigates the multi-robot patrolling (MuRP) problem in a discrete environment with the objective of approaching the uniform node coverage probability distri | |