| | |
Last updated on March 1, 2026. This conference program is tentative and subject to change
Technical Program for Wednesday March 11, 2026
| |
| WedFA |
|
| Interactive Session & Demos 1 & Coffee |
Interactive |
| |
| 14:05-15:25, Paper WedFA.1 | |
| Opportunities in Learning Physically Consistent Dynamics for Floating-Base Systems |
|
| Schulze, Lucas | Technische Universität Darmstadt |
| Peters, Jan | Technische Universität Darmstadt |
| Arenz, Oleg | TU Darmstadt |
Keywords: Legged Robots, Model Learning for Control, RIG Cluster: Legged Locomotion
Abstract: Grey-box methods for system identification integrate deep learning with physics-based constraints, capturing complex dependencies while improving out-of-distribution generalization. However, despite the increasing relevance of floating-base systems like humanoids and quadrupeds, existing grey-box models fail to account for their unique physical constraints. In this work, we revisit Floating-Base Deep Lagrangian Networks (FeLaN), discussing further research directions, applications, and open challenges.
|
| |
| 14:05-15:25, Paper WedFA.2 | |
| Swarms of Chain-Like Vehicles |
|
| Schönnagel, Adrian | Otto-von-Guericke-University Magdeburg & Fraunhofer Institute for Transportation and Infrastructure Systems |
| Dubé, Michael | Otto-von-Guericke-University |
| Steup, Christoph | Fraunhofer Institute for Transportation and Infrastructure Systems |
| Keppler, Felix | Fraunhofer Institute for Transportation and Infrastructure Systems |
| Mostaghim, Sanaz | Faculty of Computer Science, University of Magdeburg |
|
|
| |
| 14:05-15:25, Paper WedFA.3 | |
| PLID: Probabilistic Latent Intention Dynamics for Interpretable Egocentric Action Prediction |
|
| Gaus, Johannes Albert | Hertie Center of Neurology, Hertie Institute for Clinical Brain Research, NeuRoMech Group, University of Tuebingen |
| Haeufle, Daniel Florian Benedict | University of Tübingen |
|
|
| |
| 14:05-15:25, Paper WedFA.4 | |
| Multi-Objective Photoreal Simulation (MOPS) Dataset for Computer Vision in Robot Manipulation |
|
| Li, Maximilian Xiling | Karlsruhe Institute of Technology |
| Mattes, Paul | Karlsruhe Institute of Technology |
| Blank, Nils | KIT |
| Lioutikov, Rudolf | Karlsruhe Institute of Technology |
Keywords: Data Sets for Robotic Vision, Visual Learning, Imitation Learning
Abstract: We introduce the Multi-Object Photoreal Simulation (MOPS) dataset, providing comprehensive ground truth annotations for robotic manipulation scenes in photorealistic simulation. MOPS employs an LLM-based pipeline to automatically normalize 3D object scales and generate part-level affordances. The dataset features pixel-level segmentations for fine-grained part segmentation and affordance prediction (e.g., "graspable" or "pushable"). MOPS generates diverse scenes to accelerate progress in robot perception and manipulation. The Code and Dataset is available at https://intuitive-robots.github.io/mops/
|
| |
| 14:05-15:25, Paper WedFA.5 | |
| SIR: Structured Image Representations for Explainable Robot Learning |
|
| Mattes, Paul | Karlsruhe Institute of Technology |
| Li, Maximilian Xiling | Karlsruhe Institute of Technology |
| Blank, Nils | KIT |
| Lioutikov, Rudolf | Karlsruhe Institute of Technology |
Keywords: Representation Learning, Imitation Learning, Deep Learning Methods
Abstract: Existing robot policies based on learned visual embeddings lack explicit structure and are sensitive to visual distractions. Thus, the representations that drive their behaviour are often opaque, making their decision-making process difficult to interpret. To address this, we introduce Structured Image Representations (SIR), a method that leverages Scene Graphs as an intermediate representation for robot policy learning. Evaluations on RoboCasa show that our sparse graph policies outperform image-based baselines on average with 19.5% vs 14.81% success rate. We show that the learned sparse graphs are a powerful tool for model analysis.
|
| |
| 14:05-15:25, Paper WedFA.6 | |
| MagBotSim: Physics-Based Simulation and Reinforcement Learning Environments for Magnetic Robotics |
|
| Bergmann, Lara | Bielefeld University |
| Grothues, Cedric | Universität Bielefeld |
| Neumann, Klaus | Bielefeld University / Fraunhofer IOSB-INA |
Keywords: RIG Cluster Multi-Robot Systems, RIG TC: Reconfigurable Robotics, RIG TC: Swarm Robotics
Abstract: Magnetic levitation is about to revolutionize in-machine material flow in industrial automation. Such systems are flexibly configurable and can include a large number of independently actuated shuttles (movers) that dynamically rebalance production capacity. Beyond their capabilities for dynamic transportation, these systems possess the inherent yet unexploited potential to perform manipulation. By merging the fields of transportation and manipulation into a coordinated swarm of magnetic robots (MagBots), we enable manufacturing systems to achieve significantly higher efficiency, adaptability, and compactness. To support the development of intelligent algorithms for magnetic levitation systems, we introduce MagBotSim (Magnetic Robotics Simulation): a physics-based simulation for magnetic levitation systems. By framing magnetic levitation systems as robot swarms and providing a dedicated simulation, this work lays the foundation for next generation manufacturing systems powered by Magnetic Robotics. MagBotSim’s documentation, videos, experiments, and code are available at: https://ubi-coro.github.io/MagBotSim/
|
| |
| 14:05-15:25, Paper WedFA.7 | |
| A Human-Inspired Design of a Robotic Cell for Composite Manufacturing Using Knowledge Representation |
|
| Lennartz, Moritz | Institut Für Textiltechnik of RWTH Aachen University |
| Thelen, Martin | Institut Für Textiltechnik of RWTH Aachen University |
| Zähl, Konstantin | Institut Für Unternehmenskybernetik |
| Henke, Christoph | RWTH-Aachen University |
| Gries, Thomas | Institut Für Textiltechnik of RWTH Aachen University |
Keywords: Intelligent and Flexible Manufacturing, Manipulation Planning, Grippers and Other End-Effectors
Abstract: Due to the complex dynamics and high dimensional configuration space of the fabrics to be handled, as well as the high quality requirements for the resulting component properties, the manufacture of fiber-reinforced composite continues to be a manual, labor-intensive industrial task that requires years of experience. This paper presents a knowledge engineering driven approach to design a component-adaptive robotic cell for the automated manufacturing of fiber-reinforced composite components. With the help of a study, we examine manual operations from domain experts and formalize them in a domain knowledge model in order to then translate them into function based hardware requirements. Based on this, we develop a fully automated and component-adaptive robot cell for the flexible manufacture of diverse and geometrically complex composite parts.
|
| |
| 14:05-15:25, Paper WedFA.8 | |
| Uncertainty-Guided Continual Adaptation for Traversability Prediction |
|
| Lee, Hojin | Ulsan National Institute of Science and Technology |
| Lee, Yunho | Department of Mechanical Engineering, Ulsan National Institute of Science and Technology (UNIST) |
| Duecker, Daniel Andre | Technical University of Munich (TUM) |
| Kwon, Cheolhyeon | Ulsan National Institute of Science and Technology |
|
|
| |
| 14:05-15:25, Paper WedFA.9 | |
| Robotic Ultrasound for 3D Bone Reconstruction in the Knee: Preliminary Results |
|
| Phlippen, Lovis | RWTH Aachen University |
| Brößner, Peter | Chair of Medical Engineering, RWTH Aachen University |
| Radermacher, Klaus | RWTH Aachen University |
Keywords: Medical Robots and Systems, Task and Motion Planning, Object Detection, Segmentation and Categorization
Abstract: Total Knee Arthroplasty (TKA) is a frequently performed surgery in which patient-specific planning and implants may improve surgical outcomes. For this purpose, 3D bone models are required, which are typically obtained from computed tomography (CT). Ultrasound offers a radiation-free and more cost-effective alternative. In this work, we present a fully automatic bone reconstruction pipeline based on robotic ultrasound, incorporating machine-learning-based image segmentation. Preliminary tests show feasibility. Further studies are necessary to thoroughly assess its accuracy.
|
| |
| 14:05-15:25, Paper WedFA.10 | |
| Where Did I Leave My Glasses? Open-Vocabulary Semantic Exploration in Real-World Semi-Static Environments |
|
| Bogenberger, Benjamin | Technical University of Munich (TUM) |
| Harrison, Oliver | University of Toronto |
| Dahanaggamaarachchi, Dinushka Orrin | University of Toronto |
| Brunke, Lukas | Technical University of Munich |
| Qian, Jingxing | University of Toronto |
| Zhou, Siqi | Technical University of Munich |
| Schoellig, Angela P. | TU Munich |
Keywords: RIG Cluster: Learning and Multimodal AI for Robotics, Semantic Scene Understanding, Vision-Based Navigation
Abstract: Robots deployed in real-world environments, such as homes, must not only navigate safely but also understand their surroundings and adapt to changes in the environment. Existing research on semantic exploration largely focuses on static scenes without persistent object-level instance tracking. In this work, we propose an open-vocabulary, semantic exploration system for semi-static environments. Our system maintains a consistent map by building a probabilistic model of object instance stationarity, systematically tracking semi-static changes, and actively exploring areas that have not been visited for an extended period. In addition to active map maintenance, our approach leverages the map’s semantic richness with large language model (LLM)-based reasoning for open-vocabulary object-goal navigation. Evaluated on state-of-the-art baselines using publicly available object navigation and mapping datasets, our method outperforms the baselines in both success rate in object-goal navigation tasks and in handling scene changes during mapping. A video of full (including real-world) experimental results can be found at https://tiny.cc/sem-explor-semi-static and on our website https://utiasdsl.github.io/semi-static-semantic-exploration/.
|
| |
| 14:05-15:25, Paper WedFA.11 | |
| Remote Experience Center: An Experimental Platform for Remote Teleoperation Over Configurable 5G Networks |
|
| Yang, Dong | Technical University of Munich |
| Janes, Adam | Czech technical university in Prague |
| Danek, Jan | Czech technical university in Prague |
| Xu, Xiao | Technical University of Munich |
| Becvar, Zdenek | Czech Technical University in Prague, Faculty of Electrical Engineering |
| Steinbach, Eckehard | Technical University of Munich |
|
|
| |
| 14:05-15:25, Paper WedFA.12 | |
| SVN-ICP: Uncertainty Estimation of ICP-Based LiDAR Odometry Using Stein Variational Newton – an Extended Abstract |
|
| Ma, Shiping | Technische Universität Berlin |
| Zhang, Haoming | Technical University of Munich |
| Toussaint, Marc | TU Berlin |
Keywords: SLAM, Probabilistic Inference, Sensor Fusion
Abstract: We introduce SVN-ICP, a novel ICP algorithm with uncertainty estimation based on Stein Variational Newton on manifold. By approximating the posterior with particles, SVN-ICP avoids explicit noise modeling and manual tuning. Integrated into a simple error-state Kalman filter with an IMU, it is evaluated across diverse datasets and robot platforms. Results show superior performance in challenging scenarios while providing reliable uncertainty estimates. Code and a video are publicly available.
|
| |
| 14:05-15:25, Paper WedFA.13 | |
| Toward a Safety-Oriented Strategy for Intentional Patient-Robot Interaction in Telediagnostics |
|
| Kolb, Sven | Technical University of Munich |
| Zhang, Yueyang | Technical University of Munich |
| Wilhelm, Dirk | Technical University of Munich |
| Steinbach, Eckehard | Technical University of Munich |
Keywords: Safety in HRI, Medical Robots and Systems, RIG Cluster: Rigorous Perception
Abstract: The expansion of robotic systems into complex, unstructured and yet safety-critical medical environments poses many challenges. Physical Human-Robot Interaction (pHRI) scenarios between patients and medical robotic systems require new approaches to ensure a patient's physical and perceived safety. In this work, we discuss a safety-oriented strategy for intentional patient-robot interaction in telediagnostic applications. Our approach is based on an adaptive impedance controller that monitors and reacts to external forces and, in the future, will also incorporate higher-level scene information.
|
| |
| 14:05-15:25, Paper WedFA.14 | |
| Towards QoE Oriented Teleoperation Based on Variable Impedance Control and Imitation Learning |
|
| Wang, Zican | Technical University of Munich |
| Xu, Xiao | Technical University of Munich |
| Yang, Dong | Technical University of Munich |
| Steinbach, Eckehard | Technical University of Munich |
Keywords: Human-Centered Robotics, RIG TC: Tactile Robotics, Robust/Adaptive Control
Abstract: Human–robot interaction (HRI) in teleoperation scenarios has gained increasing attention in recent years. Conventional teleoperation systems are evaluated using objective metrics, such as transparency or task completion time. However, as human-in-the-loop applications, teleoperation systems require a more comprehensive evaluation framework that also accounts for the human operator’s quality of experience (QoE). Recent advances in machine learning have enabled real-time QoE prediction, making it possible to adapt control parameters and task strategies dynamically during teleoperation. In this paper, we review state-of-the-art methods for QoE prediction in teleoperation systems and discuss QoE-aware teleoperation architectures and their applications.
|
| |
| 14:05-15:25, Paper WedFA.15 | |
| Failure Prediction at Runtime for Generative Robot Policies |
|
| Römer, Ralf | Technical University of Munich |
| Kobras, Adrian | TUM |
| Worbis, Luca | Technical University of Munich |
| Schoellig, Angela P. | TU Munich |
Keywords: RIG Cluster: Learning and Multimodal AI for Robotics, RIG Cluster: Safety, Reliability and Resilience of AI-based Robotics, Failure Detection and Recovery
Abstract: Imitation learning with generative policies has enabled robots to perform complex manipulation tasks, but distribution shifts and compounding errors can still lead to unpredictable failures. Therefore, early failure prediction during runtime is essential for safe deployment in human-centered environments. We propose FIPER, a framework for Failure Prediction at Runtime for generative policies that identifies two key indicators of impending failure: (i) out-of-distribution observation embeddings detected via random network distillation, and (ii) high uncertainty in generated actions measured by a novel action-chunk entropy score. Both scores are calibrated using successful rollouts via conformal prediction, and a failure alarm is triggered when both indicators exceed their thresholds. Experiments across five simulation and real-world environments demonstrate that FIPER predicts failures more accurately and earlier than existing methods without requiring failure data. Code, data and videos are available at tum-lsy.github.io/fiper_website.
|
| |
| 14:05-15:25, Paper WedFA.16 | |
| Dynamically Constraining Diffusion Policies During Inference |
|
| Fiedler, Niklas | University of Hamburg |
| Vahl, Florian André | University of Hamburg |
| Zhang, Jianwei | University of Hamburg |
|
|
| |
| 14:05-15:25, Paper WedFA.17 | |
| PRoVoCE: An Open Platform for Interactive Voice Design in Human-Robot Interaction |
|
| Heimerl, Niklas | University of Augsburg |
| Kuch, Johanna Magdalena | University of Augsburg |
| Dietz, Michael | University of Augsburg |
| Schörner, Matthias | Technische Hochschule Augsburg |
| Mertes, Silvan | Technische Hochschule Augsburg |
| Andre, Elisabeth | University of Augsburg |
Keywords: Methods and Tools for Robot System Design, Product Design, Development and Prototyping, Human Factors and Human-in-the-Loop
Abstract: Voice design plays a crucial role in human–robot interaction, influencing user acceptance, trust, and perceived social competence of robots. In this work, we present PRoVoCE (Personalized Robot Voice Customization Engine), a system that enables fine-grained, user-driven voice customization for robots using a human-in-the-loop approach. PRoVoCE combines a state-of-the-art neural speech synthesis backbone with an evolutionary voice customization algorithm and provides seamless integration into robotic systems via a Python API and ROS-compatible solutions. The system is designed as a scalable backend that supports both local and web-based deployment, making advanced voice customization accessible to the robotics community. PRoVoCE establishes a platform where personalized robot voices can be created, openly shared, and reused across projects.
|
| |
| 14:05-15:25, Paper WedFA.18 | |
| LSTM-Based Task Detection for Haptic Teleoperation Using Kinaesthetic Data |
|
| Rodriguez-Guevara, Daniel | Technical University of Munich |
| Hernandez Gobertti, Fernando | Universitat Politecnica De Valencia |
| Wei, Wenxuan | Technical University of Munich |
| Xu, Xiao | Technical University of Munich |
| Güleçyüz, Başak | Technical University of Munich |
| Gomez Barquero, David | Universitat Politecnica De Valencia |
| Steinbach, Eckehard | Technical University of Munich |
Keywords: Haptics and Haptic Interfaces, RIG TC: Semantic Perception, RIG Cluster: Rigorous Perception
Abstract: Bilateral teleoperation significantly enhances task performance by providing force feedback; however, maintaining transparent and stable interaction over lossy networks remains challenging. High-fidelity haptic data transmission in bilateral teleoperation systems demands high sampling rates and low latency, making the system vulnerable to communication delays and jitter. To provide the semantic context necessary for mitigating these limitations through dynamic network resource allocation, this work proposes a Physics-Informed Long Short-Term Memory (LSTM) network for real-time task detection. Unlike traditional approaches that rely on computationally expensive visual feedback or heuristic thresholds, our model classifies user intent—such as pressing, dragging, or tapping—relying exclusively on invariant kinaesthetic features derived from force and velocity data. Preliminary validation demonstrates high classification accuracy and robustness across varying environmental stiffness conditions. By enabling real-time activity recognition, this model facilitates adaptive control strategies and activity-aware signal coding, thereby improving system resilience against latency and packet loss in future teleoperation scenarios.
|
| |
| 14:05-15:25, Paper WedFA.19 | |
| E2FAI: Events to Optical Flow and Intensity |
|
| Guo, Shuang | TU Berlin |
| Hamann, Friedhelm | Technical University Berlin |
| Gallego, Guillermo | Technische Universität Berlin |
Keywords: Computer Vision for Automation, RIG TC: Robot Perception, Deep Learning for Visual Perception
Abstract: Event cameras capture scene appearance through motion, inherently coupling appearance and motion in event data. However, most prior works recover these two quantities independently, overlooking their intrinsic relationship. We propose an unsupervised framework that jointly estimates optical flow and image intensity using a single network. Starting from the event generation model, we derive an event-based photometric error that explicitly depends on both optical flow and intensity, and combine it with contrast maximization to form a unified loss that constrains both tasks. Experiments demonstrate state-of-the-art performance in unsupervised optical flow estimation, reducing EPE and AE by 20% and 25%, respectively, while achieving competitive intensity reconstruction results, especially in high dynamic range scenes. Our method also achieves compelling efficiency, despite jointly estimating both quantities. Project page: https://github.com/tub-rip/E2FAI
|
| |
| 14:05-15:25, Paper WedFA.20 | |
| Robust Heterogenous Multi-Robot Interception of Adversarial Bounded Rational Agents |
|
| Kramer, Markus | Technical University of Darmstadt |
| Daun, Kevin | Technische Universität Darmstadt |
| von Stryk, Oskar | Technische Universität Darmstadt |
Keywords: Multi-Robot Systems, Path Planning for Multiple Mobile Robots or Agents, RIG TC: Civil Safety Robotics
Abstract: This paper proposes a Robust Adversarial Receding Horizon Control framework for intercepting non-cooperative, boundedly rational agents using heterogeneous robot teams. To overcome the computational limits of differential games, the approach decouples the problem into strategic containment and tactical search via a "Commit-Predict-Search" architecture. The system first commits to topological bottlenecks to limit escape routes using Mixed-Integer Linear Programming (MILP), predicts the target's reactive motion, and finally optimizes search trajectories for maximum visibility. This hierarchical loop ensures real-time containment and effective coverage in complex environments.
|
| |
| 14:05-15:25, Paper WedFA.21 | |
| BirdRecorder: A Modular AI-Driven Robotic System for Wildlife Protection at Wind-Turbines |
|
| Klar, Nico | Center for Solar Energy and Hydrogen Research Baden-Württemberg (ZSW) |
| Amann, Ursula | ZSW |
| Gifary, Nizam | Zentrum für Sonnenenergie- und Wasserstoff-Forschung |
| Traub, Jakob | Zentrum für Sonnenenergie- und Wasserstoff-Forschung Baden-Württemberg (ZSW) |
| Sehnke, Frank | Zentrum für Sonnenenergie- und Wasserstoff-Forschung |
| Ahmad, Aamir | University of Stuttgart |
|
|
| |
| 14:05-15:25, Paper WedFA.22 | |
| SARs for Icebreaking in Psychotherapy and the SPARCi-Project (Supportive Psychotherapeutic Assistant Robot for Children) |
|
| Wildner, Andreas S. | Universität Augsburg |
| Reck, Corinna | Ludwig-Maximilians-Universität |
| Müller, Mitho | Ludwig-Maximilians-Universität |
| Schuwerk, Tobias | Ludwig-Maximilians-Universität |
| Andre, Elisabeth | Augsburg University |
| Nasir, Jauwairia | University of Augsburg |
Keywords: Human-Robot Collaboration, Human-Robot Teaming
Abstract: Mental healthcare is a promising field of application for Socially Assistive Robots (SARs). However, in this context, most research has been targeted at the treatment of autism spectrum disorder and dementia. In this extended abstract, we give an update on the SPARCi project, with which we aim to investigate the efficacy of SARs in child and adolescent psychotherapy.
|
| |
| 14:05-15:25, Paper WedFA.23 | |
| Laboratory Demonstration of an Underwater Welding Robot |
|
| Koch, Christian Ernst Siegfried | German Research Center for Artificial Intelligence GmbH |
| Yuan, Chunrong | Technische Hochschule Koeln, University of Applied Sciences |
| Antoniou, Antonios | Technische Hochschule Köln |
| Krause, Tom | Fraunhofer-Institut Für Graphische Datenverarbeitung IGD |
| Rößeler, Dirk | AMT GmbH |
| Koglin, Jens | AMT GmbH |
| Kirchner, Frank | University of Bremen |
Keywords: Marine Robotics, Manufacturing, Maintenance and Supply Chains, RIG TC: AI-driven Marine Robotics
Abstract: In the MARIOW project, a robot for semi-autonomous underwater welding has been developed. The robot is based on recent advances in underwater welding technology, in particular flux-cored arc welding, and integrates a stereo camera system and artificial intelligence methods to recognize welding seams and plan welding tasks. The MARIOW system thus presents a solution to the increasing demand for maintenance of maritime infrastructure that is safer, more cost-effective, and more sustainable than the current state of the art.
|
| |
| 14:05-15:25, Paper WedFA.24 | |
| Adaptive Control Based Friction Estimation for Tracking Control of Robot Manipulators |
|
| Huang, Junning | Intelligent Autonomous Systems |
| Tateo, Davide | Technische Universität Darmstadt |
| Liu, Puze | German Research Center for Artificial Intelligence |
| Peters, Jan | Technische Universität Darmstadt |
Keywords: Model Learning for Control, Robust/Adaptive Control, Dynamics
Abstract: Adaptive control is often used for friction compensation in trajectory tracking tasks because it does not require torque sensors. However, it has some drawbacks: first, the most common certainty-equivalence adaptive control design is based on linearized parameterization of the friction model, therefore nonlinear effects, including the stiction and Stribeck effect, are usually omitted. Second, the adaptive control-based estimation can be biased due to non-zero steady-state error. Third, neglecting unknown model mismatch could result in non-robust estimation. This paper proposes a novel linear parameterized friction model capturing the nonlinear static friction phenomenon. Subsequently, an adaptive control-based friction estimator is proposed to reduce the bias during estimation based on backstepping. Finally, we propose an algorithm to generate excitation for robust estimation. Using a KUKA iiwa 14, we conducted trajectory tracking experiments to evaluate the estimated friction model, including random Fourier and drawing trajectories, showing the effectiveness of our methodology in different control schemes.
|
| |
| 14:05-15:25, Paper WedFA.25 | |
| Reinforcement Learning of Corrective Forces for Compensating Hand Movement Disturbances |
|
| Trautmann, Merle | University of Tübingen |
| Charaja, Jhon Paul Feliciano | Hertie Institute for Clinical Brain Research, and Centre for Integrative Neuroscience, University of Tübingen, Tübingen, Germany |
| Haeufle, Daniel Florian Benedict | University of Tübingen |
|
|
| |
| 14:05-15:25, Paper WedFA.26 | |
| Imitation Learning for Large-Scale Decentralised Multi-Robot Systems |
|
| Fialho Jesus, André | University of Konstanz |
| Kuckling, Jonas | University of Konstanz |
|
|
| |
| 14:05-15:25, Paper WedFA.27 | |
| A Concept for the Use of Vision-Language-Action Models for Human-Robot Collaboration |
|
| Moehrle, Jannik | Fraunhofer Institute for Casting Composite and Processing Technology |
|
|
| |
| 14:05-15:25, Paper WedFA.28 | |
| Improving the Accuracy of Haptic Rendering for Model Mediated Teleoperation in Dynamic Environments Using Deep Learning |
|
| Fernandez Prado, Diego | Technical University of Munich / Agile Robots SE |
| Mirza, Asfia | Technical University of Munich |
| Steinbach, Eckehard | Technical University of Munich |
Keywords: Telerobotics and Teleoperation, Simulation and Animation, Deep Learning Methods
Abstract: Teleoperation enables humans to perform tasks in remote, inaccessible, or hazardous environments, such as medical interventions, nuclear handling, and space exploration. Increasingly, it is also used for skill teaching and data collection in robot learning frameworks. Accurate haptic feedback is essential in these systems to ensure immersion and precise task execution. However, communication delays and packet losses severely degrade the force feedback, negatively impacting stability and performance. Model-Mediated Teleoperation (MMT) addresses this issue by replacing direct force feedback with locally computed forces and torques derived from a model of the remote environment. While MMT improves robustness to delays, its performance depends on the accuracy of the underlying model, which often fails to capture real-world effects such as noise, unmodeled dynamics, and contact uncertainties. In this work, we propose a learning-based approach to enhance MMT by compensating for discrepancies between simulated and real interaction forces. We train a neural network to predict the residual between forces estimated by the model and forces measured at the real follower robot. The predicted residual is used to correct the model-based force feedback provided to the operator. Experimental results demonstrate that the proposed approach reduces the RMSE of rendered haptic feedback by 98% and 90% for force and torque respectively, which can enable a more realistic and consistent teleoperation experience. Increasing the temporal sequence length from 1 to 4 also reduced the RMSE of force and torque in our experiments by 30% and 31% respectively.
|
| |
| 14:05-15:25, Paper WedFA.29 | |
| O-STaR: Effective Object Search through Spatio-Temporal Reasoning on Dynamic Scene Graphs |
|
| Menon, Rohit | University of Bonn |
| Schmiede, Yasmin | University of Bonn |
| Blum, Hermann | Uni Bonn | Lamarr Institute |
| Bennewitz, Maren | University of Bonn |
Keywords: Service Robotics, RIG TC: Semantic Perception, RIG TC: AI-powered and Cognition-Enabled Robotics
Abstract: We present O-STaR, an integrated reasoning framework for personalized object search that combines semantic common sense, geometric feasibility, and adaptive temporal belief updates over dynamic scene graphs. While generic LLM priors associate objects with likely furniture categories, O-STaR grounds these hypotheses in physical feasibility through volume exclusion tests and personalizes them through long-term temporal adaptation. Physical experiments on a Stretch mobile manipulator demonstrate that geometric reasoning reduces concealed-space search time by 68%, while simulation benchmarks on 30-day household drift scenarios show that our adaptive transition model allows the robot to recover performance even under heavily corrupted initial priors. By combining semantic common sense knowledge with temporal adaptation, O-STaR enables robots to resolve open-vocabulary search requests while adapting to the unique habits of individual households.
|
| |
| 14:05-15:25, Paper WedFA.30 | |
| Crazyflow: An Accurate, GPU-Accelerated, Differentiable Drone Simulator in JAX |
|
| Schuck, Martin | Technical University of Munich |
| Rath, Marcel Peter | Technical University of Munich |
| Hua, Yufei | Technical University of Munich |
| von Rohr, Alexander | Technical University of Munich |
| Zhou, Siqi | Technical University of Munich |
| Schoellig, Angela P. | TU Munich |
Keywords: RIG Cluster: Learning and Multimodal AI for Robotics, RIG Cluster Multi-Robot Systems, RIG Cluster: Safety, Reliability and Resilience of AI-based Robotics
Abstract: In this work, we introduce Crazyflow, an accurate, differentiable simulator built on JAX. By leveraging JIT compilation via XLA, Crazyflow unifies physics and control into a single differentiable computation graph, enabling massive parallelization on accelerated hardware without sacrificing modeling accuracy. This architecture achieves order-of-magnitude speedups over existing baselines, capable of training deployable reinforcement learning agents in as little as 1.4 seconds. To support reliable sim-to-real transfer, the framework provides two levels of model fidelity and integrates seamlessly with the open-source Crazyflie ecosystem. Beyond direct compatibility with Crazyflies, Crazyflow includes a streamlined system identification pipeline, allowing users to easily adapt the simulation framework to any drone platform. By combining speed, accuracy, and differentiability, Crazyflow serves as a foundational tool that enables efficient learning, control, and optimization of quadrotor systems across both single- and multi-agent settings.
|
| |
| 14:05-15:25, Paper WedFA.31 | |
| Reinforcement Learning for Humanoid Walking with Explicit Backlash Simulation |
|
| Güldenstein, Jasper | University of Hamburg |
| Vahl, Florian | University of Hamburg |
| Zhang, Jianwei | University of Hamburg |
Keywords: Legged Robots, Reinforcement Learning
Abstract: Reinforcement Learning has shown great success in learning walking controllers for humanoid robots. Sim2Real transfer is usually enabled by extensive domain randomization of simulation parameters during training. A critical physical characteristic of actuators that is often overlooked is the backlash present in gear trains. In this work, we investigate the effect of simulating backlash during training on the performance of simulation learned walking policies on a real humanoid robot. We describe our approach to modeling backlash in the MuJoCo physics simulator. Results from our real-world experiments show that simulating backlash during training improves the robustness of the learned policies when deployed on real hardware.
|
| |
| 14:05-15:25, Paper WedFA.32 | |
| SwarmKit: The ROS2 Swarm Framework for Rapid Development of Reliable, Scalable, and Efficient Swarm Applications |
|
| Röper, Eva | Fraunhofer Institute for Transportation and Infrastructure Systems IVI |
| Schönnagel, Adrian | Otto-von-Guericke-University Magdeburg & Fraunhofer Institute for Transportation and Infrastructure Systems |
| Shao, Fengyun | Otto von Guericke University |
| Prabhakaran, Surya | Otto von Guericke University |
| Zumbusch, Bastian | Otto von Guericke University Magdeburg |
| Urtheil, Alexander | Otto von Guericke Universität Magdeburg |
| Olivas-Martínez, Gustavo | Instituto Tecnológico de Estudios Superiores de Monterrey |
| Ashok Kumar, Bharath | Fraunhofer-Institut für Verkehrs- und Infrastruktursysteme (IVI) |
| Dubé, Michael | Otto-von-Guericke-University |
| Steup, Christoph | Fraunhofer Institute for Transportation and Infrastructure Systems |
| Mostaghim, Sanaz | Faculty of Computer Science, University of Magdeburg |
|
|
| |
| 14:05-15:25, Paper WedFA.33 | |
| Spatio-Temporal Semantic Mapping for Long-Term Fruit Monitoring with Mobile Robots |
|
| Lobefaro, Luca | University of Bonn |
| Sodano, Matteo | Photogrammetry and Robotics Lab, University of Bonn |
| Fusaro, Daniel | Department of Information Engineering (DEI), University of Padova |
| Magistri, Federico | University of Bonn |
| Malladi, Meher Venkata Ramakrishna | University of Bonn |
| Guadagnino, Tiziano | University of Bonn |
| Pretto, Alberto | University of Padova |
| Stachniss, Cyrill | University of Bonn |
|
|
| |
| 14:05-15:25, Paper WedFA.34 | |
| Safety Augmented Model-Based Reinforcement Learning: A Purely Data-Driven Approach to State-Wise Safety Certification |
|
| Eisele, Artur | RWTH Aachen |
| Frauenknecht, Bernd | RWTH Aachen University |
| Solowjow, Friedrich | RWTH Aachen University |
| Trimpe, Sebastian | RWTH Aachen University |
Keywords: Reinforcement Learning, Robot Safety, Model Learning for Control
Abstract: Real-world reinforcement learning (RL) risks violating strict state-wise safety constraints. To address this, we propose Dyna-style Safety Augmented Reinforcement Learning (Dyna-SAuR), which learns a safety filter and a policy via RL utilizing synthetic data generated by an uncertainty-aware dynamics model. Experiments on an extended CartPole task demonstrate zero constraint violations during training and deployment.
|
| |
| 14:05-15:25, Paper WedFA.35 | |
| Learning Bipedal Musculoskeletal Locomotion through a Bioinspired Double Curriculum |
|
| Badie, Nadine Shafik | University of Stuttgart |
| Al-Hafez, Firas | TU Darmstadt |
| Schumacher, Pierre | Max Planck Institute for Intelligent Systems, Tübingen, Germany |
| Peters, Jan | Technische Universität Darmstadt |
| Haeufle, Daniel Florian Benedict | University of Tübingen |
| Schmitt, Syn | University of Stuttgart, Germany |
Keywords: Bioinspired Robot Learning, Modeling and Simulating Humans, Reinforcement Learning
Abstract: Human locomotion exhibits capabilities that current robotic systems have yet to achieve. For musculoskeletal humanoids, acquiring comparable skills with reinforcement learning (RL) is particularly challenging due to complex control dynamics and over-actuation. To address this, we introduce a bioinspired double curriculum framework that integrates a progressive morphology curriculum, reflecting physical growth, with a staged progression of locomotion tasks. By leveraging body development to drive exploration, this approach enables a single policy to acquire stable balance and adaptable gaits that generalize across speeds and withstand external perturbations. Comprehensive experiments and ablation studies demonstrate that our approach surpasses current exploration-based methods for muscle-actuated control. These findings suggest that incorporating developmental and morphological structure into learning can enable more human-like, embodied intelligence in artificial agents.
|
| |
| 14:05-15:25, Paper WedFA.36 | |
| Containerized LLM-Agent Architectures for Natural Language Control of ROS-Based Robotic Systems |
|
| Martino, Gerardo | University of the Bundeswehr Munich |
| Kolb, Julia | University of Stuttgart |
| May, Michael | University of Stuttgart |
| MONNIN, David | ISL |
| Neve, Antje | University of the Bundeswehr |
Keywords: AI-Enabled Robotics, Natural Dialog for HRI, Human-Robot Teaming
Abstract: Large Language Models (LLMs) are increasingly used as natural language interfaces in assistive systems, automotive applications, e-commerce, and robotics. Their ability to interpret multilingual free-form text and map it to machine-readable representations makes them attractive as high-level planners and dialogue managers for robots. When combined with ROS-based platforms, LLMs can support task specification, execution monitoring and explanation in a way that is accessible to non-expert users. However, integrating LLM-agents safely and robustly on the edge device remains challenging, as design patterns, middleware interfaces and deployment practices are still evolving. This work presents a modular, containerized LLM-agent architecture for ROS-based robots in which each functional component is encapsulated as an independent Docker container. The system couples a web-based human machine interface (HMI), speech recognition and synthesis components, and an LLM-agent built on the ROSA framework to enable natural-language control of ROS/ROS2 robots. Containerization is a central design principle, allowing components to be independently developed, replaced, and scaled. All containers are interconnected via Zenoh, resulting in a flexible, reproducible and hardware-agnostic experimental platform suitable for both simulation and real-robot deployment. We describe the speech-language-action pipeline from the operator’s spoken command to robot behavior and back to spoken feedback. We report preliminary results from experiments in Gazebo simulation and on a physical LeoRover platform. An initial comparison of several open-weight LLMs highlights trade-offs between model size, responsiveness, and robustness in real robotic settings. The proposed architecture is intended as a practical blueprint for robotics practitioners seeking to experiment with LLM-driven control of ROS-based systems.
|
| |
| 14:05-15:25, Paper WedFA.37 | |
| Dynamic Human-To-Robot Object Handover with VLM-Based Intention Detection and Movement Primitives |
|
| Rietsch, Sebastian | Karlsruhe Institute of Technology (KIT) |
| Ruf, Lukas | Karlsruhe Institute of Technology (KIT) |
| Asfour, Tamim | Karlsruhe Institute of Technology (KIT) |
Keywords: RIG Cluster: Human-Robot Interaction, Human-Robot Collaboration
Abstract: This work presents an initial exploration of using Vision-Language Models (VLMs) for dynamic Human-to-Robot (H2R) handovers, integrating VLM-based intention detection with Via-Point Movement Primitives (VMPs) for adaptive motion generation. By employing a structured chain-of-thought prompt and a majority vote over a prediction circular buffer, the system achieves 95.1% handover intention detection accuracy on the ARMAR-6 robot without task-specific training. Preliminary results suggest the approach can react dynamically to changing human behaviors and grasp strategies, though our evaluation reveals current challenges that must be addressed before practical deployment.
|
| |
| 14:05-15:25, Paper WedFA.38 | |
| TechnoSapiens: Experiencing Bionic Prostheses in Augmented Reality |
|
| Rudolph, Carsten | Chemnitz University of Technology |
| Brunnett, Guido | Chemnitz University of Technology |
| Maximilian, Bretschneider | Technische Universität Chemnitz |
| Meyer, Bertolt | Chemnitz University of Technology |
| Frank, Asbrock | TU Chemnitz |
Keywords: Virtual Reality and Interfaces, Haptics and Haptic Interfaces
Abstract: We present TechnoSapiens, an unique system that enables the interactive simulation of bionic prostheses in augmented reality. Unlike conventional AR applications, which merely superimpose a virtual prosthesis onto the user’s body, our approach employs diminished reality to first visually remove the user’s arm, thereby enabling a virtual replacement of the limb. The prosthesis is intuitively controlled via markerless optical hand tracking. A physics-based simulation governs interactions with virtual objects to ensure a coherent and consistent visual experience. To enhance realism, we developed a custom data glove that delivers haptic feedback, creating a multi-modal illusion of touch during object interaction. This way, it provides a versatile platform for investigating the perception of bionic technologies within a controlled experimental environment.
|
| |
| 14:05-15:25, Paper WedFA.39 | |
| Who Was Where: Natural Language Verbalization of Localized Persons from a Humanoid Robot's Episodic Memory |
|
| Plewnia, Joana | Karlsruhe Institute of Technology (KIT) |
| Asfour, Tamim | Karlsruhe Institute of Technology (KIT) |
Keywords: RIG TC: AI-powered and Cognition-Enabled Robotics
Abstract: The ability to verbalize past locations of people is essential for natural human-robot interaction in multi-person environments. We present a system enabling the humanoid robot ARMAR-7 to answer person-related queries about its past, such as "Where did you last see Joana?". Our approach combines Azure Kinect Body Tracking for pose detection, InsightFace for face recognition, and a memory-based cognitive architecture to create person entities with symbolic spatial representations integrated with an existing verbalization framework. Evaluation in a laboratory environment demonstrates successful verbalization. Analysis reveals that verbalization quality depends critically on face recognition accuracy. We discuss challenges and future directions for person-aware spatial verbalization.
|
| |
| 14:05-15:25, Paper WedFA.40 | |
| When Gradients Are Enough: Non-Stationary Potential Fields for Reactive Control |
|
| Mengers, Vito | Technische Universität Berlin |
| Brock, Oliver | Technische Universität Berlin |
Keywords: Reactive and Sensor-Based Planning, Perception-Action Coupling, Behavior-Based Systems
Abstract: Reactive control based on gradient descent in fixed potential fields is attractive for its robustness and simplicity, but it is fundamentally limited by local minima in complex tasks. This extended abstract presents a general framework for reactive control in which the potential field is made non-stationary in structured ways, allowing a single controller to resolve sequential objectives, multi-objective trade-offs, and exploration without explicit planning or discrete mode switching.
|
| |
| 14:05-15:25, Paper WedFA.41 | |
| Noise Reduction in Quadrupedal Robot Locomotion |
|
| Kohl, Katharina | Karlsruhe Institute of Technology (KIT) |
| Roennau, Arne | Karlsruhe Institute of Technology (KIT) |
Keywords: RIG Cluster: Legged Locomotion, RIG TC: Manyfold Legged Locomotion in Various Terrains
Abstract: Quadrupedal robots are increasingly deployed in indoor, human-centered environments, where locomotion-induced noise can negatively affect comfort, acceptance, and practical usability. While prior work has addressed quiet locomotion primarily through learning-based approaches, reducing walking noise on hard, non-compliant surfaces remains challenging. This extended abstract reviews relevant state-of-the-art methods and proposes three complementary ideas for quieter quadrupedal locomotion. First, foot impact velocity is regulated during the pre-contact swing phase using distance-based sensing and a velocity-modulating joint-level controller integrated into existing reinforcement learning frameworks. Second, biologically inspired foot trajectory shaping is proposed, drawing on observations of animal locomotion and implemented through imitation learning with a focus on foot motion. Third, compliant, biologically inspired foot designs are considered to passively damp impact forces and vibrations at the foot–ground interface. While each approach is independently applicable, their combination has the potential to further reduce acoustic emissions and improve walking efficiency, highlighting promising directions for future research on quiet legged locomotion.
|
| |
| 14:05-15:25, Paper WedFA.42 | |
| Temperature-Guided Diffusion Planning |
|
| Busch, Johannes | Dresden University of Technology |
| Calandra, Roberto | TU Dresden |
Keywords: Imitation Learning, Reinforcement Learning
Abstract: Diffusion planners address sequential decision-making by framing plan generation as a generative modeling task over trajectories, mitigating compounding errors and myopic predictions typical of autoregressive methods. They sample long-horizon, globally consistent plans in a single pass, enabling parallel refinement and robust handling of multimodal futures. Reward conditioning is typically achieved through classifier guidance or classifier-free guidance (CFG), with CFG favored for its performance and flexibility but requiring extensive, task-specific hyperparameter tuning that limits scalability and generalization. Our analysis reveals that guidance performance hinges on careful adaptation to the data manifold and reward distribution, contributing to CFG’s hyperparameter fragility. In this work, we propose the temperature-guided diffusion planner (TGDP), which adapts CFG for reward conditioning by self-calibrating to these task-specific characteristics. TGDP leverages temperature-based sample reweighting during training and adaptive guidance scaling at inference, yielding robust high-reward plan generation without per-task hyperparameter optimization. Across standard reward-driven benchmarks, TGDP matches performance of prior methods while maintaining a single set of default hyperparameters, establishing a practical, scalable, and generalizable approach to diffusion-based planning.
|
| |
| 14:05-15:25, Paper WedFA.43 | |
| Hybrid Model-Based Learning for Cobots |
|
| Aubeeluck, Chandra Yuvesh | Cologne University of Applied Sciences |
| Permin, Eike | Cologne University of Applied Sciences |
| Schilberg, Daniel | Hochschule Bochum |
| Pyschny, Nicolas | Cologne University of Applied Sciences |
Keywords: Learning from Demonstration, Machine Learning for Robot Control, Task Planning
Abstract: Mobile collaborative robots (cobots) increasingly operate in unstructured, contact-rich environments where classical position control and pre-programmed strategies fail to ensure reliable and adaptable behavior. Complex tasks such as precision assembly and obstacle-aware manipulation require online adaptation of motion and interaction while maintaining safety and responsiveness. Recent advances in robot learning, particularly learning from demonstration with large models, enable flexible trajectory generation but often lack verifiable safety mechanisms and require extensive retraining when deployed beyond the original task distribution. This extended abstract presents a hybrid framework that combines structured motion representations with learning-based policies to enable adaptive compliance during task execution. The approach emphasizes skill reuse, structured decision-making, and minimal data adaptation, allowing cobots to handle complex task variations without large task-specific datasets or large-scale retraining.
|
| |
| 14:05-15:25, Paper WedFA.44 | |
| Generating Whole-Body Motions on Riemannian Manifolds to Grasp Objects on the Move |
|
| Reister, Fabian | Karlsruhe Institute of Technology (KIT) |
| Meixner, Andre | Karlsruhe Institute of Technology (KIT) |
| Haag, Kevin | Karlsruhe Institute of Technology (KIT) |
| Jaquier, Noémie | KTH Royal Institute of Technology |
| Asfour, Tamim | Karlsruhe Institute of Technology (KIT) |
Keywords: RIG TC: Manipulation On-the-Move, Mobile Manipulation, Whole-Body Motion Planning and Control
Abstract: To perform mobile manipulation tasks efficiently, humanoid robots must generate fluent and energy-efficient whole-body motion that coordinates their mobile base with their arms. This extended abstract describes a novel method to generate whole-body motion based on geodesic synergies. Our method generates trajectories according to the Riemannian kinetic-energy metric, thereby leading to smooth, dynamically consistent, and energy-efficient behaviors. We enable grasping-on-the-move by enforcing via-point constraints for task-relevant motion adaptation and by introducing an adapted trajectory initialization. In addition, we integrate a hand control method based on an in-hand time-of-flight sensor to support robust object grasping. We report early results, including real robot experiments on the humanoid robot ARMAR-DE.
|
| |
| 14:05-15:25, Paper WedFA.45 | |
| Hyperbolic Embeddings for Reliable Open-Set Object Detection |
|
| Lu, Yao | University of Freiburg |
| Valada, Abhinav | University of Freiburg |
Keywords: Deep Learning for Visual Perception, Deep Learning Methods
Abstract: Modern object detectors achieve strong performance under closed-world assumptions, where the set of categories is fixed and fully known during training. However, they often rely on flat classification schemes that treat categories as independent, ignoring their rich semantic hierarchies that structure real-world concepts. This ignorance limits generalization and interpretability when novel categories appear at test time, which is particularly problematic in robotics where systems must operate in dynamic environments and handle unseen objects. In contrast, hierarchical representations provide powerful inductive biases for organizing visual knowledge. In this work, we propose HYP-DETR, which regularizes hierarchical embeddings in hyperbolic space, whose constant negative curvature provides a natural geometry for representing such structure. This regularization encourages the model to learn both category-level and ancestor-level representations, capturing abstract concepts such as objectness and improving the detection of unknown objects in open-set object detection. We validate HYP-DETR on both closed- and open-world benchmarks, including COCO val2017, COCO-Mixed, and LVIS, demonstrating consistent performance improvements across settings.
|
| |
| 14:05-15:25, Paper WedFA.46 | |
| Correct-By-Construction Control Architectures for Contact-Rich Mobile-Manipulation Robots from Composable Models |
|
| Kalagaturu, Vamsi Krishna | University of Bremen |
| Sawant, Kishan Ravindra | University of Bremen |
| Schneider, Sven | Hochschule Bonn-Rhein-Sieg |
| Bruyninckx, Herman | KU Leuven |
| Hochgeschwender, Nico | University of Bremen |
Keywords: RIG Cluster: Safety, Reliability and Resilience of AI-based Robotics, RIG TC: Principles and Methods for Building AI-powered Robust and Resilient Robots, Control Architectures and Programming
Abstract: While frameworks such as Stack of Tasks (SoT) and Stanford Whole-Body Control (WBC) support complex contact-rich manipulation, jgeom constr and eTaSL are formal, computer-interpretable languages for declarative task specification. We analyze these languages and identify key limitations in composability and compositionality. Similar limitations arise in controller frameworks such as ros2 control. Although controllers such as PID admit multiple variants through different compositions of P, I, and D components and additional functional elements, existing frameworks primarily offer parameterized implementations rather than explicit, compositional specifications of controller semantics. Graphical tools such as Simulink support functional composition but remain tightly coupled to proprietary toolchains. To address these issues, this work introduces a graph-structured, composable, and compositional interchange format for constraint-based task specifications and the controllers that realize them, enabling automated structural verification and correct-by-construction code generation. The approach is validated through a contact-rich workspace alignment task on a highly redundant dual-arm mobile manipulator with torque-controlled joints.
|
| |
| 14:05-15:25, Paper WedFA.47 | |
| Joint Target-Less Intrinsic and Extrinsic Camera-LiDAR Calibration Using Deep Point Correspondences |
|
| Bultmann, Simon | University of Freiburg |
| Cattaneo, Daniele | University of Freiburg |
| Valada, Abhinav | University of Freiburg |
Keywords: Calibration and Identification, Deep Learning for Visual Perception, RIG Cluster: Rigorous Perception
Abstract: Accurate camera-LiDAR calibration is a prerequisite for robust multi-modal perception in robotics. Recent target-less approaches based on deep point correspondences achieve remarkable performance for extrinsic calibration but assume rectified images with known intrinsics. In this work, we overcome this limitation and present the first fully target-less pipeline that jointly estimates camera intrinsics (pinhole model with radial-tangential distortion) and camera-LiDAR extrinsics with deep pixel-point correspondences. Our approach extends deep correspondence-based calibration by (i) automatic intrinsic initialization via structure-from-motion, (ii) generalizing camera-LiDAR matching to raw images with unknown intrinsics including distortion, and (iii) tightly coupling correspondence estimation with joint nonlinear optimization over both intrinsics and extrinsics. We evaluate our method on the KITTI dataset with unseen camera-LiDAR pairs and demonstrate that joint calibration achieves improved extrinsic accuracy while additionally recovering accurate intrinsics.
|
| |
| 14:05-15:25, Paper WedFA.48 | |
| From Perception to Action: A Pipeline for Autonomous Garment Manipulation |
|
| Hohensee, Julia | Karlsruhe Institute of Technology (KIT) |
| Dreher, Christian R. G. | Karlsruhe Institute of Technology (KIT) |
| Gaukel, Andreas | Karlsruhe Institute of Technology |
| Asfour, Tamim | Karlsruhe Institute of Technology (KIT) |
Keywords: Perception for Grasping and Manipulation, Task and Motion Planning
Abstract: Manipulation of deformable objects such as garments remains a major challenge in robotics due to their high-dimensional configurations, complex dynamics, and frequent self-occlusions. Previous studies on folding or dressing have often relied on detailed cloth models, extensive training, or large datasets, which limit their applicability. We present a perception–action pipeline for autonomous garment handling that focuses on reliably picking up and transporting garments. The system uses FoundationStereo for stereo-based depth estimation and Grounded SAM 2 for open-vocabulary segmentation to generate garment point clouds. Grasping is based on geometric heuristics, and manipulation is executed with Via-Point Movement Primitives learned from kinesthetic demonstrations, with force-based hand closure. Implemented on a humanoid robot, the system autonomously clears garments from a tabletop into a laundry basket, demonstrating that effective garment handling can be achieved without complex modeling or task-specific training.
|
| |
| 14:05-15:25, Paper WedFA.49 | |
| Learning Agile Locomotion Skills on a 12-DoF Quadruped Via Emulated Muscle Dynamics |
|
| Kerner, Jan | Hertie Institute for Clinical Brain Research, and Centre for Integrative Neuroscience, University of Tübingen, Germany |
| Singh, Neelaksh | ETH Zürich |
| Charaja, Jhon Paul Feliciano | Hertie Institute for Clinical Brain Research, and Centre for Integrative Neuroscience, University of Tübingen, Tübingen, Germany |
| Araz, Matthew | University of Tübingen |
| Bunz, Elsa Katharina | University of Stuttgart |
| Haeufle, Daniel Florian Benedict | University of Tübingen |
|
|
| |
| 14:05-15:25, Paper WedFA.50 | |
| Balancing Responsiveness, Reliability, and Flexibility in Natural-Language Robot Interaction |
|
| Birr, Timo | Karlsuhe Institute of Technology (KIT) |
| Bärmann, Leonard | Karlsruhe Institute of Technology |
| Weberruß, Timo | Karlsruher Institut Für Technologie |
| Asfour, Tamim | Karlsruhe Institute of Technology (KIT) |
Keywords: RIG TC: AI-powered and Cognition-Enabled Robotics, RIG Cluster: Learning and Multimodal AI for Robotics, AI-Enabled Robotics
Abstract: As robotic assistants become increasingly prevalent, natural language interaction is emerging as a key enabler of effective human–robot collaboration. Although recent advances in Large Language Models (LLMs) have substantially improved open-ended language understanding, robust human–robot interaction requires more than text interpretation alone. In particular, a system must decide when an utterance warrants the robot’s attention and ensure that responses are generated within acceptable time constraints. Moreover, purely LLM-based approaches often incur higher latency and are less predictable than traditional grammar-based methods when handling predefined commands. We propose a hybrid speech processing and attention management architecture that integrates low-latency command recognition with LLM-based open-vocabulary understanding. By combining the strengths of both approaches, the system achieves a balance between responsiveness and flexibility, enabling natural, efficient, and timely interaction with humanoid robots.
|
| |
| 14:05-15:25, Paper WedFA.51 | |
| The 2025 GOOSE Dataset Competition for Semantic Segmentation in Unstructured Environments |
|
| Hagmanns, Raphael | Karlsruhe Institute of Technology |
| Mortimer, Peter | Universität Der Bundeswehr München |
| Granero, Miguel | Fraunhofer IOSB |
| Luettel, Thorsten | University of the Bundeswehr Munich |
| Petereit, Janko | Fraunhofer IOSB |
Keywords: RIG TC: Large-Scale Offroad Robotics, RIG Cluster: Field Robotics, Performance Evaluation and Benchmarking
Abstract: The GOOSE Dataset Competition for Semantic Segmentation was hosted in conjunction with the Workshop on Field Robotics at ICRA 2025 in Atlanta, USA. The competition focused on developing semantic segmentation methods for image and point cloud data from unstructured environments provided by the GOOSE datasets. Its primary aim was to evaluate the robustness and real-world performance of segmentation approaches across diverse, challenging scenarios and robotic platforms. In this report, we present the competition setup, baseline methods and summarize the winning approaches.
|
| |
| 14:05-15:25, Paper WedFA.52 | |
| The Mini Wheelbot Dataset: High-Fidelity Data for Robot Learning |
|
| Hose, Henrik | RWTH Aachen |
| Brunzema, Paul | RWTH Aachen University |
| Subhasish, Devdutt | RWTH Aachen University |
| Trimpe, Sebastian | RWTH Aachen University |
Keywords: Data Sets for Robot Learning, Underactuated Robots, RIG TC: Foundations of Optimization and Learning for Robotics
Abstract: The development of robust learning-based control algorithms for unstable systems requires high-quality, real-world data, yet access to specialized robotic hardware remains a significant barrier for many researchers. This letter introduces a comprehensive dynamics dataset for the Mini Wheelbot, an open-source, quasi-symmetric balancing reaction wheel unicycle. The dataset provides 1~kHz synchronized data encompassing all onboard sensor readings, state estimates, ground-truth poses from a motion capture system, and third-person video logs. To ensure data diversity, we include experiments across multiple hardware instances and surfaces using various control paradigms, including pseudo-random binary excitation, nonlinear model predictive control, and reinforcement learning agents. We include several example applications in dynamics model learning, state estimation, and time-series classification to illustrate common robotics algorithms that can be benchmarked on our dataset.
|
| |
| 14:05-15:25, Paper WedFA.53 | |
| Perception-Driven Digital Twin Generation for Simulation-Ready Robotic Environments |
|
| Ruegemer, Leroy | Bielefeld University |
| Leins, David Philip | Bielefeld University |
Keywords: Simulation and Animation, Semantic Scene Understanding, Object Detection, Segmentation and Categorization
Abstract: We present a perception-driven pipeline that converts a mapped environment into a simulation-ready digital twin by incrementally enriching the robot's internal world state representation in the Entity Component World Model (ECWM), reconstructing object assets with SAM 3D, and exporting scenes to MuJoCo ROS, yielding consistent parallel testing, planning, and behavior validation. The resulting simulated scene can be used for action validation, manipulation feasibility checks, motion prediction, and what-if reasoning, supporting robust decision making in dynamic environments such as RoboCup@Home scenarios.
|
| |
| 14:05-15:25, Paper WedFA.54 | |
| Scalable Multi-Agent Navigation in Maze-Like Environments |
|
| Rau, Julian | Technical University of Darmstadt |
| Argote-Gerald, Jahir | The University of Sheffield |
| Miyauchi, Genki | The University of Sheffield |
| Gross, Roderich | Technical University of Darmstadt |
Keywords: RIG Cluster Multi-Robot Systems, RIG TC: Multi-Robot Coordination, RIG TC: Swarm Robotics
Abstract: Cave networks, pipe systems, and similar maze-like environments pose significant challenges for multi-agent navigation, particularly when the environment is unknown and communication is limited. To address this challenge, we propose a distributed algorithm that enables agents with limited communication range to move through an unknown, possibly cyclic, graph while simultaneously avoiding collisions and ensuring that all agents reach the goal. Agents enter the graph one at a time through a designated start node and are tasked with finding and navigating to a designated goal node. The algorithm leverages a leader-switching mechanism and leader-follower relationships between agents to coordinate them solely through local communication. For exploration, one agent runs a single-agent maze solver. Simulations show that the algorithm scales efficiently and that the performance of the algorithm approaches the one of fully informed agents that follow an optimal path.
|
| |
| 14:05-15:25, Paper WedFA.55 | |
| Feasibility of LLM-Based Robot Programming in SMEs: A Human-In-The-Loop Approach |
|
| Gashi, Adriatik | Technische Hochschule Nürnberg Georg Simon Ohm |
| Bibbig, Tobias | Isento GmbH |
| Schmidt-Vollus, Ronald | Technische Hochschule Nürnberg Georg Simon Ohm |
Keywords: AI-Enabled Robotics, Software Tools for Robot Programming, Natural Dialog for HRI
Abstract: The economic viability of automating High-Mix Low-Volume (HMLV) manufacturing is limited by the high expertise and setup times required for manual robot programming. While Generative AI (GenAI) promises to lower this barrier via Natural Language Interfaces, industrial adoption remains hindered by the lack of safety and auditability in black-box models. To close this transfer gap, we present a Work-in-Progress framework, developed in collaboration with an industrial brush manufacturer, designed to render current GenAI capabilities applicable to factory environments. The proposed system utilizes a verifiable, Human-in-the-Loop architecture comprising a two-stage pipeline: (1) An Intent Reasoning Agent transforms user prompts into a visual Intermediate Representation (IR), enabling non-experts to verify process logic before code generation; and (2) A Code Translation Agent utilizes a parser-guided feedback loop to generate executable robot code, which is then subject to a second verification step via kinematic simulation in a digital twin. Preliminary results demonstrate the feasibility of this approach: natural language instructions for a manipulation task were successfully converted into syntactically valid robot programs and visually verified within a digital twin.
|
| |
| WedGA |
|
| Oral Session 1 |
Regular |
| |
| 15:25-15:31, Paper WedGA.1 | |
| A Generalized Model-Free Placeability Metric for Unified Pick-And-Place Planning |
|
| Wingender, Benno | University of Bonn |
| Dengler, Nils | University of Bonn |
| Menon, Rohit | University of Bonn |
| Pan, Sicong | University of Bonn |
| Bennewitz, Maren | University of Bonn |
Keywords: Manipulation Planning, RIG Cluster: Manipulate Anything, Anywhere, Anytime, Grasping
Abstract: The ability to pick and place objects is central to robotics across various domains. Recent work addresses the grasping and placing of objects jointly. However, most methods rely on CAD priors, evaluate only limited number of placements, or require domain knowledge. This potentially limits integration with of-the-shelf grasping approaches and can restrict the robustness of these methods. Furthermore, learning-based approaches often struggle to identify grasps that preserve the same object-relative configuration across both the pick and place poses (shared-grasps). We propose a generalized placeability metric that evaluates 6-DoF placements directly from point clouds without shape priors. It fuses physical stability and altitude-based clearance with robot feasibility and collision constraints, producing stable, collision-free placements online, while directly incorporating shared-grasps. Therefore, our framework computes placement aware (placeability) scores, efficiently evaluating large sets of grasp–place pairs in real time, and remaining compatible with different grasp prediction methods.
|
| |
| 15:31-15:37, Paper WedGA.2 | |
| A Vision-Based Approach for Hand-Object Grasp Retargeting with Taxonomy Awareness |
|
| Shi, Yitian | Karlsruhe Institute of Technology |
| Guo, Zicheng | Karlsruher Institut Für Technologie |
| Wolf, Rosa Petra | Karlsruhe Institute of Technology |
| Welte, Edgar | Karlsruhe Institute of Technology (KIT) |
| Rayyes, Rania | Karlsruhe Institute for Technology (KIT) |
Keywords: Perception for Grasping and Manipulation, Deep Learning in Grasping and Manipulation, Grasping
Abstract: Current robotic grasping methods typically require detailed 3D models or geometric priors of the target object. We bypass this limitation with Hand-Object (HO) GraspFlow, a new affordance-centric framework. By analyzing a single RGB image of a human hand interacting with an object, our system translates that interaction into executable parallel jaw grasps. We achieve this by employing denoising flow matching (FM) to synthesize SE(3) poses. To ensure accuracy, the model is guided by three distinct signals: visual semantics from foundation models, reconstructed contact points, and a taxonomy-based understanding of grasp types. We demonstrate a reliable, object-agnostic grasp synthesis from human demonstrations in real-world experiments, where an average success rate of over 83% is achieved.
|
| |
| 15:37-15:43, Paper WedGA.3 | |
| Learning Hierarchical Domain Models through Environment Interaction |
|
| Kienle, Claudius | TU Darmstadt |
| Alt, Benjamin | University of Bremen |
| Arenz, Oleg | TU Darmstadt |
| Peters, Jan | Technische Universität Darmstadt |
Keywords: RIG TC: Neuro-symbolic Learning for Robotics
Abstract: Domain models enable autonomous agents to solve long-horizon tasks by producing interpretable plans. However, in open-world environments, a single general domain model cannot capture the variety of tasks, so agents must generate suitable task-specific models on the fly. LLMs, with their implicit common knowledge, can generate such domains, but suffer from high error rates that limit their applicability. Hence, related work relies on extensive human feedback or prior knowledge, which undermines autonomous, open-world deployment. In this work, we propose LODGE, a framework for autonomous domain learning from LLMs and environment grounding. LODGE builds on hierarchical abstractions and automated simulations to identify and correct inconsistencies between abstraction layers and between the model and environment. Our framework is task-agnostic, as it generates predicates, operators, and their preconditions and effects, while only assuming access to a simulator and a set of generic, executable low-level skills. Experiments on two IPC domains and a robotic assembly domain show that LODGE yields more accurate domain models and higher task success than existing methods, requiring remarkably few environment interactions and no human feedback or demonstrations.
|
| |
| 15:43-15:49, Paper WedGA.4 | |
| A Sensorized Bicycle to Evaluate a Robotic Bicycle Simulator |
|
| Kohler, Christina | RWTH Aachen University |
| Pena Perez, Nuria | RWTH Aachen University |
| Schwab, Arend L. | Delft University of Technology |
| Vallery, Heike | RWTH Aachen University and Delft University of Technology |
Keywords: RIG Cluster: Healthcare Robotics and Human Augmentation, Engineering for Robotic Systems, Physically Assistive Devices
Abstract: This work presents a proof-of-concept for a sensorized bicycle developed to generate ground-truth data that can be used to evaluate the realism of a robotic bicycle simulator. The system measures the steering angle using an encoder, the roll angle and yaw rate using an Inertial Measurement Unit (IMU) and the forward speed using redundant measurements from a GPS and a Hall sensor. The resulting data can be stored both locally or in the cloud. We have currently evaluated the steer angle, lean angle and GPS forward speed in a short cycling task. Future work will compare the forward speed derived from the GPS to that of the Hall sensor, as well as evaluate the roll angle measurements using optical tracking.
|
| |
| 15:49-15:55, Paper WedGA.5 | |
| Scene Graph-Based Exploration with Learned Search Heuristics for Open World Interactive Object Search |
|
| Mahdi, Imen | University of Freiburg |
| Cassinelli, Matteo | Toyota Motor Europe |
| Despinoy, Fabien | Toyota Motor Europe |
| Welschehold, Tim | Albert-Ludwigs-Universität Freiburg |
| Valada, Abhinav | University of Freiburg |
|
|
| |
| 15:55-16:01, Paper WedGA.6 | |
| Cross-Modal and Multi-Point Coding of Tactile Data |
|
| Wei, Wenxuan | Technical University of Munich |
| Nockenberg, Lars | Technical University of Munich |
| Rodriguez-Guevara, Daniel | Technical University of Munich |
| Xu, Xiao | Technical University of Munich |
| Steinbach, Eckehard | Technical University of Munich |
Keywords: RIG Cluster: Rigorous Perception, Haptics and Haptic Interfaces, Touch in HRI
Abstract: As the Tactile Internet evolves towards rich, multi-point physical immersion, advanced and efficient tactile codecs are crucial for high-fidelity remote interactions. This extended abstract summarizes the state-of-the-art in tactile codecs, particularly the recent advancements in cross-modal and multi-point codecs. We also outline and discuss the potential future research directions for this rapidly evolving field.
|
| |
| 16:01-16:07, Paper WedGA.7 | |
| From Transportation to Manipulation: Transforming Magnetic Levitation to Magnetic Robotics |
|
| Bergmann, Lara | Bielefeld University |
| Greis, Noah | Bielefeld University |
| Gross, Roderich | Technical University of Darmstadt |
| Haschke, Robert | Bielefeld University |
| Hochgeschwender, Nico | University of Bremen |
| Hoenig, Wolfgang | TU Berlin |
| Jost, Jana | Fraunhofer Institute for Material Flow and Logistics |
| Kopp, Stefan | Bielefeld University |
| Rayyes, Rania | Karlsruhe Institute for Technology (KIT) |
| Trimpe, Sebastian | RWTH Aachen University |
| Vollmer, Anna-Lisa | Bielefeld University |
| Wirkus, Malte | DFKI GmbH |
| Wrede, Sebastian | Bielefeld University |
| Neumann, Klaus | Bielefeld University / Fraunhofer IOSB-INA |
|
|
| |
| 16:07-16:13, Paper WedGA.8 | |
| A Cloud and 5G-Supported Robotic Demonstrator Using AI-Based Spatial Contextualization for Steel Construction Pre-Assembly |
|
| Ergin, Emre | RWTH Aachen University |
| Brell-Cokcan, Sigrid | RWTH Aachen University |
Keywords: Robotics and Automation in Construction, Assembly, Sensor-based Control
Abstract: The steel construction industry is characterized by a low degree of automation in assembly-related processes, particularly in joining and tack welding operations, which are still predominantly performed manually despite increasing geometric complexity and precision requirements in the industry. As part of the CLOUD56 research project, which investigates virtualized cloud radio access network (Cloud RAN) and 5G-based network infrastructures for industrial applications across multiple sectors, a robotic pre-assembly demonstrator for steel construction was developed and validated. The contribution focuses on transferring cloud- and 5G-enabled automation concepts to steel construction processes. The work focused on the design, implementation, and validation of a modular demonstrator that integrates cooperative industrial robots, heterogeneous sensor systems, adaptive control, AI based spatial contextualization, and distributed edge–cloud services into a continuous digital process chain for automated handling, joining, and inspection of steel components.
|
| |
| 16:13-16:19, Paper WedGA.9 | |
| SQ-CBF: Signed Distance Functions for Superquadric-Based Safety Filtering |
|
| Zhao, Haocheng | Technical University of Munich |
| Brunke, Lukas | Technical University of Munich |
| Lagerquist, Oliver | University of Toronto |
| Zhou, Siqi | Technical University of Munich |
| Schoellig, Angela P. | TU Munich |
Keywords: Collision Avoidance, Robot Safety, RIG TC: Safety and Reliability of AI-based Robotics
Abstract: Safety filters based on control barrier functions (CBFs) provide an effective mechanism for enforcing collision avoidance in robot manipulation, but their performance critically depends on the underlying geometric representation of the environment. This work presents a geometry-aware safety filter that models robot and environment geometries using superquadrics. Our approach yields a numerically stable, real-time-capable CBF safety filter, SQ-CBF, by leveraging signed–distance functions and their efficient gradient estimation. Simulation and real-world experiments show that our proposed SQ-CBF achieves stable collision avoidance while improving teleoperation efficiency in highly constrained or dynamic scenes.
|
| |
| 16:19-16:25, Paper WedGA.10 | |
| Beyond Performance: Rethinking Quadruped Robots for Scalable Robotics Education |
|
| Schmidt, Annika | Technical University of Munich (TUM) |
| Gumpert, Thomas | German Aerospace Center (DLR) |
| Ehlert, Tristan | German Aerospace Center (DLR) |
| Calzolari, Davide | German Aerospace Center, Technical University of Munich |
| Raffin, Antonin | DLR |
| Seidel, Daniel | German Aerospace Center (DLR) |
| Loeffl, Florian | German Aerospace Center (DLR) |
| Griesbauer, Korbinian | Technical University of Munich, German Aerospace Center (DLR) |
| Lee, Jinoh | German Aerospace Center (DLR) |
| Lii, Neal Y | German Aerospace Center (DLR) |
| Albu-Schäffer, Alin | DLR - German Aerospace Center |
Keywords: Education Robotics, Human-Centered Robotics, Legged Robots
Abstract: Robots are becoming an integral part of our lives, yet most educational platforms remain simple wheeled systems with limited capabilities, particularly in terms of public acceptance and outreach. Thus, we highlight the opportunities presented by an educational legged robot, which can be scaled in its interaction capabilities to learners of all ages to introduce coding, mechanics, and electronics in a playful, hands-on manner. Optional AI and voice interfaces further lower barriers for younger learners. By combining legged locomotion with progressive complexity, such a platform complements existing robots in education and opens new opportunities for engaging, embodied robotics education, which will gain importance in a future where robots may present a more integral part of our societies.
|
| |