GRC 2026 Program | Thursday March 12, 2026


ThuAA
Doors Open Thursday


ThuBA
Oral Session 2	Regular
Chair: Vollmer, Anna-Lisa	Bielefeld University

09:30-09:36, Paper ThuBA.1
Flying Model-Based Gas Tomography for Carbon Dioxide Mapping: A Cooperative Robotic System

Schaab, Marius	Technical University of Munich (TUM)
Wiedemann, Thomas	German Aerospace Center (DLR)
Lilienthal, Achim J.	Technical University of Munich (TUM)
Keywords: Robotics in Hazardous Fields, RIG TC: Robot Perception, Environment Monitoring and Management Abstract: Drones equipped with gas sensors monitor hazardous emissions; however, rotor downwash frequently renders onboard in-situ sensors ineffective by dispersing gas plumes. To address this, we present a cooperative robotic system designed for remote gas tomography. The architecture consists of a ground-based tracking unit and an aerial reflector unit, establishing an open measurement path that traverses the undisturbed plume while keeping the drone's propulsion external to the sensing volume. The system relies on a robust visual tracking and control framework that maintains precise laser alignment between the stationary ground robot and the moving aerial target. By fusing these remote measurements through model-based gas tomography, we generate spatial maps of gas distribution. Field validation confirms the system’s ability to bypass aerodynamic disturbances, outperforming standard in-situ drones and enabling environmental monitoring of gas concentrations.

09:36-09:42, Paper ThuBA.2
Bootstrapping Indoor Semantic Digital Twins from 2D Video

Alt, Benjamin	University of Bremen
Krohm, Luca	University of Bremen
Mania, Patrick	University of Bremen
Stefańczyk, Maciej	Warsaw University of Technology
Wilkowski, Artur	Institute of Control and Computation Engineering, Warsaw University of Technology, ul. Nowowiejska 15/19, 00-665 Warsaw, Poland
Beetz, Michael	University of Bremen
Keywords: Semantic Scene Understanding, Object Detection, Segmentation and Categorization, RIG TC: Semantic Perception

09:42-09:48, Paper ThuBA.3
Role-Played Human-Robot Interaction under Uncertainty and Its Impact on Trust

Shen, Shuyuan	University of Augsburg
Klein, Stina	University of Augsburg
Nasir, Jauwairia	University of Augsburg
Andre, Elisabeth	University of Augsburg
Kraus, Matthias	University of Augsburg
Keywords: Human-Centered Robotics, Design and Human Factors, RIG Cluster: Human-Robot Interaction Abstract: Large language models (LLMs) are increasingly embedded in robotic systems to enable flexible, open-ended human–robot interaction. However, there are concerns that these models fail to communicate uncertainty, leading to overconfident behavior and potential overtrust by users, particularly in sensitive domains such as healthcare. This work used role-play to investigate how uncertainty is communicated in human–robot dialogue and how such uncertainty communication shapes trust. Forty-two participants took part in paired role-play sessions in which one participant enacted a robot and the other a user in a care facility scenario. Our analysis focused on how perceived uncertainty shaped trust, both the robot players’ self-assessed trust and the user players’ trust. Findings showed that when robot players perceived uncertainty, their self-assessed trust decreased significantly in both capacity and moral dimensions, while interestingly, users’ trust also declined in the capacity dimension. We propose treating uncertainty communication as a two-party process and translating it into dialogue policies that calibrate trust, foster transparency, and grounding for safer human–robot collaboration.

09:48-09:54, Paper ThuBA.4
Understanding and Addressing Mental Model Mismatches in Human-Robot Teaching

Richter, Phillip	Universität Bielefeld
Wersing, Heiko	Honda Research Institute Europe
Vollmer, Anna-Lisa	Bielefeld University
Keywords: Learning Categories and Concepts, RIG Cluster: Human-Robot Interaction Abstract: A major challenge in human-robot interaction is the mental model mismatch, which arises when a human's understanding of a robot's capabilities differs from the robot's actual operational model. Such mismatches can result in ineffective teaching, suboptimal performance, and interaction breakdowns. This work aims to quantify and systematically categorize mental model mismatches by formalizing human expectations and comparing them with robot learning processes. We have developed metrics to measure mismatch and are currently developing a comprehensive taxonomy that identifies specific types of cognitive misalignments. By providing structured frameworks for understanding these mismatches, the goal is to enable targeted interventions fostering intuitive and effective cooperative learning between humans and robots.

09:54-10:00, Paper ThuBA.5
Adaptive Domestic Service Robotics through Foundation Models for Perception, Interaction, and Action

Memmesheimer, Raphael	University of Bonn
Pavlichenko, Dmytro	University of Bonn
Kruzhkov, Evgenii	University of Bonn
Bode, Jonas	University of Bonn
Schilke, Fynn	University of Bonn
Tutevych, Vitalii	University of Bonn
Lenz, Christian	University of Bonn
Schreiber, Michael	University of Bonn
Behnke, Sven	University of Bonn
Keywords: Domestic Robotics, Mobile Manipulation, RIG TC: Robotics Foundation Models Abstract: This extended abstract presents an overview of the approaches and results of the NimbRo@Home team, winner of the RoboCup@Home 2024 Open Platform League (OPL) in Eindhoven, Netherlands and runner-up of the RoboCup@Home OPL in Salvador, Brazil. The competition evaluates domestic service robots in realistic household scenarios, requiring robust perception, interaction, manipulation, and autonomous task execution. Our work focuses on advancing generalization and robustness through the integration of foundation models for perception, human-robot interaction, and planning. A central contribution is the deployment of foundational models like Large Language Models (LLMs) for human-robot interaction and task planning, Vision Language Models (VLMs) for scene state analysis, open-vocabulary object segmentation for perception and grasping. Traditional RoboCup systems rely on closed-set, supervised vision pipelines that require extensive labeling and retraining for each competition environment. We demonstrate that promptable vision models enable segmentation and manipulation of previously unseen objects described only by natural language. This significantly reduces labeling overhead and allows the robot to adapt on the fly. We combine these models with a closed-set pipeline based on YOLO and MaskDINO, yielding a hybrid perception system that balances robustness, speed, and flexibility across tasks.

10:00-10:06, Paper ThuBA.6
How Far Are LLMs from AGI? Evidence from a Large World Problem

Zenkri, Oussama	Technische Universität Berlin
Brock, Oliver	Technische Universität Berlin
Keywords: Performance Evaluation and Benchmarking, RIG TC: AI-powered and Cognition-Enabled Robotics, RIG Cluster: Learning and Multimodal AI for Robotics Abstract: Robotics stands to benefit profoundly from Artificial General Intelligence (AGI), which promises agents capable of solving complex, long-horizon problems, show adaptive behavior, and perform robust decision-making in dynamic and uncertain environments. While Large Language Models (LLMs) are widely promoted as a viable pathway to this goal, their efficacy is typically benchmarked on tasks that fall into what Leonard Savage defined as the small world: static, fully observable, and deterministic. We evaluate LLMs, including reasoning models, on a long-horizon task designed to enforce large world constraints, requiring the accumulation and updating of information across successive interactions, and the formation of stable hypotheses about latent task structure across varying instances. The task prevents reliance on trial-specific recall and instead demands the identification of behavioral regularities. We contrast LLMs’ performance against a human baseline, serving as a baseline for genuine generalization. Our evaluation reveals that current models consistently fail to leverage interaction history to infer latent regularities, that are necessary for robust generalization across related task instances. These findings indicate that, while LLMs are effective at small-world inference, they do not yet exhibit the cognitive properties required to cope with the demands of real-world problem solving.

10:06-10:12, Paper ThuBA.7
Versatile LiDAR Bundle Adjustment for Multi-Scan Alignment Utilizing Continuous-Time Trajectories

Wiesmann, Louis	University of Bonn
Marks, Elias Ariel	University of Bonn
Gupta, Saurabh	University of Bonn
Guadagnino, Tiziano	University of Bonn
Behley, Jens	University of Bonn
Stachniss, Cyrill	University of Bonn
Keywords: SLAM, Mapping, Localization Abstract: Constructing precise global maps is a key task in robotics and is required for localization, surveying, monitoring, or constructing digital twins. To build accurate maps, data from mobile light detection and ranging (LiDAR) sensors is often used. Mapping requires correctly aligning the individual point clouds to each other to obtain a globally consistent map. In this paper, we investigate the problem of multi-scan alignment to obtain globally consistent point cloud maps. We propose a 3D LiDAR bundle adjustment approach that targets high-accuracy mapping and pose estimation. Our method solves the misalignment error of the corresponding scans in a joint least squares adjustment using all available data. We utilize a continuous-time trajectory to better model the actual ego-motion of the LiDAR instead of using the classical discrete-time assumption. To enable the joint optimization of thousands of LiDAR scans, we prune the search space of correspondences and utilize out-of-core circular buffers. We show that with our general optimization strategy, we can address tasks like simultaneous localization and mapping, multi-session alignment, and scan-to-map matching with different sensor types in different application areas.

10:12-10:18, Paper ThuBA.8
SE(3)-PoseFlow: Estimating 6D Pose Distributions for Uncertainty-Aware Robotic Manipulation

Jin, Yufeng	Technische Universität Darmstadt
Funk, Niklas Wilhelm	TU Darmstadt
Prasad, Vignesh	TU Darmstadt
Li, Zechu	Technische Universität Darmstadt
Franzius, Mathias	Honda Research Institute (HRI)
Peters, Jan	Technische Universität Darmstadt
Chalvatzaki, Georgia	Technische Universität Darmstadt
Keywords: Perception for Grasping and Manipulation Abstract: Object pose estimation is challenging due to partial observability and symmetries, which often lead to pose ambiguity. While deterministic networks struggle to capture this multi-modality, we propose a novel probabilistic framework leveraging textit{flow matching on the SE(3) manifold}. Our approach models full 6D pose distributions via sample-based estimates, enabling robust reasoning about uncertainty in ambiguous scenarios. We achieve state-of-the-art results on Real275, YCB-V, and LM-O, and demonstrate how our uncertainty-aware estimates effectively guide downstream robotic tasks such as active perception and grasp synthesis.

10:18-10:24, Paper ThuBA.9
Minsound: Adding Internal Audio Sensing to Internal Vision Enables Human-Like In-Hand Fabric Recognition with Soft Robotic Fingertips

Andrussow, Iris	Max-Planck-Institute for Intelligent Systems
Solano, Jans	Max Planck Institute for Intelligent Systems
Richardson, Benjamin A.	Max Planck Institute for Intelligent Systems
Martius, Georg	Max Planck Institute for Intelligent Systems
Kuchenbecker, Katherine J.	Max Planck Institute for Intelligent Systems
Keywords: Force and Tactile Sensing, Soft Sensors and Actuators, RIG TC: Tactile Robotics Abstract: Distinguishing the feel of smooth silk from coarse cotton is a trivial everyday task for humans. When exploring such fabrics, fingertip skin senses both spatio-temporal force patterns and texture-induced vibrations that are integrated to form a haptic representation of the explored material. In this work, we present a robotic system that can sense both of these types of haptic information, and we investigate how each type influences robotic tactile perception of fabrics. Our robotic hand's middle finger and thumb each feature a soft tactile sensor: one is the open-source Minsight sensor that uses an internal camera to measure fingertip deformation and force at 50 Hz, and the other is our new sensor Minsound that captures vibrations through an internal MEMS microphone with a bandwidth from 50 Hz to 15 kHz. Inspired by the movements humans make to evaluate fabrics, our robot actively encloses and rubs folded fabric samples between its two sensitive fingers. Experiments test the influence of each sensing modality on overall classification performance on a new dataset of 20 common fabrics; our transformer-based method achieves a maximum fabric classification accuracy of 97%. Incorporating an external microphone away from Minsound increases our method's robustness in loud ambient-noise conditions.

10:24-10:30, Paper ThuBA.10
Automated On-Site Assembly of Timber Building Components on the livMatS Biomimetic Shell

Lauer, Anja Patricia Regina	University of Stuttgart
Benner, Elisabeth	University of Stuttgart
Stark, Tim	University of Stuttgart
Klassen, Sergej	University of Stuttgart
Abolhasani, Sahar	University of Stuttgart
Schroth, Lukas	ETH Zürich
Gienger, Andreas	University of Stuttgart
Wagner, Hans-Jakob	University of Stuttgart
Schwieger, Volker	University of Stuttgart
Menges, Achim	Insitute for Computational Design and Construction, University of Stuttgart
Sawodny, Oliver	University of Stuttgart
Keywords: Robotics and Automation in Construction, Assembly


ThuCA
Keynotes — Wolfgang Hönig Motion Coordination for Multi-Robot Teams	Keynote
Chair: Steinbach, Eckehard	Technical University of Munich


ThuDA
Keynotes — Stefanie Speidel Beyond the Scalpel: Revolutionizing Surgery with AI and Robotics	Keynote
Chair: Steinbach, Eckehard	Technical University of Munich


ThuEA
Keynotes — Bundesministerin Dorothee Bär Germany's High-Tech Agenda and AI Robotics Booster – the Path for Germany's Future Technologies	Keynote


ThuFA
Keynotes — Minsterpräsident Hendrik Wüst	Keynote


ThuGA
Interactive Session & Demos 2 & Lunch	Interactive

11:30-13:00, Paper ThuGA.1
Pololu-Rs: A Rust-Based Framework for Reproducible Multi-Robot Experiments

Li, Jiaming	TU Berlin
Stentzler, Charlotte	TU Berlin
Roser, Johannes	TU Berlin
Hoenig, Wolfgang	TU Berlin
Keywords: RIG Cluster Multi-Robot Systems, Software-Hardware Integration for Robot Systems, Wheeled Robots Abstract: We introduce a modular, low-cost, open-source framework for differential-drive and tracked multi-robot experiments. The framework combines COTS hardware, a Rust-based real-time firmware, and a ROS 2 integration that together support reproducible, real-world multi-robot experiments. The successful usage of the framework presented in published research demonstrates its real-world applicability.

11:30-13:00, Paper ThuGA.2
Extended Abstract: Preferential Bayesian Optimization with Crash Feedback

Menn, Johanna	RWTH Aachen University
Stenger, David	RWTH AACHEN
Trimpe, Sebastian	RWTH Aachen University
Keywords: Human Factors and Human-in-the-Loop, Machine Learning for Robot Control Abstract: Bayesian optimization is commonly used for black-box parameter tuning in robotics, but typically requires an explicit objective function, which is often unavailable in practice. Preferential Bayesian optimization (PBO) addresses this limitation by directly learning from human preferences. When applied to hardware systems, evaluating unsafe parameters can cause crashes, resulting in downtime and increased hardware stress. Standard PBO cannot exploit feedback from such failures and therefore repeatedly explores unsafe regions. We introduce CrashPBO, an extension of PBO that incorporates both preference feedback and explicit crash reports. Synthetic benchmarks show a substantial reduction in crashes, and experiments across three real robotic platforms demonstrate the potential to reduce tuning time and hardware strain, resulting in a more reliable and practical framework for parameter learning with human feedback. Video:https://tinyurl.com/crashpbo

11:30-13:00, Paper ThuGA.3
Risk Awareness and Management for Autonomous Robots: Assessing Non-Perceivable Hazards through Context-Aware Safety Adaptation

Wolf, Patrick	University of Kaiserslautern-Landau \| Fraunhofer IESE
Helten, Catharina	RPTU University of Kaiserslautern-Landau
Adler, Rasmus	Fraunhofer IESE
Schneider, Daniel	Fraunhofer IESE
Keywords: Robot Safety, RIG Cluster: Safety, Reliability and Resilience of AI-based Robotics, RIG TC: Safety and Reliability of AI-based Robotics Abstract: Autonomous robots can operate under a wide range of conditions and must act safely. Accordingly, the question arises of how to describe all unsafe conditions. Here, machinery manufacturers are responsible for specifying safe operating conditions, but the operator is responsible for ensuring they are met. Therefore, this interface between manufacturers and operators is becoming increasingly challenging because manufacturers do not know how operators will use the robots. Moreover, a further challenge is that some dangerous situations cannot be detected solely by sensors and must be managed by restricting the robot's operational domain. This paper discusses these challenges and proposes an approach to incorporate context-aware risk assessment into a robot's runtime autonomy. The central idea is demonstrated using a Unitree Go2 legged robot and an autonomous Unimog truck in a realistic workshop scenario.

11:30-13:00, Paper ThuGA.4
Fast IMU-Based Contact Detection and Reaction for Spatial Parallel Robots for Human-Robot Collaboration

Piosik, Jan	Leibniz Universität Hannover
Mohammad, Aran	Leibniz University Hannover
Seel, Thomas	Leibniz Universität Hannover
Schappler, Moritz	Institute of Mechatronic Systems, Leibniz Universitaet Hannover
Keywords: RIG Cluster: Human-Robot Interaction, Parallel Robots, Safety in HRI Abstract: Ensuring safety in human-robot collaboration (HRC) requires rapid contact detection and reaction to minimize potential damage. While such strategies are established for serial and demonstrated for planar parallel robots, spatial parallel robots (PRs) pose unique challenges due to their complex, nonlinear dynamics. This work presents a framework for fast contact detection and reaction specifically developed for spatial PRs. The approach utilizes an Unscented Kalman Filter (UKF) for sensor fusion of IMU and encoder data to directly determine external forces. Four distinct reaction strategies—stop, zero-g, retraction, and reflex—are investigated to terminate contact immediately upon detection. Experimental results demonstrate that the IMU-based detection is approximately 3.6 times faster than a conventional momentum observer. Furthermore, the investigated strategies successfully terminate contact within a short time. These findings provide a vital foundation for implementing safe HRC in spatial parallel robotic systems.

11:30-13:00, Paper ThuGA.5
AI-Assisted Risk Assessment for Safe Industrial Robot Applications

Stuhlenmiller, Florian	ABB AG Corporate Research Center Germany
Dai, Fan	ABB AG, Corporate Research Germany
Benzi, Federico	ABB AG
Keywords: Robot Safety, AI-Based Methods, Agent-Based Systems Abstract: Designing safe industrial robot applications is essential to protect people, equipment and environment. The growing complexity of robot applications increases the respective design effort. To support the early design phase, an agent-based approach leveraging generative artificial intelligence is presented to semi-automatically identify hazards and propose feasible risk reduction measures under human supervision. A prototypic implementation demonstrates the potential to reduce engineering effort, indicating its applicability in industrial scenarios.

11:30-13:00, Paper ThuGA.6
A Multi-View Heterogeneous Multi-Robot Dataset for Relative Localization and Collaborative Perception in Dynamic Scenes

Lichtenfeld, Jonathan	Technical University of Darmstadt
von Stryk, Oskar	Technische Universität Darmstadt
Keywords: RIG Cluster Multi-Robot Systems, RIG Cluster: Field Robotics, Multi-Robot SLAM Abstract: Multi-robot research in localization and mapping has primarily focused on large-scale SLAM in static environments, often lacking the mutual visibility needed for close-range coordination. We propose a novel multi-sensor dataset featuring heterogeneous platforms with frequent mutual line-of-sight in high-dynamic scenarios. Currently in development, the dataset will provide precise ground-truth poses and pointwise dynamic labels to facilitate research in robust relative localization and multi-robot moving object detection. By offering simultaneous, diverse viewpoints of dynamic scenes, this dataset is designed to enable algorithms that overcome occlusions and improve collective scene understanding. We invite researchers within the Robotics Institute Germany (RIG) to discuss and potentially collaborate on this initiative to establish a highly recognized RIG multi-robot benchmark.

11:30-13:00, Paper ThuGA.7
PINGS: Gaussian Splatting Meets Distance Fields within a Point-Based Implicit Neural Map

Pan, Yue	University of Bonn
Zhong, Xingguang	University of Bonn
Jin, Liren	University of Bonn
Wiesmann, Louis	University of Bonn
Popovic, Marija	TU Delft
Behley, Jens	University of Bonn
Stachniss, Cyrill	University of Bonn
Keywords: SLAM, Mapping, RIG TC: Robot Perception Abstract: Robots benefit from high-fidelity reconstructions of their environment, which should be geometrically accurate and photorealistic to support downstream tasks. While this can be achieved by building distance fields from range sensors and radiance fields from cameras, realising scalable incremental mapping of both fields consistently and at the same time with high quality is challenging. In this paper, we propose a novel map representation that unifies a continuous signed distance field and a Gaussian splatting radiance field within an elastic and compact point-based implicit neural map. By enforcing geometric consistency between these fields, we achieve mutual improvements by exploiting both modalities. We present a novel LiDAR-visual SLAM system called PINGS using the proposed map representation and evaluate it on several challenging large-scale datasets. Experimental results demonstrate that PINGS can incrementally build globally consistent distance and radiance fields encoded with a compact set of neural points. Compared to state-of-the-art methods, PINGS achieves superior photometric and geometric rendering at novel views by constraining the radiance field with the distance field. Furthermore, by utilizing dense photometric cues and multi-view consistency from the radiance field, PINGS produces more accurate distance fields, leading to improved odometry estimation and mesh reconstruction.

11:30-13:00, Paper ThuGA.8
Towards Industrial Robot As a Service: Current Status and Future Research Directions

Tanz, Lukas	Technical University of Munich
Geng, Paul	Technical University of Munich (TUM)
Daub, Rüdiger	Technical University of Munich (TUM), Fraunhofer IGCV
Keywords: Industrial Robots, Flexible Robotics, Factory Automation Abstract: Industrial robot as a service aims to increase the flexibility and economic viability of industrial automation in high-mix, low-volume production environments. Instead of focusing on robot-specific programming, this paradigm requires integrated solutions for rapid programming and commissioning, robust perception under uncertainty, and efficient adaptation of hardware to changing products and processes. Recent advances in artificial intelligence, including large language models, symbolic planning, and data-driven optimization, have enabled new approaches to these challenges, but essential industrial requirements remain unmet. This paper highlights current research gaps in programming and commissioning, learning perception skills, and hardware adaptation.

11:30-13:00, Paper ThuGA.9
How Can AI Empower Autonomous Sediment Sampling in Deep-Sea Environments?

Sourkounis, Cora Maria	Leibniz University Hannover
Kwasnitschka, Tom	GEOMAR Helmholtz Centre for Ocean Research Kiel
Raatz, Annika	Leibniz Universität Hannover
Keywords: RIG TC: AI-driven Marine Robotics, RIG Cluster: Field Robotics Abstract: This article presents a project focused on accelerating suction sampling in deep-sea environments through the development of an innovative robotic system. The novel design aims to reduce transportation and preparation time while enhancing the sampling process itself, as manual operation of deep-sea robots is notably time-consuming. Following the completion of the robotic system concept, the current phase involves optimizing the design of a suction sampling pipeline. This pipeline begins with the generation of a 3D model of the target area using stereo camera data. The ultimate goal is to achieve a seamless and largely automated sampling process. This article introduces the discussion on how artificial intelligence can enhance and support this pipeline, paving the way for more efficient and effective deep-sea sediment sampling.

11:30-13:00, Paper ThuGA.10
Child and Parent Perspectives on SonoBox: A Robotic Contactless Ultrasound System for Pediatric Forearm Fracture Diagnosis

Golwalkar, Rucha	University of Lübeck
Polzin, Louis	University of Lübeck
de Vries, Anton	University of Lübeck
Tüshaus, Ludger	Klinik für Kinderchirurgie, Universitätsklinikum Schleswig-Holstein, Lübeck
Ernst, Floris	University of Lübeck
Keywords: Medical Robots and Systems, Human-Centered Robotics, Acceptability and Trust

11:30-13:00, Paper ThuGA.11
FlowTouch: View-Invariant Visuo-Tactile Prediction

Bien, Seongjin	University of Technology Nuremberg
Kneissl, Carlo	LMU Munich
Ressler-Antal, Thomas	Ludwig Maximilian University of Munich
Fundel, Frank	LMU Munich
Jülg, Tobias Thomas	University of Technology Nuremberg
Walter, Florian	Technical University Munich
Ommer, Bjorn	LMU Munich
Kutyniok, Gitta	The Ludwig Maximilian University of Munich
Burgard, Wolfram	University of Technology Nuremberg
Keywords: RIG TC: Tactile Robotics, RIG Cluster: Learning and Multimodal AI for Robotics Abstract: Humans can predict tactile sensation from visual stimuli. This remarkable ability helps us to guide the way we manipulate objects and interact with our environment. However, tactile prediction is not an isolated capability that arises solely from vision. It draws upon prior experiences and models formed about the environment. Developing a similar capability for robots should thus not be viewed as a purely vision-based task, but rather as a problem that requires multiple modalities to achieve similar results. In this work, we propose FlowTouch, which builds upon these premises for tactile state prediction. It leverages 3D information of the target object to extract more information than would be available in an image alone, in order to achieve more robust tactile prediction capabilities that can generalize to a wider range of objects.

11:30-13:00, Paper ThuGA.12
MuST-C: The Multi-Sensor, Multi-Temporal, and Multi-Crop Dataset for In-Field Phenotyping and Monitoring

Chong, Yue Linn	University of Bonn
Krämer, Julie	Forschungszentrum Jülich
Chakhvashvili, Erekle	Forschungszentrum Jülich
Marks, Elias Ariel	University of Bonn
Esser, Felix	University of Bonn
Dreier, Ansgar	University of Bonn
Rosu, Radu Alexandru	University of Bonn
Warstat, Kevin	Forschungszentrum Jülich
Pude, Ralf	University of Bonn
Behnke, Sven	University of Bonn
Muller, Onno	Forschungszentrum Jülich
Rascher, Uwe	Forschungszentrum Jülich GmbH
kuhlmann, Heiner	University of Bonn
Stachniss, Cyrill	University of Bonn
Behley, Jens	University of Bonn
Klingbeil, Lasse	University of Bonn
Keywords: Data Sets for Robotic Vision, Robotics and Automation in Agriculture and Forestry, RIG TC: Agri-Robotics Abstract: Phenotyping is crucial for understanding crop trait variation and advancing research, but is currently limited by expensive, labor-intensive monitoring. New phenotypic trait monitoring methods are being proposed to reduce this so-called phenotyping bottleneck via automation. These methods are often data-driven, requiring a dataset recorded with a specific sensor and corresponding reference values for developing novel methods. To this end, we present the MuST-C (Multi-Sensor, multi-Temporal, multiple Crops) dataset, which contains field data from various sensors collected over a growing season, covering six crop species. All data was georeferenced for alignment across sensors and dates. To collect our dataset, we deployed aerial and ground robotic platforms equipped with RGB cameras, LiDARs, and multispectral cameras, aiming to capture a wide variety of modalities and observations from different viewpoints. In addition to sensor data, we also provide manually collected leaf area index and biomass reference measurements. Our dataset enables the development of novel automatic phenotypic trait estimation methods, allows comparisons across different sensors, and generalizability across crop species.

11:30-13:00, Paper ThuGA.13
ReMoRA: A Resilient and Modular Framework for Building Dependable Robotic Applications

Wu, Ruichao	Fraunhofer IPA
Youssef, Mohamed	University of Stuttgart
Kahl, Bjoern	Fraunhofer IPA
Kraus, Werner	Fraunhofer IPA
Morozov, Andrey	University of Stuttgart
Keywords: Software Tools for Robot Programming, Engineering for Robotic Systems, RIG TC: Integration and Engineering of Industrial Robot Systems Abstract: Robotic systems increasingly require modular and resilient software that supports reuse, runtime awareness, and structured data access. However, many existing solutions lack explicit execution supervision and semantically consistent execution traces, limiting both reliability and learning-based research. This paper presents the Resilient Modular Robotic Application framework, a ROS~2-based architecture that structures robotic applications into reusable skill servers with standardized interfaces and explicit control flows. By embedding quality supervision at skill boundaries, ReMoRA enables fault containment and transparent execution monitoring. Beyond application development, ReMoRA provides structured execution data that supports research directions such as predictive maintenance, noise-aware data collection for imitation learning, and learning over a robot's operational lifetime.

11:30-13:00, Paper ThuGA.14
Zero-Shot Semantic Object Placement with Foundation Models

Mirjalili, Reihaneh	University of Technology Nuremberg
Krawez, Michael	University of Technology Nuremberg
Blei, Yannik	University of Technology Nuremberg
Walter, Florian	Technical University Munich
Burgard, Wolfram	University of Technology Nuremberg
Keywords: Perception for Grasping and Manipulation, AI-Enabled Robotics, RIG TC: Robotics Foundation Models Abstract: In this paper, we present a zero-shot object placing pipeline that uses pretrained foundation models to place objects in semantically appropriate set-down orientations. Given a single RGB image, we reconstruct a 3D model and estimate the object pose. We then render a small set of axis-aligned candidate orientations and prompt a vision-language model (VLM) to choose the orientation that matches proper placement. Next, we convert the chosen orientation into an end-effector rotation and execute it on the robot. We refer to this estimation-and-execution step as an alignment cycle. We repeat the alignment cycle once more after a fixed 90° yaw reorientation, and then place the object on a planar surface. Experiments on six household objects across multiple initial roll-pitch configurations achieve an average placement success rate of 0.87, with failures primarily due to perception challenges (e.g., transparent objects) or incorrect orientation selection by the vision-language model.

11:30-13:00, Paper ThuGA.15
LLM-Agent Supported Programming of Micro-Assembly Machines

Wiemann, Rolf	Leibniz University of Hannover
Terei, Niklas	Leibniz University of Hanover
Raatz, Annika	Leibniz Universität Hannover
Keywords: RIG TC: Skill Based Robotics for Production, RIG TC: Integration and Engineering of Industrial Robot Systems, RIG TC: AI-Robotics in Industry Abstract: This abstract presents our ongoing research on a software architecture for a skill-based micro-assembly robot that can be programmed autonomously via an LLM-based co-pilot. We position our framework in the space of assistant-style architectures that combine structured skill abstractions with grounded context models and safety verification, enabling robust sequence generation and parameterization while retaining operator control.

11:30-13:00, Paper ThuGA.16
Toward Human-Like Locomotion through Modal Gait Decomposition and Optimal Control

Kist, Arian	Technical University of Munich
Flor, Isabella	Technical University Munich
Perrin, Clément	Technische Universität München
Rixen, Daniel	Technische Universität München
Keywords: Humanoid and Bipedal Locomotion, Motion Control, Whole-Body Motion Planning and Control Abstract: The degree to which a bipedal robot's locomotion pattern resembles that of a human is rarely quantified or explicitly considered in motion planning. In this work, we present an approach to address this issue. Using modal decomposition techniques on gait data enables a quantitative analysis of motion patterns and precise comparison of bipedal robot and human locomotion. Based on this, we define an optimal control problem for bipedal locomotion to actively generate a human-like gait. Using a planar minimal bipedal model, we demonstrate preliminary results that indicate an improved human-like motion pattern and provide an outlook on future work.

11:30-13:00, Paper ThuGA.17
Learning Dexterous Manipulation with Three Independent Fingers from Human Demonstrations

Gürtler, Nico	Uni. Tübingen and Max Planck Institute for Intelligent Systems
Andrussow, Iris	Max-Planck-Institute for Intelligent Systems
Walia, Rohan	University of Tübingen
Schölkopf, Bernhard	Max Planck Institute for Intelligent Systems
Martius, Georg	Uni. Tübingen and Max Planck Institute for Intelligent Systems
Keywords: Dexterous Manipulation, In-Hand Manipulation, Imitation Learning Abstract: Humans have proven to be powerful teachers for robot manipulation skills via imitation learning. How can we leverage this potential for robots with a morphology that differs from humans? In this work, we demonstrate that teleoperation of a three-fingered robot morphology is both feasible and effective for dexterous manipulation tasks. To address the challenges posed by the embodiment gap between human demonstrators and non-humanoid robots, we investigate three teleoperation strategies: fingertip matching using hand tracking from a commercial AR headset, direct control via motion controllers, and kinesthetic teaching with a leader robot. For each of the three strategies, we collect demonstrations on a suite of dexterous manipulation tasks, including assembling a 3D-printed object and folding a napkin. We then train manipulation policies with state-of-the-art imitation learning methods and evaluate their success on the respective tasks. The policies trained on data collected via motion controllers and kinesthetic teaching generally outperform those trained on hand-tracking data.

11:30-13:00, Paper ThuGA.18
End-To-End Low-Level Neural Control of an Industrial-Grade 6D Magnetic Levitation System

Hartmann, Philipp	Bielefeld University
Stranghöner, Jannick	Bielefeld University
Neumann, Klaus	Bielefeld University / Fraunhofer IOSB-INA
Keywords: RIG TC: Foundations of Optimization and Learning for Robotics, RIG Cluster: Learning and Multimodal AI for Robotics, Neural and Fuzzy Control Abstract: Magnetic levitation (MagLev) is poised to revolutionize industrial automation by integrating flexible product transport and seamless manipulation. However, controlling such systems is inherently difficult due to their complex, unstable dynamics. Traditional control methods depend on complex, hand-crafted pipelines that are sensitive to model mismatches, resulting in robust but conservative solutions. In contrast, we present the first neural controller for 6D MagLev. Trained end-to-end on interaction data from a proprietary controller, it maps raw sensor data directly to coil currents. The controller demonstrates robust stabilization, generalizes to unseen trajectories, and extrapolates to previously unseen situations while maintaining accurate and robust control. This suggests that learning-based control can effectively substitute traditional engineering in demanding high-frequency physical systems. Demonstration videos are publicly available at https://sites.google.com/view/neural-maglev.

11:30-13:00, Paper ThuGA.19
MoLaB - a Benchmark for Mobile Mapping Systems

Wagner, Markus	University of Bonn
Stapper, Tobias	University of Bonn
Klingbeil, Lasse	University of Bonn
kuhlmann, Heiner	University of Bonn
Keywords: Mapping, Performance Evaluation and Benchmarking, SLAM Abstract: The generation of maps of the environment is one of the tasks for which mobile sensing systems, such as robots, are utilized. These maps are created using various perception sensors, such as LiDARs, in conjunction with the pose information of the system. Determining the accuracy of these maps is challenging due to multiple influencing factors, including pose estimation and system calibration, which impact the final map. We propose a method to benchmark the 3D mapping accuracy of mobile sensing systems using a freely accessible test environment with highly accurate reference data. We evaluate the accuracy of various parameters derived from the generated 3D map of the test environment, which are relevant for real-world applications. Our approach can assess mapping accuracy under different conditions, such as changing environmental settings, and provides insights into correlations primarily arising from pose estimation.

11:30-13:00, Paper ThuGA.20
FARM: Force-Aware Robotic Manipulation with Tactile-Conditioned Diffusion Policies

Helmut, Erik	Technische Universität Darmstadt
Funk, Niklas Wilhelm	TU Darmstadt
Schneider, Tim	Technical University Darmstadt
de Farias, Cristiana	TU Darmstadt
Peters, Jan	Technische Universität Darmstadt
Keywords: Imitation Learning, Deep Learning Methods, Force and Tactile Sensing Abstract: Contact-rich manipulation requires precise force control, yet many imitation-learning approaches treat visuotactile feedback as a passive observation rather than an explicit control target. In this work, we present Force-Aware Robotic Manipulation (FARM), an imitation learning framework that leverages high-dimensional tactile data to define a force-based action space. Using a modified version of the handheld Universal Manipulation Interface (UMI) gripper equipped with a GelSight Mini tactile sensor, we collect human demonstrations and deploy them on a matching actuated gripper. During policy rollouts, the proposed FARM diffusion policy jointly predicts robot pose, grip width, and grip force. FARM outperforms several baselines across high-force, low-force, and dynamic force adaption tasks, demonstrating the advantages of force-grounded, high-dimensional tactile observations and a force-based control space. The codebase and design files are open-sourced and available at https://tactile-farm.github.io.

11:30-13:00, Paper ThuGA.21
Stein Variational Ergodic Surface Coverage with SE(3) Constraints

Li, Jiayun	Technical University of Darmstadt
Jin, Yufeng	Technische Universität Darmstadt
Teng, Sangli	University of California, Berkeley
Gong, Dejian	Technical University of Darmstadt
Chalvatzaki, Georgia	Technische Universität Darmstadt
Keywords: Constrained Motion Planning, Optimization and Optimal Control Abstract: Robotic surface manipulation requires generating trajectories that achieve comprehensive coverage of complex 3D surfaces while maintaining precise end-effector poses in SE(3). Although ergodic trajectory optimization (TO) provides a principled framework for coverage, existing approaches struggle on discrete point-cloud surfaces due to highly nonconvex objectives and the lack of manifold-aware sampling mechanisms. This work presents TSVEC, a sampling-based ergodic trajectory optimization framework that extends Stein Variational Gradient Descent (SVGD) to SE(3) and incorporates trajectory-level preconditioning. By formulating point-cloud ergodic coverage as inference on the SE(3) manifold, TSVEC enables parallel exploration of multiple trajectory modes while preserving geometric consistency. A Gauss–Newton preconditioner further mitigates the severe ill-conditioning inherent in long-horizon ergodic optimization. Experiments on point-cloud surface coverage benchmarks and real-world robotic surface drawing tasks demonstrate that TSVEC consistently produces higher-quality coverage trajectories than representative optimization-based and sampling-based baselines, with successful validation on a robot manipulator.

11:30-13:00, Paper ThuGA.22
From Expert Fusion to Scalable Reinforcement Learning for Complex Legged Robots

Enslin, Louis-Elias	Karlsruhe Institute of Technology
Eichmann, Christian	FZI Research Center for Information Technology
Roennau, Arne	Karlsruhe Institute of Technology (KIT)
Keywords: RIG TC: Manyfold Legged Locomotion in Various Terrains, RIG Cluster: Legged Locomotion Abstract: This extended abstract presents a reinforcement learning (RL) approach for complex legged robots with many degrees of freedom. The work focuses on simplifying training through a multi-expert policy distillation method developed for the six-legged robot LAURON VI. Several expert policies were trained for different terrains and then combined into one generalized policy through imitation learning. The resulting controller showed smooth transitions between tasks, reduced reward shaping requirements, and improved generalization compared to single-policy RL. The approach was successfully tested in simulation and transferred to the real robot. Based on these results, future research will explore scalable RL methods for robots with higher complexity, such as quadrupeds with manipulation arms or physically coupled robots, aiming to make RL training more efficient and adaptable for real-world applications.

11:30-13:00, Paper ThuGA.23
Evidential Learning of Semantic Scene Graphs for Occlusion-Aware Pepper Plant Perception

Mueller-Goldingen, Niklas	University of Bonn
Menon, Rohit	University of Bonn
Pan, Sicong	University of Bonn
Chenchani, Gokul Krishna Gandhi	Hochschule Bonn-Rhein-Sieg
Bennewitz, Maren	University of Bonn
Keywords: Semantic Scene Understanding, RIG TC: Semantic Perception, Representation Learning Abstract: Automated harvesting in dense horticultural environments remains challenging due to complex plant topology, severe occlusions, and limited sensor viewpoints. In sweet pepper plants, fruits, stems, peduncles, and leaves form articulated structures that are often only partially observable, leading to failures when hidden attachments or occlusions cannot be inferred reliably. While modern perception pipelines can detect and map plant instances, they typically do not explicitly model structural relations or quantify uncertainty arising from partial observability. This paper proposes an uncertainty-aware semantic scene graph formulation for pepper plant perception based on evidential deep learning. From geometric observations of plant organs, we infer semantic scene graphs whose nodes represent plant organs and whose edges encode attachment and direction-conditioned occlusion relations. A GraphSAGE-based graph neural network with evidential prediction heads models uncertainty in node and edge predictions using Dirichlet distributions. Occlusion relations are supervised using a projection-based geometric formulation in a fruit-local reference frame. Experiments on a procedurally generated dataset show that the proposed approach accurately predicts plant structure while expressing meaningful uncertainty for ambiguous relations caused by occlusion. The resulting uncertainty estimates support downstream decisions such as targeted leaf manipulation and next-best-view planning.

11:30-13:00, Paper ThuGA.24
A Novel Powered Jaw Exoskeleton to Treat Temporomandibular Disorders: Design and Control Challenges

Müller, Paul-Otto	Technical University of Darmstadt
von Stryk, Oskar	Technische Universität Darmstadt
Keywords: Rehabilitation Robotics, RIG Cluster: Healthcare Robotics and Human Augmentation, RIG TC: Robotic Augmentation of the Human Body Abstract: Temporomandibular disorders severely impair masticatory function and quality of life. While powered jaw exoskeletons offer potential for rehabilitation, they face challenges related to complex biomechanics and safe force transmission. This paper presents a novel hybrid active jaw exoskeleton that addresses these issues by combining a rigid chin mechanism for precise force application with a compliant facial interface for enhanced safety. We develop a high-fidelity MuJoCo simulation and outline a control concept to handle partial observability and soft dynamics. This integrates a learned, deformation-aware dynamics model with latent states into a constrained, differentiable model predictive control scheme tuned via RL. This work establishes a foundation for safe, wearable robot-assisted therapy for temporomandibular disorders.

11:30-13:00, Paper ThuGA.25
Sparse and Dense Rendering for Event-Based 3D Gaussian Splatting

Kohyama, Kai	Keio University
Aoki, Yoshimitsu	Keio University
Gallego, Guillermo	Technische Universität Berlin
Shiba, Shintaro	Keio University
Keywords: RIG TC: Robot Perception, Computer Vision for Automation, SLAM Abstract: Event cameras offer advantages over traditional frame-based cameras, making them suitable for motion and structure estimation. However, it is unclear how event-based 3D Gaussian Splatting (3DGS) approaches can leverage fine-grained temporal information of sparse events. This work proposes a framework to address the trade-off between accuracy and temporal resolution in event-based 3DGS. Our key idea is to decouple the rendering into two branches: sparse, event-by-event geometry (depth) rendering, and dense, snapshot-based radiance (intensity) rendering, by using ray-tracing and the image of warped events. Our method achieves state-of-the-art performance on real-world datasets and competitive results on a synthetic dataset. It works without prior information (e.g., pretrained image reconstruction models) or COLMAP-based initialization, is more flexible in the number of events sliced, and achieves sharp reconstruction on scene edges with fast training time. We hope that this work deepens our understanding of the sparse nature of events for 3D reconstruction.

11:30-13:00, Paper ThuGA.26
Class-Incremental End-To-End Motion Prediction

Schischka, Nicolas	University of Freiburg
Gosala, Nikhil	University of Freiburg
Ravi, Kiran	Qualcomm
Yogamani, Senthil	Qualcomm
Valada, Abhinav	University of Freiburg
Keywords: Deep Learning for Visual Perception, Incremental Learning, Continual Learning Abstract: In recent years, end-to-end autonomous driving models for motion prediction and planning have become increasingly popular due to their potential to cope with imperfect detections by leveraging end-to-end differentiability. Camera-based end-to-end systems, in particular, have emerged as a promising alternative to LiDAR-centric pipelines due to their better affordability. In this abstract, we focus on motion forecasting, which aims to predict the future movement of all agents present in a scene over the next few seconds. Achieving this in the most effective manner is crucial for scene understanding and subsequent planning of the ego-vehicle. Despite this progress, most existing approaches implicitly assume that exhaustive annotations for all classes are available during training, which constrains real-world deployability. In practice, operational domains evolve as new classes need to be added due to regional differences or novel modes of transportation, such as e-scooters. Accommodating such changes typically requires full retraining, incurring substantial computational cost and delaying deployment. A more practical alternative is class-incremental learning, where the model is updated to recognize and forecast newly introduced agent classes using annotations for those classes only, while retaining performance on previously learned ones. Although class-incremental learning has been studied in other perception tasks such as 2D object detection and semantic segmentation, it remains largely unexplored for camera-based end-to-end motion forecasting. Moreover, even in continual learning work on tracklet-based motion forecasting that relies on ground-truth detections as input, class-incremental settings have received little attention, leaving a notable gap in the current literature.

11:30-13:00, Paper ThuGA.27
Temporal Task Segmentation and Attribute-Based Rules for Hazard Identification in Human-Robot Collaboration

Scharping, Robert	Fraunhofer Institute for Factory Operation and Automation IFF
Öltjen, Julian	Voraus Robotik GmbH
Bollmann, Yannick	Fraunhofer Institute for Factory Operation and Automation IFF
Behrens, Roland	Fraunhofer IFF
Stark, Alexander	Voraus Robotik GmbH
Keywords: Human-Robot Collaboration, Human-Centered Robotics, Human Factors and Human-in-the-Loop Abstract: This work presents an attribute-based method for systematic hazard identification and risk assessment in human-robot collaboration applications. By formalizing ISO 12100 concepts through rule-based reasoning and temporal task segmentation, the approach reduces the possible hazard space while preserving qualified personnel's responsibility.

11:30-13:00, Paper ThuGA.28
Dynamic Human-To-Robot Object Handover with VLM-Based Intention Detection and Movement Primitives

Rietsch, Sebastian	Karlsruhe Institute of Technology (KIT)
Ruf, Lukas	Karlsruhe Institute of Technology (KIT)
Asfour, Tamim	Karlsruhe Institute of Technology (KIT)
Keywords: RIG Cluster: Human-Robot Interaction, Human-Robot Collaboration Abstract: This work presents an initial exploration of using Vision-Language Models (VLMs) for dynamic Human-to-Robot (H2R) handovers, integrating VLM-based intention detection with Via-Point Movement Primitives (VMPs) for adaptive motion generation. By employing a structured chain-of-thought prompt and a majority vote over a prediction circular buffer, the system achieves 95.1% handover intention detection accuracy on the ARMAR-6 robot without task-specific training. Preliminary results suggest the approach can react dynamically to changing human behaviors and grasp strategies, though our evaluation reveals current challenges that must be addressed before practical deployment.

11:30-13:00, Paper ThuGA.29
From Demonstrations to Safe Deployment: Path-Consistent Safety Filtering for Diffusion Policies

Römer, Ralf	Technical University of Munich
Balletshofer, Julian	Technical University of Munich
Thumm, Jakob	Technical University of Munich
Pavone, Marco	Stanford University
Schoellig, Angela P.	TU Munich
Althoff, Matthias	Technische Universität München
Keywords: RIG Cluster: Safety, Reliability and Resilience of AI-based Robotics, RIG Cluster: Learning and Multimodal AI for Robotics, RIG Cluster: Human-Robot Interaction Abstract: Diffusion policies (DPs) achieve state-of-the-art performance on complex, long-horizon manipulation tasks by learning from expert demonstration datasets. However, since they cannot guarantee safe behavior, external safety mechanisms are needed. These, however, alter actions in ways unseen during training, causing unpredictable behavior and performance degradation. To address these problems, we propose path-consistent safety filtering (PACS) for DPs. Our approach performs path-consistent braking on a trajectory computed from the sequence of generated actions, keeping execution consistent with the policy's training distribution. We verify safety using set-based reachability analysis, enabling real-time deployment at 1kHz. Our experimental evaluation in simulation and on three challenging real-world human-robot interaction tasks shows that PACS (a) provides formal safety guarantees in dynamic environments, (b) preserves task success rates, and (c) outperforms reactive safety approaches, such as control barrier functions, by up to 68% in task success. Videos and extensive results are available at tum-lsy.github.io/pacs.

11:30-13:00, Paper ThuGA.30
A Framework for Learning Temporal Task Constraints for Bimanual Manipulation Tasks from Human Demonstration

Dreher, Christian R. G.	Karlsruhe Institute of Technology (KIT)
Dormanns, Patrick	Karlsruhe Institute of Technology (KIT)
Asfour, Tamim	Karlsruhe Institute of Technology (KIT)
Keywords: RIG Cluster: Learning and Multimodal AI for Robotics, RIG TC: AI-powered and Cognition-Enabled Robotics Abstract: This work presents a framework for learning temporal task constraints for bimanual manipulation tasks from human demonstration. The approach integrates three parts: assessing temporal relationships between actions, inferring symbolic and subsymbolic task constraints, and generating executable temporal plans for robots. By combining qualitative Allen relations with quantitative timing parameters, the framework enables flexible and precise synchronization of bimanual actions in robot task executions.

11:30-13:00, Paper ThuGA.31
Augmented Reality for RObots (ARRO): Pointing Visuomotor Policies towards Visual Robustness

Mirjalili, Reihaneh	University of Technology Nuremberg
Jülg, Tobias Thomas	University of Technology Nuremberg
Walter, Florian	Technical University Munich
Burgard, Wolfram	University of Technology Nuremberg
Keywords: RIG TC: Robotics Foundation Models, Imitation Learning, AI-Enabled Robotics Abstract: In this paper, we present ARRO, a novel visual representation that leverages zero-shot open-vocabulary segmentation and object detection models to efficiently mask out task-irrelevant regions of the scene in real time without requiring additional training, modeling of the setup, or camera calibration. By filtering visual distractors and overlaying virtual cues during both training and inference, ARRO improves robustness to scene variations and reduces the need for additional data collection. We extensively evaluate ARRO with Diffusion Policy on a range of tabletop manipulation tasks in real-world environments, and further demonstrate its compatibility and effectiveness with generalist robot policies, such as Octo, OpenVLA and pizero. Across all settings in our evaluation, ARRO yields consistent performance gains, allows for selective masking to choose between different objects, and shows robustness even to challenging segmentation conditions. Videos showcasing our results are available at: https://augmented-reality-for-robots.github.io/

11:30-13:00, Paper ThuGA.32
Dynamics-Informed Vision–Language Models: An Extended Abstract on Dynamics-Aware Reasoning towards Next Generation Autonomous Systems

Schäfer, Finn Rasmus	Technical University Munich
Betz, Johannes	Technical University of Munich
Keywords: RIG Cluster: Learning and Multimodal AI for Robotics, AI-Enabled Robotics, Formal Methods in Robotics and Automation Abstract: Recent advances in vision–language models (VLMs) and vision–language–action (VLA) architectures have enabled impressive semantic understanding and generalization capabilities in robotics and autonomous driving. However, current foundation models are predominantly vision-centric and often abstract away the agent’s internal physical state. In contrast, classical robotic and autonomous driving systems explicitly rely on ego-dynamics, system constraints, and physical feasibility to ensure safe and reliable behavior. This discrepancy leads to an increasing semantic–dynamic mismatch between what learned models reason about and what embodied systems can physically execute, particularly in safety-critical and out-of-distribution scenarios. In this extended abstract, we argue that the next generation of VLM and VLA architectures must move beyond vision-dominant representations and toward dynamics-informed multimodal alignment. We propose treating ego-motion and dynamic state as first-class modalities that condition semantic reasoning, rather than as post-hoc safety filters. By embedding vision, language, and dynamics into a shared representation space, models can align intent with physical feasibility and execution constraints. We discuss current trends in robotics that indicate a shift from classical sensor fusion toward representation-level multimodal alignment, while highlighting the absence of explicit ego-dynamic grounding in existing approaches. Finally, we outline key challenges related to multimodal alignment, data availability, and evaluation under physical constraints, and argue that dynamics-informed foundation models are a necessary step toward reliable, deployable embodied intelligence in robotics and autonomous driving.

11:30-13:00, Paper ThuGA.33
Walking on Roofs: Exploring the Potential of Walking Robots for Construction Work on Roofs

Dettmar, Bjoern-Felix	Karlsruher Institute of Technology (KIT)
Roennau, Arne	Karlsruhe Institute of Technology (KIT)
Keywords: RIG Cluster: Legged Locomotion, RIG TC: Manyfold Legged Locomotion in Various Terrains Abstract: The construction industry remains one of the least automated sectors, with roof work posing particularly high safety risks to human workers due to steep inclines, smooth surfaces, and risk of falling. Quadruped robots offer promising mobility and adaptability, yet their suitability for roof environments has not been systematically investigated. This paper presents an experimental evaluation of the commercial quadruped robot Unitree Go2 on realistic roof inclines. An adjustable roof test rig and a motion-capture-system were employed to objectively assess locomotion performance, with a particular focus on foot slippage. Three fundamental capabilities – static balancing, getting up and laying down, and incline walking – were evaluated across increasing slope angles. The results show that the unmodified robot fails to meet manufacturer-claimed slope capabilities, with critical failure occurring at significantly lower inclines. Static balance was on average lost at (27.58±0.72)°, get-up and lay-down motions were only reliable on shallow slopes, and incline walking exhibited a highly significant quadratic increase in slippage with slope angle. These findings demonstrate that current commercial quadrupeds are not yet suitable for safe roof operation without targeted adaptations. The study establishes quantitative performance limits and provides a baseline for future roof-specific developments, including adapted gaits, improved friction modeling, and end-effector design.

11:30-13:00, Paper ThuGA.34
Hashed TSDF Submapping with Loop Closure Using NDDs

Kuhlmann, Jan	Fulda University of Applied Sciences
Wiemann, Thomas	Fulda University of Applied Sciences
Keywords: Mapping, SLAM Abstract: Truncated Signed Distance Fields (TSDFs) are a continuous representation of surfaces in 3D space. For accurate mapping, loop closure detection and pose graph optimization are crucial to compensate drift. TSDF representations are normally static and therefore do not support optimization after loop detection. Overlapping submaps can be used to solve this problem at the cost of increased memory consumption. In previous work, we presented a cluster-hashed associative and discretized data structure (CHAD TSDF) tailored to address this problem. It reduces memory consumption by hashing node contents instead of spatial positions. In this paper, we extend CHAD TSDF to support memory efficient submapping for pose graph optimization. Loop closures are detected with normal distribution descriptors (NDD), which give a rough estimation of rotational error. The translational error is compensated by a gradient descent approach on TSDF values. To construct a consistent global map from the submaps, TSDF-to-TSDF fusion with weighted trilinear interpolation is used.

11:30-13:00, Paper ThuGA.35
Hierarchical Bayesian Optimization for Efficient Multi-Task Robot Controller Parameter Learning

Hirt, Sebastian	TU Darmstadt
Theiner, Lukas	TU Darmstadt
Pfefferkorn, Maik	Technical University of Darmstadt
Findeisen, Rolf	Control and Cyber-Pysical Systems Laborator
Keywords: Machine Learning for Robot Control, Optimization and Optimal Control, Transfer Learning Abstract: Robots often rely on controllers with tunable parameters (e.g., MPC weights, whole-body control shaping terms, safety or comfort penalties). These parameters must be re-tuned across tasks such as changing objectives, payloads, terrains, or users, while each closed-loop evaluation may be expensive. Bayesian optimization is commonly used for this purpose, but typically treats each task independently and models the total episode cost as a black-box function of the parameters, resulting in limited data efficiency and poor task transfer. We therefore propose a hierarchical Bayesian optimization method that exploits the rollout structure of closed-loop evaluations: instead of learning a black-box mapping from parameters to total cost, we learn parameter-dependent closed-loop trajectories and compute task-specific costs from predicted rollouts. This enables efficient transfer across tasks that share the same robot and controller structure but differ in evaluation criteria. We provide theoretical guarantees showing sublinear regret comparable to standard approaches and demonstrate improved sample efficiency and faster adaptation in a multi-task simulation benchmark.

11:30-13:00, Paper ThuGA.36
Effective Explanations for Belief-Desire-Intention Robots: When and What to Explain

Wang, Cong	TU Dresden
Calandra, Roberto	TU Dresden
Klös, Verena	Carl Von Ossietzky Universität Oldenburg
Keywords: Human-Robot Collaboration, Social HRI, Human-Centered Robotics Abstract: When robots perform complex and contextdependent tasks in our daily lives, deviations from expectations can confuse users. Explanations of the robot’s reasoning process can help users to understand the robot intentions. However, when to provide explanations and what they contain are important to avoid user annoyance. We have investigated user preferences for explanation demand and content for a robot that helps with daily cleaning tasks in a kitchen. Our results show that users want explanations in surprising situations and prefer concise explanations that clearly state the intention behind the confusing action and the contextual factors that were relevant to this decision. Based on these findings, we propose two algorithms to identify surprising actions and to construct effective explanations for Belief-Desire-Intention (BDI) robots. Our algorithms can be easily integrated in the BDI reasoning process and pave the way for better humanrobot interaction with context- and user-specific explanations. This paper summarizes and builds upon the research presented at IEEE RO-MAN 2025.

11:30-13:00, Paper ThuGA.37
On the Impact of Sensor Modalities in ACT-Based Humanoid Manipulation

Kühn, Robin	Leibniz University Hanover
Seel, Thomas	Leibniz Universität Hannover
Schappler, Moritz	Institute of Mechatronic Systems, Leibniz Universitaet Hannover
Keywords: RIG Cluster: Learning and Multimodal AI for Robotics, Imitation Learning, Humanoid Robot Systems Abstract: The widespread adoption of humanoid robots in industrial environments is hindered by the complexity of teaching new tasks. While Imitation Learning (IL), particularly Action Chunking with Transformers (ACT), enables rapid task acquisition, there is no consensus yet on the optimal sensory hardware required for manipulation tasks. This paper benchmarks 14 sensor combinations, explicitly evaluating the integration of tactile and proprioceptive modalities alongside active vision, on the Unitree G1 humanoid robot equipped with three-finger hands. Our analysis demonstrates that strategic sensor selection can outperform complex configurations in data-limited regimes. We introduce an open-source Unified Ablation Framework that utilizes sensor masking on a comprehensive master dataset to eliminate human variability. Results indicate that additional modalities often degrade performance for IL with limited data. A minimal active stereo camera setup outperformed complex multi-sensor configurations, achieving 87.5% success in spatial generalization. Conversely, adding pressure sensors reduced success from 94% to 67% due to a low signal-to-noise ratio. We conclude that in data-limited regimes, active vision offers a superior trade-off between robustness and complexity. While tactile modalities may require larger datasets to be effective, our findings validate that strategic sensor selection is critical for designing an efficient learning process.

11:30-13:00, Paper ThuGA.38
Exploiting Foundation Model Guided BEV Maps for 3D Object Detection and Tracking

Käppeler, Markus	University of Freiburg
Çiçek, Özgün	Bosch
Cattaneo, Daniele	University of Freiburg
Glaeser, Claudius	Robert Bosch GmbH
Miron, Yakov	Bosch
Valada, Abhinav	University of Freiburg
Keywords: Computer Vision for Transportation, Object Detection, Segmentation and Categorization, Deep Learning for Visual Perception Abstract: Camera-based 3D object detection and tracking are fundamental tasks in autonomous driving. Existing state-of-the-art approaches often rely exclusively on either perspective-view (PV) or bird’s-eye-view (BEV) features, limiting their ability to leverage both fine-grained object details and spatially structured scene representations. In this work, we propose DualViewDistill, a hybrid detection and tracking framework that incorporates both PV and BEV camera image features to leverage their complementary strengths. Our approach introduces BEV maps guided by foundation models, leveraging descriptive DINOv2 features that are distilled into BEV representations through a novel distillation process. By integrating PV features with BEV maps enriched with semantic and geometric features from DINOv2, our model leverages this hybrid representation via deformable aggregation to enhance 3D object detection and tracking. Extensive experiments on the nuScenes benchmark demonstrate that DualViewDistill achieves state-of-the-art performance. The results showcase the potential of foundation model BEV maps to enable more reliable perception for autonomous driving.

11:30-13:00, Paper ThuGA.39
Next Best View for Text Detection and Recognition in Port Monitoring Unmanned Aerial Vehicles

Gülsoylu, Emre	University of Hamburg
Fiedler, Niklas	University of Hamburg
Frintrop, Simone	University of Hamburg
Keywords: Aerial Systems: Applications, Computer Vision for Transportation, Motion and Path Planning Abstract: Next-Best-View (NBV) planning is a critical capability for autonomous drones operating in complex, occluded environments. While NBV has been widely applied to tasks such as 3D reconstruction, object detection, and exploration, its use for scene-text detection and recognition, particularly in industrial settings, remains underexplored. This work addresses this gap by formalising NBV optimisation for identifying intermodal loading units (ILUs) in ports, where textual identifiers (e.g., ISO 6346 ID codes) can be occluded or degraded, leading to operational inefficiencies. We propose a two-mission approach for robust ILU identification. First, a survey mission captures nadir-view images using the Divide Areas Algorithm for Optimal Multi-Robot Coverage Path Planning (DARP), generating georeferenced 3D point clouds and orthophotos. These are processed via the Three-stage Identification of Transportation UnitS (TITUS) pipeline for ILU segmentation, text detection, and ID code recognition. However, survey missions are limited by their top-down perspective, which fails to capture legible ID codes on stacked or damaged ILUs. To resolve this, we introduce a targeted mission, where drones dynamically navigate to optimal viewpoints for text detection, guided by a novel Legibility Score (LS). The LS balances viewing angle, distance, and line-of-sight constraints to maximise ID code legibility while minimising flight time. The targeted mission leverages 3D point clouds from the survey mission to estimate each ILU’s pose. For each ILU face, candidate waypoints are sampled within a truncated half-cone and evaluated using the LS, which combines an angle term (viewing alignment) and a distance term (proximity). Waypoints are optimised using Ant Colony Optimisation, prioritising both path efficiency and legibility. This work proposes a domain-specific NBV utility function for text detection. Future work includes adaptive weighting for the LS and extending the framework to dynamic multi-drone coordination.

11:30-13:00, Paper ThuGA.40
Robust Robotic Disassembly under Structural and Shape Uncertainty

Baumgärtner, Jan	Karlsruhe Institute of Technology
Fleischer, Jürgen	Karlsruhe Institute of Technology (KIT)
Keywords: Disassembly, RIG TC: AI-Robotics in Industry, Task and Motion Planning Abstract: To support the circular economy, robotic systems must not only assemble new products but also disassemble end-of-life (EOL) ones for reuse, recycling, or safe disposal. Existing approaches to disassembly sequence planning often assume deterministic and fully observable product models, yet real EOL products frequently deviate from their initial designs in both structure and shape due to wear, corrosion, or undocumented repairs. In this work, we argue that the uncertainty inherent in the structure of the EOL products should be formulated as a POMDP, and we propose a probabilistic task and motion planning framework for disassembly that can cope with such uncertainties. We furthermore show that uncertainty in the object shape can be addressed by autonomously designing grippers that consider shape uncertainty during grasp planning.

11:30-13:00, Paper ThuGA.41
Synthesis Costs of Specialized Robot Controllers in an Object Retrieval and Delivery Scenario for Multi-Robot Systems

Leopardi, Paolo	University of Konstanz
Hamann, Heiko	University of Konstanz
Kuckling, Jonas	University of Konstanz
Kaiser, Tanja Katharina	University of Technology Nuremberg
Keywords: RIG Cluster Multi-Robot Systems, RIG TC: Swarm Robotics, Swarm Robotics Abstract: Designing control strategies for multi-robot systems is often guided by the intuition that dividing work into specialized roles simplifies individual robot behavior and improves overall system efficiency. This intuition is supported by biological examples of task partitioning and self-organized specialization. However, in engineered systems, specialization is not free: it introduces additional synthesis effort, coordination requirements, and interfaces between controllers that must function reliably under uncertainty. In this work, we investigate the cost of task specialization in multi-robot systems when controller synthesis is constrained by a limited evaluation budget. We study a two-stage object retrieval and delivery scenario in which robots must transport objects from a source to a target area. The task can be executed either by generalist robots that perform the full task end-to-end, or by task-specialist robots that split the task into sequential subtasks connected by an intermediate handoff. Robot controllers are synthesized using evolutionary optimization, represented as neural network policies. To ensure a fair comparison, evaluation budgets account for differences in task duration between specialist and generalist behaviors, while keeping the total number of evaluations constant. After optimization, controllers are deployed in a multi-robot setting and evaluated based on task completion performance. Our results show that, across all tested configurations, teams of generalist robots consistently outperform task-specialist teams. While specialized controllers for individual subtasks can be successfully synthesized in isolation, their combination leads to substantially lower system-level performance. Performance across specialist combinations varies widely, indicating strong sensitivity to weak links among subtasks. We attribute this performance gap primarily to task interdependence. In specialized systems, overall performance is constrained by the weaker subtask, and additional handoffs increase coordination demands and failure probabilities. In contrast, generalist robots contribute more independently to task completion, resulting in higher robustness under limited synthesis budgets.

11:30-13:00, Paper ThuGA.42
Classical Trajectory Planning for Dual-Camera Visual Servoing on Edge Systems

Madavath, Abilash Philip	Cologne University of Applied Sciences
Aubeeluck, Chandra Yuvesh	Cologne University of Applied Sciences
Raju, Augustin	Cologne University of Applied Sciences
Pyschny, Nicolas	Cologne University of Applied Sciences
Zwanzig, Florian	Cologne University of Applied Sciences
Hackelöer, Felix	Cologne University of Applied Sciences
Keywords: Visual Servoing, Sensor Fusion, Industrial Robots Abstract: Dynamic grasping of moving objects in industrial environments requires tight synchronization between perception and robot actuation. Although deep learning has advanced object detection, inference latency on resource-constrained edge platforms can significantly reduce interception accuracy in high-speed conveyor systems. This paper presents a comparative analysis of classical trajectory prediction and interception algorithms implemented on an NVIDIA Jetson Orin Nano for industrial conveyor picking. We evaluate two prediction methods—RANSAC-based linear extrapolation and Kalman filtering—and benchmark five interception solvers using high-precision ground-truth data from an OptiTrack motion capture system. The results show that Kalman filtering achieves sub-2 ms execution times suitable for real-time control, while iterative numerical solvers outperform analytical closed-form solutions in robustness. Additionally, we quantify and compensate for systematic perception-to-action delays via temporal lead-time adjustment, providing practical guidelines for algorithm selection in real-time dynamic picking systems.

11:30-13:00, Paper ThuGA.43
Smart Fabrics: A Scalable, Modular Approach to In-Place Printed Strain Sensors for Robotic Proprioception

Macher, Philipp Linus	Technical University of Darmstadt
Ali, Usama	Technische Universität Darmstadt
Gross, Roderich	Technical University of Darmstadt
Keywords: RIG Cluster Multi-Robot Systems, RIG TC: Reconfigurable Robotics, RIG TC: Swarm Robotics Abstract: We present a smart fabric with sensing ability that can scale to large sensor counts due to its modular structure. The fabric combines inkjet-printed dual-layer resistive strain sensors with distributed compute nodes. By comparing resistance changes on the two sensor layers, the system distinguishes stretching from bending within the same sensing element. Each compute node digitizes up to six sensors using a high-resolution ADC and transmits measurements via identifier-based CAN arbitration. Prototype measurements and throughput analysis support operation at approximately 4,750 sensors at 1 Hz, or 36 sensors at 125 Hz for the tested message format and acquisition pipeline. This architecture enables reconfigurable, large-area robotic fabrics with scalable wiring and incremental node addition.

11:30-13:00, Paper ThuGA.44
Leveraging Generative Models to Learn Preference Vectors for Context-Based Multi-Objective Robot Navigation

Sethuraman, Tharun	Hochschule Bonn-Rhein-Sieg
Agrawal, Subham	University of Bonn
de Heuvel, Jorge	University of Bonn
Hassan, Teena	Bonn-Rhein-Sieg University of Applied Sciences
Bennewitz, Maren	University of Bonn
Keywords: Human-Aware Motion Planning, Humanoid Robot Systems, RIG TC: Robotics Foundation Models Abstract: Robots are increasingly deployed in environments where they share physical space with humans and collaborate on tasks. In such settings, humans expect robots to follow social norms and personal preferences, ensuring comfort, safety, and acceptance. These dynamic preferences are context-dependent, shaped by environmental factors, such as the room type or object locations, and necessitate context understanding to reflect human preferences. Recently, advancements in generative models have led to general-purpose reasoning systems with strong generalization and contextual understanding capabilities. These capabilities offer great potential for robot behavior in human-shared environments, as they allow robots to reason about their surroundings, but are often impractical for direct robot control, due to high latency and resource consumption. In response, hybrid approaches using low-latency motion policies, such as Multi Objective Reinforcement Learning(MORL) for low-level control, offer a viable alternative solution.Such multi-objective navigation approaches use a numerical vector to weigh the different objectives during runtime and, in this way, tune robot behavior to reflect user preferences. However, since numerical preference vectors are not intuitive for users, we propose a framework that uses multiple generative models to understand and maintain context-dependent preferences and translate them into vectors for MORL control.

11:30-13:00, Paper ThuGA.45
Correct Robots before They Make Mistakes: Proactive Interactive Learning Framework Via Extended Reality

Jiang, Xinkai	Karlsruhe Institute of Technology
Zhou, Hongyi	Karlsruhe Institute of Technology
Vanjani, Pankhuri	Karlsruhe Institute of Technology
Li, Zhuoyue	Karlsruhe Institute of Technology
Baki, Ahmad	Karlsruhe Institute of Technology
Neumann, Gerhard	Karlsruhe Institute of Technology
Lioutikov, Rudolf	Karlsruhe Institute of Technology
Keywords: Imitation Learning, Learning from Demonstration, Human Factors and Human-in-the-Loop Abstract: Imitation learning has shown strong potential for training robot policies from human demonstrations, but its performance critically depends on large, high-quality datasets. In practice, limited data coverage often causes learned policies to encounter out-of-distribution states during execution, leading to compounding errors and task failures. Addressing these failures typically requires human intervention; however, existing approaches rely on post-hoc manual inspection and correction of collected data, which is labor-intensive and difficult to scale. Extended Reality (XR) offers a natural interface for human-in-the-loop robot learning by enabling intuitive visualization and interaction with 3D robot states, trajectories, and policy behavior. In this work, we propose a Extended Reality–based framework that allows humans to proactively correct robot data and policies before failures occur. By visualizing policy execution, users can provide timely, structured corrections that are directly integrated into the learning pipeline. Our approach shifts human correction from a reactive, offline process to a proactive, in-context interaction, reducing the need for manual dataset cleanup while improving data efficiency and policy robustness. This framework demonstrates the potential of Extended Reality as a scalable and effective tool for human-guided robot policy learning.

11:30-13:00, Paper ThuGA.46
Robotics Data Management at Scale Via Query-Centric Storage

Krack, Pierre	University of Technology Nuremberg
Blei, Yannik	University of Technology Nuremberg
Jülg, Tobias Thomas	University of Technology Nuremberg
Walter, Florian	Technical University Munich
Burgard, Wolfram	University of Technology Nuremberg
Keywords: Data Sets for Robot Learning, RIG Cluster: Learning and Multimodal AI for Robotics, Big Data in Robotics and Automation Abstract: Robot learning research is increasingly constrained by data engineering. Datasets vary in structure, modalities, and file formats, requiring significant effort in reading documentation, writing parsers, extracting, transforming, and loading code, and converting large datasets into task-specific formats—only to repeat the process for every new dataset, model, or experiment. At its core, researchers face a data problem that has been studied extensively by the database community. By taking a database perspective, robotics datasets can be treated as structured, queryable collections rather than opaque files tied to specific training pipelines. In this paper, we analyze the data requirements of robot learning research and propose a query-centric approach to storing datasets. We show how heterogeneous robotics datasets can be explored, filtered, transformed, and combined using simple, high-performance SQL queries.

11:30-13:00, Paper ThuGA.47
Natural Control – Hybrid Impedance on Series Elastic Actuators

Vonwirth, Patrick	RPTU University Kaiserslautern-Landau
Berns, Karsten	University of Kaiserslautern-Landau
Keywords: Compliance and Impedance Control, Natural Machine Motion, RIG Cluster: Legged Locomotion Abstract: Modern robotics, especially humanoids, made significant advances in actuation, control, and motion capabilities. However, they are still outperformed by their biological counterparts, particularly in adaptability and computational power. Thus, studying natural control offers significant potential to advance the fundamental principles of robot control. From the natural muscular tendon complex and the first neural reflex circuits, natural control can be modeled as a hybrid impedance approach. It features centralized whole-body force control, stabilized with distributed local damping.

11:30-13:00, Paper ThuGA.48
Object Collection with Modular Robots in Aquatic Environments

Ali, Usama	Technische Universität Darmstadt
Lei, Zheshui	The University of Sheffield
Talamali, Mohamed S.	University of Sheffield
Argote-Gerald, Jahir	The University of Sheffield
Miyauchi, Genki	The University of Sheffield
Rau, Julian	Technical University of Darmstadt
Cao, Lin	University of Sheffield
Gross, Roderich	Technical University of Darmstadt
Keywords: RIG Cluster Multi-Robot Systems, RIG TC: Reconfigurable Robotics, RIG TC: Multi-Robot Coordination Abstract: Floating-object collection on water surfaces is an important capability for environmental cleanup and monitor- ing. We study object collection using a modular aquatic robot assembled into a U-shaped morphology that captures stationary objects by funneling and retaining them in a frontal cavity, avoiding precise grasping. The central technical challenge is scalability: as the number of modules increases, the number of independently actuated pump faces grows rapidly. We present a morphology-aware wrench mapping and a constant-size com- posite allocation method that realizes body wrench commands with nearly constant optimization dimension, independent of module count. In simulation, the composite allocator achieves collection performance comparable to a face-level (“granular”) allocator while providing substantial computational advantages at scale. We further show how cavity geometry should be adapted to object density and how modular resolution improves robustness to sensor/actuator faults with diminishing returns.

11:30-13:00, Paper ThuGA.49
Leveraging 2D Foundation Models for 3D Segmentation

Knaebel, Karim	RWTH Aachen University
Yilmaz, Kadir	RWTH Aachen University
de Geus, Daan	Eindhoven University of Technology
Hermans, Alexander	RWTH Aachen University
Adrian, David Benjamin	Bosch Corporate Research & Ulm University
Linder, Timm	Robert Bosch GmbH
Leibe, Bastian	RWTH Aachen University
Keywords: Deep Learning for Visual Perception Abstract: Vision foundation models (VFMs) trained on large-scale image datasets provide high-quality features that have significantly advanced 2D visual recognition. However, their potential in 3D vision remains largely untapped, despite the common availability of 2D images alongside 3D point cloud datasets. While significant research has been dedicated to 2D--3D fusion, recent state-of-the-art 3D methods predominantly focus on 3D data, leaving the integration of VFMs into 3D models underexplored. In this work, we challenge this trend by introducing DITR, a simple yet effective approach that extracts 2D foundation model features, projects them to 3D, and finally injects them into a 3D point cloud segmentation model. DITR achieves state-of-the-art results on both indoor and outdoor 3D semantic segmentation benchmarks.

11:30-13:00, Paper ThuGA.50
Data Generation Via Reinforcement Learning for Language-Conditioned Bimanual Dexterous Manipulation

Li, Zechu	Technische Universität Darmstadt
Jin, Yufeng	Technische Universität Darmstadt
Liu, Puze	German Research Center for Artificial Intelligence
Peters, Jan	Technische Universität Darmstadt
Chalvatzaki, Georgia	Technische Universität Darmstadt
Keywords: Dexterous Manipulation, Reinforcement Learning Abstract: A key bottleneck in training generalist policies for bimanual dexterous manipulation is the lack of large-scale, high-quality datasets. Synthetic data generation in simulation provides a scalable alternative to human video demonstrations by overcoming challenges such as morphology mismatch, missing physical interactions, and the generation of robot actions. We propose a systematic RL-based data generation framework that integrates generalizable reward design, effective domain randomization, and language-conditioned task annotations. This framework synthesizes diverse, high-quality datasets for dexterous bimanual manipulation and enables training of language-conditioned multi-task policies. Our experiments show that the generated data significantly improves generalization across a wide range of manipulation tasks.

11:30-13:00, Paper ThuGA.51
Steering-Angle-Controlled Robotic Ultrasound for Spinal Imaging

Bi, Yuan	TUM
Duelmer, Felix	Technical University of Munich
Manalil, Larissa	TUM
Navab, Nassir	TU Munich
Keywords: Medical Robots and Systems, RIG TC: Surgical Robotics Abstract: Spinal interventions are commonly guided by fluoroscopy or computed tomography (CT), exposing both patients and clinicians to ionizing radiation. Robotic ultrasound (US) offers a real-time, radiation-free alternative, but accurate 3D reconstruction of the spine remains challenging due to limited visualization of surfaces aligned with the ultrasound propagation direction. We propose a robotic ultrasound scanning approach that dynamically controls the steering angle of a linear probe to enhance spinal surface visibility. Ultrasound images acquired at multiple steering angles are fused to generate a more complete 3D reconstruction of the spinal surface. Experimental results demonstrate improved reconstruction accuracy and completeness compared to fixed-angle scanning, achieving a mean error of 0.79 mm and a coverage of 80%. This approach provides improved 3D visualization of spinal anatomy and could potentially support downstream tasks such as image-guided spinal interventions.

11:30-13:00, Paper ThuGA.52
Damage Risk Quantification for Robot Collisions Using Vision-Language Models

Kiemel, Jonas	Karlsruhe Institute of Technology
Oztop, Erhan	Osaka University / Ozyegin University
Asfour, Tamim	Karlsruhe Institute of Technology (KIT)
Keywords: RIG Cluster: Safety, Reliability and Resilience of AI-based Robotics, RIG TC: Safety and Reliability of AI-based Robotics, Collision Avoidance Abstract: This work investigates the use of Vision-Language Models (VLMs) to estimate the risk of damage from robot-object collisions. Using a curated dataset of 100 images, each depicting a moving object or substance close to collision with a robot, we evaluate how state-of-the-art VLMs quantify the risk of damage to both the robot and the object on a scale from 0 to 10. While numerical outputs vary among models, an analysis across eight object categories shows that VLMs can produce plausible risk quantifications. Our dataset of everyday objects provides reference points for quantified risk values, enabling future VLM applications in damage-aware collision avoidance.

11:30-13:00, Paper ThuGA.53
Task and Motion Planning for Humanoid Loco-Manipulation

Ciebielski, Michal	Technical University of Munich
Dhédin, Victor	Technical University of Munich
Khadiv, Majid	Technical University of Munich
Keywords: RIG TC: Foundations of Optimization and Learning for Robotics, RIG Cluster: Legged Locomotion, RIG TC: Safety and Reliability of AI-based Robotics Abstract: This work presents an optimization-based task and motion planning (TAMP) framework that unifies planning for locomotion and manipulation through a shared representation of contact modes. We define symbolic actions as contact mode changes, grounding high-level planning in low-level motion. This enables a unified search that spans task, contact, and motion planning while incorporating whole-body dynamics, as well as all constraints between the robot, the manipulated object, and the environment. Results on a humanoid platform show that our method can generate a broad range of physically consistent loco-manipulation behaviors over long action sequences requiring complex reasoning. To the best of our knowledge, this is the first work that enables the resolution of an integrated TAMP formulation with fully acyclic planning and whole body dynamics with actuation constraints for the humanoid loco-manipulation problem.

11:30-13:00, Paper ThuGA.54
A Gamified Testbed for Teleoperated Robotics Enabled by Digital Twins and 6G

Sabanovic, Kevin-Ismet	TU Dortmund University
Schippers, Hendrik	TU Dortmund University
Heimann, Karsten	TU Dortmund University
Wietfeld, Christian	TU Dortmund University
Keywords: Telerobotics and Teleoperation, Human-Robot Collaboration, Virtual Reality and Interfaces Abstract: Teleoperation is a key enabler for robotic systems in complex and unpredictable environments. With the advent of 6G, immersive teleoperation using Virtual Reality (VR) and Digital Twins (DTs) is becoming a core service, offering intuitive control and robust perception through semantic state representations. We developed a teleoperation testbed that combines two industrial robotic arms with a virtualized air hockey scenario, enabling real-time interaction under different visual feedback modes. To showcase the system’s capabilities, we created a science communication video, which was successfully presented at public events, generating significant interest and demonstrating the potential of the platform. We evaluated four modes: high-quality 6G VR Video and 6G VR DT, and their impaired counterparts, impaired VR Video and impaired VR DT, under emulated wireless impairments. Players controlled the robots using gesture-based interaction in VR, ensuring consistent control while isolating the impact of visual feedback. Our findings reveal that digital twin feedback is significantly more robust to packet loss compared to video streaming. While impaired VR DT maintained near-optimal performance, impaired VR Video suffered from severe artifacts and interruptions, leading to reduced playability and confidence. Digital twin feedback achieved up to 70% higher offensive and defensive performance under degraded conditions, while requiring substantially less bandwidth. These results highlight the potential of semantic, state-based visual feedback to enhance both Quality of Experience (QoE) and reliability, providing a robust foundation for immersive teleoperation in challenging network environments.

11:30-13:00, Paper ThuGA.55
A Force-Amplified Tendon–Pulley Finger for Humanoid Robotic Hands

Mueller, Tobias	Fulda University of Applied Sciences
Schultheis, Marius	Fulda University of Applied Sciences
Schreiner, Niklas	Alpaka Innovation
Keywords: Human-Robot Collaboration, Humanoid Robot Systems, Grippers and Other End-Effectors Abstract: Humanoid robotic hands require compact finger mechanisms with high force density to robustly grasp a wide range of real-world objects. This contribution presents a finger module with two actively driven degrees of freedom and integrated force amplification, combining electric motors with planetary gearboxes and a miniaturized block-and-tackle (4:1) tendon–pulley mechanism.


ThuIA
Keynotes — Rudolf Lioutikov Towards Efficient Behavior Foundation Models	Keynote
Chair: Neumann, Klaus	Bielefeld University / Fraunhofer IOSB-INA


ThuJA
Keynotes — Johannes Betz Learning to Handle Autonomous Vehicles at the Limits – Lessons Learned from Real-World Autonomous Motorsport	Keynote
Chair: Neumann, Klaus	Bielefeld University / Fraunhofer IOSB-INA


ThuKA
Panel — German Zuversicht: Aligning Government, Industry and Academia towards Scalable Robotics Innovation


ThuLA
RIG Outstanding Startup Award - Final Pitches	Award


ThuMA
Interactive Session & Demos 3 & Coffee	Interactive

15:00-16:20, Paper ThuMA.1
Out of the Cage: Advanced Functional Safety for Humanoids – Positron Safety AI Architecture

Weisshardt, Florian	Synapticon
Fröhlich, Tim	Synapticon
Volpert, Dieter	Synapticon
Lukin, Petr	Synapticon
Bharadwaj, Varun	Synapticon
Ingale, Abhilash	Synapticon
Habib, William	Synapticon
Ballesteros, Roque	Synapticon
Gofre, Jauri	Synapticon
Keywords: Robot Safety, Humanoid Robot Systems, AI-Enabled Robotics Abstract: As humanoid robotics mature from research novelties to viable solutions in logistics, healthcare, and home assistance, they must exit the traditional industrial cage. However, existing safety paradigms - predicated on the assumption that a stopped robot is a safe robot - are insufficient for dynamically stable mobile robots. Humanoids introduce unique risks: they are mechanically unstable (inverted pendulums), heavy, and increasingly driven by non-deterministic AI controllers. Existing safety paradigms (STO) are dangerous for unstable bipedal systems. This paper introduces Positron Safety AI, a 3-layer architecture (Safe Motion, Safe Human Detection, AI Behavior) designed to address humanoid tipping hazards and AI non-determinism and proposing a new separation distance calculation for humanoids based on the formula from ISO 10218-2:2025 Annex L.

15:00-16:20, Paper ThuMA.2
Strongly Entangled Wire Harness Disentangling with Interactive Perception

Zhou, Zexu	University of Stuttgart
Zeh, Lukas	University of Stuttgart
Lechler, Armin	University Stuttgart
Verl, Alexander	University of Stuttgart
Keywords: Intelligent and Flexible Manufacturing, RIG TC: Deformable Object Manipulation, RIG Cluster AI-Powered Industrial Robotics Abstract: Scientists have been researching robots for decades in order to enable them to manipulate objects like humans. But it is still a challenge to manipulate deformable objects dexterously. In the automotive industry, there has been significant interest in robotized wire harness assembly. At our institute, a series of depth image based tracking solutions for shape-variant cables and complex branched wire harnesses have been implemented. For more complex wire harnesses, graph-based topology matching is enabled using feature extraction. But the strongly entangled wire harness has presented an enormous challenge to perception. A disentangling solution inspired by interactive perception could address the problem.

15:00-16:20, Paper ThuMA.3
Diffusion-Based Radar Point Cloud Enhancement for Robust 3D Perception

Xiong, Mengchen	Technical University of Munich
Peng, Yifei	Technical University of Munich
Xu, Xiao	Technical University of Munich
Steinbach, Eckehard	Technical University of Munich
Keywords: RIG TC: Robot Perception, RIG Cluster: Rigorous Perception, Object Detection, Segmentation and Categorization Abstract: Millimeter-wave radar is a robust sensing modality for autonomous perception, yet its utility for 3D tasks is often limited by inherent sparsity and multipath noise. In this work, we present a diffusion-based framework that reconstructs dense, LiDAR-like geometric representations from sparse radar data. Experimental results show that the enhanced radar point clouds effectively recover scene geometry and enable reliable performance in downstream 3D object detection.

15:00-16:20, Paper ThuMA.4
Fast and Accurate Radar-Only Teach-And-Repeat Localization

Hilger, Maximilian	Technical University of Munich
Adolfsson, Daniel	Örebro University
Becker, Ralf	Company Bosch Rexroth
Andreasson, Henrik	Örebro University
Lilienthal, Achim J.	TU Munich
Keywords: RIG Cluster: Field Robotics, RIG TC: Civil Safety Robotics, Localization Abstract: Reliable localization in prior maps is crucial for autonomous navigation, especially in vision-degraded settings where optical sensors may fail. In this work, we present a teach-and-repeat localization pipeline utilizing a spinning radar, designed for robust and accurate performance in adverse conditions. Our method performs localization by jointly aligning incoming scans to stored keyframes from the teach pass and to a sliding window of recent live keyframes. We represent scans as a sparse set of oriented surface points, computed from Doppler-compensated measurements. The map is maintained as a pose graph whose nodes are traversed during localization. Experiments on the Boreas dataset demonstrate localization accuracies of 0.117 m and 0.096°, corresponding to improvements of up to 63% over the previous state of the art, while running efficiently at 29 Hz. These results reduce the gap to lidar-level localization, with the largest improvement observed in heading estimation.

15:00-16:20, Paper ThuMA.5
Uncertainty-Aware Intention Prediction from Egocentric Video: A Controlled Comparison of Temporal Models

Schlegel, Patricia	University of Tuebingen
Gaus, Johannes Albert	University of Tuebingen
Wochner, Isabell	University of Tübingen
Haeufle, Daniel Florian Benedict	University of Tübingen
Keywords: Intention Recognition, Robot Safety, Probability and Statistical Methods

15:00-16:20, Paper ThuMA.6
Comparison of Omni-Directional Platforms for Mobile Manipulation

Hess, Daniel	University of Applied Sciences and Arts in Dortmund
Trinh, Buu Hai Dang	Dortmund University of Applied Sciences and Arts, Dortmund, Germany
Roehrig, Christof	Univ. of Appl. Sci. Dortmund
Keywords: Engineering for Robotic Systems, Service Robotics, Software Architecture for Robotic and Automation

15:00-16:20, Paper ThuMA.7
Evaluating an MR-Based System for Human–Robot Assembly Training

Lang, Silvio	Technical University of Applied Sciences Würzburg-Schweinfurt (thws)
Pfister, Tom	Technical University of Applied Sciences Würzburg-Schweinfurt
Kaupp, Tobias	Technical University of Applied Sciences Würzburg-Schweinfurt
Keywords: Virtual Reality and Interfaces, Human-Robot Collaboration, Assembly

15:00-16:20, Paper ThuMA.8
A Comparative Study of Intuitive Teleoperation Interfaces for Dexterous Robotic Manipulation

Zhong, Weiqiang	Karlsruhe Institute of Technology (KIT)
Welte, Edgar	Karlsruhe Institute of Technology (KIT)
Rayyes, Rania	Karlsruhe Institute of Technology (KIT)
Keywords: Telerobotics and Teleoperation, Dexterous Manipulation, Human-Robot Collaboration Abstract: Teleoperating dexterous robotic hands remains challenging due to limited feedback, occlusions, and the difficulty of accurately mapping human hand motion to high-DOF robot joints. This work presents a comparative study of three teleoperation interfaces for controlling a Shadow Dexterous Hand: a haptic glove with force-feedback, a VR headset with hand tracking, and a custom vision-based stereo camera system. The objective of this study is to systematically compare these interfaces in terms of task performance, control reliability, usability, and user experience, and to identify their task-dependent trade-offs. All interfaces are evaluated within a unified teleoperation framework, sharing the same Shadow Hand control interface while relying on different sensing modalities. User experiments across three manipulation tasks (gesture imitation, pick-and-place, and pouring) provide objective performance measures and subjective user experience ratings, highlighting the strengths and limitations of each interface. This work reports preliminary results from an initial user study with 15 participants, intended to inform an ongoing and larger-scale evaluation.

15:00-16:20, Paper ThuMA.9
Reinforcement Learning Control of Unstable Nonlinear Physical Systems: An Inverted Hydraulic Pendulum Application

Karaoglu, Selim	RWTH Aachen University, Institute for Fluid Power Drives and Systems (ifas)
Roeder, Patrick	RWTH Aachen University, Institute for Fluid Power Drives and Systems (ifas)
Brumand-Poor, Faras	RWTH Aachen University, Institute for Fluid Power Drives and Systems (ifas)
Schmitz, Katharina	RWTH Aachen University, Institute for Fluid Power Drives and Systems (ifas)
Keywords: Reinforcement Learning, Hydraulic/Pneumatic Actuators, Motion Control

15:00-16:20, Paper ThuMA.10
Style-Biased Reinforcement Learning for Quadruped Locomotion

Ju, Siwei	Technische Universität Darmstadt
Peters, Jan	Technische Universität Darmstadt
Arenz, Oleg	Technische Universität Darmstadt
Keywords: Imitation Learning, Legged Robots, RIG Cluster: Legged Locomotion Abstract: Reinforcement learning has emerged as a powerful approach for learning locomotion policies, typically commanded through desired velocities or keyframes. However, such interfaces lack the spatial and temporal expressiveness needed to capture motion styles or to serve as context for low-level policies in hierarchical settings. When using more detailed references such as end-effector trajectories, manual tuning of reward coefficients becomes difficult. In addition, reference motions generated by high-level policies or originating from different embodiments (e.g., humans or dogs) are often physically infeasible, leaving the agent uncertain about when to deviate from them. To address these challenges, we propose a style-biased reinforcement learning (SBRL) framework that formulates hybrid reinforcement–imitation learning as a constrained optimization problem, automatically adjusting reward coefficients to satisfy predefined imitation error bounds. We further introduce a receding-horizon trajectory prediction module that improves temporal credit assignment. We evaluate our method on both simulated and real quadruped locomotion tasks with toe trajectory tracking, demonstrating that it achieves a more favorable Pareto frontier than prior state-of-the-art approaches.

15:00-16:20, Paper ThuMA.11
UniFField: A Generalizable Unified Neural Feature Field for Visual, Semantic, and Spatial Uncertainties in Any Scene

Maurer, Christian	Technische Universität Darmstadt
Jauhri, Snehal	Technische Universität Darmstadt
Lueth, Sophie C.	Technische Universität Darmstadt
Chalvatzaki, Georgia	Technische Universität Darmstadt
Keywords: Deep Learning for Visual Perception, Computer Vision for Automation, RGB-D Perception Abstract: Comprehensive visual, geometric and semantic understanding of a 3D scene is crucial for successful execution of robotic tasks, especially in unstructured and complex environments. While recent 3D neural feature fields enable robots to leverage pretrained vision models for tasks such as language-guided manipulation and navigation, existing methods are typically scene-specific and do not model prediction uncertainty. We present UniFField, a unified uncertainty-aware neural feature field that combines visual, semantic, and geometric features in a single generalizable representation while also predicting uncertainty in each modality. Our approach generalizes zero-shot to any new environment, incrementally integrates RGB-D images into our voxel-based feature representation as the robot explores the scene, simultaneously updating uncertainty estimation. We evaluate the quality of the uncertainty predictions and demonstrate their effectiveness in an active object search task with a mobile manipulator robot.

15:00-16:20, Paper ThuMA.12
An Integrated Robotic Platform for Autonomous Fresco Assembly

Dengler, Nils	University of Bonn
Kreis, Benedikt	University of Bonn
Catalano, Manuel Giuseppe	Istituto Italiano Di Tecnologia
Tsagarakis, Nikos	Istituto Italiano Di Tecnologia
Bennewitz, Maren	University of Bonn
Keywords: Manipulation Planning, Dual Arm Manipulation, Assembly Abstract: Preserving cultural heritage is a fundamental challenge in modern archaeology, as it enables the transfer of knowledge across generations. However, this process is complicated by factors such as natural aging, environmental change, and human activities. In the case of the ancient city of Pompeii, countless archaeological treasures were damaged or destroyed by the eruption of mount Vesuvius and later by bombings during the second world war. Archaeological restoration is traditionally performed by hand and requires exceptional skill and patience, often taking months or years depending on the number of fragments. When considering the reconstruction of ancient frescoes in this context, it is comparable to assembling a jigsaw puzzle with damaged or missing pieces and no reference image. In this work, we present an integrated robotic platform designed to support the reconstruction process in a safe and robust manner, as handling ancient fresco fragments differs fundamentally from industrial robotics tasks in structured environments. Therefore, within the EU Horizon 2020 project RePAIR, we developed a dual-arm robotic system that integrates perception, motion planning, and grasping to enable precise manipulation and assembly of fragmented cultural heritage frescoes. Building upon game-theoretic puzzle-solving algorithms, we validate the system through real-world assembly trials under supervised conditions.

15:00-16:20, Paper ThuMA.13
FBGA: A Forward-Backward Method for Online Time-Optimal Velocity Planning with Generic Acceleration Constraints

Piazza, Mattia	University of Trento
Piccinini, Mattia	Technical University of Munich
Taddei, Sebastiano	University of Trento, Politecnico Di Bari
Biral, Francesco	University of Trento
Bertolazzi, Enrico	University of Trento
Keywords: Constrained Motion Planning, Optimization and Optimal Control, Motion and Path Planning Abstract: We present FBGA, a new algorithm for time-optimal velocity planning under generic acceleration constraints. By extending previous forward-backward approaches to handle custom acceleration constraints, our FBGA matches the accuracy of optimal control baselines while being up to three orders of magnitude faster. Our open-source C++ implementation is available at: https://github.com/DRIVEWISE/FBGA.

15:00-16:20, Paper ThuMA.14
Learning Quadruped Locomotion from Casual Videos

Hausdörfer, Oliver	Technical University of Munich (Chair of Safety, Performance and Reliability for Learning Systems, Prof. Angela Schoellig)
von Rohr, Alexander	Technical University of Munich
Skubacz, Filip	Technical University of Munich
Omar, Shafeef	Munich Institute of Robotics and Machine Intelligence, Technical University of Munich
Zhou, Siqi	Technical University of Munich
Khadiv, Majid	Technical University of Munich
Schoellig, Angela P.	TU Munich
Keywords: RIG Cluster: Legged Locomotion, RIG TC: Manyfold Legged Locomotion in Various Terrains, Imitation Learning

15:00-16:20, Paper ThuMA.15
A Participatory Interview Study for Service Robots and Their Value for Care

Klein, Stina	University of Augsburg
Shen, Shuyuan	University of Augsburg
Andre, Elisabeth	University of Augsburg
Kraus, Matthias	University of Augsburg
Keywords: Human-Centered Robotics, Long term Interaction, Social HRI Abstract: Care facilities increasingly consider service robots (SRs) to mitigate staff shortages and documentation burden, yet adoption often stalls because robots misalign with care as a value-driven, relational practice. We report a participatory interview study in three German care facilities with caregivers (n=7) and care recipients (n=3), grounded in Value-Sensitive Design (VSD). Stakeholders identified credible SR roles in logistics, documentation support, reminders, and wayfinding/visitor guidance, but drew strong boundaries around intimate and safety-critical care tasks. Acceptance depends less on any single function than on whether SRs can adapt their initiative, autonomy, timing, and interaction modality to preserve human attentiveness and warmth, sustaining independence and safety concerns for care recipients and caregivers' job security, control over their workload, and legal constraints. We synthesize these insights into a framing of fluid adaptivity as an operational bridge from abstract values to concrete robot behavior.

15:00-16:20, Paper ThuMA.16
COFFAIL: A Dataset of Successful and Anomalous Robot Skill Executions in the Context of Coffee Preparation

Mitrevski, Alex	Chalmers University of Technology
Salunke, Ayush	Hochschule Bonn-Rhein-Seig
Keywords: Data Sets for Robot Learning, Learning from Demonstration Abstract: In the context of robot learning for manipulation, curated datasets are an important resource for advancing the state of the art; however, available datasets typically only include successful executions or are focused on one particular type of skill. In this short paper, we briefly describe a dataset of various skills performed in the context of coffee preparation. The dataset, which we call COFFAIL, includes both successful and anomalous skill execution episodes collected with a physical robot in a kitchen environment, a couple of which are performed with bimanual manipulation. In addition to describing the data collection setup and the collected data, the paper illustrates the use of the data in COFFAIL to learn a robot policy using imitation learning.

15:00-16:20, Paper ThuMA.17
Transparent Robot Skill Execution Using Visual Predictive Capabilities

Mitrevski, Alex	Chalmers University of Technology
Zhang, Jing	Chalmers University of Technology
Ramirez-Amaro, Karinne	Chalmers University of Technology
Dean, Emmanuel	Chalmers University of Technology
Keywords: Cognitive Control Architectures, Learning from Experience Abstract: Learning-based robot skill represent the current state-of-the-art of robot manipulation, but they can struggle to generalise to out-of-distribution tasks, and failures may be difficult to understand and resolve. In our ongoing work, we aim to develop a more interpretable framework that combines a learned policy with a learned forward model and a learned semantic representation that facilitates monitoring and simplifies adaptation. In this short paper, we briefly describe the ideas we pursue in this direction, with a concrete focus on the forward modelling aspect. We particularly discuss two network-based forward model variants and illustrate some preliminary results of the obtained predictions on a domestic object pick up task.

15:00-16:20, Paper ThuMA.18
Robustness Evaluation of Uncertainty-Gated Intention Prediction with Noise and Dropouts

Mees, Hans	Eberhard Karls Universität Tübingen
Gaus, Johannes Albert	University of Tuebingen
Schmitt, Syn	University of Stuttgart, Germany
Haeufle, Daniel Florian Benedict	University of Tübingen
Keywords: Intention Recognition, Safety in HRI, Performance Evaluation and Benchmarking

15:00-16:20, Paper ThuMA.19
LeARN: Learnable and Adaptive Representations for Nonlinear Dynamics in System Identification

SINGH, ARUNABH	Birla Institute of Technology and Science, Hyderabad
Mukherjee, Joyjit	Birla Institute of Technology and Science, Hyderabad
Keywords: RIG TC: Principles and Methods for Building AI-powered Robust and Resilient Robots, Calibration and Identification, Model Learning for Control Abstract: System identification, the process of deriving mathematical models of dynamical systems from observed input-output data, has undergone a paradigm shift with the advent of learning-based methods. Addressing the intricate challenges of data-driven discovery in nonlinear dynamical systems, these methods have garnered significant attention. Among them, Sparse Identification of Nonlinear Dynamics (SINDy) has emerged as a transformative approach, distilling complex dynamical behaviors into interpretable linear com- binations of basis functions. However, SINDy’s reliance on domain-specific expertise to construct its foundational ’library’ of basis functions limits its adaptability and universality. In this work, we introduce a nonlinear system identification framework LeARN that transcends the need for prior domain knowledge by learning the library of basis functions directly from data. To enhance adaptability to evolving system dynamics under varying noise conditions, we employ a novel meta-learning-based system identification approach that utilizes a light-weight Deep Neural Network (DNN) to dynamically refine these basis functions. This not only captures intricate system behaviors but also adapts seamlessly to new dynamical regimes. We validate our framework on the Neural Fly dataset, showcasing its robust adaptation and generalization capabilities. Despite its simplicity, our LeARN achieves competitive dynamical error performance to SINDy. This work presents a step towards autonomous discovery of dynamical systems, paving the way for a future where machine learning uncovers the governing principles of complex systems without requiring extensive domain-specific interventions.

15:00-16:20, Paper ThuMA.20
Efficient Fine-Tuning of VLA Models for Industrial Manipulation

Wrede, Konstantin	Fraunhofer IIS/EAS
Di, Yibo	Fraunhofer IIS
Neumann, Julius	Fraunhofer IIS/EAS
Martin, Ron	Fraunhofer IIS/EAS
Schneider, Peter	Fraunhofer IIS/EAS
Keywords: AI-Enabled Robotics, Data Sets for Robot Learning Abstract: This work investigates the potential of Vision-Language-Action (VLA) models for industrial robotic manipulation, aiming to address the need for flexible automation solutions. By deploying and evaluating the open-source VLA model π0 (openpi) on a Franka Panda robot, this study analyzes the data efficiency and generalization capabilities of robot foundation models in industrial settings. We compare three training strategies: Real-only, Sim-and-Real Co-Training, and Sim-then-Real across two tasks of varying complexity: Plug Removal and Long-Horizon Object Sorting. Results demonstrate that the Sim-then-Real approach significantly outperforms other strategies, achieving high success rates with minimal real-world data (100% success on Plug Removal with only 10 real demonstrations). This study shows how to efficiently leverage simulation-based pre-training coupled with real-world fine-tuning in industrial robotic manipulation tasks.

15:00-16:20, Paper ThuMA.21
Human-Interpretable Uncertainty Explanations for Point Cloud Registration

Gaus, Johannes Albert	University of Tuebingen
Schneider, Loris	Karlsruhe Institute of Technology
Shi, Yitian	karlsruhe institute of technology
Lee, Jongseok	German Aerospace Center
Rayyes, Rania	Karlsruhe Institute for Technology (KIT)
Triebel, Rudolph	German Aerospace Center (DLR)
Keywords: RGB-D Perception, Probabilistic Inference, Probability and Statistical Methods

15:00-16:20, Paper ThuMA.22
Autonomous Docking of Multi-Rotor UAVs on Blimps under the Influence of Wind Gusts

Goldschmid, Pascal	University of Stuttgart
Ahmad, Aamir	University of Stuttgart
Keywords: Aerial Systems: Applications, Aerial Systems: Mechanics and Control, Aerial Systems: Perception and Autonomy Abstract: Multi-rotor UAVs face limited flight time due to battery constraints. Autonomous docking on blimps with onboard battery recharging and data offloading offers a promising solution for extended UAV missions. However, the vulnerability of blimps to wind gusts causes trajectory deviations, requiring precise, obstacle-aware docking strategies. To this end, this work introduces two key novelties: (i) a temporal convolutional network that predicts blimp responses to wind gusts, enabling rapid gust detection and estimation of points where the wind gust effect has subsided; (ii) a model predictive controller (MPC) that leverages these predictions to compute collision-free trajectories for docking, enabled by a novel obstacle avoidance method for close-range maneuvers near the blimp. Simulation results show our method outperforms a baseline constant-velocity model of the blimp significantly across different scenarios. We further validate the approach in real-world experiments, demonstrating the first autonomous multi-rotor docking control strategy on blimps shown outside simulation. Source code is available here https://github.com/robot-perception-group/multi_rotor_airsh ip_docking.

15:00-16:20, Paper ThuMA.23
Joint Denoising and Motion Estimation with Event Cameras

Shiba, Shintaro	Keio University
Aoki, Yoshimitsu	Keio University
Gallego, Guillermo	Technische Universität Berlin
Keywords: Computer Vision for Automation, RIG TC: Robot Perception, RIG Cluster: Rigorous Perception Abstract: Event cameras are emerging vision sensors whose noise is challenging to characterize. Existing denoising methods for event cameras are often designed in isolation and thus consider other tasks, such as motion estimation, separately (i.e., sequentially after denoising). However, motion is an intrinsic part of event data, since scene edges cannot be sensed without motion. We propose the first method that simultaneously estimates motion in its various forms (e.g., ego-motion, optical flow) and noise. The method is flexible, as it allows replacing the one-step motion estimation of the widely-used Contrast Maximization framework with any other motion estimator, such as deep neural networks. The experiments show that the proposed method achieves state-of-the-art results on the E-MLB denoising benchmark and competitive results on the DND21 benchmark, while demonstrating effectiveness across motion estimation and intensity reconstruction tasks. Our approach advances event-data denoising theory and expands practical denoising use-cases via open-source code. Project page: https://github.com/tub-rip/ESMD

15:00-16:20, Paper ThuMA.24
Towards Mixed-Reality-Based Robot Programming

Pfister, Tom	Technical University of Applied Sciences Würzburg-Schweinfurt
Lang, Silvio	Technical University of Applied Sciences Würzburg-Schweinfurt (thws)
Kaupp, Tobias	Technical University of Applied Sciences Würzburg-Schweinfurt
Keywords: Virtual Reality and Interfaces, Human-Centered Robotics, Human-Robot Teaming

15:00-16:20, Paper ThuMA.25
Memory-Aware Environmental Knowledge Sharing for Cooperative Autonomous Robot Systems

Helten, Catharina	RPTU University of Kaiserslautern-Landau
Wolf, Patrick	University of Kaiserslautern-Landau \| Fraunhofer IESE
Keywords: RIG Cluster: Safety, Reliability and Resilience of AI-based Robotics, Cooperating Robots Abstract: Cooperative autonomous operation requires effective coordination under uncertainty. Leader-follower convoying in off-road environments is challenged by limited visibility, which makes purely local perception and short-term data exchange insufficient. This paper presents a memory-aware leader-follower concept that introduces persistent, confidence-aware environmental knowledge as a basis for cooperation. Vehicles maintain structured short- and long-term representations of environmental context and selectively share abstract, confidence-annotated knowledge via vehicle-to-vehicle (V2V) communication. By explicitly distinguishing transient observations from persistent environmental properties and reasoning about their reliability, follower vehicles adapt their behavior through interpretable behavior modes. This enables robust and self-adaptive convoy operation under uncertainty without explicit path or trajectory sharing.

15:00-16:20, Paper ThuMA.26
Task-Adaptive Perception for Human-Robot Interaction

Mania, Patrick	University of Bremen
Beetz, Michael	University of Bremen
Keywords: RIG TC: Robot Perception, Perception for Grasping and Manipulation, RIG TC: Semantic Perception Abstract: Human-robot interaction (HRI) requires perception systems that can handle diverse tasks such as multi-person re-identification, gaze estimation and control, gesture and activity recognition, body posture analysis, and spatial reasoning in dynamic crowds. These requirements span both single-shot tasks (e.g., classifying guest attributes) and continuous perception (e.g., tracking speakers or monitoring groups over time), often in unstructured environments where rigid pipelines are insufficient. However, no single perception algorithm can robustly address this spectrum: human detection, re-identification, gaze estimation, activity recognition, and continuous tracking differ fundamentally in their assumptions, data modalities, and temporal characteristics, making monolithic solutions impractical in real-world HRI. To address these limitations, our presented System RoboKudo enables generalized, task-adaptive perception through query-driven Perception Pipeline Trees (PPTs). PPTs use Behavior Trees (BTs) to compose specialized vision experts into task-specific pipelines at runtime, while a shared data structure allows Annotations and belief state information to be maintained consistently across single-shot and continuous execution. By supporting reactivity, looping, and parallel inference within a unified process model, RoboKudo accommodates the heterogeneous demands of HRI. We demonstrate this capability in the Receptionist and Restaurant challenges at RoboCup@Home, which require seamless integration of diverse perception behaviors.

15:00-16:20, Paper ThuMA.27
Task-Based Evaluation of Robot Foot Geometries for Granular Substrates Using Three-Dimensional Resistive Force Theory

Aslam, Umair	RWTH Aachen University
Adak, Omer Kemal	RWTH Aachen
Fuentes, Raul	RWTH Aachen
Keywords: Legged Robots, Dynamics, Field Robots Abstract: oot geometry plays a critical role in legged robot locomotion on granular substrates, influencing thrust generation, energetic cost, load support, and stability. However, systematic evaluation of robot foot designs in three-dimensional granular interaction remains limited. In this paper, we present a compact, task-based framework for evaluating rigid robot foot geometries using three-dimensional resistive force theory (3D-RFT). A representative stance-phase interaction is defined, consisting of vertical intrusion followed by horizontal shear under prescribed yaw misalignment. Performance is quantified using stroke-integrated metrics capturing thrust, energetic cost, and peak yaw moment, complemented by a sinkage-based proxy for load support. The framework is applied to three representative foot geometries—a flat plate, a high-aspect-ratio ski, and a ribbed plate—under identical kinematic conditions. Results reveal clear tradeoffs between shallow-sinkage load support, thrust generation, and yaw robustness, demonstrating the utility of 3D-RFT as a practical design evaluation tool for robot feet interacting with granular media.

15:00-16:20, Paper ThuMA.28
Towards Whole-Body VLA: A Scalable Data Collection Framework for Quadrupedal Mobile Manipulators

Gao, Yuan	Technical University of Munich
Piccinini, Mattia	Technical University of Munich
Betz, Johannes	Technical University of Munich
Keywords: Whole-Body Motion Planning and Control, Data Sets for Robot Learning, RIG TC: Robotics Foundation Models Abstract: Quadrupedal mobile manipulators combine locomotion and manipulation for versatile operation in unstructured environments. While conventional model-based control ensures stability, it often lacks the generalization capabilities required for diverse daily tasks. Conversely, data-driven approaches offer a promising alternative but are hindered by a critical scarcity of unified whole-body data. To bridge this gap, we introduce a scalable dataset collection pipeline designed to enable the training of Vision Language Action (VLA) models. Our framework automates the generation of diverse demonstrations through a two-stage process: bootstrapping from expert model-based controllers and scaling via autonomous rollouts. This work provides the foundational data infrastructure to extend the success of VLA models to quadrupedal mobile manipulators.

15:00-16:20, Paper ThuMA.29
ANN-CMCGS: Generalizing Continuous Monte Carlo Graph Search with Approximate Nearest Neighbors

Scherer, Christoph	Technical University Berlin
Hoenig, Wolfgang	Technical University Berlin
Keywords: RIG Cluster: Learning and Multimodal AI for Robotics, Motion and Path Planning, Planning under Uncertainty Abstract: Continuous Monte Carlo Graph Search (CMCGS) enables state reuse in continuous domains but relies on a layered, acyclic structure, limiting its effectiveness. We introduce ANN-CMCGS, a generalized, non-layered formulation to detect approximate transpositions in continuous spaces via approximate nearest-neighbor search. By allowing arbitrary directed graphs and enabling incremental reuse across decision steps, ANN-CMCGS demonstrates improved exploration efficiency and success rates in challenging continuous domains.

15:00-16:20, Paper ThuMA.30
A Manipulation Pipeline for Grasping Unknown Objects in Heavy Clutter for Decontamination

Hyseni, Engjell	Karlsruhe Institute of Technology (KIT)
Nutto, Sebastian	Karlsruhe Institute of Technology (KIT)
Nefzer, Janna	Karlsruhe Institute of Technology (KIT)
De Diego Pérez, Miguel	Universitat Jaume I
Morales, Antonio	Universitat Jaume I
Asfour, Tamim	Karlsruhe Institute of Technology (KIT)
Keywords: RIG Cluster: Manipulate Anything, Manipulation Planning, Perception for Grasping and Manipulation Abstract: The decontamination of nuclear waste remains a challenging and largely manual process, exposing human workers to physical strain and potential health risks due to radiation. In this work, we present a manipulation pipeline for grasping unknown objects in heavily cluttered environments, motivated by real-world decontamination scenarios addressed in the ROBDEKON project. The proposed system integrates a robust perception pipeline for scene segmentation, a manipulation framework for grasp generation and selection and a failure detection and recovery mechanism. Our approach enables autonomous grasping of previously unseen objects in a cluttered scene from containers and prepares them for subsequent decontamination steps, thereby improving worker safety and increasing overall process efficiency.

15:00-16:20, Paper ThuMA.31
Towards Maximum Distance and Accurate Throwing by Exploiting Dynamics of Robotic Manipulators

Barten, Moritz	Karlsruhe Institute of Technology
Meyer, Anne	Karlsruhe Institute of Technology
Roennau, Arne	Karlsruhe Institute of Technology (KIT)
Keywords: RIG Cluster AI-Powered Industrial Robotics, RIG TC: AI-Robotics in Industry Abstract: Modern warehouse logistics demand automated and highly efficient solutions to ensure rapid and reliable commissioning processes. Even though there are several approaches addressing robotic throwing, most of them rely on motion primitives thereby only learning end-effector velocities or specific joint behaviors. These restrictions prevent the robot from fully exploiting its dynamics, which limits the reachable task space and maximum throwing range. To address these gaps, this work proposes an architecture to exploit the dynamic capabilities of a robotic arm, optimizing both throwing distance and accuracy for stationary and moving targets. Therefore, we will extend an existing optimization approach and combine it with reinforcement learning. The optimization module will compute an optimal release state for a maximum throwing distance, whereas the reinforcement learning agent will be trained to find a trajectory to the determined release state. In a second step the trained agent for maximum distance throwing will serve as a pre-trained policy to warmstart a training of a second reinforcement learning agent addressing throwing accuracy. While purely data-driven RL suffers from limited sample efficiency and lack of physical guarantees, we will incorporate physical knowledge directly into the loss terms of the underlying neural networks.

15:00-16:20, Paper ThuMA.32
Development of Robotic Hands for Grasping of Deformable Objects

Hundhausen, Felix	Karlsruhe Institute of Technology
Moosmüller, Moritz	Karlsruhe Institute of Technology (KIT)
Ruffler, Daniel	Karlsruhe Institute of Technology (KIT)
Asfour, Tamim	Karlsruhe Institute of Technology (KIT)
Keywords: Grippers and Other End-Effectors, RIG TC: Deformable Object Manipulation Abstract: Manipulating deformable objects such as textiles remains a key challenge in robotics, requiring dexterous hardware and tactile sensing. This paper presents the design and development of an anthropomorphic five-finger robotic hand for deformable object manipulation. The fingers are realized using a four-bar mechanism to replicate human finger trajectories. We compare two distinctive thumb designs with two actuated degrees of freedom to allow pinch grasps. Each finger incorporates an embedded tactile sensing system without the need for finger internal cables. A hand-internal embedded system is designed for real-time control and sensor data processing. To evaluate the performance of the hand design in grasping deformable objects, a prototype is currently being built.

15:00-16:20, Paper ThuMA.33
Composable and Interpretable Theory of Mind for Fluid Human-Robot Collaboration Via Behavior Trees

Schröder, Florian	Bielefeld University
Heinrich, Fabian	Bielefeld University
Kopp, Stefan	Bielefeld University
Keywords: RIG Cluster: Human-Robot Interaction, Human-Robot Collaboration, Intention Recognition Abstract: Fluid collaboration (FC) refers to highly flexible, real-time teamwork characterized by dynamic task and role allocation, as often observable in everyday human interaction. Enabling fluid human-robot collaboration requires robots to employ online Theory of Mind (ToM), inferring latent mental states such as goals and intentions of others to support decentralized planning. Key challenges include managing the computational complexity of ToM, enabling extensibility to new tasks and strategies, providing planning-relevant mental state representations, and ensuring interpretability and explainability for human-robot interaction. We present an approach to ToM based on Behavior Trees (BTs) to address these challenges and support fluid human-robot collaboration. Our approach offers extensibility, adaptable computational complexity, explicit integration of uncertainty, and reuse of the robot’s action policy for ToM, enabling action-driven mental state representations that map to the robot’s task knowledge.

15:00-16:20, Paper ThuMA.34
Entity-Grounded Procedural Knowledge Graphs for Executable Task Understanding from Instructional Videos

Oguz, Cennet	German Research Center for Artificial Intelligence (DFKI)
Ostermann, Simon	Deutsches Forschungszentrum Für Künstliche Intelligenz
Neumann, Günter	DFKI GmbH & University of Saarland
Keywords: Integrated Planning and Learning, Visual Learning, Visual Tracking Abstract: Instructional videos contain rich procedural knowledge that could support robotic task execution. However, most existing video understanding approaches produce free-form captions or high-level action labels that lack the explicit, entity-centric semantics required for robotic planning. We present Entity-Grounded Procedural Knowledge Graphs (EGPKGs), a neuro-symbolic representation that decomposes instructional videos into explicit entity-level transformations with grounded preconditions and effects. EGPKGs integrate language-based action schemas, vision-based entity grounding, and symbolic state transitions to produce executable task representations suitable for AI-powered robotic systems.

15:00-16:20, Paper ThuMA.35
Visual Event-Gait-Based Human Following for Quadruped Robots

Nguyen, Hong Phuoc Nguyen	Karlsruhe Institute of Technology (KIT)
Roennau, Arne	Karlsruhe Institute of Technology (KIT)
Keywords: Human Detection and Tracking, Biologically-Inspired Robots, Recognition Abstract: As personal service robots transition into human-centric environments, autonomous human-following capabilities are essential for practical deployment. This work presents a robust pipeline for gait-based human following utilizing an event-based camera, leveraging its high temporal resolution to track fast-moving subjects. Our approach employs a detection network to identify pedestrians, followed by a recognition network that utilizes unique walking gait biometric features for human identification. Once a target is identified, the system enables continuous tracking and autonomous following. We introduce a novel event frame representation that contains information in the previous accumulation time, significantly enhancing network performance in dynamic settings. Experimental results confirm the effectiveness of gait-based recognition in real-world scenarios, demonstrating high reliability even when subjects are unseen during the training phase.

15:00-16:20, Paper ThuMA.36
Sampling-Based Trajectory Optimization for Humanoid Loco-Manipulation Motion Retargeting

Dhédin, Victor	Technical University of Munich
Khadiv, Majid	Technical University of Munich
Keywords: RIG TC: Foundations of Optimization and Learning for Robotics, Humanoid Robot Systems, Multi-Contact Whole-Body Motion Planning and Control Abstract: In this work, we present a sampling-based trajectory optimization framework that retargets imperfect kinematic humanoid loco-manipulation demonstrations into dynamically feasible motions. Our method leverages the temporal structure of the tracking objective by incrementally increasing the optimization horizon, enabling the use of single-shooting to optimize long trajectories efficiently. We validate the approach by successfully retargeting hundreds of demonstrated motions on a fully actuated humanoid interacting with a box. The framework also generalizes across varying object properties such as mass, size, and geometry with the exact same tracking objective. This ability to robustly retarget diverse demonstrations opens the door to generating large-scale synthetic datasets of humanoid loco-manipulation trajectories, addressing a major bottleneck in real-world data collection.

15:00-16:20, Paper ThuMA.37
Race Car Aerobatics Via Position-Indexed Iterative-Learning Control

Wildberger, Lukas	RWTH Aachen University
Hose, Henrik	RWTH Aachen
Solowjow, Friedrich	RWTH Aachen University
Trimpe, Sebastian	RWTH Aachen University
Keywords: RIG TC: Foundations of Optimization and Learning for Robotics, Wheeled Robots, Learning from Experience Abstract: We develop a robust 1:10 scale race car platform for repeated autonomous jump experiments with controlled takeoff, free flight, and touchdown. A position-indexed iterative learning control (ILC) formulation is proposed to refine open-loop jump maneuvers from hardware data while mitigating timing variability. Using this approach, precise and reliable jumps exceeding 2 m are achieved within 50 learning iterations.

15:00-16:20, Paper ThuMA.38
Soft Drag Gripper - a Soft Simultaneous Multiple Object Gripper Designed to Work in Rectangle Boxes

Friedl, Werner	German AerospaceCenter (DLR)
Keywords: Grippers and Other End-Effectors, Robotics and Automation in Agriculture and Forestry, Soft Robot Applications Abstract: In the field of logistics, humans have the ability to grasp multiple objects simultaneously. This paper presents a hardware solution in the form of the Soft Drag Gripper, that demonstrates excellent capabilities for grasping multiple objects simultaneously. The drag design enables efficient emptying. of rectangular boxes.. Simple strategies can be employed to increase the number of picks per grasp, thereby reducing time and costs. A benchmark compares SDG to exiting design solution showing higher pick rate than other designs.

15:00-16:20, Paper ThuMA.39
Including Meshed-Based Costlayers in Path Following Control

Braun, Justus	Osnabrück University
Mock, Alexander	Osnabrück University
kl. Piening, Malte	Nature Robots
Wiemann, Thomas	Fulda University of Applied Sciences
Keywords: Task and Motion Planning, Motion and Path Planning, RIG Cluster: Field Robotics Abstract: Robust collision-free navigation in uneven terrain is necessary for autonomous robots to be deployed in dynamic outdoor environments. The mobile robot navigation problem can be split into global path planning and path-following control. One method that was shown to be an effective solution to solve the control problem is Model Predictive Control (MPC). Existing solutions either assume that the robot moves on a flat surface or use 2D height maps, which are limited to single-story environments. In this contribution, we provide a prove of concept, that includes mesh-based cost maps into MPPI. It uses a unified cost map representation of geometric traversability metrics derived from the mesh geometry and unmapped obstacles detected using 3D LiDAR sensors. We show that the representation of terrain geometry used by our controller enables safer behaviors than existing 2D and 3D methods.

15:00-16:20, Paper ThuMA.40
Human Gesture & Activity Recognition in Scene Context for Intralogistic Mobile Robots

Käs, Stephanie	RWTH Aachen University
Linder, Timm	Robert Bosch GmbH
Leibe, Bastian	RWTH Aachen University
Keywords: Datasets for Human Motion, Gesture, Posture and Facial Expressions, Human-Robot Collaboration Abstract: We study human gesture and activity recognition for intralogistic mobile robots using fisheye cameras. To address strong distortions, we propose a dynamic projection selection strategy for monocular 3D human pose estimation, validated on the new FISHnCHIPS dataset. Building on robust pose estimates, we evaluate gesture recognition using skeleton-based models and vision foundation models on NUGGET, highlighting current trade-offs between accuracy and adaptability for human–robot interaction.

15:00-16:20, Paper ThuMA.41
Rendering Forces with a Modular Cable System, Motors, and Brakes

Bartels, Jan U.	Max-Planck Institute for Intelligent Systems
Achberger, Alexander	University of Stuttgart
Kuchenbecker, Katherine J.	Max Planck Institute for Intelligent Systems
Sedlmair, Michael	University of Stuttgart
Keywords: Haptics and Haptic Interfaces, Virtual Reality and Interfaces, Human-Centered Robotics Abstract: We describe the hardware design, force-rendering approach, and evaluation of a new reconfigurable haptic interface consisting of a network of hybrid motor-brake actuation modules that apply forces via cables. Each module contains both a motor and a brake, enabling it to smoothly render active forces up to 6 N using its motor and collision forces up to 186 N using its passive one-way brake. The modular design, meanwhile, allows the system to deliver rich haptic feedback in a flexible number of DoF and widely ranging configurations.

15:00-16:20, Paper ThuMA.42
BATEX: Biarticular Soft Exosuit Assistance Improves Walking Efficiency

Ahmadi, Arjang	Teschniche Universität Darmstadt
Firouzi, Vahid	Technical University of Darmstadt
Seyfarth, Andre	TU Darmstadt
Rinderknecht, Stephan	TU Darmstadt
Findeisen, Rolf	Control and Cyber-Pysical Systems Laborator
Sharbafi, Maziar	Technische Universität Darmstadt
Keywords: Wearable Robotics, RIG TC: Robotic Augmentation of the Human Body, RIG Cluster: Healthcare Robotics and Human Augmentation Abstract: Human locomotion is a highly adaptive and ef- ficient process shaped by biomechanical and neuromuscular control. However, aging and neuromuscular impairments can reduce walking efficiency and increase the metabolic cost of movement. This study investigates the Biarticular Thigh Exosuit (BATEX), a soft wearable device designed to assist both hip and knee joints through coordinated biarticular actuation. By supporting the function of the rectus femoris and hamstring muscle groups, BATEX aims to improve locomotor efficiency and reduce the energetic demands of walking. An experimental study with 12 participants evaluated the effects of BATEX on energy consumption. Results demonstrate that BATEX signifi- cantly lowers energy expenditure during walking, achieving a 9% reduction compared to the No-Exosuit (NE) condition and an 18% reduction compared to the Zero-Torque (ZT) condition. These findings indicate that BATEX can enhance walking efficiency while reducing neuromuscular effort, supporting its potential for applications in healthcare and human-robot interaction, such as mobility assistance.

15:00-16:20, Paper ThuMA.43
A Human-Centered Perspective on Interactive Robot Learning

Beierling, Helen	Bielefeld University
Vollmer, Anna-Lisa	Bielefeld University
Keywords: Human-Centered Robotics, Human-Robot Collaboration, Human Factors and Human-in-the-Loop Abstract: Recent advances in artificial intelligence and robotics have enabled robots to increasingly enter everyday environments, including highly burdened domains such as healthcare. However, these contexts are strongly shaped by individual needs, making it infeasible to rely solely on preprogrammed robot behaviors. Therefore, robots must be trained by end users, who are often non-experts. While Human-in-the-Loop and interactive robot learning approaches address this challenge by incorporating user feedback, they commonly assume that users are able to provide effective and meaningful input. In practice, mismatches between users’ mental models and the robot’s learning process can slow down or hinder learning. To address these challenges, we developed an initial implementation of Co-Constructive Training (CCT) within a four-year research project. CCT conceptualizes robot learning as a mutual process in which both the user and the system monitor and scaffold each other, aiming to align user understanding and robot learning for more effective human–robot interaction.

15:00-16:20, Paper ThuMA.44
Baseline Lower-Limb Kinematics Are Associated with Individual Responses to Exosuit Assistance

Firouzi, Vahid	Technical University of Darmstadt
von Stryk, Oskar	Technische Universität Darmstadt
Sharbafi, Maziar	Technische Universität Darmstadt
Keywords: Prosthetics and Exoskeletons, RIG Cluster: Healthcare Robotics and Human Augmentation, RIG TC: Robotic Augmentation of the Human Body Abstract: Individuals respond differently to exosuit assistance, with some experiencing metabolic benefits and others not. This study examined whether baseline lower-limb joint kinematics during unassisted walking differ between positive and negative responders to a passive biarticular exosuit. Subjects were classified based on metabolic cost changes across multiple assisted configurations, and unassisted joint kinematics were compared between groups. Significant differences were found in hip flexion–extension, hip abduction–adduction, and knee flexion–extension. These results suggest that baseline gait mechanics may help predict responsiveness to exosuit assistance.

15:00-16:20, Paper ThuMA.45
Bipedal Robot Squatting Control Using Human Kinematics

Jiang, Yelin	Technical University of Darmstadt
Zhao, Guoping	Southeast University
Haufe, Dennis	Technische Universität Darmstadt
Findeisen, Rolf	Control and Cyber-Pysical Systems Laborator
Ahmad Sharbafi, Maziar	Technical University of Darmstadt
Keywords: Humanoid and Bipedal Locomotion, RIG Cluster: Legged Locomotion, Biologically-Inspired Robots Abstract: Controlling humanoid robot locomotion can be challenging, while biological systems demonstrate adaptive and robust locomotion with minimal control efforts. In that sense, human kinematics hold potential for locomotion controller design. Among various human movements, squatting serves as a fundamental behavior that integrates both stance and balance subfunctions of locomotion. This study investigates how observed human kinematics can be leveraged to control squatting motions in a humanoid robot. We propose a bioinspired open-loop controller to map human joint angles to robot reference trajectories. This controller was implemented on our simulation model and real robot. We explored the parameter space of joint gains to evaluate three key performance metrics: stability, efficiency, and similarity. Experimental results show that our kinematic-based controller is effective for human-like squatting behaviors. By tuning gains, trade-offs among stability, efficiency, and similarity can be achieved to obtain optimal performance. This work contributes to the control of bipedal squatting motion and the understanding of bio-inspired legged locomotion.

15:00-16:20, Paper ThuMA.46
Large Language Models for Automatic Specification Design in Supervisory Control of Multi-Robot Systems

Isildar, Ecem	Technical University of Darmstadt
Miyauchi, Genki	The University of Sheffield
Gross, Roderich	Technical University of Darmstadt
Keywords: RIG Cluster Multi-Robot Systems, RIG TC: Swarm Robotics, RIG TC: Multi-Robot Coordination Abstract: This paper explores using Large Language Models (LLMs) to generate formal control specifications for Supervisory Control Theory (SCT) to ensure safe multi-robot navigation in unmapped environments. Although manual specification design is labor-intensive and direct LLM commanding poses significant safety risks, including collisions and unpredictable behavior in human-populated spaces, our approach bridges this gap by leveraging LLMs for formal synthesis. Results show that GPT-4.1 reliably generates valid specifications from predefined events, enabling collision- and deadlock-free collaborative patrolling. Furthermore, the model effectively filters redundant events and discovers strategies that improve environmental coverage, highlighting the potential for LLMs to accelerate safe controller design.

15:00-16:20, Paper ThuMA.47
Robotics in Sensitive Settings: Lessons Learned from Case Studies Exploring Real-World Integration

Rixen, Jan Ole	Karlsruhe Institute of Technology
Gerling, Kathrin	KIT
Neef, Caterina	Karlsruhe Institute of Technology
Bruno, Barbara	Karlsruhe Institute of Technology (KIT)
Herzog, Olivia	Technical University of Munich
Ackermann, Marko	Karlsruhe Institute of Technology
Mombaur, Katja	Karlsruhe Institute of Technology
Pascher, Max	TU Dortmund University
Gerken, Jens	TU Dortmund University
Vollmer, Anna-Lisa	Bielefeld University
Keywords: RIG TC: Human-Robot Interaction in Sensitive Settings, RIG Cluster: Human-Robot Interaction, Long term Interaction Abstract: In this work, we draw on six case studies conducted to gain insights into how robots can support stakeholders in different sensitive settings. In this way, we identify and outline research opportunities for future work within the robotics community that bridges the gap between human-oriented and technical robotics research, enabling the development of solutions that are both effective and suitably aligned with societal demands.

15:00-16:20, Paper ThuMA.48
AI-Ready Information Architecture for Smart Factories with RFID, Edge Intelligence, Digital Twins, and Policy Control

Nagrath, Vineet	Technical University of Munich (TUM)
Rajaei, Nader	Technical University of Munich
Lilienthal, Achim J.	TU Munich
Keywords: Mapping, Integrated Planning and Control, RIG Cluster: Safety, Reliability and Resilience of AI-based Robotics Abstract: Smart factories increasingly rely on Artificial Intelligence (AI) operating over shared physical infrastructure, heterogeneous robots, and multi-stakeholder environments. This paper presents an information architecture that positions RFID-based localization as a foundational sensing layer for AI-driven cyber-physical production systems (CPPS). The architecture integrates high-performance UHF RFID (Siemens SIMATIC RF600), edge computing, digital twins, and policy-governed control (AI.Lock) to provide real-time awareness of assets, personnel, robots, tools, and work-in-progress. By combining RFID proximity, reflected-power triangulation, digital twin baselines, and trajectory persistence, the system enables probabilistic localization, safety enforcement, experiment traceability, and scalable coordination across trustless, multi-vendor environments. The architecture supports use cases ranging from safety geofencing to automated bill-of-material verification and synthetic data generation for machine learning. This paper details system components, key capabilities, and application domains, positioning the information architecture as a critical enabler of safe, explainable, and reusable AI in Industry 4.0.

15:00-16:20, Paper ThuMA.49
Evaluation and Future Prospects of the SHIVAA Strawberry-Picking Robot

Wirkus, Malte	German Research Center for Artificial Intelligence (DFKI)
Peters, Heiner	German Research Center for Artificial Intelligence (DFKI)
Janzen, Janne	German Research Center for Artificial Intelligence (DFKI)
Stark, Tobias	German Research Center for Artificial Intelligence (DFKI)
Stoeffler, Christoph	German Research Center for Artificial Intelligence (DFKI)
Keywords: RIG TC: Agri-Robotics, Robotics and Automation in Agriculture and Forestry, RIG Cluster Multi-Robot Systems Abstract: The SHIVAA robot was specifically designed for harvesting strawberries grown in outdoor environments. The system features a lightweight manipulator and a perception system based on multispectral camera images for strawberry detection and classification. A passive suspension mechanism ensures all-wheel contact with uneven open-field terrain. A series of field tests was conducted during the 2024 and 2025 strawberry seasons on different professional strawberry plantations at various stages throughout the season. The aim was to evaluate the system's performance in picking strawberries and navigating within rows of plants. Performance parameters such as manipulation success rate, damage or bycatch rate, and total output were determined from the data acquired during the field and outdoor laboratory tests. In addition to the field tests, opportunities to increase the operating speed of the system were identified. Video analysis revealed potential for optimizing high-level coordination, and laboratory tests determined the maximum manipulator speed. To obtain an initial limit value for maximum movement speed, optimal trajectory plans for the manipulator's upward and downward movements were generated using an iterative linear-quadratic regulator. Differential times of 0.8 seconds were feasible in laboratory experiments. During normal operation, the system's individual capabilities are combined to create autonomous sequence control for the gripping process. Some sequential actions can also be performed in parallel to save time. For example, the manipulator can be moved to its rest position at the same time as moving to the next harvesting section. Additionally, the linear joint can be integrated into the manipulator control system, meaning it no longer needs to be controlled individually during harvesting or fruit placement. Currently the robot is further developed to act within a hybrid team of human field workers and other robots to complete the field logistics.

15:00-16:20, Paper ThuMA.50
An Open-Source Humanoid Research Platform for Democratizing Robotics (pib Introduction for Researchers)

Okujava, Shota	Isento GmbH
Baier, Jürgen	Isento GmbH
Keywords: Education Robotics, Developmental Robotics, Embedded Systems for Robotic and Automation Abstract: Research in humanoid robotics is often hindered by high hardware costs and proprietary software barriers. This paper introduces pib (printable intelligent bot), an open-source, 3D-printable research platform designed to democratize access to advanced robotics through a modular hardware approach and agile development processes. Built on industry standards like ROS 2 and Onshape, pib features a comprehensive digital twin environment in Webots and MuJoCo to facilitate seamless Sim2Real transfer and Reinforcement Learning. The platform advances Human-Robot Interaction (HRI) by integrating LLMs and a LangGraph-based "Intelligence Node" for orchestrating complex sensor-actor workflows. Supported by the cloud-based perception platform TRYB, pib accelerates the development of vision models and offers a scalable ecosystem for future mobility and embodied AI research.

15:00-16:20, Paper ThuMA.51
Learning to Race in Minutes: Infoprop Dyna on the Mini Wheelbot

Subhasish, Devdutt	RWTH Aachen University
Hose, Henrik	RWTH Aachen
Trimpe, Sebastian	RWTH Aachen University
Keywords: RIG TC: Foundations of Optimization and Learning for Robotics, RIG Cluster: Learning and Multimodal AI for Robotics Abstract: Reinforcement Learning (RL) has the potential to enable robots with fast, nonlinear, and unstable dynamics to reach the limits of their performance. However, most recent advances rely on carefully designed physics-based simulators and domain randomization to achieve successful sim-to-real transfer within reasonable wall-clock time. In this work, we bypass the need for such simulators and demonstrate that Infoprop Dyna, a state-of-the-art uncertainty-aware model-based reinforcement learning (MBRL) framework, can enable robots to learn directly from real-world interactions. Using Infoprop Dyna, the Mini Wheelbot, an underactuated unicycle robot, learns to race around a track within 11 minutes of real-world experience.

15:00-16:20, Paper ThuMA.52
Distributed Boat Detection Via Acoustic Buoy Networks with Consensus-Based Fusion

Matzdorf, Felix	Technical University of Darmstadt
Talamali, Mohamed S.	University of Sheffield
Rau, Julian	Technical University of Darmstadt
Miyauchi, Genki	The University of Sheffield
Watteyne, Thomas	Inria
Gross, Roderich	Technical University of Darmstadt
Keywords: RIG Cluster Multi-Robot Systems, RIG TC: Swarm Robotics, RIG TC: Networked Robotics Abstract: Multi-agent sensing enables spatially distributed measurements that improve coverage and robustness compared to a single platform. We explore the use of a network of buoys equipped with microphone arrays with the purpose of detecting passing boats. Each buoy estimates time differences of arrival from short audio windows using the generalized cross-correlation with phase transform and derives a local direction-of-arrival under a plane-wave approximation. We fuse measurements by exchanging limited information parameters and running decentralized consensus, yielding a global maximum a posteriori estimate without transmitting raw audio. Simulation results suggest that localization error increases with measurement noise, whereas additional buoys improve accuracy and convergence. These findings indicate a scalable approach for distributed acoustic boat detection.

15:00-16:20, Paper ThuMA.53
Magnetic Jamming for Reconfigurable Robotic Structures

Aktas, Buse	Robotic Composites and Compositions Group, Max Planck Institute for Intelligent Systems
Kim, Minsoo	ETH Zurich
Bäckert, Marc	ETH Zurich
Posada, Alejandro	Max Planck Institute for Intelligent Systems
Keywords: RIG Cluster: Novel Bodyware, RIG TC: Reconfigurable Robotics, RIG TC: Swarm Robotics

15:00-16:20, Paper ThuMA.54
Applications and Functional Extensions of Vision–Language–Navigation Models in Indoor Environments

Chen, Donglin	Northeastern University
Zhang, Jiazhao	Peking University
Liu, Jiahang	Harbin Institute of Technology(ShenZhen)
Qiguan, Shiqun	University of Hamburg
Liu, Shang-Ching	Universität Hamburg
Wang, He	Peking University
Zhang, Jianwei	Hamburg University
Keywords: AI-Enabled Robotics, AI-Based Methods, Vision-Based Navigation Abstract: In this work, we propose a unified framework for mobile robots based on a vision–language–navigation (VLN) model, enabling navigation-driven execution of multiple indoor tasks. Specifically, we formulate object counting, object searching, and human-instruction-based navigation within a single navigation paradigm, allowing a robot to interpret natural language instructions and accomplish diverse goals through consistent action planning. To validate the feasibility of our approach, we conduct preliminary deployment and evaluation in simulation. Our preliminary results indicate that its performance is constrained by several practical factors, including the gap in data quality, the scale of fine-tuning, the level of instruction specificity, and the flexibility of action control. These factors also point to promising directions for future work. The code has been open-sourced currently to facilitate further research and development.

15:00-16:20, Paper ThuMA.55
Toolbox of Modular Components to Demonstrate Reconfigurable Space Robots

Langosz, Malte	DFKI GmbH
Brinkmann, Wiebke	DFKI Robotics Innovation Center Bremen
Schilling, Moritz	University of Bremen
Eisenmenger, Jonas	DFKI GmbH
Wirkus, Malte	DFKI GmbH
Keywords: RIG TC: Reconfigurable Robotics, RIG Cluster Multi-Robot Systems, RIG TC: Space Robotics for Sustainability and Exploration Abstract: The MODKOM (Modular components as Building Blocks for application-specific configurable space robots) project aims to create a toolbox that allows robots to be configured and recombined for specific tasks using specialized, standardized building blocks, throughout different mission phases. To showcase the toolbox's capabilities, a demonstration scenario was created using a selection of hardware and software components. The video shows the demonstration scenario involving autonomous docking, rover reconfiguration and payload deployment, all of which are embedded within a broader mission context.

15:00-16:20, Paper ThuMA.56
Contact-Implicit Optimization for Sequential Object Placements

Zhang, Yuezhe	Technische Universität Darmstadt
Tateo, Davide	Lund University
Chalvatzaki, Georgia	Technische Universität Darmstadt
Keywords: Optimization and Optimal Control, Collision Avoidance Abstract: Robotic object packing has attracted significant research interests in both academia and automation industry for the last decade. It is challenging due to the curse of dimensionality in the combinatorial search and the difficulty in dealing with dynamic and contact constraints for irregular-shaped objects. Current heuristic and learning-based methods assume a limited resolution of spatial discretization and overlook the significance of contacts. In this work, we eliminate these assumptions by introducing a contact-implicit optimization framework that combines the contact constraints into the Signed Distance Functions naturally. We use a convex decomposition module to divide the collision-free space into various convex sets, which yields tight solutions for convex objects. For non-convex objects, we divide the obstacle space into convex sets and parallelize the collision checking to improve the efficiency. Through extensive evaluations on a variety of irregular-shaped objects and comparison with existing methods, we demonstrate that our method can handle convex and non-convex object placements and leads to better performance in terms of packing utility and computation efficiency.


ThuNA
Transfer & Industry Keynotes	Keynote


ThuOA
Panel — Transfer & Industry — Future of Robotics in Germany and Europe: Opportunities, Challenges, and the Role of Industry in the Global Race


ThuRA
Woman in Robotics and AI Aperitif & Networking and Exhibition


ThuSA
RIG Heroes Award Night	Award


ThuTA
RIG Heroes Award Night - Dinner

Technical Program for Thursday March 12, 2026