GRC 2026 Program | Friday March 13, 2026


FrAA
Doors Open Friday


FrBA
Keynotes — Georgia Chalvatzaki Structured Robot Learning the Quest for Scalable Embodied Intelligence	Keynote
Chair: Toussaint, Marc	TU Berlin


FrCA
Florentin Neumann (DFG) Robotics in the German Research Foundation
Chair: Toussaint, Marc	TU Berlin


FrDA
Panel — Education


FrZA
Interactive Session & Demos 4 & Coffee	Interactive

10:45-12:05, Paper FrZA.1
Neuromusculoskeletal Modeling of Human Bipedal Gaits: Exploring the Role of Reflexes in Locomotion

Bunz, Elsa Katharina	University of Stuttgart
Haeufle, Daniel Florian Benedict	University of Tübingen
Schmitt, Syn	University of Stuttgart, Germany
Keywords: Human and Humanoid Motion Analysis and Synthesis, Humanoid and Bipedal Locomotion, Biologically-Inspired Robots Abstract: Human locomotion is characterized by impressive robustness and versatility which is currently unmatched by humanoid robots. These abilities arise from a complex interplay between the musculoskeletal system, spinal circuits and input from supraspinal centers. Despite decades of research, the neural mechanisms underlying this intricate human motor control remain controversial. The most basal control component in the central nervous system are reflexes. They are involuntary responses to sensory stimuli that occur via neural pathways in the spinal cord. While their importance in reacting to perturbations has been widely accepted, the role of reflexes as a stand-alone control component is unclear. In several works, we used neuromusculoskeletal simulations to investigate the role of reflexes in robust and versatile locomotion. The results show that reflexes play an important role in ensuring robust locomotion and can replicate versatile human locomotion involving different gaits and speed adaptations. They support the idea that reflexes are a powerful control primitive and encourage future research on implementing reflexive control on humanoids, prostheses, orthoses and exoskeletons.

10:45-12:05, Paper FrZA.2
Autonomous Robotics As an Enabler for Sustainable Agroforestry Systems

Troesken, Lennart	Technical University of Munich (TUM)
Duecker, Daniel Andre	Technical University of Munich (TUM)
Keywords: RIG TC: Agri-Robotics, RIG Cluster: Field Robotics, Agricultural Automation Abstract: Robotic technologies in agriculture have primarily focused on automating existing machinery in large-scale, homogeneous production systems. As a result, their applicability to structurally heterogeneous and ecologically driven farming paradigms remains limited, particularly with respect to longterm autonomy, learning, and system-level integration. This work argues for a shift in perspective toward autonomous robotics as a system enabler for sustainable agriculture. Agroforestry is considered as an emerging farming paradigm and reference system that exposes fundamental challenges for agricultural robotics. The paper outlines research directions on long-term autonomy, learning, human–robot interaction, and extended planning horizons.

10:45-12:05, Paper FrZA.3
Underwater Manipulation Wrench Estimation with Small-Scale Robots

Graf, Moritz	Technical University of Munich
Duecker, Daniel Andre	Technical University of Munich
Keywords: RIG TC: AI-driven Marine Robotics, RIG Cluster: Field Robotics Abstract: Performing dexterous manipulation underwater with small-scale robots remains a challenging problem due to unpredictable current disturbances and the impracticality of integrating force/torque sensors. We propose a model-based wrench estimator relying solely on onboard IMU and DVL measurements and a probabilistic interaction detection method based on a Gaussian mixture model that treats wrenches resulting from steady currents as a quasi-static background and interaction wrenches as a dynamic foreground. The proposed method is evaluated with practical experiments in an indoor basin.

10:45-12:05, Paper FrZA.4
Vision-Based Screw Detection and Robotic Unscrewing

Menetrey-Meinhold, Sara	Chemnitz University of Technology
Schlegel, Holger	Technische Universität Chemnitz
Rehm, Matthias	Chemnitz University of Technology
Dix, Martin	Chemnitz University of Technology
Keywords: Disassembly, AI-Enabled Robotics, RGB-D Perception Abstract: Automated disassembly is an important topic regarding the limited resources we have on earth. The most established nondestructive process remains unscrewing. However, there is still no system on the market that allows automated unscrewing in a robust and highly flexible manner. Most research focuses on the removal of very specific screws in predefined environments. This document presents a solution tested on screws from M5 to M8, with hexagonal, Torx and Allen drive and different head types, placed in orientations of up to 51° to the vertical. The system includes a robotic arm, an AI-assisted 3D vision module and an industrial screwdriver adapted for unscrewing. Experimentation showed a detection rate of 84 % to 100 % depending on the screw type and an unscrewing rate between 88 % and 100 % also depending on the screw type. Overall, including the detection, the unscrewing and the removal of the screws from the working table, the system is successful at least 80 % of the time and shows potential for improvements.

10:45-12:05, Paper FrZA.5
Learned Incremental Nonlinear Dynamic Inversion for Quadrotors with and without Slung Payloads

Cobo-Briesewitz, Eckart	TU Berlin
Wahba, Khaled	TU Berlin
Hoenig, Wolfgang	TU Berlin
Keywords: RIG Cluster: Learning and Multimodal AI for Robotics, Machine Learning for Robot Control, Model Learning for Control Abstract: The increasing complexity of multirotor applications demands flight controllers that can accurately account for all forces acting on the vehicle. Conventional controllers model most aerodynamic and dynamic effects but often neglect higher-order forces, as their accurate estimation is computationally expensive. Incremental Nonlinear Dynamic Inversion (INDI) offers an alternative by estimating residual forces from differences in sensor measurements; however, its reliance on specialized and often noisy sensors limits its applicability. Recent work has demonstrated that residual forces can be predicted using learning-based methods. In this paper, we show that a neural network can generate smooth approximations of INDI outputs without requiring additional sensor inputs. We further propose a hybrid approach that integrates learning-based predictions with INDI and demonstrate both methods for multirotors and multirotors carrying slung payloads. Experimental results on trajectory tracking errors demonstrate that the specialized sensor measurements required by INDI can be eliminated by replacing the residual computation with a neural network.

10:45-12:05, Paper FrZA.6
BEAST: Efficient Tokenization of B-Splines Encoded Action Sequences for Imitation Learning

Zhou, Hongyi	Karlsruhe Institute of Technology
Lioutikov, Rudolf	Karlsruhe Institute of Technology
Keywords: Imitation Learning, Deep Learning in Grasping and Manipulation Abstract: We present the B-spline Encoded Action Sequence Tokenizer (BEAST), a novel action tokenizer that encodes action sequences into compact discrete or continuous tokens using B-spline. In contrast to existing action tokenizers based on vector quantization or byte pair encoding, BEAST requires no separate tokenizer training and consistently produces tokens of uniform length, enabling fast action sequence generation via parallel decoding. Leveraging our B-spline formulation, BEAST inherently ensures generating smooth trajectories without discontinuities between adjacent segments.

10:45-12:05, Paper FrZA.7
Plan2Pose: Bridging Symbolic Planning and Robot Control Via Context-Aware Goal Generation on Point Clouds

Swoboda, Daniel Maximilian	RWTH Aachen University
Rakhman, Ulzhalgas	RWTH Aachen University
Hofmann, Till	RWTH Aachen University
Keywords: AI-Enabled Robotics, Deep Learning in Grasping and Manipulation, Deep Learning for Visual Perception Abstract: Goal-conditioned policies excel at manipulation tasks but lack the context to respect long-term constraints. Conversely, symbolic planners efficiently generate long-horizon action skeletons, but often abstract away the object geometries necessary for execution. However, combining the two approaches is challenging: translating task plans into geometric sub-goals requires translating an abstract symbolic state into a complete geometric goal representation. In this work, we present preliminary results on Plan2Pose, a neuro-symbolic transformer based on point clouds that produces pose transformations for each object and every state transition in the high-level plan. Each step in the plan is represented by a combination of symbolic predicate embeddings and geometric point cloud embeddings. The resulting sequence of grounded state representations is processed using a Transformer. Plan2Pose attends over the full plan to predict target poses that satisfy both immediate and future geometric constraints. The proposed approach can be used as sub-goal generator and combined with a goal-conditioned policy to realize each step, thereby executing long-term plans while taking all geometric constraints into account.

10:45-12:05, Paper FrZA.8
Bridging the Loco-Manipulation Disconnect: A Framework for Dynamic Whole-Body Impulse Control on Floating-Base Robots

Brusnicki, Roberto	Technical University of Munich
Betz, Johannes	Technical University of Munich
Piccinini, Mattia	Technical University of Munich
Keywords: Whole-Body Motion Planning and Control, Integrated Planning and Learning, Reinforcement Learning Abstract: Mobile manipulation in the mid-2020s is defined by a paradox: locomotion has achieved remarkable agility across legged and wheeled-legged platforms, while Vision-Language-Action (VLA) models enable semantic manipulation with increasing generalization. Yet, a fundamental "Loco-Manipulation Disconnect" remains a critical barrier yet to be overcome for deployment in high-utility environments. This disconnect – the architectural segregation of the floating base from the manipulation chain – results in robots that must stop to act, failing in forceful scenarios where body momentum could be exploited. We propose Impulse-Aware Whole-Body Control (IA-WBC), a framework that transitions from quasi-static interaction to Dynamic Whole-Body Impulse Control. Our approach unifies a monolithic RL policy with certified stability guarantees, enabling floating-base robots to reason about impulse (J = ∫F dt) as a resource rather than force as a disturbance. The framework targets deployment on humanoid and quadruped platforms with manipulation capabilities, with experimental validation planned on dynamic tasks such as door breaching, cart pushing, and dynamic lifting – scenarios requiring peak forces beyond the robot's static actuation limits.

10:45-12:05, Paper FrZA.9
Development of a Differential Hip for Dynamic Bipedal Locomotion

Schroeders, Florian	Techical University of Munich
Radecker, Philipp	Technical University of Munich
Rixen, Daniel	Technische Universität München
Keywords: Actuation and Joint Mechanisms, Tendon/Wire Mechanism, Legged Robots Abstract: High leg inertia limits dynamic agility and energy efficiency in humanoid robotics. This work develops a differential hip architecture that reduces distal mass by relocating actuation to the torso via a tendon-driven differential. A prototype was designed and modeled in the flexible multibody dynamics framework Exudyn to validate kinematic coupling and structural loads. Comparative simulations against a standard serial hip show substantially reduced leg inertia and peak power demand during dynamic gait cycles. Physical experiments corroborate the simulation trends, supporting proximal actuation as a promising strategy for improving bipedal performance.

10:45-12:05, Paper FrZA.10
An Effective and Robust Loop Closure Detection Pipeline for 3D LiDARs in Urban Environments

Gupta, Saurabh	University of Bonn
Guadagnino, Tiziano	University of Bonn
Mersch, Benedikt	Filics GmbH
Trekel, Niklas	University of Bonn
Malladi, Meher Venkata Ramakrishna	University of Bonn
Stachniss, Cyrill	University of Bonn
Keywords: SLAM, Mapping, Localization Abstract: Globally consistent mapping for autonomous robots relies on accurate pose estimation, where loop closures are essential for correcting accumulated drift. This extended abstract presents a robust loop closure detection pipeline for outdoor LiDAR SLAM. Our method constructs local maps from LiDAR scans, performs ground-based alignment, and generates a density-preserving bird's-eye-view representation. We then extract ORB feature descriptors for place recognition, followed by self-similarity pruning to mitigate perceptual aliasing. Experimental results demonstrate high-precision loop closure detection across varying LiDAR scanning patterns, fields of view, and motion profiles.

10:45-12:05, Paper FrZA.11
ROS2 Interface for Aggregating Robot Manipulation Performances with Electronic Task Boards

So, Peter	Technical University of Munich
Abdelrahman, Ahmed	Technical University of Munich
Le, Hoan Quang	Technical University of Munich
Steinbach, Eckehard	Technical University of Munich
Keywords: Performance Evaluation and Benchmarking, Dexterous Manipulation, Disassembly Abstract: Conducting cross-comparable robotic object manipulation experiments remains a goal for robotics research, yet progress is still reported manually and with non-harmonized use cases which are difficult to compare. We present a software architecture for benchmarking robot performances capable of supporting multiple versions of physical electronic task boards and contribute a streamlined process for conducting and reporting experiment performance with test condition reproducibility guarantees. We extend the software interface of the existing electronic task board using a ROS2 Action Server and a local web server to perform standard and custom tasks. The interface allows users to programmatically start experiments with the task board and seamlessly publish their experimental data to a public web dashboard for the benchmarking community.

10:45-12:05, Paper FrZA.12
DexTeRo: Dexterous Telemanipulation System for Upper Body Humanoid Robots

Schwarz, Stephan Andreas	Chemnitz University of Technology
Nieberle, Benedikt	Chemnitz University of Technology
Thomas, Ulrike	Chemnitz University of Technology
Keywords: Telerobotics and Teleoperation, Human-Centered Robotics, Multi-Robot Systems Abstract: The rapid development of complex robots such as humanoids increases the interest in telemanipulation systems. Besides the capability to control robots remotely, dual-arm telemanipulation setups are suitable tools for many research areas such as imitation learning or complex manipulation tasks. In this work, we introduce our new telemanipulation system DexTeRo. It enables an operator to control the upper-body of a humanoid robot and provides new features to overcome previous restrictions, such as limited payloads and small workspaces. The components of the follower and leader side together with the applied control architectures are presented. Finally, an outlook on our planned improvements and current research is given.

10:45-12:05, Paper FrZA.13
Implementation of the Patellar Tendon Reflex in a Muscle-Driven Robotic Leg Based on Bioinspired Motor Control

Nadler, Tobias	University of Stuttgart
Schmitt, Syn	University of Stuttgart, Germany
Keywords: Biologically-Inspired Robots, Robust/Adaptive Control, Humanoid and Bipedal Locomotion Abstract: We developed an anthropomorphic, muscle-driven biorobotic leg that replicates the human monosynaptic reflex loop. By explicitly wiring the artificial muscle spindle signals to mimic the human afferent pathway, defined impacts on the patellar tendon elicit feedback-modulated stimulation mirroring human physiology. By calibrating the controller against dynamics from 14 healthy subjects, we achieved reflex behaviors indistinguishable from humans. This demonstrates the successful implementation of low-level sensorimotor control in muscle-like, soft actuation devices, enabling engineered systems to exploit the robustness and adaptability of biological locomotion.

10:45-12:05, Paper FrZA.14
VLAgents: A Policy Server for Efficient VLA Inference

Jülg, Tobias Thomas	University of Technology Nuremberg
Gamal, Khaled	University of Technology Nuremberg
Nilavadi, Nisarga	University of Technology Nuremberg
Krack, Pierre	University of Technology Nuremberg
Bien, Seongjin	University of Technology Nuremberg
Krawez, Michael	University of Technology Nuremberg
Walter, Florian	Technical University Munich
Burgard, Wolfram	University of Technology Nuremberg
Keywords: RIG TC: Robotics Foundation Models, Software Architecture for Robotic and Automation, Software, Middleware and Programming Environments Abstract: The rapid emergence of Vision-Language-Action models (VLAs) has a significant impact on robotics. However, their deployment remains complex due to the fragmented interfaces and the inherent communication latency in distributed setups. To address this, we introduce VLAgents, a modular policy server that abstracts VLA inferencing behind a unified Gymnasium-style protocol. Crucially, its communication layer transparently adapts to the context by supporting both zero-copy shared memory for high-speed simulation and compressed streaming for remote hardware. In this work, we present the architecture of VLAgents and validate it by integrating seven policies---including OpenVLA and Pi0. In a benchmark with both local and remote communication, we further demonstrate how it outperforms the default policy servers provided by OpenVLA, OpenPi, and LeRobot. VLAgents is available at github.com/RobotControlStack/vlagents

10:45-12:05, Paper FrZA.15
Collaborative Multi-Agent Architectures for Autonomous Robot Self-Optimization Via Shared Deliberation

Enose Kamalabai, Nampuraja	Infosys
Keywords: AI-Enabled Robotics Abstract: Autonomous mobile robots require continuous adaptation of operational and control parameters across tightly coupled subsystems to maintain a stable performance. We propose a multi-agent architecture where specialized Large Language Model (LLM) based agents collaborate through a shared conversation network and leverage interaction history to achieve runtime self-optimization. Validated on the Rover Robotics 2WD platform running ROS2 Humble, the system orchestrates multiparameter tuning across 396+ parameters, spanning 15 ROS2 nodes including Nav2 navigation stack, AMCL localization, and velocity control. This is defined within a transformation tree that encompasses 18 coordinate frames, extending from the map down to individual sensor frames. All agents participate in shared deliberation, enabling emergent cross-subsystem optimization and inherent explainability through natural language reasoning.

10:45-12:05, Paper FrZA.16
What Over How: Sparse Graphical Task Models from Minimal Demonstrations

Röfer, Adrian	University of Freiburg
Heppert, Nick	University of Freiburg
Valada, Abhinav	University of Freiburg
Keywords: Learning from Demonstration, Task and Motion Planning, Probability and Statistical Methods Abstract: Learning robotic manipulation from demonstration has traditionally emphasized behavior‑cloning approaches that map raw state observations to actions, thereby focusing on how a task is performed. Such methods are fragile to substantial variations in task‑space scale, layout, or embodiment, even after extensive training. To improve robustness, recent work has introduced object‑centric representations, yet these still struggle under large environmental changes. An answer to this challenge can lie in understanding the high-level goals of a task first, by modeling manipulation tasks as evolving object graphs which capture the semantic intent (e.g., toast to toaster, to plate, to tray). Our method constructs probabilistic kinematic graphs, which connects objects throughout an entire manipulation, including idle phases. Unlike earlier approaches that require known object correspondences, we match objects across demonstrations using similarity of pretrained visual feature vectors, and we further simplify matching by focusing on transitions between independent subgraphs. This yields compact activation/deactivation sequences and pose distributions for objects at key moments. We study the quality of segmentations on two datasets and a robotic benchmark and qualitatively deploy our approach on a real robotic system.

10:45-12:05, Paper FrZA.17
Towards Analyzing the Characteristics of Model-Based and Model-Free Decision Making Algorithms

Hall, Adam W.	University of Toronto
Che, Mingxuan	TU Munich
Sawant, Shambhuraj	Max Planck Institute for Intelligent Systems
Zou, Joey	University of Toronto
Pizarro Bejarano, Federico	University of Toronto
Brunke, Lukas	Technical University of Munich
Zhou, Siqi	Technical University of Munich
Schoellig, Angela P.	TU Munich
Keywords: RIG TC: Foundations of Optimization and Learning for Robotics, Machine Learning for Robot Control, RIG TC: Principles and Methods for Building AI-powered Robust and Resilient Robots Abstract: Continuous advancements in robot sensors, actuators, and computing hardware have made robot platforms more compact, capable, and accessible. With improvements in hardware, as well as supporting tools such as large-scale simulation, we have seen decision-making algorithms push the limits of what is possible, achieving previously unseen performance. The rapid progress is also creating a disparity between modular, model-based control methods and data-driven learning approaches, with a limited understanding of the fundamental trade-offs that differentiate them in real-world deployments. These algorithms vary primarily in how they internalize a model of the world---from analytical representations in optimal and learning-based control to experiential data representations in reinforcement learning; these choices lead to distinct processes during policy optimization and consequently behaviours during deployment. In this work, we present a systematic analysis of representative methods across this spectrum and provide a holistic view of their characteristics using six metrics that capture core trade-offs in robot decision-making: model complexity, learning complexity, runtime efficiency, performance, robustness, and task generalization. Through extensive experiments on a practical robotic platform, supported by open-source implementations, we demonstrate how these trade-offs manifest in realistic settings and highlight key considerations for selecting appropriate decision-making frameworks. As robot systems begin to incorporate larger amounts of data, these distinctions will provide a crucial foundation for developing decision-making algorithms that scale safely and effectively for future applications.

10:45-12:05, Paper FrZA.18
KI.Fabrik: Shaping the AI-Driven Factory of the Future by Turning Embodied AI into Industrial Reality

Rajaei, Nader	Technical University of Munich
Rudenko, Andrey	Robert Bosch GmbH
Lehsing, Christian	Technical University of Munich
Vineet, Nagrath	Technical University of Munich
Koenig, Alexander	Technical University of Munich
Klaus, Diepold	Technical University of Munich
Alois, Knoll	Technical University of Munich
Lilienthal, Achim J.	TU Munich
Keywords: Intelligent and Flexible Manufacturing, Factory Automation, Embodied Cognitive Science Abstract: While AI and automation are successfully prototyped in research labs, they often struggle with complex real production environments. This is particularly true for ``Embodied AI" systems. KI.Fabrik was established as a long-term program to bridge this gap, moving beyond isolated demonstrations toward reliable industrial use. By utilizing a networked ecosystem of modular components, ranging from robot learning hubs to remote teleoperation portals, KI.Fabrik enables the system-level development necessary to turn flexible, AI-driven manufacturing and ``Production-as-a-Service" into a tangible reality.

10:45-12:05, Paper FrZA.19
SURE: Safe Uncertainty-Aware Robot-Environment Interaction Using Trajectory Optimization

Zhang, Zhuocheng	Technical University of Munich
Zhao, Haizhou	New York University
Sun, Xudong	Technical University of Munich
Johnson, Aaron M.	Carnegie Mellon University
Khadiv, Majid	Technical University of Munich
Keywords: Optimization and Optimal Control, Planning under Uncertainty, RIG TC: Foundations of Optimization and Learning for Robotics Abstract: Robotic tasks involving contact interactions pose significant challenges for trajectory optimization due to discontinuous dynamics. Conventional formulations typically assume deterministic contact events, which limit robustness and adaptability in real-world settings. In this work, we propose SURE, a robust trajectory optimization framework that explicitly accounts for contact timing uncertainty. By allowing multiple trajectories to branch from possible pre-impact states and later rejoin a shared trajectory, SURE achieves both robustness and computational efficiency within a unified optimization framework. We evaluate SURE on two representative tasks with unknown impact times. In a cart–pole balancing task involving uncertain wall location, SURE achieves an average improvement of 21.6% in success rate when branch switching is enabled during control. In an egg-catching experiment using a robotic manipulator, SURE improves the success rate by 40%. These results demonstrate that SURE substantially enhances robustness compared to conventional nominal formulations.

10:45-12:05, Paper FrZA.20
Multimodal Human-Cobot-Interaction for Collaborative Tasks

Milde, Sven	Fulda University of Applied Sciences
Milde, Jan-Torsten	Fulda University of Applied Science
Blum, Rainer	Fulda University of Applied Sciences
Mueller, Tobias	Fulda University of Applied Sciences
Schultheis, Marius	Fulda University of Applied Sciences
Schreiner, Niklas	Alpaka Innovation
Keywords: Human-Robot Collaboration, Behavior-Based Systems, Virtual Reality and Interfaces Abstract: We present CoMeSy, a specialized system for multimodal human-cobot interaction designed to make collaboration more intuitive and efficient through the use of natural language and physical gestures. A primary finding of our research is the system's ability to interpret linguistic instructions situationally, which enables the robot to resolve ambiguous commands and deictic expressions—such as pointing—within a shared workspace. To achieve this, we implemented Dynamic Action Planning using Behavior Trees, which allow the system to decompose complex tasks into manageable subtasks while adapting to evolving environmental states. This is paired with a reactive and robust control architecture that ensures the cobot can adjust to unexpected changes or interruptions without sacrificing performance. Furthermore, the system utilizes a Unified Control Architecture, allowing it to operate interchangeably between a physical work cell and a virtual Augmented Reality (AR) application via a Meta Quest 3, all powered by the same ROS2 computer. Finally, the framework employs Hierarchical Language Processing, a two-level conceptual approach that processes both immediate situational modifiers and more complex multi-stage action sequences.

10:45-12:05, Paper FrZA.21
Real-Time Ground Reaction Force Estimation for Wearable Robotics Using Temporal Convolutional Networks

Jazini, Mohammadjavad	Technical University of Darmstadt
Firouzi, Vahid	Technical University of Darmstadt
Sharbafi, Maziar	Technical University of Darmstadt
Findeisen, Rolf	Technical University of Darmstadt
Keywords: AI-Based Methods, Rehabilitation Robotics Abstract: Estimating ground reaction forces outside laboratory environments is a key requirement for wearable robotics and mobile gait analysis. While force plates and instrumented treadmills provide accurate measurements, their stationary nature limits applicability in real-world scenarios. This paper presents a learning-based framework for estimating vertical ground reaction forces from lower-body inertial measurement unit data using temporal convolutional networks. The approach exploits the causal and computationally efficient structure of temporal convolutional networks to enable continuous, low-latency estimation suitable for real-time deployment. Two processing paradigms are investigated: a gait-cycle–segmented formulation and a fully continuous formulation that avoids explicit gait phase detection. Their behavior with respect to accuracy, robustness, and generalization across subjects and walking speeds is analyzed. The results highlight the trade-off between prediction accuracy and real-time feasibility, demonstrating the suitability of the proposed approach for wearable and assistive robotic systems.

10:45-12:05, Paper FrZA.22
Identifying Inductive Biases for Efficient Robot Co-Design

Vaish, Apoorv	Technische Universität Berlin
Brock, Oliver	Technische Universität Berlin
Keywords: Soft Robot Materials and Design, Soft Sensors and Actuators, RIG TC: AI-Driven Co-Design for Task-Oriented Humanoids Abstract: The co-design of robot morphology and control is computationally intensive, as we lack inductive biases tailored to it. We analyze co-design landscapes to systematically identify inductive biases tailored to the structure of co-design problems. Our method reveals that within regions of the co-design space, a low-dimensional manifold governs the quality of co-designs. Higher-quality regions exhibit variation across more dimensions, with a tighter coupling between morphology and control. To leverage these inductive biases, we propose an adaptive co-design algorithm that extracts this low-dimensional structure within a region and adjusts its exploration bias accordingly.

10:45-12:05, Paper FrZA.23
SHaRe-RL: Towards Co-Constructing Industrial Manipulation with Human-In-The-Loop Reinforcement Learning

Stranghöner, Jannick	Universität Bielefeld
Hartmann, Philipp	Bielefeld University
Braun, Marco	Bielefeld University
Wrede, Sebastian	Bielefeld University / Fraunhofer IOSB-INA
Neumann, Klaus	Bielefeld University / Fraunhofer IOSB-INA
Keywords: RIG Cluster: Human-Robot Interaction, RIG Cluster AI-Powered Industrial Robotics, RIG Cluster: Learning and Multimodal AI for Robotics Abstract: Reinforcement learning (RL) is a promising route to adaptive robot assembly, yet real-world training is often sample-inefficient and unsafe in contact-rich tasks. A recurring lesson from practical systems is to leverage domain expertise as priors that restrict learning to plausible behaviors. Motivated by this principle, we present SHaRe-RL, a real-world RL framework that takes a first step toward emph{co-constructing} contact-rich assembly skills with an operator by combining multiple forms of prior knowledge. SHaRe-RL integrates (i) a partial task specification via a manipulation primitive net, (ii) operator demonstrations and online interventions, and (iii) a deterministic per-axis compliance layer that bounds interaction forces during exploration. This design yields sample-efficient learning and scales to complex, vision-based assembly. On insertion of Harting Han-Modular connectors with SI{0.2}{millimetre}--SI{0.4}{millimetre} clearance, SHaRe-RL reaches reliable performance within a three-hour wall-clock budget.

10:45-12:05, Paper FrZA.24
Automated Acceptance Testing of Robotic Systems Using Behavior-Driven Models

Nguyen, Minh	University of Bremen
Wrede, Sebastian	Bielefeld University
Hochgeschwender, Nico	University of Bremen
Keywords: Software Tools for Robot Programming, RIG TC: Principles and Methods for Building AI-powered Robust and Resilient Robots, RIG Cluster: Safety, Reliability and Resilience of AI-based Robotics Abstract: We present an approach extending behavior-driven development (BDD) for robotic systems by introducing explicit, composable behavior-driven models for automated acceptance testing. These models formalize acceptance criteria by combining temporal semantics, domain knowledge about robots, objects, and environments, and inter-scenario relations, specifying what makes robotic behavior acceptable. We represent these models as knowledge graphs, which enable querying, manipulation, and transformation into executable test artifacts. To improve developer usability, a domain-specific language is provided and can be transformed into the underlying graph representation. These models enable systematic test case generation, automated execution in simulation (e.g., Isaac Sim), and evaluation of robotic behavior under diverse variations, providing evidence of behavior conformance and failure modes.

10:45-12:05, Paper FrZA.25
Robot Path Planning Via Flow Matching with Safety and Adaptivity through Predictive Control

Holzmann, Philipp	Technichal University of Darmstadt
Pfefferkorn, Maik	Technical University of Darmstadt
Carvalho, João	Albert-Ludwigs-Universitaet Freiburg
Younes, Ali	TU Darmstadt
Le, An Thai	Vin University
Chalvatzaki, Georgia	Technische Universität Darmstadt
Peters, Jan	Technische Universität Darmstadt
Findeisen, Rolf	Control and Cyber-Pysical Systems Laborator
Keywords: Constrained Motion Planning, Robot Safety, Machine Learning for Robot Control Abstract: Learning-based path planners based on diffusion and flow matching can generate diverse trajectories from demonstrations but classically lack guarantees on safety and con- straint satisfaction during deployment. We propose a framework that integrates flow-matching-based path planning, trained on demonstrations, with model predictive path-following control to combine data-driven path diversity with real-time safe execution. The flow-matching model efficiently captures multimodal path distributions, while predictive controller adapts the motion online, ensuring satisfaction of state/input constraints and obstacle avoidance. We introduce an event-triggered re-planning scheme that biases new path generation using solutions from the predictive controller, enabling safety even in environments with previously unseen obstacles.

10:45-12:05, Paper FrZA.27
CRISP - Compliant ROS2 Controllers for Learning-Based Manipulation Policies and Teleoperation

San José Pro, Daniel	Technical University Munich
Hausdörfer, Oliver	Technical University of Munich
Römer, Ralf	Technical University of Munich
Dösch, Maximilian	Techinical University Munich
Schuck, Martin	Technical University of Munich
Schoellig, Angela P.	Technical University of Munich
Keywords: RIG Cluster: Learning and Multimodal AI for Robotics, Imitation Learning, Compliance and Impedance Control Abstract: Learning-based controllers, such as diffusion policies and vision-language action models, often generate low-frequency or discontinuous robot state changes. Achieving smooth reference tracking requires a low-level controller that converts high-level targets commands into joint torques, enabling compliant behavior during contact interactions. We present CRISP, a lightweight C++ implementation of compliant Cartesian and joint-space controllers for the ROS2 control standard, designed for seamless integration with high-level learning-based policies as well as teleoperation. The controllers are compatible with any manipulator that exposes a joint-torque interface. Through our Python and Gymnasium interfaces, CRISP provides a unified pipeline for recording data from hardware and simulation and deploying high-level learning-based policies seamlessly, facilitating rapid experimentation. The system has been validated on hardware with the Franka Robotics FR3 and in simulation with the Kuka IIWA14 and Kinova Gen3. Designed for rapid integration, flexible deployment, and real-time performance, our implementation provides a unified pipeline for data collection and policy execution, lowering the barrier to applying learning-based methods on ROS2-compatible manipulators. Detailed documentation is available at the project website.

10:45-12:05, Paper FrZA.28
Cable Combat: The Manipulandum Begins – a Force-Exerting 3D Interface for Biomechanical Assessment and Assistive Robotics

Behrendt, Jacob	Friedrich-Alexander Universität Erlangen-Nürnberg
Demir, Ayşe Betül	FAU Erlangen-Nuernberg
Scheidl, Marc-Anton	Friedrich-Alexander Universität Erlangen-Nürnberg
Castellini, Claudio	Friedrich-Alexander-Universität Erlangen-Nürnberg
Thuerauf, Sabine	Friedrich-Alexander-University Erlangen-Nuremberg
Keywords: Calibration and Identification, Telerobotics and Teleoperation, RIG Cluster: Healthcare Robotics and Human Augmentation Abstract: In this work, we describe our concept for a 3D manipulandum (a device to concurrently measure position, orientation, and interaction forces) and its necessity. The designed manipulandum is cable-driven and an accurate measurement system that can also exert forces on the user. It can be used for multiple purposes, like the evaluation of assistive devices, the evaluation and calibration of biomechanical models, the tracking of rehabilitation or neurodegenerative diseases, and for impedance parameter measurements of the human body, like stiffness.

10:45-12:05, Paper FrZA.29
Whole-Body Diffusion Trajectory Generation and Reinforcement Learning Control for Humanoid Loco-Manipulation

Omar, Shafeef	Technical University of Munich
Yu, Dian	Technical University of Munich
Khadiv, Majid	Technical University of Munich
Keywords: RIG Cluster: Learning and Multimodal AI for Robotics, RIG Cluster: Legged Locomotion, Whole-Body Motion Planning and Control Abstract: Loco-manipulation for humanoid robots remains a long-standing challenge due to the difficulty of learning complex whole-body behaviours from scratch. While humans can easily perform many such tasks, collecting teleoperation data at scale is often infeasible. In this work, we propose a framework that enables large-scale whole-body trajectory generation and control for humanoid robots from limited demonstrations. Specifically, we leverage Sampling-Based Trajectory Optimisation (SBTO) to generate a rich and physically consistent motion dataset after kinematic retargeting, which is then used by a diffusion-based whole-body motion planner to generate motions that are tracked using a general low-level tracking RL policy. By decoupling motion generation from control, the diffusion model produces diverse, smooth, and feasible trajectories, while the lightweight RL policy ensures robust high-frequency tracking. Experiments demonstrate versatile object transport behaviors, including kicking, pushing, and carrying, from arbitrary initial states, significantly outperforming baselines trained on raw retargeted data with physical inconsistencies.

10:45-12:05, Paper FrZA.30
Velocity Field Based Data Augmentation for Corrective Imitation Learning

Ma, Shiping	Technische Universität Berlin
Auddy, Sayantan	Technische Universität Berlin
Toussaint, Marc	TU Berlin
Keywords: Data Sets for Robot Learning, Imitation Learning Abstract: While imitation learning (IL) has demonstrated strong performance in a variety of tasks, policies trained purely from demonstrations often suffer from distribution mismatch during execution. Small errors can accumulate over time, driving the system into states that are insufficiently covered by the training data and leading to task failure. Although distribution shift in IL has been widely studied, existing approaches often provide limited coverage of recovery behaviors from off-trajectory states. In this abstract, we collect demonstrations using a motion-planning solver in simulation and then construct a velocity field around the trajectories to generate additional recovery behaviors. We investigate whether the proposed approach can enhance the robustness of imitation learning.

10:45-12:05, Paper FrZA.31
A ROS-Based Platform for Standard-Compliant Haptic Teleoperation

Liu, Siwen	Technical University of Munich
Zhou, Xuanyu	Technical University of Munich
Xu, Xiao	Technical University of Munich
Steinbach, Eckehard	Technical University of Munich
Keywords: Telerobotics and Teleoperation, RIG Cluster: Rigorous Perception, RIG Cluster: Human-Robot Interaction Abstract: This work presents an IEEE 1918.1.1-compliant human-in-the-loop haptic teleoperation testbed. The testbed implements the haptic codec defined in the standard, enabling bandwidth-efficient haptic communication and teleoperation in the presence of communication delay. Two key contributions are the first implementations of the required metadata exchange as an implicit handshake mechanism and the bilateral Plug-and-Play (PnP) procedure specified in IEEE 1918.1.1, which were not included in the previously released open-source reference code accompanying the standard. The system is realized on Linux and evaluated in human-in-the-loop experiments using a NOVINT Falcon as the leader device and a simulated follower robot in Gazebo (Panda arm), demonstrating correct operation of the codec, metadata exchange, and PnP mechanism.

10:45-12:05, Paper FrZA.32
LLM-Pack: Intuitive Grocery Handling for Logistics Applications

Blei, Yannik	University of Technology Nuremberg
Krawez, Michael	University of Technology Nuremberg
Göß, Adrian	University of Technology Nuremberg (UTN)
Jülg, Tobias Thomas	University of Technology Nuremberg
Krack, Pierre	University of Technology Nuremberg
Walter, Florian	Technical University Munich
Burgard, Wolfram	University of Technology Nuremberg
Keywords: Logistics, Manipulation Planning, Deep Learning in Grasping and Manipulation Abstract: Robotics and automation are increasingly influential in logistics but remain largely confined to traditional warehouses. In grocery retail, advancements such as cashier-less supermarkets exist, yet customers still manually pick and pack groceries. While robotics research has extensively addressed bin picking, packing objects has received comparatively little attention. Packing grocery items, however, can be crucial for several reasons. First, densely packing objects is typically beneficial for optimizing subsequent logistics due to the saved space. Second, the order in which the items are packed can be important for preventing product damage, e.g., heavy objects should not be placed on top of fragile ones. Unfortunately, it is difficult to exactly specify the criteria for the right packing scheme for all objects the robot might encounter, given the huge variety of objects typically found in stores. In this paper, we introduce LLM-Pack, a novel approach to grocery packing. LLM-Pack leverages language and vision foundation models for identifying groceries and generating packing constraints that mimic human packing strategies. These constraints serve as input to a Mixed-Integer Linear Programming (MIP) approach, which computes an optimal packing scheme. LLM-Pack does not require any training and can flexibly handle new grocery items. We evaluate our approach in simulation and real-world experiments to demonstrate its performance.

10:45-12:05, Paper FrZA.33
Policy Distillation from a Model-Based Expert for Non-Prehensile Manipulation

Shcherba, Denis	TU Berlin
Toussaint, Marc	TU Berlin
Keywords: Imitation Learning Abstract: While significant advancements have been made in prehensile pick-and-place operations, achieving human-level manipulation capabilities requires further advancements in non-prehensile manipulation. Strategies such as pushing, sliding, or toppling remain formidable challenges, as they require the system to reason about complex contact dynamics and frictional forces without the stability of a firm grasp. This motivates the use of policy distillation from a model-based solver. We leverage an non linear programming based optimal planner in simulation to generate an abundance of optimal trajectories for complex non-prehensile tasks, such as hooking books from a shelf or pushing a puck. This privileged knowledge is then distilled into a sensorimotor student policy. Through extensive ablation studies, we evaluate the impact of different observation modalities—including depth maps and pre-trained visual features—alongside various policy paradigms, ranging from transformer-based sequence-to-sequence models to diffusion-based policies. Finally, we investigate sim-to-real capabilities, addressing sensor mismatch and distribution shifts by comparing simulation-trained models with policies trained on real-world data.

10:45-12:05, Paper FrZA.34
Learning Semantic-Geometric Task Graph-Representations from Human Demonstrations

Herbert, Franziska	Technische Universität Darmstadt
Prasad, Vignesh	Technische Universität Darmstadt
Liu, Han	Technische Universität Darmstadt
Koert, Dorothea	Technische Universität Darmstadt
Chalvatzaki, Georgia	Technische Universität Darmstadt
Keywords: Representation Learning, Semantic Scene Understanding, Bimanual Manipulation Abstract: Learning structured task representations from human demonstrations is essential for understanding long-horizon manipulation behaviors, particularly in bimanual settings where action ordering, object involvement, and interaction geometry can vary significantly. A key challenge lies in jointly capturing the discrete semantic structure of tasks and the temporal evolution of object-centric geometric relations in a form that supports reasoning over task progression. In this work, we introduce a semantic–geometric task graph-representation that encodes object identities, inter-object relations, and their temporal geometric evolution from human demonstrations. Building on this formulation, we propose a learning framework that combines a Message Passing Neural Network (MPNN) encoder with a Transformer-based decoder, decoupling scene representation learning from action-conditioned reasoning about task progression. The encoder operates solely on temporal scene graphs to learn structured representations, while the decoder conditions on action-context to predict future action sequences, associated objects, and object motions over extended time horizons. Through extensive evaluation on human demonstration datasets, we show that semantic–geometric task graph-representations are particularly beneficial for tasks with high action and object variability, where simpler sequence-based models struggle to capture task progression. Finally, we demonstrate that task graph representations can be transferred to a physical bimanual robot and used for online action selection, highlighting their potential as reusable task abstractions for downstream decision-making in manipulation systems.

10:45-12:05, Paper FrZA.35
A Unified Human-Likeness Criterion for Evaluating Human-Like Motion Retargeting on Bimanual Manipulation Tasks

Meixner, Andre	Karlsruhe Institute of Technology (KIT)
Carl, Mischa	Karlsruhe Institute of Technology (KIT)
Krebs, Franziska	Karlsruhe Institute of Technology (KIT)
Steudle, Steffen	Karlsruhe Institute of Technology
Jaquier, Noémie	KTH Royal Institute of Technology
Asfour, Tamim	Karlsruhe Institute of Technology (KIT)
Keywords: RIG Cluster: Manipulate Anything, Anywhere, Anytime, Human and Humanoid Motion Analysis and Synthesis, Bimanual Manipulation Abstract: Understanding bimanual and human-like motion is pivotal to equip humanoid robots with human-like capabilities and manipulation skills and to enable intuitive human-robot interaction. In this extended abstract, we present a multi-modal dataset of accurate whole-body human bimanual manipulation actions. Moreover, we conceptualize and derive a novel unified human-likeness criterion to assess human-like robot motions, which we evaluated across applications related to motion retargeting on bimanual manipulation tasks. Building on these results, we propose an importance-based motion retargeting approach improving human likeness.

10:45-12:05, Paper FrZA.36
Supervisory Control for Runtime-Safe LLM-Generated Swarm Controllers

Bauer, Jannis	Technical University of Darmstadt
Isildar, Ecem	Technical University of Darmstadt
Gross, Roderich	Technical University of Darmstadt
Keywords: RIG Cluster: Safety, Reliability and Resilience of AI-based Robotics, RIG Cluster Multi-Robot Systems, RIG TC: Principles and Methods for Building AI-powered Robust and Resilient Robots Abstract: Current approaches that use Large Language Models to generate swarm controllers fail to account for the risk of faulty code. We introduce a system that uses Supervisory Control Theory (SCT) as a runtime safety layer to prevent unsafe execution. Across five independently generated swarm controllers, including several with missing and incorrect code, the SCT supervisor enforced valid execution traces and enabled only actions allowed by the specifications.

10:45-12:05, Paper FrZA.37
Towards Automated Disassembly for Battery Removal of Robot Vacuum Cleaners

Singh, Dheeraj	Fraunhofer IPA
Hoeltge, Lasse	Fraunhofer IPA
Al Assadi, Anwar	Fraunhofer IPA
Bargmann, Daniel	Fraunhofer IPA
Kraus, Werner	Fraunhofer IPA
Huber, Marco F.	Fraunhofer IPA
Keywords: Disassembly, Factory Automation Abstract: The annual amount of electronic waste is increas- ing worldwide, and the number of battery-powered electrical appliances, such as Robot Vacuum Cleaners (RVCs), is also on the rise. The current state-of-the-art disposal process involves collection, partial disassembly, shredding, and subsequent sort- ing, which currently does not prioritize batteries, leading to fire incidents. Furthermore, this process does not guarantee that other value-preserving recycling methods can directly reuse materials as recyclate. Rather than having humans perform this dirty and hazardous task, robot-based disassembly could serve as a new and effective treatment to address these issues. This paper presents a case study of a critical step: the robot- based removal of batteries from RVCs. The proposed solution is a robust and modular robotic cell that fully automates the removal of spring terminal batteries from RVCs once a user feeds the object into the cell. The disassembly process consists of screw detection, unfastening, removal of the battery cover, and finally, removal of the battery. For this objective, we use a force- controlled skill-based robot programming framework combined with computer vision-based detection and comprehensive error handling strategies. Using the robot cell, we achieved an 81% success rate in the entire pipeline for different models.

10:45-12:05, Paper FrZA.38
Data-Free Training of Diverse Neural Samplers for Constrained Sets

Burghoff, Tilman	Technical University Berlin
Braun, Cornelius Valentin	Technische Universität Berlin
Toussaint, Marc	TU Berlin
Keywords: Deep Learning Methods, AI-Based Methods, RIG Cluster: Learning and Multimodal AI for Robotics Abstract: Sampling from constrained sets is a core paradigm for many robotics problems, for example grasping or path planning. Although optimizing under constraints is a well studied problem, sampling under constraints has received less attention. In comparison, diffusion and flow based sampling is well-studied and has proven effective in many domains. However, these models usually need a lot of data for their training. This proves to be a bottleneck, especially in robotics, where datasets are often not readily available and expensive to create. We propose a novel method to train diffusion and flow matching models to sample uniformly from a constrained set. Our algorithm trains the model without any initial dataset, using only differentiable constraints. This allows novel applications for problems where constraints are known but data is rare.

10:45-12:05, Paper FrZA.39
CLEVERR: Commonsense LLM-Enhanced Vehicle Routing for Efficient Room Rearrangement

Gassen, Martina	Technical University of Darmstadt
Rudra, Sohan	TU Darmstadt
Prasad, Vignesh	TU Darmstadt
Schultze, Sven	Technical University of Darmstadt
Chalvatzaki, Georgia	Technische Universität Darmstadt
Keywords: Integrated Planning and Learning, Task Planning, Domestic Robotics Abstract: Tidying up rooms is a challenging task that requires agents to navigate their environments, locate and interact with objects, and reason about placements. Given the open-ended nature of room tidying, recent works have shifted their focus to using Large Language Models (LLMs). However, existing approaches often rely on predefined goal states or incur high prompting costs, leaving efficiency in goal-free tidying largely unaddressed. Different from existing works, we present a novel method that combines the commonsense reasoning of LLMs and the classical optimization based planning methods for efficient and open-ended goal-free room rearrangement. We incrementally constructs a 3D semantic scene graph and queries an LLM to identify misplaced objects, propose plausible placements, and highlight areas of interest. The resulting waypoints are optimized via a Vehicle Routing Problem (VRP) formulation to minimize travel and improve tidying efficiency. Our method outperforms greedy baselines in both tidying efficiency and success rates, achieveing high accuracy in misplaced object detection and placement suggestions. Finally, we demonstrate real-world feasibility through qualitative experiments on misplacement detection and placement reasoning.

10:45-12:05, Paper FrZA.40
Unified Legged Locomotion: A Single Policy for Millions of Morphologies

Bohlinger, Nico	TU Darmstadt
Peters, Jan	Technische Universität Darmstadt
Keywords: RIG Cluster: Legged Locomotion, Reinforcement Learning, Humanoid and Bipedal Locomotion Abstract: We present a single, general locomotion policy trained on a diverse collection of 50 legged robots. By combining an improved embodiment-aware architecture (URMAv2) with a performance-based curriculum for extreme Embodiment Randomization, our policy learns to control millions of morphological variations. Our policy achieves zero-shot transfer to unseen real-world humanoid and quadruped robots.

10:45-12:05, Paper FrZA.41
Verifier-Guided Action Selection for Embodied Agents

Singhi, Nishad	Technische Universität Darmstadt
Bialas, Christian	TU Darmstadt
Jauhri, Snehal	TU Darmstadt
Prasad, Vignesh	TU Darmstadt
Chalvatzaki, Georgia	Technische Universität Darmstadt
Rohrbach, Marcus	TU Darmstadt
Rohrbach, Anna	TU Darmstadt
Keywords: Integrated Planning and Learning, Task Planning, RIG Cluster: Learning and Multimodal AI for Robotics Abstract: Creating generalist embodied agents that solve complex real-world tasks is a grand challenge for AI. Mul- timodal large language models (MLLMs) have enhanced the reasoning capabilities of embodied agents, yet they struggle with distributional shifts. Standard Chain-of-Thought (CoT) reasoning improves performance but is insufficient to overcome these out-of-distribution challenges. We introduce Verifier- Guided Action Selection (VeGAS), a novel framework that fundamentally enhances the robustness of MLLM reasoning by integrating an explicit verification step. Instead of relying on a single decoded action, VeGAS generates an ensemble of candidate actions and employs a learned generative verifier to select the most reliable action. To train the verifier without any costly human data collection, we introduce an LLM-driven data generation strategy to automatically synthesise a diverse curriculum of failures, enabling the verifier to learn from a rich distribution of potential mistakes. Experiments on long- horizon embodied reasoning tasks showcase the power of the proposed approach to improve performance and generalization across all tasks, leading even to a 70% relative performance gain on challenging scenarios over strong CoT baselines.

10:45-12:05, Paper FrZA.42
Fast Path, Slow State: Dual-Rate Vision-Language-Action Control under Asynchrony

Vanjani, Pankhuri	Karlsruhe Institute of Technology
Reuss, Moritz	Karlsruher Institut of Technology
Li, Zhuoyue	Karlsruhe Institute of Technology
Suliga, Jakub	Karlsruhe Institute of Technology
Lioutikov, Rudolf	Karlsruhe Institute of Technology
Keywords: RIG TC: Robotics Foundation Models, Imitation Learning, Learning from Demonstration Abstract: Real-world robot perception runs on heterogeneous sensors at different rates and receive observations asynchronously, but most VLA methods assume synchronized observation bundles. We propose a dual-rate VLA design that decouples the control-rate of action generation from slower context estimation. A fast flow-matching style action generator acts continuously from the latest available tokens, while a compact slow state context is updated asynchronously. We evaluate robustness by injecting controlled modality delays and frame dropouts at inference time, and study update rules and representations for the slow state context. Preliminary LIBERO results indicate that compressed memory tokens, particularly GRU-based compression, improve success and inference-time dropout robustness compared to naive frame stacking, and single system baseline motivating event-triggered context updates as a next step.

10:45-12:05, Paper FrZA.43
CampusEye: A Visual Corpus for Self-Localization on the Fulda Campus

Milde, Jan-Torsten	Fulda University of Applied Science
Keywords: Data Sets for Robotic Vision, Vision-Based Navigation Abstract: The CampusEye project establishes a visual dataset designed to help autonomous robots navigate the Fulda University of Applied Sciences using only basic cameras. We identified 55 specific locations across the campus, using the existing tactile guidance system to ensure precise positioning and environmental variety. By recording panoramic videos at these spots, the team captured over one million images to reflect different lighting and perspectives. This data was used to train a lightweight deep learning model that identifies a robot's location and heading with high accuracy. The study proves that vision-based localization is a functional, low-cost alternative to expensive sensors like LiDAR for navigating complex pedestrian environments. Overall, the corpus provides a robust foundation for real-time robotic orientation in semi-structured outdoor spaces.

10:45-12:05, Paper FrZA.44
RoboGrounder - Grounded Spatio-Temporal Reasoning for Robotics

Blank, Nils	KIT
Lioutikov, Rudolf	Karlsruhe Institute of Technology
Keywords: Deep Learning for Visual Perception, Data Sets for Robot Learning, Data Sets for Robotic Vision Abstract: High-quality language and reasoning annotations are essential for training generalizable Vision Language Action Models (VLAs), yet manual labeling is unscalable. Existing automated methods often lack precision in robotic domains or require complex hand-crafted annotation pipelines. To address this, we introduce RoboGrounder, a VLM framework designed to generate reliable spatio-temporal reasoning and object-centric annotations for manipulation demonstrations. We propose a robust annotation pipeline that combines foundation models with robot proprioception to annotate a diverse dataset sourced from various robot manipulation of 500k VQA grounding pairs. Experiments demonstrate that RoboGrounder significantly outperforms base models, showing substantial improvements in task identification, temporal action localization, object interaction detection, and spatial grounding accuracy.

10:45-12:05, Paper FrZA.45
MAD-IRL: Multi-Agent Drone Racing Using Inverse RL and Iterative Best-Response MPC

Schlüter, Niklas	Technical University of Munich
Schuck, Martin	Technical University of Munich
Brunke, Lukas	Technical University of Munich
Samavi, Sepehr	University of Toronto
Zhou, Siqi	Technical University of Munich
Schoellig, Angela P.	TU Munich
Keywords: Aerial Systems: Perception and Autonomy, Multi-Robot Systems, RIG Cluster: Safety, Reliability and Resilience of AI-based Robotics Abstract: While autonomous drone racing has achieved superhuman performance in time trials, competitive multi-agent racing remains challenging due to complex interactions. We address this by modeling opponents as optimal agents, using Inverse Reinforcement Learning (IRL) to infer their reward functions. This learned reward drives an iterative best response Model Predictive Control (MPC) that jointly predicts opponent behavior and optimizes our agent’s strategy, explicitly accounting for collisions and aerodynamic effects. Extensive experiments demonstrate that our approach significantly improves prediction accuracy across diverse opponent controllers, which translates directly to higher success rates in interactive maneuvers, such as overtaking, and therefore lower crash rates.

10:45-12:05, Paper FrZA.46
Title: Pinky 2: A Vision-Based Tactile Sensor for Minimal Invasive Surgery

Koch, Robin	Technische Universität Dresden
Mascot, Annabella	Stanford University
Younis, Rayan	University Hospital and Medical Faculty Carl Gustav Carus, TU Dresden
Wagner, Martin	University Hospital and Faculty of Medicine Carl Gustav Carus at TUD Dresden University of Technology
Speidel, Stefanie	National Center for Tumor Diseases
Cutkosky, Mark	Stanford University
Sieber, Ingo	Karlsruhe Institute of Technology (KIT)
Calandra, Roberto	TU Dresden
Keywords: RIG TC: Tactile Robotics, Force and Tactile Sensing

10:45-12:05, Paper FrZA.47
Structured Planning Using Vision Language Models for Physical Agents

S Prabhu, Pranav	Technische Universität Dortmund
Xavier, Aaron	Technische Universität Dortmund
Keywords: Hybrid Logical/Dynamical Planning and Verification, Natural Dialog for HRI, AI-Enabled Robotics

10:45-12:05, Paper FrZA.48
Sim-To-Real for Muscle-Actuated Robots Via Learned Actuator Models

Schneider, Jan	Max Planck Institute for Intelligent Systems, Tübingen
Mahajan, Mridul	Boston University
Chen, Le	Max Planck Institute for Intelligent Systems, Tübingen
Guist, Simon	Max Planck Institute for Intelligent Systems, Tübingen
Schölkopf, Bernhard	ELLIS Institute Tübingen
Posner, Ingmar	University of Oxford
Büchler, Dieter	University of Alberta
Keywords: Soft Robot Applications, Reinforcement Learning, Transfer Learning Abstract: Tendon drives and soft muscle actuation, as seen in humans, can make robots safer, faster, and potentially accelerate skill learning. Still, such robots are rarely used due to inherent nonlinearities, friction, and hysteresis, which make such systems challenging to model and control. These challenges prohibit learning entire policies in simulation and transferring them to the real system. To enable sim-to-real training for robots with soft actuation and tendon drives, we propose learning a model of the actuators and combining it with a torque-based simulator, allowing us to learn the difficult-to-model actuation and leverage well-studied rigid body models for the rest. This combined simulation enables training reinforcement learning policies for a goal-reaching and a ball-in-a-cup task purely in simulation, achieving success rates of 90% and 75%, respectively, on the real robot.

10:45-12:05, Paper FrZA.49
Hybrid Approach for Asymmetric Teacher-Student Training

Schik, Maximilian	FZI Forschungszentrum Informatik
Daaboul, Karam	Karlsruhe Institut for Technology
Neumann, Gerhard	Karlsruhe Institute of Technology
Keywords: AI-Enabled Robotics Abstract: Vision-based locomotion policies typically train with on-policy methods in massively parallel simulation, but rendering depth images is computationally expensive and limits feasible parallelism. Off-policy methods can reuse rendered observations through experience replay, reducing computational cost, but struggle with instability when training from high-dimensional visual inputs. We introduce a hybrid approach that combines off-policy RL with time-decaying knowledge distillation from a privileged teacher. Strong early supervision prevents the student from exploiting poorly calibrated critics, while exponential decay transfers control to the RL objective as the replay buffer matures. Trained in parallelized simulation, the resulting policy transfers zero-shot to a Unitree Go2 quadruped, where it executes crawling behaviors under low obstacles using only egocentric depth and proprioception.

10:45-12:05, Paper FrZA.50
Coarse-To-Fine BEAST: Dual-System Spline Tokenization for Vision-Language-Action Models

Stranghöner, Jannick	Universität Bielefeld
Gruner, Theo	TU Darmstadt
Vanjani, Pankhuri	Karlsruhe Institute of Technology
Jülg, Tobias Thomas	University of Technology Nuremberg
Scherer, Christian Felix	TU Darmstadt
Peters, Jan	Technische Universität Darmstadt
Burgard, Wolfram	University of Technology Nuremberg
Neumann, Klaus	Bielefeld University / Fraunhofer IOSB-INA
Lioutikov, Rudolf	Karlsruhe Institute of Technology
Keywords: RIG Cluster: Learning and Multimodal AI for Robotics, RIG TC: Robotics Foundation Models, Imitation Learning Abstract: B-spline action tokenizers such as BEAST compress high-frequency robot trajectories into a small, fixed-length token sequence, enabling fast parallel decoding and smooth motion within the chunk. However, in online control a fundamental tension between reactivity and throughput remains: replanning frequently improves responsiveness. However, repeatedly invoking a large vision-language-action (VLA) backbone is expensive. Longer action chunks can improve efficiency, but delay corrections and decrease responsiveness. We propose Coarse-to-Fine BEAST, a dual-system formulation that factorizes planning and refinement directly in B-spline control-point space. A slow System~2 VLA predicts a small set of coarse control points, while a lightweight System~1 refiner performs fast, observation-conditioned residual updates on a small subset of future fine control points, preserving continuity by construction. This decomposition reduces both backbone calls and provides an interface for adding new modalities, such as force/torque or tactile input, during finetuning without modifying the large VLA.

10:45-12:05, Paper FrZA.51
Enhanced Co-Design of a Collective Robotic Construction System for the Assembly of In-Plane Timber Structures

Leder, Samuel	University of Stuttgart
Kim, HyunGyu	Korea Aerospace university
Sitti, Metin	Max-Planck Institute for Intelligent Systems
Menges, Achim	Insitute for Computational Design and Construction, University of Stuttgart
Keywords: Robotics and Automation in Construction, Swarm Robotics, Cellular and Modular Robots

10:45-12:05, Paper FrZA.52
Digital Twins for Virtual Validation of Social Robots Leveraging Semantic Knowledge and Cognition

Zebisch, Raoul	University of Augsburg
Schilp, Johannes	University of Augsburg
Keywords: Social HRI, Semantic Scene Understanding, Cognitive Modeling Abstract: In recent years, the field of social robotics has gained popularity, most prominently in areas such as service, healthcare, or care of the elderly. More recently, the topic also gained relevance in other application fields such as manufacturing. Social capabilities have the potential to improve the quality of an interaction between humans and robots, enhancing robot acceptance and quality of life in the process. However, their validation through digital approaches proves to be unarguably difficult. When regarding processes of social human-robot interaction, a large number of social facts and rules need to be considered, possibly highly dependent on the context, rendering solutions even with data-driven AI approaches difficult. It is therefore the authors' believe that context-sensitive virtual validation systems of social robots, possibly a digital twin with an interactive 3D virtualization of the robot, achieve better results if one combines an AI system with a formal, possibly fuzzified, rule base, realized in a semantic knowledge graph. In this extended abstract, the authors give an overview on their methodological approach and recent advances to develop a social robot system in a digital twin, enabling virtual validation and improved social robot cognition, such as for communication or task planning.

10:45-12:05, Paper FrZA.53
Mistake-Aware LLM Finetuning for Robust Planning

Prescher, Erik	Technical University Darmstadt
Schultze, Sven	Technical University of Darmstadt
Rudra, Sohan	TU Darmstadt
Prasad, Vignesh	TU Darmstadt
Stock-Homburg, Ruth	Technical University of Darmstadt
Chalvatzaki, Georgia	Technische Universität Darmstadt
Keywords: Task Planning, Failure Detection and Recovery, AI-Based Methods Abstract: While finetuned Large Language Models (LLMs) for embodied planning excel at producing reliable plans for the given environment, they do so in a very narrow area of operation. Usually one wrong step in such a plan is sufficient to get the agent into an unseen scenario. Current training paradigms focus on preventing mistakes by learning from optimal demonstrations, but neglect the crucial skill of recovering when deviating from the correct plan. To address this gap, we propose Mistake-Aware Finetuning (MAF), a novel training methodology that explicitly teaches agents to recover from planning errors. In the MAF paradigm, the model is exposed to plans containing intentional mistakes, but a targeted loss mask ensures it only learns from the subsequent, correct recovery actions. This allows the model to learn the association between a failure state and its resolution without being negatively influenced by the erroneous action. We demonstrate the effectiveness of MAF in substantially improving the task successes from 67% to 96% on the complex MiniGrid MiniBossLevel environment.

10:45-12:05, Paper FrZA.54
Close-Proximity Human-Robot Interaction in Medical Interventions and Patient Care

Plonka, Björn Sören	Karlsruhe Institute of Technology (KIT)
Stallkamp, Jan	University of Heidelberg, Medical Faculty Mannheim, MIiSM
Mombaur, Katja	Karlsruhe Institute of Technology
Keywords: Medical Robots and Systems, Physical Human-Robot Interaction, Task Planning Abstract: Healthcare systems are increasingly strained by demographic change and staff shortages, motivating the integration of robotic assistants into daily patient care. While existing hospital robots mostly focus on surgical applications, routine patient-centered tasks remain unsupported. This work presents a research vision for the interaction strategies of a humanoid robotic assistant designed to operate safely in close physical contact with patients. We propose the use of an intuitive task selection from a predefined yet adaptive set of robot-executable actions. Additionally, we plan to examine the applicability and limitations of state-of-the-art dexterous humanoid hands in medical settings. Finally, we discuss the usability of artificial intelligence in hospital environments, highlighting its potential to increase its acceptance through human-like conversations, while emphasizing that AI-driven actuation of the robot's motors may lead to catastrophic failure.

10:45-12:05, Paper FrZA.55
Designing Gaze-Guided Spatial Referencing for Trustworthy Mobile Robot Interaction

Elangovan, Govindaprasath	PES University, Bangalore
Janardhana, Vivek Kashyap	PES University, Bangalore
Vinay Krishna Sharma, Vinay Krishna	Indian Institute of Science
Bharti, Priyanka	IIT Kanpur
Keywords: Design and Human Factors, Social HRI Abstract: This work proposes a gaze-driven spatial referencing framework that couples robot gaze behavior with SLAM-based spatial representations to enable explainable and human-legible interaction in autonomous mobile robots. While gaze has been widely used as a social cue in Human–Robot Interaction (HRI), its integration with a robot’s internal spatial cognition remains limited. Building on the Mirror Eyes concept of gaze as an explainability interface and prior findings on trust and legibility in HRI, we transform gaze into a computational mechanism that grounds spatial intent in observable robot behavior. The proposed approach projects the robot’s gaze vector into the SLAM-generated map to anchor objects, regions, and navigation goals, allowing the robot to communicate spatial references through coordinated gaze and interaction cues. This framework bridges the gap between internal navigation reasoning and human-understandable communication, supporting transparency, predictability, and user trust in mobile robot interaction.

10:45-12:05, Paper FrZA.56
Learning Autonomous Excavation: A Reinforcement Learning Approach to Rock Manipulation

Daaboul, Karam	Karlsruhe Institut for Technology
Wiberg, Viktor	Algoryx Simulation AB
Singh, Arvind	Karlsruher Institut Für Technologie
Weißkopf, Tobias	Karlsruhe Institute of Technology
Neumann, Gerhard	Karlsruhe Institute of Technology
Keywords: Robotics and Automation in Construction, Reinforcement Learning Abstract: Autonomous control of hydraulic excavators is challenging due to complex contact dynamics with granular materials, actuator lag, and stability constraints. We present a reinforcement learning approach for robustly grasping and lifting rocks while maintaining machine stability under realistic physical constraints. Using Proximal Policy Optimization, we train an agent entirely in a high-fidelity AGX Dynamics simulation environment that accurately captures excavator dynamics and soil interactions. Through domain randomization and curriculum learning, the agent acquires coordinated control policies without manually crafted heuristics. The learned policy achieves a 91% success rate across randomized soil parameters, rock positions, and joint configurations, and 98% under fixed conditions. These results demonstrate the potential of reinforcement learning for automating complex operations with heavy construction machinery.


FrEA
AI Competence Centers & Networks


FrFA
Panel Discussion: Integrating Robotics and AI


FrGA
Oral Session 3 — RIG Outstanding Doctoral Thesis Award Presentations	Regular


FrHA
Closing Panel

Technical Program for Friday March 13, 2026