RO-MAN 2025 Program | Tuesday August 26, 2025


TuOC Plenary Session, Blauwe Zaal	Add to My Program
Plenary Opening 8: 30-9: 30

Chair: Barakova, Emilia I.	Eindhoven University of Technology


TuP1L Plenary Session, Blauwe Zaal	Add to My Program
Keynote Speech by Prof. Lynne Baillie 9: 30-10: 30

Chair: Nejat, Goldie	University of Toronto


TuAM1_BR Voorhof/Hall	Add to My Program
Coffee Break (Tu1) 10: 30-10: 50


TuAT1 Regular Session, Auditorium 1	Add to My Program
Machine Learning and Adaptation I

Chair: Tozadore, Daniel	University College London (UCL)
Co-Chair: Abdulazeem, Nourhan	University of Waterloo

10:50-11:02, Paper TuAT1.1	Add to My Program
Deep Learning for Recognition of Object Manipulation Hand Gestures Using Wearable Sensor Data

Sharples, Jack (University of Bath), Martinez-Hernandez, Uriel (University of Bath)
Keywords: Detecting and Understanding Human Activity Abstract: In this paper, we present a technique for recognising hand gestures using data obtained from wearable sensors. A Neural Network takes three input streams - accelerometer data, electromyography (EMG) data, and joint angle data - from sensors affixed to the user’s arm and estimates the gesture the user is performing based on the incoming data. Our proposed method handles each of the three input streams separately using multiple convolution and Long-Short Term Memory (LSTM) layers before concatenating them together and passing them through 3 layers of fully connected neurons. This work was trained on a subset of Database 2 (DB2) Exercise 3 from the Ninapro dataset, composed of data collected from 40 participants performing 23 gestures. This work achieved a classification accuracy of 82.11%. As this falls below the classification accuracy of established literature on similar data, we aim to improve its performance by training and validating the model with a larger quantity of training data.

11:02-11:14, Paper TuAT1.2	Add to My Program
CHARM: Considering Human Attributes for Reinforcement Modeling

Fang, Qidi (Tufts University), YU, HANG (Tufts University), Fang, Shijie (Tufts University), Huang, Jindan (Tufts University), Chen, Qiuyu (Tufts University), Aronson, Reuben (Tufts University), Short, Elaine Schaertl (Tufts University)
Keywords: Social Learning and Skill Acquisition Via Teaching and Imitation, Machine Learning and Adaptation Abstract: Reinforcement Learning from Human Feedback has recently achieved significant success in various fields, and its performance is highly related to feedback quality. While much prior work acknowledged that human teachers' characteristics would affect human feedback patterns, there is little work that has closely investigated the actual effects. In this work, we designed an exploratory study investigating how human feedback patterns are associated with human characteristics. We conducted a public space study with two long-horizon tasks and 46 participants. We found that feedback patterns are not only correlated with task statistics, such as rewards, but also correlated with participants' characteristics, especially robot experience and educational background. Additionally, we demonstrated that human feedback value can be more accurately predicted with human characteristics compared to only using task statistics. All human feedback and characteristics we collected, and codes for our data collection and predicting more accurate human feedback are available at url{https://github.com/AABL-Lab/CHARM}.

11:14-11:26, Paper TuAT1.3	Add to My Program
Multimodal Transfer Learning for Privacy in Human Activity Recognition

Rolfsjord, Sigmund (Norwegian Defence Research Establishment), von Arnim, Hugh Alexander (University of Oslo), Fatima, Safia (Simula Research Laboratory), Baselizadeh, Adel (University of Oslo (UiO))
Keywords: Detecting and Understanding Human Activity, Machine Learning and Adaptation, Multi-modal Situation Awareness and Spatial Cognition Abstract: Human Activity Recognition (HAR) models often rely on small, specialized datasets, limiting their generalizability. In addition, many systems rely on privacy-invasive RGB video as their primary sensing modality. This choice raises ethical concerns, especially in health- and home-care robotics, where patient privacy is paramount. In this study, we evaluate transfer learning as a method to improve HAR generalizability across RGB, IMU, and depth imaging modalities while assessing how privacy-preserving modalities can compensate for a lack of RGB video in multimodal learning contexts. We train a feature-fusion model on aggregated HAR datasets, leveraging pretrained backbones for each modality, and compare it to a general multimodal model pretrained on non-HAR datasets. Evaluating on the PriMA-Care privacy-focused dataset across combinations of modalities, we find that the general model outperforms the HAR model, with best accuracies of 98.29% for the general model with RGB + IMU and 94.97% for the HAR-specific model with RGB + depth. Analysis shows that the general model more readily identifies individuals from RGB input, while IMU and depth better preserve privacy with a small accuracy loss (5%).

11:26-11:38, Paper TuAT1.4	Add to My Program
Robust Optimal Motion Planning for Nonlinear Systems in the Context of Neural Abstraction

Farid, Yousef (University of Pisa), Baracca, Marco (University of Pisa), Simonini, Giorgio (Università Di Pisa), Salaris, Paolo (University of Pisa)
Keywords: Machine Learning and Adaptation, Motion Planning and Navigation in Human-Centered Environments Abstract: Robotic automation provides a great contribution to increase efficiency and precision of several tasks. However, most of the solutions used rely on the perfect knowledge of the system's model. While this requirement could be easily fulfilled in industrial setup, this cannot be assumed as true in more general scenarios, like daily living environment. To this aim, in this study, we present a novel theoretical framework for optimal motion planning in unknown nonlinear dynamical systems using neural abstraction. The proposed approach establishes an ε-approximate simulation relation, leveraging deep neural networks to create an abstracted representation of the system with quantified approximation errors. The state space is partitioned into polyhedral regions using neural networks with ReLU activation functions, resulting in piecewise linear dynamical models. A hybrid control scheme combining Tube Model Predictive Control and Sliding Mode Control is incorporated into this framework to generate optimal trajectories and control signals while ensuring stability in the presence of unknown external disturbances and model inaccuracies. Its effectiveness is demonstrated through a case study on object manipulation using a Franka arm equipped with a Pisa/IIT SoftHand gripper in a ROS/Gazebo simulation environment.

11:38-11:50, Paper TuAT1.5	Add to My Program
Robot Delivery of Actionable Counterfactual Neural Network Explanations: Results in Group Perception

Narcomey, Austin (Yale University), Claure, Houston (Yale University), Vázquez, Marynel (Yale University)
Keywords: Machine Learning and Adaptation, Multi-modal Situation Awareness and Spatial Cognition Abstract: Ensuring transparency in robot’s decision-making is increasingly important as they become better at making complex decisions when collaborating with humans. Among the most promising approaches in Human-Robot Interaction (HRI) is the use of counterfactual explanations. Within HRI, counterfactual explanations typically provide insight into robot's models by presenting changes to the inputs of the model that influence policy decisions and resulting outcomes. While prior work presents counterfactuals that may vary features that users cannot control, we focus on actionable counterfactuals that show how the robot responds to feasible user actions and thereby reveal only what users need to understand to interact effectively. We introduce a novel explanation framework that generates actionable counterfactuals for neural network models in HRI applications. We evaluate this framework in simulation and on live sensor data during an in-person demonstration with groups of participants. Our results highlight the value of each component of our framework and demonstrate its effectiveness in real-time robotic explanations.


TuAT2 Regular Session, Auditorium 2	Add to My Program
Child - Robot Interaction I

Chair: Sandygulova, Anara	Nazarbayev University
Co-Chair: D'Arco, Luigi	University of Naples Federico II

10:50-11:02, Paper TuAT2.1	Add to My Program
Designing Cognitively-Aware Psychomotor Intelligent Tutoring Systems Via Multi-Objective and Human-Centric Optimization

Yuh, Madeleine (Purdue University), Jain, Neera (Purdue University)
Keywords: Monitoring of Behaviour and Internal States of Humans, Detecting and Understanding Human Activity, User-centered Design of Robots Abstract: In this work, we present a methodology for integrating multiple tutoring objectives into model-based control policies for a psychomotor intelligent tutoring system (ITS). While tutoring objectives, like increasing self-efficacy, are well defined in the learning literature, realizing these types of objectives in an algorithmic framework is challenging. Motivated by the role of self-efficacy in psychomotor learning, we employ a Markov Decision Process (MDP) framework to train probabilistic models of self-confidence, workload, and learning stage. A custom reward function combines three key objectives: calibrating self-confidence with learning stage, calibrating workload to learning demands, and advancing learners from novice toward expert. Using this framework, we train two optimal tutoring policies—one prioritizing cognitive state calibration and the other focusing on learning progression. Using these two policies as test case scenarios, we conduct a between-subjects study and demonstrate how one can leverage the tunability of the multi-objective reward function to achieve desired learning outcomes.

11:02-11:14, Paper TuAT2.2	Add to My Program
Age-Related Differences in Children's Spontaneous Gesturing with a Robot versus Human Instructor

Wilson, Jason (Franklin & Marshall College), Langer, Allison (Temple University), Howard, Lauren (Franklin and Marshall College), Marshall, Peter (Temple University)
Keywords: Child-Robot Interaction, Non-verbal Cues and Expressiveness, Robots in Education, Therapy and Rehabilitation Abstract: Research on gestures in human-robot interaction has largely focused on finding that children may learn better and enjoy interacting with robots that gesture more often. However, no research to date has examined how children themselves spontaneously gesture in the presence of a human vs. robot instructor. A child’s use of gesture might be indicative of engagement or rapport with the robot instructor and may provide key information about a robot instructor’s efficacy or opportunities for intervention. As such, the current study examines 5-8-year-old children’s rate of deictic and conventional gestures when being assisted by a robot vs. human instructor. Overall, we find age-related effects in children’s gestures relation to the specific instructors. There is a significant negative correlation between age and gesture rate when learning from the human instructor, but no significant correlation with the robot instructor. These results are discussed in relation to children’s perceptions of the instructor, task difficulty, and age-related cognitive development shifts.

11:14-11:26, Paper TuAT2.3	Add to My Program
Designing Robot-Mediated Phonological Awareness Activities: Child-Centered Approach and Kindergarten Integration

Azarfar, Jasmin (Karlsruhe Institute of Technology (KIT)), Norman, Utku (Karlsruhe Institute of Technology (KIT)), Rudenko, Irina (Karlsruhe Institute of Technology (KIT)), Bruno, Barbara (Karlsruhe Institute of Technology (KIT))
Keywords: Child-Robot Interaction, Long-term Experience and Longitudinal HRI Studies, Robots in Education, Therapy and Rehabilitation Abstract: Long-term child-robot interaction (CRI) in kindergartens is key for integrating robots into children’s daily lives and supporting their development. This unsupervised 4-weeks study explores how the robot Pepper can support phonological awareness (PA) in kindergarten children through the 'Sound Game', leveraging robot logs, questionnaires and open feedback by teachers and experts' analysis of audio recordings. Our findings reveal a generally high engagement, with the game being played 12 times, for a total of 233 minutes. Teachers played a crucial role in mediating the game, especially for younger children, and noted its potential for long-term use. The game proved beneficial across age groups, highlighting its potential as a first step for sustained, effective PA development and the transformative impact of social robots in early childhood education.

11:26-11:38, Paper TuAT2.4	Add to My Program
Robot-Led Vision Language Model Wellbeing Assessment of Children

Abbasi, Nida Itrat (University of Cambridge), Dogan, Fethiye Irmak (University of Cambridge), Laban, Guy (University of Cambridge), Anderson, Joanna (University of Cambridge), Ford, Tamsin (University of Cambridge), Jones, Peter B. (University of Cambridge), Gunes, Hatice (University of Cambridge)
Keywords: Applications of Social Robots, Child-Robot Interaction Abstract: This study presents a novel robot-led approach to assessing children’s mental wellbeing using a Vision Language Model (VLM). Inspired by the Child Apperception Test (CAT), the social robot NAO presented children with pictorial stimuli to elicit their verbal narratives of the images, which were then evaluated by a VLM in accordance with CAT assessment guidelines. The VLM’s assessments were systematically compared to those provided by a trained psychologist. The results reveal that while the VLM demonstrates moderate reliability in identifying cases with no wellbeing concerns, its ability to accurately classify assessments with wellbeing concerns remains limited. Moreover, although the model’s performance was generally consistent when prompted with varying demographic factors such as age and gender, a significantly higher false positive rate was observed for girls, indicating potential sensitivity to gender attribute. These findings highlight both the promise and the challenges of integrating VLMs into robot-led assessments of children's wellbeing.

11:38-11:50, Paper TuAT2.5	Add to My Program
Building Friendships across Borders: The Role of Social Robot Haru in Children Group Communication and Connection Development

Yi, Zhennan (Luddy School of Informatics, Computing, and Engineering, Indiana), Levinson, Leigh (Indiana University), Delgado-Chaves, Diego (Universidad Pablo De Olavide, Seville), Perez-Moleron, Jose Manuel (Universidad Pablo De Olavide), Bougria, Nabil (Universidad Pablo De Olavide, Seville), Krummheuer, Antonia (Aalborg University), Rehm, Matthias (Aalborg University), Kalsgaard Møller, Anders (Aalborg University), Kielsholm Ramsgaard, Katrine (Aalborg University), Auala, Selma (Namibia University of Science and Technology), Winschiers-Theophilus, Heike (Namibia University of Science and Technology), Nepolo, Edward (Namibia University of Science and Technology), Calero, David (Eurecat, Centre Tecnològic De Catalunya), Dal Moro, Devis (University of Trento), Serrano, Daniel (Eurecat, Technology Centre of Catalonia), Dalmau-Moreno, Magí (Eurecat Technology Centre), Gomez, Randy (Honda Research Institute Japan Co., Ltd), Merino, Luis (Universidad Pablo De Olavide), Sabanovic, Selma (Indiana University Bloomington)
Keywords: Child-Robot Interaction, Robot Companions and Social Robots, Novel Interfaces and Interaction Modalities Abstract: Forming friendship with peers from diverse backgrounds is key to children's social emotional development. In this study, we explored the use of social robot, Haru, as mediator for remote communication in children group, to support connection and friendship building. We invited children from different countries aged from 10 to 15 to participate in two interaction sessions with peers from other countries, after which we conducted interviews with children from three countries, focusing on their experiences, and perceptions of the robot's roles in the process. The findings indicated that social robot Haru effectively served as an icebreaker and entertainer; However, improvements are needed in conversation flow, transitions between different roles, and supporting children's autonomy in guiding the conversation and the depth of their communication.


TuAT3 Regular Session, Auditorium 3	Add to My Program
Assistive Robotics I

Chair: Rosenthal-von der Pütten, Astrid Marieke	RWTH Aachen University
Co-Chair: Nipatphonsakun, Kawinna	Kanazawa University

10:50-11:02, Paper TuAT3.1	Add to My Program
Design and Preliminary Evaluation of a Walker-Mounted Robotic System for Elderly Toilet Dressing Assistance

Unde, Jayant (Nagoya University), Urata, Taisei (Nagoya University), Kamei, Shinnosuke (Nagoya University), Ito, Yahiro (Nagoya University), Kihara, Ryusei (Nagoya University), Colan, Jacinto (Nagoya University), Hasegawa, Yasuhisa (Nagoya University)
Keywords: Assistive Robotics, Innovative Robot Designs, Human Factors and Ergonomics Abstract: This paper presents the design, development, and preliminary evaluation of a robotic dressing assistance system integrated into a mobile walker, RoboSnail, to support frail elderly individuals during toileting. The system features telescopic linear arms and adaptable roller grippers that automate the lowering and raising of trousers and underwear, addressing challenges in confined restroom environments. The compact design ensures unobstructed user mobility when not in use, while safety and adaptability are prioritized through mechanisms such as adaptable roller grippers and pivoting arms. Preliminary experiments demonstrated reliable trouser-lowering performance with a 95% success rate. Future improvements will focus on enhancing gripper adaptability, expanding stroke length, and conducting user trials to validate the system’s usability and effectiveness. This work represents a step toward autonomous toileting solutions that enhance independence, privacy, and quality of life for older adults with mild mobility impairments.

11:02-11:14, Paper TuAT3.2	Add to My Program
Evaluating an Assistive Robot for the Detection of Urinary Tract Infections

Nault, Emilyann (Heriot-Watt University & University of Edinburgh), Bettosi, Carl (Heriot-Watt University), JAGADEESAN, VIJAYA BALAN (Heriot Watt University), Baillie, Lynne (Heriot-Watt University)
Keywords: Assistive Robotics, Monitoring of Behaviour and Internal States of Humans, Multimodal Interaction and Conversational Skills Abstract: Urinary Tract Infections (UTIs) are highly prevalent and, if left untreated, can lead to serious health complications. While urine testing is an effective diagnostic tool, it is often conducted late in the infection's progression. As part of a broader project exploring technology-driven approaches to early UTI detection, we introduce an assistive robot prototype designed to facilitate daily interactions with users. This robot gathers a range of information that may serve as potential early indicators of UTIs, through a three-tiered interaction framework, increasing in intensity from light conversational topics to structured clinical questionnaires. To assess the feasibility and effectiveness of this approach, we conduct a user study in a psueudo-home environment involving 15 individuals with a history of UTIs, spanning 55 interactive sessions. Findings reveal promising levels of engagement and acceptance and insight on our LLM-driven interaction as a feasible approach. We provide key next steps for improvement towards our long-term study in-situ, highlighting the need for a more robust conversational strategy and promoting agency through user-initiated interactions.

11:14-11:26, Paper TuAT3.3	Add to My Program
Attitudes towards Humanoid Robots for In-Home Assistance

Radka, Basia (University of Washington), Layne, Evolone (University of Washington), Cakmak, Maya (University of Washington)
Keywords: User-centered Design of Robots, Assistive Robotics, Anthropomorphic Robots and Virtual Humans Abstract: Humanoid robots are the latest bet of the robotics community in advancing ways robots carry out a large variety of tasks that generate profit or increase quality of life for people. While their capabilities might extend to assistive care tasks, such as feeding, dressing, or household tasks, it is unclear if people are comfortable with having humanoid robots in their homes assisting with those tasks. In this paper we explore people’s attitudes towards assistive humanoid robots in the the home. We present two questionnaire studies, with 76 total participants, in which people are shown imaginary images of humanoid robots performing assistance tasks in the home, along with special purpose robot alternatives. Participants are asked to rate and compare robots in the context of eight different tasks and share their reasoning. The second study also shows participants pictures of advertised humanoid robot both without any context and in the context of in-home assistance tasks, and asks their opinions about these robots. Our findings indicate that people prefer special purpose robots over humanoids in most cases and their preferences vary by task. Although people think that humanoids are acceptable for assistance with some tasks, they express concerns about having them in their homes.

11:26-11:38, Paper TuAT3.4	Add to My Program
Step by Step: Enhancing Gait Analysis with Sensor-Equipped Robotic Platforms

Sorrentino, Alessandra (University of Florence), Pagliacci, Vanessa (University of Florence), Fiorini, Laura (University of Florence), Cavallo, Filippo (University of Florence)
Keywords: Assistive Robotics, Robots in Education, Therapy and Rehabilitation, Motion Planning and Navigation in Human-Centered Environments Abstract: Neurodegenerative diseases often result in pathological gait patterns, reducing mobility, stability, and overall functional capabilities. Given their impact on older adults' quality of life, early and accurate diagnosis is crucial for timely intervention. Traditional gait assessment technologies present some limitations related to low portability levels and user comfort. In this context, Socially Assistive Robots (SARs) offer an alternative by enabling non-intrusive gait monitoring while also supporting professional caregivers with objective measurements of users' motor performance. This study investigates the feasibility of using a mobile robotic platform to extract and analyze digital biomarkers related to gait activity. A novel pipeline was developed to automatically detect gait parameters from laser sensor data, segment the gait cycle, and compare these measurements against inertial measurement unit (IMU) data, which is the widely used approach. Results demonstrate a strong correlation (CI > 0.7) between laser-derived and IMU-based temporal gait parameters. However, discrepancies in step length measurements suggest that laser-based tracking provides more precise spatial information than IMU estimations. Additionally, this study explores the influence of the robotic platform on gait performance. Findings indicate that users walk faster when the robot is absent, despite its position behind them and out of sight. This suggests an unconscious adaptation to the robot’s presence, aligning with previous studies on human-robot interaction.

11:38-11:50, Paper TuAT3.5	Add to My Program
Personalized Communication of Socially Assistive Robots for Older Adults: A Perspective on Explicit, Implicit, Individual, and Group-Level Approaches

Hofstede, Bob Matthias (Vilans), Ipakchian Askari, Sima (Vilans), Cuijpers, Raymond (Eindhoven University of Technology), IJsselsteijn, Wijnand (Technische Universiteit Eindhoven), Nap, Henk Herman (Vilans)
Keywords: Robot Companions and Social Robots, Linguistic Communication and Dialogue, Assistive Robotics Abstract: As populations are ageing, innovative solutions are needed to support older adults in living independently. Socially Assistive Robots (SARs) hold significant potential, but their adoption in real-world healthcare settings remains limited, partially due to low user acceptance and diverse end-user needs. Therefore, personalization might be an interesting approach to address these user needs and increase user acceptance. This perspective paper presents a targeted synthesized analysis of findings from four prior studies conducted by the authors, to explore the role of personalized communication in SARs and its impact on user experience. The analysis reveals that both explicit and implicit personalization strategies enhance user engagement, acceptance, and communication effectiveness in SARs. It was found that explicit personalization primarily involves adjustments such as customizing speech characteristics, while implicit personalization focuses more on tailoring message content (e.g., based on individual interests of older adults). By linking empirical findings to existing literature, this paper provides a novel perspective on personalization approaches in SAR design, emphasizing the importance of both individual and group-level explicit and/or implicit adaptations to improve the user experience of SARs for older adults.


TuAT4 Regular Session, Blauwe Zaal	Add to My Program
Applications of Social Robots I

Chair: Wang, Fan	Eindhoven University of Technology
Co-Chair: Claure, Houston	Yale University

10:50-11:02, Paper TuAT4.1	Add to My Program
Towards Multi-Modal Learning by Demonstration: Combining Depthless Visual Data with Kinesthetic Teaching

Papageorgiou, Dimitrios (Hellenic Mediterranean University), Athanasiadis, Charalampos (Hellenic Mediterranean University), Fasoulas, John (Hellenic Mediterranean University), Sfakiotakis, Michael (Hellenic Mediterranean University)
Keywords: Programming by Demonstration, Multi-modal Situation Awareness and Spatial Cognition Abstract: Human-human learning is usually conducted utilizing multiple modalities (verbal, visual, kinesthetic etc), which is proven to significantly improve the quality of communication between the teacher and the learner. Towards extending this idea to human-robot learning, we propose here a multi-modal scheme for robot Learning by Demonstration (LbD), that combines visual observation of the human hand from a depthless RGB camera, with enhanced direct kinesthetic teaching of the robot. The proposed method, exploits the visually perceived action of the human hand, to provide haptic guidance during kinesthetic teaching, facilitating corrections along the axes with the maximum uncertainty. The method is experimentally evaluated using an RGB camera and a UR10e robot, and compared against single-modal approaches, i.e. visually trained behavior and trained through kinesthetic teaching. Our method is shown to outperform the latter, in terms of both the quality of skill transfer, and the cognitive/physical load required for the training.

11:02-11:14, Paper TuAT4.2	Add to My Program
Conversations with Andrea: Visitors' Opinions on Android Robots in a Museum

Heisler, Marcel (Hochschule Der Medien Stuttgart), Becker-Asano, Christian (Stuttgart Media University)
Keywords: Androids, Applications of Social Robots, Linguistic Communication and Dialogue Abstract: The android robot Andrea was set up at a public museum in Germany for six consecutive days to have conversations with visitors, fully autonomously. No specific context was given, so visitors could state their opinions regarding possible use-cases in structured interviews, without any bias. Additionally the 44 interviewees were asked for their general opinions of the robot, their reasons (not) to interact with it and necessary improvements for future use. The android's voice and wig were changed between different days of operation to give varying cues regarding its gender. This did not have a significant impact on the positive overall perception of the robot. Most visitors want the robot to provide information about exhibits in the future, while opinions on other roles, like a receptionist, were both wanted and explicitly not wanted by different visitors. Speaking more languages (than only English) and faster response times were the improvements most desired. These findings from the interviews are in line with an analysis of the system logs, which revealed, that after chitchat and personal questions, most of the 4436 collected requests asked for information related to the museum and to converse in a different language. The valuable insights gained from these real-world interactions are now used to improve the system to become a useful real-world application.

11:14-11:26, Paper TuAT4.3	Add to My Program
Studying Disagreement in Grasp Intentions of Givers and Receivers Studying Disagreement in Grasp Intentions of Givers and Receivers Engaging in Human-Human Handover

Wiederhold, Noah (Clarkson University), Banerjee, Sean (Wright State Univeristy), Banerjee, Natasha Kholgade (Wright State University)
Keywords: Monitoring of Behaviour and Internal States of Humans, Curiosity, Intentionality and Initiative in Interaction, Detecting and Understanding Human Activity Abstract: Successful handover of objects between two agents---two humans, or a human and a robot---plays an important role in ensuring success of larger-scale collaborative physical interactions between the agents. Handovers are likely to be seen as successful if the intentions of the agents participating in the handover are met, e.g., if the receiver intends to hold the object at the location presented by the giver. In this work, we analyze participant responses on giver and receiver comfort, giver's intention being met on receiver's receipt grasp, and receiver's intentions being met on giving and receipt grasp from a dataset of 1,632 handovers performed by 48 human giver-receiver dyads using 204 objects. Our findings show evidence of misalignment in intentions in unprompted handovers. In 19.47% or nearly a fifth of the analyzed grasps, the receiver would not have performed the giving grasp where the giver did. In 8.66% and 10.20%, i.e., nearly an eleventh and a tenth of cases, the receipt grasp was not in alignment with the giver and the receiver's intentions respectively. Our findings show the need for grasp and manipulation algorithms for robots engaging in handover to be aware of object structure and expected behavior to minimize misalignment in intentions for successful handover.

11:26-11:38, Paper TuAT4.4	Add to My Program
Investigating Robot Behaviors to Resolve Navigation Blocks

PARK, JISUN (Naver Labs Europe), Willamowski, Jutta (Naver Labs Europe), Colombino, Tommaso (Naver Labs Europe), Gallo, Danilo (Naver Labs Europe)
Keywords: Non-verbal Cues and Expressiveness, Motion Planning and Navigation in Human-Centered Environments, Multimodal Interaction and Conversational Skills Abstract: Various strategies have been explored for robots to prevent navigation blocks. However, such blocks may still happen and robots then need to resolve them. Blocks may happen either on the robot’s path or target location and may be caused either by human or object obstacles. In this paper we explore the design of robot behaviors to resolve such navigation blocks by asking humans for help. These behaviors combine different communication modalities in steps with increasing urgency. We focus on robots with limited sensing capabilities and present findings from an in-person experiment evaluating these behaviors. Our findings illustrate that humans can be more easily engaged to solve blocks caused by themselves rather than by third party objects. They also highlight the complexity of having robots with limited sensing capabilities successfully enact sequential interactions with their bystanders.

11:38-11:50, Paper TuAT4.5	Add to My Program
Adaptive Gait Pattern Switching under External Disturbances Using Multi-Modal MPC

Kang, Jeonguk (Samsung Research, Samsung Electronics), Ham, Byeong-Il (KAIST), Han, Seungho (Korea Advanced Institute of Science and Technology (KAIST)), Kim, Hyun-Bin (KAIST), Kim, Kyung-Soo (KAIST(Korea Advanced Institute of Science and Technology))
Keywords: Anthropomorphic Robots and Virtual Humans, Cooperation and Collaboration in Human-Robot Teams, Motion Planning and Navigation in Human-Centered Environments Abstract: This study presents the development of a gait selection strategy for quadrupedal robots capable of responding to external forces. Unlike traditional controllers, our approach directly utilizes information about external forces and introduces a strategy to adapt the robot's gait patterns accordingly. To simultaneously solve and compare various gait types, we propose a multimodal Model Predictive Control (MPC) framework. This framework dynamically adjusts four distinct gaits-trotting, bounding, pacing, and walking—as well as the contact time with the ground in real-time. The results demonstrate that this approach improves the robot's ability to withstand disturbances while interacting with external environments, significantly enhancing operational performance and stability.


TuAT5 Regular Session, Auditorium 5	Add to My Program
Computational Architectures and Cognitive Systems I

Chair: Ayub, Ali	Concordia University

10:50-11:02, Paper TuAT5.1	Add to My Program
LM-MCVT: A Lightweight Multi-Modal Multi-View Convolutional-Vision Transformer Approach for 3D Object Recognition

Xiong, Songsong (University of Groningen), Kasaei, Hamidreza (University of Groningen)
Keywords: Assistive Robotics Abstract: In human-centered environments such as restaurants, homes, and warehouses, robots often face challenges in accurately recognizing 3D objects. These challenges stem from the complexity and variability of these environments, including diverse object shapes. In this paper, we propose a novel Lightweight Multi-modal Multi-view Convolutional-Vision Transformer network (LM-MCVT) to enhance 3D object recognition in robotic applications. Our approach leverages the Globally Entropy-based Embeddings Fusion (GEEF) method to integrate multi-views efficiently. The LM-MCVT architecture incorporates pre- and mid-level convolutional encoders and local and global transformers to enhance feature extraction and recognition accuracy. We evaluate our method on the synthetic ModelNet40 dataset and achieve a recognition accuracy of 95.6% using a four-view setup, surpassing existing state-of-the-art methods. To further validate its effectiveness, we conduct 5-fold cross-validation on the real-world OmniObject3D dataset using the same configuration. Results consistently show superior performance, demonstrating the method’s robustness in 3D object recognition across synthetic and real-world 3D data.

11:02-11:14, Paper TuAT5.2	Add to My Program
Generalization of Machine and Deep Learning Models for Brain-Computer Interfaces across Sessions and Paradigms in a Completely Locked-In Patient

Garrote, Luís Carlos (Institute of Systems and Robotics, University of Coimbra), Bettencourt, Rute (Institute of Systems and Robotics, University of Coimbra), Perdiz, João (University of Coimbra), Pires, Gabriel (University of Coimbra), Nunes, Urbano J. (Instituto De Sistemas E Robotica)
Keywords: Machine Learning and Adaptation, Monitoring of Behaviour and Internal States of Humans, Medical and Surgical Applications Abstract: Brain-Computer Interfaces (BCIs) are one of the few remaining communication options for individuals in a Completely Locked-In State (CLIS), where all voluntary motor functions are lost. However, decoding electroencephalographic (EEG) signals in CLIS is particularly challenging due to low signal-to-noise ratios, high intra- and inter-session variability, and cognitive fluctuations. In this study, we systematically evaluate classical and deep learning-based (DL) classification methods on a longitudinal P300-based BCI dataset acquired from a CLIS patient over ten months, comprising seven different stimulation paradigms. A systematic approach is followed to assess model generalization across BCI sessions and paradigms. Overall, more than 40 approaches are compared, including spatial filters for feature extraction with standard classifiers, as well as DL methods based on CNNs and Attention-based architectures. All methods are evaluated with raw input data and three different normalization strategies. Additionally, SMOTE data augmentation is applied to upsample the minority class. The results show high generalization performance across sessions and paradigms, with some approaches achieving nearly 100% performance. Normalization strategies significantly influence performance, while SMOTE often leads to performance degradation. These findings offer valuable insights for designing more robust BCI systems tailored to CLIS users, showing that collecting data across sessions and multiple BCI paradigms can improve BCI performance, while reducing or eliminating the need for per session calibration. Despite the very promising results, they are based on offline analysis. Thus, the best-performing approaches now require online validation for deployment in real-world CLIS scenarios.

11:14-11:26, Paper TuAT5.3	Add to My Program
Blending Participatory Design and Artificial Awareness for Trustworthy Autonomous Vehicles (I)

Tanevska, Ana (Uppsala University), Ratheesh Kumar, Ananthapathmanabhan (Uppsala University), Ghosh, Arabinda (Max Planck Institute for Software Systems), Casablanca, Ernesto (Newcastle University), Castellano, Ginevra (Uppsala University), Soudjani, Sadegh (Max Planck Institute for Software Systems)
Keywords: Computational Architectures, Ethical Issues in Human-robot Interaction Research, User-centered Design of Robots Abstract: Current robotic agents, such as autonomous vehicles (AVs) and drones, need to deal with uncertain real-world environments with appropriate situational awareness (SA), risk awareness, coordination, and decision-making. The SymAware project strives to address this issue by designing an architecture for artificial awareness in multi-agent systems, enabling safe collaboration of autonomous vehicles and drones. However, these agents will also need to interact with human users (drivers, pedestrians, drone operators), which in turn requires an understanding of how to model the human in the interaction scenario, and how to foster trust and transparency between the agent and the human. In this work, we aim to create a data-driven model of a human driver to be integrated into our SA architecture, grounding our research in the principles of trustworthy human-agent interaction. To collect the data necessary for creating the model, we conducted a large-scale user-centered study on human-AV interaction, in which we investigate the interaction between the AV's transparency and the users' behavior. The contributions of this paper are twofold: First, we illustrate in detail our human-AV study and its findings, and second we present the resulting Markov chain models of the human driver computed from the study's data. Our results show that depending on the AV's transparency, the scenario's environment, and the users' demographics, we can obtain significant differences in the model's transitions.

11:26-11:38, Paper TuAT5.4	Add to My Program
Enhancing Elder Cognitive Engagement with Adaptive Serious Games: A MAPE-K Framework for Personalized Human-Robot Interaction

Blanco, Antonio (Universidad De Extremadura), Condón, Alicia (Universidad De Extremadura), Rodríguez-Domínguez, Trinidad (Universidad De Extremadura), Núñez, Pedro (University of Extremadura)
Keywords: Applications of Social Robots, Human Factors and Ergonomics, Social Intelligence for Robots Abstract: The growing number of older adults worldwide has increased demand for innovative cognitive and social stimulation tools. While socially assistive robots with serious games show promise, traditional platforms often lack flexibility to adapt to users’ changing cognitive states and engagement levels. We propose a novel self-adaptive framework combining the MAPE-K (Monitor, Analyze, Plan, Execute, Knowledge) control loop with the CORTEX cognitive robotics architecture. Deployed on the EBOv2 robot, our system supports two serious games—“Simon Says” for attention and memory, and “Pasapalabras” (a word quiz) for semantic and language training. Specialized software agents in CORTEX adjust each game's structure and parameters using multimodal inputs, including user performance and emotional cues. Simulation-based evaluation with profiles based on cognitive models (from high capacity to mild impairment) shows the system’s adaptability and reliability. Latency stays well under the 200 ms threshold, ensuring seamless real-time reconfiguration. The adaptive difficulty mechanism adjusts challenge levels to user profiles, stabilizing within five sessions and improving engagement compared to static setups. High-capacity users progressed to advanced levels without frustration, while users with impairments benefited from gradual increases in difficulty, boosting task completion. These results show the technical feasibility and potential therapeutic value of agent-based robotic systems for personalized cognitive stimulation. Future work includes long-term studies with older adults in care settings to assess sustained cognitive and psychosocial benefits.

11:38-11:50, Paper TuAT5.5	Add to My Program
Towards a Cognitive Architecture to Enable Natural Language Interaction in Co-Constructive Task Learning

Scheibl, Manuel (University Bielefeld), Richter, Birte (University of Bielefield), Müller, Alissa (University of Bielefeld), Beetz, Michael (University of Bremen), Wrede, Britta (Bielefeld University)
Keywords: Multimodal Interaction and Conversational Skills, Social Learning and Skill Acquisition Via Teaching and Imitation, Linguistic Communication and Dialogue Abstract: This research addresses the question, which characteristics a cognitive architecture must have to leverage the benefits of natural language in Co-Constructive Task Learning (CCTL). To provide context, we first discuss Interactive Task Learning, the mechanisms of the human memory system, and the significance of natural language and multi-modality. Next, we examine the current state of cognitive architectures, analyzing their capabilities to inform a concept of CCTL grounded in multiple sources. We then integrate insights from various research domains to develop a unified framework. Finally, we conclude by identifying the remaining challenges and requirements necessary to achieve CCTL in Human-Robot Interaction.


TuAT6 Regular Session, Auditorium 6	Add to My Program
Sound Design for Robots

Chair: Nakadai, Kazuhiro	Institute of Science Tokyo

10:50-11:02, Paper TuAT6.1	Add to My Program
The Role of Voice and Appearance in Gender Perception of Speaking Robots

van Veen, Sjoerd (University of Twente), Willemse, Cesco (University of Twente), Garcia Goo, Hideki (University of Twente), Truong, Khiet (University of Twente)
Keywords: Sound design for robots, Multimodal Interaction and Conversational Skills, Non-verbal Cues and Expressiveness Abstract: To enhance the design of gender-ambiguous speaking robots, designers could benefit from more insights into how people integrate robot appearance and voice in the gender perception of robots. In an online survey, we presented audio-only, visual-only, and audiovisual displays of gendered (and gender-ambiguous) robots and asked participants to rate the perceived level of femininity, masculinity, and gender-ambiguity of these robots. We investigated how the addition of an embodiment feature (voice, robot appearance) or gender (feminine, masculine, ambiguous) affected the perceived gender of speaking robots. Results showed a complex interplay between the variables under study: the magnitude and direction of effect on perceived gender are dependent on the embodiment feature, the gender of the added feature, and the gender identity of the basis that was added to. Furthermore, we found that the addition of a voice to a gender-ambiguous identity has more impact than the addition of a robot appearance on the perceived gender-ambiguity by lowering perceived ambiguity. For the design of gender-ambiguous robots, efforts should focus on making the appearance more ambiguous to counterbalance the gendered effect of voice.

11:02-11:14, Paper TuAT6.2	Add to My Program
Auditory Localization and Assessment of Consequential Robot Sounds: A Multi-Method Study in Virtual Reality

Wessels, Marlene (Johannes Gutenberg-University Mainz), de Heuvel, Jorge (University of Bonn), Müller, Leon (Chalmers University of Technology), Maier, Anna Luisa (Johannes Gutenberg-University Mainz), Bennewitz, Maren (University of Bonn), Kraus, Johannes (Johannes-Gutenberg University of Mainz)
Keywords: Sound design for robots, Human Factors and Ergonomics Abstract: Mobile robots increasingly operate alongside humans but are often out of sight, so that humans need to rely on the sounds of the robots to recognize their presence. For successful human-robot interaction (HRI), it is therefore crucial to understand how humans perceive robots by their consequential sounds, i.e., operating noise. Prior research suggests that the sound of a quadruped Go1 is more detectable than that of a wheeled Turtlebot. This study builds on this and examines the human ability to localize consequential sounds of three robots (quadruped Go1, wheeled Turtlebot 2i, wheeled HSR) in Virtual Reality. In a within-subjects design, we assessed participants' localization performance for the robots with and without an acoustic vehicle alerting system (AVAS) for two velocities (0.3, 0.8 m/s) and two trajectories (head-on, radial). In each trial, participants were presented with the sound of a moving robot for 3 s and were tasked to point at its final position (localization task). Localization errors were measured as the absolute angular difference between the participants' estimated and the actual robot position. Results showed that the robot type significantly influenced the localization accuracy and precision, with the sound of the wheeled HSR (especially without AVAS) performing worst under all experimental conditions. Surprisingly, participants rated the HSR sound as more positive, less annoying, and more trustworthy than the Turtlebot and Go1 sound. This reveals a tension between subjective evaluation and objective auditory localization performance. Our findings highlight consequential robot sounds as a critical factor for designing intuitive and effective HRI, with implications for human-centered robot design and social navigation.

11:14-11:26, Paper TuAT6.3	Add to My Program
Your Robot, My Voice: Enhancing Android Robot Likability through Personalization by Cloning the User's Voice

Kuch, Johanna Magdalena (Universität Augsburg), Heisler, Marcel (Hochschule Der Medien Stuttgart), Klein, Stina (Universität Augsburg), Mertes, Silvan (University of Augsburg), Eing, Lennart (Universität Augsburg), André, Elisabeth (Universität Augsburg), Becker-Asano, Christian (Stuttgart Media University)
Keywords: Sound design for robots, User-centered Design of Robots, Anthropomorphic Robots and Virtual Humans Abstract: This study investigates whether personalized voice cloning can improve a robot's likability compared to a design-congruent voice and a distinctly dissimilar voice. Participants interacted with a gender-ambiguous android robot in three different voice conditions. We compared: (1) a personalized voice clone based on the participant's voice, (2) a design-congruent voice matching the robot's appearance, and (3) a dissimilar voice, which differs from both the participant's and the robot's features. The cloned and design-congruent voices significantly increased likability compared to the dissimilar voice, while anthropomorphism and familiarity showed no significant differences across conditions. Most participants did not immediately recognize their cloned voice until informed that one of the voices was a clone. However, most of the participants were successful when asked to pick out their cloned voice from those used. We assume that voice personalization through similarity to the user improves likability even before the user is aware of this similarity. Our results show that personalized voice cloning is a simple alternative to other methods for the design of robotic voices. It significantly increases robot likability while requiring minimal user effort.

11:26-11:38, Paper TuAT6.4	Add to My Program
Do Re Mi Fa so Pass the Tool: Using Melodic Prediction to Improve Human-Robot Fluency

Rogel, Amit (Georgia Institute of Technology), Yang, Qiaoyu (Georgia Institute of Technology), Hayley, Jack (Georgia Institute of Technology), Weinberg, Gil (Georgia Inst. of Technology)
Keywords: Sound design for robots, Non-verbal Cues and Expressiveness, Novel Interfaces and Interaction Modalities Abstract: This paper investigates how melodic prediction can enhance synchronization and fluency in human-robot collaborative tasks. We leverage humans' natural ability to anticipate musical progressions to improve the fluency of handoff interactions. Through an experiment with 21 participants performing a dual-task scenario of sorting tiles while handing objects to a robot, we compared three sonification approaches: robot consequential sounds, musical scales, and musical melodies. Participants synchronized their handoffs based on the finale of the scale/melody, and aligned their actions with the leading tone rather than the final tonic note. Results demonstrate that both musical conditions significantly improved timing accuracy and enabled participants to better perform concurrent tasks compared to motor sounds alone. Melodies proved to be statistically significantly more consistent over repeated use of the stimuli, which shows that the diverse nature of tonal melodies can improve long term interactions without sacrificing performance. These findings suggest that the predictive qualities of tonal melodies provides an effective tool for anticipatory action.

11:38-11:50, Paper TuAT6.5	Add to My Program
A Hands-Free Interface for Disinfection During Tabletop Tasks

Sanchez, Alan Giovanny (Oregon State University), Miller, Matthew (Oregon State University), Smart, William D. (Oregon State University)
Keywords: Degrees of Autonomy and Teleoperation, Detecting and Understanding Human Activity, Multi-modal Situation Awareness and Spatial Cognition Abstract: In the wake of the global health crisis caused by the COVID-19 pandemic, there is a pressing need for innovative disinfection methods that are both effective and user-friendly to a broad user base. This paper introduces an approach that allows a user to instruct tasks to an ultraviolet (UV) disinfection robot via speech. The implementation of a voice interface offers a hands-free operation and caters to non-technical users who require a simple and effective way to command the robot. Through a combination of object recognition, natural language processing using a large language model (LLM), and task planning, our system can execute tasks more effectively since it is more aware of the context of its sanitizing duties.


TuAT7 Regular Session, Auditorium 7	Add to My Program
HRI and Collaboration in Manufacturing Contexts I

Chair: Schneider, Sebastian	University of Twente

10:50-11:02, Paper TuAT7.1	Add to My Program
What Should an Industrial Robot Record? Understanding Worker Perceptions Toward Privacy

Yankee, Tyler (Clarkson University), Kyrarini, Maria (Santa Clara University), Banerjee, Natasha Kholgade (Wright State University), Banerjee, Sean (Wright State Univeristy)
Keywords: Creating Human-Robot Relationships, HRI and Collaboration in Manufacturing Environments, Monitoring of Behaviour and Internal States of Humans Abstract: Modern robots found in the workplace include high-resolution color cameras, depth cameras, thermal sensors, and microphones, as well as onboard computing that enables generation of human body pose and facial landmarks that can be used to infer the physiological state of the worker. Prior research reveals that privacy concerns in the workplace tend higher towards robots than towards humans, and that robots that include privacy controllers are perceived to be more trustworthy. However, prior research lacks a systematic understanding of how workers in lower-income positions in industries such as warehousing and manufacturing perceive sensing technologies, such as color cameras and microphones, and inferred human data, such as body pose, in their concerns for privacy. In this paper, we use survey responses from 530 workers across four countries and six job domains to understand how workers perceive robots that record human activity through video, audio, or body pose. Our studies reveal that privacy concern grows with the amount of personally identifiable data that is available, i.e., workers show higher concern for video rather than body pose inferred from video. We find that Millennials show more concern about being recorded than Gen Z individuals or than Gen X individuals and older. We find that privacy concerns vary by the worker's country with lowest concerns in the United States and highest in Canada, and that worker concern is lowest in large-size cities and highest in suburban areas.

11:02-11:14, Paper TuAT7.2	Add to My Program
A Review of Personalisation in Human-Robot Collaboration and Future Perspectives towards Industry 5.0

Fant-Male, James (Tampere University), Pieters, Roel S. (Tampere University)
Keywords: HRI and Collaboration in Manufacturing Environments, Human Factors and Ergonomics, User-centered Design of Robots Abstract: The shift in research focus from Industry 4.0 (I4.0) to Industry 5.0 (I5.0) promises a human-centric workplace, with social and well-being values at the centre of technological implementation. Human-Robot Collaboration (HRC) is a core aspect of I5.0 development, with an increase in adaptive and personalised interactions and behaviours. This review investigates recent advancements towards personalised HRC, where user-centric adaptation is key. There is a growing trend for adaptable HRC research, however there lacks a consistent and unified approach. The review highlights key research trends on which personal factors are considered, the design of human-robot interaction, and adaptive task completion. This raises various key considerations for future developments, particularly around the ethical and regulatory development of personalised systems, which are discussed in detail.

11:14-11:26, Paper TuAT7.3	Add to My Program
Deliberative Layered Behavior Tree Approach for Real-Time Concurrent Decision-Making in Human-Robot Interaction for Assembly

Rodríguez-Guerra, Diego (SUPSI - ISTePS), Avram, Oliver (SUPSI-ISTePS), Baraldo, Stefano (Scuola Universitaria Professionale Della Svizzera Italiana), Zamboni, Mattia (SUPSI ISTePS), Valente, Anna (SUPSI-ISTePS)
Keywords: HRI and Collaboration in Manufacturing Environments, Computational Architectures, Cooperation and Collaboration in Human-Robot Teams Abstract: This paper presents a Layered Behavior Tree (BT) architecture for improving decision-making and responsiveness in collaborative assembly for manufacturing value chains. The aim of the proposed software architecture is to ensure safety while maintaining productivity. To this aim, a multi-layered software architecture is proposed with three different layers, to handle different aspects of the context: the State Interpreters for context awareness, a Mode Handler for operational mode transitions, and the Executors for advanced behavior execution. By modularizing the logic and the mode handling of the application, and utilizing a shared memory for real-time communication between the layers, this architecture addresses the limitations of traditional monolithic BTs, achieving faster and more reliable responses to critical events while adapting the production to both the operator and context status. The proposed Layered Behavior Tree approach has been validated on an aerospace assembly use case, where a rough 79% of reduction in the time to handle critical events has been appreciated in the measured results.

11:26-11:38, Paper TuAT7.4	Add to My Program
A Dual-Shear Ring End-Effector for Autonomous Pomegranate Harvesting

Ma, Peifeng (Xi'an Jiaotong University), Zhu, Aibin (Xi'an Jiaotong University), Mao, Han (Xi'an Jiaotong University), Li, Dangchao (Xi'an Jiaotong University，Practical Education Center (Eng), Xu, Rui (Xi'an Jiaotong University), Wang, Jing (Xi'an Jiaotong University), Zhang, Yu (Xi'an Jiaotong University), Li, Meng (Xi'an Jiaotong University), Song, Jiyuan (Guangming Laboratory, Guangdong Laboratory of Artificial Intelli), Tu, Yao (Guangdong Laboratory of Artificial Intelligence and Digital Econ), Wu, xue (Xi'an Jiaotong University), Dong, Xia (Xi'an Jiaotong University)
Keywords: Innovative Robot Designs, User-centered Design of Robots, Social Intelligence for Robots Abstract: Pomegranate harvesting remains a challenging task due to the fruit's tough stem, dense canopy, and sensitivity to mechanical damage. Traditional harvesting robots rely on vision-based stem localization, which increases computational complexity and reduces robustness in unstructured orchard environments. This paper presents a dual-shear ring end-effector designed to eliminate the need for precise stem detection, utilizing a self-locking shear mechanism that allows the stem to naturally align between the cutting blades. The system integrates a vision-assisted robotic manipulator for fruit detection and a torque regulation mechanism for optimized cutting force application. Experimental validation demonstrates a success rate of over 90% for stems up to 8 mm in diameter and robust performance even under partial and full occlusion conditions. The results confirm that the proposed system achieves efficient, adaptable, and damage-free harvesting, providing a viable solution for autonomous pomegranate harvesting.

11:38-11:50, Paper TuAT7.5	Add to My Program
LMPVC and Policy Bank: Adaptive Voice Control for Industrial Robots with Code Generating LLMs and Reusable Pythonic Policies

Parikka, Ossi (Tampere University), Pieters, Roel S. (Tampere University)
Keywords: Machine Learning and Adaptation, HRI and Collaboration in Manufacturing Environments Abstract: Modern industry is increasingly moving away from mass manufacturing, towards more specialized and personalized products. As manufacturing tasks become more complex, full automation is not always an option, human involvement may be required. This has increased the need for advanced human robot collaboration (HRC), and with it, improved methods for interaction, such as voice control. Recent advances in natural language processing, driven by artificial intelligence (AI), have the potential to answer this demand. Large language models (LLMs) have rapidly developed very impressive general reasoning capabilities, and many methods of applying this to robotics have been proposed, including through the use of code generation. This paper presents textit{Language Model Program Voice Control} (LMPVC), an LLM-based prototype voice control architecture with integrated policy programming and teaching capabilities, built for use with Robot Operating System 2 (ROS2) compatible robots. The architecture builds on prior works using code generation for voice control, by implementing an additional programming and teaching system, the textit{Policy Bank}. We find this system can compensate for the limitations of the underlying LLM, and allow LMPVC to adapt to different downstream tasks without a slow and costly training process. The architecture and additional results are released on GitHub (https://github.com/ozzyuni/LMPVC).


TuLU_BR Voorhof/Hall	Add to My Program
Lunch (Tu) 11: 50-12: 50


TuBT1 Regular Session, Auditorium 1	Add to My Program
Machine Learning and Adaptation II

Chair: Jullens, Monique Schaule	University of Amsterdam
Co-Chair: Khan, Nabeela	Institute of Science Tokyo

12:50-13:02, Paper TuBT1.1	Add to My Program
Maximizing Query Diversity for Terrain Cost Preference Learning in Robot Navigation

Sinclair, Jordan (University of Denver), Alabi, Elijah (University of Denver), Wigness, Maggie (U.S. Army Research Laboratory), Reily, Brian (Army Research Laboratory), Reardon, Christopher M. (MITRE)
Keywords: Machine Learning and Adaptation, Social Learning and Skill Acquisition Via Teaching and Imitation, Motion Planning and Navigation in Human-Centered Environments Abstract: Effective robot navigation in real-world environments requires an understanding of terrain properties, as different terrain types impact factors such as speed, safety, and wear on the platform. Preference-based learning offers a compelling framework in which terrain costs can be inferred through simple trajectory queries to the user. However, existing query selection methods often suffer from redundant selection due to limited trajectory diversity, as well as query ambiguity, where the user must choose between trajectories with minimal distinguishable differences. These issues lead to inefficient learning and suboptimal terrain cost estimation. In this paper, we introduce a joint optimization framework that enhances learning efficiency by improving both trajectory set diversity and the query selection strategy. We leverage a variational autoencoder (VAE) to encode and cluster trajectories based on their terrain characteristics, ensuring a balanced and representative query set. Additionally, we introduce a cluster-aware query selection mechanism that prioritizes diverse trajectory pairs pulled from distinct clusters to maximize information gain. Experimental results demonstrate that our approach significantly reduces the number of queries required to converge to the ground truth terrain cost assignment, outperforming state-of-the-art query selection techniques.

13:02-13:14, Paper TuBT1.2	Add to My Program
Personalized Robotic Object Rearrangement from Scene Context

Ramachandruni, Kartik (Georgia Institute of Technology), Chernova, Sonia (Georgia Institute of Technology)
Keywords: Machine Learning and Adaptation, Assistive Robotics Abstract: Object rearrangement is a key task for household robots requiring personalization without explicit instructions, meaningful object placement in environments occupied with objects, and generalization to unseen objects and new environments. To facilitate research addressing these challenges, we introduce PARSEC, an object rearrangement benchmark for learning user organizational preferences from observed scene context to place objects in a partially arranged environment. PARSEC is built upon a novel dataset of 110K rearrangement examples crowdsourced from 72 users, featuring 93 object categories and 15 environments. To better align with real-world organizational habits, we propose ContextSortLM, an LLM-based personalized rearrangement model that handles flexible user preferences by explicitly accounting for objects with multiple valid placement locations when placing items in partially arranged environments. We evaluate ContextSortLM and existing personalized rearrangement approaches on the PARSEC benchmark and complement these findings with a crowdsourced evaluation of 108 online raters ranking model predictions based on alignment with user preferences. Our results indicate that personalized rearrangement models leveraging multiple scene context sources perform better than models relying on a single context source. Moreover, ContextSortLM outperforms other models in placing objects to replicate the target user's arrangement and ranks among the top two in all three environment categories, as rated by online evaluators. Importantly, our evaluation highlights challenges associated with modeling environment semantics across different environment categories and provides recommendations for future work.

13:14-13:26, Paper TuBT1.3	Add to My Program
Challenges and Research Directions from the Operational Use of a Machine Learning Damage Assessment System Via Small Uncrewed Aerial Systems at Hurricanes Debby and Helene

Manzini, Thomas (Texas A&M), Perali, Priyankari (Texas A&M University), Merrick, David (Florida State University), Murphy, Robin (Texas A&M)
Keywords: Machine Learning and Adaptation, Computational Architectures Abstract: This paper details four principal challenges encountered with machine learning (ML) damage assessment using small uncrewed aerial systems (sUAS) at Hurricanes Debby and Helene that prevented, degraded, or delayed the delivery of data products during operations and suggests three research directions for future real-world deployments. The presence of these challenges is not surprising given that a review of the literature considering both datasets and proposed ML models suggests this is the first sUAS-based ML system for disaster damage assessment actually deployed as a part of real-world operations. The sUAS-based ML system was applied by the State of Florida to Hurricanes Helene (2 orthomosaics, 3.0 gigapixels collected over 2 sorties by a Wintra WingtraOne sUAS) and Debby (1 orthomosaic, 0.59 gigapixels collected via 1 sortie by a Wintra WingtraOne sUAS) in Florida. The same model was applied to crewed aerial imagery of inland flood damage resulting from post-tropical remnants of Hurricane Debby in Pennsylvania (436 orthophotos, 136.5 gigapixels), providing further insights into the advantages and limitations of sUAS for disaster response. The four challenges (variation in spatial resolution of input imagery, spatial misalignment between imagery and geospatial data, wireless connectivity, and data product format) lead to three recommendations that specify research needed to improve ML model capabilities to accommodate the wide variation of potential spatial resolutions used in practice, handle spatial misalignment, and minimize the dependency on wireless connectivity. These recommendations are expected to improve the effective operational use of sUAS and sUAS-based ML damage assessment systems for disaster response.

13:26-13:38, Paper TuBT1.4	Add to My Program
Non-Uniform Spatial Alignment Errors in sUAS Imagery from Wide-Area Disasters

Manzini, Thomas (Texas A&M), Perali, Priyankari (Texas A&M University), Karnik, Raisa (Texas A&M University), Godbole, Mihir (Texas A&M University), Abdullah, Hasnat Md (Texas A&M University), Murphy, Robin (Texas A&M)
Keywords: Machine Learning and Adaptation, Motion Planning and Navigation in Human-Centered Environments Abstract: This work presents the first quantitative study of alignment errors between small uncrewed aerial systems (sUAS) georectified imagery and a priori building polygons and finds that alignment errors are non-uniform and irregular, which negatively impacts field robotics systems and human-robot interfaces that rely on geospatial information. There are no efforts that have considered the alignment of a priori spatial data with georectified sUAS imagery, possibly because straight-forward linear transformations often remedy any misalignment in satellite imagery. However, an attempt to develop machine learning models for an sUAS field robotics system for disaster response from nine wide-area disasters using the CRASAR-U-DROIDs dataset uncovered serious translational alignment errors. The analysis considered 21,608 building polygons in 51 orthomosaic images, covering 16787.2 Acres (26.23 square miles), and 7,880 adjustment annotations, averaging 75.36 pixels and an average intersection over union of 0.65. Further analysis found no uniformity among the angle and distance metrics of the building polygon alignments, presenting an average circular variance of 0.28 and an average distance variance of 0.45 pixels2, making it impossible to use the linear transform used to align satellite imagery. The study’s primary contribution is alerting field robotics and human-robot interaction (HRI) communities to the problem of spatial alignment and that a new method will be needed to automate and communicate the alignment of spatial data in sUAS georectified imagery. This paper also contributes a description of the updated CRASAR-U-DROIDs dataset of sUAS imagery, which contains building polygons and human-curated corrections to spatial misalignment for further research in field robotics and HRI

13:38-13:50, Paper TuBT1.5	Add to My Program
Incremental and Interactive Exploration of Robot Appearance Designs Using GenAI

Hielscher, Till (University of Stuttgart), Bartenbach, Arne (Universität Stuttgart), Leonhardt, Wolf (TLH), Arras, Kai Oliver (University of Stuttgart)
Keywords: User-centered Design of Robots, Machine Learning and Adaptation, Innovative Robot Designs Abstract: A robot’s physical design impacts user acceptance, engagement, and trust while also influencing social and functional expectations about the robot’s capabilities. Robot design, and industrial design in general, can also be driver of differentiation in a competitive market. In this paper, we leverage generative AI to incrementally explore design spaces of robot appearances. With the goal of overcoming training data bias of text-to-image models that favor stereotypical morphologies and designs, we propose a set of generation methods that enable a designer to guide the exploration process through text-, style-, and structure-based specifications from user or client feedback. Further, using Low-Rank Adaptation for model fine-tuning, the method allows to define the aesthetic direction by an image collection that conveys a particular style or theme (“mood boards”). The experiments demonstrate that our extensions retain image quality in terms of statistical and structural features and allow for both diversity and specificity in the design process. In a case study, we apply this method in the user-centered design process and discuss its opportunities and limitations.


TuBT2 Regular Session, Auditorium 2	Add to My Program
Child - Robot Interaction II

Chair: Bruno, Barbara	Karlsruhe Institute of Technology (KIT)
Co-Chair: O'Connor, Sean	Bucknell University

12:50-13:02, Paper TuBT2.1	Add to My Program
Gestures vs. Faces: Exploring Emotion Recognition in Child-Robot Interaction

Klimecka, Julia (Jagiellonian University), Wróbel, Alicja (Jagiellonian University), Zguda, Paulina (Jagiellonian University), Tymon, Kukier (Jagiellonian University), Indurkhya, Bipin (Jagiellonian University)
Keywords: Child-Robot Interaction, Robots in Education, Therapy and Rehabilitation, Motivations and Emotions in Robotics Abstract: As robots are increasingly integrated into everyday life, their ability to communicate emotions becomes crucial to foster positive human-robot interactions. However, the key factors for the recognition of artificial emotions are still poorly understood. This study explores whether children can better recognize emotions expressed by robots using gestures or by robots using facial expressions. We observed ten five-year-olds and fourteen six-year-olds during an in-the-wild workshop. The results revealed significant differences in emotion recognition based on robot type, age, and emotion. Children who interact with the gesture robot were 90% less likely to accurately recognize emotions than the robot with facial expressions. Six-year-olds demonstrated significantly higher accuracy, with 8.30 times greater chances of recognizing emotions than five-year-olds. Regarding emotion type, children were significantly more likely to identify sadness and happiness correctly than anger. These findings highlight the importance of incorporating facial expressions to aid in emotion recognition and make robots more approachable for young children.

13:02-13:14, Paper TuBT2.2	Add to My Program
Dressing Trust: Exploring the Impact of Clothing CoDesign on Disclosure and Attitudes in Children - Social Robots Interaction

Gioumatzidou, Sofia (Univercity of Macedonia), Velentza, Anna Maria (Brest National School of Engineering (ENIB)), Fachantidis, Nikolaos (University of Macedonia)
Keywords: Child-Robot Interaction, Embodiment, Empathy and Intersubjectivity, Robots in Education, Therapy and Rehabilitation Abstract: Trust is fundamental in Human-Robot Interaction (HRI) and can be influenced by factors such as the robot's appearance, and texture. Research suggests that children can develop trust with robots, particularly humanoid ones; however, other studies raise concerns about robotic behaviors and the extent to which they are perceived as trustworthy. In this study, we investigate whether the co-designing of clothes for a Socially Assistive Robot (SAR) affects their trust in the SAR, specifically the Nao robot. Children aged 7-10 in small group workshops created with the needle felting technique clothes for Nao, guided by the robot itself and an expert. Each child made their own piece of clothing. Their a) trust, b) empathy, c) perceived liveliness, and d) negative attitudes before and after making the clothes were assessed with pre-post questionnaires, indicating a significant increase in trust and empathy. After making the clothes, each child interacted alone with 3 Nao robots; one with default appearance, one dressed in the clothes they made, and one with other similar clothes, and were invited to share a secret wish with one of them. Behavioral observations, individual interviews and linguistic analysis of the wishes, provided insights into factors influencing children’s trust and appearance preferences in HRI. Results indicated that most children chose to confide in the robot who wore their crafted outfit, expressing personal desires grammatically with the verb 'wish' in contrast to the children who shared immediate desires with the verb 'want' and chose the default robot. Finally, interviews indicated that the primary characteristics that make a SAR trustworthy to children are kindness, intelligence, anthropomorphism, and helpfulness.

13:14-13:26, Paper TuBT2.3	Add to My Program
Furhat Robot for Children: Designing an Interactive Educational Activity

Oralbayeva, Nurziya (Nazarbayev University), Isteleyeva, Ameli (Human-Robot Interaction (HRI) Lab, Nazarbayev University), Zhenissova, Nurbanu (Nazarbayev University), Telisheva, Zhansaule (Nazarbayev University), Tungatarova, Aida (Nazarbayev University), Sandygulova, Anara (Nazarbayev University)
Keywords: Child-Robot Interaction, Narrative and Story-telling in Interaction, Creating Human-Robot Relationships Abstract: Children bring unique perspectives and valuable input into design processes, unburdened by the complexities of societal influences. Their direct involvement in the design process is essential for creating meaningful, engaging, and inclusive activities. Participatory design (PD) research is be- coming increasingly important in involving end-users in co- designing robotic systems and software. This paper presents the preliminary findings of a PD workshop aimed at establishing an interactive and collaborative environment for children’s co- design of a learning activity for robot-assisted learning on the Furhat robot. To this end, we conducted four workshop sessions with children aged 2-8 years old to co-create a robot- assisted learning scenario surrounding the topic of animals. As a result, together with the children, we designed and tested the prototype of a learning scenario, which helped us identify the shortcomings and set the foundation for future robot- assisted learning scenarios and activities. By reflecting on the challenges and lessons learned through PD with young children, we contribute to enhancing the understanding of PD.

13:26-13:38, Paper TuBT2.4	Add to My Program
Design Activity for Robot Faces: Evaluating Child Responses to Expressive Faces

Oliva, Denielle (University of Nevada, Reno), Knight, Joshua (University of Nevada, Reno), Becker, Tyler J (University of Nevada, Reno), Amistani, Heather (University of Nevada, Reno), Nicolescu, Monica (University of Nevada, Reno), Feil-Seifer, David (University of Nevada, Reno)
Keywords: Child-Robot Interaction, User-centered Design of Robots, Social Presence for Robots and Virtual Humans Abstract: Facial expressiveness plays a crucial role in a robot’s ability to engage and interact with children. Prior research has shown that expressive robots can enhance child engagement during human-robot interactions. However, many robots used in therapy settings feature non-personalized, static faces designed with traditional facial feature considerations, which can limit the depth of interactions and emotional connections. Digital faces offer opportunities for personalization, yet the current landscape of robot face design lacks a dynamic, user-centered approach. Specifically, there is a significant research gap in designing robot faces based on child preferences. Instead, most robots in child-focused therapy spaces are developed from an adult-centric perspective. We present a novel study investigating the influence of child-drawn digital faces in child-robot interactions. This approach focuses on a design activity with children instructed to draw their own custom robot faces. We compare the perceptions of social intelligence (PSI) of two implementations: a generic digital face and a robot face personalized using the user’s drawn robot faces. The results of this study show a significant difference in the PSI of a customized agent compared to a generic face, where a customized agent was rated higher than a non-personalized agent over multiple sub-scales.

13:38-13:50, Paper TuBT2.5	Add to My Program
Emotionally Expressive Robots: Implications for Children's Behavior Toward Robot

Zibetti, Elisabetta (CHArt-LUTIN Laboratory), Waheed Palmer, Sureya (Laboratoire CHArt, Université Paris 8), Stower, Rebecca (KTH), Anzalone, Salvatore Maria (Université Paris 8)
Keywords: Non-verbal Cues and Expressiveness, Child-Robot Interaction, Embodiment, Empathy and Intersubjectivity Abstract: The growing development of robots with artificial emotional expressiveness raises important questions about their persuasive potential in children's behavior. While research highlights the pragmatic value of emotional expressiveness in human social communication, the extent to which robotic expressiveness can or should influence empathic responses in children is grounds for debate. In a pilot study with 22 children (aged 7-11) we begin to explore the ways in which different levels of embodied expressiveness (body only, face only, body and face) of two basic emotions (happiness and sadness) displayed by an anthropomorphic robot (QTRobot) might modify children’s behavior in a child-robot cooperative turn-taking game. We observed that children aligned their behavior to the robot’s inferred emotional state. However, higher levels of expressiveness did not result in increased alignment. The preliminary results reported here provide a starting point for reflecting on robotic expressiveness and its role in shaping children's social-emotional behavior toward robots as social peers in the near future.


TuBT3 Regular Session, Auditorium 3	Add to My Program
Assistive Robotics II

Chair: Spitale, Micol	Politecnico Di Milano
Co-Chair: Cuijpers, Raymond	Eindhoven University of Technology

12:50-13:02, Paper TuBT3.1	Add to My Program
Combining LLM, Non-Monotonic Logical Reasoning, and Human-In-The-Loop Feedback in an Assistive AI Agent

Fu, Tianyi (The University of Edinburgh), Jauw, Brian (University of Edinburgh), Sridharan, Mohan (University of Edinburgh)
Keywords: Computational Architectures, Machine Learning and Adaptation, Assistive Robotics Abstract: Large Language Models (LLMs) are considered state of the art for many tasks in robotics and AI. At the same time, there is increasing evidence of their critical limitations such as generating arbitrary responses in new situations, inability to support rapid incremental updates based on limited examples, and opacity. Toward addressing these limitations, our architecture leverages the complementary strengths of LLMs and knowledge-based reasoning. Specifically, the architecture enables an AI agent assisting a human to use an LLM to provide generic abstract predictions of upcoming tasks. The agent also reasons with domain-specific knowledge, recent history of interactions with the human, and semantic databases to: (a) provide contextual prompts to the LLM; and (b) compute a plan of concrete actions that jointly implements the current task and prepares for the anticipated task, replanning as needed. Furthermore, the agent solicits and uses high-level human feedback based on need and availability to incrementally revise the domain-specific knowledge and interactions with the LLM. We ground and evaluate our architecture’s abilities in the realistic VirtualHome simulation environment, demonstrating a substantial performance improvement compared with just using an LLM or an LLM and logical reasoner. Project website: https://brianej.github.io/igfmrdskaa.github.io/

13:02-13:14, Paper TuBT3.2	Add to My Program
A Standing Support Mobility Robot for Enhancing Independence in Elderly Daily Living

Manríquez-Cisterna, Ricardo (Tohoku University), Ravankar, Ankit A. (Tohoku University), Salazar Luces, Jose Victorio (Tohoku University), Hatsukari, Takuro (Paramount Bed Co., Ltd), Hirata, Yasuhisa (Tohoku University)
Keywords: Assistive Robotics, Innovative Robot Designs, Human Factors and Ergonomics Abstract: This paper presents a standing support mobility robot "Moby" developed to enhance independence and safety for elderly individuals during daily activities such as toilet transfers. Unlike conventional seated mobility aids, the robot maintains users in an upright posture, reducing physical strain, supporting natural social interaction at eye level, and fostering a greater sense of self-efficacy. Moby offers a novel alternative by functioning both passively and with mobility support, enabling users to perform daily tasks more independently. Its main advantages include ease of use, lightweight design, comfort, versatility, and effective sit-to-stand assistance. The robot leverages the Robot Operating System (ROS) for seamless control, featuring manual and autonomous operation modes. A custom control system enables safe and intuitive interaction, while the integration with NAV2 and LiDAR allows for robust navigation capabilities. This paper reviews existing mobility solutions and compares them to Moby, details the robot’s design, and presents objective and subjective experimental results using the NASA-TLX method and time comparisons to other methods to validate our design criteria and demonstrate the advantages of our contribution.

13:14-13:26, Paper TuBT3.3	Add to My Program
Clinical Deployment of Socially Assistive Robot for Physical Health Assessment

LACHAUX, Killian (Université Du Québec à Chicoutimi), GAGNON, Élodie (Université Du Québec à Chicoutimi), THULLIER, Florentin (Université Du Québec à Chicoutimi), MAÎTRE, Julien (Université Du Québec à Chicoutimi), BOUCHARD, Kevin (Université Du Québec à Chicoutimi), GAGNON, Cynthia (Université De Sherbrooke), DUCHESNE, Élise (Université Laval), GABOURY, Sébastien (Université Du Québec à Chicoutimi)
Keywords: Applications of Social Robots, Medical and Surgical Applications, Robot Companions and Social Robots Abstract: This paper builds on previous work to explore the feasibility of a Socially Assistive Robot (SAR)-based system for automated physical health assessments in clinical settings. Transitioning from lab experiments to real-world deployment, the TEMI robot integrates AI analytics to automate three key tests: the 30-Second Chair Stand Test (30sCST), 10-Meter Walk Test (10mWT), and Grip Strength Test (GST). Clinical evaluations with 12 healthy controls and 3 neuromuscular disease patients confirmed its reliability, matching clinician-obtained measurements. Key challenges, including network performance, video stability, and patient-specific anthropometric adjustments, were addressed. A major innovation was a locally hosted Vision Language Model (VLM) for automated grip strength data extraction, reducing errors and improving accuracy. Future work includes refining real-time data validation, expanding trials, and integrating multimodal AI for enhanced patient engagement and adaptability. This research highlights SAR-based systems as promising tools for scalable, AI-driven physical health assessments in clinical practice.

13:26-13:38, Paper TuBT3.4	Add to My Program
The Myth Buster Robot: Supporting Older Adults’ Robot Literacy through Robot-Assisted Data Privacy Learning Application

Ahtinen, Aino (Tampere University), Jarske, Salla (Tampere University), Chowdhury, Aparajita (Tampere University), Kiuru, Hilla (University of Jyväskylä), Vasara, Paula (University of Jyväskylä), Valokivi, Heli (University of Jyväskylä), Siirtola, Harri (Tampere University), Raisamo, Roope (Tampere University)
Keywords: Applications of Social Robots, Robots in Education, Therapy and Rehabilitation, User-centered Design of Robots Abstract: Robots interacting with humans pose data privacy risks, potentially leading to uncontrollable leaks of personal information. Robots interact with older adults in public places, homes, care facilities, and hospitals. Older adults may have concerns about privacy issues related to robots and would benefit from developing their robot literacy skills regarding data privacy. Research on older adults’ robot literacy in relation to data privacy is scarce. We conducted a qualitative and explorative human-centered design study with care home residents (N=9) to explore their perceptions of and interest in robot data privacy literacy. In the study, they interacted with an early prototype of a robot-assisted learning application implemented on the social robot QTrobot. Participants were concerned about the “superpowers” and data storage of robots. Based on our findings and existing literature, we redesigned the prototype into “Myth Buster,” a robot-assisted learning application aimed at enhancing older adults’ data privacy literacy regarding robots. Our work contributes to the understanding of older adults’ data privacy literacy, which is currently under-researched in Human-Robot Interaction. We also present design-relevant insights for developing robot-assisted data privacy learning applications to enhance robot literacy of older adults.

13:38-13:50, Paper TuBT3.5	Add to My Program
Humanoid Robot Personalised Serious Games in an Older Adults’ Care Center (I)

Canapa, Giulio (CNR-ISTI), Catricalà, Benedetta (CNR-ISTI), Manca, Marco (CNR-ISTI), Paternò, Fabio (CNR-ISTI, HIIS Laboratory), Santoro, Carmen (CNR-ISTI), Zedda, Eleonora (University of Pisa; ISTI-CNR)
Keywords: Applications of Social Robots Abstract: Most countries are ageing rapidly, creating significant challenges in providing adequate care to elderly population. Older adults’ care centers face several difficulties in ensuring support able to address their diverse needs, due to several factors including increasing caregiver shortages, high variability in cognitive and health conditions of elderly, and challenges in delivering personalized interventions to them. Additionally, maintaining elderly engagement during cognitive training is often problematic due to the repetitive and impersonal nature of involved tasks. To address these limitations, we carried out a study investigating the use of a humanoid robot to deliver interactive, personalised serious games based on older adults' personal memories, to enhance relevance and engagement for them. The approach has been evaluated in a trial conducted in a center for older adults, involving users with varying cognitive abilities. Results indicated that such personalised games were well received by them, with a positive impact on their experience.


TuBT4 Regular Session, Blauwe Zaal	Add to My Program
Applications of Social Robots II

Chair: Louie, Wing-Yue Geoffrey	Oakland University
Co-Chair: Brass, Emma	University of Liverpool

12:50-13:02, Paper TuBT4.1	Add to My Program
Grasping Posture Estimation Method Using Range-And-Infrared Images and EMG for Control of Multi-Degree-Of-Freedom Prosthetic Hands

Nishizawa, Shintaro (Yokohama National University), Kato, Ryu (Yokohama National University), Watanabe, Hiroshi (Yokohama National University)
Keywords: Machine Learning and Adaptation, Robots in Education, Therapy and Rehabilitation, User-centered Design of Robots Abstract: Current myoelectric hands have two problems: instability in performance due to low estimation accuracy and individual differences, and deviation from the natural grasping approach of the hand, such as the need to preset operations and strain for stable movement. These problems are thought to be due to the nature of electromyography (EMG); therefore, it is necessary to introduce a new multimodal sensing method that uses control inputs other than EMG. Existing research on multimodal sensing has focused on improving performance rather than replicating the grasping approach of the hand. To solve the two problems presented above, this study developed a system that uses both EMG and distance images as control inputs and compared the estimation and grasping rates with existing methods. The results show improvements in both the performance and approach.

13:02-13:14, Paper TuBT4.2	Add to My Program
Socially-Aware Object Transportation by a Mobile Manipulator in Static Planar Environments with Obstacles

Ribeiro, Caio (Universidade Federal De Minas Gerais), Paes, Leonardo Reis Domingues (Universidade Federal De Minas Gerais), G. Macharet, Douglas (Universidade Federal De Minas Gerais)
Keywords: Motion Planning and Navigation in Human-Centered Environments Abstract: Socially-aware robotic navigation is essential in environments where humans and robots coexist, ensuring both safety and comfort. However, most existing approaches have been primarily developed for mobile robots, leaving a significant gap in research that addresses the unique challenges posed by mobile manipulators. In this paper, we tackle the challenge of navigating a robotic mobile manipulator, carrying a non-negligible load, within a static human-populated environment while adhering to social norms. Our goal is to develop a method that enables the robot to simultaneously manipulate an object and navigate between locations in a socially-aware manner. We propose an approach based on the Risk-RRT* framework that enables the coordinated actuation of both the mobile base and manipulator. This approach ensures collision-free navigation while adhering to human social preferences. We compared our approach in a simulated environment to socially-aware mobile-only methods applied to a mobile manipulator. The results highlight the necessity for mobile manipulator-specific techniques, with our method outperforming mobile-only approaches. Our method enabled the robot to navigate, transport an object, avoid collisions, and minimize social discomfort effectively.

13:14-13:26, Paper TuBT4.3	Add to My Program
A Blessing or a Burden? Exploring Worker Perspectives of Using a Social Robot in a Church

Blair, Andrew (University of Glasgow), Gregory, Peggy (University of Glasgow), Foster, Mary Ellen (University of Glasgow)
Keywords: Applications of Social Robots, User-centered Design of Robots, Philosophical Issues in Human-Robot Coexistence Abstract: Recent technological advances have allowed robots to assist in the service sector, and consequently accelerate job and sector transformation. Less attention has been paid to the use of robots in real-world organisations where social benefits, as opposed to profits, are the primary motivator. To explore these opportunities, we have partnered with a working church and visitor attraction in the United Kingdom. We conducted interviews with 15 participants from a range of stakeholder groups within the church to understand worker perspectives of introducing a social robot to the church and analysed the results using reflexive thematic analysis. Findings indicate mixed responses to the use of a robot, with participants highlighting the empathetic responsibility the church has towards people and the potential for unintended consequences. However, information provision and alleviation of menial or mundane tasks were identified as potential use cases. This highlights the need to consider not only the financial aspects of robot introduction, but also how social and intangible values shape what roles a robot should take on within an organisation.

13:26-13:38, Paper TuBT4.4	Add to My Program
Learning Dexterous Object Handover

Frau-Alfaro, Daniel (University of Alicante), Castaño-Amorós, Julio (University of Alicante), Puente, Santiago (University of Alicante), Gil, Pablo (University of Alicante), Calandra, Roberto (TU Dresden)
Keywords: Machine Learning and Adaptation, Social Learning and Skill Acquisition Via Teaching and Imitation, Cooperation and Collaboration in Human-Robot Teams Abstract: Object handover is an important skill that we use daily when interacting with other humans. To deploy robots in collaborative setting, like houses, being able to receive and handing over objects safely and efficiently becomes a crucial skill. In this work, we demonstrate the use of reinforcement learning for dexterous object handover between two multi-finger hands. Key to this task is the use of a novel reward function based on dual quaternions to minimize the rotation distance, which outperforms other rotation representations such as Euler and rotation matrices. The robustness of the trained policy is experimentally evaluated by testing w.r.t. objects that are not included in the training distribution, and perturbations during the handover process. The results demonstrate that the trained policy successfully perform this task, achieving a total success rate of 94% in the best-case scenario after 100 experiments, thereby showing the robustness of our policy with novel objects. In addition, the best-case performance of the trained policy decreases by only 13.8% when the other robot moves during the handover, proving that our policy is also robust to this type of perturbation, which is common in real-world object handovers.

13:38-13:50, Paper TuBT4.5	Add to My Program
Collecting Object-Level Affordance for RGBD Datasets

Schoot Uiterkamp, Luc (University of Twente), Englebienne, Gwenn (University of Twente), Heylen, Dirk (University of Twente)
Keywords: Multi-modal Situation Awareness and Spatial Cognition, Cognitive Skills and Mental Models Abstract: Accurate interpretation of the environment is both essential to automated robots and highly beneficial for teleoperated robots. Going beyond obstacle recognition, interpreting the semantics of the environment and the actions it affords, enables robots to interact with environments made for humans in a human-like manner. This paper describes the collection of affordance labels at the object level for multiple indoors datasets, to train computer vision algorithms for detecting object affordances in indoor spaces. It is a first step towards determining high level ``semantic'' affordances, to allow reasoning about what to do with objects, rather than ``functional'' affordances, which allow reasoning about how to use the same. A baseline model is provided, which highlights the value of the affordance labels in a variety of robotics applications.


TuBT5 Special Session, Auditorium 5	Add to My Program
SS: Explainable Human-Robot Interaction

Chair: Nasir, Jauwairia	University of Augsburg
Co-Chair: Andriella, Antonio	Institut De Robòtica I Informàtica Industrial

12:50-13:02, Paper TuBT5.1	Add to My Program
Evaluating Embeddable Language Models in Verbalizing Rule-Based Inferences through Justifications (I)

Dussard, Bastien (LAAS-CNRS), Clodic, Aurélie (Laas - Cnrs), Sarthou, Guillaume (LAAS-CNRS)
Keywords: Evaluation Methods, Linguistic Communication and Dialogue Abstract: While Language Models have shown promising performance, they still struggle with limitations regarding reasoning and are very token-sensitive. In contrast, knowledge-based systems, such as ontologies, allow for provable logically valid reasoning and provide explicit justifications regarding newly inferred knowledge. However, those justifications can be hard to understand for non-expert users given their formal syntax and their length. We investigated if language models could be considered as reliable tools for verbalizing such explanations, thus increasing explainability over reasoning output. This paper presents a reference evaluation of a set of embeddable language models on a task of translation from rule-based ontology formatted inferences and justifications into natural language sentences. We show that the order of justifications significantly decreases performance, whereas adding the inference rule as additional context significantly improves performance, leading to more reliable results.

13:02-13:14, Paper TuBT5.2	Add to My Program
Understanding Human-Machine Team Communication from an Explainable-AI Perspective (I)

Roig Vilamala, Marc (Cardiff University), Furby, Jack (Cardiff University), de Gortari Briseno, Julian (University of California, Los Angeles), Srivastava, Mani (UCLA), Preece, Alun (Cardiff University), Fuentes, Carolina (Cardiff University)
Keywords: Linguistic Communication and Dialogue, Cooperation and Collaboration in Human-Robot Teams Abstract: In this paper, we explore how humans communicate with teammates from an explainable-AI perspective, comparing how they interact with both human and AI-controlled robot teammates under a number of different strategies in scenarios such as disaster relief. We find that while humans do adapt their communication based on which strategy they are following, they consistently communicate differently with AI teammates than human teammates, tending to give explicit orders to the former while sending more vague messages with implicit understanding to the later. However, we also find that modern Large Language Models (LLMs) are capable of understanding the explainability intent of messages to the same level as humans, implying that such differences may not be required if LLMs are fully integrated into AI agents.

13:14-13:26, Paper TuBT5.3	Add to My Program
Effective Explanations for Belief-Desire-Intention Robots: When and What to Explain (I)

Wang, Cong (TU Dresden), Calandra, Roberto (TU Dresden), Klös, Verena (Carl Von Ossietzky Universität Oldenburg)
Keywords: User-centered Design of Robots, Social Intelligence for Robots, Cognitive Skills and Mental Models Abstract: When robots perform complex and context-dependent tasks in our daily lives, deviations from expectations can confuse users. Explanations of the robot’s reasoning process can help users to understand the robot intentions. However, when to provide explanations and what they contain are important to avoid user annoyance. We have investigated user preferences for explanation demand and content for a robot that helps with daily cleaning tasks in a kitchen. Our results show that users want explanations in surprising situations and prefer concise explanations that clearly state the intention behind the confusing action and the contextual factors that were relevant to this decision. Based on these findings, we propose two algorithms to identify surprising actions and to construct effective explanations for Belief-Desire-Intention (BDI) robots. Our algorithms can be easily integrated in the BDI reasoning process and pave the way for better human-robot interaction with context- and user-specific explanations.

13:26-13:38, Paper TuBT5.4	Add to My Program
Let Me Explain Why I Didn't Take the Action You Wanted! Comparing Different Modalities for Explanations in Human-Robot Interaction (I)

Akalin, Neziha (Jönköping University), Riveiro, Maria (Jönköping University)
Keywords: Multimodal Interaction and Conversational Skills, Human Factors and Ergonomics, User-centered Design of Robots Abstract: Socially assistive humanoid robots are becoming increasingly integrated into home environments, where they are expected to interact naturally and transparently with users. In this context, the aim of the study presented in this paper is to understand how they should deliver explanations, especially when the robots cannot fulfill a user's request. We explore different explanation modalities by combining spoken explanations with an additional element (speech alone, speech plus lights, speech plus sounds, and speech plus gestures) and examining user preferences among them. We designed a video-based between-subjects user study featuring the Nao robot across three everyday scenarios where the robot fails to perform a task due to overheated motors. Participants evaluated four explanation modalities and provided feedback on their preferences. Our findings show that multimodal explanations are generally favored over speech alone, with speech and lights being the most preferred. Although scenario context did not have a statistically significant association with modality preference, qualitative feedback highlights the importance of context-aware and adaptive communication strategies. These insights provide practical guidance for designing more natural and effective human-robot interactions.

13:38-13:50, Paper TuBT5.5	Add to My Program
Control Methodology Impact on User Cognitive Workload in Gaze-Controlled Robotic Manipulation Tasks (I)

He, Yiyang (LANCASTER UNIVERSITY), Wang, Ziwei (Lancaster University), Yan, Lei (Harbin Institute of Technology, Shenzhen), Xue, Tao (Tsinghua University), Fei, Haolin (Lancaster University)
Keywords: Novel Interfaces and Interaction Modalities, Assistive Robotics, Human Factors and Ergonomics Abstract: This paper investigates two distinct paradigms for gaze-based control in human-robot collaboration (HRC). While gaze tracking offers a promising hands-free interaction method, the optimal mapping between eye movements and robot control remains an open research question. We examine two fundamentally different control approaches: (1) position-based control, which utilizes fiducial markers for spatial referencing and maps gaze positions directly to physical target locations; and (2) velocity-based control, whose functions are similar to a joystick where gaze position relative to camera frame centers determines movement direction and speed. Participants completed standardized pick-and-place tasks with both control methods. Performance was assessed through objective metrics including task completion time, trajectory efficiency, and error rates. Subjective experiences were evaluated using NASA Task Load Index questionnaires. Both systems incorporate a blink detection mechanism for gripper activation, enabling completely hands-free operation. This research addresses fundamental questions in eye-based robotic control for HRC, with applications spanning assistive technologies for mobility-impaired users, industrial settings that require hands-free operation, and medical environments where maintaining sterility is crucial. Results indicate significant differences between control paradigms, providing design insights for more intuitive and effective gaze-based interfaces in human-robot systems.


TuBT6 Special Session, Auditorium 6	Add to My Program
SS: Human Modeling for Hybrid Interactions with Robots

Chair: Beraldo, Gloria	National Research Council of Italy
Co-Chair: Berto, Leticia Mara	University of Campinas

12:50-13:02, Paper TuBT6.1	Add to My Program
A Theory of Mind Motivational Framework for Social Interaction with Autonomous Cognitive Robots (I)

Berto, Leticia Mara (University of Campinas), Hellou, Mehdi (University of Manchester), Sciutti, Alessandra (Italian Institute of Technology), Gudwin, Ricardo Ribeiro (University of Campinas), Colombini, Esther (University of Campinas), Cangelosi, Angelo (University of Manchester)
Keywords: Cognitive Skills and Mental Models, Motivations and Emotions in Robotics, Cognitive and Sensorimotor Development Abstract: As hybrid interactions between humans and artificial agents become more prevalent, social skills are increasingly essential for autonomous systems. Beyond assisting in various tasks, robots are expected to understand human states and recognize that knowledge and perceptions of the world can differ, influencing overall behavior. This ability is closely tied to motivation, which plays a crucial role in driving autonomous agents' actions. In this work, we explore the interaction between two intrinsically motivated cognitive autonomous robots with distinct profiles and preferences, utilizing Theory of Mind to infer each other's motivations. We investigate the conditions under which they successfully collaborate to achieve mutual well-being and the circumstances that hinder cooperation. Our findings indicate that successful interactions emerge when at least one agent prioritizes helping others and when their profiles are aligned, leading to positive outcomes for both.

13:02-13:14, Paper TuBT6.2	Add to My Program
Perspective-Shifted Neuro-Symbolic World Models: A Framework for Socially-Aware Robot Navigation (I)

Alcedo, Kevin (Institute for Systems and Robotics, Instituto Superior Técnico), Lima, Pedro U. (Instituto Superior Técnico - Institute for Systems and Robotics), Alami, Rachid (CNRS)
Keywords: Monitoring of Behaviour and Internal States of Humans, Cognitive Skills and Mental Models, Social Intelligence for Robots Abstract: Navigating in environments alongside humans requires agents to reason under uncertainty and account for the beliefs and intentions of those around them. Under a sequential decision-making framework, egocentric navigation can naturally be represented as a Markov Decision Process (MDP). However, social navigation additionally requires reasoning about the hidden beliefs of others, inherently leading to a Partially Observable Markov Decision Process (POMDP), where agents lack direct access to others' mental states. Inspired by Theory of Mind and Epistemic Planning, we propose (1) a neuro-symbolic model-based reinforcement learning architecture for social navigation, addressing the challenge of belief tracking in partially observable environments; and (2) a perspective-shift operator for belief estimation, leveraging recent work on Influence-based Abstractions (IBA) in structured multi-agent settings.

13:14-13:26, Paper TuBT6.3	Add to My Program
Reasoning LLMs for User-Aware Multimodal Conversational Agents (I)

Rahimi, Hamed (Sorbonne University), Cattoni, Jeanne (Université Paris Cité), Beghili, Meriem (Sorbonne University), Abrini, Mouad (Sorbonne University), Khoramshahi, Mahdi (Sorbonne Université), Pino, Maribel (Hôpital Broca (APHP), Université Paris Cité), Chetouani, Mohamed (Sorbonne University)
Keywords: User-centered Design of Robots, Machine Learning and Adaptation, Social Intelligence for Robots Abstract: Personalization in social robotics is critical for fostering effective human-robot interactions, yet systems often face the cold start problem, where initial user preferences or characteristics are unavailable. This paper proposes a novel framework called USER-LLM R1 for a user-aware conversational agent that addresses this challenge through dynamic user profiling and model initiation. Our approach integrates chain-of-thought (CoT) reasoning models to iteratively infer user preferences and vision-language models (VLMs) to initialize user profiles from multimodal inputs, enabling personalized interactions from the first encounter. Leveraging a Retrieval-Augmented Generation (RAG) architecture, the system dynamically refines user representations within an inherent CoT process, ensuring contextually relevant and adaptive responses. Evaluations on the ElderlyTech-VQA Bench demonstrate significant improvements in ROUGE-1 (+23.2%) ROUGE-2 (+0.6%) and ROUGE-L (+8%) F1 scores over state-of-the-art baselines, with ablation studies underscoring the impact of reasoning model size on performance. Human evaluations further validate the framework’s efficacy, particularly for elderly users, where tailored responses enhance engagement and trust. Ethical considerations, including privacy preservation and bias mitigation, are rigorously discussed and addressed to ensure responsible deployment.

13:26-13:38, Paper TuBT6.4	Add to My Program
Introducing a Socially Interacting Robot in Clinical Rehabilitation Practice (I)

Beraldo, Gloria (National Research Council of Italy), Bajrami, Albin (Università Politecnica Delle Marche), Baldini, Nicolò (Universita Politecnica Delle Marche), Capecci, Marianna (Universita Politecnica Delle Marche), Ceravolo, Maria Gabriella (Università Politecnica Delle Marche), Palpacelli, Matteo Claudio (Università Politecnica Delle Marche), Umbrico, Alessandro (National Research Council of Italy), Cortellessa, Gabriella (CNR -- National Research Council of Italy, ISTC)
Keywords: Assistive Robotics, Robots in Education, Therapy and Rehabilitation, Multimodal Interaction and Conversational Skills Abstract: As the aging population increases, so does the demand for personalized care and rehabilitation for individuals with neurological disorders. Effective recovery programs require intensive, task-oriented training, yet delivering continuous and individualized care remains a major challenge. Technological innovations such as wearable sensors and socially interactive robots can enhance patient monitoring and improve medical teams’ situational awareness. This paper presents a preliminary evaluation of a robot-based architecture deployed in a real clinical setting, designed to support rehabilitation tasks and patient monitoring. The results demonstrate the system's feasibility in providing therapists with timely and accurate data, facilitating natural interactions with patients, and minimizing the need for technical interventions during use.

13:38-13:50, Paper TuBT6.5	Add to My Program
Enhancing Adaptive Robotic Coaches with Multimodal Workload Estimation (I)

Tamantini, Christian (National Research Council of Italy), Cristofanelli, Maria Laura (Università Campus Bio-Medico Di Roma, Research Unit of Advanced), Umbrico, Alessandro (National Research Council of Italy), Fracasso, Francesca (National Research Council of Italy), Cortellessa, Gabriella (CNR -- National Research Council of Italy, ISTC), Cordella, Francesca (University Campus Biomedico of Rome), Orlandini, Andrea (National Research Council of Italy)
Keywords: Monitoring of Behaviour and Internal States of Humans, Robots in Education, Therapy and Rehabilitation, Detecting and Understanding Human Activity Abstract: Social robots are increasingly being explored as interactive coaches capable of delivering personalized physical and cognitive training sessions. Improving their effectiveness entails personalized interventions through adaptive robotic systems with continuous workload quantification. This study presents a workload estimation methodology based on physiological and kinematic monitoring, designed for integration into a social robotic coach. Physical, mental, and dual-task activities were administered to 15 healthy participants, and Support Vector Regression was used to model their perceived workload levels. Physical workload was estimated with a mean absolute error (MAE) of 0.12±0.01 and a correlation of 0.75±0.02, demonstrating high reliability across conditions. Mental workload estimation, however, showed greater variability (MAE: 0.18±0.01, correlation: 0.62±0.03), particularly in cognitively demanding and high-intensity tasks. This is likely due to overlapping physiological responses to cognitive and physical demands, which introduce ambiguity in signal interpretation. The continuous workload estimation provided by the model can be leveraged to define thresholds offering a discrete interpretation of workload levels.


TuBT7 Regular Session, Auditorium 7	Add to My Program
HRI and Collaboration in Manufacturing Contexts II

Chair: Okada, Shogo	Japan Advanced Institute of Technology
Co-Chair: Thalahitiya Vithanage, Ranul Helitha Vithanage	University of Moratuwa

12:50-13:02, Paper TuBT7.1	Add to My Program
VAIRO: A Vision-Based Adaptive Impedance-Control Robotic Framework

Lee, Chun Hei Jeffrey (University of Waterloo), Wong, Alexander (University of Waterloo), Hu, Yue (University of Waterloo)
Keywords: HRI and Collaboration in Manufacturing Environments, Computational Architectures Abstract: In this work, we present VAIRO, a Vision-based Adaptive Impedance-control RObotic framework for the purpose of manipulating soft materials, centered around the use case of rolling croissant dough for use in artisanal bakeries. Traditional automated processes for the industrial production of croissants consist of overly bulky equipment and fail to preserve the artisanal quality of hand-rolled croissants, with one of the major challenges being the high variability in the dough properties. VAIRO addresses these challenges by introducing a novel vision-based adaptive Cartesian impedance control strategy for collaborative robot arms to regulate rolling forces in real-time without the need for estimating the properties of the soft material. As such, VAIRO mimics the tactile adjustments made by human pastry chefs, ensuring consistent layer thickness and eliminating gaps. Using a Kinova Gen3 robotic arm and a custom-designed end-effector, we demonstrate that VAIRO can successfully manipulate various ”doughs” without knowing any material properties. These results are promising and offer a cost-effective, small-scale alternative for local craft bakeries to leverage automation while maintaining high artisanal quality.

13:02-13:14, Paper TuBT7.2	Add to My Program
Harvesting Perspectives: A Worker-Centered Inquiry into the Future of Fruit-Picking Farm Robots

Malik, Muhammad Abdul Basit (King's College London), Brandao, Martim (King's College London), Coopamootoo, Kovila (King's College London)
Keywords: Human Factors and Ergonomics, User-centered Design of Robots, Ethical Issues in Human-robot Interaction Research Abstract: The integration of robotics in agriculture presents promising solutions to challenges such as labour shortages and increasing global food demand. However, existing visions of agriculture robots often prioritize technological and business needs over workers'. In this paper, we explicitly investigate farm workers' perspectives on robots, particularly regarding privacy, inclusivity, and safety, three critical dimensions of agricultural HRI. Through a thematic analysis of semi-structured interviews, we: 1) outline how privacy, safety and inclusivity issues manifest within modern picking-farms; 2) reveal worker attitudes and concerns about the adoption of robots; and 3) articulate a set of worker-centered requirements and alternative visions for robotic systems deployed in farm settings. Some of these visions open the door to the development of new systems and HRI research. For example, workers' visions included robots for enhancing workplace inclusivity and solidarity, training, workplace accountability, reducing workplace accidents and responding to emergencies, as well as privacy-sensitive robots. We conclude with actionable recommendations for designers and policymakers. By centering worker perspectives, this study contributes to ongoing discussions in human-centered robotics, participatory HRI, and the future of work in agriculture.

13:14-13:26, Paper TuBT7.3	Add to My Program
How Do Foundation Models Compare to Skeleton-Based Approaches for Gesture Recognition in Human-Robot Interaction?

Käs, Stephanie (RWTH Aachen University), Burenko, Anton (RWTH Aachen), Markert, Louis (RWTH Aachen University), Çulha, Onur Alp (RWTH Aachen University), Mack, Dennis (Robert Bosch GmbH), Linder, Timm (Robert Bosch GmbH), Leibe, Bastian (RWTH Aachen University)
Keywords: Detecting and Understanding Human Activity, HRI and Collaboration in Manufacturing Environments, Non-verbal Cues and Expressiveness Abstract: Gestures enable non-verbal human-robot communication, especially in noisy environments like agile production. Traditional deep learning-based gesture recognition relies on task-specific architectures using images, videos, or skeletal pose estimates as input. Meanwhile, Vision Foundation Models (VFMs) and Vision Language Models (VLMs) with their strong generalization abilities offer potential to reduce system complexity by replacing dedicated task-specific modules. This study investigates adapting such models for dynamic, full-body gesture recognition, comparing V-JEPA (a state-of-the-art VFM), Gemini Flash 2.0 (a multimodal VLM), and HD-GCN (a top-performing skeleton-based approach). We introduce NUGGET, a dataset tailored for human-robot communication in intralogistics environments, to evaluate the different gesture recognition approaches. In our experiments, HD-GCN achieves best performance, but V-JEPA comes close with a simple, task-specific classification head—thus paving a possible way towards reducing system complexity, by using it as a shared multi-task model. In contrast, Gemini struggles to differentiate gestures based solely on textual descriptions in the zero-shot setting, highlighting the need of further research on suitable input representations for gestures.

13:26-13:38, Paper TuBT7.4	Add to My Program
A Flexible Safety System for Achieving Close Proximity in Industrial Human Robot Collaboration

Sanan, Siddharth (Omron Research Center of America), Misra, Gaurav (OMRON Research Center of America), Von Edge, David (OMRON Research Center of America), Rodriguez Campo, Andres (OMRON Research Center of America), Reynell, Alexander Stuart (Omron Research Center of America), Bailey, Sean (OMRON Research Center of America), Drinkard, John (Omron Research Center of America)
Keywords: HRI and Collaboration in Manufacturing Environments, Motion Planning and Navigation in Human-Centered Environments, Cooperation and Collaboration in Human-Robot Teams Abstract: Modern manufacturing applications require frequent changes to the production line and require a combination of humans and high-speed industrial robots. Establishing human safety in such applications is a key requirement. Existing solutions, such as zone-based safeguarding systems, offer an unsatisfactory trade-off between productivity and flexibility. This is because these solutions result in frequent safety stops that disrupt production, or involve complex safety setups. 3D sensors can reduce the area where safety stops occur, increasing productivity. Additionally, eliminating zone setup can reduce the safety setup effort, increasing flexibility. A key challenge is leveraging the higher fidelity data from 3D sensing while keeping computational complexity low. Another key challenge is achieving safety guarantees aligned with industrial safety standards. In this work, we propose a speed and separation monitoring (SSM) safeguarding system that utilizes 3D sensors and addresses the challenges noted above. The proposed system addresses computational complexity by using simple, fast-to-generate representations that enable fast human-robot distance computations. Fast distance computations reduce the required separation distance (defined in ISO 13855), reducing safety stops at close proximity. Simple representations are further leveraged to proactively adjust the robot speed, using a fast analytical formulation, such that the required separation distance is maintained and safety stops are prevented. Finally, safety guarantees are further established by embedding uncertainty information in the representation itself. Experimental results for a real-time implementation of the system are presented that demonstrate the performance and safety achieved.

13:38-13:50, Paper TuBT7.5	Add to My Program
Estimating Scene Flow in Robot Surroundings with Distributed Miniaturised Time-Of-Flight Sensors

Sander, Jack (University of Oxford), Caroleo, Giammarco (University of Oxford), Albini, Alessandro (University of Oxford), Maiolino, Perla (University of Oxford)
Keywords: Detecting and Understanding Human Activity, HRI and Collaboration in Manufacturing Environments Abstract: Tracking the motion of humans or objects in a robot's surroundings is essential to improve safe robot motions and reactions. In this work, we present an approach for scene flow estimation from low-density and noisy point clouds acquired from miniaturised Time-of-Flight (ToF) sensors distributed across the robot's body. The proposed method clusters points from consecutive frames and applies the Iterative Closest Point (ICP) algorithm to estimate a dense motion flow, with additional steps introduced to mitigate the impact of sensor noise and low-density data points. Specifically, we employ a fitness-based classification to distinguish between stationary and moving points and an inlier removal strategy to refine geometric correspondences. The proposed approach is validated in an experimental setup where 24 ToF are used to estimate the velocity of an object moving at different controlled speeds. Experimental results show that the method consistently approximates the direction of the motion and its magnitude with an error which is in line with sensor noise.


TuPM2_BR Voorhof/Hall	Add to My Program
Break (Tu1) 13: 50-14: 00


TuCT1 Regular Session, Auditorium 1	Add to My Program
Virtual and Telepresence I

Chair: Holthaus, Patrick	University of Hertfordshire

14:00-14:12, Paper TuCT1.1	Add to My Program
Exploring the Effects of (Re)Embodiment on Perceptions of Robot Teammates in Virtual Reality Environments

Kelly, Karla Bransky (The Australian National University), Sweetser, Penny (Australian National University)
Keywords: Embodiment, Empathy and Intersubjectivity, Cooperation and Collaboration in Human-Robot Teams, Anthropomorphic Robots and Virtual Humans Abstract: This study explores how robot embodiment influences human perceptions of robot teammates in virtual reality (VR) environments. In a mixed-design experiment, we simulated an immersive control room where participants enacted teaming with autonomous robots to respond to emergency events. We investigated the effects of robot re-embodiment during VR collaboration, comparing avatar type (machinelike, augmented, humanlike) for different robot types (a drone, vehicle, and humanoid) on perceptions of robot teammates. We found increased anthropomorphism improved perceptions of the robots' non-verbal expressiveness and bodily-based capabilities but reduced the perceived appearance-based trustworthiness of the robots. Despite their limited non-verbal communication abilities, machinelike embodiments were perceived as more suitable for VR interaction. At the same time, our results suggest that augmented forms offer a compromise, improving the non-verbal communication capabilities of non-anthropomorphic robots with little impact on perceptions of intelligence, trustworthiness, or social abilities. These findings highlight the trade-offs in designing multi-embodied artificial teammates and suggest that alignment between appearance and functionality is critical for effective VR-based human-robot teams.

14:12-14:24, Paper TuCT1.2	Add to My Program
Virtual Robot Riding with a Person in a Real Elevator - Mixed Reality System for Robot Behavior Design -

Adachi, Mau (Mitsubishi Electric Corporation), Kakio, Masayuki (Mitsubishi Electric Corporation), Shiomi, Masahiro (ATR), Miyashita, Takahiro (ATR)
Keywords: Social Intelligence for Robots, Novel Interfaces and Interaction Modalities Abstract: To provide more services on multiple floors in buildings, recent mobile robots have the capability to take elevators and need to board them with other passengers. When a mobile robot boards an elevator, the current technological limitations result in stress for the surrounding passengers. However, it is still unclear what impressions a robot’s quick boarding behavior will impart to surrounding passengers, when such limitations are removed in the future and mobile robots acquire human-like quick movements. It is also difficult to increase the speed of robots drastically in a manner which ensures the safety of experiment participants. In this study, we developed a mixed reality system that integrates a virtual robot into the real world to ensure the physical safety of participants during the experiment. Visual and auditory information about the virtual robot was provided to participants through a head-mounted display and was synchronized with the robot’s behavior. We focused on three basic factors of robot’s boarding behavior: forward speed during entry, rotation speed, and rotation direction in the elevator car. A statistical analysis revealed that quick, human-like entry into an elevator by a virtual robot reduces perceived anthropomorphism, likeability, intelligence, and safety, and increases the stress levels experienced by the passengers. The statistical analysis also showed that quick rotation inside the elevator by the virtual robot reduces perceived likeability. Our results provide useful knowledge for designing the behavior of future robots when they board elevators with other passengers.

14:24-14:36, Paper TuCT1.3	Add to My Program
Investigating the Influence of Cultural Backgrounds on Proxemics to Telepresence Robots

Prilla, Michael (University of Duisburg-Essen), Mouliom, Seydou Njoya (University of Clausthal)
Keywords: User-centered Design of Robots, Human Factors and Ergonomics, Cooperation and Collaboration in Human-Robot Teams Abstract: This paper reports on a study investigating the influence of cultural background on the distance people keep in communication mediated by telepresence robots. Such robots are used increasingly to allow people to connect across distances without traveling for economic, ecological, and even political reasons. However, little is known about how and whether differences in face-to-face communication between people from different cultural backgrounds also play a role in communication mediated by a telepresence robot. To close that gap and inform the design of telepresence robots, we ran a study with 20 participants from Germany and 20 from Cameroon, who talked to each other in different pairs via a telepresence robot. Our results suggest that their cultural background had an influence on distancing behavior and on the comfort they perceived with the distance. These and other results inform the design of telepresence robots for intercultural settings

14:36-14:48, Paper TuCT1.4	Add to My Program
Feel the Presence: The Effects of Haptic Sensation on VR-Based Human-Robot Interaction

Yu, Xinyan (The University of Sydney), Hoggenmueller, Marius (The University of Sydney), Tran, Tram Thi Minh (The University of Sydney), Tomitsch, Martin (University of Technology Sydney)
Keywords: Evaluation Methods, Social Touch in Human–Robot Interaction, User-centered Design of Robots Abstract: Virtual reality (VR) has been increasingly utilised as a simulation tool for human-robot interaction (HRI) studies due to its ability to facilitate fast and flexible prototyping. Despite efforts to achieve high validity in VR studies, haptic sensation, an essential sensory modality for perception and a critical factor in enhancing VR realism, is often absent from these experiments. Studying an interactive robot help-seeking scenario, we used a VR simulation with haptic gloves that provide highly realistic tactile and force feedback to examine the effects of haptic sensation on VR-based HRI. We compared participants' sense of presence and their assessments of the robot to a traditional setup using hand controllers. Our results indicate that haptic sensation enhanced participants' social and self-presence in VR and fostered more diverse and natural bodily engagement. Additionally, haptic sensations significantly influenced participants’ affective-related perceptions of the robot. Our study provides insights to guide HRI researchers in building VR-based simulations that better align with their study contexts and objectives.

14:48-15:00, Paper TuCT1.5	Add to My Program
Gaze to Grasp: Shared Autonomy in VR Robot Teleoperation

Joseph, Kevin (University of Waterloo), Hu, Yue (University of Waterloo)
Keywords: Degrees of Autonomy and Teleoperation, Novel Interfaces and Interaction Modalities, Virtual and Augmented Tele-presence Environments Abstract: Shared autonomy in robot teleoperation can ease task completion and lower cognitive load for operators by combining human intent with the autonomous capabilities of robots. As many manipulation control tasks involve the grasping of objects as the first step, augmenting assistance at this stage has the potential to improve user experience and task performance. This work discusses a new grasping assistance framework based on users' intent signalling via eye gaze. Specifically, the eye gaze direction is retrieved from a virtual reality headset during the teleoperation. This information is used to automatically determine grasping locations on the target object, after which the grasping sequence is executed without the need to perform 1-to-1 motion mapping. The grasping assistance system is implemented with the ROS2 framework to control a Kinova Gen 3 robotic manipulator in velocity mode, using a Meta Quest Pro virtual reality headset. A user study is performed with 30 participants using the developed system, with the objective of comparing the usability, workload, and performance of the grasping-assisted teleoperation with pure teleoperation (motion mapping). Results show that grasping assistance significantly reduces users' workload, but also leads to lower performance metrics with respect to pure teleoperation.


TuCT2 Special Session, Auditorium 2	Add to My Program
SS: LLM/Gen AI-Based Multimodal, Multilingual and Multitask Modeling Technologies for Robotic Systems

Chair: Li, Sheng	Institute of Science Tokyo
Co-Chair: Nakadai, Kazuhiro	Institute of Science Tokyo

14:00-14:12, Paper TuCT2.1	Add to My Program
LatentSpeech: Latent Diffusion for Text-To-Speech Generation (I)

Lou, Haowei (UNSW Sydney), Paik, Hye young (UNSW Sydney), Delir Haghighi, Pari (Monash University), Li, Sheng (Institute of Science Tokyo), HU, Wen (UNSW), Yao, Lina (Csiro & Unsw)
Keywords: Sound design for robots, Multimodal Interaction and Conversational Skills Abstract: Text-to-Speech (TTS) generation plays a crucial role in human-robot interaction by allowing robots to communicate naturally with humans. Researchers have developed various TTS models to enhance speech generation. More recently, diffusion models have emerged as a powerful generative framework, achieving state-of-the-art performance in tasks such as image and video generation. However, their application in TTS has been limited by its slow inference speeds due to their iterative denoising process. Previous work has applied diffusion models to Mel-spectrograms with an additional vocoder to convert them into waveforms. To address these limitations, we propose LatentSpeech, a novel diffusion-based TTS framework that operates directly in a latent space. This space is significantly more compact and information-rich than raw Mel-spectrograms. Furthermore, we introduce an alternative latent space of Pseudo-Quadrature Mirror Filters (PQMF), which decomposes speech into multiple subbands. By leveraging PQMF's near-perfect waveform reconstruction capability, LatentSpeech eliminates the need for a separate vocoder and reduces both model size and inference time. Our PQMF-based LatentSpeech model reduces inference time by 45% and model size by 77% compared to Mel-spectrogram diffusion models. On benchmark datasets, it achieves 25% lower WER and 58% higher MOS using the same training data. These results highlight LatentSpeech as an efficient, high-quality TTS solution for real-time and human-robot interaction. Code and models are available at https://github.com/haoweilou/LatentSpeech_Demo

14:12-14:24, Paper TuCT2.2	Add to My Program
LLM-Driven Approach for Motion Control in Human-Robot Dialogue for Elevating Engagement (I)

Baihaqi, Muhammad Yeza (Nara Institute of Science and Technology and RIKEN), Garcia Contreras, Angel Fernando (Guardian Robot Project, RIKEN), Kawano, Seiya (RIKEN), Yoshino, Koichiro (Institute of Physical and Chemical Research (RIKEN))
Keywords: Multimodal Interaction and Conversational Skills, Non-verbal Cues and Expressiveness, Novel Interfaces and Interaction Modalities Abstract: Non-verbal behaviors, such as body movements, play a crucial role in enhancing a robot’s speech to elevate engagement in human-robot dialogue. Many existing approach based on rules offered natural and engaging motions aligned with the robot's utterances but required significant resources to maintain. Recent methods leveraging large language models (LLMs) offer a promising alternative to reduce these costs. However, there is a trade-off between flexibility and safety when determining whether the language model should generate motions based on joint angle parameters or action primitives. In this study, we evaluated two LLM-based motion control models: one for motion generation based on joint angle parameters (LLM-GJA) and the other for motion generation based on primitive actions (LLM-GPA). Our human evaluations indicated that directly generating joint angles outperformed generating action primitives in naturalness, timing consistency, and overall engagement, even achieving performance comparable to rule-based systems. This work highlights the potential of LLMs in generating expressive and contextually appropriate robot motions at the joint angle level.

14:24-14:36, Paper TuCT2.3	Add to My Program
Parameter-Efficient Personalized Speech Synthesis Via EMD-Based Speaker Modeling (I)

Lei, Chengxi (Massey University), Hou, Feng (Massey University), Jahnke, Huia (Massey University), Wang, Ruili (Massey University)
Keywords: Machine Learning and Adaptation, Sound design for robots, Linguistic Communication and Dialogue Abstract: Personalized speech synthesis has attracted increasing attention in recent years. Compared to traditional speech synthesis, it faces two primary challenges: the limited availability of adaptation data and the necessity for highly efficient adaptation method using compact parameters to reduce both training time and memory consumption. To address both the challenges, this paper proposes a personalized speech synthesis approach that incorporates an Empirical Mode Decomposition (EMD)-based speaker modeling method alongside a novel decoder structure with masked inputs, which improves the model’s ability to extract speaker-specific features accurately. Furthermore, we introduce a parameter-efficient fine-tuning technique, Attention-based Speaker-Text Scaling and Shifting Feature (AST-SSF), to enhance adaptation efficiency. We validate our approach using the MAGICDATA Corpus. The results indicate that our proposed approach outperforms the baseline in both naturalness and similarity, demonstrating its effectiveness. Moreover, although the proposed adaptation method substantially reduces the number of parameters, it exhibits only minimal performance degradation compared to full and partial fine-tuning strategies.

14:36-14:48, Paper TuCT2.4	Add to My Program
Leveraging the SHALCAS22A Chinese Numerical Corpus for Enhanced Text-Dependent Speaker Verification with Decoupled Speaker and Text Embeddings (I)

Hong, Feng (Shanghai Acoustics Laboratory, Chinese Academy of Sciences), Zheng, Wan (Shanghai Acoustics Laboratory, Chinese Academy of Sciences), Zheng, Litong (Shanghai Acoustics Laboratory, Chinese Academy of Sciences), Xu, Weijie (Shanghai Acoustics Laboratory, Chinese Academy of Sciences)
Keywords: Linguistic Communication and Dialogue Abstract: Speaker verification is a core component of embodied robotic systems because it enables fast and secure user authentication. Mandarin numerical pass-phrases offer a compact lexical scope and high entropy, thus combining convenience with strong security and extending the same technology to voice-based financial payments. The community, however, lacks a public corpus of Chinese numerical strings. We address this gap by releasing SHALCAS22A, an 18.3-hour studio-quality corpus of Mandarin numerical utterances from 60 balanced speakers across nine rhythm-controlled templates, now hosted on OpenSLR (https://openslr.org/138/). Building on this resource, we present textbf{DE-CNSV}, a dual-ended network that explicitly separates text and speaker embeddings. The text branch employs an enhanced Transformer trained with a composite loss that merges text classification, connectionist temporal classification, and sequence-to-sequence decoding objectives. The speaker branch adopts a sliding-window attentive statistics pooling mechanism to better capture temporal speaker characteristics from short utterances, thereby enhancing discriminability in low-duration scenarios. The proposed system attains an equal error rate of 0.32% on the Hi-Mia dataset and 0.12% on SHALCAS22A, establishing a new benchmark for Mandarin text-dependent speaker verification. These results demonstrate that numerical TD-SV can support both secure robot–human interaction and practical voice payment services.

14:48-15:00, Paper TuCT2.5	Add to My Program
Take That for Me: Multimodal Exophora Resolution with Interactive Questioning for Ambiguous Out-Of-View Instructions (I)

Oyama, Akira (Ritsumeikan University), Hasegawa, Shoichi (Ritsumeikan University), Taniguchi, Akira (Ritsumeikan University), Hagiwara, Yoshinobu (Soka University), Taniguchi, Tadahiro (Kyoto University)
Keywords: Multi-modal Situation Awareness and Spatial Cognition Abstract: Daily life support robots must interpret ambiguous verbal instructions involving demonstratives such as ``Bring me that cup,'' even when objects or users are out of the robot's view. Existing approaches to exophora resolution primarily rely on visual data and thus fail in real-world scenarios where the object or user is not visible. We propose Multimodal Interactive Exophora resolution with user Localization (MIEL), which is a multimodal exophora resolution framework leveraging sound source localization (SSL), semantic mapping, visual-language models (VLMs), and interactive questioning with GPT-4o. Our approach first constructs a semantic map of the environment and estimates candidate objects from a linguistic query with the user's skeletal data. SSL is utilized to orient the robot toward users who are initially outside its visual field, enabling accurate identification of user gestures and pointing directions. When ambiguities remain, the robot proactively interacts with the user, employing GPT-4o to formulate clarifying questions. Experiments in a real-world environment showed results that were approximately 1.3 times better when the user was visible to the robot and 2.0 times better when the user was not visible to the robot, compared to the methods without SSL and interactive questioning. The project website is https://emergentsystemlabstudent.github.io/MIEL/.


TuCT3 Regular Session, Auditorium 3	Add to My Program
Assistive Robotics III

Chair: Mwangi, Eunice Njeri	Jomo Kenyatta University of Agriculture and Technology
Co-Chair: He, Yiyang	LANCASTER UNIVERSITY

14:00-14:12, Paper TuCT3.1	Add to My Program
3D Path Control: Can We Use Lower Limb Inter-Joint Coordination to Assist Gait and Balance?

Orhan, Zeynep Özge (EPFL), Ijspeert, Auke (EPFL), Bouri, Mohamed (EPFL)
Keywords: Assistive Robotics, User-centered Design of Robots, Cooperation and Collaboration in Human-Robot Teams Abstract: Maintaining balance during walking is a critical yet under-addressed challenge in the control of lower-limb exoskeletons, especially for users with progressive neurological conditions such as multiple sclerosis and muscular dystrophy. While assist-as-needed strategies have enabled flexible support in the sagittal plane, most exoskeletons lack active control in the frontal plane, limiting their ability to support mediolateral (ML) balance. In this study, we introduce a 3D Path Control strategy that enables coordinated assistance across hip abduction/adduction, hip flexion/extension, and knee flexion/extension. The controller is designed to provide partial gait assistance while preserving user autonomy and the ability to modulate step width, an essential mechanism for maintaining ML balance. Two experiments with healthy participants were conducted to evaluate the approach. The first experiment showed that increasing ML assistance improved alignment with a nominal coordination pattern and allowed modulation of hip abduction/adduction range of motion, and consequently, lateral foot placement. The second experiment demonstrated that even under constraining controller settings, users could still deviate from the desired path and adopt different step widths. These results suggest that 3D Path Control can simultaneously assist gait and support balance by combining structured inter-joint coordination while still providing flexibility in foot placement to the subjects.

14:12-14:24, Paper TuCT3.2	Add to My Program
An Automatic Cutting Plane Planning Method Based on Multi-Objective Optimization for Robot-Assisted Laminectomy Surgery

Liu, Gaodeng (Shenzhen Institute of Advanced Technology, Chinese Academy of Sc), Qi, Xiaozhi (Shenzhen Institutes of Advanced Technology, Chinese Academy of S), Li, Meng (Shenzhen Institute of Advanced Technology, Chinese Academy of Sci), Gao, Yongsheng (Harbin Institute of Technology), HU, Ying (Shenzhen Institute of Advanced Technology, ShenZhen, China), Hu, Lei (Beihang University), Zhao, Yu (Peking Union Medical College Hospital)
Keywords: Surgical Robotics: Planning, Constrained Motion Planning, Software Architecture for Robotic and Automation Abstract: Laminectomy represents an effective surgical procedure for the treatment of lumbar spinal stenosis. Due to the intricate anatomical structure of the lumbar spine, meticulous surgical path planning is essential to ensure the safety of the procedure and enhance the likelihood of successful outcomes. This study aims to implement multi-objective optimization techniques in the context of laminectomy, with a particular emphasis on identifying the optimal reference cutting path for the lamina. In clinical practice,the cutting path is typically characterized as a relatively straight line, akin to making an incision through the lamina with a sharp, rigid plane. Consequently, the optimal reference cutting path can be established by determining the ideal reference cutting plane. In our methodology, the cutting contour, defined as the intersection of the cutting plane with the lamina, is treated as a variable. Key features of the lamina are extracted and classified into three obiective functions: the average thickness of the lamina, the derivative of the entry point of the cutting path, and the degree of overlap between adjacent vertebrae. We then apply multi-objective optimization algorithms and utilize the weighted sum method to solve the multi-objective problems in the laminectomy task. The experimental findings are validated by an enhanced laminectomy plane evaluation system, demonstrating that the automatically generated cutting planes achieve a high level of excellence (94%), thereby satisfying the surgical requirements validated by professional surgeons.

14:24-14:36, Paper TuCT3.3	Add to My Program
Quantifying Human Mental State in Interactive pHRI: Maintaining Balancing

Abdulazeem, Nourhan (University of Waterloo), Sichert, Nils (Hamburg University of Technology), Feng, Ji Yuan (University of Waterloo), Hu, Yue (University of Waterloo)
Keywords: Human Factors and Human-in-the-Loop, Physical Human-Robot Interaction, Human-Centered Robotics Abstract: As robots increasingly enter domestic environments, investigating the impact of their physical behaviors and the potential to leverage human mental states during interaction becomes crucial. This study examines how a robot's active behavior (unanticipated physical actions) versus passive behavior (actions aligned with user expectation) affects users' mental states during a physical balance task. Our findings show that passive interaction is generally more cognitively ergonomic, while active behavior, though it reduces imbalance, adds cognitive strain. Users' perceptions of the robot are not affected by its behavior type. We conclude that combining peripheral skin temperature with age and personality traits holds significant potential for enhancing robots' ability to infer users' cognitive ergonomics and belief levels. This study explores the relatively under-researched area of active behavior in physical assistive applications with minimal sensor requirements and identifies easily obtainable online data as indicators of human mental state.

14:36-14:48, Paper TuCT3.4	Add to My Program
Leveraging GCN-Based Action Recognition for Teleoperation in Daily Activity Assistance

Kwok, Thomas M. (University of Waterloo), Li, Jiaan (University of Waterloo), Hu, Yue (University of Waterloo)
Keywords: Degrees of Autonomy and Teleoperation, Assistive Robotics, Robots in Education, Therapy and Rehabilitation Abstract: Caregiving for older adults is an urgent global challenge, with many preferring to age in place rather than enter residential care. However, providing adequate home-based assistance is difficult, particularly in geographically vast regions. Teleoperated robots offer a promising solution, but conventional motion-mapping teleoperation imposes unnatural movement constraints, causing operator fatigue and reducing usability. This paper presents a novel teleoperation framework that leverages action recognition for intuitive remote robot control. A simplified Spatio-Temporal Graph Convolutional Network (S-ST-GCN) recognizes human actions and executes preset robot trajectories, eliminating the need for direct motion synchronization. A finite-state machine (FSM) further enhances reliability by filtering misclassified actions. Experiments demonstrate that the framework enables effortless operator movement and accurate robot execution. This proof-of-concept study highlights the potential of action-recognition-based teleoperation to help caregivers remotely assist older adults with daily activities. Future work will focus on improving the S-ST-GCN's recognition accuracy and generalization, integrating advanced motion planning techniques for greater robot autonomy, and conducting user studies to evaluate telepresence and usability.


TuCT4 Regular Session, Blauwe Zaal	Add to My Program
Applications of Social Robots III

Chair: Haring, Kerstin Sophie	University of Denver
Co-Chair: Blair, Andrew	University of Glasgow

14:00-14:12, Paper TuCT4.1	Add to My Program
Should Delivery Robots Intervene If They Witness Civilian or Police Violence? an Exploratory Investigation

Seassau, Tilly (King's College London), Wu, Wenxi (King's College London), Williams, Tom (Colorado School of Mines), Brandao, Martim (King's College London)
Keywords: User-centered Design of Robots, Human Factors and Ergonomics, Ethical Issues in Human-robot Interaction Research Abstract: As public space robots navigate our streets, they are likely to witness various human behavior, including verbal or physical violence. In this paper we investigate whether people believe delivery robots should intervene when they witness violence, and their perceptions of the effectiveness of different conflict de-escalation strategies. We consider multiple types of violence (verbal, physical), sources of violence (civilian, police), and robot designs (wheeled, humanoid), and analyze their relationship with participants' perceptions. Our analysis is based on two experiments using online questionnaires, investigating the decision to intervene (N=80) and intervention mode (N=100). We show that participants agreed more with human than robot intervention, though they often perceived robots as more effective, and preferred certain strategies, such as filming. Overall, the paper shows the need to investigate whether and when robot intervention in human-human conflict is socially acceptable, to consider police-led violence as a special case of robot de-escalation, and to involve communities that are common victims of violence in the design of public space robots with safety and security capabilities.

14:12-14:24, Paper TuCT4.2	Add to My Program
Improving Social Robot Acceptance in Public Libraries by Qualitative Analysis of TAM

Kubullek, Ann-Kathrin (Ruhr West University of Applied Science), Hermann, Julia (Ruhr West University of Applied Sciences), Mäder, Aiden Danny (Hochschule Ruhr-West), Lisetschko, Artur (Ruhr West University of Applied Sciences), Dogangün, Aysegül (University of Applied Sciences Ruhr West)
Keywords: Applications of Social Robots, User-centered Design of Robots, Social Intelligence for Robots Abstract: As public libraries adopt social robots to enhance visitor interactions, understanding factors driving user acceptance is crucial. Based on the Technology Acceptance Model (TAM), present study investigates how perceived usefulness (PU), perceived ease of use (PEOU), social influences, and perceived enjoyment shape the acceptance of social robots in public libraries. 65 participants interacted in a field study held in two public libraries with the Pepper robot for book presentations. Following a qualitative approach, the influence of TAM constructs on the user experience (UX) with Pepper was explored. Results show participants valued pragmatic properties such as efficiency and task fulfillment, underscoring PU. However, transparency issues impaired usability. Social influences shaped users‘ attitudes, and perceived enjoyment emerged as an important factor, with mixed responses to the robot’s emotional engagement. Insights were used to derive 20 recommendations for improving Human-Robot Interaction (HRI) in public libraries.

14:24-14:36, Paper TuCT4.3	Add to My Program
Rebranding Sex Robots: Realbotix's Corporate Metamorphosis

Masterson, Annette (University of Michigan Ann Arbor), Robert, Lionel (University of Michigan)
Keywords: Creating Human-Robot Relationships, Robot Companions and Social Robots, Social Intelligence for Robots Abstract: Robots are rapidly becoming more interactive and dyadic. With advancements in artificial intelligence and robotic movements, companies are shifting their corporate messaging to highlight the social and companionship features of their robots. Realbotix’s recent rebranding exemplifies a deliberate effort to carve a new path within the humanoid robotics industry. Grounded in political economy and discourse analysis, this paper examines 86 publicity interviews and press releases from Realbotix to assess the positioning of intimacy and its associated corporate power. The findings reveal a focus on the robot’s social intelligence, framing the company as a leader in humanoid robotics and reshaping human–robot interactions.

14:36-14:48, Paper TuCT4.4	Add to My Program
Enhancing Safety and User Experience in Automated Driving: A Multimodal Comparison of Pneumatic and Vibrotactile Haptic Feedback Takeover Scenarios

Liu, Yang (Télécom Paris, IP Paris), Shangguan, Zhegong (University of Manchester), Tapus, Adriana (ENSTA Paris, Institut Polytechnique De Paris), Détienne, Françoise (Télécom Paris), Safin, Stéphane (Télécom Paris), Lecolinet, Eric (Télécom ParisTech)
Keywords: Affective Computing, Multimodal Interaction and Conversational Skills, Multi-modal Situation Awareness and Spatial Cognition Abstract: The seamless transition of control between drivers and autonomous systems remains a critical challenge in automated driving, affecting both safety outcomes and overall user experience. To address this challenge, our study examines the effectiveness of two distinct haptic feedback approaches—pneumatic and vibrotactile—when implemented as intelligent interface components for takeover requests (TORs) during these transition periods. We specifically investigate how these haptic modalities can effectively signal drivers when human intervention is required, facilitating smoother control transitions from automated to manual driving. We designed a comprehensive experimental setup integrating these haptic modalities with audio and visual cues and evaluated their performance across nine interaction tasks to understand how multi-modal feedback influences driver responsiveness during takeover scenarios. Our findings reveal that multi-modal approaches incorporating either pneumatic or vibrotactile feedback, combined with standard visual cues, substantially outperform audio-only alerts in both response time and accuracy metrics for takeover requests (TORs). Notably, pneumatic feedback offered more natural sensation and smoother transitions than vibrotactile feedback, with pneumatic systems excelling in comfort while vibrotactile feedback better serves urgent takeovers. This first systematic comparison provides valuable insights for developing interfaces that balance effectiveness with comfort in human-machine systems.

14:48-15:00, Paper TuCT4.5	Add to My Program
GRAsPAD: Generalized Framework for Optimal Grasp Key-Points Active Detection

Efstathopoulos, Nikolaos (Hellenic Mediterranean University), Kounalakis, Nikolaos (Hellenic Mediterranean University), Balaska, Vasiliki (Democritus University of Thrace), Fasoulas, John (Hellenic Mediterranean University), Papageorgiou, Dimitrios (Hellenic Mediterranean University)
Keywords: Detecting and Understanding Human Activity, Programming by Demonstration Abstract: In the challenging context of human-robot collaboration, the ability to capture critical human-grasp-related features in a short period is essential for the transfer of skills/experience and the effective imitation learning of human grasping strategies to robotic systems. The paper at hand introduces a novel system for pose optimization of an active (moving) sensor, reducing uncertainty in grasp perception and enabling a more accurate observation of the human grasp pose. The proposed framework, coined as ”GRAsPAD” (Generalized fRAmework for optimal graSp key-Points Active Detection), leverages a keypoint detection model to identify grasp-related features in a tomato fruit. The camera pose is iteratively adjusted through the Covariance Matrix Adaptation with Margin - Evolution Strategy (CMAwM-ES) to minimize uncertainty in detected keypoints, enhancing the accuracy of grasp perception and ensuring a clearer, more stable grasp representation. To mitigate sensor noise and variability, a Kalman filter is applied to refine keypoint detections over time. The experimental results demonstrate the significant improvement achieved by the pose optimization of the observer, in terms of grasp perception quality, which is critical for robotic skill learning and effective transfer of human grasping strategies to robotic systems.


TuCT5 Regular Session, Auditorium 5	Add to My Program
Linguistic Communication and Dialogue I

Chair: Jelinek, Matous	University of Southern Denmark
Co-Chair: Tan, Sihan	Institute of Science Tokyo

14:00-14:12, Paper TuCT5.1	Add to My Program
Exploring Unstructured Language Feedback for Robot Learning

Kuehn, Hannah (KTH Royal Institute of Technology), Ahlberg, William (Royal Institute of Technology (KTH)), La Delfa, Joseph (KTH Royal Institute of Technology), Leite, Iolanda (KTH Royal Institute of Technology)
Keywords: Social Learning and Skill Acquisition Via Teaching and Imitation, Linguistic Communication and Dialogue, Machine Learning and Adaptation Abstract: In this paper, we aim to explore how humans give unstructured free-form natural language feedback towards correcting robot task policies. We present a qualitative study based on crowd-sourced feedback from 66 participants. Participants give feedback on the execution of three robotic tasks in the form of mobile navigation, dexterous object manipulation and a robot arm opening a door. We find through reflexive thematic analysis which features participants reference most and what other patterns are present in the feedback, including that responders do not naturally give concrete and actionable feedback. We attribute the lack of feedback concreteness to a lack of engagement with robot behavior and false assumptions about who is receiving the feedback and how much knowledge participants have. The study presents a step towards better understanding unstructured language feedback for robotic learning.

14:12-14:24, Paper TuCT5.2	Add to My Program
How Conversation Type and Presumed Message Source Influence Users’ Trust towards Mental Health Conversational Agents: The Mediator Effect of Intentional Stance

Fang, Chen (Ghent University), Guo, Fu (Northeastern University), Belpaeme, Tony (University of Ghent - IMEC)
Keywords: Human Factors and Ergonomics, Cognitive Skills and Mental Models, Detecting and Understanding Human Activity Abstract: Mental health conversational agents (CAs) are gaining increasing attention as accessible tools for social communication, emotional support, and stress relief. These agents introduce new forms of human-AI interaction, yet the factors influencing user trust remain underexplored. Prior research suggests that conversation type and presumed message source may shape users’ experience, but their effects on users’ intentional stance and trust in CAs are not well understood. To address this gap, we first conducted a pre-study to develop a questionnaire for measuring users’ intentional stance towards mental health CAs. We then carried out a 2 × 2 mixed-design experiment to examine how conversation type and presumed message source influence intentional stance and trust, and whether intentional stance mediates the relationship between conversation type and trust. Results show that conversation type significantly influences user trust, mediated by intentional stance, while presumed message source had no significant effect. These findings advance our understanding of how users form trust in mental health CAs and offer implications for designing more engaging and trustworthy conversational systems in mental health contexts.

14:24-14:36, Paper TuCT5.3	Add to My Program
A Model-Agnostic Approach for Semantically Driven Disambiguation in Human-Robot Interaction

Dogan, Fethiye Irmak (KTH Royal Institute of Technology), Patel, Maithili (Georgia Institute of Technology), Liu, Weiyu (Stanford University), Leite, Iolanda (KTH Royal Institute of Technology), Chernova, Sonia (Georgia Institute of Technology)
Keywords: Linguistic Communication and Dialogue, Machine Learning and Adaptation, Multimodal Interaction and Conversational Skills Abstract: Ambiguities are inevitable in human-robot interaction, especially when a robot follows user instructions in a large, shared space. For example, if a user asks the robot to find an object in a home environment with underspecified instructions, the object could be in multiple locations depending on missing factors. For instance, a bowl might be in the kitchen cabinet or on the dining room table, depending on whether it is clean or dirty, full or empty, and the presence of other objects around it. Previous works on object search have assumed that the queried object is immediately visible to the robot or have predicted object locations using one-shot inferences, which are likely to fail for ambiguous or partially understood instructions. This paper focuses on these gaps and presents a novel model-agnostic approach leveraging semantically driven clarifications to enhance the robot's ability to locate queried objects in fewer attempts. Specifically, we leverage different knowledge embedding models, and when ambiguities arise, we propose an informative clarification method, which follows an iterative prediction process. The user experiment evaluation of our method shows that our approach is applicable to different custom semantic encoders as well as LLMs, and informative clarifications improve performances, enabling the robot to locate objects on its first attempts. The user experiment data is publicly available at https://github.com/IrmakDogan/ExpressionDataset.

14:48-15:00, Paper TuCT5.5	Add to My Program
Bootstrapping Human-Like Planning Via LLMs

Porfirio, David (U.S. Naval Research Laboratory), Hsiao, Vincent (U.S. Naval Research Laboratory), Fine-Morris, Morgan (U.S. Naval Research Laboratory), Smith, Leslie (Naval Research Laboratory), Hiatt, Laura M. (Naval Research Laboratory)
Keywords: Linguistic Communication and Dialogue, Cooperation and Collaboration in Human-Robot Teams, Novel Interfaces and Interaction Modalities Abstract: Robot end users increasingly require accessible means of specifying tasks for robots to perform. Two common end-user programming paradigms include drag-and-drop interfaces and natural language programming. Although natural language interfaces harness an intuitive form of human communication, drag-and-drop interfaces enable users to meticulously and precisely dictate the key actions of the robot's task. In this paper, we investigate the degree to which both approaches can be combined. Specifically, we construct a large language model (LLM)-based pipeline that accepts natural language as input and produces human-like action sequences as output, specified at a level of granularity that a human would produce. We then compare these generated action sequences to another dataset of hand-specified action sequences. Although our results reveal that larger models tend to outperform smaller ones in the production of human-like action sequences, smaller models nonetheless achieve satisfactory performance.


TuCT6 Regular Session, Auditorium 6	Add to My Program
LLM-Enhanced Social Robotics I

Chair: Shibata, Takanori	AIST

14:00-14:12, Paper TuCT6.1	Add to My Program
SYNERGY: An LLM-Based System for Smart Home Device Control and User Social Interaction

Nguyen Huynh Thao, My (International University, Ho Chi Minh City, Vietnam - Vietnam Na), Dang, Nguyen Nam Anh (University of Huddersfield), Le, Duy Tan (Vietnam National University Ho Chi Minh City), Tuyen, Nguyen Tan Viet (University of Southampton)
Keywords: Cooperation and Collaboration in Human-Robot Teams, Motion Planning and Navigation in Human-Centered Environments, Machine Learning and Adaptation Abstract: Socially assistive robots and smart home devices are increasingly integrated into daily life, offering emotional and social support in a modern society where many individuals live alone. Inspired by this context, this paper presents the SYNERGY framework for managing smart home environments comprising multiple smart devices and assistive robots. SYNERGY is designed to handle a broad spectrum of user requests, from simple queries to complex tasks requiring multi-step reasoning, and assigns them to appropriate agents for optimal execution. We conducted a series of experiments under various configurations to evaluate the framework's performance in processing user queries across socially relevant contexts and its effectiveness in task allocation. Experimental results demonstrate that incorporating contextual retrieval into our designed decision-making module significantly improves the system’s understanding of user intent and is crucial in handling complex, multi-step tasks. Additionally, the designed task allocation module proves its effectiveness in optimizing assignments using cost matrices, enabling flexible and efficient multi-agent coordination.

14:12-14:24, Paper TuCT6.2	Add to My Program
Gaze-Supported Large Language Model Framework for Bi-Directional Human-Robot Interaction

Rüppel, Jens Volker (Technical University of Munich), Rudenko, Andrey (Robert Bosch GmbH), Schreiter, Tim (Örebro University), Magnusson, Martin (Örebro University), Lilienthal, Achim J. (Orebro University)
Keywords: Multimodal Interaction and Conversational Skills, Cooperation and Collaboration in Human-Robot Teams, Non-verbal Cues and Expressiveness Abstract: The rapid development of Large Language Models (LLMs) creates an exciting potential for flexible, general knowledge-driven Human-Robot Interaction (HRI) systems for assistive robots. Existing HRI systems demonstrate great progress in interpreting and following user instructions, action generation, and robot task solving. On the other hand, bi-directional, multimodal, and context-aware support of the user in collaborative tasks still remains an open challenge. In this paper, we present a gaze- and speech-informed interface to the assistive robot, which is able to perceive the working environment from multiple vision inputs and support the dynamic user in their tasks. Our system is designed to be modular and transferable to adapt to diverse tasks and robots, and it is real-time capable due to the language-based interaction state representation and fast on-board perception modules. Its development was supported by multiple public dissemination events, contributing important considerations for improved robustness and user experience. Furthermore, in a lab study, we compare the performance and user ratings of our system with those of a traditional scripted HRI pipeline. Our findings indicate that an LLM-based approach enhances adaptability and marginally improves user engagement and task execution metrics but may produce redundant output, while a scripted pipeline is well suited for more straightforward tasks.

14:24-14:36, Paper TuCT6.3	Add to My Program
Speech Recognition and LLM Performance in Elderly Care Home Conversations

Pinto-Bernal, Maria (Ghent University—imec), Belpaeme, Tony (University of Ghent - IMEC)
Keywords: Applications of Social Robots, Multimodal Interaction and Conversational Skills, Evaluation Methods Abstract: Conversational robots offer promise in elderly care, but dialectal speech poses challenges for automatic speech recognition (ASR). This study evaluates a conversational robot integrating Microsoft Azure ASR and GPT-4o in real-world interactions with elderly users. Results show that ASR accuracy varied significantly (95% for standard French, 45–56% for Dutch dialects (e.g., West Flemish), often leading to transcription errors. Despite this, the LLM restored conversational coherence in 44–52% of misrecognitions, while users contributed 25–35% of repairs. Comparative ASR analysis showed Whisper’s superior dialectal robustness (28% WER) but high latency. Interaction durations ranged from 17 to 45 minutes, with participants perceiving the robot as understanding them despite ASR challenges. This study uniquely integrates ASR performance, LLM recovery, and user adaptation, highlighting the need for hybrid ASR solutions and context-aware dialogue management in elderly-care robots. Findings highlight the importance of context-aware dialogue management, hybrid ASR strategies, and user-driven conversational adaptation for effective human-robot interactions in real-world settings.

14:36-14:48, Paper TuCT6.4	Add to My Program
Dynamic Prompting Improves Turn-Taking in Embodied Spoken Dialogue Systems

SHEN, Yifan (Hong Kong University of Science and Technology), Liu, Dingdong (HKUST), MO, Xiaoyu (The Hong Kong University of Science and Technology), Tsung, Fugee (HKUST), Ma, Xiaojuan (Computer Science & Engineering, Hong Kong University of Science), Shi, Bertram Emil (Hong Kong University of Science and Technology)
Keywords: Linguistic Communication and Dialogue, Anthropomorphic Robots and Virtual Humans, Multimodal Interaction and Conversational Skills Abstract: The ability to coordinate turn taking during spoken dialogue is crucial for embodied spoken dialogue systems (SDS), like humanoid robots. The SDS needs to model shifts in the conversational floor, which describes each party's stance (either speaking or listening). Further, the SDS needs to signal its perception of the floor to the human, so that they can coordinate floor shifts and resolve conflicts. Conventional SDS employ standalone modules to control floor shifts but do not produce timely and appropriate responses. Recent end-to-end audio LLMs generate responses quickly, but do not coordinate floor shifts as accurately. In this work, we propose an SDS architecture that dynamically adjusts its prompts to an end-to-end audio LLM based upon its perception of the conversational floor state. The LLM output determines not only the audio output, but also the perceived floor state. This enables the system to signal its stance to the human, both when listening and when speaking. We conducted an experiment where a humanoid robot administered a semi-structured interview with human subjects. Results show that, compared with baseline systems using static prompts, dynamic prompting enables the LLM to model floor shifts more accurately and generate more appropriate signalling and fewer interruptions, Overall, this leads to smoother turn-taking in dialogue.

14:48-15:00, Paper TuCT6.5	Add to My Program
LLM-Enhanced Interactions in Human-Robot Collaborative Drawing with Older Adults

Bossema, Marianne (University of Applied Sciences Amsterdam), Ben Allouch, Somaya (University of Amsterdam), Plaat, Aske (Leiden University), Saunders, Rob (Leiden University)
Keywords: Robots in Education, Therapy and Rehabilitation, Cooperation and Collaboration in Human-Robot Teams, Multimodal Interaction and Conversational Skills Abstract: The goal of this study is to identify factors that support and enhance older adults’ creative experiences in human-robot co-creativity. Because the research into the use of robots for creativity support with older adults remains underexplored, we carried out an exploratory case study. We took a participatory approach and collaborated with professional art educators to design a course "Drawing with Robots" for adults aged 65 and over. The course featured human-human and human-robot drawing activities with various types of robots. We observed collaborative drawing interactions, interviewed participants on their experiences, and analyzed collected data. Findings show that participants preferred acting as curators, evaluating creative suggestions from the robot in a teacher or coach role. When we enhanced a robot with a multimodal Large Language Model (LLM), participants appreciated its spoken dialogue capabilities. They reported however, that the robot’s feedback sometimes lacked an understanding of the context, and sensitivity to their artistic goals and preferences. Our findings highlight the potential of LLM-enhanced robots to support creativity and offer future directions for advancing human-robot co-creativity with older adults.


TuCT7 Regular Session, Auditorium 7	Add to My Program
Robots in Families, Education, Therapeutic Contexts & Arts I

Chair: Li, Yinchu	Eindhoven University of Technology

14:00-14:12, Paper TuCT7.1	Add to My Program
Exploring Causality for HRI: A Case Study on Robotic Mental Well-Being Coaching

Spitale, Micol (Politecnico Di Milano), Gadipudi, Srikar Babu (Indian Institute of Technology Madras), Cakmak, Serhan (Department of Computer En Gineering, Bogazici University), Cheong, Jiaee (University of Cambridge), Gunes, Hatice (University of Cambridge)
Keywords: Applications of Social Robots, Affective Computing Abstract: One of the primary goals of Human-Robot Interaction (HRI) research is to develop robots that can interpret human behavior and adapt their responses accordingly. Adaptive learning models, such as continual and reinforcement learning, play a crucial role in improving robots' ability to interact effectively in real-world settings. However, these models face significant challenges due to the limited availability of real-world data, particularly in sensitive domains like healthcare and well-being. This data scarcity can hinder a robot’s ability to adapt to new situations. To address these challenges, causality provides a structured framework for understanding and modeling the underlying relationships between actions, events, and outcomes. By moving beyond mere pattern recognition, causality enables robots to make more explainable and generalizable decisions. This paper presents an exploratory causality-based analysis through a case study of an adaptive robotic coach delivering positive psychology exercises over four weeks in a workplace setting. The robotic coach autonomously adapts to multimodal human behaviors, such as facial valence and speech duration. By conducting both macro- and micro-level causal analyses, this study aims to gain deeper insights into how adaptability can enhance well-being during interactions. Ultimately, this research seeks to advance our understanding of how causality can help overcome challenges in HRI, particularly in real-world applications.

14:12-14:24, Paper TuCT7.2	Add to My Program
Preserving Style Identity of Dance Choreographies Mapped from Human to Robotic Arm

Villani, Alberto (University of Siena), SAVIANO, GIUSEPPE (University of Pisa), Prattichizzo, Domenico (University of Siena)
Keywords: Art pieces supported by robotics, Robots in art and entertainment, Applications of Social Robots Abstract: Traditional dances play a crucial role in preserving cultural identity, fostering community bonds, and maintaining artistic heritage. The integration of robotics into this domain, leveraging AI, motion-capture, and mapping algorithms, introduces new possibilities for replicating traditional choreography using artificial agents. However, this fusion raises important questions about authenticity, cultural impact, and the role of technology in artistic expression. These concerns become even more relevant when the artificial agent is a non-humanoid robot, as the mapping process becomes more complex. In a previous study, we proposed a Principal Component Analysis (PCA)-based projection method to transfer human dance movements onto robotic arms. This method aims to minimize movement loss by efficiently adapting the high degrees of freedom of the human body to the constrained capabilities of a robotic manipulator. While earlier research confirmed the method’s ability to produce robot movements consistent with human references, this study further explores its impact on preserving dance style identity. To evaluate this, we perform a two-phase numerical analysis. First, we investigate whether the method retains stylistic differences between movements and how it influences them. If a reduction in stylistic distinctions is observed, we proceed to examine the statistical uniqueness of the robot-generated trajectories by comparing them to human movements from both the same and different dance styles. This deeper analysis provides insights into whether the proposed mapping method maintains the essence of stylistic identity despite the transformation from human motion to robotic execution.

14:24-14:36, Paper TuCT7.3	Add to My Program
Exploring Human Perceptions of AI-Driven Musical Robots: A Study on RoSAS with Keirzo

Melder, Trinity (Macquarie University), Savery, Richard (Georgia Inst. of Technology)
Keywords: Motivations and Emotions in Robotics, Social Intelligence for Robots, Multimodal Interaction and Conversational Skills Abstract: As AI-driven robots become more integrated into daily life, understanding user perceptions is crucial for improving their design and interaction. This study investigates the impact of interruptibility and response unpredictability on user engagement with Keirzo, an AI-powered musical robot. Using the Robotic Social Attributes Scale (RoSAS), participants engaged with Keirzo under two conditions: one allowing interruptions and one requiring them to wait for complete responses. Findings suggest that while the ability to interrupt offered a greater sense of control, it did not significantly increase engagement. Participants generally rated Keirzo as more competent when its responses were structured and coherent, whereas repetitive or unpredictable replies reduced perceived intelligence. Perceptions of personality were mixed; some found the robot engaging and expressive, while others viewed it as mechanical or detached. These results highlight the importance of balancing control, coherence, and expressiveness in AI-driven musical interactions. As the findings are exploratory, future work should involve more adaptive systems and larger sample sizes to further examine these dynamics in creative HRI contexts.

14:36-14:48, Paper TuCT7.4	Add to My Program
If I Move, Do You Move? Investigating the Role of Interpersonal Synchrony in Human-Robot Joint Painting

Boadi-Agyemang, Abena (Carnegie Mellon University), Schaldenbrand, Peter (Carnegie Mellon University), Misra, Vihaan (Carnegie Mellon University), Carter, Elizabeth (Carnegie Mellon University), Oh, Jean (Carnegie Mellon University), Steinfeld, Aaron (Carnegie Mellon University)
Keywords: Cooperation and Collaboration in Human-Robot Teams, Robots in art and entertainment, Social Intelligence for Robots Abstract: Interpersonal synchrony (IS), the behavioral and physiological coordination across time and space, plays a crucial role in social interactions by fostering empathy, closeness, and prosocial behaviors. However, there is a need for more examination of human-robot interaction (HRI) research focused on interactions where the temporal alignment of body movements and the spatial coordination of the content produced by those movements is vital to the quality of the interaction, such as in joint visual art-making. In this work, we investigated the impact of IS on human raters' perceptions of a human-robot (HR) dyad engaged in a joint painting activity. We conducted two online studies (n=70, total) in which participants watched 4 videos (1 repeated synchronous video and 3 asynchronous videos). We varied the degree of IS displayed by an HR dyad on two axes: (a) temporal alignment (e.g., speed of producing brush strokes) and (b) spatial similarity (i.e., similarity in the visual content produced). Our results indicate that some temporal and spatial dimensions of IS displayed by an HR dyad during joint painting have significant positive impacts on external observers' perceptions of the robot, including prosocial tendencies (i.e., empathy, synchrony, and closeness) and acceptance. These findings are significant for emergent research on collaborative robots.

14:48-15:00, Paper TuCT7.5	Add to My Program
The Fluffy Tightrope - Examining Zoomorphic Robot Interactions for Promoting Active Behavior in a Comfortable Setting

Ringe, Rachel (University of Bremen), Dänekas, Bastian (Universität Bremen), Bork, Anika (University of Bremen), Hurrelbrink, Lars (University of Bremen), Kröger, Christopher (University of Bremen), Litvin, Yuliya (University of Bremen), Madam Sampangiramu, Srujana (University of Bremen), Žemberi, Ivana (University of Bremen), Malaka, Rainer (University of Bremen)
Keywords: Robot Companions and Social Robots, Non-verbal Cues and Expressiveness Abstract: In this explorative study, we investigated how a zoomorphic dog-like robot could encourage users to engage in active behavior through nudges with different levels of intrusiveness. We examined three hypotheses on the effectiveness of different intensity levels, the impact of zoomorphic design on emotional responses, and the influence of prior dog experience on interactions. Using a within subject approach with 34 participants, we found that low-intensity nudges were most sufficient at encouraging active behavior, contradicting our first hypothesis. Participants generally responded positively to the zoomorphic design, supporting our second hypothesis. Our third hypothesis was approved as participants with prior dog experience perceived the robot as less intrusive and reported higher user experience. This finding was complemented by these specific users putting a powerful mental model of a real dog directly on our robot dog, sometimes even resulting in projecting attributes like “thirst" or other desires to the robot. We call this new finding “Fluffy Tightrope" - the balancing act between realism and abstraction in designing a zoomorphic robot, so that users with animal experience respond to nudges, but do not project too strong wishes and desires onto the robot resulting in misinterpretations.


TuPM3_BR Voorhof/Hall	Add to My Program
Coffee Break (Tu2) 15: 00-15: 20


TuDT1 Regular Session, Auditorium 1	Add to My Program
Design Methodologies in Social Robotics I

Chair: Tilbury, Dawn	University of Michigan
Co-Chair: Kelly, Karla Bransky	The Australian National University

15:20-15:32, Paper TuDT1.1	Add to My Program
Exploring Participatory Design for Delivery Robots to Prevent and Address Impediment from a Bystander Perspective

Song, Heqiu (RWTH Aachen University), Fan, Xing (RWTH Aachen University), Meyer, Pascal (RWTH Aachen Univeristy), Rosenthal-von der Pütten, Astrid Marieke (RWTH Aachen University)
Keywords: Innovative Robot Designs, Assistive Robotics, Embodiment, Empathy and Intersubjectivity Abstract: The increasing deployment of delivery robots has highlighted both opportunities and challenges, including intentional and unintentional impediments to these robots. This study explores the design space of robot behavior to respond to such situations while considering the social group characteristics of the human interactants from the perspective of a bystander. In focus groups (n = 13), we identified human personas who (un)intentionally impede the tasks of delivery robots and collaboratively designed potentially suitable robot responses, resulting in the identification of seven human personas and five robot personas. The suitability of robot responses to human personas was subsequently evaluated in a within-between-subject online survey (n = 49), which also examined the influence of gendered robot voice (male vs. female; between-subject factor). The findings reveal a clear preference for passive and polite robot responses over assertive robot responses, with no significant differences in perception between male and female-voiced robots. This study provides insights for designers seeking to create effective and socially aware delivery robots capable of managing obstructive encounters.

15:32-15:44, Paper TuDT1.2	Add to My Program
Speculative News on Possible Futures with Robots

Rezzani, Andrea (Free University of Bolzano-Bozen), Bermúdez Chinea, Julio Daniel (Free University of Bozen-Bolzano), Domanti, Umberto (Free University of Bozen-Bolzano), Menéndez-Blanco, María (Free University of Bozen-Bolzano), De Angeli, Antonella (Free University of Bozen-Bolzano)
Keywords: Narrative and Story-telling in Interaction, Storytelling in HRI, Ethical Issues in Human-robot Interaction Research Abstract: Imagining the future of HRI is crucial for anticipating ethical, social, and technological challenges. This paper proposes a Speculative News workshop procedure to engage high school students in discussion and reflection while creating the front page of a newspaper published in 2125. Results from 40 participants explored robotic imaginaries highlighting societal issues, human attitudes toward robots, and technological feasibility. The results open questions that expand HRI knowledge beyond artefacts and deep into societies. In conclusion, we reflect on how Speculative News can foster participatory and inclusive design.

15:44-15:56, Paper TuDT1.3	Add to My Program
Shaping Expressiveness in Robotics: The Role of Design Tools in Crafting Embodied Robot Movements

Zibetti, Elisabetta (CHArt-LUTIN Laboratory), Mercader, Alexandra Léna Victoria (Technical University Munich), Duval, Hélène (Universite Du Quebec a Montreal), Levillain, Florent (Ensadlab-Reflective Interaction), Rochette, Audrey (Universite Du Quebec a Montreal), St-Onge, David (Ecole De Technologie Superieure)
Keywords: User-centered Design of Robots, Interaction Kinesics, Embodiment, Empathy and Intersubjectivity Abstract: As robots increasingly become part of shared human spaces, their movements must transcend basic functionality by incorporating expressive qualities to enhance engagement and communication. This paper introduces a movement-centered design pedagogy designed to support engineers in creating expressive robotic arm movements. Through a hands-on interactive workshop informed by interdisciplinary methodologies, participants explored various creative possibilities, generating valuable insights into expressive motion design. The iterative approach proposed integrates analytical frameworks from dance, enabling designers to examine motion through dynamic and embodied dimensions. A custom manual remote controller facilitates interactive, real-time manipulation of the robotic arm, while dedicated animation software supports visualization, detailed motion sequencing, and precise parameter control. Qualitative analysis of this interactive design process reveals that the proposed "toolbox" effectively bridges the gap between human intent and robotic expressiveness resulting in more intuitive and engaging expressive robotic arm movements.

15:56-16:08, Paper TuDT1.4	Add to My Program
Experiential Science Fiction Prototyping for Envisioning Future Life with Robots

Sawada, Tomoka (The University of Tokyo), Ichikura, Aiko (University of Tokyo), Yanokura, Iori (University of Tokyo), Okada, Kei (The University of Tokyo), Inaba, Masayuki (The University of Tokyo)
Keywords: User-centered Design of Robots, Creating Human-Robot Relationships, Innovative Robot Designs Abstract: Science Fiction Prototyping (SFP) is a method that uses science fiction to imagine future technologies and foster innovation. It is considered effective for exploring human-robot relationships and envisioning better robot designs. However, robot embodiment influences human perception, which plays a crucial role in interaction. Simply imagining future scenarios with robots through SFP may overlook these aspects. We propose an approach called Experiential Science Fiction Prototyping (ESFP), which adds a phase of experiencing the story to the traditional SFP process. To explore the effects of ESFP, we conducted a workshop with Japanese teenagers under the theme of designing a robot that contributes to a sense of “ibasho”—a Japanese concept referring to a space or relationship where one feels accepted and comfortable. ESFP unfolds in three phases: Storytelling, where participants envision future lives with robots and create stories; Experience, where they bring these stories to life through interaction with a physical robot; and Discussion, where they reflect on the story they created and experienced. The results suggested that, through the experiential phase, participants developed new ideas about interaction with robots and expanded their imagination about future relationships. Experiencing the story helped participants connect more closely with the envisioned robot interactions and inspired new reflections and expectations. This study contributes by proposing the ESFP method, detailing its implementation, and discussing its potential through a case study.

16:08-16:20, Paper TuDT1.5	Add to My Program
Teaching Human-Robot Interaction: Using Speculation and Fiction to Make Spaces for Possible Robotic Futures

Lim, Sharmayne (Cornell University), Kwon, Nayeon (Cornell University), Gendreau, Eric (Cornell University), Green, Keith Evan (Cornell University)
Keywords: User-centered Design of Robots, Narrative and Story-telling in Interaction, Innovative Robot Designs Abstract: We explore speculation ("wondering about how things could be”) as a pedagogical tool for teaching HRI. We focus on the potential of speculation to create spaces for discussion and debate on future HRI scenarios. Drawing particularly on dystopian fiction, students are encouraged to imagine human-robot interactions that offer alternative, "possible futures” in response to the negative consequences of technology. This approach challenges traditional user-centered design by guiding students to prototype complex HRI systems. To illustrate this, we introduce the motivations for and the process of our approach and present case studies from our own course delivery. While our case study centers on designing robotic environments--an emerging subfield of HRI--we see the approach as broadly applicable to the design of social robots and other embodied forms of robotics.


TuDT2 Regular Session, Auditorium 2	Add to My Program
Explainable Human-Robot Interaction

Chair: Lager, Anders	ABB AB

15:20-15:32, Paper TuDT2.1	Add to My Program
Personalised Explanations in Long-Term Human-Robot Interactions (I)

Gebellí, Ferran (PAL Robotics), Garrell, Anais (UPC-CSIC), Habekost, Jan-Gerrit (University of Hamburg), Lemaignan, Séverin (PAL Robotics), Wermter, Stefan (University of Hamburg), Ros, Raquel (IIIA-CSIC)
Keywords: Machine Learning and Adaptation, Cognitive Skills and Mental Models, Ethical Issues in Human-robot Interaction Research Abstract: In the field of Human-Robot Interaction (HRI), a fundamental challenge is to facilitate human understanding of robots. The emerging domain of eXplainable HRI (XHRI) investigates methods to generate explanations and evaluate their impact on human-robot interactions. Previous works have highlighted the need to personalise the level of detail of these explanations to enhance usability and comprehension. Our paper presents a framework designed to update and retrieve user knowledge-memory models, allowing for adapting the explanations' level of detail while referencing previously acquired concepts. Three architectures based on our proposed framework that use Large Language Models (LLMs) are evaluated in two distinct scenarios: a hospital patrolling robot and a kitchen assistant robot. Experimental results demonstrate that a two-stage architecture, which first generates an explanation and then personalises it, is the framework architecture that effectively reduces the level of detail only when there is related user knowledge.

15:32-15:44, Paper TuDT2.2	Add to My Program
Modeling of Bowing Behaviors in Apology Based on Severity

Shiomi, Masahiro (ATR), Hirayama, Taichi (Doshisha University), Kimoto, Mitsuhiko (Meiji University), Iio, Takamasa (Doshisha University), Shimohara, Katsunori (Doshisha University)
Keywords: Social HRI, Acceptability and Trust, Gesture, Posture and Facial Expressions Abstract: With the expansion of social robots’ working environments, developing strategies to mitigate their mistakes has become crucial, especially given the difficulty of entirely avoiding errors. Previous studies have explored effective strategies for robot apologies, including conversational content, yet have paid less attention to nonverbal behaviors like bowing. In this study, we focused on the modeling of bowing behaviors. A distinctive aspect of our study is its consideration of failure severity, since we hypothesized that apology behaviors vary with the situation. For instance, a minor delay might prompt a rather mechanical apology, whereas a serious failure would undoubtedly elicit a more sincere response. We elucidated the relationship in apologies between failure severity and bowing behaviors, based on data collected from participants. Then we implemented the modeled bowing behavior in a robot and conducted a web-based survey to assess its effectiveness. Our findings highlighted the significance of taking into account the severity of bowing behaviors.

15:44-15:56, Paper TuDT2.3	Add to My Program
Bridging the Human-Agent Representation Gap for Decision-Making Explanations in Autonomous Robots

Lindsay, Alan (Heriot-Watt University), Ramirez-Duque, Andrés (University of Glasgow), Craenen, Bart (Heriot-Watt University), Robb, David A. (Heriot Watt University), De Pellegrin, Emanuele (Heriot-Watt University), Boé, Laurence (SeeByte), Munafò, Andrea (National Oceanography Centre), Petrick, Ron (Heriot-Watt University)
Keywords: User-centered Design of Robots, Cooperation and Collaboration in Human-Robot Teams, Linguistic Communication and Dialogue Abstract: In autonomous vehicle mission planning, supporting human operators to understand and influence the decision-making process is crucial for building the operator’s trust and establishing effective collaboration. However, it has been observed that human and agent representations will typically not align. As a consequence, concepts that are useful for effective human-agent communication, will not necessarily feature in the agent’s representation. Focusing on specific spatial-temporal concepts, we define automatic model extensions, which can introduce these additional concepts. We report on a qualitative user study, where we investigate the use of these new structural concepts in underwater autonomous vehicle scenarios. Our study indicates that the extended concepts can be used in user queries and agent responses, enabling the user to better communicate their intent in shaping mission objectives, and supporting explanations with more relevant information.


TuDT3 Regular Session, Auditorium 3	Add to My Program
Robot Behaviours for HRI

Chair: Levy-Tzedek, Shelly	Ben Gurion University
Co-Chair: Guo, Xinliang	The University of Melbourne

15:20-15:32, Paper TuDT3.1	Add to My Program
Mirror Eyes: Explainable Human-Robot Interaction at a Glance

Krüger, Matti (Honda Research Institute Europe GmbH), Tanneberg, Daniel (Honda Research Institute Europe), Wang, Chao (Honda Research Institute Europe GmbH), Hasler, Stephan (Honda Research Institute Europe), Gienger, Michael (Honda Research Institute Europe)
Keywords: Novel Interfaces and Interaction Modalities, Cooperation and Collaboration in Human-Robot Teams, Non-verbal Cues and Expressiveness Abstract: The gaze of a person tends to reflect their interest. This work explores what happens when this statement is taken literally and applied to robots. Here we present a robot system that employs a moving robot head with a screen-based eye model that can direct the robot's gaze to points in physical space and present a reflection-like mirror image of the attended region on top of each eye. We conducted a user study with 33 participants, who were asked to instruct the robot to perform pick-and-place tasks, monitor the robot's task execution, and interrupt it in case of erroneous actions. Despite a deliberate lack of instructions about the role of the eyes and a very brief system exposure, participants felt more aware about the robot's information processing, detected erroneous actions earlier, and rated the user experience higher when eye-based mirroring was enabled compared to non-reflective eyes. These results suggest a beneficial and intuitive utilization of the introduced method in cooperative human-robot interaction.

15:32-15:44, Paper TuDT3.2	Add to My Program
Robot Continuity across Embodiments: Portability, Identity and Migration of Robotic Systems

Laity, Weston (University of Denver), Haring, Kerstin Sophie (University of Denver), Holthaus, Patrick (University of Hertfordshire)
Keywords: User-centered Design of Robots, Human Factors and Ergonomics, Robot Companions and Social Robots Abstract: This paper explores the elements that are needed to facilitate the seamless transfer of a robotic agent's "persona" between various embodiments. It addresses the challenges in maintaining the robot identity, user trust, and engagement during transitions into other robot embodiments. Through a literature review, we propose a framework for decomposing and reconstructing a robot's persona across embodiments, integrating visual, audial, and behavioral identity signals. By leveraging insights from robot interaction studies, this research contributes to the design of transferable and adaptable robot companions, with applications in eldercare and assistive technologies. The findings have broader implications for advancing human-robot interaction and fostering sustainable, user-centered robotic systems.

15:44-15:56, Paper TuDT3.3	Add to My Program
A Matter of Height: The Impact of a Robotic Object on Human Compliance

Faber, Michael (Media Innovation Lab, School of Communication, Reichman Universi), Grishko, Andrey (Media Innovation Lab, Interdisciplinary Center Herzliya), Waksberg, Julian (Media Innovation Lab, School of Communication, Reichman Universi), Pardo, David (Media Innovation Lab, School of Communication, Reichman Universi), Leivy, Tomer (Media Innovation Lab, School of Communication, Reichman Universi), Hazan, Yuval David (Media Innovation Lab, School of Communication, Reichman Universi), Talmasky, Emanuel (Media Innovation Lab, School of Communication, Reichman Universi), Megidish, Benny (Media Innovation Lab, Reichman University), Erel, Hadas (Media Innovation Lab, Interdisciplinary Center Herzliya)
Keywords: Creating Human-Robot Relationships Abstract: Robots come in various forms and have different characteristics that may shape the interaction with them. In human-human interactions, height is a characteristic that shapes human dynamics, with taller people typically perceived as more persuasive. In this work, we aspired to evaluate if the same impact replicates in a human-robot interaction and specifically with a highly non-humanoid robotic object. The robot was designed with modules that could be easily added or removed, allowing us to change its height without altering other design features. To test the impact of the robot's height, we evaluated participants' compliance with its request to volunteer to perform a tedious task. In the experiment, participants performed a cognitive task on a computer, which was framed as the main experiment. When done, they were informed that the experiment was completed. While waiting to receive their credits, the robotic object, designed as a mobile robotic service table, entered the room, carrying a tablet that invited participants to complete a 300-question questionnaire voluntarily. We compared participants' compliance in two conditions: A Short robot composed of two modules and 95cm in height and a Tall robot consisting of three modules and 132cm in height. Our findings revealed higher compliance with the Short robot's request, demonstrating an opposite pattern to human dynamics. We conclude that while height has a substantial social impact on human-robot interactions, it follows a unique pattern of influence. Our findings suggest that designers cannot simply adopt and implement elements from human social dynamics to robots without testing them first.

15:56-16:08, Paper TuDT3.4	Add to My Program
Integrating Responsible Computing into the Design of Robot Applications

Mwangi, Eunice Njeri (Jomo Kenyatta University of Agriculture and Technology), Kimani, Stephen (Jomo Kenyatta University of Agriculture and Technology), Irungu, Annette (Jomo Kenyatta University of Agriculture and Technology), Kabunyi, Peter (Jomo Kenyatta University of Agriculture and Technology), Onyango, Eric Otieno (Jomo Kenyatta University of Agriculture and Technology), Oteyo, Isaac Nyabisa (Jomo Kenyatta University of Agriculture and Technology), Mbogho, Chao (Mozilla Foundation)
Keywords: Ethical Issues in Human-robot Interaction Research, User-centered Design of Robots Abstract: As robots continue to advance and become more prevalent in social settings, integrating responsible computing into their development has become increasingly essential. Equipping technologists with the necessary skills and awareness can serve as a model for designing and developing responsible robotic applications. This paper describes a training program that was conducted with final year BSc. Computer Science students at Jomo Kenyatta University of Agriculture and Technology (JKUAT, Kenya). The aim was to teach and provide practical knowledge of responsible computing (RC) and human-centered design (HCD) through the design of social robots. To measure the impact of the training, we conducted pre- and post-training surveys to assess changes in the knowledge of RC and HCD dimensions, adoption of RC dimensions, and the ranking of the relevance of RC dimensions before and after the training. Students reported increased knowledge of the concepts of human-centered design and responsible computing related to design and social robotics. Results showed a significant increase in students' self-reported knowledge of the societal impact assessment, empathy and emotional intelligence, and fairness and bias mitigation dimensions of RC after training. In addition, the survey results indicated a significant increase in the self-reported adoption of responsible computing in student design projects. The participants considered inclusivity and accessibility, as well as ethical design and behavior, to be highly relevant before and after training in their robot design projects. Additionally, dimensions such as user privacy and data protection, empathy and emotional intelligence, transparency, and accountability were increasingly considered relevant in students’ robot design projects

16:08-16:20, Paper TuDT3.5	Add to My Program
Robot Behavior Personalization from Sparse User Feedback

Patel, Maithili (Georgia Institute of Technology), Chernova, Sonia (Georgia Institute of Technology)
Keywords: Human-Centered Robotics, Domestic Robotics Abstract: As service robots become more general-purpose, they will need to adapt to their users' preferences over a large set of all possible tasks that they can perform. This includes preferences regarding which actions the users prefer to delegate to robots as opposed to doing themselves. Existing personalization approaches require task-specific data for each user. To handle diversity across all household tasks and users, and nuances in user preferences across tasks, we propose to learn a task adaptation function independently, which can be used in tandem with any universal robot policy to customize robot behavior. We create Task Adaptation using Abstract Concepts (TAACo) framework. TAACo can learn to predict the user's preferred manner of assistance with any given task, by mediating reasoning through a representation composed of abstract concepts built based on user feedback. TAACo can generalize to an open set of household tasks from small amount of user feedback and explain its inferences through intuitive concepts. We evaluate our model on a dataset we collected of 5 people's preferences, and show that TAACo outperforms GPT-4 by 16% and a rule-based system by 54%, on prediction accuracy, with 40 samples of user feedback.


TuDT4 Regular Session, Blauwe Zaal	Add to My Program
Applications of Social Robots IV

Chair: Watanabe, Tetsuyou	Kanazawa University
Co-Chair: Aliasghari, Pourya	University of Waterloo

15:20-15:32, Paper TuDT4.1	Add to My Program
Can Robots Take Over Security? a Brief Review and Critique of Security Robot vs. Human Security Agent

Ye, Xin (University of Michigan), Robert, Lionel (University of Michigan)
Keywords: Applications of Social Robots, User-centered Design of Robots, Cooperation and Collaboration in Human-Robot Teams Abstract: Security robots are becoming increasingly prevalent for maintaining law and order, offering cost efficiencies and safety benefits in hazardous environments. Despite these advantages, significant questions remain regarding the public acceptance of robots as replacements for human security agents. This paper presents a systematic literature review to explore whether there is a discernible public preference between human security personnel and their robotic counterparts. The review identifies a contextual pattern: individuals tend to prefer human agents in citizen-initiated interactions, and security robots in police-initiated ones. This paper offers valuable insights to guide the future design and deployment of security robots.

15:32-15:44, Paper TuDT4.2	Add to My Program
Spring Loaded Double Pantograph: A Robotic Mechanism for Safe Balance Training

Tejwani, Ravi (MIT), Bell, John (Massachusetts Institute of Technology), Elliott, Drake (MIT), Wright, Cameron (MIT), Wayne, Peter (Harvard Medical School), Bonato, Paolo (Harvard Medical School), Asada, Harry (MIT)
Keywords: Assistive Robotics, Robot Companions and Social Robots, Social Presence for Robots and Virtual Humans Abstract: A Spring Loaded Double Pantograph (SLDP) mechanism is presented for safe balance training in elderly individuals practicing Tai Chi exercises. As people age, maintaining balance becomes increasingly critical, yet fear of falling often prevents effective exercise, creating a counterproductive cycle that increases fall risk. This natural hesitation to push physical limits during solo practice highlights the need for reliable safety systems. This paper presents a mechanism that provides variable assistance through spring-loaded actuation, so that support and freedom of movement can be balanced in a way that is both effective and unobtrusive. Here, it will be shown that, although support and unrestricted movement are traditionally considered contradictory goals, the two can be achieved simultaneously through mechanical design and the level of assistance can be automatically regulated. In this system, the support mechanism can a) detect falls rapidly, b) provide up to 98.0% body weight support when needed, and c) remain imperceptible during normal exercise. First, the mechanical design principles and kinematic analysis of the double-pantograph structure are presented. Methods for experimental validation with 13 human subjects will be addressed, demonstrating the system's effectiveness through quantitative metrics of support forces, workspace utilization, and energy efficiency during simulated falls in Tai Chi movements.

15:44-15:56, Paper TuDT4.3	Add to My Program
Toward Shared Control for Mobile Bimanual Manipulation on a Robotic Wheelchair

Gandhi, Rohan (Imperial College London), Casado, Fernando E. (Imperial College London), Demiris, Yiannis (Imperial College London)
Keywords: Assistive Robotics, Novel Interfaces and Interaction Modalities, Degrees of Autonomy and Teleoperation Abstract: Assistance through wheelchair-mounted manipulators has the potential to enhance the independence of individuals with disabilities. However, existing approaches primarily focus on single-arm systems or require extensive user input and demonstrations to infer intentions. In this study, we present a shared control framework for intuitive dual-arm operation through a standard 2D joystick, focusing on pick-and-place tasks. Our approach infers user intent in real-time, eliminating the need for specifying prior goals or beliefs. To address the challenges of controlling two 7-DoF (Kinova Gen3) manipulators, we propose two distinct control methods: one for pre-grasp positioning and one for grasp execution. The first method employs a shared control policy to optimize the pre-grasp positioning of the wheelchair base, ensuring ergonomic alignment for front-grasping tasks while incorporating mobile manipulation to reduce task completion time. The second method allows users to maintain high-level goal control and fine-tuning through task-specific arbitration. Experimental results demonstrate high grasp quality and task efficiency across pick-and-place scenarios, establishing the feasibility of shared control for bimanual manipulation and wheelchair navigation. We believe this is the first unified framework for mobile bimanual manipulation using standard wheelchair controls.

15:56-16:08, Paper TuDT4.4	Add to My Program
Demonstration Sidetracks: Categorizing Systematic Non-Optimality in Human Demonstrations

Fang, Shijie (Tufts University), YU, HANG (Tufts University), Fang, Qidi (Tufts University), Aronson, Reuben (Tufts University), Short, Elaine Schaertl (Tufts University)
Keywords: Social Learning and Skill Acquisition Via Teaching and Imitation, Machine Learning and Adaptation Abstract: Learning from Demonstration (LfD) has become a popular approach for robots to learn new skills, while most LfD methods suffer from imperfections in human demonstrations. Prior work in LfD often characterizes the sub-optimalities in human demonstrations as random noise. In this paper, we explored non-optimal behaviors in non-expert demonstrations and showed that these behaviors are not random and have systematic patterns: they form systematic demonstration sidetracks. We used a public space study dataset from our previous work with 40 participants and a long-horizon robot task. We recreated the experimental setup in a simulation and annotated all the demonstrations. We identified four types of demonstration sidetracks: Exploration, Mistake, Alignment, and Pause, and one control pattern one-dimension control. We found that instead of being random and rare, demonstration sidetracks frequently appear in non-expert demonstrations across all participants, and the distribution of demonstration sidetracks is associated with the robot task temporarily and spatially. Moreover, we found that users’ control patterns are affected by the control interface. Our findings highlight the need for better models of sub-optimal demonstrations, offering insights to improve LfD algorithms and reduce gaps between lab-based training and real-world applications. All the demonstrations, infrastructures, and annotations are available at https://github.com/AABL-Lab/Human-Demonstration-Sidetracks.

16:08-16:20, Paper TuDT4.5	Add to My Program
Algorithms versus Experts: Scientists’ Preference for Information Gathering Decisions in Robot-Aided Field Science Missions

Lee, Zachary (Oregon State University), Buchmeier, Sean (Oregon State University), Wilson, Cristina (Oregon State University)
Keywords: Human Factors and Ergonomics, Evaluation Methods Abstract: Despite an increased reliance on robotic systems, the field science community remains resistant to adopting information gathering algorithms for generating data collection plans, with most data collection decisions still being made by expert field scientists. This is a missed opportunity, as coordinating low-level data collection decisions is time-intensive, cognitively taxing, and prevents field scientists from being able to handle high-level tasks that they are better suited for. In this paper, we present an initial effort to understand the reasons for the slow algorithm adoption through two web-based experiments. In the first study, expert planetary scientists (N = 82) blindly evaluated three expert-generated data collection plans and an algorithm-generated plan. In a follow-up study, scientists (N = 33) were provided with the data collected using the different plans and asked to update their evaluations. We found scientists perceived algorithm-generated plans as less-than-ideal relative to expert-generated plans and providing scientists the resulting data did not improve their perception of the plans. Scientists identified characteristics of expert-generated plans that were not present in algorithm-generated plans: more even measurement coverage, more measurements transecting gradients of interest, and control measurements that better capture extreme ends of a variable range. Future work should take inspiration from how expert scientists make data collection decisions to improve information gathering algorithms and their uptake for robot-aided field science missions.


TuDT5 Regular Session, Auditorium 5	Add to My Program
Linguistic Communication and Dialogue II

Chair: Camara, Fanta	University of York
Co-Chair: Roig Vilamala, Marc	Cardiff University

15:20-15:32, Paper TuDT5.1	Add to My Program
Bridging Divides through Empathetic Robot Dialogue in Remote Interpersonal Conflict

Nakagawa, Satoshi (The University of Tokyo), Nihei, Misato (The University of Tokyo)
Keywords: Affective Computing, Creating Human-Robot Relationships, Ethical Issues in Human-robot Interaction Research Abstract: In increasingly diverse and digitally connected societies, AI is being called upon not only to support communication, but to mediate it ethically—especially when conflicting values and psychological safety are at stake. While diversity is a cornerstone of well-being, it also presents challenges for mutual understanding—particularly in the absence of nonverbal cues and safeguards for mutual trust. In this study, we propose a novel Human–Robot Interaction (HRI) framework wherein a social robot mediates dialogue between individuals by transforming conflict-prone utterances into empathic expressions in real time. Our system employs the BOCCO emo robot and a large language model (LLM) to reinterpret user speech, removing confrontational tones while preserving intent, and delivering the rephrased message to the interlocutor. Experimental results demonstrate that this empathic mediation significantly enhances partner impressions, perceived warmth, and openness, while reducing communication-related stress, indicating that empathic responses can initiate a reciprocal shift in dialogue tone. Beyond technical effectiveness, this approach raises important ethical implications: it suggests a new form of interaction design that respects individual expression while facilitating constructive dialogue. Rather than suppressing conflict, the robot reframes it in ways that support psychological safety and deepen engagement. This work contributes to a growing body of research that positions robots not merely as tools or agents, but as ethical mediators—capable of fostering mutual respect and enhancing well-being in human communication.

15:32-15:44, Paper TuDT5.2	Add to My Program
BT-ACTION: A Test-Driven Approach for Modular Understanding of User Instruction Leveraging Behaviour Trees and LLMs

Leszczynski, Alexander (KTH Royal Institute of Technology), Gillet, Sarah (KTH Royal Institute of Technology), Leite, Iolanda (KTH Royal Institute of Technology), Dogan, Fethiye Irmak (KTH Royal Institute of Technology)
Keywords: Linguistic Communication and Dialogue, Multimodal Interaction and Conversational Skills Abstract: Natural language instructions are often abstract and complex, requiring robots to execute multiple subtasks even for seemingly simple queries. For example, when a user asks a robot to prepare avocado toast, the task involves several sequential steps. Moreover, such instructions can be ambiguous or infeasible for the robot or may exceed the robot's existing knowledge. While Large Language Models (LLMs) offer strong language reasoning capabilities to handle these challenges, effectively integrating them into robotic systems remains a key challenge. To address this, we propose BT-ACTION, a test-driven approach that combines the modular structure of Behavior Trees (BT) with LLMs to generate coherent sequences of robot actions for following complex user instructions, specifically in the context of preparing recipes in a kitchen-assistance setting. We evaluated BT-ACTION in a comprehensive user study with 45 participants, comparing its performance to direct LLM prompting. Results demonstrate that the modular design of BT-ACTION helped the robot make fewer mistakes and increased user trust, and participants showed a significant preference for the robot leveraging the modular approach. The code is publicly available at https://github.com/1Eggbert7/BT_LLM.

15:44-15:56, Paper TuDT5.3	Add to My Program
Why Robots Are Bad at Detecting Their Mistakes: Limitations of Miscommunication Detection in Human-Robot Dialogue

Janssens, Ruben (Ghent University - Imec), De Bock, Jens (Ghent University - Imec), Labat, Sofie (Ghent University), Verhelst, Eva (Ghent University), Hoste, Veronique (Ghent University), Belpaeme, Tony (Ghent University)
Keywords: Non-verbal Cues and Expressiveness, Multimodal Interaction and Conversational Skills, Linguistic Communication and Dialogue Abstract: Detecting miscommunication in human-robot interaction is a critical function for maintaining user engagement and trust. While humans effortlessly detect communication errors in conversations through both verbal and non-verbal cues, robots face significant challenges in interpreting non-verbal feedback, despite advances in computer vision for recognizing affective expressions. This research evaluates the effectiveness of machine learning models in detecting miscommunications in robot dialogue. Using a multi-modal dataset of 240 human-robot conversations, where four distinct types of conversational failures were systematically introduced, we assess the performance of state-of-the-art computer vision models. After each conversational turn, users provided feedback on whether they perceived an error, enabling an analysis of the models' ability to accurately detect robot mistakes. Despite using state-of-the-art models, the performance barely exceeds random chance in identifying miscommunication, while on a dataset with more expressive emotional content, they successfully identified confused states. To explore the underlying cause, we asked human raters to do the same. They could also only identify around half of the induced miscommunications, similarly to our model. These results uncover a fundamental limitation in identifying robot miscommunications in dialogue: even when users perceive the induced miscommunication as such, they often do not communicate this to their robotic conversation partner. This knowledge can shape expectations of the performance of computer vision models and can help researchers to design better human-robot conversations by deliberately eliciting feedback where needed.

15:56-16:08, Paper TuDT5.4	Add to My Program
Grounding Language with Numeric Quantities for Human-Robot Interaction with Collaborative Robots

Rahman, Tabib Wasit (University of Rochester), Shakir, Katelyn (University of Rochester), Howard, Thomas (University of Rochester)
Keywords: Linguistic Communication and Dialogue Abstract: For humans and robots to effectively communicate using language, robots must be capable of understanding numeric quantities prevalent in instructions, descriptions, and answers. In contrast to Large Language Model-based approaches that handle numeric quantities as arbitrary input tokens, discriminative approaches such as Distributed Correspondence Graphs use an a priori defined symbol space that must be sufficiently expressive to process any statement. For such general inputs, a large, predefined range of integer and/or rational values has significant impact on efficiency that can prohibit interaction at a natural cadence. This paper introduces a generative approach to symbolic representations for Distributed Correspondence Graphs that handle arbitrary numeric quantities expressed through language. This method dynamically constructs the symbolic representation by parsing the input for number values and/or metric units. Corpus-based experiments demonstrate improvements in runtime efficiency and accuracy versus a baseline with an a priori defined symbolic representation.

16:08-16:20, Paper TuDT5.5	Add to My Program
A Grammar-Based Communication Model for Conveying Design Intent in Human-Robot Teams

Tokac, Iremnur (RWTH Aachen University)
Keywords: Cooperation and Collaboration in Human-Robot Teams, HRI and Collaboration in Manufacturing Environments, Robots in art and entertainment Abstract: This paper investigates how to represent and communicate evolving design intent driven by dynamic conditions in human-robot teams engaged in unstructured design tasks. We introduce a grammar-based communication model that computationally encodes design intent and translates it into machine-readable language. The model uses rule-based formalisms to computationally encode contextual, transformational, and conditional relationships between sensing and actions, enabling purposeful interactions between human and robot agents. Robotic clay and sand forming studies validated the grammar-based communication model and demonstrated how the rules organised HRI-enabled robotic fabrication, where the iterations of design intent were communicated through the relationships between sensing and action parameters captured in real time. This study may ultimately contribute to more fluid teamwork in unstructured and dynamic environments while improving the transparency of data collection and interpretation in autonomous systems.


TuDT6 Regular Session, Auditorium 6	Add to My Program
LLM-Enhanced Social Robotics II

Chair: Ben Allouch, Somaya	University of Amsterdam
Co-Chair: Kasuga, Haruka	Hokkaido University

15:20-15:32, Paper TuDT6.1	Add to My Program
A Communication Robot for Elderly Persons Using LLM to Generate Responses According to Emotions Estimated from Physiological Signals

Honda, Rika (Shibaura Institute of Technology), Suzuki, Kaoru (Shibaura Institute of Technology), NAKAGAWA, YURI (Shibaura Institute of Technology), Sugaya, Midori (Shibaura Institute of Technology)
Keywords: Linguistic Communication and Dialogue, Motivations and Emotions in Robotics, Monitoring of Behaviour and Internal States of Humans Abstract: To prevent the onset of mental disorders and the deterioration in the health of the elderly, reducing stress and activating brain function has been identified as effective measures. Verbal interactive communication is considered a beneficial approach to achieve these goals and given the shortage of human caregivers, robots are expected to engage in dialogue with the elderly in place of human caregivers. Also, effective communication should take the emotions of the elderly into account. However, few robots have been proposed that achieve this functionality. This study aims to reduce stress and activate brain function of the elderly through communication that considers their emotions. To achieve this, we propose a robot equipped with an emotion estimation function based on physiological signals and a verbal interactive communication function that was not implemented in the previous robot. The robot estimates the emotions of the elderly using physiological signals while capturing their speech through a microphone and voice activity detection (VAD). It then generates a response based on the speech recognition result obtained through automated speech recognition (ASR), large language model (LLM) with prompts that take estimated emotions into consideration, and text to speech (TTS). By playing the generated speech, the robot engages in communication with the elderly. Evaluation measurements conducted with two elderly participants showed that, compared to the previous robot, the proposed robot contributed to stress reduction and brain function stimulation. Future work will involve increasing the number of participants to further investigate the generalizability of the proposed robot.

15:32-15:44, Paper TuDT6.2	Add to My Program
AI or Human? Understanding Perceptions of Embodied Robots with LLMs

Hriscu, Lavinia (CSIC-UPC), Sanfeliu, Alberto (Universitat Politècnica De Cataluyna), Garrell, Anais (UPC-CSIC)
Keywords: Linguistic Communication and Dialogue, Degrees of Autonomy and Teleoperation, Applications of Social Robots Abstract: The pursuit of artificial intelligence has long been associated to the the challenge of effectively measuring intelligence. Even if the Turing Test was introduced as a means of assessing a system's intelligence, its relevance and application within the field of human-robot interaction remain largely underexplored. This study investigates the perception of intelligence in embodied robots by performing a Turing Test within a robotic platform. A total of 34 participants were tasked with distinguishing between AI- and human-operated robots while engaging in two interactive tasks: an information retrieval and a package handover. These tasks assessed the robot’s perception and navigation abilities under both static and dynamic conditions. Results indicate that participants were unable to reliably differentiate between AI- and human-controlled robots beyond chance levels. Furthermore, analysis of participant responses reveals key factors influencing the perception of artificial versus human intelligence in embodied robotic systems. These findings provide insights into the design of future interactive robots and contribute to the ongoing discourse on intelligence assessment in AI-driven systems.

15:44-15:56, Paper TuDT6.3	Add to My Program
Alignment Strategies for Language-Model-Driven Robots in Human-Robot Collaboration: Effectiveness and Impact on Trust

Goubard, Cedric (Imperial College London), Demiris, Yiannis (Imperial College London)
Keywords: Cooperation and Collaboration in Human-Robot Teams, Assistive Robotics Abstract: This study examines the alignment of Large Language Models (LLMs)-controlled assistive robots with human intentions. We conducted a comparative analysis of various alignment strategies for action selection using 13 different locally-run LLMs, demonstrating that: (1) alignment strategies significantly influence robot action choices, and (2) off-the-shelf LLMs exhibit varying default alignments. Therefore, selecting an LLM for robot control should consider not only its performance but also the alignment strategy employed. Additionally, we conducted a user study (N=24, 17h53 of data collection), indicating that alignment strategies significantly impact human trust. Participants reported higher trust levels when the robot prioritised following their instructions over ensuring their wellbeing. Furthermore, we observed a wide range of participant behaviours, from complete delegation to the robot after a few successful actions to annoyance with the robot as soon as it displayed initiative. Exploring the role of personality in this context, we found that reported personality traits using the BFI model did not significantly explain reported trust, while observed personality traits, such as the number of requests made to the robot, were more predictive. This suggests that the robot’s alignment strategy should be tailored to individual users, taking into account their specific needs and preferences.

15:56-16:08, Paper TuDT6.4	Add to My Program
Agreeing to Interact in Human-Robot Interaction Using Large Language Models and Vision Language Models

Sasabuchi, Kazuhiro (Microsoft), Wake, Naoki (Microsoft), Kanehira, Atsushi (Microsoft), Takamatsu, Jun (Microsoft), Ikeuchi, Katsushi (Microsoft)
Keywords: Social Intelligence for Robots, Robot Companions and Social Robots Abstract: In human-robot interaction (HRI), the beginning of an interaction is often complex. Whether the robot should communicate with the human is dependent on several situational factors (e.g., the current human's activity, urgency of the interaction, etc.). We test whether large language models (LLM) and vision language models (VLM) can provide solutions to this problem. We compare four different system-design patterns using LLMs and VLMs, and test on a test set containing 84 human-robot situations. The test set mixes several publicly available datasets and also includes situations where the appropriate action to take is open-ended. Our results using the GPT-4o and Phi-3 Vision model indicate that LLMs and VLMs are capable of handling interaction beginnings when the desired actions are clear. The design using direct image input scored an 89% accuracy on the test set. Of the designs using indirect input, a combined text about human activity and gaze performed best with a 90% accuracy. However, challenges remain in the open-ended situations where the model must choose the priority between the human and robot situation. The design using direct image input mostly prioritized the robot situation, whereas the design with best performance using indirect input mostly prioritized the human situation. Such one-sided behavior could be crucial for practical HRI applications.

16:08-16:20, Paper TuDT6.5	Add to My Program
Plan-And-Act Using Large Language Models for Interactive Agreement

Sasabuchi, Kazuhiro (Microsoft), Wake, Naoki (Microsoft), Kanehira, Atsushi (Microsoft), Takamatsu, Jun (Microsoft), Ikeuchi, Katsushi (Microsoft)
Keywords: Social Intelligence for Robots, Robot Companions and Social Robots Abstract: Recent large language models (LLMs) are capable of planning robot actions. In this paper, we explore how LLMs can be used for planning actions with tasks involving situational human-robot interaction (HRI). A key problem of applying LLMs in situational HRI is balancing between ``respecting the current human's activity" and ``prioritizing the robot's task," as well as understanding the timing of when to use the LLM to generate an action plan. In this paper, we propose a necessary plan-and-act skill design to solve the above problems. We show that a critical factor for enabling a robot to switch between passive / active interaction behavior is to provide the LLM with an action text about the current robot's action. We also show that a second-stage question to the LLM (about the next timing to call the LLM) is necessary for planning actions at an appropriate timing. The skill design is applied to an Engage skill and is tested on four distinct interaction scenarios. We show that by using the skill design, LLMs can be leveraged to easily scale to different HRI scenarios with a reasonable success rate reaching 90% on the test scenarios.


TuDT7 Regular Session, Auditorium 7	Add to My Program
Robots in Families, Education, Therapeutic Contexts & Arts II

Chair: Li, Sheng	Institute of Science Tokyo
Co-Chair: Voges, Amelie	University of Glasgow

15:20-15:32, Paper TuDT7.1	Add to My Program
From Fidgeting to Focused: Developing Robot-Enhanced Social-Emotional Therapy (RESET) for School De-Escalation Rooms

Ramnauth, Rebecca (Yale University), Brscic, Drazen (Kyoto University), Scassellati, Brian (Yale)
Keywords: Applications of Social Robots, Assistive Robotics, Long-term Experience and Longitudinal HRI Studies Abstract: Many schools have built de-escalation and sensory rooms to support students who experience heightened emotional states, sensory overload, or difficulty self-regulating in traditional classroom settings. Yet, effective implementation remains challenging due to diverse student needs and resource constraints. Hence, we developed RESET (Robot-Enhanced Social-Emotional Therapy), a robot for facilitating students' self-regulation in their school's existing de-escalation space. We present our co-design process, iterative development, and final system components. Following a fully autonomous, month-long deployment in an elementary school, we assessed the robot's usability and impacts. Results indicate RESET integrated well into the school environment, promoting more efficient de-escalation, smoother transitions back to classroom learning, and lasting impacts beyond its deployment period.

15:32-15:44, Paper TuDT7.2	Add to My Program
The Multilingual Student Support Robot

Ashkenazi, Shaul (University of Glasgow), Skantze, Gabriel (KTH/Furhat Robotics), Stuart-Smith, Jane (University of Glasgow), Foster, Mary Ellen (University of Glasgow)
Keywords: Applications of Social Robots, Robots in Education, Therapy and Rehabilitation, Ethical Issues in Human-robot Interaction Research Abstract: International students in UK universities often struggle with interactions in English, particularly on their first days in the country. We have developed a multilingual support robot tailored to their needs. To evaluate the performance of the robot, 60 international students asked the robot, in either English or their native language (Modern Standard Arabic or Mandarin Chinese), for support on topics including campus directions, local tax exemption, financial aid, and official documents. Overall, users preferred to use their native language when interacting with the support robot, and using their native language in the robot interaction also had a positive effect on their perception of the interaction itself.

15:44-15:56, Paper TuDT7.3	Add to My Program
“Guide Me through the Unexpected”: Investigating How Deviation from Expectation Affects Human Teaching and Robot Learning

Mihhailov, Konstantin (Vrije Universiteit Amsterdam), Hou, Muhan (Vrije University Amsterdam), Baraka, Kim (Vrije Universiteit Amsterdam)
Keywords: Social Learning and Skill Acquisition Via Teaching and Imitation, Programming by Demonstration, Machine Learning and Adaptation Abstract: The increasing integration of robots into human environments necessitates efficient learning systems capable of adapting to complex scenarios while co-existing with humans. Traditional reinforcement learning (RL) is one of the most popular option, but often struggles with inefficiencies, such as sparse rewards and prolonged training. Learning from Demonstration (LfD), which leverages human expertise, offers a promising alternative. However, human teaching strategies and robot learning processes are inherently intertwined in LfD. Ineffective human teaching can diminish robot learning. To effectively provide demonstrations, human teachers require an understanding of the robot's internal processes and needs without being overwhelmed. We address this by visually showing the robot's deviation from expectation, a metric based on Temporal Difference (TD) error, which represents discrepancies between predicted and actual outcomes. We conducted a user study (n=12) comparing two conditions: one in which deviations from expectation were visually indicated, and one in which these deviations were not shown. Results indicate that visualising deviations shifts human teaching behavior from result oriented strategy (providing demonstrations in the areas where the robot fails) to an expectation oriented strategy (focusing on demonstrations where robot's deviation from expectation is high). We conducted a follow-up simulation study to investigate how these two teaching strategies may influence robot learning, showing that diverse and widespread demonstrations have a significant effect on robot learning performance. We conclude our work with actionable guidelines for designing human-robot interactions that better align human teaching behaviors with robot learning requirements.

15:56-16:08, Paper TuDT7.4	Add to My Program
Exploring the Use of Social Robots to Prepare Children for Radiological Procedures: A Focus Group Study

Nigro, Massimiliano (Politecnico Di Milano), RIGHINI, ANDREA (Children’s Hospital V.Buzzi), Spitale, Micol (Politecnico Di Milano)
Keywords: Child-Robot Interaction, Applications of Social Robots, User-centered Design of Robots Abstract: When children are anxious or scared, it can be hard for them to stay still or follow instructions during medical procedures, making the process more challenging and affecting procedure results. This is particularly true for radiological procedures, where long scan times, confined spaces, and loud noises can cause children to move, significantly impacting scan quality. To this end, sometimes children are sedated, but doctors are constantly seeking alternative non-pharmacological solutions. This work aims to explore how social robots could assist in preparing children for radiological procedures. We have conducted a focus group discussion with five hospital stakeholders, namely radiographers, paediatricians, and clinical engineers, to explore (i) the context regarding children’s preparation for radiological procedures, hence their needs and how children are currently prepared, and (ii) the potential role of social robots in this process. The discussion was transcribed and analysed using thematic analysis. Among our findings, we identified three potential roles for a social robot in this preparation process: offering infotainment in the waiting room, acting as a guide within the hospital, and assisting radiographers in preparing children for the procedure. We hope that insights from this study will inform the design of social robots for pediatric healthcare.

16:08-16:20, Paper TuDT7.5	Add to My Program
In-Home Social Robots Design for Cognitive Stimulation Therapy in Dementia Care

Akinrintoyo, Emmanuel (Imperial College London), Salomons, Nicole (Imperial College London)
Keywords: User-centered Design of Robots, Robots in Education, Therapy and Rehabilitation Abstract: Individual cognitive stimulation therapy (iCST) is a non-pharmacological intervention for improving the cognition and quality of life of persons with dementia (PwDs)~cite{aguirre2013cognitive}; however, its effectiveness is limited by low adherence to delivery by their family members. In this work, we present the user-centered design and evaluation of a novel socially assistive robotic system to provide iCST therapy to PwDs in the home for long-term use. We consulted with 16 dementia caregivers and professionals. Through these consultations, we gathered design guidelines and developed the prototype. The prototype was validated by testing it with three dementia professionals and five PwDs. The evaluation revealed PwDs enjoyed using the system and are willing to adopt its use over the long term. One shortcoming was the system's speech-to-text capabilities, where it frequently failed to understand the PwDs.


TuPM4_BR Voorhof/Hall	Add to My Program
Break (Tu2) 16: 20-16: 30


TuET1 Regular Session, Auditorium 1	Add to My Program
Design Methodologies in Social Robotics II

Chair: Cañete, Raquel	Universidad De Sevilla
Co-Chair: Zhang, Ruohan	Eindhoven University of Technology, Industrial Engineering and Innovation Science Department

16:30-16:42, Paper TuET1.1	Add to My Program
What Was I Made For? Evaluating the Effectiveness of Layperson-Designed Robots

Voges, Amelie (University of Glasgow), Foster, Mary Ellen (University of Glasgow), Cross, Emily S (ETH Zurich)
Keywords: User-centered Design of Robots, Robot Companions and Social Robots Abstract: The social robotics community is increasingly embracing human-centered techniques to design robots that align with users’ needs, preferences, and lived experiences. However, given the known challenges of incorporating laypeople into a design process, little empirical work has tested whether these techniques generate robotic concepts that are accepted and understood by a wider audience. In this mixed-methods online study, we examined how laypeople perceived and evaluated robots that were created through human-centered design. Fifty-two participants assessed a set of laypeople-created healthcare, education, entertainment, and telepresence robot designs according to how successfully each design signaled its intended use case. Our findings demonstrate that layperson-designed robots efficiently communicated their use context. Thus, low-level creative prototyping with end-users can be an effective way of eliciting strong initial design concepts as part of the human-centered design process, though this is influenced by the robot's application domain. Furthermore, we showcase that simplistic robotic designs are sufficient to cue diverse affordances, highlighting the importance of matching a robot’s appearance to its intended use case. Our findings contribute to the study of human-centered design within social robotics, assessing the tools and activities end-users need to be able to meaningfully contribute to robotic design.

16:42-16:54, Paper TuET1.2	Add to My Program
How Co-Design and Personas Can Inform Game Implementation for Robot-Assisted Speech Therapy in Clinical Settings

Shin, Soomin (University of Waterloo), Chandra, Shruti (University of Northern British Columbia), Dautenhahn, Kerstin (University of Waterloo), Rajan, Archana (Kick Start Therapy), Shah, Seema (Kick Start Therapy)
Keywords: Assistive Robotics, User-centered Design of Robots, Applications of Social Robots Abstract: This paper presents a co-designed robot system developed through an 22-month collaboration with Speech Language Pathologists (SLPs) for the use in real-world therapeutic setting. We created persona profiles of SLPs and children with speech and language challenges to inform the development of five game types for two age groups (0-4 and 5-9 years). The system integrates a robot platform with a web-based application that facilitates real-time interaction during therapy sessions. Each game addresses specific therapeutic needs, using the developed child personas as a reference point. Prototype testing with SLPs through role-playing sessions revealed usability insights that led to system refinements, including enhanced robot dialogue, age appropriate content adjustments, and additional interactive features. The resulting system demonstrates how human-centered design can create robotic system that addresses the practical challenges faced by SLPs and children in therapeutic settings.

16:54-17:06, Paper TuET1.3	Add to My Program
When the Robot Surrounds Us: Co-Designing a New Human-Robot Interaction in a Full-Scale, “Robot-Room” Rapid Prototype

Guo, Ge (Serena) (Cornell University), Cañete, Raquel (Universidad De Sevilla), Yu, Jenny (Cornell University), Leshed, Gilly (Cornell University), Walker, Ian (University of Wyoming), Green, Keith Evan (Cornell University)
Keywords: User-centered Design of Robots, Innovative Robot Designs, Narrative and Story-telling in Interaction Abstract: While robots are traditionally envisioned as physical entities that move through, sense, and act upon the environment, a new category is emerging: the inhabitable robot, or "robot-room," which redefines human-robot interaction by immersing us within the robot itself. As a first step in exploring this novel design space, we developed a full-scale, rapid-prototyped robot-room—not a simulation or scale model—and conducted a co-design study with 30 participants. Inside this immersive space, participants explored new forms of human-robot interaction, engaging their perceptual faculties for “knowing spaces.” Our findings inform our ongoing development of a fully operational robot-room and offer valuable insights into expanding the concept of human-robot interaction to one of human-robot cohabitation.

17:06-17:18, Paper TuET1.4	Add to My Program
Towards Engaging Teaching Interfaces for Mobile Robots: Preliminary System Design and Insights on Usability

Bhandari, Saloni (Vrije Universiteit Amsterdam), Sero, Oromia (Karolinska Institutet), Von Kentzinsky, Hendrik (Free University of Amsterdam), Preciado Vanegas, Daniel Fernando (Vrije Universiteit Amsterdam), Baraka, Kim (Vrije Universiteit Amsterdam)
Keywords: Programming by Demonstration, Social Learning and Skill Acquisition Via Teaching and Imitation, User-centered Design of Robots Abstract: As robots will be deployed around non-expert users in unstructured environments, they are bound to fail or behave suboptimally in some cases. Through interactive learning approaches, robots can leverage interaction with non-expert users to refine existing skills or even learn new tasks. If we will (at least partially) rely on these users to occasionally teach robots, we need to make sure the teaching tasks is seen as enjoyable rather than a burden. To this end, this paper presents a preliminary system design and evaluation for a multi-modal teaching system for non-expert users to teach simple navigation tasks to mobile robots. Our design, using human-dog interaction as a metaphor, focuses on creating engaging and embodied interactions that foster a sense of participation. Our preliminary results (N=20) suggest that participants had a positive perception of the learning system, and showed a clear preference for demonstrations over feedback as teaching signal. Participants with more affinity and experience with technology found the system more intuitive and engaging, and provided more accurate demonstrations, underscoring the importance of considering the users background and expertise when designing learning systems. This work paves the way towards designing usable and engaging teaching interfaces for robots that reduce barriers for non-expert users and foster a sense of teaching as an act of ``care''.


TuET2 Regular Session, Auditorium 2	Add to My Program
Social Intelligence of Robots I

Chair: Smart, William D.	Oregon State University

16:30-16:42, Paper TuET2.1	Add to My Program
Anthropomorphization of Robots – a Result of Robot Design or a Result of the Human Tendency to Anthropomorphize?

Büttner, Sebastian Thomas (Westfälische Hochschule - Westphalian University of Applied Scie), Söhngen, Yannic (University Duisburg-Essen), Prilla, Michael (University of Duisburg-Essen)
Keywords: Evaluation Methods, User-centered Design of Robots, Anthropomorphic Robots and Virtual Humans Abstract: Anthropomorphism and trust are two variables that are measured in a multitude of HRI studies. These two variables are often treated as dependent variables influenced by a particular robot design or behavior. However, the psychological literature suggests that people have individual tendencies to anthropomorphize, influencing these two variables. We ran an experiment in which 70 participants were confronted with two robots, a humanoid and a non-humanoid robot. Before exposing them to the robots, they were screened for their individual tendency to anthropomorphize. After the exposure, they rated the robots in terms of anthropomorphism and trust. With this experiment, we show that the individual tendency to anthropomorphize has a clear impact on measuring anthropomorphism and trust with frequently used questionnaires. Given our results, we emphasize that future studies should screen participants for their tendency to anthropomorphize to avoid unwanted biases.

16:42-16:54, Paper TuET2.2	Add to My Program
The Impact of External Human-Machine Interfaces on Pedestrian Crossing Intention

Cui, Hongzheng (Duke University), Oca, Siobhan (Duke University)
Keywords: Detecting and Understanding Human Activity, Cognitive Skills and Mental Models, Curiosity, Intentionality and Initiative in Interaction Abstract: As autonomous vehicles (AVs) become more common in real-world crossing scenarios and their automation levels continue to increase, implicit driver-pedestrian communication cannot be relied on for pedestrian safety. To address this, future autonomous vehicles must incorporate tools to explicitly convey their intentions to pedestrians. In this context, external human-machine interfaces (eHMIs) have been developed and studied in transportation settings. This study used video simulations to recreate typical real-world pedestrian crossing scenarios, where AVs approached either without eHMIs or with eHMIs displaying a yield signal. The responses were collected from 60 adult participants, including their intentions of crossing during the scenarios and other perceptions as pedestrians through a questionnaire. The findings indicated that respondents' willingness to cross varied across scenarios, with the presence or absence of the eHMI playing a crucial role, particularly when an AV approached a zebra crossing. In this scenario, subjective norms and attitudes were identified as key factors influencing the intention to cross the road when the AV was equipped with an eHMI. Enhancing these factors, such as fostering more positive attitudes through education, providing guides for safe road crossing and accompanying pedestrians across the road, could reduce the time pedestrians take to decide to cross, potentially improving overall traffic efficiency.

16:54-17:06, Paper TuET2.3	Add to My Program
Manners Matter: How Robot Politeness Influences Human Risk-Taking and Social Perception

Stinkeste, Charlotte (KTH Royal Institute of Technology), Wikström Kempe, Albin (KTH Royal Institute of Technology), Skantze, Gabriel (KTH/Furhat Robotics)
Keywords: Creating Human-Robot Relationships, Multimodal Interaction and Conversational Skills, Robot Companions and Social Robots Abstract: Robots are no longer confined to factories; they now collaborate, assist, and engage with people in social environments, making their communication style a crucial factor in human-robot interaction. This study examines whether robots’ (im)politeness affects human risk-taking behavior and social perception, key factors in fostering effective human-robot interaction. In a between-subject experiment, sixty participants interacted with either a polite robot, employing politeness strategies, or a rude robot, using face-threatening acts. Risk-taking behavior was assessed through a button-pressing task that allowed participants to accumulate monetary rewards while risking total loss, and social perception was assessed with the Human-Robot Interaction Evaluation Scale (HRIES). Although politeness did not alter risk-taking, it significantly influenced perception: the polite robot was rated as more sociable and agentic, whereas the rude robot was seen as more disturbing; perceptions of animacy remained unchanged. An exploratory factor analysis refined the Agency scale, raising questions about how users conceptualize robotic autonomy. These findings confirm that politeness enhances social acceptance, but may not universally alter behavior. In social domains like customer service and healthcare, politeness is beneficial, but in financial decision-making contexts where outcomes carry real-world consequences, decision support may require additional strategies, such as assertiveness or personalized feedback.

17:06-17:18, Paper TuET2.4	Add to My Program
Would Human-Robot Interaction Conferences Benefit from More Formal Reporting? Evaluating a Novel Study Reporting Form

Holthaus, Patrick (University of Hertfordshire), Rossi, Alessandra (University of Naples Federico II), Shrestha, Snehesh (University of Maryland College Park), Louie, Wing-Yue Geoffrey (Oakland University), ucar, aysegul (Firat University), Hernández García, Daniel (Heriot-Watt University), Foerster, Frank (University of Hertfordshire), Andriella, Antonio (Institut De Robòtica I Informàtica Industrial), Bagchi, Shelly (National Institute of Standards and Technology)
Keywords: Evaluation Methods, User-centered Design of Robots, Long-term Experience and Longitudinal HRI Studies Abstract: In an interdisciplinary and evolving research field like human-robot interaction, clear and precise results reporting is essential for study comparability and replicability. To address the lack of a standard for such reporting and, at the same time, provide guidance for novices in the field, we have developed a web-based reporting form to capture human-robot interaction studies, serving as a model for how conferences could adopt it into the submission pipeline. In this work, we present a formative evaluation of this form regarding its level of detail, format and clarity, and the perceived benefits for authors, reviewers, and the community as a whole. We report the expert review of nine researchers who highlight the substantial value of this tool. In addition, these experts also provide suggestions for improvements to its form and the addition of details surrounding qualitative reporting.

17:18-17:30, Paper TuET2.5	Add to My Program
Analyzing Reluctance to Ask for Help When Cooperating with Robots: Insights to Integrate Artificial Agents in HRC

San Martin, Ane (Tekniker), Hagenow, Michael (Massachusetts Institute of Technology), Shah, Julie A. (MIT), Kildal, Johan (TEKNIKER), Lazkano, Elena (University of Basque Country)
Keywords: User-centered Design of Robots, HRI and Collaboration in Manufacturing Environments, Non-verbal Cues and Expressiveness Abstract: As robot technology advances, collaboration between humans and robots will become more prevalent in industrial tasks. When humans run into issues in such scenarios, a likely future involves relying on artificial agents or robots for aid. This study identifies key aspects for the design of future user-assisting agents. We analyze quantitative and qualitative data from a user study examining the impact of on-demand assistance received from a remote human in a human-robot collaboration (HRC) assembly task. We study scenarios in which users require help and we assess their experiences in requesting and receiving assistance. Additionally, we investigate participants’ perceptions of future non-human assisting agents and whether assistance should be on demand or unsolicited. Through a user study, we analyze the impact that such design decisions (human or artificial assistant, on-demand or unsolicited help) can have on elicited emotional responses, productivity, and preferences of humans engaged in HRC tasks.


TuET3 Regular Session, Auditorium 3	Add to My Program
Affective Artificial Agents I

Chair: Feil-Seifer, David	University of Nevada, Reno
Co-Chair: Fedsi, Chouaib	IBISC Laboratory, University of Evry Paris-Saclay

16:30-16:42, Paper TuET3.1	Add to My Program
Fusion in Context: A Multimodal Approach to Affective State Recognition

Mohamed, Youssef (KTH Royal Institute of Technology), Lemaignan, Séverin (PAL Robotics), GUNEYSU, ARZU (Bogazici University), Jensfelt, Patric (KTH - Royal Institute of Technology), Smith, Claes Christian (KTH Royal Institute of Technology)
Keywords: Multi-modal Situation Awareness and Spatial Cognition, Machine Learning and Adaptation, Social Intelligence for Robots Abstract: Accurate recognition of human emotions is a crucial challenge in affective computing and human-robot interaction (HRI). Emotional states play a vital role in shaping behaviors, decisions, and social interactions. However, emotional expressions can be influenced by contextual factors, leading to misinterpretations if context is not considered. Multimodal fusion, combining modalities like facial expressions, speech, and physiological signals, has shown promise in improving affect recognition. This paper proposes a transformer-based multimodal fusion approach that leverages facial thermal data, facial action units, and textual context information for context-aware emotion recognition. We explore modality-specific encoders to learn tailored representations, which are then fused and processed by a shared transformer encoder to capture temporal dependencies and interactions. The proposed method is evaluated on a dataset collected from participants engaged in a tangible tabletop Pacman game designed to induce various affective states. Our results demonstrate improvements from incorporating contextual information and multimodal fusion, achieving 89% F1 score with our full model compared to 65% for action units alone and 30% for thermal data alone.

16:42-16:54, Paper TuET3.2	Add to My Program
A Control Point Based Facial Expression for Smooth Facial Display in Social Robot Expression Transitions

Kim, Hyojin (Ulsan National Institute of Science and Technology), Park, Haeun (Ulsan National Institute of Science and Technology), Hwang, Sun Jun (UNIST), Lee, Hui Sung (UNIST (Ulsan National Institute of Science and Technology))
Keywords: Non-verbal Cues and Expressiveness, Robot Companions and Social Robots, User-centered Design of Robots Abstract: Social robots utilize facial expressions to convey emotions, yet most rely on predefined animation sequences for each emotion, which can result in abrupt or unnatural transitions. To address this, we employ a Control Point (CP)-based approach that dynamically adjusts facial features, enabling seamless expression transitions without predefined animations. Our study explores whether a CP-based approach enables smoother and more natural facial expressions compared to commonly used facial expression methods. Furthermore, we validate its applicability by demonstrating its effectiveness on two distinct facial designs highlighting versatility across both tested designs.

16:54-17:06, Paper TuET3.3	Add to My Program
An Emotion Empathy Agent That Reinforces Positive Emotion by Backchanneling in Congruence with Its Facial Expressions

Sanji, Yuta (Nagoya Institute of Technology), Kawai, Yurika (Nagoya Institute of Technology), Sakuma, Takuto (Nagoya Institute of Technology), Kato, Shohei (Nagoya Institute of Technology)
Keywords: Interaction with Believable Characters, Embodiment, Empathy and Intersubjectivity, Robots in Education, Therapy and Rehabilitation Abstract: With the development of large-scale language models, research on human-AI interaction is gaining momentum. Our aim is to encourage users to disclose information that can lead to health benefits through interaction with AI. To achieve this, we focused on using various facial expressions with emojis, and explored methods to reduce discomfort when speech is expressed simultaneously with these facial expressions.

17:06-17:18, Paper TuET3.4	Add to My Program
When and How to Express Empathy in Human-Robot Interaction Scenarios

Arzate Cruz, Christian (Honda Research Institute Japan), Montiel-Vazquez, Edwin Carlos (Tecnologico De Monterrey), Maeda, Chikara (Honda Research Institute Japan), Gomez, Randy (Honda Research Institute Japan Co., Ltd)
Keywords: Robot Companions and Social Robots, Machine Learning and Adaptation, Affective Computing Abstract: Incorporating empathetic behavior into robots can improve their social effectiveness and interaction quality. In this paper, we present whEE (when and how to express empathy), a framework that enables social robots to detect when empathy is needed and generate appropriate responses. Using large language models, whEE identifies key behavioral empathy cues in human interactions. We evaluate it in human-robot interaction scenarios with our social robot, Haru. Results show that whEE effectively identifies and responds to empathy cues, providing valuable insights for designing social robots capable of adaptively modulating their empathy levels across various interaction contexts.


TuET4 Regular Session, Blauwe Zaal	Add to My Program
Applications of Social Robots V

Chair: Becker-Asano, Christian	Stuttgart Media University

16:30-16:42, Paper TuET4.1	Add to My Program
Human-Robot Interaction through REACH: Robotic Enhancement and Accessibility Via a Connected Handheld Device on the Lunar Surface

Ma, Lanssie Mingyue (AMES), Sewtz, Marco (Deutsches Zentrum Für Luft Und Raumfahrt E.V), Luo, Xiaozhou (German Aerospace Center (DLR)), Prinz, Nicolas Jakob (Hochschule München University), Theuss, Cynthia (German Aerospace Company (DLR)), salomao, mateus bonelli (German Aerospace Company (DLR)), Lii, Neal Y. (German Aerospace Center (DLR)), Fong, Terrence (NASA Ames Research Center (ARC))
Keywords: Cooperation and Collaboration in Human-Robot Teams, Multimodal Interaction and Conversational Skills, Human Factors and Ergonomics Abstract: As space agencies prepare for long-term lunar missions (ie. Lunar Base Camp), addressing the challenges of complex surface tasks is crucial. Integrating advanced robotic systems can alleviate physical and cognitive burdens on astronauts, enhancing mission success and safety. This work focuses on the design, development, and evaluation REACH, a handheld device and its UI for interacting with the modular DLR Lightweight Rover Unit 2 (LRU), to improve human-robot collaboration during lunar EVAs. A Human-In-The-Loop pilot study compares the effectiveness of the REACH device to traditional verbal commands within a modular robotic system. Our results show REACH improves user control, task efficiency, and satisfaction with its intuitive interface and real-time feedback. These findings suggest REACH, and other like systems, can streamline tasks operations and management, aligning with Artemis mission goals. This work is a crucial step toward optimizing Human-Robot Interaction, setting the stage for more effective space missions.

16:42-16:54, Paper TuET4.2	Add to My Program
Design and Preliminary Validation of a Tactile Feel-Through Display for Virtual Curvature Rendering Exploiting Haptic Invariants

Cei, Gianmarco (University of Pisa), Vena, Danilo (Università Degli Studi Di Pisa), Susini, Paolo (University of Pisa), Bianchi, Matteo (University of Pisa)
Keywords: Novel Interfaces and Interaction Modalities, Virtual and Augmented Tele-presence Environments Abstract: The tactile Augmented Reality (t-AR) paradigm allows the delivery of controlled skin stimuli to elicit perceptual responses related to virtual haptic properties, or to manipulate without blocking the natural perception of real items. It could foster human-robot interaction, both in collaborative robotics and telerobotics, in every-day life applications. Tactile displays for t-AR employ mechanically transparent, feel-through user interfaces. In this work, we present a tactile display that can elicit the perception of concave curved surfaces, exploiting a soft and highly underactuated fabric interface, together with tendon-driven actuation. Taking inspiration from soft continuous robotics, to control the shape of the device soft continuous interface we designed a Finite Element simulator, and used it to determine optimal control sequences to produce the sensation of the desired curvature. To this aim, we heavily leveraged the theory of haptic invariants, focusing on the growth of the contact area on the fingerpad as a cue for curvature perception. We validated the system with human experiments, by asking participants to match the virtual rendered curvatures with that of real objects. Results, although preliminary, are promising, with the worst-case accuracy well above the chance level, suggesting that the system could represent a viable solution to elicit the perception of different object curvatures. Our work represents first attempt at designing t-AR systems for curvature display, laying down the foundations of a new framework for t-AR system design that exploits haptic invariant theory and integrates control techniques inspired by soft robotics.

16:54-17:06, Paper TuET4.3	Add to My Program
"The Wooden Gripper Was Warmer and Made the Robot Less Threatening"– a Study on Perceived Safety Based on Robot Gripper’s Visual and Tactile Properties

Meijer, Frida (University of Oslo), S. Lindblom, Diana (University of Oslo), Baselizadeh, Adel (University of Oslo (UiO)), Torresen, Jim (University of Oslo)
Keywords: Social Touch in Human–Robot Interaction, Ethical Issues in Human-robot Interaction Research, Applications of Social Robots Abstract: An ageing population and the need of providing adequate care have led to developing robots to relieve healthcare workers and to assist individuals in their own homes. However, the successful integration of robots in such settings relies on more than just ensuring physical safety associated with physical risks (e.g., collisions): it also requires the user’s perceived safety – the users perception of the robot as not doing any harm. This paper explores the potential influence of a robot gripper’s visual and tactile properties, such as materials and texture, on the users’ perceived safety and comfort of human-robot interaction. An initial survey was distributed to 53 participants, exploring five (n=5) robot gripper designs focusing on the robots’ gripper shape. One design shape was thereafter selected to be constructed as a cover to be placed over the parallel grippers of the TIAGo robot, by using 1) wood filament and 2) plastic. The covers were then tested in an experimental setting with 11 participants. The covers were attached to the TIAGo mobile manipulator robot and participants interacted with both of the designed gripper covers within a controlled laboratory environment. A questionnaire was distributed to all 11 experiment participants, at different stages of the interactions. The findings indicate that the material of the gripper influenced participants’ sense of comfort, familiarity, and perceived capabilities of the robot. The study suggests that perceived safety in human-robot interaction (HRI) is shaped not only by physical factors but also by how materials are personally and contextually interpreted. To better support safe and comfortable interactions, further research is needed to understand how material choices shape users’ perceived safety.

17:06-17:18, Paper TuET4.4	Add to My Program
TIAGo Head: An AI Powered Platform for Social Robotics

Cooper, Sara (IIIA-CSIC), Lemaignan, Séverin (PAL Robotics), Ros, Raquel (IIIA-CSIC), Ferrini, Lorenzo (PAL Robotics), Gebellí, Ferran (PAL Robotics), Juricic, Luka (PAL Robotics), Miguel, Narcís (PAL Robotics), Marchionni, Luca (PAL Robotics), Ferro, Francesco (PAL Robotics)
Keywords: Robot Companions and Social Robots, Applications of Social Robots, Robots in Education, Therapy and Rehabilitation Abstract: This paper presents the TIAGo Head, a new tabletop social robot from PAL Robotics, focusing on its capabilities as an HRI platform. We detail the robots' hardware, highlighting its sensors/actuators and on-board computing; and its software architecture, including social perception, expressive face, a knowledge base, and integration with large language models (LLMs) for natural conversations. We also describe a use-case in a receptionist scenario where TIAGo Head dynamically interacts with travelers by displaying news and conversing.

17:18-17:30, Paper TuET4.5	Add to My Program
Online Human Action Detection During Escorting

Mondal, Siddhartha (TCS Research), Mitra, Avik (TCS Research), Sarkar, Chayan (TCS Research)
Keywords: Detecting and Understanding Human Activity, Machine Learning and Adaptation, Social Intelligence for Robots Abstract: The deployment of robot assistants in large indoor spaces has seen significant growth, with escorting tasks becoming a key application. However, most current escorting robots primarily rely on navigation-focused strategies, assuming that the person being escorted will follow without issue. In crowded environments, this assumption often falls short, as individuals may struggle to keep pace, become obstructed, get distracted, or need to stop unexpectedly. As a result, conventional robotic systems are often unable to provide effective escorting services due to their limited understanding of human movement dynamics. To address these challenges, an effective escorting robot must continuously detect and interpret human actions during the escorting process and adjust its movement accordingly. However, there is currently no existing dataset designed specifically for human action detection in the context of escorting. Given that escorting often occurs in crowded environments, where other individuals may enter the robot’s camera view, the robot also needs to identify the specific human it is escorting (the subject) before predicting their actions. Since no existing model performs both person re-identification and action prediction in real-time, we propose a novel neural network architecture that can accomplish both tasks. This enables the robot to adjust its speed dynamically based on the escortee's movements and seamlessly resume escorting after any disruption. In comparative evaluations against strong baselines, our system demonstrates superior efficiency and effectiveness, showcasing its potential to significantly improve robotic escorting services in complex, real-world scenarios.


TuET5 Regular Session, Auditorium 5	Add to My Program
Linguistic Communication and Dialogue III

Chair: Cao, Lu	Honda Research Institute Japan
Co-Chair: SHEN, Yifan	Hong Kong University of Science and Technology

16:30-16:42, Paper TuET5.1	Add to My Program
Improving the Efficiency of Grounding Language Instructions That Refer to the Future State of Objects for Human-Robot Interaction

Shakir, Katelyn (University of Rochester), Rahman, Tabib Wasit (University of Rochester), Howard, Thomas (University of Rochester)
Keywords: Linguistic Communication and Dialogue Abstract: Efficient grounding of spatiotemporal relationships in language-based interactions remains a significant challenge for human-robot teams. Such relationships are commonly used when objects cannot be uniquely identified by visual features. Recent methods have demonstrated effective symbol grounding using iterative solves of probabilistic graphical models with simulators to predict the future state of the world. A limitation of such approaches is the requirement that inference is performed at each timestep to determine if the meaning of the statement has changed. In this paper we exploit unique features of the Distributed Correspondence Graph that enable more efficient steps of this architecture that improves the efficiency of symbol grounding. In experiments on examples inspired by previous works involving the grounding of spatiotemporal relationships, we observed an improvement in efficiency between 23% and 26% without a loss in accuracy.

16:42-16:54, Paper TuET5.2	Add to My Program
Towards Online Sign Language Expression for Real-Time Human-Robot Interaction

Khan, Nabeela (Institute of Science Tokyo), Tan, Sihan (Institute of Science Tokyo), Nakadai, Kazuhiro (Institute of Science Tokyo)
Keywords: Assistive Robotics, Creating Human-Robot Relationships, Non-verbal Cues and Expressiveness Abstract: Sign language (SL) is the primary mode of communication for Deaf and Hard-of-Hearing (DHH) individuals and differs fundamentally from spoken languages. While Sign Language Expression (SLE) systems have made significant progress in generating gestures from text using deep learning, their integration into assistive Human-Robot Interaction (HRI) remains limited. This position paper introduces online SLE as a novel paradigm for enabling responsive, real-time SL communication on robotic platforms. We analyze the technical, dataset, and evaluation challenges in deploying SLE models on robots and present preliminary experiments illustrating the trade-offs between efficiency and expressiveness. We further propose design considerations for online model architectures, identify key gaps in current datasets, and call for interdisciplinary collaboration with the Deaf community. Our goal is to pave the way toward inclusive, socially-aware robotic agents capable of natural SL communication.

16:54-17:06, Paper TuET5.3	Add to My Program
The Need for (Robot) Speed: Offloading Heavy Computations Improves Response Time and User Experience in Spoken Interactions

Bartoli, Ermanno (KTH Royal Institute of Technology), Stower, Rebecca (KTH), Donyanavard, Bryan (San Diego State University), Werner, Hanna (KTH Royal Institute of Technology), Tumova, Jana (KTH Royal Institute of Technology), Leite, Iolanda (KTH Royal Institute of Technology)
Keywords: Applications of Social Robots, Computational Architectures, Linguistic Communication and Dialogue Abstract: In this work we present RoDgeR, a system that leverages edge computing to offload computationally demanding tasks for real-time human-robot interaction (HRI). We identify dialogue management as an example of a computationally intensive task and demonstrate that an edge-based Large Language Model (LLM) results in faster response times than both cloud-based and embedded LLMs. We further implement an edge-based LLM in RoDgeR to evaluate user experience with 63 participants in a simulated restaurant scenario. Our results confirm that RoDgeR outperforms embedded and cloud-based solutions, leading to improved user experience. These findings highlight the potential of edge computing for improving the quality of human-robot interactions.

17:06-17:18, Paper TuET5.4	Add to My Program
When to Say "Hi" - Learn to Open a Conversation with an In-The-Wild Dataset

Schiffmann, Michael (TH Köln University of Applied Sciences), Struth, Felix (TH Köln), Jeschke, Sabina (FAU - Friedrich-Alexander University of Erlangen-Nuremberg), Richert, Anja (University of Applied Sciences Cologne)
Keywords: Multimodal Interaction and Conversational Skills, Detecting and Understanding Human Activity, Machine Learning and Adaptation Abstract: The social capabilities of socially interactive agents (SIA) are a key to successful and smooth interactions between the user and the SIA. A successful start of the interaction is one of the essential factors for satisfying SIA interactions. For a service and information task in which the SIA helps with information, e.g. about the location, it is an important skill to master the opening of the conversation and to recognize which interlocutor opens the conversation and when. We are therefore investigating the extent to which the opening of the conversation can be trained using the user’s body language as an input for machine learning to ensure smooth conversation starts for the interaction. In this paper we propose the Interaction Initiation System (IIS) which we developed, trained and validated using an in-the-wild data set. In a field test at the Deutsches Museum Bonn, a Furhat robot from Furhat Robotics was used as a service and information point. Over the period of use we collected the data of N = 201 single user interactions for the training of the algorithms. We can show that the IIS, achieves a performance that allows the conclusion that this system is able to determine the greeting period and the opener of the interaction.

17:18-17:30, Paper TuET5.5	Add to My Program
Autonomous Dialogue Generation Based on Phase Boundary Detection within Continuous Motion for Domestic Robot

Li, Sixia (Japan Advanced Institute of Science and Technology), Miyake, Tamon (Waseda University), Ogata, Tetsuya (Waseda University), Sugano, Shigeki (Waseda University), Okada, Shogo (Japan Advanced Institute of Technology)
Keywords: Linguistic Communication and Dialogue, Machine Learning and Adaptation, Monitoring of Behaviour and Internal States of Humans Abstract: Dialogue generation plays a key role in responding to user and providing transparency in motion execution in human-robot interaction. As motion planning is generally performed in terms of discrete motions, previous studies have focused on dialogue generation at the boundaries between motions. Recently, continuous motion generation was proposed to enable adapting actions to unique characteristics of the objects for domestic robots. Since a continuous motion generally involves physical and nonphysical phases, providing dialogues when the phase changes is crucial for decreasing users' anxiety and guaranteeing safety. However, continuous motions lack clear phase boundaries, posing challenges for dialogue generation between phases. For this problem, we segmented continuous motions into discrete phases, and constructed a system to enable the robot to autonomously generate dialogues by detecting phase boundaries. To do so, we built phase estimation models using robot sensor data and designed system modules. Specifically, we collected data in the scenario of a robot assisting to lift the user up from bed. We segmented the continuous motion into three phases based on the user's posture and whether the robot applied force to the human. The best phase estimation model achieved a macro F1 score of 0.894, demonstrating that phases can be estimated from sensor data. The evaluation results of our system demonstrated that the system accurately detects phase boundaries and generates appropriate dialogues corresponding to phases. Furthermore, we conducted simulations with a user agent to investigate system behaviors when the phase estimation was incorrect. The results suggested that explicitly stating the phase is important for avoiding misunderstandings and safety issues.


TuET6 Regular Session, Auditorium 6	Add to My Program
LLM-Enhanced Social Robotics III

Chair: van den Brandt, Gijs	Eindhoven University of Technology
Co-Chair: Xiong, Songsong	University of Groningen

16:30-16:42, Paper TuET6.1	Add to My Program
Context-Aware Human Behavior Prediction Using Multimodal Large Language Models: Challenges and Insights

Liu, Yuchen (Robert Bosch GmbH), Lerch, Lino (Ulm University), Palmieri, Luigi (Robert Bosch GmbH), Rudenko, Andrey (Robert Bosch GmbH), Koch, Sebastian (Ulm University, Google), Ropinski, Timo (Ulm University), Aiello, Marco (University of Stuttgart)
Keywords: Detecting and Understanding Human Activity, Multi-modal Situation Awareness and Spatial Cognition, Monitoring of Behaviour and Internal States of Humans Abstract: Predicting human behavior in shared environments is crucial for safe and efficient human-robot interaction. Traditional data-driven methods to that end are pre-trained on domain-specific datasets, activity types, and prediction horizons. In contrast, the recent breakthroughs in Large Language Models (LLMs) promise open-ended cross-domain generalization to describe various human activities and make predictions in any context. In particular, Multimodal LLMs (MLLMs) are able to integrate information from various sources, achieving more contextual awareness and improved scene understanding. The difficulty in applying general-purpose MLLMs directly for prediction stems from their limited capacity for processing large input sequences, sensitivity to prompt design, and expensive fine-tuning. In this paper, we present a systematic analysis of applying pre-trained MLLMs for context-aware human behavior prediction. To this end, we introduce a modular multimodal human activity prediction framework that allows us to benchmark various MLLMs, input variations, In-Context Learning (ICL), and autoregressive techniques. Our evaluation indicates that the best-performing framework configuration is able to reach 92.8% semantic similarity and 66.1% exact label accuracy in predicting human behaviors in the target frame. Project webpage: https://cap-mllm.github.io/

16:42-16:54, Paper TuET6.2	Add to My Program
Inferring Human Fairness Judgments with Large Language Models in Human-Robot Interaction Scenarios

Claure, Houston (Yale University), Moosa, Aly (Yale University), Vázquez, Marynel (Yale University)
Keywords: Cognitive Skills and Mental Models, Ethical Issues in Human-robot Interaction Research, Affective Computing Abstract: Prior work has shown that humans are acutely aware of unfair treatment and that fairness is critical for sustaining collaboration. This makes determining whether a robot's behavior is perceived as fair or unfair important in Human-Robot Interaction (HRI). Traditionally, this is achieved via surveys through which people provide their opinion of a robot. However, with advancements in Large Language Models (LLMs) and significant efforts to support AI alignment, we hypothesized that LLMs could help determine human perceptions of fairness in a variety of HRI scenarios in a way that resembles human judgments. We first investigated how effective several LLMs were in inferring human fairness judgments in scenarios where unfair or fair outcomes came about due to a robot's behavior. Then, we compared the open-ended justifications that LLMs and humans provided. Our results suggest that LLM-generated fairness ratings align well with the directionality of human survey responses. However, the justifications that humans provided for their responses differed from LLM justifications. These findings highlight both the potential and limitations of using LLMs to approximate human fairness judgments. Ultimately, this work lays the foundation for enabling robots to autonomously reflect on their behavior through a fairness lens, paving the way for more ethically aligned human-robot collaborations.

16:54-17:06, Paper TuET6.3	Add to My Program
Integrating LLM into a Socially Assistive Robot for Social Dialogue: An Exploratory Study in a Nursing Home

Zhong, Vivienne Jia (FHNW University of Applied Sciences and Arts Northwestern Switze), Studerus, Erich (University of Applied Sciences and Arts Northwestern Switzerland), Vonschallen, Stephan (Zurich University of Applied Sciences (ZHAW), University of Appl)
Keywords: Linguistic Communication and Dialogue, Human Factors and Ergonomics, Robot Companions and Social Robots Abstract: Socially assistive robots (SARs) powered by Large Language Models (LLMs) can offer benefits for improving older adults’ well-being through social dialogues. However, empirical research on older adults’ experiences with LLM-enabled, open-domain conversational robots is limited. This qualitative field study investigated communication dynamics between six older adults and an LLM-powered SAR in a Swiss German nursing home. To conduct this investigation, we integrated ChatGPT-4o-realtime and Microsoft’s speech service into a SAR, specifically enabling real-time user-initiated interruptions and fast response time. Our findings reveal that participants generally found conversations pleasant, appreciating the robot’s personalized response generation. Further, we identify older adults’ interaction patterns during the conversation and uncover several technical barriers such as turn-taking difficulties, occasional misunderstandings, and repetitive response patterns. From these insights, we present design implications to guide the development of LLM-powered SARs that can sustain engaging, open-domain social dialogues with older adults.

17:06-17:18, Paper TuET6.4	Add to My Program
Engagement and Disclosures in LLM-Powered Cognitive Behavioral Therapy Exercises: A Factorial Design Comparing the Influence of a Robot vs. Chatbot Over Time

Kian, Mina (University of Southern California), Zong, Mingyu (University of Southern California), Fischer, Katrin (University of Southern California), Velentza, Anna Maria (Brest National School of Engineering (ENIB)), Singh, Abhyuday (University of Southern California), Shrestha, Kaleen (University of Southern California), Sang, Pau (University of Southern California), Upadhyay, Shriya (University of Southern California), Browning, Wallace (University of Southern California), Faruki, Misha (University of Southern California), arnold, sebastien (University of Southern California), Krishnamachari, Bhaskar (USC Viterbi School of Engineering), Mataric, Maja (University of Southern California)
Keywords: Assistive Robotics, Embodiment, Empathy and Intersubjectivity, Applications of Social Robots Abstract: Many researchers are working to address the worldwide mental health crisis by developing therapeutic technologies that increase the accessibility of care, including leveraging large language model (LLM) capabilities in chatbots and socially assistive robots (SARs) used for therapeutic applications. Yet, the effects of these technologies over time remain unexplored. In this study, we use a factorial design to assess the impact of embodiment and time spent engaging in therapeutic exercises on participant disclosures. We assessed transcripts gathered from a two-week study in which 26 university student participants completed daily interactive Cognitive Behavioral Therapy (CBT) exercises in their residences using either an LLM-powered SAR or a disembodied chatbot. We evaluated the levels of active engagement and high intimacy of their disclosures (opinions, judgments, and emotions) during each session and over time. Our findings show significant interactions between time and embodiment for both outcome measures: participant engagement and intimacy increased over time in the physical robot condition, while both measures decreased in the chatbot condition.


TuET7 Regular Session, Auditorium 7	Add to My Program
Robots in Families, Education, Therapeutic Contexts & Arts III

Chair: Jullens, Monique Schaule	University of Amsterdam

16:30-16:42, Paper TuET7.1	Add to My Program
Human Teaching Patterns in Interactive Robot Learning from Multiple Teaching Modalities

Christofi, Konstantinos (Vrije Universiteit Amsterdam), Tichelaar, Caroline (Vrije Universiteit Amsterdam), Preciado Vanegas, Daniel Fernando (Vrije Universiteit Amsterdam), Baraka, Kim (Vrije Universiteit Amsterdam)
Keywords: Social Learning and Skill Acquisition Via Teaching and Imitation, Motivations and Emotions in Robotics, Multimodal Interaction and Conversational Skills Abstract: Human-interactive robot learning allows a robot to learn tasks more effectively with the help of humans in the role of teacher. While there is a large body of work on algorithms that leverage human input for better robot learning, there has been little attention to understanding how humans teach robots. In this paper, we provide preliminary results on how users strategize the use of demonstrations and evaluative feedback under a budget, and how these choices are influenced by demographic variables such as gender. We implemented a learning algorithm that allows a simulated robot arm to learn three reaching tasks with the help of a human. We collected interaction data for a total of 58 participants, which shows that participants demonstrate a tendency to provide evaluative feedback earlier in their interactions compared to demonstrations, and that gender may have an influence on teaching strategy. This preliminary analysis lays the foundation for future research aimed at developing tuneable computational models of different human teachers.

16:42-16:54, Paper TuET7.2	Add to My Program
Interactive Robotic Painting: A Multi-Mode System for Human-Robot Art-Making

Brass, Emma (University of Liverpool)
Keywords: Robots in art and entertainment, Art pieces supported by robotics, Novel Interfaces and Interaction Modalities Abstract: The integration of robots into creative domains presents new opportunities for artistic expression. This work introduces a robotic painting system designed to facilitate intuitive, non-programmatic interaction through two distinct modes of engagement. In the first mode, a human user and a robot co-create an abstract painting by taking turns making marks, with the robot responding based on image analysis and predefined artistic rules. The second mode allows the robot to autonomously generate an outline portrait of the user based on image segmentation and facial feature extraction. The system employs ROS2 for task orchestration, animatronic eyes to enhance user engagement, and ChatGPT for conversational feedback.

16:54-17:06, Paper TuET7.3	Add to My Program
Social Robot-Led Yoga for Older Adults: A Feasibility Study

Mapoles, Sean (Northen Arizona University), Melgar-Donis, Stephanie (University of Denver), Ghiglieri, Jason (University of Denver), Siewierski, Jarid (University of Denver), Abdollahi, Hojjat (University of Denver), Gorgens, Kim (University of Denver), Mahoor, Mohammad (University of Denver)
Keywords: Robots in Education, Therapy and Rehabilitation, Applications of Social Robots, Assistive Robotics Abstract: Yoga and other forms of exercise have demonstrated protective health benefits for older adults including enhanced body flexibility, balance, joint mobility, and cognitive function. Social robots are increasingly being used as carers for older adults, including instruction to bolster mental and physical wellbeing. The primary objective of this study was to measure the impact of social robot-led yoga with six female older adult participants across twenty-four session of social robot-led yoga. This study used a pre-peri-post design to compare participants’ biopsychosocial measurements before, during, and after yoga participation. We analyzed the impact across measures of fall resilience, overhead flexion, mindfulness skills, and task engagement. Participants demonstrated statistically significant improvement in fall resilience and statistically non-significant improvements in overhead flexion and mindfulness skills. Task engagement increased throughout the study. As one of the first studies to investigate the impact of social robot-led yoga with older adults, the findings from this study provide statistically significant evidence that older adults may see improved non-dominant single-leg balance from social robot-led yoga. Furthermore, older adults may experience maintained if not improved, overhead shoulder range of motion (ROM) and mindfulness skills from social robot-led yoga. When considered with other research on the benefits of yoga with older adults, these findings suggest that social robot-led yoga instruction warrants additional investigation.

17:06-17:18, Paper TuET7.4	Add to My Program
Employing Laban Shape for Generating Emotionally and Functionally Expressive Trajectories in Robotic Manipulators

Bangalore Raghu, Srikrishna (University of Colorado Boulder), Lohrmann, Clare (University of Colorado Boulder), Bakshi, Akshay (University of Colorado Boulder), Kim, Jennifer (University of Colorado Boulder), Caraveo Herrera, Jose Alejandro (University of Colorado Boulder), Hayes, Bradley (University of Colorado Boulder), Roncone, Alessandro (University of Colorado Boulder)
Keywords: Non-verbal Cues and Expressiveness, Motivations and Emotions in Robotics, Cooperation and Collaboration in Human-Robot Teams Abstract: Successful human-robot collaboration depends on cohesive communication and a precise understanding of the robot’s abilities, goals, and constraints. While robotic manipulators offer high precision, versatility, and productivity, they exhibit expressionless and monotonous motions that conceal the robot’s intention, resulting in a lack of efficiency and transparency with humans. In this work, we use Laban notation, a dance annotation language, to enable robotic manipulators to generate trajectories with functional expressivity, where the robot uses nonverbal cues to communicate its abilities and the likelihood of succeeding at its task. We achieve this by introducing two novel variants of Hesitant expressive motion (Spoke-Like and Arc-Like). We also enhance the emotional expressivity of four existing emotive trajectories (Happy, Sad, Shy, and Angry) by augmenting Laban Effort usage with Laban Shape. The functionally expressive motions are validated via a human-subjects study, where participants equate both variants of Hesitant motion with reduced robot competency. The enhanced emotive trajectories are shown to be viewed as distinct emotions using the Valence-Arousal-Dominance (VAD) spectrum, corroborating the usage of Laban Shape.

Technical Program for Tuesday August 26, 2025