WRC SARA 2025 Program | Sunday August 10, 2025


SuPLPL	Room D
Plenary: The Challenges to Realize Embodied AI/Physical AI -- Wolfram Burgard, Professor of Computer Science at the University of Technology Nuremberg, Germany	Plenary Sessions


SuKNKN	Room D
Keynote: Social Robotics - from Technological Performance to Ecologic Social Uses: A Missing Link? -- Sophie Sakka, Professor, French National Higher Institute for Research and Training in Inclusive Education (INSEI), France	Plenary Sessions


SuS1A	Room D
Award Session	Regular Sessions
Chair: Wang, Zhidong	Chiba Institute of Technology

10:50-11:02, Paper SuS1A.1
Style-Aware and Robust Sim-To-Real Humanoid Locomotion with Contact-Rich Motion Priors and Online Dynamics Adaptation

Zhang, Jinlin	Zhejiang University
Gan, Chunbiao	Zhejiang University
Keywords: Humanoid Robots, Machine Learning, Path and Motion Planning Abstract: Achieving natural and expressive full-body motion remains a significant challenge for humanoid robots. This work proposes a sim-to-real control framework integrating an online dynamics inference and adaptation module with a Contact-Rich Adversarial Motion Prior (CRAMP) to enable robust and style-aware motion control for humanoid robots with high gear ratio actuation.The dynamics adaptation module relies solely on proprioceptive feedback and multi-point foot contact modeling to online estimate and adapt critical dynamic parameters, effectively narrowing the simulation-to-reality gap. Meanwhile, CRAMP extends conventional adversarial motion priors by explicitly incorporating the robot-environment contact structure into the discriminator input, enabling the policy to learn end-to-end expressive behaviors with consistent styles and coordinated contact transitions, without relying on separate upper-lower body models or multi-stage training processes. The control policy is trained via reinforcement learning in simulation and successfully transferred to real hardware without additional tuning. Experimental results from GTX-III demonstrate that the proposed method stably generates natural and stylistically diverse full-body motions, exhibiting clear capabilities in gait rhythm modulation,posture tension control, asymmetric adaptation, contact strategy adjustment, and delivering multidimensional stylistic responses in context-rich human-like scenarios.

11:02-11:14, Paper SuS1A.2
Surgical Phase Detection with Deep Learning for Thoracoscopic Pulmonary Lobectomy Surgery

Yang, Qian	Beijing University of Posts and Telecommunications
Liu, Chang	Beijing University of Posts and Telecommunications
Zhou, Zhuojia	Beijing University of Technology
Feng, Yongqiang	Plastic Surgery Hospital, Chinese Academy of MedicalSciences And
Zheng, Heng	Chinese Academy of Medical Sciences and Peking Union Medical Col
Wang, Junchen	Beihang University
Tang, Jie	Beijing Tian Tan Hospital, Capital Medical University
Hou, Yuanzheng	Xuanwu Hospital, Capital Medical University
Su, Baiquan	Beijing University of Posts and Telecommunications
Zhang, Xiaoya	CIE
Keywords: Medical Robotics, Deep Learning Abstract: Accurate recognition of surgical phases is essential for clinical education and the advancement of robotic surgery. Although significant progress has been made in automated surgical phase recognition for various procedures, a notable research gap remains in the context of video-assisted thoracoscopic pulmonary lobectomy-a highly complex and technically demanding operation. To address this gap, this study is the first to comprehensively investigate automated phase recognition in thoracoscopic pulmonary lobectomy. Our principal contributions include: (1) the creation of the first novel, expert-annotated private video dataset for pulmonary lobectomy, providing a critical resource for this understudied surgical domain; and (2) the optimization and adaptation of the SV-RCNet model, which achieves robust phase recognition by effectively capturing both visual and temporal features. Experimental results demonstrate that our approach achieves a surgical phase recognition accuracy of 92.4%, comparable to the state-of-the-art performance in other surgical phase recognition tasks. This work not only fills a critical research gap, but also lays a solid foundation for the future development of computer-assisted thoracoscopic surgery, offering significant clinical and educational value for surgical training and the advancement of intelligent robotic systems.

11:14-11:26, Paper SuS1A.3
A Human–Robot Collaborative System for Maxillofacial Osteotomy Assisted by Virtual Fixtures Based on Admittance Control

Deng, Yingyan	Beihang University
Lu, Chunheng	Beihang University
Liu, Xinyu	Peking University School and Hospital of Stomatology
He, Yang	Peking University School and Hospital of Stomatology
Wang, Junchen	Beihang University
Keywords: Medical Robotics, Human-Robot Interaction and Cooperation, Grasping and Manipulation Abstract: Precise and safe execution of maxillofacial osteotomy remains challenging due to complex anatomy and dynamic human–robot interaction. This study proposes a collaborative control system combining parametrized-surface virtual fixtures and admittance-based hybrid force–position control. The osteotomy surface is extracted from preoperative CT data, projected and spline-fitted to generate virtual constraint. An anisotropic admittance model ensures compliant motion in the cutting direction while enforcing geometric constraints in normal and depth directions. The control scheme was implemented on a UR5e manipulator and validated through three experimental trials on identical anatomical mandible models. Postoperative CT scans were registered to preoperative plans, and cutting accuracy was evaluated using 13 anatomical landmarks per model. The overall mean error was 0.81 mm, with a maximum deviation of 1.48 mm and a standard deviation of 0.53 mm. 3D error map revealed localized deviations. The robot demonstrated stable and responsive behavior across trials, with smooth tool interaction and no safety violations. These results confirm that the proposed system achieves millimeter-level precision while supporting intuitive human–robot cooperation, offering a promising basis for future clinical translation.

11:26-11:38, Paper SuS1A.4
Phase Surface Electromyography (sEMG) Based Muscle Onset Detection

Yuan, Wenbo	The University of Hong Kong
Zhou, Changqiu	The University of Hong Kong
Zhao, Yafei	The University of Hong Kong
Ling, Zi-qin	Shenzhen University
Chen, Jiangcheng	The University of Hong Kong
Xi, Ning	The University of Hong Kong
Keywords: Human-Robot Interaction and Cooperation, Sensing, Haptic System, Soft Robotics Abstract: Surface electromyography(sEMG) signals, as a physiological signal, have natural advantages in robot control and human-machine collaboration.Using electrical signals transmitted from the brain to the muscles through the central nervous system to control robots can improve the collaboration between robots and humans, making robot control highly real-time and mechanically matched.Using electromyographic signals to detect the activation state of robots helps to quickly switch between different robot states.Based on the propagation characteristics of neuromuscular junctions and electromyographic signals on the skin surface, we propose a muscle activation detection method based on opposite phase sEMG.The onset signal obtained based on this method is about 60ms earlier than the joint force output, which means that muscle activation information can be obtained within the electromechanical delay(EMD) range.Using this method, information on electromyographic signals can be expanded from degree of activation to whether they are activated, providing more reference information for human-machine collaboration, fitness training, and disease evaluation.

11:38-11:50, Paper SuS1A.5
Ultra-Sensitive Aorta Pressure Sensor Based on Graphene

Zhao, Jing	Beijing Institute of Technology
Yin, Yiqi	Beijing Institute of Technology
Li, Zhongyi	Beijing Institute of Technology
Feng, Zhejian	Beijing Institute of Technology
Zeng, Xueer	Beijing Institute of Technology
Keywords: Sensing, Haptic System, Smart Structures, Materials, Actuators, Emerging Technologies and Applications Abstract: The graphene-based strain sensors have attracted much attention recently. Usually, there is a tradeoff between the sensitivity and resistance of such devices for larger resistance devices cost higher energy consumption. As the Aorta pressure sensor, smaller device is needed for less invasive to the human body. In this paper, we report an ultra-sensitive graphene-based pressure sensor whose sensitivity can be tuned by the original resistance in different growth condition. For a typical pressure sensor device, the gauge factor can be achieved ~103 while the sheet resistance ~100 KΩ/□. The flexible pressure sensor placed in vivo or in vitro can sense both various blood pressure and heart rate. Cardiac arrhythmias such as atrial ﬁbrillation and ventricular premature contraction can be detected in real-time. After more than 104 cycles test, the performance of the device remains basically unchanged with fast response time less than 50 ms. The highly sensitive, small volume graphene-based Aorta pressure sensor suggests a great potential in future healthcare industry.

11:50-12:02, Paper SuS1A.6
A Novel Dual-Modeling Framework for Precision Control of Servovalve Electrohydraulic Actuators

Oyeranmi, Sheriff	Paris Saclay University
Su, Hang	Paris Saclay University
Mammar, Saïd	IBISC
Alfayad, Samer	Paris-Saclay Universit -Evry University
Keywords: Intelligent Control and Systems, Intelligent Control, Human-Robot Interaction and Cooperation Abstract: Electrohydraulic actuators (EHAs) are increasingly used in robotics because of their superior power density, bandwidth, and force control. However, accurately modeling their behavior is challenging due to the complex internal dynamics of the servovalve. This paper presents a dual-modelling methodology for a symmetric double-acting cylinder. First is a nonlinear mathematical model derived from fluid continuity, orifice flow, and Newton's motion equations (including a hard-stop spring-damper to emulate end caps). The second is a component-level simscape implementation that mirrors the hardware architecture. The initial parameters were obtained from direct measurements of a dismantled servovalve and manufacturer datasheets. The remaining eleven unknowns were identified via nonlinear least-squares optimization using the Trust-Region Reflective algorithm against experimental sinusoidal displacement data. Beyond regular validation with piston displacement, multi-metric validation of both models behavior verified their ability to track sinusoidal inputs in displacement, and also the expected chamber pressure dynamics. This rigorously validated dual-model framework provides analytical clarity and high-fidelity simulation for designing precise position, force, and compliant controllers for EHAs, enabling safe human-robot interaction.


SuS1B	Room E
Vision	Regular Sessions
Chair: Xu, Jing	Tsinghua University

10:50-11:02, Paper SuS1B.1
An Adaptive Peak Detector for Frequency Shift Methods in Fringe Projection Profilometry

Hu, Changping	Tsinghua University
Chen, Rui	Tsinghua University
Xu, Jing	Tsinghua University
Keywords: Robot Vision and Computer Vision Abstract: 3D measurement of transparent objects is a challenging task for fringe projection profilometry, since each camera pixel captures an accumulation of information from multiple projector pixels. While existing frequency shift methods can decouple projector pixels, they can not identify correct surface points from the decoupled results, due to the complexity of refraction and reflection. To address this challenge, an adaptive peak detector, robust to different reflected light intensities, is proposed for transparent object measurement. Experimental results demonstrate that our method achieves more accurate and complete measurements of transparent objects.

11:02-11:14, Paper SuS1B.2
A Multi-Stage Automatic Framework for Fracture Fragments Segmentation of Zygomatic Bone and Arch

Mo, Hao	Beihang University
Zhang, Runshi	Beihang University
Liu, Runqi	Peking University School of Stomatology
Tong, Yanhang	Peking University
Wang, Junchen	Beihang University
Keywords: Robot Vision and Computer Vision, Deep Learning, Artificial Intelligence Abstract: Zygomatic bone (ZB) and zygomatic arch (ZA) fractures are among the most common cranio-maxillofacial (CMF) injuries, posing significant challenges for accurate preoperative assessment and surgical planning due to the complex anatomy and high demand for facial symmetry restoration. To address these challenges, we propose a novel deep learning-based segmentation framework specifically designed for automated analysis of ZA and ZB fracture fragments. The framework consists of three major components: a 3D Faster R-CNN network to detect the ZA and ZB regions within cranial CT volumes, a 3D ConvNeXt encoder combined with a UPerNet decoder for coarse segmentation, and a lightweight 2D segmentation network for fine-grained fracture line identification. Experimental results on a dataset of 186 CT volumes demonstrate high accuracy in region detection (mAP: 90.66%) and effective ZA and ZB region segmentation (mDice: 93.52%). While the performance of fracture line segmentation was limited by the scarcity of annotated data, the proposed pipeline significantly enhances the automation and precision of preoperative planning for ZA and ZB fracture cases. This study facilitates CMF surgical planning and advances the development of autonomous and intelligent robotic surgical systems.

11:14-11:26, Paper SuS1B.3
OOPS: Occlusion-Aware Optimization-Based 6D PoSe Estimation of Unseen Transparent Objects

Pun, Chifai	Tsinghua University
Xu, Jing	Tsinghua University
Hu, Changping	Tsinghua University
Chen, Rui	Tsinghua University
Keywords: Robot Vision and Computer Vision Abstract: In this paper, we introduce OOPS, a CAD model-based pipeline that reframes the 6D pose estimation of novel transparent objects with known CAD models as an optimization-based alignment problem. The goal is to align the silhouettes and borders of observed transparent objects with those rendered from CAD models.

11:26-11:38, Paper SuS1B.4
Robust Monocular Distance Estimation in Degraded Visual Conditions Via a Wide-And-Deep Fusion Framework

Wang, Richard	BASIS Independent Silicon Valley Upper School
Han, Grant	VEX V5RC Team 1698V
Liu, Alexander	Basis Independent Silicon Valley Upper School
Ye, Zixuan	VEX V5RC Team 1698V
Deng, Alexander	Saratoga High School
Liu, Lexie	VEX V5RC Team 1698V
Wei, Xing	Basis Independent Silicon Valley Upper School
Keywords: Robot Vision and Computer Vision, Deep Learning, Machine Learning Abstract: Robust visual perception remains a central challenge for autonomous systems operating under degraded environmental conditions such as fog, low light, and artificial glare. In this study, we investigate multiple strategies to enhance the resilience of vision-based models in such scenarios, using monocular distance estimation as a representative task due to its sensitivity to visual quality and availability of ground-truth metrics. A Convolutional Neural Network (CNN) is first trained on clean visual data using a single-camera robotic platform, and its direct inference results serve as the baseline. We then explore three enhancement strategies: (1) image preprocessing with a transformer-based large model, (2) model fine-tuning on low-visibility data, and (3) a novel fusion framework that integrates temporal motion history with image features. Inspired by wide-and-deep learning structures, the proposed fusion model combines shallow inputs (e.g., recent speed and displacement) with deep CNN-based image embeddings for improved robustness. Experimental results across various lighting scenarios demonstrate that while multiple enhancement strategies improve estimation accuracy, the fusion model achieves the most consistent and efficient performance. The findings provide insight into designing more reliable vision systems for low-cost robotics and general perception tasks under real-world variability.

11:38-11:50, Paper SuS1B.5
Silhouette-Guided Diffusion Model for Transparent Object Reconstruction

Hu, Changping	Tsinghua University
Xu, Jing	Tsinghua University
Pun, Chifai	Tsinghua University
Chen, Rui	Tsinghua University
Keywords: Robot Vision and Computer Vision Abstract: Depth reconstruction for transparent objects is a challenging problem, where surface feature matching methods are hindered by complex refraction and reflection. We propose a novel transparent object reconstruction pipeline with a guided object-centric 3D diffusion model. Specifically, we train an unconditional 3D diffusion model with only 3D point cloud data. To control the output of the diffusion model, we design a silhouette-based guidance function for each step of diffusion process. Experiment results show that our method can achieve state-of-the-art performance for transparent object depth reconstruction compared to existing depth regression and completion methods.

11:50-12:02, Paper SuS1B.6
Free-Viewpoint Multi-View Face Reconstruction Using Deep Learning Method

Li, Zhongtian	Beihang University
Zhang, Runshi	Beihang University
Jie, Bimeng	Peking University
He, Yang	Peking University School and Hospital of Stomatology
Wang, Junchen	Beihang University
Keywords: Deep Learning, Robot Vision and Computer Vision Abstract: - 准确恢复三维形状从二维图像中绘制一张脸是一项具有挑战性的任务与许多应用程序一起使用。虽然有些进展在基于 3D 可变形模型的研究中，大多数目前的方法主要集中在使用单个图像来重建。中包含的有限信息单一图像不可避免地限制了人脸的有效性重建。本研究提出了一种基于深度学习的使用三维可变形模型的多视图重建方法。不需要真人，只有弱监督使用多张人脸图像进行学习，获得自由视野条件下的准确面&#


SuS2A	Room D
Robotics and AI for Soft Materials	Invited Sessions
Chair: Wang, Zhidong	Chiba Institute of Technology

14:20-14:32, Paper SuS2A.1
Fabric Modelling for Robot Applications (I)

Bhattacharya, Dipankar	The University of Hong Kong
Kosuge, Kazuhiro	The University of Hong Kong
Keywords: Intelligent Control and Systems Abstract: Precise fabric manipulation—such as folding, hanging, and aligning textiles for sewing—is a core challenge in garment manufacturing automation. These manipulation tasks are especially difficult for robots due to the highly deformable nature of fabric, which exhibits infinite configurations, non-linear dynamics, and frequent self-occlusion. Achieving reliable perception and control for fabric manipulation remains an open problem in robotics. To address this, we combine physics-based models, like mass-spring systems and position-based dynamics, with data-driven approaches for accurate cloth state estimation and manipulation tracking. At the heart of our method is a Graph Attention Network (GAT), which predicts and reconstructs the mesh state of fabric from real-time sensory data, enabling robust perception for manipulation even in complex scenarios. These GAT-based mesh representations are then used to train an Action Chunking Transformers (ACT) network using real human demonstrations. The ACT network learns to generate effective action sequences that move fabric to desired configurations, enabling fine manipulation with limited training data. By integrating GAT-based perception with ACT’s sequential action modeling, our system enables the robot to align a fabric with a desired configuration with high reliability and efficiency, with only a small number of demonstrations.

14:32-14:44, Paper SuS2A.2
Fabric Handling with Passive Actuator Less Grippers (I)

Seino, Akira	Centre for Transformative Garment Production
Kosuge, Kazuhiro	The University of Hong Kong
Keywords: Intelligent Control and Systems, Grasping and Manipulation Abstract: The automation of fabric destacking and pick-and-place operations is critical for garment manufacturing. Conventional grasping solutions, such as pinching, suction, electro-adhesive, or needle-based grippers, are fundamentally limited by their reliance on active actuation and external power sources. We propose the Passive Actuator-Less Gripper (PALGRIP), a novel, fully mechanical end-effector for fabric manipulation. PALGRIP utilizes an internal mechanism that converts from the relative movement between the housing of the gripper and the fingers to open-and-close motion of the fingers, enabling it to grasp a topmost single fabric from a fabric stack without requiring any onboard power or tethered supplies. By its passive, power-free design, PALGRIP facilitates low-cost, simplified system integration, presenting a significant advancement in accessible automation for the textile industry. In this talk, the concept and mechanical design of PALGRIP and its application will be presented.

14:44-14:56, Paper SuS2A.3
Fabric Handling System (I)

Kobayashi, Akinari	Centre for Transformative Garment Production
Kosuge, Kazuhiro	The University of Hong Kong
Keywords: Intelligent Control and Systems, Grasping and Manipulation Abstract: Robotic handling of thin, soft, and flexible materials such as fabric presents significant challenges because they are prone to wrinkling and losing their shape. Keeping fabric flat requires either controlled tension or complete support. To address this issue, we propose a novel robotic end-effector that handles a piece of fabric by rolling it up. The end-effector incorporates a roller equipped with suction, which first adheres an edge of the fabric to the roller using suction and then rolls it up smoothly to avoid wrinkling. We implemented this end-effector on a dual-arm robot equipped with force sensors, allowing coordinated dual-arm manipulation with dynamic tension control to maintain tautness. We evaluated the performance of the system through pick-and-place experiments involving stacked pieces of fabric. The results demonstrate that our approach enables precise and reliable fabric manipulation, highlighting its strong potential for automation in sectors such as garment production.

14:56-15:08, Paper SuS2A.4
Visual Servoing Enhanced with AI (I)

Tokuda, Fuyuki	Centre for Transformative Garment Production
Kosuge, Kazuhiro	The University of Hong Kong
Keywords: Robot Vision and Computer Vision, Artificial Intelligence, Intelligent Control and Systems Abstract: Visual servoing is a technique that enables robots to control their pose using visual feedback from cameras. It is widely used in applications such as object tracking, autonomous navigation, and robotic manipulation. However, most existing systems focus on handling rigid objects, making soft and flexible materials like fabrics a significant challenge for visual servoing. This talk presents a novel CNN-based visual servoing method for the simultaneous positioning and flattening of a soft, non-textured fabric part using a dual-arm manipulator system. The system utilizes multimodal sensory input, grayscale camera images and force/torque data from force sensors, to control the robot arms. To address the difficulty of recognizing the shape of a non-textured fabric surface, structured lighting is applied to enhance surface features. A convolutional neural network is trained using data collected by randomly manipulating the fabric with real robots. The proposed method successfully flattens fabric parts from various initial conditions, including unseen wrinkles.


SuS2B	Room E
Recent Advances and Research Frontiers in Marine and Aerial Robotics	Invited Sessions
Chair: Ji, Daxiong	Zhejiang University

14:20-14:32, Paper SuS2B.1
Mitigating Output Redefinition Error in Trajectory Planning for Underactuated Unmanned Surface Vehicles (I)

Zheng, Yongsheng	Zhejiang University
Jiang, Yuning	Zhejiang University
Xu, Chao	Zhejiang University
Xiang, Ji	Zhejiang University
Keywords: Mobile Robotics, Path and Motion Planning Abstract: Underactuated unmanned surface vehicles (USVs) lack surge-direction controllers, making it difficult to accomplish motion control tasks. To address the issue that available control inputs in the yaw direction cannot directly correspond to position error, an output redefinition method is used to shift the control object from the centroid to offset point with a small distance therefrom. This approach directly takes the point's position as the control object to avoid underactuation issues, but it introduces tracking error. This paper proposes a trajectory planning method for underactuated USVs to mitigate output redefinition error. The method translates each point of a known desired trajectory by a certain distance along its tangent direction with sideslip angle compensation, which is used to make up for the inconsistency between the USV's moving direction and the tangential direction of the desired trajectory, and finally generates a new trajectory for the offset point. This ensures the USV centroid follows the original desired trajectory when the offset point tracks the new one. Compared with direct tracking, it mitigates error due to output redefinition, enabling more accurate trajectory following. Simulation results validate the method's effectiveness.

14:32-14:44, Paper SuS2B.2
Proximal Policy Optimized Tube MPC Fault-Tolerant Control for Thrusters in AUV (I)

Lai, Zehua	Zhejiang University
Xu, Lie	Zhejiang University
Yang, Jinghe	The University of Melbourne
Pu, Ye	University of Melbourne
Ji, Daxiong	Zhejiang University
Keywords: Intelligent Control and Systems Abstract: This paper investigates a fault-tolerant control (FTC) method for underwater vehicles, specifically focusing on small autonomous underwater vehicles (S-AUVs) experiencing thruster faults. We propose a novel control strategy that combines Proximal Policy Optimization with Tube Model Predictive Control (PPO Tube MPC) to enhance system performance under fault conditions. By leveraging PPO-trained policy to adaptively adjust the auxiliary feedback control law parameters in Tube MPC, this method reduces controller conservativeness while maintaining system stability during fault handling. Simulation results demonstrate that our PPO Tube MPC method significantly outperforms traditional control methods in reducing tracking errors. The PPO-trained policy exhibits strong generalization capability, maintaining effective control across diverse tasks and varying fault occurrence times.

14:44-14:56, Paper SuS2B.3
Performance Evaluation of Flight Control Strategies for a Pesticide-Carrying Unmanned Aerial Vehicle

Arshad, Syed Muhammad Nashit	Shenzhen Technology University, Shenzhen, China
Xu, Haoliang	Shenzhen Technology University
Hussain, Muntazir	Sustech
Khan, Rashid	Shenzhen Technology University, Shenzhen, 518188, China
Meng, Xiangdong	Shenyang Institute of Automation, Chinese Academy of Sciences
Li, Qiang	Shenzhen Technology University
Ming, Zhong	Shenzhen University
Keywords: Intelligent Control, Field Robotics Abstract: The efficient transportation of liquids by unmanned aerial vehicles (UAVs) is of utmost importance in various autonomous missions, including firefighting and field spraying. Nevertheless, liquid sloshing during transportation can lead to undesirable effects such as instability, unwanted forces, position error, and increased control effort resulting in inefficient power utilization and payload constraints. To mitigate the effects of Chlorpyrifos (pesticide) sloshing, a Lagrangian based dynamic model of UAV and resulting slosh was developed. Using SMD analogy, sloshing and the dynamic equation of the quadcopter was modeled based on the geometry of the liquid container. An effective classical control algorithm for a liquid carrier quadcopter is presented which has been extensively investigated, validated, and compared. Simulations based on Coppelia V-rep are also presented to investigate the real time implementation of the proposed system. The results demonstrate a decrease in chlorpyrifos slosh amplitude and intuitively, a reduction in the control effort. These findings have significant implications for improving the quality of quadcopter control in various real-world applications.

14:56-15:08, Paper SuS2B.4
Experimental Investigation of Sloshing Force Prediction Via Deep Learning and Sensor Fusion in Robotics

Arshad, Syed Muhammad Nashit	Shenzhen Technology University, Shenzhen, China
Chen, Mingqi	Shenzhen Technology University
Xu, Haoliang	Shenzhen Technology University
Li, Meng	Shenzhen Technology University
Li, Qiang	Shenzhen Technology University
Ming, Zhong	Shenzhen University
Keywords: Intelligent Control and Systems Abstract: Accurately predicting the force of water in a moving container remains a challenging task. This paper introduces a novel framework for estimating dynamic sloshing forces in liquid-carrying robotic systems, leveraging a CNN-LSTM model enhanced with an attention mechanism and multi-sensor fusion. A rectangular beaker was mounted on a robotic manipulator, which was equipped with a multi-level water height sensor, a 10-axis IMU to monitor beaker motion, and a 3-axis force sensor to capture sloshing-induced forces. The robotic manipulator executed both controlled and random 3D motions with varying velocities and accelerations to induce diverse sloshing dynamics without causing spillage. A sensor fusion algorithm prioritized laser sensor data when ultrasonic readings became unreliable due to high velocities or large sloshing angles. This approach enables real-time sloshing force estimation, laying the foundation for sensor free systems where forces can be accurately predicted..


SuS3A	Room D
Robotic Learning, Intelligent Control and Design	Regular Sessions
Chair: Chen, Heping	Texas State University

15:10-15:22, Paper SuS3A.1
Online Domain Adaption for Sim2Real Transfer of High-Precision Manipulation with Visuotactile Sensing

Chen, Rui	Tsinghua University
Dang, Renjun	Tsinghua University
Xu, Jing	Tsinghua University
Keywords: Intelligent Control, Machine Learning Abstract: Visuotactile sensors provide highresolution contact information for contact-rich manipulation tasks using reinforcement learning (RL). However, the simulation-to-real domain gap limits Sim2Real performance in high-precision tasks. We propose an online adaptation framework using a recurrent neural network (RNN) to learn correlations between robot proprioception and tactile signals, enabling policy adaptation. Experiments demonstrate a 96.7% success rate for zero-shot Sim2Real transfer of peg insertion with 40 μm clearance.

15:22-15:34, Paper SuS3A.2
TD3-Based Visual-Tactile Fusion for Dexterous Robotic Grasping

Wang, Xiaoyu	Shandong University
Li, Ke	School of Control Science and Engineering, Shandong University
Keywords: Grasping and Manipulation, Intelligent Control, Machine Learning Abstract: 结构化环境中的机器人抓取呈现由于复杂的物体几何形状而面临的重大挑战，不同的材料特性和动态作条件。而传统的基于视觉的抓取系统提供空间感知，但它们往往无法解决作过程中的关键接触动态。相反基于触觉的方法，尽管对接触敏感交互，缺乏全局态势理解。因此，本文提出了一种新颖的视觉触觉基于孪生延迟深度确定性的融合框架利用协同作用的政策梯度框架视觉和触觉方式的优势得到改善人形机器人的自

15:34-15:46, Paper SuS3A.3
Leg Joint Trajectory Planning of a Cat-Inspired Falling Robot Driven by Pneumatic Muscles with Optimal Energy Consumption

Cao, Jian	Hefei University of Technology
Han, Shun	Hefei University of Technology
Zhu, Xiaocong	Zhejiang University
Song, Yunhe	Hefei University of Technology
Keywords: Biologically Inspired Robotics, Path and Motion Planning, Mobile Robotics Abstract: In this paper, the leg joint trajectory planning with optimal energy consumption for a cat-inspired falling robot driven by pneumatic muscles (CIFRDPM) during its diagonal trotting gait is proposed. Firstly, the kinematics and energy consumption model of the leg joints of CIFRDPM are analyzed. Subsequently, a MATLAB/Adams co-simulation model for the leg motion of the robot is established, utilizing the composite pendulum line as a reference trajectory at its foot-end. Then, an optimized leg trajectory generation method combining a three-point Fourier series and cubic spline interpolation is developed, with parameters specifically tuned to minimize energy consumption of the leg joints during motion. The simulation results demonstrate that the CIFRDPM with Fourier optimized trajectory and the Fourier plus cubic spline interpolation optimized trajectory respectively achieve an 18.12% and 19.63% reduction of energy consumption during its diagonal trotting motion, compared to that with reference trajectory of the composite pendulum at the foot end only

15:46-15:58, Paper SuS3A.4
Structure Design and Manufacturing Method of the Silicon-Based Elbow Exoskeleton

Hou, Xinyu	Shenyang Aerospace University
Zeng, Xinyu	Shenyang Aerospace University
Zhirui, Zhao	Shenyang Aerospace University
Gang, Liu	Shenyang Aerospace University
Dexing, Shan	Northeastern University
Xu, Jiqian	Shenyang Aerospace University
Keywords: Soft Robotics, Smart Structures, Materials, Actuators Abstract: This paper presents a silicone-based soft elbow exoskeleton designed for rehabilitation, particularly targeting individuals with weakened physical capabilities. The study delves into the working principle and the design process using CAD and CAE. During the preparation process, a 3D printer was used to manufacture the casting mold, and PVA material and carbon fiber weaving were employed to enhance the performance of the air chambers. The experimental setups and performance tests of the proposed exoskeleton are also discussed. The results indicate that the proposed device can assist a passive prosthetic arm weighing up to 1.5 kg in achieving a maximum bending angle of 63.09 degrees. These results also align well with the range of motion required for daily activities of the human elbow joint and demonstrate promising potential for rehabilitation applications.


SuS3B	Room E
Embodied Intelligence and Adaptive Systems	Invited Sessions
Chair: Zhao, Yuliang	Northeastern University at Qinhuangdao

15:10-15:22, Paper SuS3B.1
Gait Phase Recognition Based on a Multimodal Sensing-Driven Smart Shoe System (I)

Hanbing, Liu	Université Paris-Saclay, Université d'Evry Paris-Saclay
Zuo, Chuanlin	University Evry Val d'Essonne, University Paris-Saclay
Bencharif, Loqmane	Paris Saclay
Ibset, Abderahim	University Paris-Saclay
Qi, Wen	Politecnico Di Milano
Su, Hang	Paris Saclay University
Dychus, Eric	Sandyc
Alfayad, Samer	Paris-Saclay Universit -Evry University
Keywords: Sensor Networks, Rehabilitation and Assistive Robotics, Medical Robotics Abstract: Plantar pressure distribution is a key biomechanical parameter reflecting gait stability and foot loading characteristics. It plays a crucial role in designing control strategies for exoskeleton-assisted rehabilitation. This study investigates plantar pressure variation, regional loading dynamics and the trajectory of the center of pressure (CoP) during normal walking, based on a previously developed multimodal smart shoe system. The system integrates three types of sensors: flexible film pressure sensors, bending sensors and inertial measurement units (IMUs). We analyze temporal and spatial variations in pressure over the heel, arch and forefoot, generate real-time heatmaps and extract CoP trajectories to assess gait stability. This work provides high-resolution data supporting biomechanical modeling of healthy gait and offers a basis for gait event detection and closed-loop control in future exoskeleton systems.

15:22-15:34, Paper SuS3B.2
RTK and IMU Fusion Positioning Technology for Orchard Robot

Wang, Hao	Yantai University
Wang, Fei	Yantai University
Keywords: Agricultural Robotics, Path and Motion Planning Abstract: Positioning is fundamental for enabling intelligent operations of orchard equipment. However, satellite-based positioning signals are prone to occlusion and multipath effects caused by tree canopies and surrounding obstacles, resulting in degraded localization accuracy. To address these challenges in orchard environments, this paper introduces a measurement-deviation-based extended Kalman filter fusion algorithm. The proposed method enhances state estimation accuracy by adaptively adjusting measurement noise covariances and mitigating bias accumulation due to sensor drift and intermittent signal outages. Consequently, it offers improved robustness and adaptability in real-time state estimation. Experimental results demonstrate that this fusion scheme significantly enhances both the accuracy and real-time responsiveness of Inertial Measurement Unit (IMU) and Real-Time Kinematic (RTK) data fusion, thereby satisfying the stringent positioning requirements of orchard applications.

15:34-15:46, Paper SuS3B.3
Real-Time Detection Performance of RTSP vs. USB Cameras for UGVs

Sadman, Raheeb	BRAC University
Sattar, Safwan	BRAC University
Alam, Jahedul	BRAC University
Maliha, Sabrina	BRAC University
Mashkura, Mahadia	BRAC University
Sikder, Sajid Ali Sikder	BRAC University
Akand, Anan	Brac University
Abrar, Fahim	BRAC University
Keywords: Mobile Robotics, Robot Vision and Computer Vision, Sensor Networks Abstract: Real-time object detection is critical for ensuring the operational safety of unmanned ground vehicles (UGVs) in industrial inspection and last-mile delivery applications, yet the impact of camera interface selection remains poorly understood. This study presents the first systematic comparison of USB webcams versus network cameras for UGV perception systems, focusing on safety-critical scenarios including emergency braking and obstacle avoidance. Our experimental results demonstrate that USB-based systems consistently outperform networked alternatives, offering superior frame processing efficiency, enhanced navigation precision during avoidance maneuvers, and significantly improved operational stability in prolonged deployments. These advantages are a result of USB’s direct hardware interfacing and predictable low latency characteristics. The outcomes conclusively establish that for UGV applications, directly connected USB cameras provide more reliable vision performance, enabling robust collision avoidance in dynamic environments while maintaining exceptional tracking consistency and system robustness. Additionally, the study highlights the significant reduction in detection latency and jitter, which are important for real-time decision-making in fast-paced, safety in critical operations.


SuS4A	Room D
Medical Robots for Precision Surgery, AI-Driven Algorithms, and Autonomous Surgical Task	Invited Sessions
Chair: Wang, Junchen	Beihang University
Co-Chair: Su, Baiquan	Beijing University of Posts and Telecommunications

16:20-16:32, Paper SuS4A.1
Intra-Natural Orifice Locomotion Robot with Propulsion by Chemical Reaction (I)

Zhang, Dengbo	Beijing University of Posts and Telecommunications
Jiang, Zhangzhang	Beijing University of Posts and Telecommunications
Ba, Peng	Beijing University of Posts and Telecommunications
Wang, Junchen	Beihang University
Liu, Wenyong	Beihang University
Tang, Jie	Beijing Tian Tan Hospital, Capital Medical University
Su, Baiquan	Beijing University of Posts and Telecommunications
Keywords: Medical Robotics Abstract: There are multiple ways for robots to advance.such as peristalsis, rolling,magnetic drive, and rear push.However, the thrust generated by chemical reactions can alsodrive objects to move. How to design a robot to generate thrustthrough chemical reactions to drive it is a challenging problem.This paper proposes a cavity motion robot driven by chemicalreactions, multiple reactant delivery mechanisms, a methodfor controlling the direction of reaction forces, a reactionchamber of the robot, on-off valves, and a device for dischargingreaction products, Analyze and derive the relationship betweenthe kinetic energy of gases generated by chemical reactions.robot displacement, and reactant volume. The feasibility of thebasic principle of the robot's movement was tested from anexperimental perspective. This driving method provides a newtype of movement mode for the motion robot in the pipeline.A new field of driving research has been opened up.

16:32-16:44, Paper SuS4A.2
Digital Twin-Based Stereo Dataset Generation Method for 3D Reconstruction of Teeth (I)

Su, Pengjiao	Beihang University
Liu, Yuchen	State Key Laboratory of Oral & Maxillofacial Reconstruction And
Zhang, Runshi	Beihang University
Bai, Shizhu	Digital Center, School of Stomatology, the Fourth Military Medic
Wang, Junchen	Beihang University
Keywords: Robot Vision and Computer Vision, Deep Learning, Medical Robotics Abstract: Producing stereo matching datasets in dental 3D reconstruction studies is expensive and difficult. In this study, we propose a method for procedurally generating high-precision virtual stereo matching datasets based on the Unity engine. By accurately modeling the binocular camera imaging principle and environmental parameters, the method can generate paired stereo images with known depth information (disparity maps), sampling depth values with an accuracy of up to 10-4m. This method breaks through the traditional paradigm of relying on expensive sensor acquisition or limited public datasets. It effectively solves the traditional problems of difficulty in acquiring depth information of real scenes, scarcity of high-quality datasets and high cost. The experiments were performed on the widely used open-source stereo matching model RAFT-Stereo and aligned to the target tooth point cloud after segmentation. The results show that the performance of the model trained based on this dataset is comparable to the traditional real dataset. The average alignment error for a single tooth was 0.17-0.26 mm, with the maximum error below 0.93 mm. For multiple tooth scenarios, the average error of multiple co-alignment was stabilized at 0.27-0.79 mm. Finally, experiments were performed on facial images taken by a real camera. The average error of the reconstructed point cloud is within 1mm.

16:44-16:56, Paper SuS4A.3
Structural Design and Analysis of a Wire-Driven Robot with a Double-Wire Coaxial Drive Mode (I)

Ma, Xudong	Beijing University of Posts and Telecommunications
Chen, Anqi	Beijing University of Posts and Telecommunications
Yi, Yubo	Beijing University of Posts and Telecommunications
Hu, Yida	Harvard Medical School
Liu, Wenyong	Beihang University
Wang, Junchen	Beihang University
Kuang, Shaolong	Shenzhen Techonology University
Tang, Jie	Beijing Tian Tan Hospital, Capital Medical University
Hou, Yuanzheng	Xuanwu Hospital, Capital Medical University
Li, Changsheng	Beijing Institute of Technology
Su, Baiquan	Beijing University of Posts and Telecommunications
Keywords: Medical Robotics Abstract: Wire-driven robots have wide applications. The structure of the driver is simple but bulky, with too many motors and no tensioning mechanism. A driver for a dual-wire coaxial wire-driven robot is proposed. The principle is that the extension and contraction amounts of the mirror-symmetrically distributed driving wires are equal. Based on this principle, a wire-driven hyper-redundant robot and a continuum robot with a dual-wire coaxial driving method are designed, and their kinematic models are established and analyzed. Based on the design method, a hyper-redundant robot and a continuum robot with a dual-wire coaxial driver are fabricated, and their motion performance is tested. The test results show that the wire-driven robot based on the dual-wire coaxial driver conforms to thekinematic analysis, proving that the dual-wire coaxial driving method is correct and efficient.

16:56-17:08, Paper SuS4A.4
Design and Simulation of a Robotic System for Sports Injury Treated with Extracorporeal Shock Wave Therapy (I)

Wang, Boyang	Beihang University
Wang, Yueyang	Beihang University
Liu, Wenyong	Beihang University
Keywords: Medical Robotics, Robot Design, Human-Robot Interaction and Cooperation Abstract: When applying robotic technology to extracorporeal shock wave therapy (ESWT) for sports injuries, challenges related to dexterity and dynamic response stability need to be addressed. To this end, this paper proposes an arm-hand coordinated robot-assisted ESWT scheme and completes its simulation. Firstly, by analyzing the operational requirements for the ESWT target area, a robot-assisted ESWT scheme and a two-degree-of-freedom (2-DOF) end-effector mechanism are designed. The inverted conical workspace of the end-effector enables precise treatment of localized target areas and controls the ESWT instrument to perform feed motion at specific orientations, overcoming the movement limitations of using a robotic arm alone in confined spaces，achieving the workspace of 1/3 π(75)^2*200 mm^3, which meets the small space occupation for ESWT. The forward and inverse kinematics of the robot system are then calculated, and joint motion simulations are performed. Finally, a dynamic model of the end-effector is established, and the characteristics of its driving force profile, in which the maximum forces for translational feed joint and swinging joint do not exceed 2.5 N and 15 N, are analyzed. This research provides a high-dexterous and stable robot-assisted ESWT solution for ESWT robots, laying the foundation for intelligent and precise treatment.

17:08-17:20, Paper SuS4A.5
Organ Deformation and Contact Force Estimation of Surgical Instruments Based on 3D Vision

Qian, Hongyu	Tsinghua University
Wang, Yixuan	Tsinghua University
Chen, Rui	Tsinghua University
Xu, Jing	Tsinghua University
Keywords: Medical Robotics, Sensing, Haptic System Abstract: We propose a 3D vision-based tissue contact force estimation framework. It integrates image/point cloud acquisition, deformation reconstruction, multimodal deep learning, and physical verification to form a closed-loop system for visual-to-force estimation. The proposed model fuses RGB images, dense point clouds, and inter-point displacements, incorporates deformation modeling for interpretability, uses the Transformer architecture for unsupervised non-rigid point cloud registration, and adopts a spatial-temporal dual-pathway structure to handle static and dynamic features separately. Experiments involve building a renal tissue 3D deformation and force dataset, with extensive tests in simulated and physical environments.


SuS4B	Room E
Human-Robot Interaction	Regular Sessions
Chair: Chen, Fei	T-Stone Robotics Institute, the Chinese University of Hong Kong

16:20-16:32, Paper SuS4B.1
Development of Personalized Human Digital Twin for Exercising

Zou, Kehan	The University of Hong Kong
Ma, Xin	The University of Hong Kong
Chen, Yuetian	The University of Hongkong
Yang, Ping	Southern University of Science and Technology
Wu, Xi	The Chinese University of Hong Kong
Li, Chenzui	The Chinese University of Hong Kong
Zhang, Yijian	The University of Hong Kong
Huang, Jialiang	ShenZhen Academy of Robotics
Chen, Jiangcheng	The University of Hong Kong
Chen, Fei	T-Stone Robotics Institute, the Chinese University of Hong Kong
Xi, Ning	The University of Hong Kong
Keywords: Human-Machine Interface, Medical Robotics, Rehabilitation and Assistive Robotics Abstract: Exercise requires quantitative planning and monitoring. Digital approaches offer a potential solution; however, developing subject-specific digital twin (DT) models remains challenging. Here, we propose a method to establish personalized human digital twin for exercising. Compared to the conventional methods that relies on medical imaging, we identify personalized biomechanical parameters, i.e., maximum isometric muscle force, based on the electromyographic (EMG) signals, kinetics data, and interaction forces. It allows for rapid and cost effective creation of individualized musculoskeletal models. Application studies demonstrate that our approach can accurately model the back squat movement in five subjects, yielding normalized RMSE of 0.2 for the ankle and hip joints, and 0.3 for the knee joints. These advantages, as a result, indicate a convincing strong potential for personalized exercise planning and monitoring.

16:32-16:44, Paper SuS4B.2
Human-Robot Interaction Behavior in Commercial Services: A Four-Stage Integrative Framework

Tu, Yangjun	Hunan University
Jiang, Simin	Hunan University
Xiao, Lijun	Hunan University
Niu, Ziqi	Hunan University
Yang, Zhi	Hunan University
Keywords: Human-Robot Interaction and Cooperation Abstract: With the increasing adoption of embodied service robots in commercial service settings, there is a critical need to understand the intricate interaction behavior process between humans (customers/employees) and these robots. Drawing upon a systematic review and qualitative integration of 44 core articles, this study proposes a four-stage dynamic model of human-robot interaction behavior: (1) Contact Initiation, (2) Interaction Execution, (3) Feedback Adaptation, and (4) Outcome Evaluation. This model not only delineates the goal-oriented activities within individual service encounters but also illuminates how repeated interactions, through learning and evolution, shape long-term human-robot service relationships and users' perception of the robot's social role (e.g., tool, assistant, partner). The proposed framework advances the theoretical understanding of the dynamic process of human-robot interaction behavior within commercial service settings and offers practical guidance for service firms seeking to effectively manage human-robot interactions and enhance service performance.

16:44-16:56, Paper SuS4B.3
"Accelerator" versus "Ceiling": Unpacking the Career Paradoxes of Employee-Robot Cowork and Their Impact on Career Sustainability

Tu, Yangjun	Hunan University
Chen, Jia yuan	Hunan University
Guo, Yaqian	Peking University
Jiang, Simin	Hunan University
Chen, Shaoxuan	Hunan Normal University in Changsha, China
Yang, Zhi	Hunan University
Keywords: Human-Robot Interaction and Cooperation Abstract: The complex paradoxical impact of integrating service robots into frontline service work on employee career sustainability is frequently overlooked. Employing an inductive qualitative method, this study draws on in-depth interviews with 11 frontline service employees in the Chinese hotel industry to uncover and theorize the core career paradoxes inherent in employee-robot cowork and the mechanisms through which they shape career sustainability. Findings reveal that robots simultaneously enact dual roles, acting as both career "accelerators" and "ceilings." This duality manifests in four interwoven core career paradoxes: (1) immediate convenience versus long-term capability constraints; (2) selective efficiency versus situational failure and the imposition of new burdens; (3) task relief versus potential career threats; and (4) instrumental dependence versus interactional absence. These paradoxes interact such that the pursuit of short-term convenience can inadvertently lead to long-term skill suppression, heightened job insecurity, and constrained potential for human-robot cowork, thereby exerting a complex influence on career sustainability. This study introduces the construction of the "employee-robot cowork career paradox," challenging simplistic assumptions about technology's impact and contributing paradox and long-term perspectives to research on career sustainability and human-robot interaction (HRI). Our findings suggest that managers must look beyond the "efficiency

16:56-17:08, Paper SuS4B.4
A Method for Constructing a Dual-Arm Robot Motion Retargeting Dataset

Yuan, Jiahui	Yanshan University
Yang, Haoxin	Beijing University of Posts and Telecommunications
Yao, Zhuofan	Yanshan University
Xu, Wenjing	Yanshan University
Qiao, Kai	Yanshan University
Zhang, Yahui	Yanshan University
Wen, Guilin	Yanshan University
Keywords: Human-Robot Interaction and Cooperation, Humanoid Robots Abstract: 复位是一项精细的外科任务，本质上属于人机交互的领域。手臂姿势的胡人相似性直接影响复位效率和患者的接受度。针对运动重定向数据集构建过程中成本高、末端执行器精度与人相似性难以平衡的挑战，提出了一种基于Adam优化算法的胡人机器人运动重定向数据集构建方法。该方法将机械臂的手臂配置角度作为量化人体相似性的关键指标，同时引入末端执行器位置和方向作为精度约束，制定多目标优化模型，将人臂运动数

17:08-17:20, Paper SuS4B.5
CL-RAG: A Closed-Loop Multimodal Retrieval-Augmented Generation Architecture for Robust Human-Robot Control Interaction

Zhang, Bowen	University of Trento
Jiang, Yuhang	University of Trento
Hu, Lingxiang	Paris Saclay University
Li, Dun	Tsinghua University
Hu, Qianqian	Nanjing Agricultural University
Keywords: Human-Robot Interaction and Cooperation, Humanoid Robots, ROS, Software System for Robotics Application Abstract: Recent advancements in Large Language Models (LLMs) such as Claude 3.5 and LLaMA 3, paired with Retrieval-Augmented Generation (RAG), offer promising opportunities for intelligent, context-aware robotic systems. This paper proposes a modular architecture that integrates enterprise-grade RAG technologies—originally developed for corporate knowledge management—into real-time control of bipedal and interactive robots. By adapting components such as ChromaDB, Docling, and AWS Bedrock, we demonstrate how unstructured sensor data, dynamic environment cues, and human language commands can be seamlessly processed to drive physically grounded robot behaviors. A proof-of-concept implementation shows significant improvements in instruction comprehension, semantic robustness, and safety, verified in industrial and inspection scenarios.


SuPP1	Hall
Poster Session	Poster Sessions

10:30-16:40, Paper SuPP1.1
Hybrid A*-Bézier Optimization for 3D Path Planning in Complex Environments (I)

Mingrui, Mou	University of Chinese Academy of Sciences
Gu, Haitao	Chinese Academy of Sciences
Keywords: Path and Motion Planning, Machine Learning, Deep Learning Abstract: To overcome limitations in traditional A* algorith-ms for 3D path planning—including path roughness, node redundancy, and safety concerns—we propose a hybrid strategy integrating Enhanced Heuristic A* with adaptive Bézier optimization. The approach refines heuristic functions via neural networks to dynamically guide path exploration, reducing redundant nodes and improving search efficiency. Concurrently, it employs segmented Bézier smoothing with local adaptive control points to ensure path continuity while mitigating collision risks. Simulations confirm that in medium-to-high density obstacle environments, the method reduces indexed nodes to 1/5–1/12 of benchmarks (traditional A* + global Bézier smoothing), significantly boosting computational efficiency and safety. This provides an effective solution for UUV complex 3D path planning.

10:30-16:40, Paper SuPP1.2
A Multimodal Sensing-Driven Smart Shoe System for Gait Phase Recognition in Exoskeleton Applications (I)

Zuo, Chuanlin	University Evry Val d'Essonne, University Paris-Saclay
Hanbing, Liu	Université Paris-Saclay, Université d'Evry Paris-Saclay
Bencharif, Loqmane	Paris Saclay
Ibset, Abderahim	University Paris-Saclay
Qi, Wen	Politecnico Di Milano
Su, Hang	Paris Saclay University
Dychus, Eric	Sandyc
Alfayad, Samer	Paris-Saclay Universit -Evry University
Keywords: Medical Robotics, Human-Machine Interface, Sensing, Haptic System Abstract: To enhance the precision and naturalness of exoskeleton control, this paper proposes a novel multimodal smart shoe design for precise gait analysis. Unlike conventional gait models that rely solely on kinematic data, our system integrates foot pressure distribution and foot deformation sensing to construct a more comprehensive foot motion model. The smart shoe is equipped with inertial measurement units (IMUs), a plantar pressure sensor and bending sensors. Multimodal data was collected and analyzed to extract gait features including trajectory, pressure distribution, and foot deformation patterns. A multimodal gait analysis model was developed using sensor fusion techniques. Experimental results demonstrate that the proposed system provides a more accurate and holistic representation of foot motion, offering enhanced biomechanical and dynamic information for future exoskeleton control systems.

10:30-16:40, Paper SuPP1.3
Elastic Actuation and Sensor-Fusion-Driven Adaptive Control for Wearable Lower-Limb Exoskeletons (I)

Bencharif, Loqmane	Paris Saclay
Ibset, Abderahim	University Paris-Saclay
Zuo, Chuanlin	University Evry Val d'Essonne, University Paris-Saclay
Hanbing, Liu	Université Paris-Saclay, Université d'Evry Paris-Saclay
Qi, Wen	Politecnico Di Milano
Su, Hang	Paris Saclay University
Dychus, Eric	Sandyc
Alfayad, Samer	Paris-Saclay Universit -Evry University
Keywords: Medical Robotics, Rehabilitation and Assistive Robotics, Intelligent Control and Systems Abstract: This paper presents an adaptive control framework for a wearable lower-limb exoskeleton that assists sagittal-plane motion at the hip and knee joints. The system integrates sensor fusion, virtual joint elasticity, and an adaptive fuzzy PID controller to enhance tracking accuracy under dynamic and nonlinear conditions. A simulation environment evaluates controller performance across varying scenarios, while a hardware prototype demonstrates real-time trajectory tracking using classical PID control. Results confirm the feasibility of combining adaptive control and elastic actuation for robust, user-responsive assistance in wearable robotics.

10:30-16:40, Paper SuPP1.4
Human-Inspired Pre-Design Optimization for Humanoid Robots in Dynamic Interaction Tasks (I)

Marshoud, Abd Alrahman	Université Évry Paris-Saclay
Sleiman, Maya	Paris Saclay
Ait Oufroukh, Naima	University of Paris-Saclay
Su, Hang	Paris Saclay University
Alfayad, Samer	Paris-Saclay Universit -Evry University
Keywords: Humanoid Robots, Robot Design, Path and Motion Planning Abstract: This study presents a novel pre-design optimization framework for humanoid robots, enabling precise alignment of robotic kinematics with human motion patterns for dynamic tasks. The aim is to reduce prototyping costs and support task-specific designs. Leveraging mechatronics expertise, the framework features an interactive interface for real-time parameter tuning, guided by metrics such as workspace coverage and inverse kinematics (IK) success rate. The predesign tool is applied on a 3-DOF arm, part of an upper-body humanoid robot under development, achieves 100% workspace coverage and an 85.51% IK success rate across four experiments for jabs and hooks. Motion similarity metrics were used to validate human-like performance and smoothness.

10:30-16:40, Paper SuPP1.5
A Novel Hybrid Serial-Parallel Shoulder Mechanism for Humanoid Robots: Design and Workspace Analysis (I)

Soukarieh, Wael	University of Evry, IBISC Laboratory
Sleiman, Maya	Paris Saclay
Ait Oufroukh, Naima	University of Paris-Saclay
Su, Hang	Paris Saclay University
Alfayad, Samer	Paris-Saclay Universit -Evry University
Keywords: Humanoid Robots, Robot Design, Human-Robot Interaction and Cooperation Abstract: This paper presents a novel 4-DOF hybrid serial–parallel shoulder mechanism for humanoid robots, designed to enhance workspace coverage while maintaining a compact, human-like form. The mechanism is composed of two integrated substructures: a serial chain and a fully parallel subsystem. This hybrid approach addresses the challenges of achieving large pitch, yaw, and roll ranges under size and anthropomorphic constraints. The design is tailored for the HYDROiD humanoid robot, with a focus on developing "slim and smart" joints suitable for human–robot interaction. To approximate anatomical constraints, the shoulder region is modeled as a conjoined conical volume representing the upper trunk and upper arm. The kinematic model is validated through numerical workspace analysis, demonstrating the hybrid architecture’s effectiveness in expanding motion capabilities within a restricted envelope.

10:30-16:40, Paper SuPP1.6
High-Fidelity Contrastive Language-State Pre-Training for Embodied Agent State Representation

Huang, Fuxian	Shanghai AI Laboratory
Zhang, Qi	Shanghai AI Lab
Zhang, Haoran	Shanghai AI Lab
Zhang, Tianyi	Shanghai AI Laboratory
Zhou, Ming	Shanghai AI Laboratory
Zhang, Jinouwen	Shanghai AI Lab
Zhai, Shaopeng	Shanghai AI Laboratory
Keywords: Artificial Intelligence, Deep Learning, Intelligent Control Abstract: With the rapid development of AI, multimodal learning has become crucial, especially with multimodal large language models and embodied agent. However, the representation of the state modality still lags behind other modalities like images, videos, and language. To this end, we propose a High-Fidelity Contrastive Language-State Pre-training method (CLSP), which can accurately encode state information into representations for both embodied agent and multimodal large language models. Extensive experiments demonstrate the superior precision and generalization capabilities of our representation, achieving outstanding results in text-state retrieval, navigation tasks, and multimodal large language model understanding.

10:30-16:40, Paper SuPP1.7
3DS-Plan: A Perception-Planning-Co-Design Framework to Facilitate Robot Task Planning with Open-Vocabulary 3D Scene

Xue, Min	Tsinghua University
Yu, Jincheng	Tsinghua University
Xiu, Lingkun	Tsinghua University
Tang, Jiahao	Tsinghua University
Cai, Xudong	Openmind Smart Robot Co.Ltd
Zhao, Yali	Beijing Novauto Technology Co., Ltd
Liang, Shuang	Novauto Tech
Wang, Yu	Tsinghua University
Keywords: Artificial Intelligence, Robot Vision and Computer Vision Abstract: To plan and execute robotics tasks, 3D scene perception and understanding are critical for robots to interact with complex environments effectively. Traditional systems often rely on closed vocabularies and are constrained by pre-defined object categories, which limit their flexibility. In this paper, we propose 3DS-Plan, an open-vocabulary 3D scene perception and generation model designed to facilitate long-horizon task planning. It leverages the pre-trained foundation models to support robotic scene understanding and further provides environmental details to infer actionable steps in various scenarios. We validate the effectiveness of our framework through comprehensive experiments. It demonstrates a 31.41% improvement in computational efficiency for class operations and at least a 14.52% increase in success rates for complex tasks without requiring extensive retraining or extra annotation. The corresponding project page is available at https://techpage.github.io/open3dsp/.

10:30-16:40, Paper SuPP1.8
Research on Traffic Police Gesture Recognition Algorithm Based on Improved YOLOv11

Lü, Chao	School of Electronics and Information Engineering, Changchun Uni
Sun, Zhaoying	School of Electronics and Information Engineering, Changchun Uni
Keywords: Artificial Intelligence, Deep Learning, Intelligent Transportation Systems Abstract: Traffic police gesture recognition is essential for intelligent traffic management, road safety, and autonomous driving. This paper proposes a novel recognition approach based on an improved YOLOv11 architecture to address challenges such as diverse gesture categories, large scale variations, and environmental interference. We propose a novel Global-Local Pooling Fusion Block to reduce model complexity while maintaining feature quality, introduce a Global Context (GC) attention mechanism to enhance focus on key gesture regions, and integrate an improved Lite-BiFPN structure for better multi-scale feature fusion. Experimental results show that our method achieves a mean average precision of 90.6%, which is 3.8% higher than the original YOLOv11. This work significantly improves detection accuracy, robustness, and real-time performance, providing strong support for intelligent driving systems under complex traffic conditions.

10:30-16:40, Paper SuPP1.9
A Control Method for Wheeled Bipedal Robots Using Improved PPO and Multidimensional Curriculum Learning

Fan, Shenglin	Shandong University
Zhou, Lelai	Shandong University
Sun, Jingyu	Shandong University
Zhang, Yi	Shandong University
Li, Guowei	Shandong University
Dai, Xiaomeng	China Railway Construction Corporation Bridge Engineering Bureau
Li, Yibin	Shandong University
Keywords: Field Robotics, Deep Learning, Intelligent Control Abstract: Wheeled bipedal robots offer enhanced agility, but as an underactuated system based on a wheeled inverted pendulum model, they exhibit strong nonlinearity and coupling, making traditional control methods ineffective on complex, unstructured terrain. This work proposes Adaptive Curriculum Proximal Optimization (ACPO)—a deep reinforcement learning approach for locomotion control that combines an improved PPO algorithm with multidimensional curriculum learning (MCL). First, by incorporating an adaptive clipping coefficient based on KL divergence (Importance‐Weighted Clipping, IWC) and a PI‐KL self‐adaptive learning rate (SALR) into PPO, ACPO achieves efficient exploration in early training and stable convergence in later stages. Second, we propose an MCL framework that dynamically increases and fine‐tunes reward weights, action‐command complexity, observation noise, terrain difficulty, and external disturbances. Finally, large‐scale parallel training and comparative experiments are conducted in the NVIDIA Isaac Gym simulation environment. Ablation studies demonstrate that, compared to baseline PPO, ACPO significantly outperforms in terms of cumulative reward, episode length, orientation stability, fall frequency, velocity‐tracking error, and mean joint torque—exhibiting superior robustness, energy efficiency, and cross‐terrain adaptability.

10:30-16:40, Paper SuPP1.10
STH-SynNet: Spatio-Temporal Heterogeneity-Aware Synergistic Network for Traffic Prediction

Li, Mingqiu	Changchun University of Science and Technology
Liu, Wanting	Changchun University of Science and Technology
Yang, Yang	Changchun University of Science and Technology
Cong, Haifang	Changchun University of Science and Technology
Xing, Songqi	Changchun University of Science and Technology
Keywords: Intelligent Transportation Systems, Deep Learning, Artificial Intelligence Abstract: High-precision traffic flow prediction plays a crucial role in optimizing intelligent transportation systems and enhancing urban operational efficiency. However, existing methods still face limitations in modeling spatial heterogeneity and multi-scale temporal dependencies within traffic networks, which restrict the expressiveness and prediction accuracy of the models. To address these challenges, this paper proposes a Spatio-Temporal Heterogeneity-aware Synergistic Network for traffic prediction, termed STH-SynNet. Built upon an encoder-decoder architecture, the framework integrates an enhanced gating unit—ST-MS-Spectral GRU—within its core module. This unit incorporates dynamic graph modeling, time-frequency decoupling, and attention-enhanced mechanisms to improve the model’s capacity to capture structural diversity and complex temporal dynamics. By employing multi-scale convolutional receptive fields with a dynamic graph generation mechanism, the model adapts to heterogeneous interactions among nodes. Additionally, a frequency masking mechanism based on Fourier Transform is introduced to disentangle periodic patterns from short-term fluctuations, while a lightweight attention-enhanced path dynamically selects and models critical historical states. Experiments on three real-world datasets demonstrate significant improvements in predictive accuracy, validating the effectiveness and robustness of the proposed method in complex traffic forecasting tasks.

10:30-16:40, Paper SuPP1.11
Design and Experiment of a Distributed Tiltrotor UAV

Wang, Xiaobo	Zhejiang Lab
Keywords: Robot Design Abstract: With the development of unmanned aerial vehicle (UAV) technology, new UAV designs have become a research hotspot. Addressing the issues of short flight time, low flight speed for multirotor UAVS, and higher takeoff and landing environment requirements for fixed-wing UAVs, this paper proposes a distributed tiltrotor UAV system. First, it introduces the current research progress on vertical takeoff and landing (VTOL) tiltrotor UAVs. Next, it describes the overall design and components of the distributed tiltrotor UAV. The paper then provides a detailed introduction to the avionics system design for this UAV, and to the redundant flight control system. Following that, it presents a detailed analysis of the UAV control methodology. Finally, by constructing a prototype and conducting flight tests, the feasibility of this UAV as a novel rapid VTOL platform is validated. The results indicate that this UAV can achieve stable and reliable flight, offering valuable insights for the design of future VTOL UAVs.

10:30-16:40, Paper SuPP1.12
LML-GAN: Latent Generative Adversarial Network for Time-Series Heart Rate Signal Prediction

Li, Mingqiu	Changchun University of Science and Technology
Li, Fengtian	Changchun University of Science and Technology
Yang, Yang	Changchun University of Architecture and Civil Engineering
Lan, Tianyu	Changchun University of Science and Technology
Liu, Wanting	Changchun University of Science and Technology
Li, Yifeng	Changchun University of Science and Technology
Keywords: Deep Learning, Artificial Intelligence Abstract: Arrhythmia, as an important cardiovascular disease, poses a serious threat to global human health and life. Electrocardiograms (ECGs) provide a wealth of information for the diagnosis and treatment of cardiovascular diseases, but traditional diagnostic methods are time-consuming, labor-intensive, and unable to provide early clinical warnings. Additionally, due to the complex information contained in ECG signals and their strong nonlinear characteristics, current ECG signal temporal prediction models often exhibit high oscillatory behavior. To address these challenges, this paper proposes an ECG signal prediction model based on latent low-dimensional space generative adversarial networks—LML-GAN (Latent MIX-LSTM Generative Adversarial Network). This model decomposes high-dimensional complex ECG signals into low-dimensional simple representations for adversarial training, ultimately mapping back to the high-dimensional space to complete the prediction. It also employs a hybrid time-frequency domain stepwise supervised loss function to enhance the model's ability to extract features from time-series dynamic data. This paper proposes a novel residual model, MIX-LSTM, based on traditional LSTM for feature extraction. Experiments were conducted using the MIT-BIH dataset, and compared to similar models, it eliminates the severe oscillation issues present in other models when applied to ECG signal prediction.

10:30-16:40, Paper SuPP1.13
A Decentralized Reinforcement Learning Approach for Modular Octoped Locomotion Control

Li, Guowei	Shandong University
Zhou, Lelai	Shandong University
Sun, Jingyu	Shandong University
Zhang, Yi	Shandong University
Fan, Shenglin	Shandong University
Dai, Xiaomeng	China Railway Construction Corporation Bridge Engineering Bureau
Li, Yibin	Shandong University
Keywords: Field Robotics, Multi-Robot Systems, Artificial Intelligence Abstract: With the increasing deployment of quadruped robots in complex environments, reinforcement learning (RL) has shown strong potential in tasks such as gait generation and dynamic balance. The modular octoped robot is composed of two quadruped subsystems connected front and rear. It features detachable and reconfigurable capabilities, enabling coordinated motion control. However, centralized reinforcement learning frameworks struggle with modular octoped robots due to limited coordination between subsystems, high-dimensional action spaces, and poor scalability. To address these issues, this paper proposes a decentralized reinforcement learning control method that decouples the system into two quadruped subsystems, each with an independent PPO policy for local decision-making. By enabling asynchronous control and lightweight communication, the system achieves efficient and coordinated locomotion by enabling local decision-making and asynchronous coordination. Experiments on the IsaacGym platform demonstrate that the decentralized approach improves motion performance, and adaptability, offering a scalable and flexible solution for complex multi-robot systems.

10:30-16:40, Paper SuPP1.14
A Hierarchical Control Framework for Cooperative Adaptive Cruise Control Considering FDI Attacks and Loop Delay Analysis

Wang, Wenwei	Beijing Institute of Technology
Liu, Yushan	Beijing Institute of Technology
Cao, Wanke	Beijing Institute of Technology
Keywords: Intelligent Transportation Systems, Intelligent Control and Systems Abstract: Cooperative Adaptive Cruise Control (CACC) systems based on real-time vehicle-to-vehicle (V2V) communication are pivotal for enhancing traffic efficiency and safety in vehicular network environments. However, the adoption of open communication channels renders these systems more susceptible to attacks, particularly false data injection (FDI) attacks, which manipulate vehicle states and disrupt platoon stability. Firstly, the system delay under FDI attacks is meticulously analyzed based on the concept of multi-link loop delay, and its upper bound is derived. Then, a hierarchical control framework resilient to cyberattacks is proposed to address loop delays and implement FDI compensation control. The upper layer develops a model predictive controller (MPC) for decision-making and planning under uncertainties. The lower layer employs an H∞ controller combined with a linear quadratic regulator (LQR) to mitigate the effects of loop delays and provide reliable acceleration tracking control. Finally, the effectiveness of the proposed method is validated through comprehensive hardware-in-the-loop testing.

10:30-16:40, Paper SuPP1.15
AttSAM: Attention-Augmented Segment Anything Model for Accurate Polyp Segmentation

Lan, Lixiang	Changchun University of Science and Technology
Yang, Yang	Changchun University of Science and Technology
Zhao, Guangyu	Changchun University of Science and Technology
Li, Yifeng	Changchun University of Science and Technology
Liu, Wanting	Changchun University of Science and Technology
Wang, Jikui	Changchun Shikai Technology Industry Co., Ltd
Keywords: Deep Learning, Medical Robotics, Robot Vision and Computer Vision Abstract: Polyp segmentation is crucial in the diagnosis of colorectal cancer. The introduction of the Segment Anything Model (SAM) provides powerful pretraining capabilities for polyp segmentation, but it faces two main challenges when applied to endoscopic images: first, its Transformerbased architecture tends to overlook local details, leading to feature bias; second, its performance on out-of-distribution (OOD) data is suboptimal, affecting prediction accuracy and confidence estimation. To address these issues, we propose an enhanced approach based on SAM that incorporates a Local Feature Enhancement Module (LFEM) and a Channel Attention Enhancement Module (CAEM). LFEM improves the capture of high-frequency information by enhancing local detail features, playing a crucial role in extracting polyp edges and textures. CAEM introduces a channel attention mechanism that dynamically adjusts the weight of feature channels, thereby enhancing the model’s sensitivity and generalization ability. Additionally, we draw on the design principles of the Cross- Branch Feature Enhancement module and the Uncertainty- Guided Prediction Regularization (UPR) module to further improve SAM’s performance in multi-scale feature extraction and on out-of-distribution data. Experimental results demonstrate that the inclusion of LFEM and CAEM significantly improves the model’s segmentation accuracy on multiple public polyp datasets, particularly excelling in complex background scenarios.

10:30-16:40, Paper SuPP1.16
FreezeAdaCRAFT Point Cloud Network*

Lin, Siyu	Changchun University of Science and Technology
Yang, Yang	Changchun University of Science and Technology
Li, Mingqiu	Changchun University of Science and Technology
Xing, Songqi	Changchun University of Science and Technology
Lan, Lixiang	Changchun University of Science and Technology
Keywords: Robot Vision and Computer Vision, Deep Learning, SLAM and Navigation Abstract: 3D 点云补全旨在从部分观察中重建完整的物体形状，解决数据稀疏、结构缺失和精细细节恢复等挑战。在本文中，我们提出了FreezeAdaCRAFT点云网络（FAC-PCN），这是一个将自适应冻结采样与基于跨尺度Transformer的特征增强相结合的新型框架。具体来说，基于自适应冻结的混合采样（AFSM）模块选择关键几何点以提高特征稳定性，而多尺度交叉分辨率特征增强变换器（MS-CRAFT）模块则增强了不同尺度的语义交互。然后，级联生成模块逐步对种子点进行Ç

10:30-16:40, Paper SuPP1.17
Research on Robotic Arm Trajectory Planning Method Based on Dual Cameras Guidance without Common Field of View

Li, Mingyang	Changchun University of Technology
Chenyu, Liu	Electrical and Electronic Engineering, Changchun University of T
Wu, Hongying	Electrical and Electronic Engineering, Changchun University of T
Jiang, Changhong	School of Electrical and Electronic Engineering, Changchun Unive
Xie, Mujun	Changchun University of Technology
Keywords: Robot Vision and Computer Vision, Path and Motion Planning, Industrial Robotics and Factory Automation Abstract: In order to solve the problem of automatic assembly of large workpieces or automatic loading of large objects, the position measurement of the target object is realized with the cooperation of two cameras. For the problem of robotic arm trajectory planning with two cameras vision guidance in a non-common field of view, a four-degree-of-freedom robotic arm system with dual cameras vision measurement is first constructed to establish the kinematic model of the robotic arm, then image processing techniques are used to recognize, segment and ellipse fit the cooperative target image to obtain the pixel coordinates of the cooperative target, and the spatial position information of the cooperative target in the coordinate system of the robotic arm base is calculated by the transformation relationship between the coordinate systems, and finally, according to the position information of the cooperative target obtained from the measurement, the motion trajectory of the joints of the robotic arm is planned through the Constrained Optimization BY Linear Approximations (COBYLA) algorithm iterative optimization, and it is able to make the robotic arm realize the accurate positioning to the position of the cooperative target.

10:30-16:40, Paper SuPP1.18
Research on Multi-Robot Collision Detection Strategy in Complex Environment

Xie, Shuxin	Soochow University
Cao, Zhimin	Suzhou City University
Keywords: Multi-Robot Systems, Path and Motion Planning Abstract: Abstract— Collision detection is one of the key technologies for realizing multi-robot motion planning under environmental constraints. To reduce computation time and improve detection efficiency, this paper proposes a task-decomposition-based 3-stage collision detection strategy, which includes a collision detection rigid body matrix, a coarse collision detection stage, and a precise collision detection stage. To verify the superior performance of the proposed strategy, the widely used Bullet collision detection algorithm was selected as a comparison benchmark. The experimental results demonstrate that the proposed method significantly optimizes the collision detection process, greatly reduces execution time, and enhances overall detection efficiency.

10:30-16:40, Paper SuPP1.19
A Dexterous Manipulator Driven by Wire and Pulleys

Su, Baiquan	Beijing University of Posts and Telecommunications
Wang, Junchen	Beihang University
Chen, Anqi	Beijing University of Posts and Telecommunications
Hu, Yida	Harvard Medical School
Xie, Zhen	National University of Singapore

10:30-16:40, Paper SuPP1.20
Dynamic Obstacle Avoidance Control Method for Unmanned Surface Vessels Integrating Trajectory Prediction and Rule-Based Guidance (I)

Shen, Weilong	Shenyang University of Technology
Feng, Dongying	Guangzhou Institute of Industrial Intelligence
Zang, Chuanzhi	Shenyang University of Technology
Yuan, Mingzhe	Shenyang Institute of Automation, Chinese Academy of Sciences
Xiao, Jinchao	Guangzhou Institute of Industrial Intelligence
Keywords: Path and Motion Planning, Intelligent Transportation Systems, Mobile Robotics Abstract: To address the challenges posed by unpredictable dynamic obstacle trajectories, rule-inconsistent avoidance behaviors, and the lack of motion feasibility in traditional path planning, this paper proposes a path planning method that integrates trajectory prediction, rule-guided avoidance, and dynamic window control. First, a lightweight trajectory prediction framework based on polynomial curve fitting is designed. By leveraging a sliding window mechanism, the future motion trends of obstacle vessels are dynamically modeled, providing stable and reliable foresight information with low computational cost. Second, typical encounter scenarios defined in the International Regulations for Preventing Collisions at Sea (COLREGs) are incorporated to reconstruct the repulsive force direction mechanism in the Artificial Potential Field (APF) method, ensuring that avoidance behaviors comply with maritime rules and improving the rationality of avoidance directions. Finally, to address the limitations of the APF method in handling motion constraints, which often lead to excessively large avoidance angles, the Dynamic Window Approach (DWA) is introduced to ensure smoother and more feasible steering control. The predicted obstacle trajectories and the rule-compliant recommended heading are incorporated into the trajectory evaluation function, forming a closed-loop control strategy with both control feasibility and multi-objective guidance capability. Simulation and experimental results demons

10:30-16:40, Paper SuPP1.21
Motion Control Methods for Dual Wheel-Legged Robot

Che, Yufei	Nanchang Hangkong University
Chen, Zhihua	Nanchang Hangkong University
Jiang, ZhiFan	Nanchang Hangkong University
Wang, Xiuwen	Beijing Institute of Mechanical Equipment
Huang, Xilong	Nanchang Hangkong University
Cheng, Shan	Nanchang Hangkong University
Keywords: Mobile Robotics Abstract: The dual-wheel-legged robot combines the efficiency of wheeled mobility with the obstacle-crossing capability of a legged structure, offering significant advantages in unstable terrain. However, dynamic balance control of the dual-wheel-legged robot on highly unstructured terrain remains a significant challenge. This paper analyzes a five-bar linkage mechanism and introduces virtual model control (VMC) to decouple the robot into two components: body balance motion and leg motion. Design a linear quadratic regulator (LQR) for real-time control of balance control based on leg length division of the robot's state space. When facing external interference, use Linear Active Disturbance Rejection Control (ADRC) technology to control the robot's legs. Compare and simulate it with the method of PID control of robot legs. The simulation results show that the control methods based on LQR and ADRC enable the robot to successfully navigate obstacles, and the error of attitude angle and leg length are smaller than that of the control methods based on LQR and PID.

10:30-16:40, Paper SuPP1.22
Seam-Cutting Image Stitching Method Based on DM-Net and SU-Net

Liu, Qi	Shenyang Jianzhu University
Guo, Song	Shenyang Jianzhu University
Keywords: Robot Vision and Computer Vision, Deep Learning, Artificial Intelligence Abstract: Traditional image stitching methods tend to rely on complex and salient geometric features to achieve better stitching results. However, these features are only effective in specific natural scenes with sufficient geometric structures. In contrast, deep learning-based stitching approaches overcome adverse conditions by adaptively learning robust semantic features. Nevertheless, in scenarios with low or highly complex feature content, these methods may fail due to insufficient feature extraction, leading to errors. Moreover, conventional seam-finding methods do not consider the optimal positioning of seams during the stitching process, which can result in misalignment caused by poor seam placement. To address these issues, we propose a novel unsupervised feature extraction method(DM-Net) combined with an optimal seam selection strategy(SU-Net) for image stitching. Our approach leverages a time-series algorithm to effectively fuse local and global features, thereby enhancing the correlation between them and ensuring a global perspective of the feature relationships. During stitching, the optimal seam is determined by sharing features between image pairs. Experimental results demonstrate that the proposed method outperforms other seam-cutting stitching techniques, yielding superior results.

10:30-16:40, Paper SuPP1.23
An Adaptive Reinforcement Learning Path Planning Method for Indoor Complex Dynamic Environments

Xing, Songqi	Changchun University of Science and Technology
Yang, Yang	Changchun University of Science and Technology
Liu, Wanting	Changchun University of Science and Technology
Lin, Siyu	Changchun University of Science and Technology
Keywords: Path and Motion Planning, Machine Learning, Artificial Intelligence Abstract: Path planning in indoor complex dynamic environments faces challenges such as frequent obstacle movements, high environmental uncertainty, and difficulty for agents to achieve stable convergence. Existing algorithms have limitations in adaptability, convergence speed, and path quality. To address these issues, this paper proposes an Adaptive Prioritized Experience Double Deep Q-Network Path Optimization method (APE-DPO), which integrates dynamic perturbation optimization, adaptive exploration rate adjustment, and a prioritized experience replay mechanism to enhance the agent's responsiveness to dynamic environments and improve policy stability. Experiments conducted in three representative dynamic scenarios—home, restaurant, and library—demonstrate that APE-DPO outperforms representative reinforcement learning models in the path planning domain, including DQN, DDQN, and their variants, in terms of path length, success rate, and convergence speed. In some scenarios, path length is reduced by 6.9% to 12.5%, showing strong robustness and practical application potential.

10:30-16:40, Paper SuPP1.24
A CBAM-ResNet Based PPO Framework for Safe Navigation in Dynamic Pedestrian Environments

Tian, Haoran	Changchun University of Science and Technology
Yang, Yang	Changchun University of Science and Technology
Meng, Jin	Changchun University of Science and Technology
Wang, Shifeng	Changchun University of Science and Technology
Xing, Songqi	Changchun University of Science and Technology
Cong, Haifang	Changchun University of Science and Technology
Keywords: Path and Motion Planning, SLAM and Navigation, Sensor Networks Abstract: Safe and efficient navigation in dynamic pedestrian environments remains a significant challenge for mobile robots, particularly due to rapidly changing scenarios and complex multi-agent interactions. This paper proposes CBRN-PPO, a deep reinforcement learning-based navigation framework designed to address these challenges. The framework incorporates a ResNet-based feature extractor enhanced with the Convolutional Block Attention Module (CBAM) to improve both perception and decision-making. By fusing multimodal inputs—including LiDAR scans, pedestrian velocity maps, and target direction vectors—the model effectively captures rich semantic representations of dynamic obstacles. The CBAM module adaptively emphasizes critical spatial and channel-wise features, enhancing robustness in densely populated environments. Experimental results demonstrate that in scenarios with 35 pedestrians, CBRN-PPO achieves a 93% success rate in obstacle avoidance, outperforming the A1-RD baseline by 15%. Furthermore, it improves path efficiency by 38% and average navigation speed by 34%. These results highlight the effectiveness and robustness of CBRN-PPO as a navigation solution for autonomous mobile robots operating in complex dynamic environments.

10:30-16:40, Paper SuPP1.25
Geometry-Enhanced Multi-Level DynamicDisparity Estimation and Virtual View Synthesis Method for Stereo Vision

Pan, Zeqian	Changchun University of Science and Technology
Piao, Yan	Changchun University of Science and Technology
Keywords: Deep Learning, Robot Vision and Computer Vision Abstract: 准确的视差图估计对于计算机至关重要视觉任务，例如虚拟视图生成和 3D 建模。解决低差异估计问题无纹理、重复纹理和遮挡的精度区域，以及观察到的生成视图中的伪影在现有方法中，本文提出了一个几何特征增强的多级动态视差估计和虚拟视图生成方法。方法从源头提取多尺度区域特征使用具有微分的扩展卷积的图像膨胀率并包含特征金字塔网络（FPN）以及多尺度几何增强模块从参考图片。在融合阶段，梯度引入显著

10:30-16:40, Paper SuPP1.26
Research on Multi UAV Path Planning Based on Deep Reinforcement Learning

Wu, Wei	Changchun University of Science and Technology
Li, MingQiu	Changchun University of Science and Technology
Keywords: Path and Motion Planning, Deep Learning, Multi-Robot Systems Abstract: ，无人机技术的低成本和便利性使其得到广泛应用，成为重点研究各国战略目标的重点。以前无人机路径规划方法存在低效率高，速度慢。虽然深度加固学习在路径规划方面取得了许多成就无人机，仍存在训练时间长等问题和规划准确性低。解决上述问题问题，我们提出了改进的双延迟深度确定性策略梯度（TD3）算法（改进人工势场法TD3：APF-TD3）用于路径无人机规划。首先，这篇文章将人工势场的组合值和虚拟势场到对偶的&#

10:30-16:40, Paper SuPP1.27
Development of a Multi-Modal Control Architecture for a Cable-Actuated Ankle Exoskeleton

Largeteau, Etienne	University D'Évry
Conte, Bangaly	IBISC Laboratory, University of Paris Saclay
Su, Hang	Paris Saclay University
Bruneau, Olivier	ENS CACHAN
Alfayad, Samer	Paris-Saclay Universit -Evry University
Keywords: Human-Robot Interaction and Cooperation, Medical Robotics, Home and Personal Robot Systems Abstract: This paper presents a lightweight, cable-actuated ankle exoskeleton featuring hip-mounted motors and a multimodal sensing platform for real-time gait-synchronized assistance. To minimize distal inertia and improve user comfort, the system integrates off-limb actuation with plantar-pressure and inertial sensing embedded in a smart insole. A dual-loop control framework—comprising a PID-based position controller and an inner PI current loop—ensures smooth, biologically inspired torque generation aligned with gait phases. Experimental validation demonstrates sub-degree joint-angle tracking, stable torque regulation, and rapid convergence without overshoot or oscillations. These results confirm the feasibility of the proposed system as a reliable and precise assistive device for rehabilitation and daily mobility, with potential for further enhancement through adaptive trajectory generation and energy-efficient control strategies.

10:30-16:40, Paper SuPP1.28
Self-Supervised Monocular Depth Estimation Using Temporal Convolution

Lin, Pengfei	Changchun University of Science and Technology
Wang, Yu	Changchun University of Science and Technology
Wang, Lu	Changchun University of Science and Technology
Keywords: Deep Learning Abstract: In recent years, self-supervised monocular depth estimation has increasingly attracted extensive attention from scholars due to its lack of reliance on labeled data, adaptability, low cost, ability to combine with pose estimation, and continuous learning capability. However, existing self-supervised monocular depth estimation methods often produce ambiguous depth predictions for moving targets in dynamic scenes by ignoring temporal continuity. Therefore, this paper proposes an improved architecture based on Lite-Mono. We introduce a channel-space attention mechanism into the continuous expansion convolution module to enhance the feature response to the edges of dynamic objects and untextured regions, thereby reducing depth blurring. We design a multiscale feature storage unit to cache the multiscale feature maps of consecutive frames in a sliding window and generate timing-enhanced features through cross-frame fusion. We construct a hierarchical temporal convolution network, stacking causal convolutional layers with an increasing dilation rate, to aggregate long and short-term temporal contexts and capture the continuity of motion trajectories of dynamic objects. The decoder fuses temporal features with original spatial features to generate motion-consistent multiscale depth maps. Through extensive experiments, we compare the proposed method with the benchmark model Lite-Mono.

10:30-16:40, Paper SuPP1.29
A Novel Data-Driven Visualization and Analysis Framework for Embedded Robotic Firmware (I)

Yermakov, Elia	Paris Saclay
Ghandour, Maysoon	Université Paris Saclay
Su, Hang	Paris Saclay University
Alfayad, Samer	Paris-Saclay Universit -Evry University
Keywords: Field Robotics, Humanoid Robots, ROS, Software System for Robotics Application Abstract: The growing complexity of embedded firmware in robotics and automation, including mobile robots, industrial actuators, and autonomous systems, presents significant challenges related to maintainability, debugging, and efficient developer onboarding. Traditional documentation methods frequently become outdated and insufficient, which complicates code comprehension and increases the likelihood of errors. At the same time, current tools often provide only static or fragmented documentation without the interactive, realtime insights required for effective development. To address these limitations, we present a unified visualization and documentation approach based on the open-source tools Doxygen and Graphviz. When applied to a sophisticated embedded control board designed for robotic applications, this method generates interactive documentation, call graphs, and module dependency diagrams that enhance code comprehension, simplify debugging, and accelerate onboarding. In addition, the integration supports targeted optimization by highlighting performance bottlenecks and areas of excessive code complexity, thereby guiding data-driven refactoring. Overall, our findings demonstrate that automated, interactive visualization significantly improves maintainability and development efficiency in embedded software.

Technical Program for Sunday August 10, 2025