| |
Last updated on August 7, 2025. This conference program is tentative and subject to change
Technical Program for Sunday August 10, 2025
|
SuS1A |
Room D |
Award Session |
Regular Sessions |
Chair: Wang, Zhidong | Chiba Institute of Technology |
|
10:50-11:02, Paper SuS1A.1 | |
Style-Aware and Robust Sim-To-Real Humanoid Locomotion with Contact-Rich Motion Priors and Online Dynamics Adaptation |
|
Zhang, Jinlin | Zhejiang University |
Gan, Chunbiao | Zhejiang University |
Keywords: Humanoid Robots, Machine Learning, Path and Motion Planning
Abstract: Achieving natural and expressive full-body motion remains a significant challenge for humanoid robots. This work proposes a sim-to-real control framework integrating an online dynamics inference and adaptation module with a Contact-Rich Adversarial Motion Prior (CRAMP) to enable robust and style-aware motion control for humanoid robots with high gear ratio actuation.The dynamics adaptation module relies solely on proprioceptive feedback and multi-point foot contact modeling to online estimate and adapt critical dynamic parameters, effectively narrowing the simulation-to-reality gap. Meanwhile, CRAMP extends conventional adversarial motion priors by explicitly incorporating the robot-environment contact structure into the discriminator input, enabling the policy to learn end-to-end expressive behaviors with consistent styles and coordinated contact transitions, without relying on separate upper-lower body models or multi-stage training processes. The control policy is trained via reinforcement learning in simulation and successfully transferred to real hardware without additional tuning. Experimental results from GTX-III demonstrate that the proposed method stably generates natural and stylistically diverse full-body motions, exhibiting clear capabilities in gait rhythm modulation,posture tension control, asymmetric adaptation, contact strategy adjustment, and delivering multidimensional stylistic responses in context-rich human-like scenarios.
|
|
11:02-11:14, Paper SuS1A.2 | |
Surgical Phase Detection with Deep Learning for Thoracoscopic Pulmonary Lobectomy Surgery |
|
Yang, Qian | Beijing University of Posts and Telecommunications |
Liu, Chang | Beijing University of Posts and Telecommunications |
Zhou, Zhuojia | Beijing University of Technology |
Feng, Yongqiang | Plastic Surgery Hospital, Chinese Academy of MedicalSciences And |
Zheng, Heng | Chinese Academy of Medical Sciences and Peking Union Medical Col |
Wang, Junchen | Beihang University |
Tang, Jie | Beijing Tian Tan Hospital, Capital Medical University |
Hou, Yuanzheng | Xuanwu Hospital, Capital Medical University |
Su, Baiquan | Beijing University of Posts and Telecommunications |
Zhang, Xiaoya | CIE |
Keywords: Medical Robotics, Deep Learning
Abstract: Accurate recognition of surgical phases is essential for clinical education and the advancement of robotic surgery. Although significant progress has been made in automated surgical phase recognition for various procedures, a notable research gap remains in the context of video-assisted thoracoscopic pulmonary lobectomy-a highly complex and technically demanding operation. To address this gap, this study is the first to comprehensively investigate automated phase recognition in thoracoscopic pulmonary lobectomy. Our principal contributions include: (1) the creation of the first novel, expert-annotated private video dataset for pulmonary lobectomy, providing a critical resource for this understudied surgical domain; and (2) the optimization and adaptation of the SV-RCNet model, which achieves robust phase recognition by effectively capturing both visual and temporal features. Experimental results demonstrate that our approach achieves a surgical phase recognition accuracy of 92.4%, comparable to the state-of-the-art performance in other surgical phase recognition tasks. This work not only fills a critical research gap, but also lays a solid foundation for the future development of computer-assisted thoracoscopic surgery, offering significant clinical and educational value for surgical training and the advancement of intelligent robotic systems.
|
|
11:14-11:26, Paper SuS1A.3 | |
A Human–Robot Collaborative System for Maxillofacial Osteotomy Assisted by Virtual Fixtures Based on Admittance Control |
|
Deng, Yingyan | Beihang University |
Lu, Chunheng | Beihang University |
Liu, Xinyu | Peking University School and Hospital of Stomatology |
He, Yang | Peking University School and Hospital of Stomatology |
Wang, Junchen | Beihang University |
Keywords: Medical Robotics, Human-Robot Interaction and Cooperation, Grasping and Manipulation
Abstract: Precise and safe execution of maxillofacial osteotomy remains challenging due to complex anatomy and dynamic human–robot interaction. This study proposes a collaborative control system combining parametrized-surface virtual fixtures and admittance-based hybrid force–position control. The osteotomy surface is extracted from preoperative CT data, projected and spline-fitted to generate virtual constraint. An anisotropic admittance model ensures compliant motion in the cutting direction while enforcing geometric constraints in normal and depth directions. The control scheme was implemented on a UR5e manipulator and validated through three experimental trials on identical anatomical mandible models. Postoperative CT scans were registered to preoperative plans, and cutting accuracy was evaluated using 13 anatomical landmarks per model. The overall mean error was 0.81 mm, with a maximum deviation of 1.48 mm and a standard deviation of 0.53 mm. 3D error map revealed localized deviations. The robot demonstrated stable and responsive behavior across trials, with smooth tool interaction and no safety violations. These results confirm that the proposed system achieves millimeter-level precision while supporting intuitive human–robot cooperation, offering a promising basis for future clinical translation.
|
|
11:26-11:38, Paper SuS1A.4 | |
Phase Surface Electromyography (sEMG) Based Muscle Onset Detection |
|
Yuan, Wenbo | The University of Hong Kong |
Zhou, Changqiu | The University of Hong Kong |
Zhao, Yafei | The University of Hong Kong |
Ling, Zi-qin | Shenzhen University |
Chen, Jiangcheng | The University of Hong Kong |
Xi, Ning | The University of Hong Kong |
Keywords: Human-Robot Interaction and Cooperation, Sensing, Haptic System, Soft Robotics
Abstract: Surface electromyography(sEMG) signals, as a physiological signal, have natural advantages in robot control and human-machine collaboration.Using electrical signals transmitted from the brain to the muscles through the central nervous system to control robots can improve the collaboration between robots and humans, making robot control highly real-time and mechanically matched.Using electromyographic signals to detect the activation state of robots helps to quickly switch between different robot states.Based on the propagation characteristics of neuromuscular junctions and electromyographic signals on the skin surface, we propose a muscle activation detection method based on opposite phase sEMG.The onset signal obtained based on this method is about 60ms earlier than the joint force output, which means that muscle activation information can be obtained within the electromechanical delay(EMD) range.Using this method, information on electromyographic signals can be expanded from degree of activation to whether they are activated, providing more reference information for human-machine collaboration, fitness training, and disease evaluation.
|
|
11:38-11:50, Paper SuS1A.5 | |
Ultra-Sensitive Aorta Pressure Sensor Based on Graphene |
|
Zhao, Jing | Beijing Institute of Technology |
Yin, Yiqi | Beijing Institute of Technology |
Li, Zhongyi | Beijing Institute of Technology |
Feng, Zhejian | Beijing Institute of Technology |
Zeng, Xueer | Beijing Institute of Technology |
Keywords: Sensing, Haptic System, Smart Structures, Materials, Actuators, Emerging Technologies and Applications
Abstract: The graphene-based strain sensors have attracted much attention recently. Usually, there is a tradeoff between the sensitivity and resistance of such devices for larger resistance devices cost higher energy consumption. As the Aorta pressure sensor, smaller device is needed for less invasive to the human body. In this paper, we report an ultra-sensitive graphene-based pressure sensor whose sensitivity can be tuned by the original resistance in different growth condition. For a typical pressure sensor device, the gauge factor can be achieved ~103 while the sheet resistance ~100 KΩ/□. The flexible pressure sensor placed in vivo or in vitro can sense both various blood pressure and heart rate. Cardiac arrhythmias such as atrial fibrillation and ventricular premature contraction can be detected in real-time. After more than 104 cycles test, the performance of the device remains basically unchanged with fast response time less than 50 ms. The highly sensitive, small volume graphene-based Aorta pressure sensor suggests a great potential in future healthcare industry.
|
|
11:50-12:02, Paper SuS1A.6 | |
A Novel Dual-Modeling Framework for Precision Control of Servovalve Electrohydraulic Actuators |
|
Oyeranmi, Sheriff | Paris Saclay University |
Su, Hang | Paris Saclay University |
Mammar, Saïd | IBISC |
Alfayad, Samer | Paris-Saclay Universit -Evry University |
Keywords: Intelligent Control and Systems, Intelligent Control, Human-Robot Interaction and Cooperation
Abstract: Electrohydraulic actuators (EHAs) are increasingly used in robotics because of their superior power density, bandwidth, and force control. However, accurately modeling their behavior is challenging due to the complex internal dynamics of the servovalve. This paper presents a dual-modelling methodology for a symmetric double-acting cylinder. First is a nonlinear mathematical model derived from fluid continuity, orifice flow, and Newton's motion equations (including a hard-stop spring-damper to emulate end caps). The second is a component-level simscape implementation that mirrors the hardware architecture. The initial parameters were obtained from direct measurements of a dismantled servovalve and manufacturer datasheets. The remaining eleven unknowns were identified via nonlinear least-squares optimization using the Trust-Region Reflective algorithm against experimental sinusoidal displacement data. Beyond regular validation with piston displacement, multi-metric validation of both models behavior verified their ability to track sinusoidal inputs in displacement, and also the expected chamber pressure dynamics. This rigorously validated dual-model framework provides analytical clarity and high-fidelity simulation for designing precise position, force, and compliant controllers for EHAs, enabling safe human-robot interaction.
|
|
SuS1B |
Room E |
Vision |
Regular Sessions |
Chair: Xu, Jing | Tsinghua University |
|
10:50-11:02, Paper SuS1B.1 | |
An Adaptive Peak Detector for Frequency Shift Methods in Fringe Projection Profilometry |
|
Hu, Changping | Tsinghua University |
Chen, Rui | Tsinghua University |
Xu, Jing | Tsinghua University |
Keywords: Robot Vision and Computer Vision
Abstract: 3D measurement of transparent objects is a challenging task for fringe projection profilometry, since each camera pixel captures an accumulation of information from multiple projector pixels. While existing frequency shift methods can decouple projector pixels, they can not identify correct surface points from the decoupled results, due to the complexity of refraction and reflection. To address this challenge, an adaptive peak detector, robust to different reflected light intensities, is proposed for transparent object measurement. Experimental results demonstrate that our method achieves more accurate and complete measurements of transparent objects.
|
|
11:02-11:14, Paper SuS1B.2 | |
A Multi-Stage Automatic Framework for Fracture Fragments Segmentation of Zygomatic Bone and Arch |
|
Mo, Hao | Beihang University |
Zhang, Runshi | Beihang University |
Liu, Runqi | Peking University School of Stomatology |
Tong, Yanhang | Peking University |
Wang, Junchen | Beihang University |
Keywords: Robot Vision and Computer Vision, Deep Learning, Artificial Intelligence
Abstract: Zygomatic bone (ZB) and zygomatic arch (ZA) fractures are among the most common cranio-maxillofacial (CMF) injuries, posing significant challenges for accurate preoperative assessment and surgical planning due to the complex anatomy and high demand for facial symmetry restoration. To address these challenges, we propose a novel deep learning-based segmentation framework specifically designed for automated analysis of ZA and ZB fracture fragments. The framework consists of three major components: a 3D Faster R-CNN network to detect the ZA and ZB regions within cranial CT volumes, a 3D ConvNeXt encoder combined with a UPerNet decoder for coarse segmentation, and a lightweight 2D segmentation network for fine-grained fracture line identification. Experimental results on a dataset of 186 CT volumes demonstrate high accuracy in region detection (mAP: 90.66%) and effective ZA and ZB region segmentation (mDice: 93.52%). While the performance of fracture line segmentation was limited by the scarcity of annotated data, the proposed pipeline significantly enhances the automation and precision of preoperative planning for ZA and ZB fracture cases. This study facilitates CMF surgical planning and advances the development of autonomous and intelligent robotic surgical systems.
|
|
11:14-11:26, Paper SuS1B.3 | |
OOPS: Occlusion-Aware Optimization-Based 6D PoSe Estimation of Unseen Transparent Objects |
|
Pun, Chifai | Tsinghua University |
Xu, Jing | Tsinghua University |
Hu, Changping | Tsinghua University |
Chen, Rui | Tsinghua University |
Keywords: Robot Vision and Computer Vision
Abstract: In this paper, we introduce OOPS, a CAD model-based pipeline that reframes the 6D pose estimation of novel transparent objects with known CAD models as an optimization-based alignment problem. The goal is to align the silhouettes and borders of observed transparent objects with those rendered from CAD models.
|
|
11:26-11:38, Paper SuS1B.4 | |
Robust Monocular Distance Estimation in Degraded Visual Conditions Via a Wide-And-Deep Fusion Framework |
|
Wang, Richard | BASIS Independent Silicon Valley Upper School |
Han, Grant | VEX V5RC Team 1698V |
Liu, Alexander | Basis Independent Silicon Valley Upper School |
Ye, Zixuan | VEX V5RC Team 1698V |
Deng, Alexander | Saratoga High School |
Liu, Lexie | VEX V5RC Team 1698V |
Wei, Xing | Basis Independent Silicon Valley Upper School |
Keywords: Robot Vision and Computer Vision, Deep Learning, Machine Learning
Abstract: Robust visual perception remains a central challenge for autonomous systems operating under degraded environmental conditions such as fog, low light, and artificial glare. In this study, we investigate multiple strategies to enhance the resilience of vision-based models in such scenarios, using monocular distance estimation as a representative task due to its sensitivity to visual quality and availability of ground-truth metrics. A Convolutional Neural Network (CNN) is first trained on clean visual data using a single-camera robotic platform, and its direct inference results serve as the baseline. We then explore three enhancement strategies: (1) image preprocessing with a transformer-based large model, (2) model fine-tuning on low-visibility data, and (3) a novel fusion framework that integrates temporal motion history with image features. Inspired by wide-and-deep learning structures, the proposed fusion model combines shallow inputs (e.g., recent speed and displacement) with deep CNN-based image embeddings for improved robustness. Experimental results across various lighting scenarios demonstrate that while multiple enhancement strategies improve estimation accuracy, the fusion model achieves the most consistent and efficient performance. The findings provide insight into designing more reliable vision systems for low-cost robotics and general perception tasks under real-world variability.
|
|
11:38-11:50, Paper SuS1B.5 | |
Silhouette-Guided Diffusion Model for Transparent Object Reconstruction |
|
Hu, Changping | Tsinghua University |
Xu, Jing | Tsinghua University |
Pun, Chifai | Tsinghua University |
Chen, Rui | Tsinghua University |
Keywords: Robot Vision and Computer Vision
Abstract: Depth reconstruction for transparent objects is a challenging problem, where surface feature matching methods are hindered by complex refraction and reflection. We propose a novel transparent object reconstruction pipeline with a guided object-centric 3D diffusion model. Specifically, we train an unconditional 3D diffusion model with only 3D point cloud data. To control the output of the diffusion model, we design a silhouette-based guidance function for each step of diffusion process. Experiment results show that our method can achieve state-of-the-art performance for transparent object depth reconstruction compared to existing depth regression and completion methods.
|
|
11:50-12:02, Paper SuS1B.6 | |
Free-Viewpoint Multi-View Face Reconstruction Using Deep Learning Method |
|
Li, Zhongtian | Beihang University |
Zhang, Runshi | Beihang University |
Jie, Bimeng | Peking University |
He, Yang | Peking University School and Hospital of Stomatology |
Wang, Junchen | Beihang University |
Keywords: Deep Learning, Robot Vision and Computer Vision
Abstract: - 准确恢复三维形状 从二维图像中绘制一张脸是一项具有挑战性的 任务与许多应用程序一起使用。虽然有些进展 在基于 3D 可变形模型的研究中,大多数 目前的方法主要集中在使用单个图像来 重建。中包含的有限信息 单一图像不可避免地限制了人脸的有效性 重建。本研究提出了一种基于深度学习的 使用三维可变形模型的多视图重建方法。不需要真人,只有弱监督 使用多张人脸图像进行学习,获得 自由视野条件下的准确面
|
|
SuS2A |
Room D |
Robotics and AI for Soft Materials |
Invited Sessions |
Chair: Wang, Zhidong | Chiba Institute of Technology |
|
14:20-14:32, Paper SuS2A.1 | |
Fabric Modelling for Robot Applications (I) |
|
Bhattacharya, Dipankar | The University of Hong Kong |
Kosuge, Kazuhiro | The University of Hong Kong |
Keywords: Intelligent Control and Systems
Abstract: Precise fabric manipulation—such as folding, hanging, and aligning textiles for sewing—is a core challenge in garment manufacturing automation. These manipulation tasks are especially difficult for robots due to the highly deformable nature of fabric, which exhibits infinite configurations, non-linear dynamics, and frequent self-occlusion. Achieving reliable perception and control for fabric manipulation remains an open problem in robotics. To address this, we combine physics-based models, like mass-spring systems and position-based dynamics, with data-driven approaches for accurate cloth state estimation and manipulation tracking. At the heart of our method is a Graph Attention Network (GAT), which predicts and reconstructs the mesh state of fabric from real-time sensory data, enabling robust perception for manipulation even in complex scenarios. These GAT-based mesh representations are then used to train an Action Chunking Transformers (ACT) network using real human demonstrations. The ACT network learns to generate effective action sequences that move fabric to desired configurations, enabling fine manipulation with limited training data. By integrating GAT-based perception with ACT’s sequential action modeling, our system enables the robot to align a fabric with a desired configuration with high reliability and efficiency, with only a small number of demonstrations.
|
|
14:32-14:44, Paper SuS2A.2 | |
Fabric Handling with Passive Actuator Less Grippers (I) |
|
Seino, Akira | Centre for Transformative Garment Production |
Kosuge, Kazuhiro | The University of Hong Kong |
Keywords: Intelligent Control and Systems, Grasping and Manipulation
Abstract: The automation of fabric destacking and pick-and-place operations is critical for garment manufacturing. Conventional grasping solutions, such as pinching, suction, electro-adhesive, or needle-based grippers, are fundamentally limited by their reliance on active actuation and external power sources. We propose the Passive Actuator-Less Gripper (PALGRIP), a novel, fully mechanical end-effector for fabric manipulation. PALGRIP utilizes an internal mechanism that converts from the relative movement between the housing of the gripper and the fingers to open-and-close motion of the fingers, enabling it to grasp a topmost single fabric from a fabric stack without requiring any onboard power or tethered supplies. By its passive, power-free design, PALGRIP facilitates low-cost, simplified system integration, presenting a significant advancement in accessible automation for the textile industry. In this talk, the concept and mechanical design of PALGRIP and its application will be presented.
|
|
14:44-14:56, Paper SuS2A.3 | |
Fabric Handling System (I) |
|
Kobayashi, Akinari | Centre for Transformative Garment Production |
Kosuge, Kazuhiro | The University of Hong Kong |
Keywords: Intelligent Control and Systems, Grasping and Manipulation
Abstract: Robotic handling of thin, soft, and flexible materials such as fabric presents significant challenges because they are prone to wrinkling and losing their shape. Keeping fabric flat requires either controlled tension or complete support. To address this issue, we propose a novel robotic end-effector that handles a piece of fabric by rolling it up. The end-effector incorporates a roller equipped with suction, which first adheres an edge of the fabric to the roller using suction and then rolls it up smoothly to avoid wrinkling. We implemented this end-effector on a dual-arm robot equipped with force sensors, allowing coordinated dual-arm manipulation with dynamic tension control to maintain tautness. We evaluated the performance of the system through pick-and-place experiments involving stacked pieces of fabric. The results demonstrate that our approach enables precise and reliable fabric manipulation, highlighting its strong potential for automation in sectors such as garment production.
|
|
14:56-15:08, Paper SuS2A.4 | |
Visual Servoing Enhanced with AI (I) |
|
Tokuda, Fuyuki | Centre for Transformative Garment Production |
Kosuge, Kazuhiro | The University of Hong Kong |
Keywords: Robot Vision and Computer Vision, Artificial Intelligence, Intelligent Control and Systems
Abstract: Visual servoing is a technique that enables robots to control their pose using visual feedback from cameras. It is widely used in applications such as object tracking, autonomous navigation, and robotic manipulation. However, most existing systems focus on handling rigid objects, making soft and flexible materials like fabrics a significant challenge for visual servoing. This talk presents a novel CNN-based visual servoing method for the simultaneous positioning and flattening of a soft, non-textured fabric part using a dual-arm manipulator system. The system utilizes multimodal sensory input, grayscale camera images and force/torque data from force sensors, to control the robot arms. To address the difficulty of recognizing the shape of a non-textured fabric surface, structured lighting is applied to enhance surface features. A convolutional neural network is trained using data collected by randomly manipulating the fabric with real robots. The proposed method successfully flattens fabric parts from various initial conditions, including unseen wrinkles.
|
|
SuS2B |
Room E |
Recent Advances and Research Frontiers in Marine and Aerial Robotics |
Invited Sessions |
Chair: Ji, Daxiong | Zhejiang University |
|
14:20-14:32, Paper SuS2B.1 | |
Mitigating Output Redefinition Error in Trajectory Planning for Underactuated Unmanned Surface Vehicles (I) |
|
Zheng, Yongsheng | Zhejiang University |
Jiang, Yuning | Zhejiang University |
Xu, Chao | Zhejiang University |
Xiang, Ji | Zhejiang University |
Keywords: Mobile Robotics, Path and Motion Planning
Abstract: Underactuated unmanned surface vehicles (USVs) lack surge-direction controllers, making it difficult to accomplish motion control tasks. To address the issue that available control inputs in the yaw direction cannot directly correspond to position error, an output redefinition method is used to shift the control object from the centroid to offset point with a small distance therefrom. This approach directly takes the point's position as the control object to avoid underactuation issues, but it introduces tracking error. This paper proposes a trajectory planning method for underactuated USVs to mitigate output redefinition error. The method translates each point of a known desired trajectory by a certain distance along its tangent direction with sideslip angle compensation, which is used to make up for the inconsistency between the USV's moving direction and the tangential direction of the desired trajectory, and finally generates a new trajectory for the offset point. This ensures the USV centroid follows the original desired trajectory when the offset point tracks the new one. Compared with direct tracking, it mitigates error due to output redefinition, enabling more accurate trajectory following. Simulation results validate the method's effectiveness.
|
|
14:32-14:44, Paper SuS2B.2 | |
Proximal Policy Optimized Tube MPC Fault-Tolerant Control for Thrusters in AUV (I) |
|
Lai, Zehua | Zhejiang University |
Xu, Lie | Zhejiang University |
Yang, Jinghe | The University of Melbourne |
Pu, Ye | University of Melbourne |
Ji, Daxiong | Zhejiang University |
Keywords: Intelligent Control and Systems
Abstract: This paper investigates a fault-tolerant control (FTC) method for underwater vehicles, specifically focusing on small autonomous underwater vehicles (S-AUVs) experiencing thruster faults. We propose a novel control strategy that combines Proximal Policy Optimization with Tube Model Predictive Control (PPO Tube MPC) to enhance system performance under fault conditions. By leveraging PPO-trained policy to adaptively adjust the auxiliary feedback control law parameters in Tube MPC, this method reduces controller conservativeness while maintaining system stability during fault handling. Simulation results demonstrate that our PPO Tube MPC method significantly outperforms traditional control methods in reducing tracking errors. The PPO-trained policy exhibits strong generalization capability, maintaining effective control across diverse tasks and varying fault occurrence times.
|
|
14:44-14:56, Paper SuS2B.3 | |
Performance Evaluation of Flight Control Strategies for a Pesticide-Carrying Unmanned Aerial Vehicle |
|
Arshad, Syed Muhammad Nashit | Shenzhen Technology University, Shenzhen, China |
Xu, Haoliang | Shenzhen Technology University |
Hussain, Muntazir | Sustech |
Khan, Rashid | Shenzhen Technology University, Shenzhen, 518188, China |
Meng, Xiangdong | Shenyang Institute of Automation, Chinese Academy of Sciences |
Li, Qiang | Shenzhen Technology University |
Ming, Zhong | Shenzhen University |
Keywords: Intelligent Control, Field Robotics
Abstract: The efficient transportation of liquids by unmanned aerial vehicles (UAVs) is of utmost importance in various autonomous missions, including firefighting and field spraying. Nevertheless, liquid sloshing during transportation can lead to undesirable effects such as instability, unwanted forces, position error, and increased control effort resulting in inefficient power utilization and payload constraints. To mitigate the effects of Chlorpyrifos (pesticide) sloshing, a Lagrangian based dynamic model of UAV and resulting slosh was developed. Using SMD analogy, sloshing and the dynamic equation of the quadcopter was modeled based on the geometry of the liquid container. An effective classical control algorithm for a liquid carrier quadcopter is presented which has been extensively investigated, validated, and compared. Simulations based on Coppelia V-rep are also presented to investigate the real time implementation of the proposed system. The results demonstrate a decrease in chlorpyrifos slosh amplitude and intuitively, a reduction in the control effort. These findings have significant implications for improving the quality of quadcopter control in various real-world applications.
|
|
14:56-15:08, Paper SuS2B.4 | |
Experimental Investigation of Sloshing Force Prediction Via Deep Learning and Sensor Fusion in Robotics |
|
Arshad, Syed Muhammad Nashit | Shenzhen Technology University, Shenzhen, China |
Chen, Mingqi | Shenzhen Technology University |
Xu, Haoliang | Shenzhen Technology University |
Li, Meng | Shenzhen Technology University |
Li, Qiang | Shenzhen Technology University |
Ming, Zhong | Shenzhen University |
Keywords: Intelligent Control and Systems
Abstract: Accurately predicting the force of water in a moving container remains a challenging task. This paper introduces a novel framework for estimating dynamic sloshing forces in liquid-carrying robotic systems, leveraging a CNN-LSTM model enhanced with an attention mechanism and multi-sensor fusion. A rectangular beaker was mounted on a robotic manipulator, which was equipped with a multi-level water height sensor, a 10-axis IMU to monitor beaker motion, and a 3-axis force sensor to capture sloshing-induced forces. The robotic manipulator executed both controlled and random 3D motions with varying velocities and accelerations to induce diverse sloshing dynamics without causing spillage. A sensor fusion algorithm prioritized laser sensor data when ultrasonic readings became unreliable due to high velocities or large sloshing angles. This approach enables real-time sloshing force estimation, laying the foundation for sensor free systems where forces can be accurately predicted..
|
|
SuS3A |
Room D |
Robotic Learning, Intelligent Control and Design |
Regular Sessions |
Chair: Chen, Heping | Texas State University |
|
15:10-15:22, Paper SuS3A.1 | |
Online Domain Adaption for Sim2Real Transfer of High-Precision Manipulation with Visuotactile Sensing |
|
Chen, Rui | Tsinghua University |
Dang, Renjun | Tsinghua University |
Xu, Jing | Tsinghua University |
Keywords: Intelligent Control, Machine Learning
Abstract: Visuotactile sensors provide highresolution contact information for contact-rich manipulation tasks using reinforcement learning (RL). However, the simulation-to-real domain gap limits Sim2Real performance in high-precision tasks. We propose an online adaptation framework using a recurrent neural network (RNN) to learn correlations between robot proprioception and tactile signals, enabling policy adaptation. Experiments demonstrate a 96.7% success rate for zero-shot Sim2Real transfer of peg insertion with 40 μm clearance.
|
|
15:22-15:34, Paper SuS3A.2 | |
TD3-Based Visual-Tactile Fusion for Dexterous Robotic Grasping |
|
Wang, Xiaoyu | Shandong University |
Li, Ke | School of Control Science and Engineering, Shandong University |
|
|
15:34-15:46, Paper SuS3A.3 | |
Leg Joint Trajectory Planning of a Cat-Inspired Falling Robot Driven by Pneumatic Muscles with Optimal Energy Consumption |
|
Cao, Jian | Hefei University of Technology |
Han, Shun | Hefei University of Technology |
Zhu, Xiaocong | Zhejiang University |
Song, Yunhe | Hefei University of Technology |
Keywords: Biologically Inspired Robotics, Path and Motion Planning, Mobile Robotics
Abstract: In this paper, the leg joint trajectory planning with optimal energy consumption for a cat-inspired falling robot driven by pneumatic muscles (CIFRDPM) during its diagonal trotting gait is proposed. Firstly, the kinematics and energy consumption model of the leg joints of CIFRDPM are analyzed. Subsequently, a MATLAB/Adams co-simulation model for the leg motion of the robot is established, utilizing the composite pendulum line as a reference trajectory at its foot-end. Then, an optimized leg trajectory generation method combining a three-point Fourier series and cubic spline interpolation is developed, with parameters specifically tuned to minimize energy consumption of the leg joints during motion. The simulation results demonstrate that the CIFRDPM with Fourier optimized trajectory and the Fourier plus cubic spline interpolation optimized trajectory respectively achieve an 18.12% and 19.63% reduction of energy consumption during its diagonal trotting motion, compared to that with reference trajectory of the composite pendulum at the foot end only
|
|
15:46-15:58, Paper SuS3A.4 | |
Structure Design and Manufacturing Method of the Silicon-Based Elbow Exoskeleton |
|
Hou, Xinyu | Shenyang Aerospace University |
Zeng, Xinyu | Shenyang Aerospace University |
Zhirui, Zhao | Shenyang Aerospace University |
Gang, Liu | Shenyang Aerospace University |
Dexing, Shan | Northeastern University |
Xu, Jiqian | Shenyang Aerospace University |
Keywords: Soft Robotics, Smart Structures, Materials, Actuators
Abstract: This paper presents a silicone-based soft elbow exoskeleton designed for rehabilitation, particularly targeting individuals with weakened physical capabilities. The study delves into the working principle and the design process using CAD and CAE. During the preparation process, a 3D printer was used to manufacture the casting mold, and PVA material and carbon fiber weaving were employed to enhance the performance of the air chambers. The experimental setups and performance tests of the proposed exoskeleton are also discussed. The results indicate that the proposed device can assist a passive prosthetic arm weighing up to 1.5 kg in achieving a maximum bending angle of 63.09 degrees. These results also align well with the range of motion required for daily activities of the human elbow joint and demonstrate promising potential for rehabilitation applications.
|
|
SuS3B |
Room E |
Embodied Intelligence and Adaptive Systems |
Invited Sessions |
Chair: Zhao, Yuliang | Northeastern University at Qinhuangdao |
|
15:10-15:22, Paper SuS3B.1 | |
Gait Phase Recognition Based on a Multimodal Sensing-Driven Smart Shoe System (I) |
|
Hanbing, Liu | Université Paris-Saclay, Université d'Evry Paris-Saclay |
Zuo, Chuanlin | University Evry Val d'Essonne, University Paris-Saclay |
Bencharif, Loqmane | Paris Saclay |
Ibset, Abderahim | University Paris-Saclay |
Qi, Wen | Politecnico Di Milano |
Su, Hang | Paris Saclay University |
Dychus, Eric | Sandyc |
Alfayad, Samer | Paris-Saclay Universit -Evry University |
Keywords: Sensor Networks, Rehabilitation and Assistive Robotics, Medical Robotics
Abstract: Plantar pressure distribution is a key biomechanical parameter reflecting gait stability and foot loading characteristics. It plays a crucial role in designing control strategies for exoskeleton-assisted rehabilitation. This study investigates plantar pressure variation, regional loading dynamics and the trajectory of the center of pressure (CoP) during normal walking, based on a previously developed multimodal smart shoe system. The system integrates three types of sensors: flexible film pressure sensors, bending sensors and inertial measurement units (IMUs). We analyze temporal and spatial variations in pressure over the heel, arch and forefoot, generate real-time heatmaps and extract CoP trajectories to assess gait stability. This work provides high-resolution data supporting biomechanical modeling of healthy gait and offers a basis for gait event detection and closed-loop control in future exoskeleton systems.
|
|
15:22-15:34, Paper SuS3B.2 | |
RTK and IMU Fusion Positioning Technology for Orchard Robot |
|
Wang, Hao | Yantai University |
Wang, Fei | Yantai University |
Keywords: Agricultural Robotics, Path and Motion Planning
Abstract: Positioning is fundamental for enabling intelligent operations of orchard equipment. However, satellite-based positioning signals are prone to occlusion and multipath effects caused by tree canopies and surrounding obstacles, resulting in degraded localization accuracy. To address these challenges in orchard environments, this paper introduces a measurement-deviation-based extended Kalman filter fusion algorithm. The proposed method enhances state estimation accuracy by adaptively adjusting measurement noise covariances and mitigating bias accumulation due to sensor drift and intermittent signal outages. Consequently, it offers improved robustness and adaptability in real-time state estimation. Experimental results demonstrate that this fusion scheme significantly enhances both the accuracy and real-time responsiveness of Inertial Measurement Unit (IMU) and Real-Time Kinematic (RTK) data fusion, thereby satisfying the stringent positioning requirements of orchard applications.
|
|
15:34-15:46, Paper SuS3B.3 | |
Real-Time Detection Performance of RTSP vs. USB Cameras for UGVs |
|
Sadman, Raheeb | BRAC University |
Sattar, Safwan | BRAC University |
Alam, Jahedul | BRAC University |
Maliha, Sabrina | BRAC University |
Mashkura, Mahadia | BRAC University |
Sikder, Sajid Ali Sikder | BRAC University |
Akand, Anan | Brac University |
Abrar, Fahim | BRAC University |
Keywords: Mobile Robotics, Robot Vision and Computer Vision, Sensor Networks
Abstract: Real-time object detection is critical for ensuring the operational safety of unmanned ground vehicles (UGVs) in industrial inspection and last-mile delivery applications, yet the impact of camera interface selection remains poorly understood. This study presents the first systematic comparison of USB webcams versus network cameras for UGV perception systems, focusing on safety-critical scenarios including emergency braking and obstacle avoidance. Our experimental results demonstrate that USB-based systems consistently outperform networked alternatives, offering superior frame processing efficiency, enhanced navigation precision during avoidance maneuvers, and significantly improved operational stability in prolonged deployments. These advantages are a result of USB’s direct hardware interfacing and predictable low latency characteristics. The outcomes conclusively establish that for UGV applications, directly connected USB cameras provide more reliable vision performance, enabling robust collision avoidance in dynamic environments while maintaining exceptional tracking consistency and system robustness. Additionally, the study highlights the significant reduction in detection latency and jitter, which are important for real-time decision-making in fast-paced, safety in critical operations.
|
|
SuS4A |
Room D |
Medical Robots for Precision Surgery, AI-Driven Algorithms, and Autonomous
Surgical Task |
Invited Sessions |
Chair: Wang, Junchen | Beihang University |
Co-Chair: Su, Baiquan | Beijing University of Posts and Telecommunications |
|
16:20-16:32, Paper SuS4A.1 | |
Intra-Natural Orifice Locomotion Robot with Propulsion by Chemical Reaction (I) |
|
Zhang, Dengbo | Beijing University of Posts and Telecommunications |
Jiang, Zhangzhang | Beijing University of Posts and Telecommunications |
Ba, Peng | Beijing University of Posts and Telecommunications |
Wang, Junchen | Beihang University |
Liu, Wenyong | Beihang University |
Tang, Jie | Beijing Tian Tan Hospital, Capital Medical University |
Su, Baiquan | Beijing University of Posts and Telecommunications |
Keywords: Medical Robotics
Abstract: There are multiple ways for robots to advance.such as peristalsis, rolling,magnetic drive, and rear push.However, the thrust generated by chemical reactions can alsodrive objects to move. How to design a robot to generate thrustthrough chemical reactions to drive it is a challenging problem.This paper proposes a cavity motion robot driven by chemicalreactions, multiple reactant delivery mechanisms, a methodfor controlling the direction of reaction forces, a reactionchamber of the robot, on-off valves, and a device for dischargingreaction products, Analyze and derive the relationship betweenthe kinetic energy of gases generated by chemical reactions.robot displacement, and reactant volume. The feasibility of thebasic principle of the robot's movement was tested from anexperimental perspective. This driving method provides a newtype of movement mode for the motion robot in the pipeline.A new field of driving research has been opened up.
|
|
16:32-16:44, Paper SuS4A.2 | |
Digital Twin-Based Stereo Dataset Generation Method for 3D Reconstruction of Teeth (I) |
|
Su, Pengjiao | Beihang University |
Liu, Yuchen | State Key Laboratory of Oral & Maxillofacial Reconstruction And |
Zhang, Runshi | Beihang University |
Bai, Shizhu | Digital Center, School of Stomatology, the Fourth Military Medic |
Wang, Junchen | Beihang University |
Keywords: Robot Vision and Computer Vision, Deep Learning, Medical Robotics
Abstract: Producing stereo matching datasets in dental 3D reconstruction studies is expensive and difficult. In this study, we propose a method for procedurally generating high-precision virtual stereo matching datasets based on the Unity engine. By accurately modeling the binocular camera imaging principle and environmental parameters, the method can generate paired stereo images with known depth information (disparity maps), sampling depth values with an accuracy of up to 10-4m. This method breaks through the traditional paradigm of relying on expensive sensor acquisition or limited public datasets. It effectively solves the traditional problems of difficulty in acquiring depth information of real scenes, scarcity of high-quality datasets and high cost. The experiments were performed on the widely used open-source stereo matching model RAFT-Stereo and aligned to the target tooth point cloud after segmentation. The results show that the performance of the model trained based on this dataset is comparable to the traditional real dataset. The average alignment error for a single tooth was 0.17-0.26 mm, with the maximum error below 0.93 mm. For multiple tooth scenarios, the average error of multiple co-alignment was stabilized at 0.27-0.79 mm. Finally, experiments were performed on facial images taken by a real camera. The average error of the reconstructed point cloud is within 1mm.
|
|
16:44-16:56, Paper SuS4A.3 | |
Structural Design and Analysis of a Wire-Driven Robot with a Double-Wire Coaxial Drive Mode (I) |
|
Ma, Xudong | Beijing University of Posts and Telecommunications |
Chen, Anqi | Beijing University of Posts and Telecommunications |
Yi, Yubo | Beijing University of Posts and Telecommunications |
Hu, Yida | Harvard Medical School |
Liu, Wenyong | Beihang University |
Wang, Junchen | Beihang University |
Kuang, Shaolong | Shenzhen Techonology University |
Tang, Jie | Beijing Tian Tan Hospital, Capital Medical University |
Hou, Yuanzheng | Xuanwu Hospital, Capital Medical University |
Li, Changsheng | Beijing Institute of Technology |
Su, Baiquan | Beijing University of Posts and Telecommunications |
Keywords: Medical Robotics
Abstract: Wire-driven robots have wide applications. The structure of the driver is simple but bulky, with too many motors and no tensioning mechanism. A driver for a dual-wire coaxial wire-driven robot is proposed. The principle is that the extension and contraction amounts of the mirror-symmetrically distributed driving wires are equal. Based on this principle, a wire-driven hyper-redundant robot and a continuum robot with a dual-wire coaxial driving method are designed, and their kinematic models are established and analyzed. Based on the design method, a hyper-redundant robot and a continuum robot with a dual-wire coaxial driver are fabricated, and their motion performance is tested. The test results show that the wire-driven robot based on the dual-wire coaxial driver conforms to thekinematic analysis, proving that the dual-wire coaxial driving method is correct and efficient.
|
|
16:56-17:08, Paper SuS4A.4 | |
Design and Simulation of a Robotic System for Sports Injury Treated with Extracorporeal Shock Wave Therapy (I) |
|
Wang, Boyang | Beihang University |
Wang, Yueyang | Beihang University |
Liu, Wenyong | Beihang University |
Keywords: Medical Robotics, Robot Design, Human-Robot Interaction and Cooperation
Abstract: When applying robotic technology to extracorporeal shock wave therapy (ESWT) for sports injuries, challenges related to dexterity and dynamic response stability need to be addressed. To this end, this paper proposes an arm-hand coordinated robot-assisted ESWT scheme and completes its simulation. Firstly, by analyzing the operational requirements for the ESWT target area, a robot-assisted ESWT scheme and a two-degree-of-freedom (2-DOF) end-effector mechanism are designed. The inverted conical workspace of the end-effector enables precise treatment of localized target areas and controls the ESWT instrument to perform feed motion at specific orientations, overcoming the movement limitations of using a robotic arm alone in confined spaces,achieving the workspace of 1/3 *π*(75)^2*200 mm^3, which meets the small space occupation for ESWT. The forward and inverse kinematics of the robot system are then calculated, and joint motion simulations are performed. Finally, a dynamic model of the end-effector is established, and the characteristics of its driving force profile, in which the maximum forces for translational feed joint and swinging joint do not exceed 2.5 N and 15 N, are analyzed. This research provides a high-dexterous and stable robot-assisted ESWT solution for ESWT robots, laying the foundation for intelligent and precise treatment.
|
|
17:08-17:20, Paper SuS4A.5 | |
Organ Deformation and Contact Force Estimation of Surgical Instruments Based on 3D Vision |
|
Qian, Hongyu | Tsinghua University |
Wang, Yixuan | Tsinghua University |
Chen, Rui | Tsinghua University |
Xu, Jing | Tsinghua University |
Keywords: Medical Robotics, Sensing, Haptic System
Abstract: We propose a 3D vision-based tissue contact force estimation framework. It integrates image/point cloud acquisition, deformation reconstruction, multimodal deep learning, and physical verification to form a closed-loop system for visual-to-force estimation. The proposed model fuses RGB images, dense point clouds, and inter-point displacements, incorporates deformation modeling for interpretability, uses the Transformer architecture for unsupervised non-rigid point cloud registration, and adopts a spatial-temporal dual-pathway structure to handle static and dynamic features separately. Experiments involve building a renal tissue 3D deformation and force dataset, with extensive tests in simulated and physical environments.
|
|
SuS4B |
Room E |
Human-Robot Interaction |
Regular Sessions |
Chair: Chen, Fei | T-Stone Robotics Institute, the Chinese University of Hong Kong |
|
16:20-16:32, Paper SuS4B.1 | |
Development of Personalized Human Digital Twin for Exercising |
|
Zou, Kehan | The University of Hong Kong |
Ma, Xin | The University of Hong Kong |
Chen, Yuetian | The University of Hongkong |
Yang, Ping | Southern University of Science and Technology |
Wu, Xi | The Chinese University of Hong Kong |
Li, Chenzui | The Chinese University of Hong Kong |
Zhang, Yijian | The University of Hong Kong |
Huang, Jialiang | ShenZhen Academy of Robotics |
Chen, Jiangcheng | The University of Hong Kong |
Chen, Fei | T-Stone Robotics Institute, the Chinese University of Hong Kong |
Xi, Ning | The University of Hong Kong |
Keywords: Human-Machine Interface, Medical Robotics, Rehabilitation and Assistive Robotics
Abstract: Exercise requires quantitative planning and monitoring. Digital approaches offer a potential solution; however, developing subject-specific digital twin (DT) models remains challenging. Here, we propose a method to establish personalized human digital twin for exercising. Compared to the conventional methods that relies on medical imaging, we identify personalized biomechanical parameters, i.e., maximum isometric muscle force, based on the electromyographic (EMG) signals, kinetics data, and interaction forces. It allows for rapid and cost effective creation of individualized musculoskeletal models. Application studies demonstrate that our approach can accurately model the back squat movement in five subjects, yielding normalized RMSE of 0.2 for the ankle and hip joints, and 0.3 for the knee joints. These advantages, as a result, indicate a convincing strong potential for personalized exercise planning and monitoring.
|
|
16:32-16:44, Paper SuS4B.2 | |
Human-Robot Interaction Behavior in Commercial Services: A Four-Stage Integrative Framework |
|
Tu, Yangjun | Hunan University |
Jiang, Simin | Hunan University |
Xiao, Lijun | Hunan University |
Niu, Ziqi | Hunan University |
Yang, Zhi | Hunan University |
Keywords: Human-Robot Interaction and Cooperation
Abstract: With the increasing adoption of embodied service robots in commercial service settings, there is a critical need to understand the intricate interaction behavior process between humans (customers/employees) and these robots. Drawing upon a systematic review and qualitative integration of 44 core articles, this study proposes a four-stage dynamic model of human-robot interaction behavior: (1) Contact Initiation, (2) Interaction Execution, (3) Feedback Adaptation, and (4) Outcome Evaluation. This model not only delineates the goal-oriented activities within individual service encounters but also illuminates how repeated interactions, through learning and evolution, shape long-term human-robot service relationships and users' perception of the robot's social role (e.g., tool, assistant, partner). The proposed framework advances the theoretical understanding of the dynamic process of human-robot interaction behavior within commercial service settings and offers practical guidance for service firms seeking to effectively manage human-robot interactions and enhance service performance.
|
|
16:44-16:56, Paper SuS4B.3 | |
"Accelerator" versus "Ceiling": Unpacking the Career Paradoxes of Employee-Robot Cowork and Their Impact on Career Sustainability |
|
Tu, Yangjun | Hunan University |
Chen, Jia yuan | Hunan University |
Guo, Yaqian | Peking University |
Jiang, Simin | Hunan University |
Chen, Shaoxuan | Hunan Normal University in Changsha, China |
Yang, Zhi | Hunan University |
Keywords: Human-Robot Interaction and Cooperation
Abstract: The complex paradoxical impact of integrating service robots into frontline service work on employee career sustainability is frequently overlooked. Employing an inductive qualitative method, this study draws on in-depth interviews with 11 frontline service employees in the Chinese hotel industry to uncover and theorize the core career paradoxes inherent in employee-robot cowork and the mechanisms through which they shape career sustainability. Findings reveal that robots simultaneously enact dual roles, acting as both career "accelerators" and "ceilings." This duality manifests in four interwoven core career paradoxes: (1) immediate convenience versus long-term capability constraints; (2) selective efficiency versus situational failure and the imposition of new burdens; (3) task relief versus potential career threats; and (4) instrumental dependence versus interactional absence. These paradoxes interact such that the pursuit of short-term convenience can inadvertently lead to long-term skill suppression, heightened job insecurity, and constrained potential for human-robot cowork, thereby exerting a complex influence on career sustainability. This study introduces the construction of the "employee-robot cowork career paradox," challenging simplistic assumptions about technology's impact and contributing paradox and long-term perspectives to research on career sustainability and human-robot interaction (HRI). Our findings suggest that managers must look beyond the "efficiency
|
|
16:56-17:08, Paper SuS4B.4 | |
A Method for Constructing a Dual-Arm Robot Motion Retargeting Dataset |
|
Yuan, Jiahui | Yanshan University |
Yang, Haoxin | Beijing University of Posts and Telecommunications |
Yao, Zhuofan | Yanshan University |
Xu, Wenjing | Yanshan University |
Qiao, Kai | Yanshan University |
Zhang, Yahui | Yanshan University |
Wen, Guilin | Yanshan University |
|
|
17:08-17:20, Paper SuS4B.5 | |
CL-RAG: A Closed-Loop Multimodal Retrieval-Augmented Generation Architecture for Robust Human-Robot Control Interaction |
|
Zhang, Bowen | University of Trento |
Jiang, Yuhang | University of Trento |
Hu, Lingxiang | Paris Saclay University |
Li, Dun | Tsinghua University |
Hu, Qianqian | Nanjing Agricultural University |
Keywords: Human-Robot Interaction and Cooperation, Humanoid Robots, ROS, Software System for Robotics Application
Abstract: Recent advancements in Large Language Models (LLMs) such as Claude 3.5 and LLaMA 3, paired with Retrieval-Augmented Generation (RAG), offer promising opportunities for intelligent, context-aware robotic systems. This paper proposes a modular architecture that integrates enterprise-grade RAG technologies—originally developed for corporate knowledge management—into real-time control of bipedal and interactive robots. By adapting components such as ChromaDB, Docling, and AWS Bedrock, we demonstrate how unstructured sensor data, dynamic environment cues, and human language commands can be seamlessly processed to drive physically grounded robot behaviors. A proof-of-concept implementation shows significant improvements in instruction comprehension, semantic robustness, and safety, verified in industrial and inspection scenarios.
|
|
SuPP1 |
Hall |
Poster Session |
Poster Sessions |
|
10:30-16:40, Paper SuPP1.1 | |
Hybrid A*-Bézier Optimization for 3D Path Planning in Complex Environments (I) |
|
Mingrui, Mou | University of Chinese Academy of Sciences |
Gu, Haitao | Chinese Academy of Sciences |
Keywords: Path and Motion Planning, Machine Learning, Deep Learning
Abstract: To overcome limitations in traditional A* algorith-ms for 3D path planning—including path roughness, node redundancy, and safety concerns—we propose a hybrid strategy integrating Enhanced Heuristic A* with adaptive Bézier optimization. The approach refines heuristic functions via neural networks to dynamically guide path exploration, reducing redundant nodes and improving search efficiency. Concurrently, it employs segmented Bézier smoothing with local adaptive control points to ensure path continuity while mitigating collision risks. Simulations confirm that in medium-to-high density obstacle environments, the method reduces indexed nodes to 1/5–1/12 of benchmarks (traditional A* + global Bézier smoothing), significantly boosting computational efficiency and safety. This provides an effective solution for UUV complex 3D path planning.
|
|
10:30-16:40, Paper SuPP1.2 | |
A Multimodal Sensing-Driven Smart Shoe System for Gait Phase Recognition in Exoskeleton Applications (I) |
|
Zuo, Chuanlin | University Evry Val d'Essonne, University Paris-Saclay |
Hanbing, Liu | Université Paris-Saclay, Université d'Evry Paris-Saclay |
Bencharif, Loqmane | Paris Saclay |
Ibset, Abderahim | University Paris-Saclay |
Qi, Wen | Politecnico Di Milano |
Su, Hang | Paris Saclay University |
Dychus, Eric | Sandyc |
Alfayad, Samer | Paris-Saclay Universit -Evry University |
Keywords: Medical Robotics, Human-Machine Interface, Sensing, Haptic System
Abstract: To enhance the precision and naturalness of exoskeleton control, this paper proposes a novel multimodal smart shoe design for precise gait analysis. Unlike conventional gait models that rely solely on kinematic data, our system integrates foot pressure distribution and foot deformation sensing to construct a more comprehensive foot motion model. The smart shoe is equipped with inertial measurement units (IMUs), a plantar pressure sensor and bending sensors. Multimodal data was collected and analyzed to extract gait features including trajectory, pressure distribution, and foot deformation patterns. A multimodal gait analysis model was developed using sensor fusion techniques. Experimental results demonstrate that the proposed system provides a more accurate and holistic representation of foot motion, offering enhanced biomechanical and dynamic information for future exoskeleton control systems.
|
|
10:30-16:40, Paper SuPP1.3 | |
Elastic Actuation and Sensor-Fusion-Driven Adaptive Control for Wearable Lower-Limb Exoskeletons (I) |
|
Bencharif, Loqmane | Paris Saclay |
Ibset, Abderahim | University Paris-Saclay |
Zuo, Chuanlin | University Evry Val d'Essonne, University Paris-Saclay |
Hanbing, Liu | Université Paris-Saclay, Université d'Evry Paris-Saclay |
Qi, Wen | Politecnico Di Milano |
Su, Hang | Paris Saclay University |
Dychus, Eric | Sandyc |
Alfayad, Samer | Paris-Saclay Universit -Evry University |
Keywords: Medical Robotics, Rehabilitation and Assistive Robotics, Intelligent Control and Systems
Abstract: This paper presents an adaptive control framework for a wearable lower-limb exoskeleton that assists sagittal-plane motion at the hip and knee joints. The system integrates sensor fusion, virtual joint elasticity, and an adaptive fuzzy PID controller to enhance tracking accuracy under dynamic and nonlinear conditions. A simulation environment evaluates controller performance across varying scenarios, while a hardware prototype demonstrates real-time trajectory tracking using classical PID control. Results confirm the feasibility of combining adaptive control and elastic actuation for robust, user-responsive assistance in wearable robotics.
|
|
10:30-16:40, Paper SuPP1.4 | |
Human-Inspired Pre-Design Optimization for Humanoid Robots in Dynamic Interaction Tasks (I) |
|
Marshoud, Abd Alrahman | Université Évry Paris-Saclay |
Sleiman, Maya | Paris Saclay |
Ait Oufroukh, Naima | University of Paris-Saclay |
Su, Hang | Paris Saclay University |
Alfayad, Samer | Paris-Saclay Universit -Evry University |
Keywords: Humanoid Robots, Robot Design, Path and Motion Planning
Abstract: This study presents a novel pre-design optimization framework for humanoid robots, enabling precise alignment of robotic kinematics with human motion patterns for dynamic tasks. The aim is to reduce prototyping costs and support task-specific designs. Leveraging mechatronics expertise, the framework features an interactive interface for real-time parameter tuning, guided by metrics such as workspace coverage and inverse kinematics (IK) success rate. The predesign tool is applied on a 3-DOF arm, part of an upper-body humanoid robot under development, achieves 100% workspace coverage and an 85.51% IK success rate across four experiments for jabs and hooks. Motion similarity metrics were used to validate human-like performance and smoothness.
|
|
10:30-16:40, Paper SuPP1.5 | |
A Novel Hybrid Serial-Parallel Shoulder Mechanism for Humanoid Robots: Design and Workspace Analysis (I) |
|
Soukarieh, Wael | University of Evry, IBISC Laboratory |
Sleiman, Maya | Paris Saclay |
Ait Oufroukh, Naima | University of Paris-Saclay |
Su, Hang | Paris Saclay University |
Alfayad, Samer | Paris-Saclay Universit -Evry University |
Keywords: Humanoid Robots, Robot Design, Human-Robot Interaction and Cooperation
Abstract: This paper presents a novel 4-DOF hybrid serial–parallel shoulder mechanism for humanoid robots, designed to enhance workspace coverage while maintaining a compact, human-like form. The mechanism is composed of two integrated substructures: a serial chain and a fully parallel subsystem. This hybrid approach addresses the challenges of achieving large pitch, yaw, and roll ranges under size and anthropomorphic constraints. The design is tailored for the HYDROiD humanoid robot, with a focus on developing "slim and smart" joints suitable for human–robot interaction. To approximate anatomical constraints, the shoulder region is modeled as a conjoined conical volume representing the upper trunk and upper arm. The kinematic model is validated through numerical workspace analysis, demonstrating the hybrid architecture’s effectiveness in expanding motion capabilities within a restricted envelope.
|
|
10:30-16:40, Paper SuPP1.6 | |
High-Fidelity Contrastive Language-State Pre-Training for Embodied Agent State Representation |
|
Huang, Fuxian | Shanghai AI Laboratory |
Zhang, Qi | Shanghai AI Lab |
Zhang, Haoran | Shanghai AI Lab |
Zhang, Tianyi | Shanghai AI Laboratory |
Zhou, Ming | Shanghai AI Laboratory |
Zhang, Jinouwen | Shanghai AI Lab |
Zhai, Shaopeng | Shanghai AI Laboratory |
Keywords: Artificial Intelligence, Deep Learning, Intelligent Control
Abstract: With the rapid development of AI, multimodal learning has become crucial, especially with multimodal large language models and embodied agent. However, the representation of the state modality still lags behind other modalities like images, videos, and language. To this end, we propose a High-Fidelity Contrastive Language-State Pre-training method (CLSP), which can accurately encode state information into representations for both embodied agent and multimodal large language models. Extensive experiments demonstrate the superior precision and generalization capabilities of our representation, achieving outstanding results in text-state retrieval, navigation tasks, and multimodal large language model understanding.
|
|
10:30-16:40, Paper SuPP1.7 | |
3DS-Plan: A Perception-Planning-Co-Design Framework to Facilitate Robot Task Planning with Open-Vocabulary 3D Scene |
|
Xue, Min | Tsinghua University |
Yu, Jincheng | Tsinghua University |
Xiu, Lingkun | Tsinghua University |
Tang, Jiahao | Tsinghua University |
Cai, Xudong | Openmind Smart Robot Co.Ltd |
Zhao, Yali | Beijing Novauto Technology Co., Ltd |
Liang, Shuang | Novauto Tech |
Wang, Yu | Tsinghua University |
Keywords: Artificial Intelligence, Robot Vision and Computer Vision
Abstract: To plan and execute robotics tasks, 3D scene perception and understanding are critical for robots to interact with complex environments effectively. Traditional systems often rely on closed vocabularies and are constrained by pre-defined object categories, which limit their flexibility. In this paper, we propose 3DS-Plan, an open-vocabulary 3D scene perception and generation model designed to facilitate long-horizon task planning. It leverages the pre-trained foundation models to support robotic scene understanding and further provides environmental details to infer actionable steps in various scenarios. We validate the effectiveness of our framework through comprehensive experiments. It demonstrates a 31.41% improvement in computational efficiency for class operations and at least a 14.52% increase in success rates for complex tasks without requiring extensive retraining or extra annotation. The corresponding project page is available at https://techpage.github.io/open3dsp/.
|
|
10:30-16:40, Paper SuPP1.8 | |
Research on Traffic Police Gesture Recognition Algorithm Based on Improved YOLOv11 |
|
Lü, Chao | School of Electronics and Information Engineering, Changchun Uni |
Sun, Zhaoying | School of Electronics and Information Engineering, Changchun Uni |
Keywords: Artificial Intelligence, Deep Learning, Intelligent Transportation Systems
Abstract: Traffic police gesture recognition is essential for intelligent traffic management, road safety, and autonomous driving. This paper proposes a novel recognition approach based on an improved YOLOv11 architecture to address challenges such as diverse gesture categories, large scale variations, and environmental interference. We propose a novel Global-Local Pooling Fusion Block to reduce model complexity while maintaining feature quality, introduce a Global Context (GC) attention mechanism to enhance focus on key gesture regions, and integrate an improved Lite-BiFPN structure for better multi-scale feature fusion. Experimental results show that our method achieves a mean average precision of 90.6%, which is 3.8% higher than the original YOLOv11. This work significantly improves detection accuracy, robustness, and real-time performance, providing strong support for intelligent driving systems under complex traffic conditions.
|
|
10:30-16:40, Paper SuPP1.9 | |
A Control Method for Wheeled Bipedal Robots Using Improved PPO and Multidimensional Curriculum Learning |
|
Fan, Shenglin | Shandong University |
Zhou, Lelai | Shandong University |
Sun, Jingyu | Shandong University |
Zhang, Yi | Shandong University |
Li, Guowei | Shandong University |
Dai, Xiaomeng | China Railway Construction Corporation Bridge Engineering Bureau |
Li, Yibin | Shandong University |
Keywords: Field Robotics, Deep Learning, Intelligent Control
Abstract: Wheeled bipedal robots offer enhanced agility, but as an underactuated system based on a wheeled inverted pendulum model, they exhibit strong nonlinearity and coupling, making traditional control methods ineffective on complex, unstructured terrain. This work proposes Adaptive Curriculum Proximal Optimization (ACPO)—a deep reinforcement learning approach for locomotion control that combines an improved PPO algorithm with multidimensional curriculum learning (MCL). First, by incorporating an adaptive clipping coefficient based on KL divergence (Importance‐Weighted Clipping, IWC) and a PI‐KL self‐adaptive learning rate (SALR) into PPO, ACPO achieves efficient exploration in early training and stable convergence in later stages. Second, we propose an MCL framework that dynamically increases and fine‐tunes reward weights, action‐command complexity, observation noise, terrain difficulty, and external disturbances. Finally, large‐scale parallel training and comparative experiments are conducted in the NVIDIA Isaac Gym simulation environment. Ablation studies demonstrate that, compared to baseline PPO, ACPO significantly outperforms in terms of cumulative reward, episode length, orientation stability, fall frequency, velocity‐tracking error, and mean joint torque—exhibiting superior robustness, energy efficiency, and cross‐terrain adaptability.
|
|
10:30-16:40, Paper SuPP1.10 | |
STH-SynNet: Spatio-Temporal Heterogeneity-Aware Synergistic Network for Traffic Prediction |
|
Li, Mingqiu | Changchun University of Science and Technology |
Liu, Wanting | Changchun University of Science and Technology |
Yang, Yang | Changchun University of Science and Technology |
Cong, Haifang | Changchun University of Science and Technology |
Xing, Songqi | Changchun University of Science and Technology |
Keywords: Intelligent Transportation Systems, Deep Learning, Artificial Intelligence
Abstract: High-precision traffic flow prediction plays a crucial role in optimizing intelligent transportation systems and enhancing urban operational efficiency. However, existing methods still face limitations in modeling spatial heterogeneity and multi-scale temporal dependencies within traffic networks, which restrict the expressiveness and prediction accuracy of the models. To address these challenges, this paper proposes a Spatio-Temporal Heterogeneity-aware Synergistic Network for traffic prediction, termed STH-SynNet. Built upon an encoder-decoder architecture, the framework integrates an enhanced gating unit—ST-MS-Spectral GRU—within its core module. This unit incorporates dynamic graph modeling, time-frequency decoupling, and attention-enhanced mechanisms to improve the model’s capacity to capture structural diversity and complex temporal dynamics. By employing multi-scale convolutional receptive fields with a dynamic graph generation mechanism, the model adapts to heterogeneous interactions among nodes. Additionally, a frequency masking mechanism based on Fourier Transform is introduced to disentangle periodic patterns from short-term fluctuations, while a lightweight attention-enhanced path dynamically selects and models critical historical states. Experiments on three real-world datasets demonstrate significant improvements in predictive accuracy, validating the effectiveness and robustness of the proposed method in complex traffic forecasting tasks.
|
|
10:30-16:40, Paper SuPP1.11 | |
Design and Experiment of a Distributed Tiltrotor UAV |
|
Wang, Xiaobo | Zhejiang Lab |
Keywords: Robot Design
Abstract: With the development of unmanned aerial vehicle (UAV) technology, new UAV designs have become a research hotspot. Addressing the issues of short flight time, low flight speed for multirotor UAVS, and higher takeoff and landing environment requirements for fixed-wing UAVs, this paper proposes a distributed tiltrotor UAV system. First, it introduces the current research progress on vertical takeoff and landing (VTOL) tiltrotor UAVs. Next, it describes the overall design and components of the distributed tiltrotor UAV. The paper then provides a detailed introduction to the avionics system design for this UAV, and to the redundant flight control system. Following that, it presents a detailed analysis of the UAV control methodology. Finally, by constructing a prototype and conducting flight tests, the feasibility of this UAV as a novel rapid VTOL platform is validated. The results indicate that this UAV can achieve stable and reliable flight, offering valuable insights for the design of future VTOL UAVs.
|
|
10:30-16:40, Paper SuPP1.12 | |
LML-GAN: Latent Generative Adversarial Network for Time-Series Heart Rate Signal Prediction |
|
Li, Mingqiu | Changchun University of Science and Technology |
Li, Fengtian | Changchun University of Science and Technology |
Yang, Yang | Changchun University of Architecture and Civil Engineering |
Lan, Tianyu | Changchun University of Science and Technology |
Liu, Wanting | Changchun University of Science and Technology |
Li, Yifeng | Changchun University of Science and Technology |
Keywords: Deep Learning, Artificial Intelligence
Abstract: Arrhythmia, as an important cardiovascular disease, poses a serious threat to global human health and life. Electrocardiograms (ECGs) provide a wealth of information for the diagnosis and treatment of cardiovascular diseases, but traditional diagnostic methods are time-consuming, labor-intensive, and unable to provide early clinical warnings. Additionally, due to the complex information contained in ECG signals and their strong nonlinear characteristics, current ECG signal temporal prediction models often exhibit high oscillatory behavior. To address these challenges, this paper proposes an ECG signal prediction model based on latent low-dimensional space generative adversarial networks—LML-GAN (Latent MIX-LSTM Generative Adversarial Network). This model decomposes high-dimensional complex ECG signals into low-dimensional simple representations for adversarial training, ultimately mapping back to the high-dimensional space to complete the prediction. It also employs a hybrid time-frequency domain stepwise supervised loss function to enhance the model's ability to extract features from time-series dynamic data. This paper proposes a novel residual model, MIX-LSTM, based on traditional LSTM for feature extraction. Experiments were conducted using the MIT-BIH dataset, and compared to similar models, it eliminates the severe oscillation issues present in other models when applied to ECG signal prediction.
|
|
10:30-16:40, Paper SuPP1.13 | |
A Decentralized Reinforcement Learning Approach for Modular Octoped Locomotion Control |
|
Li, Guowei | Shandong University |
Zhou, Lelai | Shandong University |
Sun, Jingyu | Shandong University |
Zhang, Yi | Shandong University |
Fan, Shenglin | Shandong University |
Dai, Xiaomeng | China Railway Construction Corporation Bridge Engineering Bureau |
Li, Yibin | Shandong University |
Keywords: Field Robotics, Multi-Robot Systems, Artificial Intelligence
Abstract: With the increasing deployment of quadruped robots in complex environments, reinforcement learning (RL) has shown strong potential in tasks such as gait generation and dynamic balance. The modular octoped robot is composed of two quadruped subsystems connected front and rear. It features detachable and reconfigurable capabilities, enabling coordinated motion control. However, centralized reinforcement learning frameworks struggle with modular octoped robots due to limited coordination between subsystems, high-dimensional action spaces, and poor scalability. To address these issues, this paper proposes a decentralized reinforcement learning control method that decouples the system into two quadruped subsystems, each with an independent PPO policy for local decision-making. By enabling asynchronous control and lightweight communication, the system achieves efficient and coordinated locomotion by enabling local decision-making and asynchronous coordination. Experiments on the IsaacGym platform demonstrate that the decentralized approach improves motion performance, and adaptability, offering a scalable and flexible solution for complex multi-robot systems.
|
|
10:30-16:40, Paper SuPP1.14 | |
A Hierarchical Control Framework for Cooperative Adaptive Cruise Control Considering FDI Attacks and Loop Delay Analysis |
|
Wang, Wenwei | Beijing Institute of Technology |
Liu, Yushan | Beijing Institute of Technology |
Cao, Wanke | Beijing Institute of Technology |
Keywords: Intelligent Transportation Systems, Intelligent Control and Systems
Abstract: Cooperative Adaptive Cruise Control (CACC) systems based on real-time vehicle-to-vehicle (V2V) communication are pivotal for enhancing traffic efficiency and safety in vehicular network environments. However, the adoption of open communication channels renders these systems more susceptible to attacks, particularly false data injection (FDI) attacks, which manipulate vehicle states and disrupt platoon stability. Firstly, the system delay under FDI attacks is meticulously analyzed based on the concept of multi-link loop delay, and its upper bound is derived. Then, a hierarchical control framework resilient to cyberattacks is proposed to address loop delays and implement FDI compensation control. The upper layer develops a model predictive controller (MPC) for decision-making and planning under uncertainties. The lower layer employs an H∞ controller combined with a linear quadratic regulator (LQR) to mitigate the effects of loop delays and provide reliable acceleration tracking control. Finally, the effectiveness of the proposed method is validated through comprehensive hardware-in-the-loop testing.
|
|
10:30-16:40, Paper SuPP1.15 | |
AttSAM: Attention-Augmented Segment Anything Model for Accurate Polyp Segmentation |
|
Lan, Lixiang | Changchun University of Science and Technology |
Yang, Yang | Changchun University of Science and Technology |
Zhao, Guangyu | Changchun University of Science and Technology |
Li, Yifeng | Changchun University of Science and Technology |
Liu, Wanting | Changchun University of Science and Technology |
Wang, Jikui | Changchun Shikai Technology Industry Co., Ltd |
Keywords: Deep Learning, Medical Robotics, Robot Vision and Computer Vision
Abstract: Polyp segmentation is crucial in the diagnosis of colorectal cancer. The introduction of the Segment Anything Model (SAM) provides powerful pretraining capabilities for polyp segmentation, but it faces two main challenges when applied to endoscopic images: first, its Transformerbased architecture tends to overlook local details, leading to feature bias; second, its performance on out-of-distribution (OOD) data is suboptimal, affecting prediction accuracy and confidence estimation. To address these issues, we propose an enhanced approach based on SAM that incorporates a Local Feature Enhancement Module (LFEM) and a Channel Attention Enhancement Module (CAEM). LFEM improves the capture of high-frequency information by enhancing local detail features, playing a crucial role in extracting polyp edges and textures. CAEM introduces a channel attention mechanism that dynamically adjusts the weight of feature channels, thereby enhancing the model’s sensitivity and generalization ability. Additionally, we draw on the design principles of the Cross- Branch Feature Enhancement module and the Uncertainty- Guided Prediction Regularization (UPR) module to further improve SAM’s performance in multi-scale feature extraction and on out-of-distribution data. Experimental results demonstrate that the inclusion of LFEM and CAEM significantly improves the model’s segmentation accuracy on multiple public polyp datasets, particularly excelling in complex background scenarios.
|
|
10:30-16:40, Paper SuPP1.16 | |
FreezeAdaCRAFT Point Cloud Network* |
|
Lin, Siyu | Changchun University of Science and Technology |
Yang, Yang | Changchun University of Science and Technology |
Li, Mingqiu | Changchun University of Science and Technology |
Xing, Songqi | Changchun University of Science and Technology |
Lan, Lixiang | Changchun University of Science and Technology |
|
|
10:30-16:40, Paper SuPP1.17 | |
Research on Robotic Arm Trajectory Planning Method Based on Dual Cameras Guidance without Common Field of View |
|
Li, Mingyang | Changchun University of Technology |
Chenyu, Liu | Electrical and Electronic Engineering, Changchun University of T |
Wu, Hongying | Electrical and Electronic Engineering, Changchun University of T |
Jiang, Changhong | School of Electrical and Electronic Engineering, Changchun Unive |
Xie, Mujun | Changchun University of Technology |
Keywords: Robot Vision and Computer Vision, Path and Motion Planning, Industrial Robotics and Factory Automation
Abstract: In order to solve the problem of automatic assembly of large workpieces or automatic loading of large objects, the position measurement of the target object is realized with the cooperation of two cameras. For the problem of robotic arm trajectory planning with two cameras vision guidance in a non-common field of view, a four-degree-of-freedom robotic arm system with dual cameras vision measurement is first constructed to establish the kinematic model of the robotic arm, then image processing techniques are used to recognize, segment and ellipse fit the cooperative target image to obtain the pixel coordinates of the cooperative target, and the spatial position information of the cooperative target in the coordinate system of the robotic arm base is calculated by the transformation relationship between the coordinate systems, and finally, according to the position information of the cooperative target obtained from the measurement, the motion trajectory of the joints of the robotic arm is planned through the Constrained Optimization BY Linear Approximations (COBYLA) algorithm iterative optimization, and it is able to make the robotic arm realize the accurate positioning to the position of the cooperative target.
|
|
10:30-16:40, Paper SuPP1.18 | |
Research on Multi-Robot Collision Detection Strategy in Complex Environment |
|
Xie, Shuxin | Soochow University |
Cao, Zhimin | Suzhou City University |
Keywords: Multi-Robot Systems, Path and Motion Planning
Abstract: Abstract— Collision detection is one of the key technologies for realizing multi-robot motion planning under environmental constraints. To reduce computation time and improve detection efficiency, this paper proposes a task-decomposition-based 3-stage collision detection strategy, which includes a collision detection rigid body matrix, a coarse collision detection stage, and a precise collision detection stage. To verify the superior performance of the proposed strategy, the widely used Bullet collision detection algorithm was selected as a comparison benchmark. The experimental results demonstrate that the proposed method significantly optimizes the collision detection process, greatly reduces execution time, and enhances overall detection efficiency.
|
|
10:30-16:40, Paper SuPP1.19 | |
A Dexterous Manipulator Driven by Wire and Pulleys |
|
Su, Baiquan | Beijing University of Posts and Telecommunications |
Wang, Junchen | Beihang University |
Chen, Anqi | Beijing University of Posts and Telecommunications |
Hu, Yida | Harvard Medical School |
Xie, Zhen | National University of Singapore |
|
10:30-16:40, Paper SuPP1.20 | |
Dynamic Obstacle Avoidance Control Method for Unmanned Surface Vessels Integrating Trajectory Prediction and Rule-Based Guidance (I) |
|
Shen, Weilong | Shenyang University of Technology |
Feng, Dongying | Guangzhou Institute of Industrial Intelligence |
Zang, Chuanzhi | Shenyang University of Technology |
Yuan, Mingzhe | Shenyang Institute of Automation, Chinese Academy of Sciences |
Xiao, Jinchao | Guangzhou Institute of Industrial Intelligence |
Keywords: Path and Motion Planning, Intelligent Transportation Systems, Mobile Robotics
Abstract: To address the challenges posed by unpredictable dynamic obstacle trajectories, rule-inconsistent avoidance behaviors, and the lack of motion feasibility in traditional path planning, this paper proposes a path planning method that integrates trajectory prediction, rule-guided avoidance, and dynamic window control. First, a lightweight trajectory prediction framework based on polynomial curve fitting is designed. By leveraging a sliding window mechanism, the future motion trends of obstacle vessels are dynamically modeled, providing stable and reliable foresight information with low computational cost. Second, typical encounter scenarios defined in the International Regulations for Preventing Collisions at Sea (COLREGs) are incorporated to reconstruct the repulsive force direction mechanism in the Artificial Potential Field (APF) method, ensuring that avoidance behaviors comply with maritime rules and improving the rationality of avoidance directions. Finally, to address the limitations of the APF method in handling motion constraints, which often lead to excessively large avoidance angles, the Dynamic Window Approach (DWA) is introduced to ensure smoother and more feasible steering control. The predicted obstacle trajectories and the rule-compliant recommended heading are incorporated into the trajectory evaluation function, forming a closed-loop control strategy with both control feasibility and multi-objective guidance capability. Simulation and experimental results demons
|
|
10:30-16:40, Paper SuPP1.21 | |
Motion Control Methods for Dual Wheel-Legged Robot |
|
Che, Yufei | Nanchang Hangkong University |
Chen, Zhihua | Nanchang Hangkong University |
Jiang, ZhiFan | Nanchang Hangkong University |
Wang, Xiuwen | Beijing Institute of Mechanical Equipment |
Huang, Xilong | Nanchang Hangkong University |
Cheng, Shan | Nanchang Hangkong University |
Keywords: Mobile Robotics
Abstract: The dual-wheel-legged robot combines the efficiency of wheeled mobility with the obstacle-crossing capability of a legged structure, offering significant advantages in unstable terrain. However, dynamic balance control of the dual-wheel-legged robot on highly unstructured terrain remains a significant challenge. This paper analyzes a five-bar linkage mechanism and introduces virtual model control (VMC) to decouple the robot into two components: body balance motion and leg motion. Design a linear quadratic regulator (LQR) for real-time control of balance control based on leg length division of the robot's state space. When facing external interference, use Linear Active Disturbance Rejection Control (ADRC) technology to control the robot's legs. Compare and simulate it with the method of PID control of robot legs. The simulation results show that the control methods based on LQR and ADRC enable the robot to successfully navigate obstacles, and the error of attitude angle and leg length are smaller than that of the control methods based on LQR and PID.
|
|
10:30-16:40, Paper SuPP1.22 | |
Seam-Cutting Image Stitching Method Based on DM-Net and SU-Net |
|
Liu, Qi | Shenyang Jianzhu University |
Guo, Song | Shenyang Jianzhu University |
Keywords: Robot Vision and Computer Vision, Deep Learning, Artificial Intelligence
Abstract: Traditional image stitching methods tend to rely on complex and salient geometric features to achieve better stitching results. However, these features are only effective in specific natural scenes with sufficient geometric structures. In contrast, deep learning-based stitching approaches overcome adverse conditions by adaptively learning robust semantic features. Nevertheless, in scenarios with low or highly complex feature content, these methods may fail due to insufficient feature extraction, leading to errors. Moreover, conventional seam-finding methods do not consider the optimal positioning of seams during the stitching process, which can result in misalignment caused by poor seam placement. To address these issues, we propose a novel unsupervised feature extraction method(DM-Net) combined with an optimal seam selection strategy(SU-Net) for image stitching. Our approach leverages a time-series algorithm to effectively fuse local and global features, thereby enhancing the correlation between them and ensuring a global perspective of the feature relationships. During stitching, the optimal seam is determined by sharing features between image pairs. Experimental results demonstrate that the proposed method outperforms other seam-cutting stitching techniques, yielding superior results.
|
|
10:30-16:40, Paper SuPP1.23 | |
An Adaptive Reinforcement Learning Path Planning Method for Indoor Complex Dynamic Environments |
|
Xing, Songqi | Changchun University of Science and Technology |
Yang, Yang | Changchun University of Science and Technology |
Liu, Wanting | Changchun University of Science and Technology |
Lin, Siyu | Changchun University of Science and Technology |
Keywords: Path and Motion Planning, Machine Learning, Artificial Intelligence
Abstract: Path planning in indoor complex dynamic environments faces challenges such as frequent obstacle movements, high environmental uncertainty, and difficulty for agents to achieve stable convergence. Existing algorithms have limitations in adaptability, convergence speed, and path quality. To address these issues, this paper proposes an Adaptive Prioritized Experience Double Deep Q-Network Path Optimization method (APE-DPO), which integrates dynamic perturbation optimization, adaptive exploration rate adjustment, and a prioritized experience replay mechanism to enhance the agent's responsiveness to dynamic environments and improve policy stability. Experiments conducted in three representative dynamic scenarios—home, restaurant, and library—demonstrate that APE-DPO outperforms representative reinforcement learning models in the path planning domain, including DQN, DDQN, and their variants, in terms of path length, success rate, and convergence speed. In some scenarios, path length is reduced by 6.9% to 12.5%, showing strong robustness and practical application potential.
|
|
10:30-16:40, Paper SuPP1.24 | |
A CBAM-ResNet Based PPO Framework for Safe Navigation in Dynamic Pedestrian Environments |
|
Tian, Haoran | Changchun University of Science and Technology |
Yang, Yang | Changchun University of Science and Technology |
Meng, Jin | Changchun University of Science and Technology |
Wang, Shifeng | Changchun University of Science and Technology |
Xing, Songqi | Changchun University of Science and Technology |
Cong, Haifang | Changchun University of Science and Technology |
Keywords: Path and Motion Planning, SLAM and Navigation, Sensor Networks
Abstract: Safe and efficient navigation in dynamic pedestrian environments remains a significant challenge for mobile robots, particularly due to rapidly changing scenarios and complex multi-agent interactions. This paper proposes CBRN-PPO, a deep reinforcement learning-based navigation framework designed to address these challenges. The framework incorporates a ResNet-based feature extractor enhanced with the Convolutional Block Attention Module (CBAM) to improve both perception and decision-making. By fusing multimodal inputs—including LiDAR scans, pedestrian velocity maps, and target direction vectors—the model effectively captures rich semantic representations of dynamic obstacles. The CBAM module adaptively emphasizes critical spatial and channel-wise features, enhancing robustness in densely populated environments. Experimental results demonstrate that in scenarios with 35 pedestrians, CBRN-PPO achieves a 93% success rate in obstacle avoidance, outperforming the A1-RD baseline by 15%. Furthermore, it improves path efficiency by 38% and average navigation speed by 34%. These results highlight the effectiveness and robustness of CBRN-PPO as a navigation solution for autonomous mobile robots operating in complex dynamic environments.
|
|
10:30-16:40, Paper SuPP1.25 | |
Geometry-Enhanced Multi-Level DynamicDisparity Estimation and Virtual View Synthesis Method for Stereo Vision |
|
Pan, Zeqian | Changchun University of Science and Technology |
Piao, Yan | Changchun University of Science and Technology |
Keywords: Deep Learning, Robot Vision and Computer Vision
Abstract: 准确的视差图估计对于计算机至关重要 视觉任务,例如虚拟视图生成和 3D 建 模。解决低差异估计问题 无纹理、重复纹理和遮挡的精度 区域,以及观察到的生成视图中的伪影 在现有方法中,本文提出了一个 几何特征增强的多级动态视差 估计和虚拟视图生成方法。方法 从源头提取多尺度区域特征 使用具有微分的扩展卷积的图像 膨胀率并包含特征金字塔网络 (FPN)以及多尺度几何增强模块 从 参考图片。在融合阶段,梯度 引入显著
|
|
10:30-16:40, Paper SuPP1.26 | |
Research on Multi UAV Path Planning Based on Deep Reinforcement Learning |
|
Wu, Wei | Changchun University of Science and Technology |
Li, MingQiu | Changchun University of Science and Technology |
Keywords: Path and Motion Planning, Deep Learning, Multi-Robot Systems
Abstract: ,无人机技术的低成本和便利性 使其得到广泛应用,成为重点研究 各国战略目标的重点。以前 无人机路径规划方法存在低 效率高,速度慢。虽然深度加固 学习在路径规划方面取得了许多成就 无人机,仍存在训练时间长等问题 和规划准确性低。解决上述问题 问题,我们提出了改进的双延迟深度 确定性策略梯度(TD3)算法(改进 人工势场法TD3:APF-TD3)用于路径 无人机规划。首先,这篇文章 将人工势场的组合值和 虚拟势场到对偶的
|
|
10:30-16:40, Paper SuPP1.27 | |
Development of a Multi-Modal Control Architecture for a Cable-Actuated Ankle Exoskeleton |
|
Largeteau, Etienne | University D'Évry |
Conte, Bangaly | IBISC Laboratory, University of Paris Saclay |
Su, Hang | Paris Saclay University |
Bruneau, Olivier | ENS CACHAN |
Alfayad, Samer | Paris-Saclay Universit -Evry University |
Keywords: Human-Robot Interaction and Cooperation, Medical Robotics, Home and Personal Robot Systems
Abstract: This paper presents a lightweight, cable-actuated ankle exoskeleton featuring hip-mounted motors and a multimodal sensing platform for real-time gait-synchronized assistance. To minimize distal inertia and improve user comfort, the system integrates off-limb actuation with plantar-pressure and inertial sensing embedded in a smart insole. A dual-loop control framework—comprising a PID-based position controller and an inner PI current loop—ensures smooth, biologically inspired torque generation aligned with gait phases. Experimental validation demonstrates sub-degree joint-angle tracking, stable torque regulation, and rapid convergence without overshoot or oscillations. These results confirm the feasibility of the proposed system as a reliable and precise assistive device for rehabilitation and daily mobility, with potential for further enhancement through adaptive trajectory generation and energy-efficient control strategies.
|
|
10:30-16:40, Paper SuPP1.28 | |
Self-Supervised Monocular Depth Estimation Using Temporal Convolution |
|
Lin, Pengfei | Changchun University of Science and Technology |
Wang, Yu | Changchun University of Science and Technology |
Wang, Lu | Changchun University of Science and Technology |
Keywords: Deep Learning
Abstract: In recent years, self-supervised monocular depth estimation has increasingly attracted extensive attention from scholars due to its lack of reliance on labeled data, adaptability, low cost, ability to combine with pose estimation, and continuous learning capability. However, existing self-supervised monocular depth estimation methods often produce ambiguous depth predictions for moving targets in dynamic scenes by ignoring temporal continuity. Therefore, this paper proposes an improved architecture based on Lite-Mono. We introduce a channel-space attention mechanism into the continuous expansion convolution module to enhance the feature response to the edges of dynamic objects and untextured regions, thereby reducing depth blurring. We design a multiscale feature storage unit to cache the multiscale feature maps of consecutive frames in a sliding window and generate timing-enhanced features through cross-frame fusion. We construct a hierarchical temporal convolution network, stacking causal convolutional layers with an increasing dilation rate, to aggregate long and short-term temporal contexts and capture the continuity of motion trajectories of dynamic objects. The decoder fuses temporal features with original spatial features to generate motion-consistent multiscale depth maps. Through extensive experiments, we compare the proposed method with the benchmark model Lite-Mono.
|
|
10:30-16:40, Paper SuPP1.29 | |
A Novel Data-Driven Visualization and Analysis Framework for Embedded Robotic Firmware (I) |
|
Yermakov, Elia | Paris Saclay |
Ghandour, Maysoon | Université Paris Saclay |
Su, Hang | Paris Saclay University |
Alfayad, Samer | Paris-Saclay Universit -Evry University |
Keywords: Field Robotics, Humanoid Robots, ROS, Software System for Robotics Application
Abstract: The growing complexity of embedded firmware in robotics and automation, including mobile robots, industrial actuators, and autonomous systems, presents significant challenges related to maintainability, debugging, and efficient developer onboarding. Traditional documentation methods frequently become outdated and insufficient, which complicates code comprehension and increases the likelihood of errors. At the same time, current tools often provide only static or fragmented documentation without the interactive, realtime insights required for effective development. To address these limitations, we present a unified visualization and documentation approach based on the open-source tools Doxygen and Graphviz. When applied to a sophisticated embedded control board designed for robotic applications, this method generates interactive documentation, call graphs, and module dependency diagrams that enhance code comprehension, simplify debugging, and accelerate onboarding. In addition, the integration supports targeted optimization by highlighting performance bottlenecks and areas of excessive code complexity, thereby guiding data-driven refactoring. Overall, our findings demonstrate that automated, interactive visualization significantly improves maintainability and development efficiency in embedded software.
|
| |