WRC SARA 2024 Program | Friday August 23, 2024


FrOPOP	Room B
Opening Ceremony	Plenary Sessions
Chair: Wang, Zhidong	Chiba Institute of Technology


FrPL1PL	Room B
Plenary Talk I: Remote Magnetic Navigation at Clinical Scales for Micro and Nano Robots by Prof. Bradley J. Nelson, ETH Zurich	Plenary Sessions
Chair: Kosuge, Kazuhiro	The University of Hong Kong


FrPL2PL	Room T1
Plenary Talk II: Vision-Based Tactile Sensing for Robot Manipulation and Learning by Prof. Michael Yu Wang, Great Bay University	Plenary Sessions
Chair: Wang, Zhidong	Chiba Institute of Technology


FrKNKN	Room B
Keynote Talk: The Importance of Policy Representation by Prof. Jens Kober, TU Delft	Plenary Sessions
Chair: Fatikow, Sergej	OFFIS e.V. - Research Institute


FrS1A	Room B
Award Session	Regular Sessions
Chair: Wang, Zhidong	Chiba Institute of Technology

11:05-11:17, Paper FrS1A.1
Enhancing the Adaptability of Hexapod Robots Via Multi-Agent Reinforcement Learning and Value Function Decomposition

Zhan, Weishu	Dartmouth College
Ray, Laura	Dartmouth College
Keywords: Intelligent Control, Intelligent Control and Systems, Multi-Robot Systems Abstract: This paper applies the value function decomposition algorithm, QPLEX, to hexapod robot motion control. The QPLEX architecture is designed for multi-agent reinforcement learning, and in the case of the hexapod robot, each leg is treated as a separate agent. The QPLEX algorithm is trained using reinforcement learning in a simulated environment to achieve stable and efficient locomotion. The architecture outperforms a fully-decentralized baseline architecture in achieving walking behavior with rapid learning convergence and results in well-coordinated, rhythmic gaits. The study examines the generalization of the trained controllers to uneven terrain conditions and shows that the QPLEX architecture is more robust and performs significantly better than the fully-decentralized approach on uneven terrain. The importance of terrain curriculum learning is also evaluated.

11:17-11:29, Paper FrS1A.2
Deep Reinforcement Learning-Based Path Planning with Dynamic Collision Probability for Mobile Robots

Tariq, Muhammad Taha	Nanjing University of Aeronautics and Astronautics
Wang, Congqing	Nanjing University of Aeronautics and Astronautics
Hussain, Yasir	Nanjing University of Aeronautics and Astronautics
Keywords: Mobile Robotics, Path and Motion Planning, Deep Learning Abstract: This study proposes a new approach for path planning and collision avoidance in mobile robots by considering the Collision Probability (CP) combined with the Soft Actor-Critic Lagrangian (SAC-L) framework to ensure safety and efficiency in both static and dynamic environments. Our approach is well suited for situations that vary from navigating static obstacles to dealing with environments filled with high-speed dynamic obstacles that closely mimic real-world unpredictable behavior. The proposed SAC-L(CP) aims to minimize the expected total cost, which is a combination of both the negative rewards and the collision costs model. The framework’s efficiency is confirmed through extensive simulations on the Gazebo platform involving three increasingly difficult scenarios. Our approach demonstrates enhanced adaptability and operational efficiency compared to a well-established baseline. Further, it exhibits enhanced performance compared to existing Deep reinforcement learning (DRL) approaches, achieving optimal performance in all environments without collisions. It also demonstrates significant improvements in social and ego safety scores. Our work contributes to developing autonomous navigation for mobile robots in complex and dynamic environments resembling real-world scenarios. Integrating CP estimation with the SAC-L framework marks a pivotal step toward safer and more precise autonomous navigation and opens new avenues for future research in mobile robotics.

11:29-11:41, Paper FrS1A.3
Continuous Motion Planning for Mobile Robots Using Fuzzy Deep Reinforcement Learning

Wu, Fenghua	Nanyang Technological University
Tang, Wenbing	East China Normal University
Zhou, Yuan	Zhejiang Sci-Tech University
Lin, ShangWei	Nanyang Technological University
Liu, Yang	Nanyang Technological University
Ding, Zuohua	Zhejiang Sci-Tech University
Keywords: Mobile Robotics, Path and Motion Planning, Machine Learning Abstract: Deep reinforcement learning (DRL) has emerged as an efficient approach to motion planning for mobile robots, as it does not require a large amount of labeled training data. Current DRL-based motion planning methods can be divided into two categories: DRL with discrete action spaces and DRL with continuous action spaces. The former aims to select the best action from a predefined action space but suffers from high computational costs. The latter computes actions directly but may encounter local optima, leading to motion failures. To mitigate these limitations, we propose a novel motion planning method for mobile robots that integrates Deep Deterministic Policy Gradient (DDPG) with fuzzy logic. The Actor and Critic networks in our DDPG framework feature an architecture incorporating a Long Short-Term Memory (LSTM) network to handle varying numbers of obstacles in the environment. To address local optima in continuous DRL, we utilize fuzzification to discretize continuous actions into a small number of fuzzy sets and defuzzification to generate continuous actions. Consequently, the Actor generates fuzzy membership degrees, and the defuzzification process maps these degrees into continuous actions. In this way, our method leverages low-dimensional discrete DRL to ensure task completion. The results show that our method can generate collision-free and smooth motion trajectories in real time and improve motion planning performance significantly.

11:41-11:53, Paper FrS1A.4
An Optimized Variational Bayesian Point Set Matching Approach Using Coordinate Ascent and Simulated Annealing

Luan, Fangjun	Shenyang Jianzhu University, School of Information and Control E
Liu, Shitong	Shenyang Jianzhu University
Yuan, Shuai	Shenyang Jianzhu University
Keywords: Robot Vision and Computer Vision, Artificial Intelligence, Machine Learning Abstract: Aiming at the interference problem of outliers in point set matching, we propose an optimized variational Bayesian point set matching algorithm using coordinate ascent and simulated annealing. For the complex terms in the objective function, we introduce the coordinate ascending strategy and use the Monte Carlo integral stochastic approximation and control variable method to directly optimize the parameters. To enhance the generalization ability of the algorithm and avoid falling into the local optimal solution, we use the simulated annealing scheme to optimize the spatial mapping of the point set. Finally, the affine transform point set matching experiment and the non-rigid point set matching experiment represent our algorithm have a good matching effect and high accuracy.

11:53-12:05, Paper FrS1A.5
Optimal Viewpoints Selection for Monitoring Teleoperators Based on Event Artificial Potential Method

Liu, Xinyu	The University of Hong Kong
Ye, Jiajie	The University of Hong Kong
Xue, Yuxuan	The University of Hong Kong
Sheng, Yongji	The University of Hong Kong
Wang, Yichen	The University of HongKong
Lin, Jianfeng	The University of Hong Kong
Song, Zekun	The University of Hong Kong
Chen, Jiangcheng	The University of Hong Kong
Xi, Ning	The University of Hong Kong
Keywords: Tele-Robotics/Networked, Cloud Robotics, Mobile Robotics Abstract: The visual monitoring performance of teleoperated robots in complex environments has always been challenging, and existing viewpoint selection methods such as free and first-person viewpoints are inadequate. The free viewpoint provides a high degree of control freedom but increases the cognitive load of the operator, while the first-person viewpoint has a limited field of view, making it difficult to obtain a global perception. In this paper, we propose an automatic viewpoint selection method based on artificial potential field (APF). By setting gravity and repulsion fields around important points and obstacles, the camera robot can adaptively adjust its viewpoint to ensure that the operator always obtains the best view for task execution. Experiments were designed for a mobile robot navigating a sanitization task, and the method was compared with existing methods. The results show that the proposed method excels in objective indicators such as task completion time, while significantly reducing the operator's emotional fluctuation and cognitive load, providing technical support for realizing efficient, comfortable, and low-risk teleoperation.

12:05-12:17, Paper FrS1A.6
Humanoid Grasping with Multi-Finger Dexterous Hands Based on DM-KMP

Wang, Jiashuai	Shandong University
Li, Ke	School of Control Science and Engineering, Shandong University
Keywords: Intelligent Control, Rehabilitation and Assistive Robotics Abstract: Humanoid grasping plays a crucial role in robotic manipulation. Currently, most humanoid grasping methods exhibit limitations in control and generalization when handling complex and diverse grasping tasks. In this study, we proposed a novel humanoid grasping control method based on dynamic movement primitives (DMP) and kernelized movement primitives (KMP), namely DM-KMP. We conducted an experiment involving the grasping of 12 different objects using a multi-finger dexterous hand. The results showed that the DM-KMP method achieved a success rate of 90.8%, significantly higher than the 77.5% success rate of the traditional KMP method and the 81.7% success rate of the DMP method. These findings indicated that the proposed DM-KMP method could not only improve the grasping performance for multi-finger robotic hands but also show flexibility of modulating the reference trajectories, making it more suitable for complex and varied grasping tasks. This novel method may improve the performance of grasping with multi-finger dexterous hand and can be widely applied in a variety of application such as industrial or in-house manipulative tasks.


FrS1B	Room A
Brain-Machine Interface & Human-Robot Interaction and Cooperation	Regular Sessions
Chair: Xie, Ping	Yanshan University

11:05-11:17, Paper FrS1B.1
Bayesian Optimization Based Dempster-Shafer Fusion for Brain-Robot Cooperation to Navigate a Mobile Robot

Han, Jixin	Yanshan University
Gao, Chen	Yanshan University
Cheng, Jian	Yanshan University
Chen, Lin	Yanshan University
Zhao, Jing	Yanshan University
Keywords: Brain-Machine Interface, Human-Robot Interaction and Cooperation, Machine Learning Abstract: One critical problem in brain-robot cooperation is to efficiently fuse human and machine to enhance the overall performance. In this paper, a Bayesian optimization based Dempster-Shafer (BO-DS) method was designed to integrate decisions from both human brain and machine. Decisions on every step of robot’s movement were made by a brain-computer interface (BCI) model and an intelligent machine (IM) model. The BO-DS estimated basic probability assignments (BPAs) of the two models and assigned exponential weights to them. The optimal weights were selected using a Bayesian optimization approach. The BO-DS fused the weighted BPAs based on D-S theory to make the final decision. The experimental results showed that compared with only using the BCI model (84.9 %), the proposed BO-DS achieved higher average accuracy of 89.01 % in offline validation.

11:17-11:29, Paper FrS1B.2
A Pain Assessment Method Based on Decoding Local Field Potential Signals

Cui, Yunlong	Yanshan University
Zhang, Baojia	Yanshan University
Gao, Peijie	Yanshan University
Zhao, Jing	Yanshan University
Keywords: Brain-Machine Interface, Machine Learning Abstract: Pain is a highly subjective sensation. To achieve more accurate assessment of pain signals in clinical settings, this study proposes a pain assessment method based on time- frequency-space features (PA-TFS). Through pain experiments on rats and recording their local field potential (LFP) signals, data from four rats were collected. The study involves steps such as filter bank analysis, time window decomposition, extraction of spatiotemporal features, and machine learning classification to decode pain signals. Experimental results demonstrate that the PA-TFS method exhibits the highest accuracy in decoding pain signals, showing significant advantages over other methods. Additionally, the study explores the impact of data collection time and trial quantity on decoding performance, revealing that the PA-TFS method maintains high accuracy under different time and trial quantities. This research provides a new method and perspective for the quantitative assessment of pain signals, with promising clinical applications.

11:29-11:41, Paper FrS1B.3
Classification of Five Control/Non-Control States to Enhance Asynchronous Brain-Computer Interfaces

Liu, Xueshuo	Yanshan Univerity
Zhang, Qian	Yanshan Univerity
Li, Jiaxin	Yanshan Univerity
Wang, Hantao	Yanshan Univerity
Zhao, Jing	Yanshan Univerity
Keywords: Brain-Machine Interface, Machine Learning Abstract: Discriminating EEG signals between control and non-control states is a key issue in asynchronous brain-computer interfaces, and recent studies have shown that fusion-based approaches can effectively improve classification performance. In this paper, we propose a Dempster-Shafer theory-based decision fusion (DST-DF) method integrating decisions from two asynchronous algorithms for improved classification accuracy. An individualized space-frequency based complex network (ISF-OCN) algorithm and an Irregular-resampling auto-spectral analysis (IRASA) algorithm were used to extract control/non-control features for classification. The DST-DF method constructed the basic probability assignment (BPA) functions for them and assigned weights considering their varied importance in decision making. The weights were optimized using a single objective Bayesian optimization method, and were assigned to the decisions of ISF-OCN and IRASA for D-S fusion. The experimental results of 10 subjects showed that the proposed method achieved higher accuracy (93.92%) by integrating the ISF-OCN (78.87%) and IRASA (91.20%) algorithms.

11:41-11:53, Paper FrS1B.4
Effect of Decision Way on the EEG Signal Classification Performance of Motor Imagery

Lu, Yanzheng	Yanshan University
Wang, Hong	Northeastern University
Niu, Jianye	Yanshan University
Fu, Rongrong	Yanshan University
Keywords: Brain-Machine Interface, Machine Learning, Artificial Intelligence Abstract: Timely and accurate classification of motor imagery (MI) is crucial for its practical application. However, the effect of decision way has not been thoroughly researched in the existing studies. This paper researches the effect of decision way on the classification of MI task electroencephalogram (EEG) signals. The time-frequency analysis of pre-processed EEG signals is carried out to obtain the variation rules of different EEG rhythms in MI tasks. The power changes of MI task EEG signals are analyzed. The time domain and frequency domain features of EEG signals are extracted, and the MI task classification models with different decision ways are built. The average classification accuracy of MI in logistic regression model with continuous window decision way is 0.5889. The accuracy of different decision ways on the classification of EEG signals in MI tasks are compared.

11:53-12:05, Paper FrS1B.5
LSTM Based Reinforcement Learning for Electroencephalogram Music Generation

Wang, Hua	Nanchang Hangkong University
Cheng, Shan	Nanchang Hangkong University
Tong, Lijun	Nanchang Hangkong University
Dong, Hua	Nanchang Hangkong University
Keywords: Artificial Intelligence, Deep Learning, Machine Learning Abstract: EEG signals represent the activity of neurons in the brain, containing complex information that can effectively reflect human's physiological state. Music stimulation acts on the brain, causing corresponding changes in EEG signals, thus there is an inseparable connection between EEG signals and music. This article studies the neural control of the brain by using EEG signals to generate music. This article uses EEG as input, designs a new reward function using LSTM and music theory rules, and uses this reward function to design a music generation method based on Q-Learning. Through experiments, it has been shown that the method proposed in this article can generate music with ornamental properties that have a similar distribution to real music.

12:05-12:17, Paper FrS1B.6
Physical Human-Robot Interaction Control with Human Behavior Comprehension

Hou, Zhimin	National University of Singapore
Li, Dongyu	Beihang University
Keywords: Human-Robot Interaction and Cooperation, Rehabilitation and Assistive Robotics, Intelligent Control and Systems Abstract: Physical human-robot interaction~(pHRI) has attracted massive research interests, leading to advancements in robot hardware and control scheme development. Series elastic actuators~(SEAs)-driven robots are popular for physical interactions owing to their natural compliance and safety. Existing control schemes for SEA-driven robots have been demonstrated in pHRI tasks with a specific objective and limited interaction forms. However, the pHRI tasks requiring flexible interactions for task completion have yet to be thoroughly investigated. To this end, this paper develops a unified controller to address flexible interactions based on comprehending unintentional and intentional behaviours. A biomimetic adaptation learning method estimates the human unintentional interaction force based on iteratively updating time-varying feedforward force, stiffness, and damping. Simultaneously, human intentional behaviour is recognized from interaction force, and the reference trajectory is accordingly deformed for task completion. Finally, several experiments are conducted on a two-DoF SEA-driven robot to validate the effectiveness of the proposed controller.


FrS2A	Room B
Innovations of Technology for Medical Robotics and Devices: Boosting Healthcare	Invited Sessions
Chair: Su, Baiquan	Beijing University of Posts and Telecommunications

14:40-14:52, Paper FrS2A.1
Single Segment Controllable Continuum Robot: A Pilot Study (I)

Ma, Xudong	Beijing University of Posts and Telecommunications
Yang, Qian	Beijing University of Posts and Telecommunications
Wu, Fengtong	Beijing University of Posts and Telecommunications
Zhang, Yinjia	Beijing University of Posts and Telecommunications
Hu, Yida	Harvard Medical School
Liu, Wenyong	Beihang University
Wang, Junchen	Beihang University
Li, Changsheng	Beijing Institute of Technology
Tang, Jie	Beijing Tian Tan Hospital, Capital Medical University
Kuang, Shaolong	Shenzhen Technology University
Su, Baiquan	Beijing University of Posts and Telecommunications
Keywords: Medical Robotics, Artificial Intelligence Abstract: In existing continuum robots, not every segment within a section is controlled, limiting the full utilization of their flexibility and making navigation through narrow and continuously curved spaces challenging. This paper introduces a single-section controllable continuum robot where each segment can be independently controlled via driving wires. Firstly, the structure and connection method of the continuum robot segments are designed. Secondly, the kinematic model of thesingle-section controllable continuum robot is established, enabling independent control of each segment through the driving wires. Finally, we compares and tests the fitting accuracy of different configurations of continuum robots with controllable parts containing 1, 3, 5, and 7 segments in one section, regarding their ability to fit the same curve. The single-section controllable continuum robot can independently control the position of each joint within the continuum, demonstrating superior curve fitting capability and enabling navigation through narrow, continuously curved spaces.

14:52-15:04, Paper FrS2A.2
Clamping Tool Holder of Replaceable Tools for Medical Robot (I)

Zhang, Yinjia	Beijing University of Posts and Telecommunications
Yang, Qian	Beijing University of Posts and Telecommunications
Zhang, Dengbo	Beijing University of Posts and Telecommunications
Ma, Xudong	Beijing University of Posts and Telecommunications
Tang, Jie	Beijing Tian Tan Hospital, Capital Medical University
Hou, Yuanzheng	Xuanwu Hospital, Capital Medical University
Hu, Yida	Harvard Medical School
Wang, Junchen	Beihang University
Li, Changsheng	Beijing Institute of Technology
Kuang, Shaolong	Shenzhen Technology University
Liu, Wenyong	Beihang University
Su, Baiquan	Beijing University of Posts and Telecommunications
Keywords: Medical Robotics Abstract: The replacement of surgical tools is one of the basic operations in performing surgery. However, this step takes a lot of time in operations, reduces the efficiency of the operation and increases the risk of the operation. At present, medical robots still need manual operation to replace surgical tools. In order to enable the robot to change surgical tools, this study designed a clamping tool holder for changing surgical tools. The clamping tool holder is driven by a 6-degree of freedom robotic arm. In this paper, CAD model of the clamping tool holder is presented, the motion of the clamping tool holder analysis and pick up the tools of simulation operation. Finally, the prototype experiment verified that the clamping tool holder can hold and release surgical tools without manual assistance. The structure can realize the effect of saving manpower and improving operation efficiency.

15:04-15:16, Paper FrS2A.3
Design of Medium Contact 3D Printed Guides in Robot-Assisted Laminectomy: Finite Element Analysis and Animal Experiment (I)

Ji, Xuquan	Beihang University
Zhang, Xiaojuan	Beijing Zhuzheng Robot Co., LTD
Zhu, Yuanyuan	Beihang University
Zhang, Yonghong	Beihang University
Hu, Lei	Beihang University
Liu, Wenyong	Beihang University
Keywords: Medical Robotics, Emerging Technologies and Applications Abstract: The accuracy of 3D-printed guides alignment during robot-assisted spinal surgery is significantly influenced by the size of the contact area for registration. An undersized contact area may result in unstable registration, whereas an oversized contact area can increase the dissection area of soft tissues. This paper introduces a novel medium contact (MC) guide design procedure based on a four-point alignment method, which is specifically tailored for posterior laminectomy procedures. A comparative finite element analysis against the widely utilized full contact (FC) guides demonstrated that the MC guide, which occupies only 36.28% of the area of the FC guide, shows lower stress concentration and displacement when subjected to external forces according to the ABAQUS simulation environment. To verify the clinical efficacy of the MC guide, an in vivo animal experiment was conducted. The MC guide, post-printing, was employed for alignment, and successfully complete the registration of a robot-assisted spinal decompression surgery. The MC guide proposed herein not only ensures registration stability but also minimizes patient injury and surgeon’s workload, and offers considerable reference value in clinical applications.

15:16-15:28, Paper FrS2A.4
Compact Multiple Functional Integrated Medical Robotic End-Effector for Neurosurgery (I)

Zhang, Yinjia	Beijing University of Posts and Telecommunications
Li, Weihan	The University of Tokyo
Wang, Haotian	Beijing University of Posts and Telecommunications
Ma, Xudong	Beijing University of Posts and Telecommunications
Yang, Qian	Beijing University of Posts and Telecommunications
Tang, Jie	Beijing Tian Tan Hospital, Capital Medical University
Hou, Yuanzheng	Xuanwu Hospital, Capital Medical University
Hu, Yida	Harvard Medical School
Li, Changsheng	Beijing Institute of Technology
Liu, Wenyong	Beihang University
Wang, Junchen	Beihang University
Su, Baiquan	Beijing University of Posts and Telecommunications
Keywords: Medical Robotics Abstract: In neurosurgery, surgeons use single-function surgical tools such as aspirator and electric-coagulation cutters to deal with blood vessel bleeding. Compared with multifunctional surgical tools, the surgical efficiency of using singlefunction tools is lower. In this paper, a multiple functional integrated medical robotic end-effector which integrates CCD imaging, electric-coagulation and suction is designed. The endeffector can be installed at the end of the serial robotic arm. The CAD model of the end-effector was constructed, and the motion analysis is carried out. The end effector is proved to be effective by three experiments. The operator can use the design to suction blood, spotting of bleeding spots, and electriccoagulation lesion. In operations, there is no need to change the surgical tools, improve the efficiency of the operation.

15:28-15:40, Paper FrS2A.5
A Bone Tumor Boundary Extraction Method Incorporating Envelope and Responsive Region for Robot-Assisted R0 Resection (I)

Liu, Yanwen	Beihang University
Fan, Daoyang	Beijing Jishuitan Hospital, Capital Medical University
Liu, Weifeng	Beijing Jishuitan Hospital, Capital Medical University，Pe
Liu, Wenyong	Beihang University
Keywords: Medical Robotics, Emerging Technologies and Applications, Artificial Intelligence Abstract: The accurate extraction of accurate boundary of bone tumor is a prerequisite for surgical planning in robotic bone tumor resection. Current research mainly focuses on anatomy-based automatic or semi-automatic segmentation. We notice that besides the anatomically visible envelope of bone tumor, the responsive region should also be considered for the safe robot-assisted bone tumor resection. Based on this idea, this paper taking the femoral bone tumor as an example, proposes a bone tumor boundary extraction method incorporating envelope and responsive region for robot-assisted R0 resection. First, we automatically extract the envelope model of bone tumor utilizing the neural network and simplified its mesh model using the quadric edge collapse decimation algorithm. Then, we extend the envelope model to get preliminary bone tumor boundary model covering the responsive region using an offset along with the surface point normal of the envelope model. Finally, we calculate the intersection between the preliminary bone tumor boundary and the femoral condyle model to get the resulting bone tumor boundary. A case with clinical images is tested and assessed by surgeon which demonstrates the feasibly of the proposed method for robot-assisted accurate R0 resection.

15:40-15:52, Paper FrS2A.6
Hair Direction Detection (I)

Ba, Peng	Beijing University of Posts and Telecommunications
Wang, Pengyi	Beijing University of Posts and Telecommunications
Wu, Hongde	Beijing University of Posts and Telecommunications
Yang, Qian	Beijing University of Posts and Telecommunications
Feng, Yongqiang	Plastic Surgery Hospital, Chinese Academy of MedicalSciences And
Wang, Junchen	Beihang University
Hu, Yida	Harvard Medical School
Li, Changsheng	Beijing Institute of Technology
Liu, Wenyong	Beihang University
Kuang, Shaolong	Shenzhen Technology University
Su, Baiquan	Beijing University of Posts and Telecommunications
Keywords: Robot Vision and Computer Vision, Medical Robotics Abstract: Hair direction is an important external feature of hair, and recognising hair direction is a prerequisite for processing hair. In this paper, a new algorithm is proposed and systematically verified experimentally for the problem of recognising hair direction. The main goal of this paper is to develop an algorithm that can identify hair direction in a complex image environment. A curve segment analysis method based on image skeletonisation is adopted, which is based on skeleton extraction, intersection identification, curve segmentation and direction prediction. In addition, this paper combines the technique of non-maximal value suppression and PCA analysis to improve the accuracy and stability of the estimation. In the experimental design, this paper chooses a representative image dataset to verify the effectiveness of this paper’s algorithm. The experimental process includes the steps of image preprocessing, skeletonisation processing, intersection detection and merging, and direction prediction. The experimental results show that the method in this paper can accurately and effectively identify the hair direction. The main contribution of this paper is to propose a new hair direction recognition method and experimentally verify its effectiveness and accuracy in complex backgrounds.


FrS2B	Room A
Robotic Learning and Vision	Regular Sessions
Chair: Chen, Fei	T-Stone Robotics Institute, the Chinese University of Hong Kong

14:40-14:55, Paper FrS2B.1
Intelligent Robotics and Machine Learning for Manufacturing Automation

Chen, Heping	Texas State University

14:55-15:07, Paper FrS2B.2
Methodology on Robot-Based Complex Surface Processing Using 2D and 3D Visual Combination

Wu, Haodong	Memorial University
Zou, Ting	Memorial University
Burke, Heather	Memorial University of Newfoundland
King, Stephen	Memorial University of Newfoundland
Burke, Brian	Nunavut Fisheries Association (NFA)
Keywords: Industrial Robotics and Factory Automation, Robot Vision and Computer Vision Abstract: Despite the significant development of automated equipment in the seafood industry, including the automated processing equipment for fish and crab, tremendous challenges accompany. Low processing capability for complex seafood surfaces is a paradigm, leading to heavy reliance on manual labor. The research is driven by the urgent need for robotic processing of the complex shape of seafood, which has a promising potential in the processing of objects with random, flexible, and complex shapes. In this paper, we are proposing a novel approach to processing complex surfaces designed for the seafood industry by combining 2D and 3D visual information based on point cloud segmentation. Porcupine crabs---a species of king crab in the family Lithodidae living in the Canadian Atlantic Ocean---have complex surfaces with long, sharp spines and are chosen as the case study. The unique feature of the Porcupine crabs poses substantial challenges to the conventional visual processing method in terms of low accuracy and efficiency. On the other hand, using our method, the crab feature has been successfully recognized and processed by cutting the spines. The robot spine removal tool path is generated based on the extracted spine features. Simulation results using Robot Operating System (ROS) and experimental tests have validated the robustness of the proposed method.

15:07-15:19, Paper FrS2B.3
MSD-YOLO: A Novel Method for Detecting Microscopic Surface Defects in Metal Spray-Painted Thermal Mugs

Yan, Zhibo	Zhejiang Normal University
Zhang, Teng	Xi’an Jiaotong University；Zhejiang Normal University
Liu, Yu	The 722nd Research Institute of China Shipbuilding Corporation,
Li, Rui	Xi'an Jiaotong University
Wang, Dongyun	Zhejiang Normal University
Keywords: Industrial Robotics and Factory Automation, Robot Vision and Computer Vision, Deep Learning Abstract: Metal spray-painted thermal mugs, as a typical industrial product, are produced on a large scale worldwide. However, the problem of detecting microscopic surface flaws, which leads to an increased rate of returns and a steep drop in customer satisfaction, represents a common challenge in the field of industrial product surface defect detection. This paper introduces a novel microscopic surface defect detection network named MSD-YOLO (Microscopic Surface Defects-YOLO). This model is specifically targeted at microscopic surface defect detection through the innovative MGC2f (Multiscale-Group Convolution to Feature) module and its integration with the Inner-IoU method. And the method optimizes multi-scale convolutional kernels and designs a new loss function, enabling the detection of microscopic defects in complex industrial product images. Experimental results indicate that the MSD-YOLO model improves the mAP (mean Average Precision) and F1 score by 5.492% and 5% respectively, compared to the original YOLOv8 model, significantly enhancing the model's detection performance. Additionally, this study provides a dataset of surface defects on metal spray-painted thermal mugs, enriching the data in the field of industrial product surface defects.

15:19-15:31, Paper FrS2B.4
Res2net Based Siamese Network for Object Tracking Via Efficient Multi-Scale Attention and Border Region Reppoints

Yuan, Shuai	Shenyang Jianzhu University
Geng, Jinyu	Shenyang Jianzhu University
Dou, Huize	Shenyang Jianzhu University
Keywords: Deep Learning, Robot Vision and Computer Vision Abstract: Object tracking is an important and challenging task. The changes of the object itself and the complex background can affect tracking performance. Siamese networks based object tracking are widely adopted due to their advantages in efficient similarity measurement, end-to-end learning, and modular design. However, existing object tracking algorithms based on Siamese networks do not extract object features sufficiently, resulting in lower tracking accuracy. And excessive parameter settings adversely affect real-time performance. Therefore, we propose a new algorithm, SiamBRR, which can more accurately and quickly track objects. We first introduce the Res2Net residual network into the Siamese network framework as the backbone feature extraction network to fully extract features. Then, we use the EMA to enhance feature representation. Furthermore, our proposed border region reppoints accurately locate the object border while avoiding the need for extensive parameter settings. Finally, we conducted experiments on three challenging public datasets: VOT2018, VOT2019, and OTB100. The experimental results illustrate that the proposed SiamBRR outperforms other advanced trackers in tracking accuracy.

15:31-15:43, Paper FrS2B.5
Combining VLM and LLM for Enhanced Semantic Object Perception in Robotic Handover Tasks

Huang, Jiayang	Shenzhen Technology University
Limberg, Christian	Bielefeld University
Arshad, Syed Muhammad Nashit	National University of Sciences and Technology
Zhang, Qifeng	Shenyang Institute of Automation, CAS
Li, Qiang	Shenzhen Technology University
Keywords: Humanoid Robots, Robot Vision and Computer Vision, Grasping and Manipulation Abstract: We are utilizing a combination of Large Language Model (LLM) and Vision Language Model (VLM) to perform a robot-to-human handover task with semantic object knowledge. Current object perception systems for this task often work with a fixed set of objects and primarily consider geometric properties, neglecting semantic knowledge about where or where not to grasp an object. By applying LLM and VLM in a zero-shot fashion, we demonstrate that our approach can identify optimal and semantically correct handover parts for both the robot and the human in this handover task. We validate our approach quantitatively across several object categories.

15:43-15:55, Paper FrS2B.6
Research on Enhanced YOLOv8 Gesture Recognition Method for Complex Environments

Yuan, Shuai	Shenyang Jianzhu University
Kong, Xiangjie	Shenyang Jianzhu University
Zhang, Shuai	Shenyang Jianzhu University
Keywords: Robot Vision and Computer Vision, Deep Learning Abstract: 针对复杂环境下识别率低的问题，该研究提出了一种手势识别方法DF-YOLOv8s。该方法首先用AIFI模块替换SPPF模块，利用基于注意力的内部尺度特征交互来增强特征提取效果。然后，根据ASF-YOLO网络结构设计了一种新型的ZD特征融合网络，提高了我们提取和融合图像特征的能力，从而提高了复杂环境下的手势识别精度。这种改进综合考虑了不同尺寸、遮挡和不均匀照明的特征，这些特征可能会导致较小的目标损失。最后，我们利用公开可用的


FrS3A	Room B
Revolutionizing Interaction II: Embodied Intelligence and the New Era of Human-Robot Collaboration	Invited Sessions
Chair: Li, Qiang	Shenzhen Technology University

16:15-16:27, Paper FrS3A.1
Mechatronic Development of Highly Integrated Electromagnetic Actuator (HIERA) for Scalable Adolescent Exoskeletons

Kardofaki, Mohamad	UVSQ
Fouz, Moustafa	UVSQ
Tabti, Nahla	Université Paris-Sud
Su, Hang	Politecnico Di Milano
Alfayad, Samer	Paris-Saclay Universit -Evry University
Dychus, Eric	Sandyc
Keywords: Humanoid Robots, Robot Design, Human-Machine Interface Abstract: Exoskeletons are widely used in robotic rehabilitation to improve the quality of life for disabled patients. However, existing exoskeletons lack scalability for teenagers due to their rapid growth. This study presents the Highly Integrated Electromagnetic Actuator (HIERA), a novel compact electric joint actuator designed to address this need. HIERA's mechatronics development focuses on key design requirements, including size, angular speed, torque, and safety. We explore DC motor selection, transmission design, bearing and material choices, and sensor integration. Experiments such as load testing and adaptability assessment validate HIERA's performance. Results show that HIERA accommodates adolescents' anatomical changes, offering robust scalability and performance, thus providing a new solution for exoskeletons designed for growing teenagers.

16:27-16:39, Paper FrS3A.2
Mechanical Design of a Humanoid Robotic Head for Human Robot Interaction

Yahnian, Sevag	Kalysta
Sleiman, Maya	Paris Saclay
Ait Oufroukh, Naima	University of Paris-Saclay
Su, Hang	Politecnico Di Milano
Alfayad, Samer	Paris-Saclay Universit -Evry University
Keywords: Robot Design, Humanoid Robots, Human-Robot Interaction and Cooperation Abstract: The emerging demand for robots that can interact naturally and effectively with humans in modern environments such as healthcare, service industries, and personal assistance highlights the need for developing expressive humanoid robotic heads. These heads should integrate advanced mechanical designs and control systems to achieve lifelike human interactions and movements. This paper details the development of a humanoid robotic head, with a primary focus on the structural design to achieve a humanlike appearance and the implementation of an advanced neck mechanism. Two conceptual designs of a neck mechanism based on Spherical Parallel Manipulators were proposed and compared on the basis of compactness, the provided workspace, and the required motor torques. 3-RRR SPM mechanism proved to be the most suitable for the application. A human face structure and a neck mechanism were hence proposed.

16:39-16:51, Paper FrS3A.3
HydROS 104: Interactive ROS Compatible Electro-Hydraulic Educational Platform

Sleiman, Maya	Paris Saclay
Ghandour, Maysoon	Université Paris Saclay
Kardofaki, Mohamad	UVSQ
Jleilaty, Subhi	Paris-Saclay University
Su, Hang	Politecnico Di Milano
Alfayad, Samer	Paris-Saclay Universit -Evry University
Keywords: Education Robotics, ROS, Software System for Robotics Application, Robot Design Abstract: Actuation is the cornerstone on which each mechatronics device is built. Hydraulic actuation offers features that no other actuators can provide. In particular, it provides high power-weight and high power-volume ratios that are indispensable for portable devices. At the hardware level, recent trends have led to the development of electro-hydraulic actuators. Those actuators combine the advantages of hydraulic and electrical actuation while masking their disadvantages. At the software level, the Robot Operating System has emerged as a suitable solution for developing robotics applications in the past few years. To comply with these trends, KALYSTA Actuation, a French company, proposes an educational development kit, "HydROS 104", to train the workforce to use ROS-compatible electro-hydraulic actuators, notably the patented Servo Electro-Hydraulic Actuator SEHA.


FrS3B	Room A
Robot Design and Control	Regular Sessions
Chair: Liu, Wenyong	Beihang University

16:15-16:27, Paper FrS3B.1
System Design and Kinematic Analysis of a Dexterous Hand with Humanoid Characteristics

Liu, Linjie	Shandong University
Sun, Ning	Shandong Normal University
Li, Ke	School of Control Science and Engineering, Shandong University
Keywords: Rehabilitation and Assistive Robotics, Biologically Inspired Robotics, Grasping and Manipulation Abstract: 目前大多数假手都很昂贵，并且低自由度（DoF）、运动范围受限、低柔韧性强，仿生性能差。这项研究发展了低成本、多自由度、高度灵活的人形机器人灵巧的手并介绍了它的机械结构，电路系统，以及运动学分析。机械式灵巧的手的结构是通过3D打印实现的使用光敏树脂。电路系统划分分为主板和驱动板两部分。此外，对灵巧的手进行运动学分析指尖的工作空间是使用 Denavit-Hartenber 参数方法。获得的数字屈曲/伸展和内收/外展运Ó

16:27-16:39, Paper FrS3B.2
Structural Design of Multi-Functional Caregiving & Rehabilitation Robot and Its Motion Simulation Analysis of Multi-Posture Adjustment Mechanism

Zhang, Xiaodong	Xi'an Jiaotong University
Yang, Xinyu	Xinjiang University
Zha, Wenyu	Shaanxi Xiaodong Aid Robotics Technology Co
Mu, Tong	Xinjiang University
Keywords: Rehabilitation and Assistive Robotics, Robot Design, Medical Robotics Abstract: Aiming at the problem that the traditional care and rehabilitation robots only have a single function other than integrating the functions of life care and rehabilitation, a new multifunctional caregiving & rehabilitation robot is firstly designed in this paper, which mainly includes: the frame design of the robot, the structural design of its functional modules, and the design of the multi-posture adjustment mechanism. Then, based on the complex vector method, the forward kinematics and inverse kinematics models of the multi-posture adjustment mechanism are established for the composite motion of lifting and standing so that its influence on the motion functions of other modules is analyzed. Finally, Adams was used to simulate the kinematics of the multi-posture adjustment mechanism, and by comparing with the theoretical analysis results, it was clearly shown that the maximum value of the error was 0.00489 rad/s, which was in the acceptable range, verifying the theory analysis accuracy and the design feasibility.

16:39-16:51, Paper FrS3B.3
Design and Implementation of a Quadrotor UAV with a Soft Gripper

Wang, Xiaobo	Zhejiang Lab
Keywords: Soft Robotics, Grasping and Manipulation Abstract: The integration of a mechanical gripper on a quadrotor UAV significantly broadens its application scope by enabling the UAV to interact with environmental objects. This paper presents the design and implementation of a quadrotor UAV system equipped with a soft gripper. The system primarily consists of the quadrotor UAV and the soft gripper, which is mounted on the UAV to enable active interaction with external objects. Compared to traditional rigid manipulators, the soft gripper offers higher compliance, stronger adaptability, simpler control, and is less likely to cause damage to objects. Additionally, compared to pneumatic soft manipulators, the soft gripper has a simpler structure without the need for air pumps or other accessories, thereby reducing the weight of the UAV and extending its flight time. Furthermore, the soft gripper is easy to control, flexible, and lightweight, combining with the UAV to achieve aerial manipulation, which makes it possible to help people pick up garbage in dangerous areas and assist fruit farmers in harvesting strawberries in the field. Experimental test results show that the designed quadrotor UAV and soft gripper have stable and reliable performance, and their combination can achieve aerial grasping and releasing.


FrS4A	Room B
Modern Control of Marine Robotics	Invited Sessions
Chair: Ji, Daxiong	Zhejiang University

16:55-17:10, Paper FrS4A.1
Marine Vehicle Model Predictive Control Using an Implicit Sliding Mode Feedback

Ji, Daxiong	Zhejiang University

17:10-17:22, Paper FrS4A.2
Composite Controller Design for AUV with Uncertainties and Noises Based on Combined Kalman Filter and RBFO (I)

Zhang, Qiang	Zhejiang University
Ji, Daxiong	Zhejiang University
Keywords: Intelligent Control and Systems Abstract: 本文研究了具有扰动的自主水下航行器（AUV）和测量噪声。首先，扰动，非线性不确定性被视为集总干扰。二、卡尔曼滤波（KF）和卡尔曼滤波的特殊组合径向基函数神经网络观察器（RBFO）为：建议过滤噪声并估计集总干扰。然后，在保证倍数的前提下观察者保持等效的噪声水平放大，它们在重建系统方面的能力各国都进行了细致的比较评估。此外，复合材料的跟踪能力验证了基于KFRBFO的速度控制器。通过定性比较，优点从状

17:22-17:34, Paper FrS4A.3
Enhanced PSO-Tuned Fractional-Order Adaptive Model Predictive Control for Robust UAV-Quadrotor Trajectory Tracking (I)

Sheharyar, Hussain	Zhejiang University
Ji, Daxiong	Zhejiang University
Xu, Lie	Zhejiang University
Keywords: Intelligent Control and Systems, Path and Motion Planning, Intelligent Control Abstract: Abstract— This paper presents an innovative approach to UAV-Quadrotor (UAV-Q) trajectory tracking through an Enhanced Particle Swarm Optimization (ePSO)-Tuned Fractional-Order Adaptive Model Predictive Control (FOAMPC). The proposed methodology leverages fractional calculus to capture the inherent memory and hereditary properties of the UAV-Q dynamics, enhancing the system’s responsiveness and stability. An ePSO algorithm dynamically tunes the controller gain in real time, incorporating adaptive inertia weights and a convergence criterion based on particle position variance to prevent premature convergence and ensure a robust balance between exploration and exploitation. The performance index minimizes the tracking error and control effort over a prediction horizon, subject to system dynamics and constraints. By integrating this novel adaptive tuning mechanism, the controller adapts to varying conditions and disturbances, achieving superior trajectory tracking performance and robustness. Simulation results demonstrate the effectiveness of the proposed FOAMPC in handling complex UAV-Q maneuvers and disturbances, significantly outperforming conventional control strategies. This work provides a comprehensive framework for advanced UAV-Q control, paving the way for enhanced operational reliability and precision in dynamic environments.

17:34-17:46, Paper FrS4A.4
Model-Free Fault Diagnosis for Autonomous Underwater Vehicles Using Multi-Layer Convolution and Attention Mechanism (I)

Lu, Enhao	Zhejiang Ocean University
Ji, Daxiong	Zhejiang University
Wang, Xia	Zhejiang Ocean University
Zhang, Qiang	Zhejiang University
Keywords: Artificial Intelligence, Deep Learning, Intelligent Control and Systems Abstract: Autonomous Underwater Vehicles (AUVs) play a crucial role in marine oil and gas exploration, ocean environment surveys, deep-sea sampling, environmental assessment, and underwater infrastructure inspection due to their remote operation, high-precision navigation, and strong autonomy. To ensure mission success in the complex and unpredictable deep-sea environments, AUVs require robust fault diagnosis capabilities. This paper introduces DCSNet (Deep Convolutional and Sequential Network), a novel fault diagnosis method leveraging Deep Convolutional Neural Networks (DCNN) and attention mechanisms. DCSNet excels in extracting key features from state data for predicting fault types, addressing both fault detection and isolation. Key innovations of DCSNet include the integration of multi-layer convolution and attention mechanisms, advanced data processing techniques such as data augmentation and sliding window methods, a model-free approach that reduces dependency on precise mathematical models, and enhanced adaptability and flexibility through the dynamic adjustment of convolution block sizes and the number of convolution layers based on changes in data volume and feature dimensions. DCSNet achieved a test accuracy of 98.54% after 100 training sessions, demonstrating its effectiveness and reliability in AUV fault diagnosis, indicating that DCSNet is highly effective and reliable for fault diagnosis in complex and dynamic underwater environments.

17:46-17:58, Paper FrS4A.5
SCKAN: A Lightweight Model-Free Fault Diagnosis Method for Autonomous Underwater Vehicles (I)

Xu, Lie	Zhejiang University
Ji, Daxiong	Zhejiang University
Keywords: Machine Learning, Field Robotics, Artificial Intelligence Abstract: This paper presents a novel approach to fault diagnosis in Autonomous Underwater Vehicles (AUVs) using a Sequence Convolutional Kolmogorov-Arnold Network (SCKAN). The proposed method addresses the critical challenge of achieving high-accuracy fault detection while maintaining a lightweight model suitable for power-limited AUV applications. SCKAN combines the strengths of Sequence Convolutional Neural Networks (SeqCNN) for processing sequential data and Kolmogorov-Arnold Networks (KAN) for efficient function representation. We evaluate our method using the "Haizhe" AUV dataset, considering five common fault types. The SCKAN achieves an average classification accuracy of 96.8%, comparable to state-of-the-art methods. Notably, it reduces the parameter count by 78.2% compared to the best-performing SeqCNN model, with only a 1.1% decrease in accuracy. This significant reduction in model complexity makes SCKAN particularly suitable for real-time, on-board fault diagnosis in AUVs. The proposed method opens up new possibilities for efficient, accurate, and real-time fault diagnosis in AUVs, potentially improving their safety and extending their operational capabilities in challenging underwater environments.


FrS4B	Room A
Advanced Robot Control	Regular Sessions
Chair: Yuan, Shuai	Shenyang Jianzhu University

16:55-17:07, Paper FrS4B.1
Fast Safe Rectangular Corridor-Based Online AGV Trajectory Optimization with Obstacle Avoidance

Liang, Shaoqiang	Huazhong University of Science and Technology
Fa, Songyuan	HUST
Li, Yiqun	Huazhong University of Science and Technology
Keywords: Path and Motion Planning, Mobile Robotics Abstract: Automated Guided Vehicles (AGVs) are essential in various industries for their efficiency and adaptability. However, planning trajectories for AGVs in obstacle-dense, unstructured environments presents significant challenges due to the nonholonomic kinematics, abundant obstacles, and the scenario's nonconvex and constrained nature. To address this, we propose an efficient trajectory planning framework for AGVs by formulating the problem as an optimal control problem. Our framework utilizes the fast safe rectangular corridor (FSRC) algorithm to construct rectangular convex corridors, representing avoidance constraints as box constraints. This eliminates redundant obstacle influences and accelerates the solution speed. Additionally, we employ the Modified Visibility Graph algorithm to speed up path planning and a boundary discretization strategy to expedite FSRC construction. Experimental results demonstrate the effectiveness and superiority of our framework, particularly in computational efficiency. Compared to advanced frameworks, our framework achieves computational efficiency gains of 1 to 2 orders of magnitude. Notably, FSRC significantly outperforms other safe convex corridor-based methods regarding computational efficiency.

17:07-17:19, Paper FrS4B.2
Variational Bayesian Based Adaptive CKF-SLAM Algorithm under Non-Steady Noise

Zhang, Feng	Shenyang Jianzhu University
Li, Aichun	Shenyang Jianzhu University
Yuan, Shuai	Shenyang Jianzhu University
Keywords: SLAM and Navigation, Mobile Robotics Abstract: The variational Bayesian based adaptive CKF-SLAM algorithm is proposed to resolve the challenge of non-steady noise caused by erratic wheel slippage and sensor performance variances in mobile robot systems. First of all, the inverse Wishart distribution is adopted to establish a prior model for the observation noise matrix. Then, according to orthogonality principle, an adaptive factor is derived to participate in the iterative filtration calculation by using the provided measurement value. Furthermore, the posterior distribution is approximated precisely through variational inference. The pose optimal estimation is obtained by using the Cubature Kalman filter algorithm to literately update the relevant parameters. Finally, simulations under non-steady noise are performed to illustrate VBACKF-SLAM superiority over UKF-SLAM, CKF-SLAM and VBCKF-SLAM. The simulation results represent that the accuracy of mobile robot state estimation is improved by the proposed algorithm, and significant enhancement in positioning and mapping precision is obtained, which could be a valid solution for the real-time localization and map building of mobile robots.

17:19-17:31, Paper FrS4B.3
Model Reference Adaptive Control for a Manned eVTOL Aircraft

Wang, Xiaobo	Zhejiang Lab
Keywords: Intelligent Control Abstract: The manned electric Vertical Takeoff and Landing (eVTOL) aircraft as an emerging option for urban transportation has attracted considerable attention. However, its flight control system faces various challenges, including issues of stability and safety. This paper aims to explore the application of Model Reference Adaptive Control (MRAC) in manned eVTOL aircraft. Firstly, it introduces the current development status of manned eVTOL aircraft and the challenges faced by their flight control systems. Secondly, it delves into the principles and methods of MRAC, analyzing its potential advantages in manned eVTOL aircraft. Then, it establishes the dynamics model of the aircraft and linearizes the nonlinear dynamic model, proposing a flight control system design based on MRAC. Subsequently, flight data obtained by MRAC and PID controllers will be compared through simulation and actual flight experiments, respectively. The results demonstrate that the control performance and disturbance rejection capabilities of the MRAC controller are superior to those of the Proportional-Integral-Derivative (PID) controller. Finally, it summarizes the research findings and prospects for future research directions. This paper provides new ideas and methods for the design of control law for manned eVTOL aircraft, promoting the development of urban air transportation systems.

17:31-17:43, Paper FrS4B.4
Reorientation of a Bio-Cat Falling Legged Robot Driven by Pneumatic Muscles with Energy Optimization

Zhu, Xiaocong	Zhejiang University
Tan, Mingao	The University of Hong Kong
Cao, Jian	Hefei University of Technology
Huang, Lixi	The University of Hong Kong
Keywords: Biologically Inspired Robotics, Smart Structures, Materials, Actuators Abstract: Nowadays, robots are facing unstructured environments with non-planar surfaces and high complexity during their operations. Cats can autonomously flip to adjust their orientation when falling, showing strong adaptability to complex environments. However, existing bio-cat robots still have shortcomings in joint drive forms and control system design. Currently, most robots are driven by motors, and the rigid body cannot provide cushioning when landing, which easily leads to motor damage. As such, a bio-cat falling-legged robot driven by pneumatic muscles(BCFLRDPM) is proposed, which uses series and parallel pneumatic muscle joints to form a flexible spine structure. In this paper, a complete dynamic mathematical model of the BCFLRDPM is established and its reorientation method is proposed, which employs modified particle swarm optimization (PSO) algorithm and cubic spline function trajectory fitting for minimizing energy consumption during free falling. The simulation results show that the reorientation with energy optimization for the BCFLRDPM can be realized with cooperative regulation of the pneumatic muscles.

17:43-17:55, Paper FrS4B.5
New GCC-TDOA Compensation Method Combined with the Wavelet Threshold Filtering

Kan, Yue	Henan Polytechnic University
Hu, Yapeng	Henan Polytechnic University
Zhang, Tengfei	Henan Polytechnic University
Shi, Chenzhe	Henan Polytechnic University
Gao, Wa	Nanjing Forestry University
Zha, Fusheng	Harbin Institute of Technology
Keywords: Human-Robot Interaction and Cooperation, Sensor Networks, Sensing, Haptic System Abstract: Sound source localization is a crucial technology for the robot auditory system. The commonly used Generalized Cross-Correlation time delay of arrival (GCC-TDOA) method in sound localization easily is affected by the sampling rate, resulting in the time delay leakage. Therefore, this article proposes a compensation method for the TDOA estimations based on signal sample differences, aiming to address this issue. This method employs the first derivative of the signal itself and the sample difference between two signals to obtain the compensation value for the TDOA estimations. Additionally, an improved wavelet filtering method with a new threshold function is introduced to enhance the accuracy of the TDOA estimations in the noisy environment. Simulation results show that under the same conditions, the proposed GCC-TDOA method has higher real-time performance, accuracy, and robustness. Moreover, the denoising performance of the new wavelet threshold function is better than that of the soft and hard threshold functions, as well as the two currently better performing threshold functions.


FrPP1	Hall
Poster Session	Poster Sessions

10:45-17:30, Paper FrPP1.1
An Efficient Human Activity Recognition Framework Based on Graph Convolutional Network for Human-Robot Collaboration (I)

Liu, Wenzhe	Yantai University
Liu, Zhaowei	School of Computer and Control Engineering, Yantai University
Su, Hang	Politecnico Di Milano
Keywords: Human-Robot Interaction and Cooperation, Robot Vision and Computer Vision, Deep Learning Abstract: In this paper, a collaborative human-robot object processing framework based on skeleton data and graph convolutional human action recognition is proposed, and network architecture search and adaptive graph mechanisms are introduced. By recognizing the pose that the human carries the object, it is passed to the collaborative robot to make a matching object pickup action. The method is driven by the human skeleton data captured from RGB images, recognizes human actions by adaptive learning graph structure, and builds a graph convolution model and training in the collected dataset. Human-robot collaboration experiments are conducted in a laboratory environment, and the experimental results show that the method achieves high accuracy and real-time performance, and effectively accomplishes the human-robot collaboration task.

10:45-17:30, Paper FrPP1.2
NEHand: Enhancing Hand Pose Estimation in the Wild through Synthetic and Motion Capture Datasets

Jiao, Xiuzhen	Dianke Data (Beijing) Information Technology Services Co., Ltd
Li, Xiangnan	Yantai Science and Technology Innovation Promotion Center
Wen, Haonan	Yantai University
Keywords: Robot Vision and Computer Vision, Artificial Intelligence, Human-Robot Interaction and Cooperation Abstract: Recovering interacting hand poses in natural environments poses a highly challenging task. Current datasets for interacting hands are relatively simple, lacking complex textures or background interference, allowing models to focus more on hand interaction behaviors. However, the diversity of backgrounds and textures in natural environments can significantly interfere with hand pose estimation, affecting the model’s generalization ability. Therefore, we propose an approach named NEHand, which combines synthetic datasets with publicly available datasets, including those based on motion capture, to achieve the recovery of interacting hand gestures in natural environments.To simplify the model, reduce complexity, and enhance generalization, we employ a strategy of flipping the left hand to the right hand to effectively achieve the recovery of both hands. Additionally, we introduce constraints from the mano model, including adjustments to joint angles, finger bending, and overall hand shape. This ensures that the generated hand poses not only conform to anatomical structures but also represent common, spontaneously occurring postures in our daily lives. By restricting and adjusting these parameters within specified ranges, we consider the natural motion relationships between hand joints, ensuring that the generated hand poses are physiologically reasonable. The method undergoes quantitative and qualitative evaluations, showcasing state-of-the-art robustness, generalization,

10:45-17:30, Paper FrPP1.3
Optimized YOLO-Based Model for Real-Time Hand Keypoint Detection in Robotics (I)

Hu, Lingxiang	Paris Saclay University
Zhu, Xingfei	Jiangnan University
Li, Dun	Tsinghua University
Jiang, Zhinan	Qingdao University
Zhang, Fukai	Shandong University
Zhang, Chengqiu	Shandong University
Keywords: Robot Vision and Computer Vision, Deep Learning, Human-Robot Interaction and Cooperation Abstract: Human hand detection is crucial for robots as they learn human gestures for grasping tasks. However, due to the limited computational power of embedded devices used by robots, many existing approaches fail to meet real-time requirements. To address this challenge, we present an real-time model based on the YOLO framework for human hand keypoint detection. Our approach involves extracting relevant hand data from the COCO dataset to create a specialized dataset tailored for our task. We then design a neural network backbone and decoder grounded in the YOLO architecture, optimizing it for efficient and accurate hand keypoint detection. Our network not only meets real-time performance standards but also demonstrates scalability, allowing for the integration of additional tasks without significant reconfiguration. Through extensive experiments and comparative analysis, we validate the superiority of our model over existing methods. We conclude by discussing the implications of our findings and potential directions for future research.

10:45-17:30, Paper FrPP1.4
Integrating Wireless Sensor Networks in Construction Project Risk Management: A Cyber-Physical Approach to Enhancing Robustness and Security in Smart Building Developments (I)

Shuai, Li	Shenyang University of Technology, Shenyang 110870, Liaoning
Aimin, Zhu	Shenyang University of Technology, Shenyang 110870, Liaoning
Keywords: Sensor Networks, Artificial Intelligence, Deep Learning Abstract: In the realm of construction project risk management, the integration of technological advancements remains paramount to tackle contemporary challenges. This research delves into the pivotal role of Wireless Sensor Networks (WSNs) and Cyber-Physical Systems (CPS) in augmenting the robustness and security of smart building projects. We elucidate the inherent advantages of WSNs in real-time data acquisition, monitoring, and their synergistic potential with CPS for an enhanced risk analysis matrix. Drawing from practical case studies and comparative analysis, our findings underscore the profound implications of this integration in streamlining construction processes, bolstering security frameworks, and ushering an era of data-driven decision-making in construction project management. This study not only bridges the technological chasm in traditional risk management approaches but also posits a forward-looking perspective for sustainable and resilient smart building developments.

10:45-17:30, Paper FrPP1.5
Real-Time Human Motion Intention Recognition for Powered Wearable Hip Exoskeleton Using LSTM Networks (I)

Ding, Fan	Capital University of Physical Education and Sports
Yu, Yang	Capital University of Physical Education and Sports
Dong, Lin	Capital University of Physical Education and Sports
Keywords: Rehabilitation and Assistive Robotics, Artificial Intelligence, Deep Learning Abstract: Lower limb exoskeletons are revolutionizing rehabilitation and assistive technologies. However, accurate and real-time recognition of user movement intent remains a challenge. This paper proposes a Long Short-Term Memory (LSTM) network-based approach for real-time motion intention recognition in a powered wearable hip exoskeleton. We leverage biomechanical signals from an embedded IMU sensor and the exoskeleton's joint angle and velocity data to classify various motion patterns, including standing, sitting, walking, stair ascending/descending, and activity transitions like sit-to-stand and stand-to-sit. The paper details the hardware architecture of the exoskeleton, data collection and processing procedures, and the LSTM model architecture. We evaluate the performance of the proposed method through experiments and the results indicate the effectiveness of the model. The results of this study may have significant implications for the development and application of lower limb exoskeletons, particularly in the field of rehabilitation.

10:45-17:30, Paper FrPP1.6
Design and Control of a Novel Powered Wearable Knee Exoskeleton for Lower Limb Rehabilitation (I)

Du, Kui	Beijing Institute of Technology
Yu, Yang	Capital University of Physical Education and Sports
Dong, Lin	Capital University of Physical Education and Sports
Keywords: Rehabilitation and Assistive Robotics, Robot Design, Smart Structures, Materials, Actuators Abstract: Lower limb rehabilitation is a crucial aspect of recovery following various injuries and musculoskeletal disorders. Traditional techniques, while effective, can be time-consuming and require intensive therapist involvement. Powered wearable knee exoskeletons offer a promising solution by providing targeted torque assistance during rehabilitation exercises. This paper presents the design and control of a novel powered wearable knee exoskeleton specifically tailored for lower limb rehabilitation applications. The exoskeleton prioritizes user comfort, safety, and ease of use. We delve into the mechanical design considerations for achieving lightweight construction, comfortable wearability, and suitable range of motion. Subsequently, we elaborate on the control algorithm employed on the knee exoskeleton, focusing on its ability to improve rehabilitation efficiency.

10:45-17:30, Paper FrPP1.7
A Hybrid Approach to Real-Time Robotic Visual Navigation: Integrating Detection and Scene Segmentation (I)

Hu, Lingxiang	Paris Saclay University
Zhu, Xingfei	Jiangnan University
Li, Dun	Tsinghua University
Zhang, Fukai	Shandong University
Zhang, Chengqiu	Shandong University
Keywords: Robot Vision and Computer Vision, Mobile Robotics, Deep Learning Abstract: Robotic vision-based navigation is crucial for enabling autonomous operation and task execution in various environments. However, most current research focuses heavily on achieving high accuracy, often at the expense of real-time performance and efficiency. In this study, we propose a novel approach that balances both efficiency and accuracy by integrating a semantic segmentation branch into the YOLOv5 architecture. We designed a semantic decoder dedicated to semantic segmentation that can better integrate the information of multiple layers of backbone. Our enhanced model is trained on a custom dataset extracted from COCO which is specifically designed for robotic navigation. This integration significantly improves the efficiency and speed of the system, making it suitable for real-time applications. Experimental results demonstrate the effectiveness of our approach, showcasing superior performance in real-world robotic navigation tasks.

10:45-17:30, Paper FrPP1.8
A Comprehensive Study of Deep Learning Visual Odometry for Mobile Robot Localization in Indoor Environments

Hu, Lingxiang	Paris Saclay University
Li, Xiangjun	Université Paris-Saclay
Bonardi, Fabien	Université D’Évry
Ait Oufroukh, Naima	University of Paris-Saclay
Ahmed Ali, Sofiane	IBISC, Evry-Val-d’Essonne University, Universite Paris-Saclay, E
Keywords: Mobile Robotics, SLAM and Navigation, ROS, Software System for Robotics Application Abstract: This paper investigates the characteristics and performance of deep learning-based visual odometry for the localization of mobile robots in 2D indoor small-scale environments. Our study begins by developing a highly accurate multi-sensor fusion localization method that integrates data from various sensors, including cameras, inertial measurement units (IMUs), marvelmind and wheel encoders. Using this method, we created a comprehensive dataset that captures the robot's movements in controlled indoor settings.We then conducted an extensive comparative study of several deep learning-based visual odometry methods by evaluating their strengths and weaknesses using public datasets. From this comparison, we identified the Deep Patch Visual Odometry (DPVO) method as the most effective approach.Then, we made several enhancements to the DPVO method to further improve its localization capabilities.Subsequently, we applied the improved DPVO method to our newly created dataset, which allowed us to perform rigorous testing and validation in realistic indoor scenarios. The results were compared with localization data obtained from other modalities.

10:45-17:30, Paper FrPP1.9
Design of Photovoltaic Panel Vehicle Cleaning Robot (I)

Xu, Feng	Beijing Information Science & Technology University
Liu, Quan	Beijing Information Science and Technology University
Li, Zhenfeng	Beijing Information Science and Technology University
Keywords: Robot Design, Emerging Technologies and Applications Abstract: Based on the fact that it is easy to accumulate a large amount of dust and snow on the photovoltaic panel, which affects the power generation efficiency of the photovoltaic panel, a photovoltaic panel vehicle cleaning robot is designed according to the spacing of the photovoltaic panels. Through the vehicle lifting frame to adjust the height of the work, the rotating arm flexibly tilt angle to fit the photovoltaic panel. It can be adjusted according to the height of the photovoltaic panel and the inclination angle of the installation to achieve complete cleaning of the photovoltaic panel and ensure the cleaning effect, so as to improve the power generation efficiency and service life of the photovoltaic panel.

10:45-17:30, Paper FrPP1.10
Design of Dictyophora Rubrovolvata Picking Robot (I)

Li, Zhenfeng	Beijing Information Science and Technology University
Liu, Quan	Beijing Information Science and Technology University
Xu, Feng	Beijing Information Science & Technology University
Ye, Lingbo	Beijing Information Science and Technology University
Keywords: Agricultural Robotics, Robot Design Abstract: Based on the low work efficiency, high labor intensity, high labor cost and easy damage of Dictyophora indusiata in the process of manual picking. In this paper, a dictyophora picking robot is designed. The robot includes two parts : the motion system and the picking system. The motion system can complete the horizontal and vertical motion of the robot in the picking process. The picking system uses the rotating cylinder and the sucker to adsorb and pick the dictyophora. In this paper, the kinematics analysis of the picking mechanism ensures the stability and continuity of the robot. In addition, the miniaturization design reduces the cost of the robot and effectively improves the picking efficiency.

10:45-17:30, Paper FrPP1.11
A Control Method for Electric Cylinder Based on Three-Order ADRC (I)

Jiang, ZhiFan	Nanchang Hangkong University
Chen, Zhihua	Nanchang Hangkong University
Tong, Lijun	Nanchang Hangkong University
Huang, Jiale	School of Information Engineering, Nanchang Hangkong University
Che, Yufei	Nanchang Hangkong University
Keywords: Intelligent Control, Industrial Robotics and Factory Automation, Smart Structures, Materials, Actuators Abstract: Abstract—As an executive component of robots, the precise control of electric cylinders is crucial for achieving accurate control of robots. This article proposes a position closed-loop electric cylinder control method based on three-orders active disturbance rejection control (ADRC). Firstly, model the electric cylinder system and determine its transfer function. Secondly, PID controllers and ADRC controllers were designed separately, and their stability was verified through the Routh criterion and Lyapunov method, respectively. Finally, the control effects of PID and ADRC were compared through three sets of experiments. The results show that compared to PID, ADRC is less affected by the increase in input signal frequency, has stronger anti-interference ability, and can effectively resist the influence of internal uncertainty.

10:45-17:30, Paper FrPP1.12
Research on the Strike Strategy of Quadcopter Unmanned Aerial Vehicles for Mobile Targets (I)

Zhang, Wencai	Norinco Group Liao Shen Industries Group Co.，ltd
Sun, Junjie	Norinco Group Liao Shen Industries Group Co.，ltd
Li, Yanying	Norinco Group Liao Shen Industries Group Co.，ltd
Yan, Shen	Norinco Group Liao Shen Industries Group Co.，ltd
Zhang, Ning	Norinco Group Liao Shen Industries Group Co.，ltd
Wang, Jiaqi	Norinco Group Liao Shen Industries Group Co.，ltd
Chen, Zhihua	Nanchang Hangkong University
Pan, Mingran	Norinco Group Liao Shen Industries Group Co.，ltd
Keywords: Path and Motion Planning, Space Robotics, Artificial Intelligence Abstract: This paper studies a strike path strategy for quadcopter drones targeting ground maneuvering targets. The strategy sets the strike path to two different strike speeds, which improves the stability and robustness of quadcopter drones while shortening strike time and increasing hit rates. Consider the attitude control of quadcopter unmanned aerial vehicles during motion, and verify the flight reliability of the mechanism through simulation experiments. Set different slope strike paths and obtain the optimal strike path through experiments, while proving the effectiveness of this strike strategy in engineering applications.

10:45-17:30, Paper FrPP1.13
Heart Rate Generation Algorithm Using Generative Adversarial Networks Based on Time-Frequency Domain Composite Loss Values

Lianlian, Cai	Changchun University of Architecture and Civil Engineering
Ziyao, Xu	Changchun University of Architecture and Civil Engineering
Zhao, Wanrong	Changchun University of Science and Technology
Ma, Jiayue	Changchun University of Science and Technology
Gao, Ziyi	Changchun University of Science and Technology
Huang, XuPeng	Changchun University of Science and Technology
Keywords: Artificial Intelligence, Deep Learning, Machine Learning Abstract: With the proliferation of smart medical devices, there is a significant need for extensive data collection, which raises concerns regarding privacy protection and data collection costs. In recent years, the development of Generative Adversarial Network (GAN) technology for heart rate signal generation has demonstrated superior generative capabilities and data diversity. In response, we designed a frequency domain loss function tailored to the frequency component of time-series signals, combined with a frequency domain attention mechanism, and integrated the advantages of Transformers to further optimize the model fitting. We assessed the model's generative performance using RMSE and MAE metrics. The results indicate that our model, which utilizes a hybrid time-frequency domain training approach, performs better than similar models, thus proving its effectiveness.

10:45-17:30, Paper FrPP1.14
Time Series Prediction of EEG Signals Based on a Multi-Scale Approach

Fei, Gao	Changchun University of Architecture and Civil Engineering
Yunfei, Teng	Changchun University of Architecture and Civil Engineering
Jing, Li	Jilin Xinzhou Intelligent Technology Co., Ltd
Pang, Weizhi	Changchun University of Science and Technology
Huang, Guanhua	Changchun University of Science and Technology
Tianyun, Luan	Changchun University of Science and Technology
Keywords: Artificial Intelligence, Deep Learning, Machine Learning Abstract: Aiming at the problem of difficulty in judgment and prediction due to non-periodicity, large amplitude and uncertainty of EEG signals, a multi-scale feature extraction based EEG signal classification method for epileptic seizure prediction is adopted, which uses null convolution to extract multi-scale features of EEG signals, in order to capture both global and local features of the signals. The experiments were performed on the CHB-MIT public dataset. The results show that the method significantly outperforms the traditional method in terms of accuracy, sensitivity and specificity, with an accuracy of 98.10%, a sensitivity of 92.95% and a specificity of 98.85%. It proves that the method based on multi-scale feature extraction has a good prospect in epileptic seizure prediction.

10:45-17:30, Paper FrPP1.15
Multi-Scale Heart Rate Anomaly Detection Based on Temporal Convolutional Networks

Lianlian, Cai	Changchun University of Architecture and Civil Engineering
Qiao, Wu	Changchun University of Architecture and Civil Engineering
Di, Zhang	Changchun University of Architecture and Civil Engineering
Liang, Qizhe	Changchun University of Science and Technology
Cong, Xinxin	Changchun University of Science and Technology
Huang, XuPeng	Changchun University of Science and Technology
Keywords: Artificial Intelligence, Machine Learning, Deep Learning Abstract: Addressing heart rate anomaly detection is crucial for preventing and managing cardiovascular diseases. This research leverages the Temporal Convolutional Network (TCN) to develop a multi-scale heart rate anomaly detection method, enhancing the accuracy and efficiency of anomaly recognition. Utilizing the MIT-BIH Arrhythmia Dataset, the ECG data is preprocessed with a Butterworth bandpass filter and combined with time-domain and frequency-domain analyses to extract signal features at key time points through PCA. A novel TCN model is designed to capture both high-frequency and low-frequency information of heart rate signals using varying convolutional kernel scales, integrated with ResNet technology to improve the learning of abstract features. Experimental results show that this model excels in multi-class heart rate anomaly detection with an accuracy of up to 99.41%, significantly outperforming comparable models. Additionally, by aggregating features through a multi-head attention mechanism, the model's capability to capture complex nonlinear characteristics of ECG signals is further enhanced, validating its feasibility and effectiveness.

10:45-17:30, Paper FrPP1.16
Research on Wearable Discharge Fault Detection Technology for Electrical Equipment within the Audible Range Using Deep Learning

Gao, Hongyi	Changchun University of Science and Technology
Bai, Xuemei	Changchun University of Science and Technology
Wang, Zhijun	High-Performance Computing Center of Changchun Normal University
Zhang, Chenjie	Changchun University of Science and Technology
Keywords: Deep Learning Abstract: Transformers and other power equipment form the solid foundation of the power grid, and their insulation performance is crucial for the safe and stable operation of the power system. Accurate assessment of the insulation performance of power equipment is essential to ensure the health status of the equipment and prevent potential failures. However, traditional partial discharge detection methods are often constrained by cumbersome shutdown detection processes, sensitivity to electromagnetic interference, and the limitations of manual detection. Inspired by the exquisite craftsmanship of the "Great Craftsman" in diagnosing faults through audition, this paper innovatively adopts a detection method based on audible sound analysis. We utilize advanced sound feature extraction techniques of spectrograms, combined with the powerful learning capabilities of convolutional neural networks, to achieve rapid detection and precise identification of discharge faults in power equipment. It has been successfully deployed on wearable devices such as smartwatches. This enables operators to perform real-time monitoring and intelligent diagnosis of power equipment anytime, anywhere, providing strong technical support for the stable operation of the power system.

10:45-17:30, Paper FrPP1.17
GHTPAN: A Graph Head-Tail Partner Attention Network for Traf-Fic Flow Prediction

Yang, Yang	Changchun University of Architecture and Civil Engineering
Lin, Kai	Changchun University of Architecture and Civil Engineering
Yusong, Ke	Tongji
Xu, Cecheng	Changchun University of Science and Technology
Huang, Fuzhong	Changchun University of Science and Technology
Guo, Hongxi	Changchun University of Architecture and Civil Engineering
Huang, XuPeng	Changchun University of Science and Technology
Keywords: Deep Learning, Artificial Intelligence, Intelligent Control and Systems Abstract: Traffic prediction is the core of intelligent transportation sys-tem (ITS). However, it is a challenging task to construct an ac-curate prediction model that can cope with the complexity of road network in the real world. In the traffic network, the traf-fic flow of different road sections or intersections changes dy-namically with time, and there is mutual influence between the flow of adjacent areas in space. Most of the existing traffic flow prediction methods lack the ability to model the dynamic spa-tio-temporal correlation in traffic data, so they cannot obtain satisfactory prediction results. In order to obtain nonlinear traffic flow data and avoid the problem of gradient disappear-ance or explosion when processing time series data. We propose an innovative approach called Graph Head-Tail Partner Atten-tion Network (GHTPAN). Specifically, the head-tail partner attention module is introduced. Through the dual attention mechanism of channel and weighted, the attention weight is automatically assigned to the traffic sensor nodes in the road network, and the spatial relationship between non-adjacent nodes is dynamically concerned. Combined with the graph con-volutional neural network, the local spatial dynamic correlation characteristics of the road network are extracted.A series of comprehensive experiments using highway traffic datasets PEMS03, PEMS04 and PEMS08 not only verify the excellent performance of the GHTPAN model, but also highlight its cut-ting-edge capabiliti

10:45-17:30, Paper FrPP1.18
Deep Learning-Based Low-Light Image Enhancement Method for Driving Behavior Recognition

Zhang, Chenjie	Changchun University of Science and Technology
Li, Jialu	Changchun University of Science and Technology
Hu, Hanping	Changchun University of Science and Technology
Bai, Xuemei	Changchun University of Science and Technology
Keywords: Deep Learning, Intelligent Transportation Systems Abstract: When image acquisition is carried out in low-light scenes, there are usually problems such as insufficient light and too much noise, which brings difficulties to the subsequent detection of distracted driving behavior. In this paper, Zero-DCE is adopted as the low-light image enhancement model, and deconvolution layer is added to compensate the details and restore the fine structure in the image, thus improving the brightness and clarity of the image. The residual block is introduced to reduce noise and improve clarity, and the kernel selection module is added to solve the noise problem. The kernel selection module can adaptively select the appropriate kernel function to reduce the impact of noise. Finally, the image quality assessment methods PSNR and SSIM are used for testing and analysis. SSIM increased by 42.62%. The results show that the proposed improved method can effectively improve image brightness, reduce noise effects and recover key details, laying a foundation for subsequent distracted driving detection methods.

10:45-17:30, Paper FrPP1.19
Lightweight UAV Image Drowning Detection Method Based on Improved YOLOv7

Cui, Yuhao	Changchun University of Science and Technology
Li, Mingqiu	Changchun University of Science and Technology
Huang, XuPeng	Changchun University of Science and Technology
Yang, Yang	Changchun University of Science and Technology
Keywords: Deep Learning, Robot Vision and Computer Vision Abstract: Abstract—Using Unmanned Aerial Vehicle (UAV) for searching for drowning individuals can enhance the efficiency of search and rescue operations. Addressing the challenges posed by the YOLOv7 model, such as a large number of parameters and high computational costs, as well as difficulties in deploying UAV, a lightweight UAV image drowning detection method based on improved YOLOv7 is proposed. We replace the backbone in YOLOv7 with MobileNetV4 and reconstruct the SPPCSPC structure to reduce the model's parameter count and computational complexity while improving detection accuracy by improving convolutional layers and changing pooling structures. Additionally, we introduce a CA attention mechanism to address the issue of limited feature information due to the limited area of human bodies exposed on water surfaces. Finally, we use Wise-IoU as the model loss function to mitigate the negative impact of low-quality samples and improve overall detector performance. Experimental results show that our improved algorithm only sacrifices 1.08% mAP, the number of parameters was reduced by 46.7% compared to YOLOv7, increases detection speed by 97.8%, and better balances detection speed with accuracy.

10:45-17:30, Paper FrPP1.20
A Method for Generating Environmental Maps Based on the Improved YOLOv8

Yan, Mingtao	Qingdao University
Sun, Chuanzhu	Qingdao University
Fu, Chaoxing	Qingdao University
Keywords: Deep Learning, Robot Vision and Computer Vision, Artificial Intelligence Abstract: To improve environmental mapping accuracy in complex workshop environments with poor lighting and high ground reflectivity, this paper proposed enhancements to the traditional grid map generation algorithm using instance segmentation. A simulation environment was created to collect data, and several instance segmentation algorithms were compared, with YOLOv8 chosen as the best. Due to the slow inference speed and poor real-time performance on industrial computers, lightweight modifications were made to YOLOv8. These included enhancing the backbone network with the Rep Ghost module, introducing the MPDIoU loss function to improve accuracy without increasing computational load. Three improvement methods were tested, showing that the enhanced YOLOv8 algorithm reduced parameters by 10%, increased inference speed by 19%, and achieved 100% accuracy in grid map generation. This improved algorithm meets the needs for accuracy and speed on resource-limited industrial computers.

10:45-17:30, Paper FrPP1.21
Research on the Impact of Types of Knowledge Workers' Co-Creation with AI on Purchase Intention of Product

Tu, Yangjun	Hunan University
Jiang, Simin	Hunan University
Qian, Guanyu	Hunan University
Yang, Zhi	Hunan University
Keywords: Artificial Intelligence, Human-Robot Interaction and Cooperation Abstract: The co-creation between knowledge workers and AI has become a hotbed of innovation, with three typical types of co-creation emerging in practice (such as human-led, etc.). However, the impact of co-creation types on purchase intention of products has not been fully explored. This study uses an experimental method to explore the impact of the co-creation types between employees and AI on purchase intention in three production scenarios. The results show that the type of co-creation significantly affects purchase intention, with the temporal and content characteristics of the co-creation process playing a moderating role. The research enriches the theory of knowledge work and provides practical guidance for improving product purchase intention and promoting innovation.

10:45-17:30, Paper FrPP1.22
Defect Detection in Solar Panels Based on PV-Flow

Niu, Minghao	Yanshan University
Li, Jiaqi	University
Li, Mengyu	Yanshan University
Zhao, Yifan	Yanshan University
Liu, Ying	The Yingli Energy Development Co., Baoding
Wen, Shuhuan	Yanshan University
Keywords: Artificial Intelligence, Industrial Robotics and Factory Automation, Deep Learning Abstract: With technological advancements, resource utilization has increased, making renewable energy sources like solar power essential. Effective defect detection for solar panels is crucial. However, manual inspection is currently inefficient and machine vision lacks accuracy. In order to improve defect detection in high-resolution images and overcome the limitations of existing methods, we propose an unsupervised deep learning model called PV-Flow, which combines a hybrid attentional mechanism and a multiscale feature extraction module of the flow model. Before defect detection we need to categorize the defects and make a dataset, after observing the features we classify the defects into black spots, broken grids, hidden cracks and fragmentations. The performance of the model while performing defect detection is evaluated by metrics of reconstructed images and visual comparisons. Experiments on real solar panel datasets show that our proposed PV-Flow model is able to accurately identify the location of broken gate defects and show more details, with even the smallest defects clearly labeled and magnified. This suggests that our proposed model is able to detect potential defects earlier, thus reducing production risks and costs. We also conducted comparative experiments to further validate the reliability of the model.

10:45-17:30, Paper FrPP1.23
A Model-Based Control Framework for a Novel Combined Octopod Robot

Zhang, Yi	Shandong University
Zhou, Lelai	Shandong University
Sun, Jingyu	Shandong University
Fan, Shenglin	Shandong University
Li, Guowei	Shandong University
Sui, Mingjun	National Innovation Center of High-Speed Rail Technology (Qingd
Dai, Xiaomeng	China Railway Construction Corporation Bridge Engineering Bureau
Li, Yibin	Shandong University
Keywords: Biologically Inspired Robotics, Intelligent Control Abstract: The legged robot with a waist structure has a more flexible body structure than that with a rigid body. To address the complexity of the waist structure, this paper proposes a lightweight combined octopod robot consisting of two quadruped robots connected back and forth by a novel detachable docking mechanism. To address the issue of performance loss caused by model predictive control (MPC) in high-dimensional nonlinear systems of octopod robots, a centralized control framework for nonlinear model predictive control (NMPC) and whole body control (WBC) is proposed to optimize the contact forces and joint torques. Apart from that, an attitude adaptation control strategy is designed in order to improve the traversing capability in the steep slope terrain. The combined octopod robot platform and motion control framework have been evaluated through a series of experiments, and the results show that the robot has excellent omnidirectional motion capability, as well as strong terrain adaptability and traversing ability in sloping terrain.

10:45-17:30, Paper FrPP1.24
Regularization Continual Learning Based on Bayesian Uncertainty Modeling for P300 Brain-Computer Interface

Huang, Qianqi	South China University of Technology
Gu, Zhenghui	South China University of Technology
Keywords: Brain-Machine Interface, Deep Learning Abstract: The P300 is a specific component of the event-related potentials (ERPs) and has been extensively utilized in brain-computer interface (BCI) applications. Numerous neural network models have been implemented to detect P300 and achieve outstanding results in intra-subject scenarios. However, in real-world situations where data streams from different subjects arrive sequentially, intra-subject models are time-consuming and resource-intensive. This necessitates research into cross-subject scenarios. To address this issue, we propose a continual learning (CL) method, named Elastic Weight Consolidation with Bayesian Neural Network (EWC-BCNN), for cross-subject P300 decoding. Specifically, EWC-BCNN comprises two main modules: EWC and BCNN. EWC employs a regularization term to penalize significant changes in parameters deemed important. Additionally, BCNN quantifies parameter uncertainty, identifying and protecting crucial knowledge. By integrating BCNN, EWC can make more informed decisions about which weights to preserve and to what extent, thus enhancing its ability to balance the retention of past knowledge with the adaptation to new information. We evaluated our method on two public P300 datasets. Our experimental results demonstrate that EWC-BCNN achieves better P300 detection performance than point-estimate networks. Furthermore, EWC-BCNN outperforms other state-of-the-art CL methods.

10:45-17:30, Paper FrPP1.25
A Collaborative Action Sequence Planning Method for Dual-Arm Piano Playing Robot Arm-Hand Based on Elite Retention Multi-Objective Optimization

Sun, Wei Dong	Nanjing University of Aeronautics and Astronautics
Jiang, Jincheng	Nanjing University of Aeronautics and Astronautics
Bin, YiMing	Nanjing University of Aeronautics and Astronautics
Yu, Yanzhao	Nanjing University of Aeronautics and Astronautics, Qinhuai Dist
Wang, Lingyu	Nanjing University of Aeronautics and Astronautics
Zhang, Siwen	Nanjing University of Aeronautics and Astronautics
Gui, Kai	Shanghai Jiaotong University
Wan, Minhong	Zhejiang Lab
Duan, Jinjun	Nanjing University of Aeronautics and Astronaut
Keywords: Humanoid Robots, Path and Motion Planning, Intelligent Control Abstract: 作为人形机器人的典型代表，展现了双臂的灵巧操作能力，对弹钢琴机器人的研究可以推动人形机器人应用的快速发展。为解决弹奏钢琴过程中手手协力困难导致演奏节奏感和完整性不足的问题，提出一种基于精英留存多目标优化的双臂弹钢琴机器人协同动作序列规划方法。首先，通过分析人体钢琴演奏过程中的手臂-手运动状态，结合专业性能评价指标，建立基于性能效果的钢琴演奏机器人手臂-手运动模型;其次，利用精英&

10:45-17:30, Paper FrPP1.26
An Aviation Plug Regrasping Strategy Fused with Prior Knowledge Perception for Dual-Arm Collaborative Assembly Task

Jiang, Jincheng	Nanjing University of Aeronautics and Astronautics
Gui, Kai	Shanghai Jiaotong University
Sun, Wei Dong	Nanjing University of Aeronautics and Astronautics
Guo, An	Nanjing University of Aeronautics and Astronautics
Wang, Zhengwei	Nanjing University of Aeronautics and Astronautics
Wang, Lingyu	Nanjing University of Aeronautics and Astronautics
Duan, Jinjun	Nanjing University of Aeronautics and Astronaut
Keywords: Humanoid Robots, Robot Vision and Computer Vision, Grasping and Manipulation Abstract: Assembly quality of typical electrical interfaces determines the safety of the aircraft, such as aviation plug. It is difficult to perform dexterous manipulation and assembly of randomly placed objects by the single-robot assembly method. To solve the above problems, an aviation plug regrasping strategy fused with prior knowledge perception for dual-arm collaborative assembly task is proposed in this paper. Firstly, the prior knowledge base of the assembly object is constructed based on the 3D point cloud model, and the transfer of prior knowledge is completed through a transfer matrix. Secondly, the recognition and pose estimation of randomly placed assembly object are realized based on the prior knowledge of the fusion SHOT features and the improved PCA algorithm. Then, based on the reachability and manipulability indices, the optimal transfer pose for the dual-arm robot is obtained, and the regrasping strategy of the dual-arm robot based on the motion transition graph is proposed to realize the regrasping manipulation of randomly placed assembly object. Finally, a dual-arm robotic system experimental platform is constructed to validate the proposed algorithms. Experimental results show that the proposed perception method is robust for randomly placed assembly object, and the regrasping strategy can adjust the pose of the assembly object before the final assembly task by the dual-arm robot.

10:45-17:30, Paper FrPP1.27
EM-SAM: Eye-Movement-Guided Segment Anything Model for Object Detection and Recognition in Complex Scenes

Li, Jinqi	National University of Defense Technology
Yu, Yang	National University of Defense Technology
Zhou, Junfan	National University of Defense Technology
Wang, Chinan	National University of Defense Technology
Zeng, Ling-Li	National University of Defense Technology
Keywords: Human-Robot Interaction and Cooperation, Deep Learning, Human-Machine Interface Abstract: Abstract—Over the past few decades, object detection and recognition systems have made great strides relying on convolutional neural networks (CNNs). However, these methods perform poorly in small object detection and complicated natural scenes. There are still plenty of opportunities for improvement in enhancing small object features extraction and eliminating the effects of complex scenes. Compared with computer, human can automatically ignore redundant information in complex scenes, focusing attention on suspected objects. To this end, we proposed an Eye-Movement-Guided Segment Anything Model for Object Detection and Recognition in Complex Scenes Specifically, the framework includes eye movement state classification for acquiring gaze points, object detection based on gaze point segmentation to remove the influence of complex scenes, and target recognition for lowering the confidence threshold. In the object detection using Segment Anything Model (SAM), the object detection success rate was 97.6% ± 3.19% when the false detection rate was 9.62% ± 6.45%. Compared to the baseline, the Recall under the object recognition increased from 85.71% to 91.83%, and the MAP increased by 5.33%. Meanwhile, The average time for selection of the framework's targets was 1.35s, improving the user's ability to interact with the environment. Keywords: object detection and recognition, eye movement, SAM, complex scenes.

10:45-17:30, Paper FrPP1.28
Research on Robotic Grinding Technology of Precision Forged Blade Edge for Aero Engine

Tao, Yong	Beijing University of Aeronautics and Astronautics
Yang, Lin	Beihang University
Xue, Jiao	Beihang University
Liu, Yazui	Beihang University
Li, Zhiyong	Changhe Aircraft Industries (GROUP) Ltd
Li, Wen	Changhe Aircraft Industries (GROUP) Ltd
Keywords: Industrial Robotics and Factory Automation, Intelligent Control, Artificial Intelligence Abstract: The high-efficiency robotic edge grinding technology for precision forged blades of aero engines can significantly improve the blade machining quality, enhance the engine performance stability, reduce rework and additional maintenance, and reduce the overall operating and maintenance costs. Therefore, the robotic edge grinding technology of precision forged blades of aero engines is investigated in this paper. Firstly, based on the characteristics of the robot, the robot dynamics model is established by the D-H method. On this basis, based on polynomial fitting data source, inverse kinematics solution, precise control of edge grinding trajectory. Finally, an example is given to verify the process of a certain type of engine blade. The results show that the proposed method is reasonable and effective, and the efficiency of blade machining is improved.

10:45-17:30, Paper FrPP1.29
Stiffness Performance Optimization Method of Drilling Robot Based on QPSO Algorithm

Zhang, Yufan	Beihang University
Tao, Yong	Beijing University of Aeronautics and Astronautics
Wei, Hongxing	Beihang University
Liu, Haitao	BUAA
Wan, Jiahao	Beihang University
Guo, Ruijun	Beijing C.H.L. Robotics Co., Ltd
Keywords: Industrial Robotics and Factory Automation, Path and Motion Planning Abstract: Due to its high flexibility and intelligence, industrial robots are being increasingly used in aeronautical processing tasks. However, the inherently low stiffness of robots seriously affects the processing quality of their operations. Studying its stiffness characteristics and optimization methods is an effective method to improve the stiffness performance of robot. This paper proposes a robot posture optimization method based on quantum particle swarm algorithm to solve the problem of vibration during drilling. Firstly, the kinematic model and static stiffness model of the robot are established, and then a stiffness index oriented to the machining plane is proposed. Next, a new performance evaluation index is obtained by normalization fusion with manipulability to quantitatively evaluate the stiffness performance of the robot. After this, the drilling posture is optimized by quantum particle swarm optimization algorithm under the boundary constraints and property constraints of the robot. Finally, drilling experiments were carried out on AUBO i5 robot to verify the accuracy of the performance evaluation coefficient and optimization method. Compared with non-optimization and particle swarm optimization, the robot performance was improved by 32.28% and 8.57% on average, indicating that this method can effectively improve the robot stiffness performance.

10:45-17:30, Paper FrPP1.30
Application of Fuzzy PID Algorithm in Path Control of Intelligent Tracking Vehicles

Zhou, Wenlong	University of Jinan
Jun, Wei	University of Jinan
Hu, Yan	Beihang University, Beijing, China. or University ofJinan, Jinan,
Liu, Lei	University of Jinan
Wang, Yiran	University of Jinan
Keywords: Intelligent Control and Systems, Path and Motion Planning Abstract: This paper addresses the path control problem of an intelligent self-tracking car in dynamic environments and proposes a control algorithm based on fuzzy PID. The algorithm controls the car's pose by adjusting the speed difference of the driving wheels, combining the high adaptability of fuzzy logic with the efficiency of PID control, effectively handling the uncertainty of input signals. The algorithm detects position and angle deviations in real time, comprehensively adjusts the control strategy, and precisely regulates the driving wheel speed, thereby quickly and accurately correcting the car's travel direction and position, significantly reducing errors. By constructing a kinematic model of the AGV and using Simulink for simulation, the results show that fuzzy PID performs excellently in tracking step signals. When there is an offset input, the fuzzy PID controller makes the AGV converge to the predetermined trajectory more quickly, with smaller overshoot, less fluctuation, smoother changes, and stronger system stability. Additionally, this paper deeply compares the performance of the two algorithms under various experimental path conditions. The experimental results show that the fuzzy PID algorithm outperforms the traditional PID in various experimental environments, with smaller variance, especially in handling complex path tracking tasks. It achieves lower curve error, more accurately controls the car's state, and significantly enhances navigation performance.

10:45-17:30, Paper FrPP1.31
Time-Frequency Domain Transformation Space-Time Graph Convolution

Huang, XuPeng	Changchun University of Science and Technology
Yang, Yang	Changchun University of Science and Technology
Yang, Chao	Beijing Aerospace Times Laser Inertial Technology Company
Cui, Yuhao	Changchun University of Science and Technology
Keywords: Intelligent Transportation Systems, Deep Learning Abstract: Traffic forecasting is a crucial element of Intelligent Transportation Systems (ITS) and plays a significant role in traffic planning, management, and control. Graph Convolutional Network (GCN)-based models have become the standard approach for traffic prediction due to their capability to model spatial correlations through the mechanism of message passing to transmit traffic information. However, existing models often fall short in effectively capturing the frequency information of traffic flow and delivering accurate long-term predictions. To address the challenge of current model frameworks that struggle to learn the periodic nature of traffic information, this paper introduces a novel architecture that integrates bidirectional time-frequency transformations. This design utilizes the Fourier Transform to perform graph convolution in the frequency domain, followed by the Inverse Fourier Transform for time-domain prediction. Additionally, to enhance the model’s learning capacity, we incorporate an adaptive learning unit and a multi-scale recurrent neural network structure. The proposed model has been validated using two real-world datasets and compared with six other typical forecasting models, consistently outperforming them in predictive accuracy.

10:45-17:30, Paper FrPP1.32
Autonomous Prediction of UAV Launch Parameters Based on SSA-BP Model

Jia, Huayu	University of Chinese Academy of Sciences
Zheng, HuiLong	University of Chinese Academy of Sciences
Huo, Shunbo	University of Chinese Academy of Sciences
Zhou, Hong	University of Chinese Academy of Sciences
Zhang, Qian	University of Chinese Academy of Sciences
Keywords: Robot Design, Intelligent Control, Artificial Intelligence Abstract: The coupling of key parameters such as launch angle, booster pinch angle, booster thrust, etc. affects the trajectory of the rocket boosted zero-length launch, and there exists a difficult problem of optimal selection of launch parameters. In this paper, a UAV is taken as the research object, dynamics and kinematics modeling of its launching stage is carried out, and parametric simulation software for UAV launching trajectory is designed based on QT/C++. Meanwhile, the UAV launch parameter prediction method based on SSA-BP model is proposed, and the superiority of SSA-BP for launch parameter prediction is comprehensively evaluated based on MAE, MAPE, and RMSE evaluation methods. The results show that the UAV launching trajectory simulation and launch parameter prediction model in this paper are effective and credible, and can provide a basis for the design of UAV launch system.

10:45-17:30, Paper FrPP1.33
Obstacle Avoidance Path Planning for Robotic Arm Based on EIT Tactile Sensing (I)

Meng, Kai	Hebei University of Technology
Zheng, Wendong	Tianjin University of Technology
Weng, Ling	Hebei University of Technology
Liu, Huaping	Tsinghua University
Keywords: Sensing, Haptic System, Path and Motion Planning, Intelligent Control Abstract: This study introduces a novel flexible tactile sensor using electrical impedance tomography (EIT) for tactile sensing and integrates it into a robotic arm to enable it to have tactile sensing capabilities. In order to enable the robot to autonomously perform interactive operations in an unstructured open environment, this paper proposes a path planning algorithm based on the artificial potential field method in combination with the above-mentioned tactile sensor to achieve autonomous obstacle avoidance of the robotic arm. Specifically, a repulsive potential field is proposed, which is adjusted according to the height of the robotic arm to guide it around detected obstacles. In order to verify the effectiveness of the proposed flexible tactile sensor and obstacle avoidance path planning strategy, the designed tactile sensor and obstacle avoidance algorithm were integrated into the UR5 robotic arm platform for experimental verification. The experimental results show that the robotic arm can accurately detect tactile targets in the environment, and can autonomously adjust the motion path in real time according to the environment, realizing autonomous and safe interaction of the robotic arm in actual scenarios. This study helps to improve the autonomy and operational safety of robots in complex environments, and has important theoretical and practical significance in the field of robot tactile perception and obstacle avoidance navigation.

10:45-17:30, Paper FrPP1.34
AvTF: A Visual-Tactile Fingertip with Active Sensing and Manipulation

Zhao, Jie	Anhui University of Technology
Wang, Sicheng	New York University
Shan, Jianhua	Anhui University of Technology
Sun, Yuhao	Anhui University of Technology
Zhang, Shixin	China University of Geosciences (Beijing)，Tsinghua Univer
Sun, Fuchun	Tsinghua University
Fang, Bin	Beijing University of Posts and Telecommunications / Tsinghua Un
Keywords: Sensing, Haptic System, Robot Vision and Computer Vision, Industrial Robotics and Factory Automation Abstract: In the interaction between dexterous hands and the environment, perception and operation are inseparable. Perception provides information and feedback for operations, and operations in turn affect the perception. Current VTS(visual-tactile sensors) lack initiative and cannot sense the state of the operated object in the world coordinate system. The AvTF(Active visual-tactile Fingertip) introduced in this article is a vision-based fingertip (finger-shaped full-field sensing) tactile sensor that uses a camera to capture tactile images. Sensing and operating are integrated by adding active DoF (Degree of Freedom). Camera modules under different coordinate systems can provide various perspectives for sensors. Through a series of experimental verifications, AvTF has achieved real-time feedback on the grasping and manipulation of objects in the hand, providing a new solution for dexterous hands to fuse sensing and operation better.

10:45-17:30, Paper FrPP1.35
Virtual Variable Stiffness Control Method for Quadruped Robot Leg Based on Contact Event

Sun, Jingyu	Shandong University
Zhou, Lelai	Shandong University
Zhang, Yi	Shandong University
Li, Guowei	Shandong University
Fan, Shenglin	Shandong University
Sui, Mingjun	National Innovation Center of High-Speed Rail Technology (Qingd
Dai, Xiaomeng	China Railway Construction Corporation Bridge Engineering Bureau
Li, Yibin	Shandong University
Keywords: Mobile Robotics, Sensing, Haptic System Abstract: Aiming at the variation of different task focuses of the quadruped robot in response to different contact events, this paper proposes a virtual stiffness adaptive compensation strategy that introduces the perception of proprioceptive disturbance dynamics. The method first relies on floating base dynamics and generalized momentum to establish the estimation method of the disturbance force of the quadruped robot. Then the method of recognizing different contact events in the base and legs of the quadruped robot is established by building band pass filter and isolation filter for estimating the force and by classifying the contact events into hard and soft contacts. Then by combining the method with the virtual model control method of the quadruped robot and swing leg, the recognized contact events can help the robot to change the virtual coefficients, and realized more accurate leg trajectory tracking.

10:45-17:30, Paper FrPP1.36
Design and Realization of a Multimodal Wheel-Leg Hybrid Mobile Robot

Xie, Zijian	Beijing University of Civil Engineering and Architecture
Qin, Jianjun	Beijing University of Civil Engineering and Architecture
Li, Haibo	Beijing University of Civil Engineering and Architecture
Fang, Shihao	Beijing University of Civil Engineering and Architecture
Keywords: Mobile Robotics, Robot Design Abstract: A wheel-leg hybrid mobile robot with multi-motion morphology is designed for the high mobility and terrain adaptability needs of robots in unstructured environments. A single actuator can be used to realize the free switching of the robot's wheel-leg morphology, and the deformation mode of the wheel-leg varies with the rotation direction of the actuator. The parameters of the deformation mechanism were designed by establishing a mathematical model, the motor ability in different morphologies was analyzed, and kinematic simulations were carried out in ADAMS software. Finally, a physical prototype was fabricated and tested for motion in an unstructured environments to verify the practicality of the robot structure.

10:45-17:30, Paper FrPP1.37
Indoor Mobile Robot Map Construction Based on Improved Cartographer Algorithm

Zheng, Zhong	School of Electrical and Electronic Engineering , Changchun Unive
Xie, Mujun	Changchun University of Technology
Jiang, Changhong	School of Electrical and Electronic Engineering, Changchun Unive
Bian, HeYu	School of Electrical and Electronic Engineering , Changchun Univ
Wang, Wei	School of Electrical and Electronic Engineering , Changchun Univ
Keywords: SLAM and Navigation, Mobile Robotics Abstract: Abstract—Cartographer is a Simultaneous Localization and Mapping (SLAM) algorithm developed by Google, which supports a variety of sensors such as odometers, IMUs, LIDAR, GPS and so on, and it is one of the most widely used Laser SLAM algorithms at present. Aiming at the problem that wheeled odometers have accumulated errors in estimating the position of mobile robots, which leads to inaccurate map construction. The data from IMU and odometer are fused by Extended Kalman Filter (EKF) to improve the quality of map construction. Under the framework of Robot Operating System (ROS), the comparison experiments before and after fusion are conducted in a real scene to verify the effectiveness of the algorithm.

10:45-17:30, Paper FrPP1.38
Robot Navigation Based on 3D Scene Graphs with the LLM Tooling

Cheng, Yao	Shandong New Generation Information Industrial Technology Resear
Jiang, Fengyang	Shandong New Generation Information Industrial Technology Resear
Han, Zhe	Shandong New Generation Information Industrial Technology Resear
Wang, Huaizhen	Inspur Group
Zhou, Fengyu	Shandong University
Li, Zhaochuan	Inspur Software Technology Co., Ltd
Wang, Bin	Inspur Software Technology Co., Ltd
Huang, Yang	Shandong New Generation Information Industrial Technology Resear
Keywords: Mobile Robotics, SLAM and Navigation, Artificial Intelligence Abstract: This contribution addresses the integration of 3D Scene Graphs (3DSGs) and Large Language Models (LLMs) to advance robotic navigation. We explore how to best prepare 3DSGs for navigation and examine how key features of 3DSGs affect the way LLMs interpret the information encoded in the 3DSGs and create instructions from it. To improve the reliability and accuracy of the generated instructions for navigation, we introduce prompting methods and scoring mechanisms that reduce uncertainty in LLM responses and ensure the validity of the task planning results. In addition, we focus on difficult navigation task queries involving negation and numbers, analyzing how this approach combining 3DSGs with LLMs behaves and devising strategies that lead to an enhanced performance. These findings and results accomplish our goal of contributing to the development of more robust and intelligent robotic navigation systems that effectively leverage the power of 3DSGs and LLMs.

10:45-17:30, Paper FrPP1.39
Intelligent Spatial Perception by Building Hierarchical 3D Scene Graphs for Indoor Scenarios with the Help of LLMs

Cheng, Yao	Shandong New Generation Information Industrial Technology Resear
Han, Zhe	Shandong New Generation Information Industrial Technology Resear
Jiang, Fengyang	Shandong New Generation Information Industrial Technology Resear
Wang, Huaizhen	Inspur Group
Zhou, Fengyu	Shandong University
Yin, Qingshan	Inspur Intelligent Terminal Co., Ltd
Wei, Lei	Inspur Intelligent Terminal Co., Ltd
Keywords: Mobile Robotics, SLAM and Navigation, Artificial Intelligence Abstract: This paper addresses the high demand in advanced intelligent robot navigation for a more holistic understanding of spatial environments, by introducing a novel system that harnesses the capabilities of Large Language Models (LLMs) to construct hierarchical 3D Scene Graphs (3DSGs) for indoor scenarios. The proposed framework constructs 3DSGs consisting of a fundamental layer with rich metric-semantic information, an object layer featuring precise point-cloud representation of object nodes as well as visual descriptors, and higher layers of room, floor, and building nodes. Thanks to the innovative application of LLMs, not only object nodes but also nodes of higher layers, e.g., room nodes, are annotated in an intelligent and accurate manner. A polling mechanism for room classification using LLMs is proposed to enhance the accuracy and reliability of the room node annotation. Thorough numerical experiments demonstrate the system’s ability to integrate semantic descriptions with geometric data, creating an accurate and comprehensive representation of the environment instrumental for context-aware navigation and task planning.

10:45-17:30, Paper FrPP1.40
Combining A* and Dynamic Window Algorithm for Dynamic Path Planning of Visually Impaired People Using Auxiliary Travel Equipment

Guo, Hongxi	Changchun University of Architecture and Civil Engineering
Huang, Fuzhong	Changchun University of Science and Technology
Jie, Jingfeng	Changchun University of Science and Technology
Xu, Cecheng	Changchun University of Science and Technology
Yang, Yang	Changchun University of Architecture and Civil Engineering
Keywords: Path and Motion Planning, Rehabilitation and Assistive Robotics, SLAM and Navigation Abstract: To address the limitations of conventional Auxiliary Travel Equipment for Visually Impaired People (ATE-VIP) in dynamic path planning, a method combining the A* algorithm and the Dynamic Window Approach (DWA) algorithm is proposed. Initially, considering the walking ability of visually impaired people, a method to reduce the path turns and smooth routes is proposed to shorten the length and total turning angle of the global path generated by the A* algorithm. Subsequently, an Ideal Mobility Model for Visually Impaired People (IMM-VIP) is established. Finally, multiple intermediate target points are extracted from the global path to guide the DWA algorithm for local path planning to obtain the reference trajectory. Simulation results demonstrate a 58.8% reduction in total turning angle compared to traditional methods, validating the effectiveness and feasibility of our approach. The incorporation of IMM-VIP in generating reference trajectories provides valuable insights for advancing ATE-VIP systems.

10:45-17:30, Paper FrPP1.41
Path Planing Using Multiple Random Trees Algorithm Based on Adaptive Step Adjustment

Li, Zhengzheng	Shandong University
Zhou, Lelai	Shandong University
Dai, Xiaomeng	China Railway Construction Corporation Bridge Engineering Bureau
Shen, Zhen	National Innovation Center of High-Speed Rail Technology (Qingd
Zhang, Wenhe	China Railway Construction Corporation Bridge Engineering Bureau
Li, Yibin	Shandong University
Keywords: Path and Motion Planning, Industrial Robotics and Factory Automation, Intelligent Control and Systems Abstract: This paper presents a path planning method using multiple random trees algorithm based on adaptive step adjustment(ASM-RRT). The method introduces a strategy for generating random trees, enabling the placement of root nodes at optimal locations within the environment. An adaptive step adjustment approach is employed to accelerate the exploration process. Additionally, a local search strategy for adjacent points is proposed, defining the search range around the center of each newly generated node to identify more optimal paths. The effectiveness of the method is validated through simulation experiments in two-dimensional and manipulator path planning environments. In comparison to the RRT and RRT-Connect algorithms, the proposed method exhibites superior performance. The manipulator experiment is carried out in physical scenarios to assess the feasibility of the presented method.

Technical Program for Friday August 23, 2024