| |
Last updated on July 16, 2025. This conference program is tentative and subject to change
Technical Program for Tuesday August 19, 2025
|
TuAT1 |
Room T1 |
AI/ML for Healthcare 1 |
Regular Session |
Chair: Yang, Hui | The Pennsylvania State University |
|
08:00-08:18, Paper TuAT1.1 | |
Adaptive Model Predictive Control for a Simulated Underwater Hydrofoil-Based Gait Assist System |
|
Bose, Rishiraj | University of Massachusetts Amherst |
Sup, Frank | University of Massachusetts - Amherst |
Keywords: Physically Assistive Devices, Optimization and Optimal Control
Abstract: This paper presents an Adaptive Model Predictive Controller for a hydrofoil-based underwater gait assistance system. The plant for the controller is a double pendulum in water that is kinematically similar to a human leg. External joint torques are applied to simulate the effect of muscle activations, and these torques are reduced to simulate muscle weakness. The paper describes the design of the controller and the process of generating the internal model, including the reasoning behind the controller architecture. The system is tested over a range of torque reductions and demonstrates the ability to compensate for them. Therefore, this approach is suitable to be adapted for actual hardware deployment with human subjects.
|
|
08:18-08:36, Paper TuAT1.2 | |
Task Allocation for Nursing Robots Using Explainable Machine Learning with Factorization Machines |
|
Acun, Cagla | University of Louisville |
Ashary, Ali | University of Louisville |
Popa, Dan | University of Louisville |
Nasraoui, Olfa | University of Louisville |
Keywords: AI and Machine Learning in Healthcare, Health Care Management, Machine learning
Abstract: We present a machine learning optimization approach for task allocation between teams of robots and nurses working collaboratively in a hospital unit. We address task allocation challenges through a recommender system that optimizes the collaboration between nursing staff and robotic nursing assistants through a two-phase methodology. First, we implement a simulated hospital environment featuring synthetic robots and nurses to collect 3D simulation data. Second, we leverage an explainable pre-hoc machine learning framework using Factorization Machines to learn and predict optimal task allocation patterns. Tasks include navigation, mobility, and item delivery for patients in a hospital unit. Our findings demonstrate that the explainable pre-hoc framework predicts efficient allocations with 94% accuracy, while offering the advantage of explainability, with task difficulty emerging as the most influential feature. Experimental results show that robots outperform nurses in completing easy tasks, while nurses perform more consistently across varying difficulty levels. This research contributes to healthcare automation by introducing a transparent task allocation system with the potential to mitigate nurse shortages, enhance workflow efficiency, and improve healthcare delivery. Our approach emphasizes the importance of explainable AI in healthcare settings and offers a novel solution to this critical optimization challenge.
|
|
08:36-08:54, Paper TuAT1.3 | |
Risk-Averse Autonomous Material Handling in Healthcare Systems |
|
Alomran, Omran | Pennsylvania State University |
Yang, Hui | The Pennsylvania State University |
Keywords: Automation in Life Science: Biotechnology, Pharmaceutical and Health Care, Modelling, Simulation and Optimization in Healthcare, Scheduling in Healthcare
Abstract: The safe internal transportation of hazardous materials within healthcare facilities is critical to mitigating risks to patients, staff, and visitors. This paper presents a risk-averse path planning framework for autonomously handling hazardous materials in healthcare systems. We model the indoor environment using grid-based obstacle and risk maps, where risk arises from pedestrian density flow and critical zones. Our novel risk-averse path planning approach integrates risk directly into each transition cost, thereby enabling more robust and secure path selection. We further improve efficiency by implementing a bidirectional search and refining the resulting path through a post-optimization procedure that minimizes unnecessary heading changes. We evaluated our approach on multiple simulated grid maps and compared it with established methods, measuring path length, average risk, and computational time. Our results show that the proposed framework consistently produces safe and efficient paths while reducing computational overhead.
|
|
08:54-09:12, Paper TuAT1.4 | |
IoT-Enabled Smartwatches for Secure Examinations and Healthcare Assessments |
|
Wang, Zhenkun | University of Southern California |
Bogdan, Ioana Corina | Transilvania University of Brasov |
Liu, Chonghao | University of Southern California |
Nazarian, Shahin | University of Southern California |
Bogdan, Paul | University of Southern California |
Keywords: AI and Machine Learning in Healthcare, Machine learning, Physically Assistive Devices
Abstract: Smart devices and wearable technologies have become indispensable equipment in our daily activities related to education, healthcare, and recreation, including augmented and virtual reality. Considering the educational field, to monitor and assess students' physiological and stress levels during exams, an experimental setup was developed based on an STM32 board that simulates a smartwatch worn for students. The system measures diverse biosignals, including heart rate (HR), electrodermal activity (EDA), blood volume pulse (BVP), and temperature (TEMP). Data are collected and transmitted via a Bluetooth module to a mobile app, which records and displays live data. Experimental results obtained using a publicly available dataset have shown that the system ensure low estimation errors, a mean absolute error (MAE) of 0.512 and a mean squared error (MSE) of 0.287, enabling precise monitoring of students' physiological states. Moreover, a model based on deep learning is then used to evaluate the student's stress level following their physiological signals. Thus, the IoT device can be adopted in real-world educational settings, providing timely insights into student well-being and performance.
|
|
TuAT2 |
Room T2 |
RAL Paper Session 2 |
Special Session |
Chair: Hsieh, Yu-Ming | National Cheng Kung University |
|
08:00-08:18, Paper TuAT2.1 | |
Developing the Keep-Important-Samples Scheme for Training the Advanced CNN-Based Automatic Virtual Metrology Models |
|
Hsieh, Yu-Ming | National Cheng Kung University |
Liu, Chun-Ting | National Cheng Kung University |
Huang, Sheng-Yu | National Cheng Kung University |
Li, Chi | National Cheng Kung University |
Wilch, Jan | Technical University of Munich |
Vogel-Heuser, Birgit | Technical University Munich |
Cheng, Fan-Tien | National Cheng Kung University |
Chen, Chao-Chun | National Cheng Kung University |
Keywords: Factory Automation, Manufacturing, Maintenance and Supply Chains
Abstract: Virtual Metrology (VM) technology can convert offline sampling inspection into online and real-time total inspection. As the processes of high-tech industries (semiconductor or TFT-LCD) are getting more sophisticated, higher VM prediction accuracy is demanded. With regard to this requirement, the advanced Convolutional-Neural-Networks (CNN) based VM system (denoted as Advanced AVMCNN) was proposed and verified to significantly enhance the overall prediction accuracy. Nevertheless, two issues need to be addressed to enhance the accuracy of the Advanced AVMCNN System: 1) rare and imbalanced collected metrology values lead to poor prediction accuracy of the extreme values, and 2) the model can only be updated when sufficient metrology values are collected. To tackle these problems, the Keep-Important-Samples (KIS) Scheme for the Advanced AVMCNN System is proposed in this paper with consideration of data balance. The experiments reveal that the proposed KIS Scheme can effectively enhance the prediction performance of the Advanced AVMCNN System on the extreme values.
|
|
08:18-08:36, Paper TuAT2.2 | |
Development of an Alarm Pattern Detection Scheme for Managing Alarm Floods in Bumping Process |
|
Hsieh, Yu-Ming | National Cheng Kung University |
Chen, Po-Jui | National Cheng Kung University |
Wilch, Jan | Technical University of Munich |
Vogel-Heuser, Birgit | Technical University Munich |
Chen, Chun-Yen | National Cheng Kung University |
Ng, I-Son | National Cheng Kung University |
Keywords: Factory Automation, Intelligent and Flexible Manufacturing
Abstract: The pursuit of high yield in semiconductor packaging manufacturing is hindered by the increasing complexity of monitoring and alarm systems, leading to alarm floods which not only interfere with operations but also mask critical issues affecting yield. Therefore, this paper proposes an Alarm Pattern Detection (APD) Scheme to address the problem of alarm floods in semiconductor packaging manufacturing. The APD scheme intelligently identifies critical production machines using the random forest algorithm and applies PrefixSpan to find alarm patterns from these machines. Minwise Hashing and Locality Sensitive Hashing techniques are then adopted to retain significant alarm patterns. The effectiveness of the APD Scheme is first demonstrated through a simulation study, where it achieves superior precision and scalability compared to traditional methods such as FP-Growth and PrefixSpan. Further, it is validated with actual manufacturing process data that the APD Scheme can 1) identify critical alarm patterns affecting yield, and 2) reduce the burden of alarm floods on operators so as to achieve the goal of improving manufacturing yield through the monitoring and management of these patterns.
|
|
08:36-08:54, Paper TuAT2.3 | |
Anytime Multi-Task Multi-Agent Pickup and Delivery under Energy Constraint |
|
Kudo, Fumiya | Osaka Metropolitan University |
Cai, Kai | Osaka Metropolitan University |
Keywords: Discrete Event Dynamic Automation Systems, Path Planning for Multiple Mobile Robots or Agents, Collision Avoidance
Abstract: Various Multi-Agent Path Finding (MAPF) and its extension, Multi-Agent Pickup and Delivery (MAPD) algorithms have been studied in academia. In industry, on the other hand, automatic safe control of teams of robots and AGVs on factory floors and logistic warehouses for pickup and delivery operations have also been studied intensively. In this paper, we extend our previous work of online multi-task MAPD problem where (i) task can be allocated to any vacant agent independent of the location of that agent --- called ``anytime task allocation'' in this paper, and (ii) each agent is subject to energy constraint. The proposed anytime task allocation MAPD algorithm achieves 5-19% shorter makespan paths compared to the baseline multi-task MAPD in wide range of agent numbers. We also examine the behavior of the proposed multi-task MAPD algorithm under various energy constraint, by changing power limits and energy charge speeds of individual agents. We find that energy charge speed has a large impact on the makespan when power limit is small. We also find that small energy charge speed typically requires a large number of agents in order to achieve the same makespan. These results demonstrate that our proposed multi-task MAPD algorithm can be useful in choosing proper agent numbers in order to achieve prescribed makespans.
|
|
08:54-09:12, Paper TuAT2.4 | |
A Transformer-Based Thermal Surrogate Model for Cooling Control in Data Centers |
|
Zhou, Hanchen | Tsinghua University |
Mu, Ni | Tsinghua University |
Jia, Qing-Shan | Tsinghua University |
Keywords: Energy and Environment-Aware Automation, Reinforcement Learning
Abstract: With the rapid development of data centers in big data era, the operation of their cooling systems has huge energy saving potential so their optimization and control is of great significance for research. The main challenge in the optimization problem above is the prediction of complicated temperature field. The most recognized Computational Fluid Dynamics (CFD) simulation consumes too much time to be applied in real time optimization. To address this problem, a Transformer-based thermal surrogate model is proposed. Specifically, self-attention is used for capturing the temporal and spatial characteristics in temperature field to replace CFD. Then, the optimization problem is formulated and a surrogate model based Soft Actor-Critic (SAC) solution framework is proposed. Finally, the control performance is verified in the CFD-based platform 6SigmaRoom and widely-used Artificial Neural Network (ANN) is selected as the baseline. Numerical experiments demonstrate that the proposed surrogate model makes predictions faster and more accurately while the control based on it achieves less energy consumption, improving energy efficiency and reducing safety risk at the same time.
|
|
TuAT3 |
Room T3 |
Additive Manufacturing 1 |
Regular Session |
Chair: Ruiz, Cesar | University of Oklahoma |
|
08:00-08:18, Paper TuAT3.1 | |
A Linear Arithmetic Model Reflecting Vertical Object Details in Object Packing and Scheduling for Sequential 3D Printing |
|
Surynek, Pavel | Czech Technical University in Prague |
Bubník, Vojtěch | Prusa Research |
Lukáš, Matěna | Prusa Research |
Kubiš, Petr | Prusa Research |
Keywords: Additive Manufacturing, Planning, Scheduling and Coordination, Collision Avoidance
Abstract: We address the problem of object arrangement and scheduling for sequential 3D printing. Unlike the standard 3D printing, where all objects are printed slice by slice at once, in sequential 3D printing, objects are completed one after other. In the sequential case, it is necessary to ensure that the moving parts of the printer do not collide with previously printed objects. We look at the sequential printing problem from the perspective of combinatorial optimization. We propose to express the problem as a linear arithmetic formula, which is then solved by a solver for satisfiability modulo theories (SMT). The formula is designed to take vertical details of objects into account. To solve the model we propose a technique inspired by counterexample guided abstraction refinement (CEGAR), which turned out to be a key innovation to efficiency.
|
|
08:18-08:36, Paper TuAT3.2 | |
Functional Models with Spatially Varying Coefficients for Fast Prediction of Temperature History for Wire-Based Metal Additive Manufacturing |
|
Dashti, Ali | University of Oklahoma |
Ruiz, Cesar | University of Oklahoma |
Keywords: Additive Manufacturing, Manufacturing, Maintenance and Supply Chains
Abstract: Wire-based metal additive manufacturing is a promising technique for fabricating large-scale structural components across various industrial sectors. Using robot-assisted deposition, these technologies offer relatively low equipment cost for fast fabrication on large areas. However, high operating temperatures and heat accumulation create significant thermal stresses, causing distortion and roughness in deposited layers. Precise thermal history prediction is essential for achieving proper deposition process planning and control. Typically, finite element method (FEM) simulations are used to solve large PDEs for thermal analysis. However, these methods are computationally prohibitive for large parts, making them unsuited the effective automation of the deposition process. In this paper, we develop a generalized functional regression model as a surrogate for thermal history prediction during single layer, single track deposition. The interaction between prediction features is modeled using tensor-product basis. Thermal history heterogeneity along weld cross-sections are modeled using Gaussian Process-based varying coefficients. We studied the effect of track geometry on the FEM results, the accuracy of the proposed surrogate methodology for predictions of the thermal history profile for each geometry and discuss its applicability to help automate process planning and control in large-scale metal additive manufacturing.
|
|
08:36-08:54, Paper TuAT3.3 | |
Identification of Latent Invariant Surface Quality Patterns Via Spatial Stochastic Process Deformation |
|
Gu, Minghao | University of Southern California |
Huang, Qiang | University of Southern California |
Keywords: Additive Manufacturing, Probability and Statistical Methods, Machine learning
Abstract: Surface quality characterization is important for ensuring product functionality and guiding production. Although a wealth of literature has been developed for mass production, unique challenges arise in additive manufacturing (AM) because of the one-off fabrication of a wide range of complex geometries. Surface quality patterns can change with built geometries and covariates such as shape size and built location. Existing statistical characterization methods are therefore not adaptive to AM due to frequent design changes and insufficient samples for each design. Since a product can be represented by a finite type of surface patches, this initial study investigates the surface quality of products with spherical patches. We propose to identify latent and invariant patterns to characterize surface quality by separating the effect of covariates. The surface quality of different domes is first described by nonstationary spatial Gaussian processes. These heterogeneous spatial stochastic processes are then mapped to one underlying base process through a spatial stochastic process deformation method. With covariate effects fully captured by the mapping function, the underlying base process characterizes the latent invariant surface quality patterns of spherical patches and the working status of the machine. Actual printed samples from an AM process are utilized for methodology demonstration. The study brings the prospect of evaluating and comparing product surface quality and process condition in one-off AM.
|
|
08:54-09:12, Paper TuAT3.4 | |
Deep Learning and Big Data Framework for Real-Time Porosity Prediction in Laser Powder Bed Fusion |
|
Bendaouia, Ahmed | Institute for Advanced Manufacturing, University of Texas Rio Gr |
El Faqar, Abdessabour Chakir | Faculty of Sciences Semlalia, Cadi Ayyad University, Marrakesh, |
Ramoni, Monsuru | University of Texas Rio Grande Valley |
Abdelwahed, El Hassan | Computer Systems Engineering Laboratory (LISI), Faculty of Scien |
Li, Jianzhi | University of Texas Rio Grande Valley |
Keywords: Process Control, Big-Data and Data Mining, Additive Manufacturing
Abstract: Laser-based powder-bed fusion (LPBF) is an additive manufacturing technique renowned for its ability to fabricate complex metallic components with intricate geometries. However, its broad industrial adoption is hindered by inherent defects, particularly porosity, which undermine the structural integrity of fabricated parts. Reducing porosity during fabrication through machine learning could significantly enhance the industrial viability of LPBF components, yet the scarcity of in situ data poses a major challenge for training models to detect pores in real time. This study proposes a novel Big Data framework that leverages ex situ porosity data to train machine learning and deep learning models for in situ pore detection and classification during LPBF. We investigate state-of-the-art models such as Support Vector Machines (SVR), eXtreme Gradient Boosting (XGBoost), LSTM and Transformer-based deep learning models. The framework includes a pipeline of data streaming, processing, and storage using Big Data tools. The results showed that transformers performed noticeably better than conventional models using a large dataset of LPBF parameters, including laser power, scan speed, construct direction, and laser density. Transformers outperformed XGBoost in the categorization of porosity types, with an accuracy of 82.0% and a precision of 79.0%. LSTM outperformed in EEF regression, showing a nearly flawless prediction ability with an RMSE of 0.24. These results highlight how Transformer models and LSTM can be used to optimize LPBF procedures, opening the door to workflows for additive manufacturing that are more effective and of greater quality.
|
|
TuAT4 |
Room T4 |
3D Point Cloud Modeling 2 |
Special Session |
Chair: Biehler, Michael | University of Wisconsin - Madison |
Co-Chair: Wang, Yinan | RPI |
Organizer: Biehler, Michael | University of Wisconsin - Madison |
Organizer: Wang, Yinan | RPI |
|
08:00-08:18, Paper TuAT4.1 | |
A Statistical Monitoring Approach for Convolution-Generated Space-Time Processes (I) |
|
Zhang, Yutong | Georgia Institute of Technology |
Xiao, Liu | Georgia Institute of Technology |
Keywords: Machine learning, Diagnosis and Prognostics, Process Control
Abstract: Monitoring of convolution-generated space-time processes is a problem of great importance for a range of applications. For example, the monitoring of a heat transfer process, an advection-diffusion pollutant transport process, and so on. In this research, we propose a statistical modeling and monitoring approach for space-time processes generated through convolution operations, in both space and time, of Gaussian input. In particular, a dynamical model representation is constructed for the convolution-generated spatio-temporal process, based on an infinite-dimensional Stochastic Differential Equation (SDE) whose solution has the same space-time covariance of the target process. Then, the Galerkin’s method is used to obtain a finite-dimensional approximation of the dynamical model, as well as the space-time covariance of the approximated processes. The proposed process monitoring approach is established on top of the approximated dynamical model. Finally, we demonstrate the performance of the proposed approach by detecting unknown new source terms of a two-dimensional advection-diffusion process.
|
|
08:18-08:36, Paper TuAT4.2 | |
Improving Biosensor Accuracy and Speed Using Dynamic Signal Change and Theory-Guided Deep Learning (I) |
|
Boggavarapu, Purna Srivatsa | Virginia Tech |
Zhang, Junru | Virginia tech |
Ahmadzai, Fazel Haq | Virginia Tech |
Liu, Yang | Virginia Tech |
Song, Xuerui | Virginia Tech |
Karpatne, Anuj | Virginia Tech |
Kong, Zhenyu | Virginia Tech |
Johnson, Blake | Virginia Tech |
|
|
08:36-08:54, Paper TuAT4.3 | |
Recent Advances in Dynamic 3D Point Cloud Modeling for 4D Printing (I) |
|
Biehler, Michael | University of Wisconsin - Madison |
Keywords: Deep Learning in Robotics and Automation, Model Learning for Control, Additive Manufacturing
Abstract: The rise of 4D printing - where printed structures transform, self-assemble, or adapt over time in response to environmental triggers - marks a radical leap toward truly programmable matter. Unlike traditional static objects, these dynamic parts demand sophisticated methods for real-time control, prediction, and optimization. This talk explores the unique challenges and opportunities in automating 4D printed systems, emphasizing the role of 3D point cloud time series modeling and multi-modal data fusion. By leveraging temporal spatial data and integrating diverse sensor modalities, we aim to achieve a more precise and adaptive understanding of evolving structures. The session will highlight recent advancements, practical implementations, and open research questions at the intersection of materials science, machine learning, and robotics.
|
|
08:54-09:12, Paper TuAT4.4 | |
Quantile-Based Thresholding for Automated Segmentation of Geometric Features in Wire-Based Metal Additive Manufactured Parts (I) |
|
Zihan, Tasbirul Alam | University of Oklahoma |
Ruiz, Cesar | University of Oklahoma |
Keywords: Additive Manufacturing, Probability and Statistical Methods
Abstract: Metal additive manufacturing (MAM) technologies have the potential for better resources efficiency and faster fabrication of large structural components compared to traditional manufacturing. Wire-based MAM (WMAM) processes such as Wire Arc Additive Manufacturing (WAAM) offer relatively cheap equipment and large fabrication area. However, these processes often suffer from surface irregularities caused by energy fluctuations during deposition and global geometric distortions due to severe thermal gradients. A key step for effective process planning and control is accurately modeling the effects of energy fluctuations. However, manual segmentation of large point clouds of geometric data is time consuming and often inaccurate. This study proposes an unsupervised framework for automatic segmentation of geometric features caused by excessive or deficient material accumulation. Data points are filtered based on outliers of the quantile function of the shape deviation. Such outliers indicate the presence of material accumulation in local regions of the printed part. The outlier quantiles are used to generate levels for contour creation which are utilized for clustering points into geometric features. A case study involving four cylindrical WAAM components demonstrates the method's robustness and interpretability. The results illustrate the effectiveness of the proposed framework in the presence of complex geometries.
|
|
TuAT5 |
Room T5 |
Human-Robot and HCA 4 |
Regular Session |
Chair: Ichnowski, Jeffrey | Carnegie Mellon University |
|
08:00-08:18, Paper TuAT5.1 | |
Hearing the Slide: Acoustic-Guided Constraint Learning for Fast Non-Prehensile Transport |
|
Mao, Yuemin | Carnegie Mellon University |
Duisterhof, Bardienus P | Carnegie Mellon University |
Lee, Moonyoung | Carnegie Mellon University |
Ichnowski, Jeffrey | Carnegie Mellon University |
Keywords: Motion and Path Planning, Model Learning for Control
Abstract: Object transport tasks are fundamental in robotic automation, emphasizing the importance of efficient and secure methods for moving objects. Non-prehensile transport can significantly improve transport efficiency, as it enables handling multiple objects simultaneously and accommodating objects unsuitable for parallel-jaw or suction grasps. Existing approaches incorporate constraints based on the Coulomb friction model, which is imprecise during fast motions where inherent mechanical vibrations occur. Imprecise constraints can cause transported objects to slide or even fall off the tray. To address this limitation, we propose a novel method to learn a friction model using acoustic sensing that maps a tray's motion profile to a dynamically conditioned friction coefficient. This learned model enables an optimization-based motion planner to adjust the friction constraint at each control step according to the planned motion at that step. In experiments, we generate time-optimized trajectories for a UR5e robot to transport various objects with constraints using both the standard Coulomb friction model and the learned friction model. Results suggest that the learned friction model reduces object displacement by up to 86.0% compared to the baseline, highlighting the effectiveness of acoustic sensing in learning real-world friction constraints.
|
|
08:18-08:36, Paper TuAT5.2 | |
A Novel Approach for Leveraging Object Detection for 3D Human Pose Estimation in Complex Human-Robot Collaboration Environments |
|
Süme, Sinan | University of Applied Sciences Offenburg |
Kaithavalappil Ajay, Amal | Work-Life Robotics Institute, University of Applied Sciences Off |
Dr. Wendt, Thomas M. | Universtiy of Applied Sciences Offenburg |
Rupitsch, Stefan Johann | University of Freiburg |
Keywords: Human-Centered Automation, AI-Based Methods, Sensor Fusion
Abstract: Although 3D Human Pose Estimation has major breakthroughs in recent years, 3D pose estimation in complex scenarios remains difficult. One of the reasons is the lack of diverse 3D datasets for training and generalizing the models. This issue is counteracted by acquiring a dataset of HumanRobot Collaboration scenes featuring different objects, such as a cobot. We propose a novel two-step method, where first a 3D Object detection task with VoteNet is performed to identify the human in the scenario and claim it as a region of interest for the pose estimation task. Second, this region of interest is cropped and passed into the 3D Human Pose Estimation algorithm SPiKE, which locates 15 keypoints of the human. Based on this procedure, our method improves detection in complex scenarios. Furthermore, this article compares the benefits of training the algorithm additionally on the obtained HumanRobot Collaboration dataset compared to training it with the standard ITOP dataset. While the SPiKE algorithm makes no correct prediction on the Human-Robot Collaboration scenario, the results of the two-step SPiKE VN approach with mAP of 41.17 % is significantly lower as the benchmark model on the ITOP dataset. Nonetheless, the SPiKE VN model exhibits similar performance to SPiKE man with a difference of 2.43 % mAP indicating the method is effectively functioning.
|
|
08:36-08:54, Paper TuAT5.3 | |
Towards Human Motion Prediction for Collaborative Robotics with Text-To-Motion Data Generation |
|
Casarin, Marco | University of Padova |
Reggiani, Monica | University of Padua |
Michieletto, Stefano | University of Padua |
Keywords: Human-Centered Automation, AI-Based Methods, Collaborative Robots in Manufacturing
Abstract: Human Motion Prediction (HMP) in Collaborative Robotics enables proactive adaptation of robot behavior, improving efficiency and safety. However, the available data from human-robot interaction scenarios is scarce and collecting new motion samples is a challenging and expensive process. In this work, we propose a novel approach for HMP that exploits synthetic human motion sequences to mitigate the need for real-world data collection. Our method consists of generating a large and diverse collection of human motions that are unified into a synthetic dataset for model pre-training. The pre-trained model is then fine-tuned on the small available dataset of real motion sequences from a target context. For a comprehensive evaluation of the effectiveness of our approach, we generate three synthetic datasets of increasing size and evaluate their impact on three HMP models with different architectures. We then compare the prediction accuracy of our approach with the same models trained exclusively on the target dataset. The results showed a strong improvement in HMP given by pre-training on synthetic sequences. The increase in prediction accuracy was consistent across all the models, reaching up to 19.49% error reduction. This outcome highlights the applicability of synthetic motion data in HMP independent of the specific model architecture. By using synthetic data for pre-training, our approach enables more effective model adaptation to real-world applications, making it a promising solution for enhancing human-robot collaboration.
|
|
08:54-09:12, Paper TuAT5.4 | |
Towards High Precision: An Adaptive Self-Supervised Learning Framework for Force-Based Verification |
|
Duan, Zebin | University of Southern Denmark |
Hagelskjćr, Frederik | University of Southern Denmark |
Kramberger, Aljaz | University of Southern Denmark |
Heredia, Juan | University of Southern Denmark |
Krüger, Norbert | University of Southern Denmark |
Keywords: Industrial and Service Robotics, Force and Tactile Sensing
Abstract: The automation of robotic tasks requires high precision and adaptability, particularly in force-based operations such as insertions. Traditional learning-based approaches either rely on static datasets, which limit their ability to generalize, or require frequent manual intervention to maintain good performances. As a result, ensuring long-term reliability without human supervision remains a significant challenge. To address this, we propose an adaptive self-supervised learning framework for insertion classification that continuously improves its precision over time. The framework operates in real-time, incrementally refining its classification decisions by integrating newly acquired force data. Unlike conventional methods, it does not rely on pre-collected datasets but instead evolves dynamically with each task execution. Through real-world experiments, we demonstrate how the system progressively reduces execution time while maintaining near-perfect precision as more samples are processed. This adaptability ensures long-term reliability in force-based robotic tasks while minimizing the need for manual intervention.
|
|
TuAT6 |
Room T6 |
Detection, Estimation and Prediction 3 |
Regular Session |
Chair: Yi, Jingang | Rutgers University |
|
08:00-08:18, Paper TuAT6.1 | |
Estimating Spatially-Dependent GPS Errors Using a Swarm of Robots |
|
Somisetty, Praneeth | Texas A&M University |
Griffin, Robert | University of Houston |
Baez, Victor | University of Houston |
Arevalo-Castiblanco, Miguel Felipe | University of Houston |
Becker, Aaron | University of Houston |
O'Kane, Jason | Texas A&M University |
Keywords: Calibration and Identification, Model Learning for Control, Planning, Scheduling and Coordination
Abstract: External factors, including urban canyons and adversarial interference, can lead to Global Positioning System (GPS) inaccuracies that vary as a function of the position in the environment. This study addresses the challenge of estimating a static, spatially-varying error function using a team of robots. We introduce a State Bias Estimation (SBE) algorithm whose purpose is to estimate the GPS biases. The central idea is to use sensed estimates of the range and bearing to the other robots in the team to estimate changes in bias across the environment. A set of drones moves in a 2D environment, each sampling data from GPS, range, and bearing sensors. The biases calculated by the SBE at estimated positions are used to train a Gaussian Process Regression (GPR) model. We use a sparse Gaussian process–based Informative Path Planning (IPP) algorithm that identifies high-value regions of the environment for data collection. The swarm plans paths that maximize information gain in each iteration, further refining their understanding of the environment’s positional bias landscape. We evaluated SBE and IPP in simulation and compared the IPP methodology to an open-loop strategy.
|
|
08:18-08:36, Paper TuAT6.2 | |
Data-Efficient Learning-Based Estimation of Region of Attractions for Nonlinear Dynamic Systems |
|
Huang, Yi | Rutgers University |
Han, Feng | New York Institute of Technology |
Yi, Jingang | Rutgers University |
Keywords: Optimization and Optimal Control, Machine learning, Model Learning for Control
Abstract: Estimation of the region of attraction (RoA) of dynamic systems is a challenging task due to complex nonlinear behaviors. Analytical approaches require accurate dynamic models and are commonly conservative. In this paper, we present a data-efficient, machine learning-based RoA estimation strategy for an unknown dynamic system with multiple attractors. Our approach constructs and maps the dynamics onto the Riemannian manifold with potential function. The initial RoA is estimated based on regions with low potential values and high gradient magnitudes. To refine the RoA boundaries, further sampling is directed towards regions adjacent to the initial estimation, ensuring an adaptive and efficient data collection process. A Morse graph-based RoA expansion strategy is finally applied to identify the boundaries of RoA. We demonstrate the proposed RoA estimation method by using both simulation and experimental results for inverted pendulums with various controllers. Comparison with other RoA estimation methods is also presented to demonstrate the superior data efficiency and accuracy of the proposed approach.
|
|
08:36-08:54, Paper TuAT6.3 | |
Modal Identification of Mirror Vibrations at the VLT Using Accelerometer Data |
|
Jaufmann, Pascal | University of Stuttgart |
Buck, Aaron | University of Stuttgart |
Pott, Jörg-Uwe | Max Planck Institute for Astronomy |
Sawodny, Oliver | University of Stuttgart |
Keywords: Calibration and Identification, Model Learning for Control, Sensor-based Control
Abstract: Recent advances in ground-based astronomy have made it possible to create optical telescopes with primary mirrors up to 40 m in size. With growing mirror diameter, the suppression of non-atmospheric disturbances becomes increasingly important. Precise knowledge of the movement of telescope mirrors is essential for understanding and compensating for vibration-based perturbations. A model from VLT accelerometer data for each individual mirror is developed, while the influence of wind buffeting is accounted for by a von Karman wind model. To describe the relevant rigid body motion, we consider the piston, tip and tilt modes of the mirrors. The identification is validated by comparing the power spectral density of the measured and identified modes. Additionally, we assess the robustness of the approach by calculating the identification error over different sections of the data. The study indicates that the employed methods are adequate for the identification of modal telescope vibrations. It is anticipated that said findings will serve as a significant foundation for the development of advanced model-based AO controllers for large telescopes, such as linear quadratic Gaussian control.
|
|
08:54-09:12, Paper TuAT6.4 | |
Early Classification of Intentions for Maritime Domains Using Deep Learning Models |
|
Hakim, Md. Azizul | University of Nevada, Reno |
Sayed, Md Abu | University of Nevada, Reno |
Meepaganithage, Ayesh | University of Nevada Reno |
Becker, Tyler J | University of Nevada, Reno |
Nicolescu, Monica | University of Nevada, Reno |
Nicolescu, Mircea | University of Nevada, Reno |
Keywords: Deep Learning in Robotics and Automation, Agent-Based Systems, Machine learning
Abstract: Effective detection of intentions in a maritime domain has become more crucial for the enhancement of the maritime security systems. Predicting intentions accurately based on a set of predictive attributes in the naval domain can guide strategic decision-making capabilities, which reduces the likelihood of relative incidents. In this paper, we focus on developing a deep learning-based framework for the immediate classification of both hostile and non-hostile behaviors in the maritime domain to minimize the potential threats. For this process, first, we collected maritime datasets for different ship navigation behaviors. Second, we developed a set of features from the raw vessel data. Third, we applied different feature selection methods to identify the optimal subset of features, and finally we developed and evaluated the performance of two deep learning models for these different feature subsets. The experimental results show that by using the optimal feature subset our proposed deep learning models can effectively identify seven distinct navigation behaviors, with an overall accuracy of 97% in a maritime domain. On the other hand, our proposed model can distinguish between Hostile and Non-Hostile behaviors with an overall accuracy of 99%
|
|
TuAT7 |
Room T7 |
Semiconductor Manufacturing |
Regular Session |
Chair: Kim, Seoung Bum | Korea University |
|
08:00-08:18, Paper TuAT7.1 | |
A Model Predictive Control-Based Scheduling for Optimizing Both Quality and Productivity in Multistage Manufacturing Process |
|
Lee, Sugyeong | Sungkyunkwan University |
Lee, Dong-Hee | Sungkyunkwan University |
Keywords: Intelligent and Flexible Manufacturing, Planning, Scheduling and Coordination, Semiconductor Manufacturing
Abstract: In this paper, we present a model predictive control (MPC) based approach for scheduling multiple lots in multistage manufacturing process (MMP) to optimize both productivity and quality. We design a mixed integer linear programming (MILP) problem for hybrid flow shop MMP environment, and employ an MPC strategy which can repeatedly solves a subproblem of immediate timestep, rather than solving the scheduling problem over the entire range at once. Furthermore, we consider scenarios where a machine is temporarily unavailable or urgent lot arrivals with specific earliest starting times, which is frequently happening in real industrial settings. Simulation results demonstrate that the proposed approach can optimize both makespan and yield at the same time, which is practically applicable in real-time decision making environments such as semiconductor fabrication plant.
|
|
08:18-08:36, Paper TuAT7.2 | |
Stacking Ensemble Method for Wafer Yield Prediction in Semiconductor Manufacturing |
|
Song, Yuna | Sungkyunkwan University |
Lee, Sugyeong | Sungkyunkwan University |
Lee, Dong-Hee | Sungkyunkwan University |
Keywords: Semiconductor Manufacturing, Process Control, Machine learning
Abstract: Wafer yield prediction plays a significant role in detecting early defects and optimizing manufacturing efficiency. For this reason, methods for forecasting the yield have been actively studied for decades. Traditional statistical models, such as the Poisson model and the Seed’s model, have been used to forecast yield, and as semiconductor manufacturing processes become more advanced, the trend has moved toward data-driven approaches. However, most studies focus on analyzing yield variation from defect or metrology data, overlooking process path information. The path information includes the list of machines that wafers have been through during the process, which has a critical impact on yield decline. To address this, we propose a novel yield prediction model regarding process paths and their corresponding queue-time. We utilized three types of machine learning algorithms: regression, tree-based, and neural network. The final prediction accuracy reached the best after performing stacking ensemble with an MSE of 0.1622 and R^2 of 0.8434, which are considered reasonable.
|
|
08:36-08:54, Paper TuAT7.3 | |
Automated Classification and Captioning for Wafer Bin Map Using Attention-Based Image Captioning Approach |
|
Kim, Beomseok | Sungkyunkwan University |
Shin, Jinsu | Memory Division, Samsung Electronics Co, Ltd., Hwaseong, Republi |
Lee, Dong-Hee | Sungkyunkwan University |
Keywords: Semiconductor Manufacturing, Computer Vision for Manufacturing, Failure Detection and Recovery
Abstract: Analyzing defect patterns in Wafer Bin Maps (WBM) is an essential part of the semiconductor fabrication process. In recent years, as semiconductor products have diversified and the number of chips integrated on a wafer has increased, critical dimensions have been shrinking and fabrication processes have become more complicated. As a result, conventional defect pattern classification systems struggle to effectively analyze diverse defect variations within the same class or accurately identify the fabrication processes responsible for them. To address these issues, this study introduces image captioning techniques and proposes a model that utilizes a Convolutional Neural Network (CNN)-based model to classify WBM defect patterns in the MixedWM38 dataset and automatically generate captions for the corresponding defect patterns. For this purpose, we divided a total of 6,000 WBM images into 4,800 training data and 1,200 test data, extracted features using a CNN encoder, and calculated weights reflecting the importance of each feature by applying an attention mechanism. Afterward, we passed them to an LSTM (Long Short-Term Memory) model along with the existing feature map to generate captions for each WBM image. By automating the classification and caption generation of WBM defect patterns, the model proposed in this study is expected to provide more consistent and reliable defect analysis results compared to conventional manual methods. In addition to classifying WBM defect classes, the model can generate captions that describe the size, length, and shape of defect patterns in detail, even without explicit subclass labels. The proposed model is expected to contribute to the optimization of semiconductor fabrication processes and enhanced quality control.
|
|
08:54-09:12, Paper TuAT7.4 | |
Range-Aware Deep Learning Framework for Multi-Parameter Electrical Test Prediction in Semiconductor Manufacturing |
|
Kim, Jihyun | Korea University |
Hwang, Sunhyeok | Korea University |
Jeong, Jinyong | Korea University |
Jeong, Jaehoon | Samsung Electronics |
Chang, Kyu-Baik | Samsung Electronics |
Choi, Hanlim | Samsung Electronics |
Sohn, Suyeon | Samsung Electronics |
Kim, Seoung Bum | Korea University |
Keywords: Machine learning, AI-Based Methods, Semiconductor Manufacturing
Abstract: In semiconductor manufacturing, accurate prediction of electrical test (ET) parameters is essential for optimizing wafer quality and production efficiency. Traditional machine learning approaches rely on hard-to-obtain metrology data or handcrafted features, limiting scalability and practical applicability. In this work, we propose a deep learning framework that predicts multiple ET parameters using only readily available fabrication (FAB) process data. Our approach addresses two fundamental challenges: the categorical and sequential nature of FAB process data and the inherent range imbalance across ET parameters. We propose a feature extraction architecture that combines an input projection layer for handling categorical data with a one-dimensional convolutional neural network-based feature extractor designed to capture sequential patterns in FAB processes. To ensure balanced optimization across ET parameters with varying ranges, we introduce a range-aware loss function that assigns parameter-specific weights based on their value range. Experimental results on real-world semiconductor manufacturing data demonstrate that our proposed method achieves superior prediction accuracy compared to conventional methods, validating our framework's effectiveness in practical semiconductor manufacturing environments.
|
|
TuAT8 |
Room T8 |
Industrial Robot As a Service |
Special Session |
Chair: Tanz, Lukas | Technical University of Munich |
Organizer: Daub, Rüdiger | Technical University of Munich (TUM, Fraunhofer IGCV |
Organizer: Tanz, Lukas | Technical University of Munich |
|
08:00-08:18, Paper TuAT8.1 | |
Digital Workpiece Model Creation for the Automated Configuration and Commissioning of Industrial Robotic Applications (I) |
|
Geng, Paul | Technical University of Munich (TUM) |
Pressnig, Michael Jonas | Technical University of Munich (TUM), Institute for Machine Tool |
Bauer, Johannes C. | Technical University of Munich |
Trattnig, Stephan | Technical University of Munich (TUM), Institute for Machine Tool |
Tanz, Lukas | Technical University of Munich |
Daub, Rüdiger | Technical University of Munich (TUM, Fraunhofer IGCV |
Keywords: Intelligent and Flexible Manufacturing, Computer Vision for Manufacturing, Industrial and Service Robotics
Abstract: The configuration and commissioning of industrial robot applications is time-intensive and prone to errors since it involves manual labor. High development costs due to frequent iteration are acceptable in the context of mass production. However, many production environments have a high degree of variety in their products and, therefore, need to make changes repeatedly. Here, automation with industrial robots is not economically feasible. The concept of Industrial Robot as a Service offers a solution so that manufacturing companies can test robot systems risk-free by renting them temporarily, but the applications must be customized for the use cases at hand. For this reason, research has recently focused on the automation of engineering steps, entitled Automation of Automation. However, these approaches often rely on accurate digital models of workpieces. Unfortunately, digital workpiece models are regularly incomplete, outdated, or unavailable in industrial settings. While commercial 3D scanner systems for workpiece scanning can generate such models, their high costs and required manual efforts limit widespread adoption. This paper introduces a cost-effective method for creating workpiece models utilizing the robot itself and its peripherals. Furthermore, we explore the application of zero-shot models for digital workpiece segmentation to enhance the process and demonstrate the feasibility and effectiveness of our method through experimental results.
|
|
08:18-08:36, Paper TuAT8.2 | |
Multimodal Interaction for Human-Robot Collaboration in Assembly: An LLM-Enhanced Approach (I) |
|
Rekik, Khansa | ZeMA GGmbH |
da Silva Filho, José Grimaldo | SENAI CIMATEC |
Bashir, Attique | ZeMA GGmbH |
Müller, Rainer | ZeMA GGmbh |
Keywords: Industrial and Service Robotics, Intelligent and Flexible Manufacturing, Collaborative Robots in Manufacturing
Abstract: As Robot as a Service (RaaS) models gain intrest in industrial automation, the need for intuitive and adaptive human-robot interaction (HRI) increaces. This paper introduces a multimodal interction framework for human-robot collaboration in assembly tasks, enhanced by Large Language Models (LLMs). The system combines explicit user inputs—such as speech commands, gestures, and graphical interfaces—with implicit intent recognition to generate and prioritize tasks in real-time. Leveraging LLMs for natural language understanding and task planning, the approach enables flexible and adaptive task execution, allowing the robot to respond to both direct requests and contextual cues. Through a pilot user study, performance and user satisfaction of each modality are evaluated, revealing trade-offs between ease of use, response speed, and accuracy. The results demonstrate the promise of the approach in industrial applications, while also identifying improvements' opportunities for broader use.
|
|
08:36-08:54, Paper TuAT8.3 | |
Assembly Lines in Circulation – towards a Holistic Framework to Enable the Reuse of Assembly Resources (I) |
|
Bluvstein, German | Technical University of Munich (TUM) |
Kurscheid, Sebastian | Technical University of Munich, TUM School of Engineering and De |
Reinbold, Nora | Technical University of Munich |
Vorraber, Wolfgang | Graz University of Technology |
Url, Philipp | Graz University of Technology |
Orgler, Maximilian Johannes | Graz University of Technology |
Noori, Shiva | YAGHMA B.V |
Yaghmaei, Emad | YAGHMA B.V |
Larsen, Rie Brammer | Yaghma |
Thevenin, Simon | IMT Atlantique |
Rezaei, Hamidreza | IMT Atlantique, Nantes, France |
Woess, Shamaim | ECI-Mechatronics GmbH |
Daub, Rüdiger | Technical University of Munich (TUM, Fraunhofer IGCV |
Keywords: Sustainable Production and Service Automation, Software, Middleware and Programming Environments, Assembly
Abstract: The research project ALICIA – Assembly Lines in Circulation – will provide a marketplace platform that enables the reuse of second-hand assembly resources. Based on a formalized resource description and AI-based decision-making support, the platform suggests suitable second-hand resources for new assembly lines. For these resources, an Asset Administration Shell (AAS) is developed to integrate them into Industrial Internet of Things (IIoT) systems. In return, the legacy resources can be connected to their Digital Twins (DTs) while complying with data security standards. Further, ALICIA evaluates environmental sustainability aspects measured by Key Performance Indicators (KPIs) as well as worker-centric aspects, e.g., worker skills, when selecting second-hand equipment to reuse in new lines. ALICA also identifies potential ethical impacts and risks caused by the technologies, stakeholders, or the ecosystem's processes to be addressed in designing digital solutions. Besides ALICIA’s core services, developed within the project, the platform will be open to partners to offer their services in the context of a second-hand equipment market, such as equipment health state assessment, repair, and Life Cycle Assessment (LCA), among others. To evaluate the platform‘s business models, a value proposition analysis is conducted, showing how Robotics as a Service (Raas) concepts can address potential stakeholder challenges in the project.
|
|
08:54-09:12, Paper TuAT8.4 | |
Automated Workspace Scanning with Local Areas of Interest in the Context of Industrial Robots As a Service (IRaaS) (I) |
|
Müller, Julian | Technical University of Munich (TUM), Institute for Machine Tool |
Trattnig, Stephan | Technical University of Munich (TUM), Institute for Machine Tool |
Heuss, Lisa | Technical University of Munich (TUM), Institute for Machine Tool |
Geng, Paul | Technical University of Munich (TUM) |
Tanz, Lukas | Technical University of Munich |
Daub, Rüdiger | Technical University of Munich (TUM, Fraunhofer IGCV |
Keywords: Computer Vision for Manufacturing, Intelligent and Flexible Manufacturing, Industrial and Service Robotics
Abstract: Recent advances in robotics provide an extensive toolkit for tasks such as collision-free path planning, trajectory optimization, and object manipulation, facilitating flexible and adaptive programming in industrial environments. However, these solutions heavily rely on accurate digital 3D models of the robot workspace, which are typically outdated or unavailable in brownfield environments, making their deployment challenging. While 3D scanning is well explored in mobile robotics and navigation, its application in stationary industrial robotics remains underutilized. To address this gap, this paper presents a methodology that integrates eye-in-hand 3D scanning, automated collision map generation, and object detection. The point cloud data aquired by the 3D camera is used to generate a voxel-based occupancy map that enables robot path planning in unknown environments. In addition, we place localization markers near objects of interest, thereby defining localized search areas. These are used for the detection and registration of relevant objects, such as workpieces, within the point cloud. The methodology is implemented in ROS and validated through an industrial use case study that resembles a typical production environment. Our scanning approach eliminates the need to manually create or retrofit digital models of the robot workspace, as is typically done with CAD tools. As a result, our automated workspace scanning solution enables advanced robotic path planning and manipulation as required within the context of Industrial Robots as a Service (IRaaS).
|
|
TuAT9 |
Room T9 |
Data-Driven Analysis and Control |
Special Session |
Chair: Fortino, Giancarlo | Universitŕ Della Calabria |
Organizer: Famularo, Domenico | Universita' Dlela Calabria |
Organizer: Fortino, Giancarlo | Universitŕ Della Calabria |
Organizer: Puig, Vicenç | UPC |
Organizer: Zhou, MengChu | New Jersey Institute of Technology |
|
08:00-08:18, Paper TuAT9.1 | |
A RHC Scheme for Constrained Nonlinear Systems Based on Data-Driven Robust Backward Reachable Sets (I) |
|
Gagliardi, Gianfranco | Universitŕ Della Calabria |
Famularo, Domenico | Universita' Dlela Calabria |
Tedesco, Francesco | Universitŕ Della Calabria |
Franzč, Giuseppe | University of Calabria |
Keywords: Machine learning, Model Learning for Control, Deep Learning in Robotics and Automation
Abstract: In this paper the problem of computing backward reachable sets directly from noisy data without requiring a known system model is considered. Starting from a Lipschitz continuous nonlinear system, a procedure deriving inner-approximations of time-backward reachable sets using matrix zonotopes is considered. Theoretical results ensuring that the computed reachable sets properly inner-approximates the true reachable set are proposed and a computable scheme obtained. A Receding Horizon Control based numerical example to prove the effectiveness of the proposed approach is illustrated.
|
|
08:18-08:36, Paper TuAT9.2 | |
Benchmarking Population-Based Reinforcement Learning across Robotic Tasks with GPU-Accelerated Simulation (I) |
|
Shahid, Asad Ali | IDSIA |
Narang, Yashraj | NVIDIA |
Petrone, Vincenzo | Universitŕ Degli Studi Di Salerno |
Ferrentino, Enrico | University of Salerno |
Handa, Ankur | NVidia |
Fox, Dieter | University of Washington |
Pavone, Marco | Stanford University |
Roveda, Loris | SUPSI-IDSIA |
Keywords: AI-Based Methods, Machine learning, Reinforcement
Abstract: In recent years, deep reinforcement learning (RL) has shown its effectiveness in solving complex continuous control tasks. However, this comes at the cost of an enormous amount of experience required for training, exacerbated by the sensitivity of learning efficiency and the policy performance to hyperparameter selection, which often requires numerous trials of time-consuming experiments. This work leverages a Population-Based Reinforcement Learning (PBRL) approach and a GPU-accelerated physics simulator to enhance the exploration capabilities of RL by concurrently training multiple policies in parallel. The PBRL framework is benchmarked against three state-of-the-art RL algorithms – PPO, SAC, and DDPG – dynamically adjusting hyperparameters based on the performance of learning agents. The experiments are performed on four challenging tasks in Isaac Gym – Anymal Terrain, Shadow Hand, Humanoid, Franka Nut Pick – by analyzing the effect of population size and mutation mechanisms for hyperparameters. The results show that PBRL agents achieve superior performance, in terms of cumulative reward, compared to nonevolutionary baseline agents. Moreover, the trained agents are finally deployed in the real world for a Franka Nut Pick task. To our knowledge, this is the first sim-to-real attempt for deploying PBRL agents on real hardware. Code and videos of the learned policies are available on our project website.
|
|
08:36-08:54, Paper TuAT9.3 | |
Data-Driven Forward Reachability Analysis (I) |
|
Franzč, Giuseppe | University of Calabria |
Puig, Vicenç | UPC |
Keywords: Behavior-Based Systems, Machine learning, Probability and Statistical Methods
Abstract: In this paper, the reachability analysis for a class of nonlinear systems is addressed by resorting to a data-driven setting. The resulting approach combines into a unique framework linear time-invariant system behavior, data-driven modeling and reinforcement learning algorithms. This allows to determine outer approximations of the exact successor sets whose accuracy is evaluated by means of statistical tests. Finally, the validity of the proposed approach is tested by resorting to a benchmark example and providing numerical comparisons with a well-reputed competitor.
|
|
08:54-09:12, Paper TuAT9.4 | |
Data-Driven Bayesian Maximum Entropy Multi-Objective Hyperparameter Optimization for PCNN Image Fusion (I) |
|
Chen, Shuaijie | Wuhan University of Technology |
Keywords: Optimization and Optimal Control, Swarms, Data fusion
Abstract: To address the challenges of hyperparameter tuning and reliance on manual experience in dual-channel pulse-coupled neural networks (DCPCNN) for image fusion tasks, the paper proposes a data-driven Bayesian maximum entropy (DBME) multi-objective optimization method to enhance image fusion performance. Firstly, the source images are transformed into the NSST domain with low-frequency bands and high-frequency bands. Secondly, we design a DCPCNN model based on DBME optimization to fuse the high-frequency subbands. For the low-frequency subbands, a method based on weighted local energy and multi-scale morphological gradient fusion rule is proposed. Finally, the fused image is reconstructed by the NSST inverse transform. The results demonstrate that, compared to other fusion methods, this approach outperforms in the fusion of infrared and visible light images, as well as multi-focus images, with significant advantages in metrics such as AG, SD, SCD, and VIFF.
|
|
TuAT10 |
Room T10 |
Medical Applications 1 |
Regular Session |
Chair: Zhang, Xiaotong | Massachusetts Institute of Technology |
|
08:00-08:18, Paper TuAT10.1 | |
Dentara 1.0: An Autonomous Dental Surgery Assistive Robotic Station |
|
Senarathna, Sanjaya | University of Moratuwa |
Kaluarachchi, Yasiru | University of Moratuwa |
Alahakoon Vidanelage, Pamuditha Lakshan | University of Moratuwa |
Amarasinghe, Ranjith | University of Moratuway |
Jayathilaka, Wanasinghe Arachchige Dumith Madushanka | Department of Mechanical Engineering, University of Moratuwa |
Hanchapola Appuhamilage, Gihan Charith Premachandra | Singapore University of Technology and Design |
Tan, U-Xuan | Singapore University of Techonlogy and Design |
Keywords: Medical Robots and Systems, Human-Centered Automation, Autonomous Agents
Abstract: The inclusion of robotics in surgical procedures enhances clinical outcomes by reducing operating duration, increasing accuracy, and mitigating risks. However, collaboration between surgeons and assistants can lead to communication gaps, compromising the sterile environment and causing delays. To address these challenges, we introduce Dentara, a novel assistive robotic station for dental surgery. Dentara features a five-degrees-of-freedom (DOF) articulated robot arm and a 2D Cartesian robot configuration mounted on the ceiling, resulting in zero footprint and eliminating disturbances. The design integrates a hybrid mechanical robot configuration, surpassing footprint limitations and enhancing accessibility and maneuverability within the surgical suite. Voice commands serve as the primary interaction method, with hand gestures as a secondary mode. Computer vision techniques detect surgical tools and the surgeon's hand position. The prototype Dentara 1.0 was developed, and experiments were conducted to evaluate its feasibility, focusing on robot motion, tool detection, and handling. Results indicate that Dentara 1.0 is feasible for automating the assistant's role, particularly in understaffed settings. The project demonstrates a cohesive integration of multiple subsystems, reflecting a strong systems engineering approach to surgical automation.
|
|
08:18-08:36, Paper TuAT10.2 | |
A Handheld, Dual-Arm Surgical Device for Semi-Automated Trans-Oral Endoscopic Resection |
|
Zhong, Xin | University of Electronic Science and Technology of China |
Zhu, Runfeng | South China University of Technology |
Zhao, Qing xiang | Sichuan University |
Zhong, Yong | South China University of Technology |
Li, Kang | West China Hospital, Sichuan University |
Keywords: Medical Robots and Systems, Robotics and Automation in Life Sciences, Tendon/Wire Mechanism
Abstract: While transoral surgery is popular for its minimally invasive nature and cosmetic benefits, the anatomy of the oropharynx and larynx presents narrow and irregularly shaped operative spaces, requiring slender and dexterous surgical instruments. For surgical operations in narrow spaces such as tonsillectomy, visualization and dexterous manipulation are generally indispensable. To address the limitations of conventional transoral tools (e.g., inconvenient eye-hand coordination, limited distal dexterity and large footprint), we propose a portable handheld dual-arm device consisting of a master arm for tissue manipulation and a slave arm carrying an endoscopic camera for monitoring the surgical scene. The master arm is a tendon-driven riveted continuum manipulator, and the slave endoscope is a dual-segment slender arm based on concentric push/pull robot (CPPR). Through delicate design, the actuation unit is miniaturized. In addition, the kinematics was built for eye-hand automatic coordination, and experiments have demonstrated its clinical potential in endoscopic tonsillectomy.
|
|
08:36-08:54, Paper TuAT10.3 | |
A Practical Micropipette-Image Calibration Method for Somatic Cell Microinjection |
|
Pan, Fei | Lingnan University |
Chen, Shuxun | City University of Hong Kong |
Zheng, Liushuai | City University of Hong Kong |
Zhi, Shaohua | The Hong Kong Polytechnique University |
Chen, Xi | Lingnan University |
Sun, Dong | City University of Hong Kong |
Keywords: Automation at Micro-Nano Scales, Biological Cell Manipulation, Manipulation Planning
Abstract: This paper proposes a practical micropipette-image (2-D pixel and 3-D spatial coordination) calibration method indispensable to somatic cell microinjection, leveraging advancements in modern motorized micromanipulators. The method determines the depth information of the micropipette in the microscope field by assessing the contact between the micropipette tip and the bottom of the culture dish. It uses recoverable deformation upon contact as a criterion for precise positioning, ensuring the tip is on the dish's bottom surface and in the microscope's focus plane. Additionally, the paper introduces an on-the-spot method for breaking a micropipette tip and a preprocessing technique for somatic cells. The proposed micropipette tip-breaking method, using a low-cost acrylic ring, overcomes previous drawbacks and proves quick and user-friendly. The preprocessing technique converts fully adherent somatic cells into semi-adherent cells, increasing cell thickness for easier puncturing. Combining these techniques, the study validated the approaches through over 900 injections on human dermal fibroblast (HDF) cells, achieving a success rate of 53.3% and a survival rate of 95.8%.
|
|
08:54-09:12, Paper TuAT10.4 | |
Evaluation and Optimization of Screening Strategies for Cancers with Age-Specific Incidence Rates (I) |
|
Ai, Yi | Xi'an Jiaotong University |
Zhang, Sheng | Xi'an Jiaotong University |
Yan, Chao-Bo | Xi'an Jiaotong University |
Keywords: Modelling, Simulation and Optimization in Healthcare, Scheduling in Healthcare, Health Care Management
Abstract: Cancer remains a major threat to human health, and early screening is crucial for its prevention and management. To enhance the efficiency of screening programs, it is essential to evaluate and optimize screening strategies. This paper focuses on cancers characterized by age-specific incidence rates and develops a stochastic model to evaluate the benefit of various screening strategies. The model calculates the expected benefit that a screened population can gain from any given screening schedule, quantified in quality-adjusted life years. Applying this approach, we assess a range of rule-based screening strategies for central nervous system cancer and identify the dominant strategies on the Pareto frontier, which achieve a balance between fewer screenings and greater benefit. Sensitivity analysis results demonstrate that the findings exhibit a certain level of robustness to parameter fluctuations.
|
|
TuAT11 |
Room T11 |
Best Healthcare Automation Paper Competition |
Special Session |
Chair: Lennartson, Bengt | Chalmers University of Technology |
|
08:00-08:25, Paper TuAT11.1 | |
Deep Learning-Enhanced Robotic Subretinal Injection with Real-Time Retinal Motion Compensation |
|
Wu, Tianle | Johns Hopkins University |
Esfandiari, Mojtaba | Johns Hopkins University |
Zhang, Peiyao | Johns Hopkins University |
Taylor, Russell H. | The Johns Hopkins University |
Gehlbach, Peter | Johns Hopkins Medical Institute |
Iordachita, Ioan Iulian | Johns Hopkins University |
Keywords: Medical Robots and Systems, Deep Learning in Robotics and Automation, Motion Control
Abstract: Subretinal injection is a critical procedure for delivering therapeutic agents to treat retinal diseases such as inherited retinal diseases (IRD) and age-related macular degeneration (AMD). However, retinal motion caused by physiological factors such as respiration and heartbeat significantly impacts precise needle positioning, increasing the risk of retinal pigment epithelium (RPE) damage. This paper presents a fully autonomous robotic subretinal injection system that integrates intraoperative optical coherence tomography (iOCT) imaging and deep learning-based motion prediction to synchronize needle and retinal motion. A Long Short-Term Memory (LSTM) neural network is used to predict internal limiting membrane (ILM) motion, outperforming a Fast Fourier Transform (FFT)-based baseline model. Additionally, a real-time registration framework aligns the needle tip position with the robot’s coordinate frame. Then, a dynamic proportional speed control strategy ensures smooth and adaptive needle insertion. Experimental validation in both simulation and ex vivo open-sky porcine eyes demonstrates precise motion synchronization and successful subretinal injections. The experiments achieve a mean tracking error below 16.4 µm in pre-insertion phases. These results show the potential of AI-driven robotic assistance to improve the safety and accuracy of retinal microsurgery.
|
|
08:25-08:50, Paper TuAT11.2 | |
Out-Of-Distribution Modular Hospital Fit-Out Scheduling Via Memory-Augmented Deep Reinforcement Learning (I) |
|
Han, Yujie | The HONG KONG POLYTECHNIC UNIVERSITY |
Sun, Kexin | The HONG KONG POLYTECHNIC UNIVERSITY |
Zhao, Zhiheng | The Hong Kong Polytechnic University |
Huang, George Q. | The Hong Kong Polytechnic University |
Keywords: Automation in Construction, Planning, Scheduling and Coordination, Agent-Based Systems
Abstract: The transition from Industry 4.0 to Industry 5.0 has highlighted the critical role of Modular Integrated Construction (MiC), particularly in rapidly deployable modular hospitals that address urgent healthcare demands. As the final stage before delivery, fit-out directly impacts both project speed and healthcare quality. However, scheduling in this phase faces challenges from dynamic labor allocation and worker fatigue, which traditional methods struggle to handle in out-of-distribution (OOD) settings. To tackle this, we reformulate the problem as a Flexible Job-shop Scheduling Problem with Workload Constraints (WL-FJSP) and propose a memory-augmented framework that models worker-task dynamics. By incorporating adaptive gating mechanisms, the model captures fatigue variations and jointly optimizes medical task fulfillment and fit-out efficiency. Experiments show improved performance over traditional and state-of-the-art methods, with strong generalization across varying instance scales.
|
|
08:50-09:15, Paper TuAT11.3 | |
Dynamic Allocation of Medical Examination Resources in Hospitals with Varying Appointment Preferences (I) |
|
Zong, Fangyu | Tsinghua University |
Zhang, Mirui | Tsinghua University |
Zhao, Yue | Beijing Tsinghua Changgung Hospital |
Fan, Zhenghao | Tsinghua University |
Wang, Feifan | Tsinghua University |
Keywords: Scheduling in Healthcare, Health Care Management, Modelling, Simulation and Optimization in Healthcare
Abstract: Appointment systems have been widely utilized across various domains, particularly in healthcare systems. However, the dynamic allocation of appointment resources has not yet been fully studied. Dynamic resource allocation refers to the process of continuously reallocating future resources based on the evolving appointment status. This study addresses the dynamic allocation of examination resources between outpatient and inpatient departments in large general hospitals, where imbalanced resource distribution often leads to prolonged appointment lead times and inefficiencies. By incorporating diverse patient preferences, this study formulates the problem as a Markov Decision Process (MDP) and solves it using a reinforcement learning approach, specifically the Proximal Policy Optimization (PPO) algorithm. The proposed dynamic strategy dynamically adapts resource allocation based on real-time patient appointment status, optimizing the appointment lead time. Numerical experiments demonstrate the superiority of the dynamic resource allocation strategy over static allocation strategy and random allocation strategy in different scenarios. Particularly under high patient volumes and complex decision spaces, reinforcement learning-based dynamic resource allocation strategy exhibits excellent effectiveness and strong robustness.
|
|
TuBT1 |
Room T1 |
AI/ML for Healthcare 2 |
Regular Session |
Chair: Kalluri, Udaya | Oak Ridge National Lab |
|
10:45-11:03, Paper TuBT1.1 | |
Surg-SegFormer: A Dual Transformer-Based Model for Holistic Surgical Scene Segmentation |
|
Ahmed, Fatma | Hamad Medical Corporation |
Abdel-Ghani, Muraam | Hamad Medical Corporation |
Ali, Mahmoud | Qatar University |
Arsalan, Muhammad | Qatar University |
Al-Ali, Abdulaziz | Qatar University |
Balakrishnan, Shidin | Hamad Medical Corporation |
Keywords: AI and Machine Learning in Healthcare, Computer Vision in Automation, Machine learning
Abstract: Abstract— Holistic surgical scene segmentation in robotic- assisted surgery (RAS) enables new residents to identify various tissues, articulated tools, and essential structures, including veins and vessels. Due to the disparity between the number of expert surgeons and trainees, the explanation of the scene and the delineation of go-no-go zones can be overwhelming. A high-performance semantic segmentation model can facilitate the understanding of the surgical scene as a post-operative analysis for residents. Nevertheless, advanced models necessitate prompts, rendering them impractical, as surgical videos usually exceed three hours. We present Surg-SegFormer, a prompt- free model that surpasses the state of the art. The model attained a mean Intersection over Union (mIoU) of 0.80 on the Endovis2018 dataset and 0.54 on the Endovis2017 dataset. This robust surgical segmentation model enables residents to understand the surgical scene to offload experts from the tutoring role.
|
|
11:03-11:21, Paper TuBT1.2 | |
Automated Plant Tissue Excision and Manipulation Via Integrated Robotics and Machine Vision for Plant Transformation |
|
Walters, Alex | Oak Ridge National Laboratory |
Nycz, Andrzej | Oak Ridge National Laboratory |
Kalluri, Udaya | Oak Ridge National Lab |
Paquit, Vincent | Oak Ridge National Laboratory |
Leach, S Clay | Oak Ridge National Laboratory |
Keywords: Robotics and Automation in Life Sciences, Product Design, Development and Prototyping, Computer Vision in Automation
Abstract: Automation technologies in the biological plant sciences domain lack the ability for handling soft, live, and solid biological materials necessary for lab-based plant genetic transformation testing. This limitation has created a research bottleneck hindering progress in agricultural, biomedical, and pharmaceutical industries. Traditionally, plant transformation at lab-scale testing has relied on slow and labor-intensive manual techniques, contributing to this challenge. Automation offers a promising solution to accelerate plant biosystems research, enabling higher throughput in gene functional validation, and shortening crop improvement timelines by reducing the need for manual interaction and allowing for continuous sampling. To address this, an automated system for addressing early steps in the plant transformation process is reported here. This report provides a comprehensive analysis of the development and evolution of tools for acquisition and manipulation of living plant tissue samples, known as explants, based on extensive testing with an automated plant transformation system.
|
|
11:21-11:39, Paper TuBT1.3 | |
Fusing Tool Segmentation Predictions from Pose-Informed Morphological Polar Transform of Endoscopic Images |
|
Wu, Xiaoyi | Smith College |
Sehnawi, Dina | Smith College |
Zhu, YiCheng | Rochester Institute of Technology |
Lee, Yangming | Rochester Institute of Technology |
Huang, Kevin | Smith College |
Keywords: Automation in Life Science: Biotechnology, Pharmaceutical and Health Care, Data fusion, Medical Robots and Systems
Abstract: This paper presents and evaluates methods of fusing semantic image segmentation predictions, and highlights a novel hybrid approach that combines spatial frequency and edge features. Tool-labeled endoscopy from sinus surgery served as the image dataset, while two methods of surgical tool segmentation via morphological polar transform provided distinct predictions. The morphological transform acted as an input pre-processing step prior to segmentation via the U-Net architecture. Two separate predictions were available for each image based on the transformation center: one at the surgical tool-tip (TT) and one at the surgical tool vanishing point (VP). The goal in this work was to systematically generate a superior segmentation by fusing information from the two aforementioned predictions. While methods for deep learning based segmentation fusion exist, such methods require extensive datasets and potentially obfuscate explainability. Thus, three approaches relying solely on low-level features to fuse gray-scale segmentation predictions were proposed in this work: (1) gradient estimation, (2) Laplacian pyramid and (3) a modified spatial frequency method. The latter two demonstrated enhanced segmentation compared to original predictions. This work also explores explainability towards identifying candidate prediction pairs for fusion via unsupervised clustering as well as a ResNet-18 model. Cursory investigations into properties of the fused predictions provide insight into the potential use of the proposed methods in domains other than surgical tool segmentation.
|
|
11:39-11:57, Paper TuBT1.4 | |
Addressing Class Imbalance in Diabetic Retinopathy Detection: An Enhanced Diffusion-Resampling Data Augmentation Approach |
|
Yangue, Emmanuel | Oklahoma State University |
O'Connor, Ethan | Oklahoma State University |
Liu, Chenang | Oklahoma State University |
Keywords: AI and Machine Learning in Healthcare, Machine learning, Big-Data and Data Mining
Abstract: Diabetic retinopathy (DR) is a severe complication of diabetes that can lead to vision impairment or blindness if not detected early. Machine learning (ML) models offer a promising approach for early DR detection by leveraging electronic health records (EHRs). However, class imbalance, where non-DR cases vastly outnumber DR cases, poses a significant challenge, often leading to inaccurate predictions. This study proposes a new data augmentation framework termed DDPM-COFFRe that integrates an enhanced denoising diffusion probabilistic model (DDPM) developed in this work, with resampling to address the data imbalance issue more effectively. The DDPM-COFFRe integrates cosine noise schedule, optimized contrastive learning, fairness metrics, and outlier filtering to generate high-quality synthetic DR samples. Moreover, it also involved ADASYN-based resampling to further refine data augmentation by targeting underrepresented regions of the minority class. The effectiveness of DDPM-COFFRe is evaluated using multiple classifiers across three test scenarios: an imbalanced test set, a weighted F1 evaluation, and a balanced test set. The results show that while augmentation improves recall and DR sensitivity, especially under balanced test conditions, its impact on overall classification performance remains limited in imbalanced, real-world scenarios. The findings suggest that while augmentation helps models identify DR cases, improvements are constrained by classifier limitations and the complexity of distinguishing DR from non-DR in imbalanced test sets.
|
|
TuBT2 |
Room T2 |
TASE Paper Session 3 |
Special Session |
Chair: Krüger, Marius | Technical University of Munich |
|
10:45-11:03, Paper TuBT2.1 | |
Digital Twin-Based Smart Manufacturing: Dynamic Line Reconfiguration for Disturbance Handling |
|
Fu, Bo | University of Michigan |
Bi, Mingjie | Beijing Institute for General Artificial Intelligence |
Umeda, Shota | Hitachi, Ltd |
Nakano, Takahiro | HITACHI |
Nonaka, Youichi | Hitachi |
Zhou, Quan | Hitachi America Ltd |
Matsui, Takaharu | Hitachi America, Ltd |
Tilbury, Dawn | University of Michigan |
Barton, Kira | University of Michigan at Ann Arbor |
Keywords: Intelligent and Flexible Manufacturing, Cyber-physical Production Systems and Industry 4.0, Planning, Scheduling and Coordination
Abstract: The increasing complexity of modern manufacturing, coupled with demand fluctuation, supply chain uncertainties, and product customization, underscores the need for manufacturing systems that can flexibly update their configurations and swiftly adapt to disturbances. However, current research falls short in providing a holistic reconfigurable manufacturing framework that seamlessly monitors system disturbances, optimizes alternative line configurations based on machine capabilities, and automates simulation evaluation for swift adaptations. This paper presents a dynamic manufacturing line reconfiguration framework to handle disturbances that result in operation time changes. The framework incorporates a system process digital twin for monitoring disturbances and triggering reconfigurations, a capability-based ontology model capturing available agent and resource options, a configuration optimizer generating optimal line configurations, and a simulation generation program initializing simulation setups and evaluating line configurations at approximately 400x real-time speed. A case study of a battery production line has been conducted to evaluate the proposed framework. In two implemented disturbance scenarios, the framework successfully recovers system throughput with limited resources, preventing the 26% and 63% throughput drops that would have occurred without a reconfiguration plan. The reconfiguration optimizer efficiently finds optimal solutions, taking an average of 0.03 seconds to find a reconfiguration plan for a manufacturing line with 51 operations and 40 available agents across 8 agent types.
|
|
11:03-11:21, Paper TuBT2.2 | |
Proactive Robust Hardening of Resilient Power Distribution Network: Decision-Dependent Uncertainty Modeling and Fast Solution Strategy |
|
Ma, Donglai | Xi'an Jiaotong University |
Cao, Xiaoyu | Xi'an Jiaotong University |
Zeng, Bo | University of Pittsburgh |
Jia, Qing-Shan | Tsinghua University |
Chen, Chen | Xi'an Jiaotong University |
Zhai, Qiaozhu | Xi'an Jiaotong University |
Guan, Xiaohong | Xi'an Jiaotong University |
Keywords: Planning, Scheduling and Coordination, Power and Energy Systems automation, Optimization and Optimal Control
Abstract: To address the power system hardening problem, traditional approaches often adopt robust optimization (RO) that considers a fixed set of concerned contingencies, regardless of the fact that hardening some components actually renders relevant contingencies impractical. In this paper, we directly adopt a dynamic uncertainty set that explicitly incorporates the impact of hardening decisions on the worst-case contingencies, which leads to a decision-dependent uncertainty (DDU) set. Then, a DDU-based robust-stochastic optimization (DDU-RSO) model is proposed to support the hardening decisions on distribution lines and distributed generators (DGs). Also, the randomness of load variations and available storage levels is considered through stochastic programming (SP) in the innermost level problem. Various corrective measures (e.g., the joint scheduling of DGs and energy storage) are included, coupling with a finite support of stochastic scenarios, for resilience enhancement. To relieve the computation burden of this new hardening formulation, an enhanced customization of parametric column-and-constraint generation (P-C&CG) algorithm is developed. By leveraging the network structural information, the enhancement strategies based on resilience importance indices are designed to improve the convergence performance. Numerical results on 33-bus and 118-bus test distribution networks have demonstrated the effectiveness of DDU-RSO aided hardening scheme. Furthermore, in comparison to existing solution methods, the enhanced P-C&CG has achieved a superior performance by reducing the solution time by a few orders of magnitude.
|
|
11:21-11:39, Paper TuBT2.3 | |
Surrogate-Assisted Scenario-Generation Method for Simulation-Based Stochastic Programming Problems |
|
Suemitsu, Issei | Hitachi, Ltd |
Izui, Kazuhiro | Department of Micro Engineering, Kyoto University |
Keywords: Planning, Scheduling and Coordination, Optimization and Optimal Control, Inventory Management
Abstract: Simulation is required in real-world industrial decision-making to model the complexity, such as supply chain management with uncertain future demand. Scenario generation is critical for handling inherent uncertainties as deterministic scenarios to solve stochastic programming problems (SPPs). Generating a minimal yet representative scenario set is important to solve simulation-based SPPs (SBSPPs) since SBSPPs require thousands of simulation iterations proportional to the number of scenarios, leading to significant computational time. However, conventional methods, such as Monte Carlo method and recent problem-driven approaches, are ineffective in solving SBSPPs due to time-consuming simulation evaluation. This paper proposes a surrogate-assisted scenario-generation method called Inferred Cost-Space Scenario Clustering (ICSSC) that is applicable to various SPPs including SBSPPs. ICSSC employs a scenario clustering based on a new cost-space scenario distance evaluated by the surrogate model trained with offline simulation data to quickly approximate simulation evaluations. We conducted three types of numerical experiments to validate the effectiveness: Markowitz portfolio optimization, stochastic server location, and inventory placement optimization. Empirical results revealed that ICSSC could generate an effective scenario set based on the impact of uncertainties on decision outcomes for broader SPPs, and yields better solutions with 7.2 times shorter runtime than Monte Carlo methods.
|
|
11:39-11:57, Paper TuBT2.4 | |
Inferring Cable-Suspended End-Effector Oscillations from Hydraulic Actuators’ Responses in Diaphragm Wall Hydraulic Grabs |
|
Krüger, Marius | Technical University of Munich |
Vogel-Heuser, Birgit | Technical University Munich |
Waterman, Daniel | HAWE Hydraulik SE |
Cha, Suhyun | HAWE Hydraulik SE |
Hujo, Dominik | Technical University of Munich |
Prinz, Theresa | Technical University of Munich, TUM School of Engineering and De |
Pohl, Daniel | Sensor-Technik Wiedemann GmbH |
Semel, Matthias | BAUER Maschinen GmbH |
Keywords: Automation in Construction, Hydraulic/Pneumatic Actuators, Sensor Fusion
Abstract: A Diaphragm Wall Hydraulic Grab (DWHG), used in civil engineering for bulk excavation, is featured with a cable-suspended end-effector (also named attachment tool). The end-effector begins to oscillate during DWHG operation. Essential insights into the DWHG’s performance or productivity in operation could be derived from end-effector oscillations. However, due to limited robustness, the end-effector oscillation curves from a DWHG in operation can not be tracked directly by motion sensors attached to the end-effector. This article presents and evaluates a novel approach for DWHGs to infer cable-suspended end-effector oscillations from their hydraulic actuators’ responses. Hydraulic actuators’ responses depict the changes of hydraulic parameters in hydraulic actuators due to end-effector oscillations. The main contribution of this article is an investigation of how far oscillations of a cable-suspended end-effector in a DWHG can be inferred from their hydraulic actuators’ responses. For this purpose, a workflow is conducted: firstly, end-effector oscillations and their hydraulic actuators’ responses are collected for a typical DWHG steering sequence. Secondly, the collected end-effector oscillation and hydraulic actuators’ response curves are analyzed and processed. Thirdly, a model to infer end-effector oscillations from hydraulic actuators’ responses is built. Fourthly, an evaluation shows that the inferred oscillation curves (model output) are similar to the true end-effector oscillations.
|
|
TuBT3 |
Room T3 |
Additive Manufacturing 2 |
Regular Session |
Chair: Stiglmeier, Lukas | University of Applied Sciences at Offenburg |
|
10:45-11:03, Paper TuBT3.1 | |
An Innovative, Fully 3D-Printed, Inherent Actuatorless, Concentric Gripper for Robotic Applications |
|
Stiglmeier, Lukas | University of Applied Sciences at Offenburg |
Schröder, Steffen | University of Applied Sciences at Offenburg |
Waltersbacher, Robin | University of Applied Sciences at Offenburg |
Dr. Wendt, Thomas M. | Universtiy of Applied Sciences Offenburg |
Rupitsch, Stefan Johann | University of Freiburg |
Keywords: Additive Manufacturing, Force and Tactile Sensing, Industrial and Service Robotics
Abstract: Recent advances in 3D printing technology have made it possible to produce functional objects and structures applying a variety of 3D printing processes and materials. This opens up new possibilities for gripping technology, enabling the flexible design of grippers for specific tasks. In this contribution we present an innovative, fully 3D-printed concentric gripper for robotic applications. A special feature of the gripper is that it does not require an inherent actuator, since the rotary axis on the robot flange is employed for this purpose. The gripper’s central component is a 3D-printed torque sensor, fabricated employing a five-axis 3D printing system. The sensor enables the measurement of forces occurring during the gripping process. The gripper concept is validated in a series of experiments in which different objects are gripped. The sensor signals during the gripping process are analysed and the functionality and reproducibility are demonstrated.
|
|
11:03-11:21, Paper TuBT3.2 | |
Kolmogorov-Arnold Networks (KAN)-Enabled Incremental Learning for Online Process Monitoring of Additive Manufacturing |
|
Oskolkov, Boris | Oklahoma State University |
Yangue, Emmanuel | Oklahoma State University |
Tian, Wenmeng | Mississippi State University |
Kan, Chen | University of Texas at Arlington |
Liu, Chenang | Oklahoma State University |
Keywords: Data fusion, Machine learning, Additive Manufacturing
Abstract: This paper explores the advances of Kolmogorov-Arnold Networks (KANs) with potential applications in additive manufacturing (AM), such as the online process monitoring with competitive cost efficiency. With unique architecture and flexible activation functions, the emerging KANs provide a promising alternative to the multi-layer perception (MLP)-based models, enabling better modeling capability of complex AM processes, especially with the need to involve incremental learning. The proposed knowledge distillation (KD) empowered CNN-KAN framework integrates convolutional neural network (CNN) with KAN to model the complex AM processes while addressing challenges such as catastrophic forgetting in domain incremental learning. Utilizing the KD and flexible activation functions, the framework enables accurate anomaly detection and real-time process monitoring, even in dynamic and evolving data environments. The framework’s novelty lies in its first-time integration of KANs, CNN-based feature extraction, and KD-based incremental learning specifically tailored to dynamic AM processes. The experimental results highlight the effectiveness of the proposed method through a practical case study in AM, showcasing the promising future of the proposed KAN-enabled incremental learning framework.
|
|
11:21-11:39, Paper TuBT3.3 | |
From Flat to Form-Fitting: A Computational Geometry Approach to 3D Conformal Electronics Design and Rapid Prototyping |
|
Wang, Lehong | Carnegie Mellon University |
Dai, Zilin | Worcester Polytechnic Institute |
Wu, Yuchen | Carnegie Mellon University |
Jin, Ye | Carnegie Mellon University |
Barry, Colm | Carnegie Mellon University - Biorobotics Laboratory |
Tian, Yu | Carnegie Mellon University |
Ruan, Fujun | Carnegie Mellon University |
Shastry, Shricharan | Carnegie Mellon University |
Panat, Rahul | Carnegie Mellon University |
Fedder, Gary K. | Carnegie Mellon University |
Choset, Howie | Carnegie Mellon University |
Li, Lu | Carnegie Mellon University |
Keywords: Intelligent and Flexible Manufacturing, Product Design, Development and Prototyping, Additive Manufacturing
Abstract: The integration and fabrication of electrical circuits conformably onto 3D surfaces offer greater spatial efficiency, increased functionality, and improved performance in compact and tightly coupled electro-mechanical systems. However, existing 3D circuit prototyping workflows are often constrained by limited performance, insufficient generalizability or excessive manual effort and time requirements. In this paper, we present a framework that transforms 2D circuit design onto high-curvature 3D surfaces while preserving user-defined circuit characteristics and desired electrical parameters, such as trace length matching and resistance value target, allowing for the design of complex 3D circuitry using conventional 2D circuit design software that are intuitive for electrical engineers. The key contribution of this work is a two-stage processing algorithm that employs surface parameterization for 3D conformal circuit mapping followed by local distortion optimization for circuit parameter preservation. This method takes a 2D circuit design and a 3D CAD of the target surface as input, and then generates 3D circuit fabrication and process plans. We demonstrate the efficacy of our framework with a comparative analysis of circuit property preservation against other mapping approaches, both in simulation and in physical experiments, showing an 85% reduction in circuit deformation. We also demonstrate the potential of our framework through test case applications in aerospace and medical devices.
|
|
11:39-11:57, Paper TuBT3.4 | |
Offline Platform Trajectory Planning for Print-While-Drive Additive Manufacturing Using Mobile Manipulators |
|
Lachmayer, Lukas | Leibniz University Hannover, Insitute of Assembly Technolog |
Recker, Tobias | Leibniz University Hanover |
Heeren, Hauke | Leibniz Universität Hannover, Insitute of Assembly Technology An |
Müller, Pitt | Leibniz Universität Hannover |
Raatz, Annika | Leibniz Universität Hannover |
Keywords: Automation in Construction, Motion and Path Planning, Additive Manufacturing
Abstract: The limited productivity growth within the construction industry in the last decades has increasingly driven the development of innovative manufacturing processes. Especially the expanding research field of robotic additive manufacturing in construction (AMC) is said to enhance the flexibility and efficiency. In particular, the usage of mobile manipulators as 3D printers enables the creation of manufacturing environments that are not constrained by the reach of the robotic arm. While initial approaches have implemented mobile manipulators that print from stationary positions before relocating, more sophisticated approaches focus on print-while-drive. Print-whiledrive eliminates the risk of inducing weakening cold joints into the component during repositioning and further enhances the flexibility of the printing process. Existing approaches to print-while-drive rely exclusively on mobile manipulators with holonomic drives, such as Mecanum wheels. However, due to their design, Mecanum wheels are not suitable for use on uneven, contaminated, or loosely deposited surfaces. Such conditions, however, are common on construction sites. The application of alternative drive concepts, such as differential drive systems — commonly employed in track driven platforms — necessitates the development of novel trajectory-planning concepts for mobile manipulators. To this end, this publication proposes a trajectory planning algorithm to derive a suitable mobile platform trajectory based on a given tool center point (TCP) / printing trajectory. The functionality of the developed algorithm is demonstrated and evaluated by simulating the trajectories for large-scale components.
|
|
TuBT4 |
Room T4 |
Reinforcement Learning 1 |
Regular Session |
Chair: Dionigi, Alberto | University of Perugia |
|
10:45-11:03, Paper TuBT4.1 | |
Enhancing Counterfactual Data Augmentation for Offline Reinforcement Learning in Vision-Based Control |
|
Brilli, Raffaele | University of Perugia |
Speziali, Paolo | Vrije Universiteit Brussel |
Dionigi, Alberto | University of Perugia |
Crocetti, Francesco | University of Perugia |
Costante, Gabriele | University of Perugia |
Keywords: Reinforcement, Causal Models, Deep Learning in Robotics and Automation
Abstract: Offline training of Deep Reinforcement Learning agents is a valuable solution for addressing autonomous control and robotics challenges, especially when acquiring real-world data is particularly difficult and simulators are not available. Despite its validity, this approach is burdened by the issue of sample inefficiency, particularly with high-dimensional data, such as images, making it less effective for vision-based tasks. Counterfactual data augmentation addresses this issue by expanding the training dataset with plausible samples consistent with stochastic environments. However, its application in vision-based control remains underexplored. This study introduces a novel counterfactual data augmentation technique for vision-based tasks, leveraging Deep Generative Models to estimate the Structural Causal Model (SCM) and the associated reward model. The learned SCM is then used to augment the training dataset and optimize the DRL agents. We show the effectiveness of our method compared to existing state-of-the-art approaches on a series of stochastic control problems, both with discrete and continuous action spaces.
|
|
11:03-11:21, Paper TuBT4.2 | |
Zero-Shot Sim-To-Real Reinforcement Learning for Fruit Harvesting |
|
Williams, Emlyn | University of Lincoln |
Polydoros, Athanasios | University of Lincoln |
Keywords: Agricultural Automation, Autonomous Agents, Robot Networks
Abstract: This paper presents a novel and comprehensive sim-to-real pipeline for autonomous strawberry harvesting from dense clusters using a Franka Panda robot. Our approach addresses the challenges of robotic manipulation in unstructured agricultural environments, particularly the difficulty of picking occluded and clustered fruit. We introduce "FruitGym", a custom open-source Mujoco simulation environment designed to train a deep reinforcement learning agent. This environment leverages extensive domain randomization, varying lighting, object placement, and sensor noise to ensure the learned policy is robust and can be transferred directly to the real world. The agent is trained using the Dormant Ratio Minimization algorithm, which enhances sample efficiency and exploration. The proposed pipeline bridges low-level control with high-level perception and decision making, demonstrating promising performance in both simulation and in a real laboratory environment, laying the groundwork for successful transfer to real-world autonomous fruit harvesting.
|
|
11:21-11:39, Paper TuBT4.3 | |
Multi-Agent Reinforcement Learning for Robotized Coral Reef Sample Collection |
|
Correa, Daniel | Florida International University |
Kaarlela, Tero Heikki | Florida International University |
Fuentes, Jose | Florida International University |
Padrao, Paulo | Providence College |
Duran, Alain | Florida International University |
Bobadilla, Leonardo | Florida International University |
Keywords: Environment Monitoring and Management, Deep Learning in Robotics and Automation, Robotics and Automation in Life Sciences
Abstract: This paper presents a reinforcement learning (RL) environment for developing an autonomous underwater robotic coral sampling agent, a crucial coral reef conservation and research task. Using software-in-the-loop (SIL) and hardware- in-the-loop (HIL), an RL-trained controller is developed using a digital twin (DT) in simulation and subsequently verified in physical experiments. An under- water motion capture (MOCAP) system provides real-time 3D position and orientation feedback during verification testing for precise synchronization between the digital and physical domains. A key novelty of this approach is the combined use of a general-purpose game engine for simulation, deep RL, and real-time underwater motion capture for an effective zero-shot sim-to-real strategy.
|
|
11:39-11:57, Paper TuBT4.4 | |
Robust Reinforcement Learning for Autonomous Driving in Uncertain Environments |
|
Wang, Dejin | Northeastern University |
Ghoreishi, Seyede Fatemeh | Northeastern University |
Keywords: Learning and Adaptive Systems, Reinforcement, Intelligent Transportation Systems
Abstract: Autonomous vehicles typically operate in highly dynamic and uncertain environments where conventional reinforcement learning methods often fail to adapt due to fixed transition dynamics, static reward structures, and Markovian state assumptions. These limitations lead to environment-specific overfitting, reducing policy robustness in unseen conditions. To address this challenge, we propose a history-dependent reinforcement learning framework that integrates randomized training dynamics, context-aware reward functions, and memory-based policy optimization using Long Short-Term Memory (LSTM). Our method enables agents to implicitly infer latent environmental variations, dynamically adjust decision-making priorities, and improve robustness across diverse driving conditions without requiring explicit access to environment parameters. We employ Proximal Policy Optimization (PPO) with an LSTM-based policy architecture, leveraging historical interactions to encode environment-aware representations and enhance policy adaptability. The proposed method requires no fine-tuning and supports real-time decision-making, making it practical for deployment in real-world driving applications. Experiments performed in CARLA across various scenarios demonstrate the effectiveness of our approach in achieving reliable decision-making in adaptive cruise control systems.
|
|
TuBT5 |
Room T5 |
Simulations & Digital Twins 1 |
Regular Session |
Chair: Gao, Yixiang | Missouri University of Science and Technology |
|
10:45-11:03, Paper TuBT5.1 | |
Simulation System for Electrostatic Coating and Automated Robotic Coating Strategy Development for H-Beams of Various Scales |
|
Otsuka, Yutaro | Waseda University |
Jogo, Satoshi | Waseda University |
Tanaka, Genichiro | Waseda University |
Maruhashi, Akihiro | Komatsu Ltd |
Iwata, Hiroyasu | Waseda University |
Keywords: Factory Automation, Intelligent and Flexible Manufacturing, Sustainable Production and Service Automation
Abstract: The teaching process in robotic coating relies on operator expertise, resulting in repetitive cycles until quality standards are met, causing longer lead times and waste. By obtaining optimal coating path knowledge, smart manufacturing automation can be realized.We developed a predictive system for arbitrary geometries by integrating kinematic modeling of coating material before the nozzle with hydrodynamic behavior analysis after application, based on rheological properties.While previous research proposed coating paths for various geometric shapes these were limited to non-electric coating applications. Our electrostatic coating model incorporates particle behavior dynamics based on the Navier-Stokes equation, considering voltage variations along with conventional parameters such as gun velocity, angle, discharge rate, and rheological properties. The digital twin methodology was validated through equipment tests, achieving coating film prediction accuracy within ±15 μm and providing insights into how parameters affect coating in electrostatic applications. This enables predictions without actual equipment, advancing smart manufacturing. With this simulation, we developed a scale-selectable coating strategy for H-beam components—structures that optimize electrostatic coating and are commonly coated using electric methods in industry. Our approach enables full automation of previously semi-manual coating processes. The consistency of the electrostatic coating strategy was validated through digital twin comparisons between simulations and physical experiments, resulting in a highly accurate coating strategy for H-Beams. This fully automated strategy reduces worker health risks by minimizing coating spatter while advancing secure, reliable automation in manufacture
|
|
11:03-11:21, Paper TuBT5.2 | |
A Combined Nonlinear Simulation and Optimization Operation Framework for Hydrogen-Enable Zero-Carbon Energy System |
|
Guo, Wangyi | Xi’an Jiaotong University |
Xu, Zhanbo | Xi'an Jiaotong University |
Liu, Jinhui | Xi'an Jiaotong University |
Wu, Jiang | Xian Jiaotong University |
Liu, Kun | Xi'an Jiaotong University |
Yang, Huibiao | Ningxia Electric Power Research Institute |
Li, Hongqiang | Ningxia Electric Power Research Institute |
Keywords: Distributed Generation and Storage, Modelling, Simulation and Validation of Cyber-physical Energy Systems, Optimization and Optimal Control
Abstract: In recent years, the growth of distributed renewable energy has significantly reduced the carbon emissions of the power system. However, due to its volatility, it is difficult to integrate it into the power grid on a large scale. Therefore, the hydrogen-enable zero-carbon energy system (HZES) has been developed to address the complete consumption and comprehensive utilization of distributed renewable energy. Nevertheless, the physical mechanisms of the devices in the HZES are complex, and there are differences in time scales among various energy forms such as electricity, hydrogen, and heat. Limited by the solving ability of current commercial solver, the existing scheduling methods struggle to accurately describe the nonlinear models of the devices. Therefore, this paper develop an optimization model and a simulation model for the HZES. Meanwhile, a combined nonlinear simulation and optimization (CSO) operation framework is put forward. Based on the Benders decomposition method, this framework divides the scheduling problem into four parts: energy-form scheduling, nonlinear simulation, sub-problem of safe operation, and generation of Benders cuts. The numerical results show that this framework can describe the nonlinearity of the devices at the temperature dynamics level, avoiding the temperature infeasibility problems that may be caused by the energy-form model.
|
|
11:21-11:39, Paper TuBT5.3 | |
Integration of the TIAGo Robot into Isaac Sim with Mecanum Drive Modeling and Learned S-Curve Velocity Profiles |
|
Schönbach, Vincent | Uni Bonn |
Wiedemann, Marvin | Fraunhofer Institute for Material Flow and Logistics |
Memmesheimer, Raphael | University of Bonn |
Mosbach, Malte | University of Bonn |
Behnke, Sven | University of Bonn |
Keywords: Calibration and Identification, Industrial and Service Robotics, Simulation and Animation
Abstract: Efficient physics simulation has significantly accelerated research progress in robotics applications such as grasping and assembly. The advent of GPU-accelerated simulation frameworks like Isaac Sim has particularly empowered learning-based methods, enabling them to tackle increasingly complex tasks. The PAL Robotics TIAGo++ Omni is a versatile mobile manipulator equipped with a mecanum-wheeled base, allowing omnidirectional movement and a wide range of task capabilities. However, until now, no model of the robot has been available in Isaac Sim. In this paper, we introduce such a model, calibrated to approximate the behavior of the real robot, with a focus on its omnidirectional drive dynamics. We present two control models for the omnidirectional drive: a physically accurate model that replicates real-world wheel dynamics and a lightweight velocity-based model optimized for learning-based applications. In conjunction with these models, we introduce a learning-based calibration approach to approximate the real robot’s S-shaped velocity profile using minimal trajectory data recordings. This simulation should allow researchers to experiment with the robot and perform efficient learning-based control in diverse environments.
|
|
11:39-11:57, Paper TuBT5.4 | |
Navigation in Underground Mine Environments: A Simulation Framework for Quadruped Robots |
|
Gao, Yixiang | Missouri University of Science and Technology |
Awuah-Offei, Kwame | Missouri University of Science and Technology |
Keywords: Autonomous Agents, Simulation and Animation, Human-Centered Automation
Abstract: Quadruped robots have shown significant potential for navigating complex and hazardous environments, such as underground mines, where traditional wheeled or tracked systems have limitations. However, their development and deployment are hindered by the disparity between controlled laboratory testing and real-world conditions and the lack of tools (e.g., simulation testbeds) that expedite the required development and testing. This work develops a simulation testbed for expediting and advancing navigation algorithms, perception systems, and control strategies for quadruped robots in subterranean and hazardous environments. By utilizing high-fidelity 3D maps, ranging from intricate cave systems to real-world sites like the Edgar Mine, simulation environment offers a safe, scalable platform for evaluating robot performance in unstructured terrains. Built upon ROS 2 and the latest Gazebo simulators, the simulation framework provides robust tools for testing and development. The codebase can be accessed at https://github.com/g1y5x3/spot_gazebo. The insights gained from these simulations help bridge the gap between laboratory research and practical field deployment, enhancing the capabilities of quadruped robots for applications such as mining safety and disaster response.
|
|
TuBT6 |
Room T6 |
Human-Centered Automation |
Regular Session |
Chair: Fantuzzi, Cesare | Universitŕ Di Modena E Reggio Emilia |
Co-Chair: Park, Daegil | Korea Research Institute of Ships & Ocean Engineering (KRISO) |
|
10:45-11:03, Paper TuBT6.1 | |
Human-In-The-Loop System for Enhancing Situational Awareness in Underwater Robot Operations: Development and Experimental Validation |
|
Park, Daegil | Korea Research Institute of Ships & Ocean Engineering (KRISO) |
Lee, Yeongjun | Korea Research Institute of Ships and Ocean Engineering |
Han, Jong-Boo | Korea Institute of Ships and Ocean Engineering |
Pyo, Seunghyun | UST KRISO School |
Koo, Bonhak | University of Science Technology |
Yeu, Tae-Kyeong | KRISO (Korea Research Institute of Ships & Ocean Engineering) |
Keywords: Telerobotics and Teleoperation, Virtual Reality and Interfaces, Haptics and Haptic Interfaces
Abstract: Remote underwater operations often experience reduced efficiency due to limited visibility and high turbidity. Suspended sediments generated during underwater tasks and actuator movements frequently obscure the workspace, hindering operators from accurately perceiving the environment. This limitation increases the risk of damage to both the robotic system and the target object. To overcome these challenges, we propose a Human-in-the-Loop System (HILS), a cognitive and control framework that establishes a real-time feedback loop between an underwater robot and a Cyber-Physical Operation System (CPOS) control station. The system integrates real-time sensory and perceptual data from the robot to construct a virtual reality (VR) interface. It also incorporates a physics engine to simulate object interactions and provides feedback through VR, motion, and haptic channels, enhancing the operator’s immersion and perception. We validate the proposed framework through remote underwater fracturing experiments. The system delivers multisensory feedback based on physical parameters collected during operation. Experimental results demonstrate that the proposed HILS significantly improves situational awareness and operational performance in underwater environments.
|
|
11:03-11:21, Paper TuBT6.2 | |
Co-Working with Cobots: A Digital Twin Approach to Smarter Logistics |
|
Nini, Matteo | University of Modena and Reggio Emilia (UNIMORE) |
Bertoli, Annalisa | Unimore |
Fantuzzi, Cesare | Universitŕ Di Modena E Reggio Emilia |
Keywords: Factory Automation, Human-Centered Automation, Collaborative Robots in Manufacturing
Abstract: This paper investigates the automation of a traditionally manual process in the industrial sector, incorporating a human-centered approach to minimize hazardous tasks and improve operator well-being. A motion capture system and digital human simulation software were employed to create a Digital Twin of a real-world industrial scenario. This methodology allowed for virtual evaluation of different automation solutions to determine the optimal configuration that met specific performance criteria. The study underscores the significance of integrating ergonomic factors into automation strategies. The main emphasis of this work is on the simulation and implementation of the collaborative robot to support the operator well-being.
|
|
11:21-11:39, Paper TuBT6.3 | |
Adaptive Explanations for Resolving Error in a Human Robot Collaborative Task - an Exploratory Study |
|
Kumar, Shikhar | Ben-Gurion University of the Negev |
Edan, Yael | Ben-Gurion University of the Negev |
Parmet, Yisrael | Ben-Gurion University of the Negev |
Keywords: Human-Centered Automation, Human Factors and Human-in-the-Loop
Abstract: This study investigates the enhancement of robot understandability through the implementation of adaptive levels of explanation. An adaptive dialogue system was developed to generate explanations aimed at resolving error in a human-robot collaborative task. A user study, involving eighty participants, was conducted to compare the effectiveness of adaptive versus non-adaptive explanations. The results demonstrate that participants exhibited a significantly more positive perception of the adaptive dialogue system. Furthermore, a greater proportion of participants successfully resolved the task error when interacting with the adaptive system compared to the non-adaptive system.
|
|
11:39-11:57, Paper TuBT6.4 | |
Adaptive Architecture for Retrofitting Manual Systems |
|
Drechsel, Domenic | University of Applied Sciences Hamm Lippstadt |
Henkler, Stefan | University of Applied Sciences Hamm Lippstadt |
Brand, Julian | BeerBooster GmbH |
Keywords: Optimization and Optimal Control, Human Factors and Human-in-the-Loop, Cognitive Automation
Abstract: This paper explores the transformation of static, manually operated systems into self-adaptive, context-aware systems through retrofitting. By integrating sensors, actuators, local control loops, and asynchronous digital twin connections, the proposed architecture enables legacy equipment to become intelligent, resilient, and scalable. The concept is grounded with real-world constraints, emphasising offline operability and operator interaction through Operator Controller Module (OCM). The retrofit system is evaluated using a beer dispensing use case, chosen for its industry relevance, human involvement, and sensitivity to environmental parameters. The prototype demonstrates key architectural features including local autonomy, reflective control, and gradual adaptation, offering early but promising insights into the practical realization of cyber-physical intelligence. Rather than offering conclusive results, this work proposes a structured approach to embedding adaptive behaviour in existing systems and identifies relevant dimensions for future research.
|
|
TuBT7 |
Room T7 |
Modeling and Control for Automation in Mfg 1 |
Special Session |
Chair: Barton, Kira | University of Michigan at Ann Arbor |
Organizer: Barton, Kira | University of Michigan at Ann Arbor |
Organizer: Balta, Efe | Inspire AG |
Organizer: Bristow, Douglas | Missouri University of Science and Technology |
Organizer: Kovalenko, Ilya | Pennsylvania State University |
Organizer: Wang, Zi | University of Nottingham |
|
10:45-11:03, Paper TuBT7.1 | |
Automated PLC Code Generation in the Connected Morphing Factory (I) |
|
Hellewell, Joseph Samuel | University of Nottingham |
Sanderson, David | The University of Nottingham |
Wang, Zi | University of Nottingham |
Ratchev, Svetan | The University of Nottingham |
Keywords: Intelligent and Flexible Manufacturing, Cyber-physical Production Systems and Industry 4.0
Abstract: Current manufacturing automation paradigms rely on economies of scale to justify the cost, time, and manual effort involved in developing, programming, testing, and validated automation control systems. Advanced manufacturing systems in the vein of “Industry 4.0” or similar are intended to offer much shorter production lifecycles, and/or much smaller batch sizes, even down to batch-size-of-one. This is a challenge for existing automation control solutions, which rely on production engineers, automation engineers, or programmers to customise and configure them for each production process. The cost, time, and effort required for this currently manual process can be significantly reduced through the use of automated code generation techniques. By automatically generating control code for PLCs, production systems can be made more flexible, reconfigurable, and adaptive and thereby more able to accommodate changes to the production process in shorter time. This paper presents a method for automated PLC code generation designed to enable the orchestration of production operations in an automated production system. This method is validated against a robotic inspection process running on real industrial hardware in the Omnifactory demonstrator at the University of Nottingham. It is shown to successfully generate correct PLC code in significantly less time than a manual programmer, using a variety of inputs corresponding to different process variables.
|
|
11:03-11:21, Paper TuBT7.2 | |
Anomaly Detection in Robotic Aerospace Drilling Using Data Driven Methods (I) |
|
Pamplin, Benjamin | University of Nottingham |
Edwards, Philip | Airbus |
Martínez-Arellano, Giovanna | University of Nottingham |
Piano, Samanta | University of Nottingham |
Ratchev, Svetan | The University of Nottingham |
Keywords: Failure Detection and Recovery, Assembly, Factory Automation
Abstract: Demand for automation of the repetitive drilling operations in aerospace assembly is growing. With millions of holes per airframe, critical skills shortages, and complicated products, the case for automation is strong. However, industrial robots are not equipped to detect the wide range of possible anomalies such as tool damage or poor process conditions during drilling. The need for trustworthy monitoring is a serious barrier to adoption – a key enabler is that anomalies are detected early, preventing damage to components, machinery, or operators. Existing research shows that machine learning can be highly capable in anomaly detection in industrial processes, and this work focuses on best applying it to the specific anomaly conditions of a robotic drilling cell. This paper investigates a method using clustering and proposes a second classification stage, on motor current data from a novel robotic drilling cell. Time series data is collected on 109 example cycles and 18 artificial anomalies including tool breakage, incorrect parameters, and workpiece defects. A comparison of methods is performed to select the best foundation for a solution. All implementations show parity or improvement over the current standard in drilling, with accuracies over 90% attainable with all methods given sufficient data. Comparison shows that the clustering methods performed the best, with an achievable accuracy of 100% on the tested anomalies and no false alarms. The limitations of the system are identified and possible methods to improve accuracy are discussed. The impact of training data size on performance indicators is investigated and compared.
|
|
11:21-11:39, Paper TuBT7.3 | |
Hierarchical Sensor-Robot Control for On-Demand Sensing in a Partially Known Environment (I) |
|
Toner, Tyler | University of Michigan |
Tilbury, Dawn | University of Michigan |
Barton, Kira | University of Michigan at Ann Arbor |
Keywords: Reactive and Sensor-Based Planning, Optimization and Optimal Control, Collaborative Robots in Manufacturing
Abstract: To enable industrial robot autonomy without traditional manual programming, current approaches involve a carefully modeled environment or dedicated sensor feedback. This paper explores a novel alternative regime: on-demand sensing, in which a fleet of sensorless robots operating in unmodeled environments adapt to frequently changing repetitive tasks by requesting temporary access to a shared mobile sensor. A Hierarchical Sensor-Robot Control scheme is developed to enable an ad hoc team to cooperatively solve a task, at which point the sensor is dismissed while the robot repeats the task safely in open loop. An outer loop simultaneously optimizes the sensor pose and the parameters of an inner loop robot controller, which is encoded with potential fields. Simulation results demonstrate the algorithm converging for a realistic problem after just three outer-loop iterations.
|
|
11:39-11:57, Paper TuBT7.4 | |
Combining Layer-To-Layer and In-Layer Feedback Control for Wire Arc Additive Manufacturing (I) |
|
Marcotte, John | Rensselaer Polytechnic Institute |
Mishra, Sandipan | RPI |
Wen, John | Rensselaer Polytechnic Institute |
Keywords: Additive Manufacturing, Process Control
Abstract: Wire Arc Additive Manufacturing is a metal additive manufacturing (AM) process that deposits material layer-by-layer by melting wire feedstock with an electrical arc. The ability to deposit layers of non-uniform (but controlled) height can be leveraged for building geometries with overhanging features without needing support material. However, this requires precise process planning of the torch speed and feedrate to generate the appropriate varying-height profile within a single layer. In this work, we propose augmenting layer-to-layer planning of the torch speed with in-layer feedback control to reduce the error in the height of deposited material. Using an infrared camera, the error in the deposition height is measured throughout the welding process. The torch speed for each layer is planned using a model-based approach to correct for errors in the previous layer by solving a constrained optimization problem. This plan is then used as a feedforward along with in-layer proportional control to further correct for errors in-process. This approach demonstrated an improvement of 15% in the average layer RMS error over pure layer-to-layer control.
|
|
TuBT8 |
Room T8 |
Intelligent Transportation Systems |
Regular Session |
Chair: Liu, Bochi | Rutgers University |
Co-Chair: Guo, Weihong | Rutgers University |
|
10:45-11:03, Paper TuBT8.1 | |
Towards a Multi-Agent Vision-Language System for Zero-Shot Novel Hazardous Object Detection for Autonomous Driving Safety |
|
Shriram, Shashank | University of California, Merced |
Perisetla, Srinivasa | University of California, Merced |
Keskar, Aryan | University of California, Merced |
Krishnaswamy, Harsha | University of California, Merced |
Emil Westerhof Bossen, Tonko | Aalborg Universitet |
Mřgelmose, Andreas | University of Aalborg |
Greer, Ross | University of California, Merced |
Keywords: Intelligent Transportation Systems, Collision Avoidance, Autonomous Vehicle Navigation
Abstract: Detecting anomalous hazards in visual data, particularly in video streams, is a critical challenge in autonomous driving. Existing models often struggle with unpredictable, out-of-label hazards due to their reliance on predefined object categories. In this paper, we propose a multimodal approach that integrates vision-language reasoning with zero-shot object detection to improve hazard identification within traffic scene data. Our approach consists of a pipeline that integrates both Vision-Language Model (VLM) and Large-Language Model (LLM) capabilities to solve the problem of detecting out-of-label hazards within traffic scene data. We refine object detection by incorporating OpenAI's CLIP model to match predicted hazards with bounding box annotations, improving localization accuracy. To assess model performance, we extended the foundational COOOL (Challenge-of-Out-of-Label) anomaly detection benchmark dataset by adding full natural language hazard descriptions as ground truth annotations. Additionally, we evaluated our pipeline’s hazard labels through cosine similarity in order to consider the semantic similarity between the predicted hazard description and the annotated ground truth for each video. Our findings highlight the strengths and limitations of current vision-language-based approaches, offering insights into future improvements in autonomous hazard detection systems. Our models, scripts, and data can be found at https://github.com/mi3labucm/COOOLER.git
|
|
11:03-11:21, Paper TuBT8.2 | |
EvoDrive: An Evolutionary Testing Framework for the Carla Simulator |
|
Kumar, Adarash | Loughborough University |
Meng, Qinggang | Loughborough University |
Li, Baihua | Loughborough University |
Keywords: Autonomous Vehicle Navigation, Computer Vision for Transportation, Intelligent Transportation Systems
Abstract: The rapid development of Autonomous Driving Systems (ADSs) calls for advanced testing frameworks to ensure their safety, reliability, and robustness across diverse traffic scenarios. Traditional software testing techniques are insufficient for ADS validation due to the complexity and unpredictability of real-world environments. Although numerous approaches have been proposed in both academia and industry, a comprehensive and flexible benchmark for ADS testing is still lacking. To address this gap, we present EvoDrive, an automated scenario-based testing framework that introduces a novel search space representation, enabling search algorithms to more effectively explore and exploit the scenario space to generate failure inducing test cases. EvoDrive is built on top of the opensource CARLA simulator, enabling high-fidelity, scalable testing in virtual environments. Through extensive experiments, we demonstrate the effectiveness of EvoDrive, using a Genetic Algorithm (GA) to identify diverse and challenging scenarios that expose ADS failures. Our framework finds more meaningful infractions than general testing with the Carla Leaderboard Challenge framework and other state-of-the-art frameworks. Through testing, we also highlight EvoDrive’s capability to uncover critical infractions when comparing three different controllers.
|
|
11:21-11:39, Paper TuBT8.3 | |
A Variable Prediction Horizon MPC Approach for Leader-Follower Transportation in Presence of Obstacles |
|
Bertoni, Massimiliano | University of Padova |
Piccina, Alberto | University of Padova |
Michieletto, Giulia | University of Padova |
Keywords: Optimization and Optimal Control, Collision Avoidance, Autonomous Agents
Abstract: Cooperative transportation with multi-robot systems can be effectively managed using a leader-follower Model Predictive Control (MPC) approach, especially in cluttered environments. In particular, accurate modeling enhances robustness in obstacle avoidance but increases computational demand, limiting real-world applicability. Thus, this work proposes a nonlinear MPC-based leader-follower strategy for cooperative transportation in static and dynamic obstacle environments, leveraging a fully actuated robot model. To manage computational cost, a state machine dynamically switches between two (short and long) prediction horizons based on obstacles proximity. The effectiveness of the outlined approach is assessed through Monte Carlo analysis in a numerical simulation, comparing it to a state-of-the-art method across various scenarios.
|
|
11:39-11:57, Paper TuBT8.4 | |
A Data-Driven Framework for the Identification and Vulnerability Assessment of Critical Transfer Hubs in Multimodal Public Transportation Networks: A Case Study of New York City |
|
Liu, Bochi | Rutgers University |
Guo, Weihong | Rutgers University |
Keywords: Data fusion, Planning, Scheduling and Coordination, Intelligent Transportation Systems
Abstract: Multimodal public transportation networks (MPTNs) play an important role in urban mobility by enabling seamless connections across different transportation modes. Within these networks, transfer hubs are especially crucial, as their disruptions can significantly reduce network efficiency and passenger convenience. This paper presents a data-driven framework to identify critical transfer hubs and assess their vulnerability in an MPTN, illustrated by a case study of New York City. Separate network models for bus, subway, rail, and ferry systems were constructed based on Space P and Space L representations, then combined into a unified MPTN by identifying transfer points through an automated data-processing method. These transfer points were clustered into hubs using the density-based spatial clustering of applications with noise (DBSCAN) algorithm. Simulation results from targeted and random failure scenarios indicate that targeted failures at critical transfer hubs substantially reduce network connectivity and efficiency compared to random failures. The analysis identifies hubs such as Pelham Bay Park, Jamaica–179 Street, and 59 Street–Lexington Avenue as particularly vulnerable, highlighting their crucial role in maintaining overall network performance. The findings provide practical insights to improve the resilience and efficiency of MPTNs.
|
|
TuBT9 |
Room T9 |
Neuro-Symbolic AI |
Special Session |
Chair: Kim, Sungil | Ulsan National Institute of Science and Technology |
Organizer: Chen, Ming | Shanghai Jiao Tong University |
Organizer: Wang, Zhigang | Intel Labs China |
Organizer: Liu, Chao | Aston University |
|
10:45-11:03, Paper TuBT9.1 | |
A Comparative Study between a Traditional Coarse Sense-Plan-Act System and a Two-Stage RoI Recommender and Eyes-In-Hand Stereo System for Robotized Unscrewing of Hard Disk Drives (HDDs) (I) |
|
Coopman, Anthonie | KU Leuven |
Piessens, Mathijs | KU Leuven |
Van den bosch, Joren | KU Leuven |
Vandersteegen, Maarten | KU Leuven, EAVISE |
Peeters, Jef | KU Leuven |
Keywords: Remanufacturing, Computer Vision for Manufacturing, Deep Learning in Robotics and Automation
Abstract: Robotic unscrewing plays a pivotal role in the automation of remanufacturing processes. Despite advancements in deep learning-based screw detection, which nowadays achieve detection accuracies exceeding 98%, consistent unscrewing success remains challenging due to physical localization errors. Within prior research various vision-based methods have been developed to correct positional deviations that arise from neural network predictions. On the other hand, force-based methods have been developed to compensate for all accumulating errors during physical contact. The vision-based approaches often lack reliability, and force-based methods, although highly effective, have to date, only been validated on larger, self-centering fasteners and will result in additional wear to the unscrewing end-effector. Therefore, this study presents a comparative analysis of two robotic unscrewing pipelines designed to enhance detection accuracy, localization precision, and unscrewing time efficiency. The first baseline approach employs a single-stage, coarse-only sense–plan–act approach. The second, advanced two-stage approach integrates coarse localization with a refinement step using stereo detection, matching and triangulation. Experimental results highlight the advantages of the advanced two-stage approach, which incorporates a region-of-interest recommendation step and an eyes-in-hand stereo module for position refinement. With the pilot system developed in the presented research, vision-based precision and recall rates of 99.2% and 99.4% are obtained, with an unscrewing success rate of 99%. At the system level, these results correspond to unscrewing precision and recall of 98.2% and 98.4% with only 0.75 seconds needed to reclassify and compute the updated screw location.
|
|
11:03-11:21, Paper TuBT9.2 | |
Decoding RobKiNet: Insights into Efficient Training of Robotic Kinematics Informed Neural Network (I) |
|
Peng, Yanlong | Shanghai Jiao Tong University |
Wang, Zhigang | Intel Labs China |
He, Ziwen | Shanghai Jiao Tong University |
Chang, Pengxu | Shanghai Jiao Tong University (SJTU) |
Zhou, Chuangchuang | Henan Academy of Sciences |
Yu, Yan | Intel |
Chen, Ming | Shanghai Jiao Tong University |
Keywords: AI-Based Methods, Deep Learning in Robotics and Automation, Motion and Path Planning
Abstract: In robots task and motion planning (TAMP), it is crucial to sample within the robot's configuration space to meet task-level global constraints and enhance the efficiency of subsequent motion planning. Due to the complexity of joint configuration sampling under multi-level constraints, traditional methods often lack efficiency. This paper introduces the principle of RobKiNet, a kinematics-informed neural network, for end-to-end sampling within the Continuous Feasible Set (CFS) under multiple constraints in configuration space, establishing its Optimization Expectation Model. Comparisons with traditional sampling and learning-based approaches reveal that RobKiNet’s kinematic knowledge infusion enhances training efficiency by ensuring stable and accurate gradient optimization. Visualizations and quantitative analyses in a 2-DOF space validate its theoretical efficiency, while its application on a 9-DOF autonomous mobile manipulator robot(AMMR) demonstrates superior whole-body and decoupled control, excelling in battery disassembly tasks. RobKiNet outperforms deep reinforcement learning with a training speed 74.29 times faster and a sampling accuracy of up to 99.25%, achieving a 97.33% task completion rate in real-world scenarios.
|
|
11:21-11:39, Paper TuBT9.3 | |
MCCE-Net: Multi-Modal Fusion Component Condition Evaluation Network in Neurosymbolic-Guided Electronics Remanufacturing (I) |
|
Wu, Yifan | KU Leuven |
Zhou, Chuangchuang | Henan Academy of Sciences |
Peeters, Jef | KU Leuven |
Keywords: Remanufacturing, Computer Vision for Manufacturing, Industrial and Service Robotics
Abstract: In electronics remanufacturing, product component condition detection is an essential step, of which the results can be further transformed into symbolic information to guide the robot in executing, amongst others, disassembly tasks. However, the accuracy of vision-based condition evaluation today is limited by the presence of complex backgrounds with textures and colors similar to the component. To address this challenge, a Multi-modal fusion Component Condition Evaluation Network (MCCE-Net) based on YOLOv11 is proposed to fuse the RGB and depth images to provide more robust feature information for object detection. To improve the cross-modality interaction, Feature Enhanced Attention (FEA) modules are integrated into the network. The network performances are validated on a collected multi-view dataset of laptops of different conditions with annotations for screw, screw_hole missing_battery, and missing_cover. The presented results achieve an 85.0% mAP and outperform with 14.8% the vanilla YOLOv11. Experimental results also demonstrate that the MCCE-Net can achieve state-of-the-art performance on the collected dataset, greatly boosting neurosymbolic-based electronics remanufacturing.
|
|
11:39-11:57, Paper TuBT9.4 | |
Embodied Intelligence in Disassembly: Multimodal Perception Cross-Valiation and Continual Learning in Neuro-Symbolic TAMP (I) |
|
He, Ziwen | Shanghai Jiao Tong University |
Wang, Zhigang | Intel Labs China |
Peng, Yanlong | Shanghai Jiao Tong University |
Chang, Pengxu | Shanghai Jiao Tong University (SJTU) |
Yang, Harold | Intel Asia Pacific R&D Co., Ltd |
Chen, Ming | Shanghai Jiao Tong University |
Keywords: Hybrid Logical/Dynamical Planning and Verification, Learning and Adaptive Systems, Autonomous Agents
Abstract: With the rapid development of the new energy vehicle industry, the efficient disassembly and recycling of power batteries have become a critical challenge for the circular economy. In current unstructured disassembly scenarios, the dynamic nature of the environment severely limits the robustness of robotic perception, posing a significant barrier to autonomous disassembly in industrial applications. This paper proposes a continual learning framework based on Neuro-Symbolic task and motion planning (TAMP) to enhance the adaptability of embodied intelligence systems in dynamic environments. Our approach integrates a multimodal perception cross-validation mechanism into a bidirectional reasoning flow: the forward working flow dynamically refines and optimizes action strategies, while the backward learning flow autonomously collects effective data from historical task executions to facilitate continual system learning, enabling self-optimization. Experimental results show that the proposed framework improves the task success rate in dynamic disassembly scenarios from 81.9% to 100%, while reducing the average number of perception misjudgments from 3.389 to 1.128. This research provides a new paradigm for enhancing the robustness and adaptability of embodied intelligence in complex industrial environments.
|
|
TuBT10 |
Room T10 |
Large Language/Foundation Models 1 |
Regular Session |
Chair: Burns, Owen | University of Central Florida |
|
10:45-11:03, Paper TuBT10.1 | |
Learning Machine Tending from Demonstration with Multimodal LLMs |
|
Odabasi, Cagatay | Fraunhofer IPA |
Lindermayr, Jochen | Fraunhofer IPA |
Krieglstein, Jan | University of Stuttgart |
Kisa, Predrag | Intel Corporation |
Keywords: AI-Based Methods, Cognitive Automation, Deep Learning in Robotics and Automation
Abstract: Driven by the need for increased efficiency and flexibility in manufacturing, particularly in demanding sectors like semiconductor production, this paper presents a framework for intuitive robot programming using multimodal Large Language Models (mLLMs). Our methodology enables robot programming through human demonstration, using AR glasses to capture video, audio narration, and hand poses of an operator performing a machine tending task. This multimodal data is processed by an mLLM, which segments the demonstration temporally, transcribes narration, assigns low-level robot skills from a predefined library, provides reasoning for these assignments, and identifies interacted objects. Crucially, hand poses and pose of the robot according to the machine (via QR codes) are used to parameterize the selected skills, ensuring accurate translation to the robot's workspace. These parameterized skills are then automatically compiled into executable programs for a UR5e robot on a mobile base. Experimental evaluation in a machine tending scenario demonstrated high accuracy, with a median positional error of less than 4 cm between the robot's executed actions (e.g., Press Button) and the corresponding physical locations of those interaction points in the environment. Although the framework allows high level of automation, its transparent, multi-stage design also allows for operator corrections, further enhancing precision.
|
|
11:03-11:21, Paper TuBT10.2 | |
Large Language Model Enabled Industrial Document Generation Method Based on Retrieval Enhanced Prompt Learning |
|
Shi, JiaYu | Shanghai Jiao Tong University |
Shi, Fangcheng | Shanghai Jiao Tong University |
Zhao, Yue | Shanghai Jiao Tong University |
Chen, Qunlong | Shanghai Jiao Tong University |
Jiang, Hongwei | Shanghai Jiao Tong University |
Chen, Liang | Shanghai Jiao Tong University |
Zheng, Yu | Shanghai Jiao Tong University |
Keywords: Big-Data and Data Mining, Intelligent and Flexible Manufacturing, Manufacturing, Maintenance and Supply Chains
Abstract: Industrial document is a critical component in manufacturing applications, such as maintenance and quality control. It requires extensive manual compilation, which is labor-intensive and fails to systematically encode complex relationships among large-scale documents. To address these challenges, this paper introduces a retrieval-enhanced prompt learning algorithm with large language model (LLM), including soft prompt, hard prompt, and retrieval-enhanced few-shot prompt. The proposed method first employs a Large Vision-Language Model (LVLM) to convert images into structured textual descriptions. Subsequently, a Large Embedding Model (LEM) converts the documents into a vector-based database,enabling efficient retrieval of task-relevant few-shot prompts. hese retrieved prompts, combined with LoRA (Low-Rank Adaptation)-driven soft prompt and manually designed hard prompt, are fed into LLM to generate the final industrial document. Experiments are carried out on the dataset for the continuous casting equipment from the Bao Steel Company. The results demonstrate the framework’s superiority in automating document generation in terms of BLEU, ROUGE and Qwen-Score.
|
|
11:21-11:39, Paper TuBT10.3 | |
Aligning LLM+PDDL Symbolic Plans with Human Objective Specifications through Evolutionary Algorithm Guidance |
|
Burns, Owen | University of Central Florida |
Hughes, Dana | Carnegie Mellon University |
Sycara, Katia | Carnegie Mellon University |
Keywords: Human Performance Augmentation, Task Planning, Machine learning
Abstract: Automated planning using a symbolic planning language, such as PDDL, is a general approach to producing optimal plans to achieve a stated goal. However, creating suitable machine understandable descriptions of the planning domain, problem, and goal requires expertise in the planning language, limiting the utility of these tools for non-expert humans. Recent efforts have explored utilizing a symbolic planner in conjunction with a large language model to generate plans from natural language descriptions given by a non-expert human (LLM+PDDL). Our approach performs initial translation of goal specifications to a set of PDDL goal constraints using an LLM; such translations often result in imprecise symbolic specifications, which are difficult to validate directly. We account for this using an evolutionary approach to generate a population of symbolic goal specifications with slight differences from the initial translation, and utilize a trained LSTM-based validation model to assess whether each induced plan in the population adheres to the natural language specifications. We evaluate our approach on a collection of prototypical specifications in a notional naval disaster recovery task, and demonstrate that our evolutionary approach improve adherence of generated plans to natural language specifications when compared to plans generated using only LLM translations. The code for our method can be found at https://github.com/owenonline/PlanCritic.
|
|
11:39-11:57, Paper TuBT10.4 | |
Augmenting Open Vocabulary Object Detection with Large Language Models for Home Service Robots |
|
Martinson, Eric | Lawrence Technological University |
Indurthi, Hemanth | Lawrence Technological University |
Keywords: Industrial and Service Robotics, Deep Learning in Robotics and Automation, Computer Vision in Automation
Abstract: Practical service robots struggle with traditional object detection methods. Extensive hand labeling requirements mean a limited set of objects that robots can detect, which in turn significantly reduces utility. New open-vocabulary based detection models like CLIP and Yolo-World can find objects associated with arbitrary search queries without requiring additional hand-labeled data. Unfortunately, these methods are significantly less accurate. Multimodal large language models have been suggested as viable alternatives, but their computational load generally requires uploading significant quantities of images to the cloud for processing – which is both a privacy risk for home deployments and expensive in general. This work proposes an alternative for integrating with LLMs. Without uploading any images to the cloud, we demonstrate how a text-only LLM can support open vocabulary object detection, integrating knowledge of object size, room type and alternative text queries to significantly improve F-score and specificity. The approach is verified against the ScanNet dataset and further demonstrated to work in real time on the Strech 2 mobile robot.
|
|
TuBT11 |
Room T11 |
Best Student Papers Competition |
Special Session |
Chair: Lennartson, Bengt | Chalmers University of Technology |
|
10:45-11:10, Paper TuBT11.1 | |
MCCoder: Streamlining Motion Control with LLM-Assisted Code Generation and Rigorous Verification (I) |
|
Li, Yin | The Hong Kong University of Science and Technology (Guangzhou) |
Wang, Liangwei | The Hong Kong University of Science and Technology (Guangzhou) |
Piao, Shiyuan | The HongKong University of Science and Technology (Guangzhou) |
Yang, Boo-Ho | MOVENSYS Inc |
Li, Ziyue | University of Cologne |
Zeng, Wei | The Hong Kong University of Science and Technology (Guangzhou) |
Tsung, Fugee | HKUST |
Keywords: Motion Control, Domain-specific Software and Software Engineering, Factory Automation
Abstract: Large Language Models (LLMs) have demonstrated significant potential in code generation. However, in the factory automation sector—particularly motion control—manual programming, alongside inefficient and unsafe debugging practices, remains prevalent. This stems from the complex interplay of mechanical and electrical systems and stringent safety requirements. Moreover, most current AI-assisted motion control programming efforts focus on PLCs, with little attention given to high-level languages and function libraries. To address these challenges, we introduce MCCoder, an LLM-powered system tailored for generating motion control code, integrated with a soft-motion controller. MCCoder improves code generation through a structured workflow that combines multitask decomposition, hybrid retrieval-augmented generation (RAG), and iterative self-correction, utilizing a well-established motion library. Additionally, it integrates a 3D simulator for intuitive motion validation and logs of full motion trajectories for data verification, significantly enhancing accuracy and safety. In the absence of benchmark datasets and metrics tailored for evaluating motion control code generation, we propose MCEVAL, a dataset spanning motion tasks of varying complexity. Experiments show that MCCoder outperforms baseline models using Advanced RAG, achieving an overall performance gain of 33.09% and a 131.77% improvement on complex tasks in the MCEVAL dataset. MCCoder is publicly available at href{https://github.com/MCCodeAI/MCCoder}{https://github.c om/MCCodeAI/MCCoder}.
|
|
11:10-11:35, Paper TuBT11.2 | |
Learning Optimal Baggage Routing from Expert Demonstrations |
|
Tian, Keyu | City University of Hong Kong |
Zeng, Li | City University of Hong Kong |
Keywords: Intelligent Transportation Systems, Collision Avoidance, Reinforcement
Abstract: Efficient baggage transportation within baggage handling systems (BHS) is crucial to modern airport logistics, impacting operational efficiency and passenger satisfaction. Optimizing baggage routing is vital to ensure timely delivery. Although reinforcement learning (RL) has shown promise in enhancing baggage routing, it faces significant limitations, including the difficulty of designing suitable reward functions and the risk for suboptimal exploration during training. To overcome these challenges, this study proposes an approach that employs variational Bayesian inverse reinforcement learning (VBIRL) to derive optimal baggage routing policies from expert demonstrations, thereby eliminating the need for manual reward function specification and inefficient exploration in training. A key innovation of our method is the generation of demonstrations using route planning techniques, effectively addressing the lack of expert demonstration data in BHS. Experimental results indicate that our proposed method outperforms conventional RL methods, leading to faster baggage delivery and reduced collision rate, especially in complex BHS.
|
|
11:35-12:00, Paper TuBT11.3 | |
K-Wafer Cyclic Sequence for Scheduling of Dual-Armed Cluster Tool with Purge Operation (I) |
|
Joo, Sanghyun | Korea Advanced Institute of Science and Technology(KAIST) |
Kim, Hyun-Jung | Korea Advanced Institute of Science and Technology |
Keywords: Semiconductor Manufacturing, Petri Nets for Automation Control
Abstract: Efficient scheduling of semiconductor manufacturing equipment, known as cluster tools, is essential for achieving high productivity. Cluster tools process semiconductor wafers using multiple chambers arranged around a central robot. To ensure wafer quality, chambers must be periodically cleaned, interrupting production and complicating the scheduling process. Previous scheduling methods typically process one wafer per robot sequence cycle, 1-wafer cyclic schedule, leading to inefficiencies when multiple chambers operate in parallel. To overcome this limitation, we propose swap(A, z), an advanced scheduling approach extending swap(a, z), which flexibly combines swap and push-and-wait operations within a K-wafer cyclic framework, allowing varied robot task sequences for each wafer cycle over a total of K wafer cycles. We formally prove that swap(A, z) achieves optimality in regions where traditional 1-wafer cyclic methods exhibit inefficiencies in two-parallel-chamber configurations. Experiments based on realistic industry scenarios demonstrate that our method substantially reduces these inefficiencies and consistently delivers near-optimal performance.
|
|
TuCT1 |
Room T1 |
Planning, Scheduling and Control 4 |
Regular Session |
Chair: Ghosh, Mukulika | Missouri State University |
|
14:45-15:03, Paper TuCT1.1 | |
Energy-Aware Planning for Delivery Tasks Executed by Legged Robots |
|
Chen, Shengqiang | University of Southern California |
Chen, Yiyu | University of Southern California |
Wei, Zishen | University of Southern California |
Nguyen, Quan | University of Southern California |
Gupta, Satyandra K. | University of Southern California |
Keywords: Planning, Scheduling and Coordination, Energy and Environment-aware Automation, Motion and Path Planning
Abstract: Legged robots can significantly increase human productivity by performing delivery tasks. Legged robots cannot be tethered to power sources when operating in large outdoor environments. If a robot runs out of energy while executing a task, it will require human intervention, resulting in delays. On the other hand, frequent battery recharging or replacement could also lead to significant delays in task completion. This paper presents an energy-aware hierarchical planning approach that accounts for energy consumption and integrates appropriate battery replacement strategies to ensure that tasks are completed efficiently. Our algorithm generates graph search instances for varying battery replacement actions, the option to split the payload into smaller portions, and reducing speed in the first level, while running graph search to determine the optimal plan that minimizes the time to complete the delivery task. We illustrate the effectiveness of our planning approach on a terrain with varying slopes and delivery tasks with different requirements.
|
|
15:03-15:21, Paper TuCT1.2 | |
Efficient Social Navigation: Leveraging Discrete Morse Theory for Dynamic Agent Interaction |
|
Mursalin, S.M. Faiaz | Missouri State University |
Ekenna, Chinwe | University at Albany |
Ghosh, Mukulika | Missouri State University |
Keywords: Planning, Scheduling and Coordination, Collision Avoidance, Motion and Path Planning
Abstract: Robotic navigation in dynamic environments presents significant challenges, particularly in managing interactions with moving agents while ensuring efficient path planning. We introduce a novel integration of social navigation strategies with topological path planning, leveraging Discrete Morse Theory, Vietoris-Rips complex, and a homotopical framework to enhance adaptability. Our method dynamically assesses path feasibility and optimizes trajectory selection through three key strategies: waiting, deflection, and diverse path selection. By incorporating Morse values into a sampling-based roadmap, our approach prioritizes critical configurations for efficient motion planning. Unlike existing methods that rely on static heuristics or extensive learning-based predictions, our framework offers a real-time, adaptive mechanism for congestion-aware navigation. Experimental evaluations demon- strate an efficient solution(< 100s in computation) with im- proved path adaptability, resulting in 30 − 60% increase in traversal time in environments containing 3 − 9 degrees of freedom robot and 15 − 60 dynamic agents.
|
|
15:21-15:39, Paper TuCT1.3 | |
LASMP: Language Aided Subset Sampling Based Motion Planner |
|
Bhattacharjee, Saswati | University at Albany |
Sinha, Anirban | GE Aerospace Research |
Ghosh, Mukulika | Missouri State University |
Ekenna, Chinwe | University at Albany |
Keywords: Planning, Scheduling and Coordination, Autonomous Vehicle Navigation, Motion and Path Planning
Abstract: This paper presents the Language Aided Subset Sampling Based Motion Planner (LASMP), a framework that helps mobile robots plan their movements from natural language instructions. LASMP uses a modified version of the Rapidly Exploring Random Tree (RRT) method, which is guided by user-provided instructions processed through a language model. The planner improves efficiency by focusing on specific areas of the robot’s workspace based on these instructions, making it faster and less resource-intensive. Compared to traditional RRT and RRT* methods, LASMP reduces the number of nodes needed by 55% and cuts random sample queries by 80%, while still generating safe, collision-free paths. Tested in both simulated and real-world environments, LASMP has shown better performance in handling complex indoor scenarios. The results highlight the potential of combining language processing with motion planning to make robot navigation more efficient.
|
|
15:39-15:57, Paper TuCT1.4 | |
CHyRRT and cHySST: Motion Planning Tools for Hybrid Dynamical Systems in OMPL |
|
Xu, Beverly | University of California Santa Cruz |
Wang, Nan | University of California, Santa Cruz |
Sanfelice, Ricardo | University of California |
Keywords: Hybrid Logical/Dynamical Planning and Verification
Abstract: This paper presents two implementations of the recently developed motion planning algorithms HyRRT and HySST. Specifically, cHyRRT, an implementation of the HyRRT algorithm, generates solutions to motion planning problems for hybrid systems with a probabilistic completeness guarantee, while cHySST, an implementation of the asymptotically near-optimal HySST algorithm, finds near-optimal trajectories based on a user-defined cost function. The implementations align with the theoretical foundations of hybrid system theory and are designed based on OMPL, ensuring compatibility with ROS while prioritizing computational efficiency. The structure, components, and usage of both tools are detailed. A modified pinball game example is provided to illustrate the tools' key capabilities.
|
|
15:57-16:15, Paper TuCT1.5 | |
Uncertainty-Aware Planning for Heterogeneous Robot Teams Using Dynamic Topological Graphs and Mixed-Integer Programming |
|
Duggan, Cora A. | Johns Hopkins University Applied Physics Laboratory |
Wolfe, Kevin | Johns Hopkins University Applied Physics Laboratory |
Woosley, Bradley | US Army Research Laboratory |
Kobilarov, Marin | Johns Hopkins University |
Moore, Joseph | Johns Hopkins University |
Keywords: Agent-Based Systems, Planning, Scheduling and Coordination, Optimization and Optimal Control
Abstract: Multi-robot planning and coordination in uncertain environments is a fundamental computational challenge, since the belief space increases exponentially with the number of robots. In this paper, we address the problem of planning in uncertain environments with a heterogeneous robot team of fast scout vehicles for information gathering and more risk-averse carrier robots from which the scouts vehicles are deployed. To overcome the computational challenges, we represent the environment and operational scenario using a topological graph, where the parameters of the edge weight distributions vary with the state of the robot team on the graph, and we formulate a computationally efficient mixed-integer program which removes the dependence on the number of robots from its decision space. Our formulation results in the capability to generate optimal multi-robot, long-horizon plans in seconds that could otherwise be computationally intractable. Ultimately our approach enables real-time re-planning, since the computation time is significantly faster than the time to execute one step. We evaluate our algorithm in a scenario where the robot team must traverse an environment while minimizing detection by observers in positions that are uncertain to the robot team. We demonstrate that our method is computationally tractable, can improve performance in the presence of imperfect information, and can be adjusted for different risk profiles.
|
|
TuCT2 |
Room T2 |
Autonomous and Software-Defined Factory 1 |
Special Session |
Chair: Lee, Chia-Yen | National Taiwan University |
Organizer: Lee, Chia-Yen | National Taiwan University |
Organizer: Hsu, Chia-Yu | National Tsing Hua University |
Organizer: Fan, Shu-Kai S. | National Taipei University of Technology |
Organizer: Ju, Feng | Arizona State University |
Organizer: Blue, Jakey | National Taiwan University |
Organizer: Jang, Young Jae | Korea Advanced Institute of Science and Technology |
Organizer: Skoogh, Anders | Chalmers University of Technology |
Organizer: Lugaresi, Giovanni | KU Leuven |
|
14:45-15:03, Paper TuCT2.1 | |
A Multi-Level Filtering Framework for Disassembly Sequence Planning Utilizing Various Detailed Information (I) |
|
Xiang, Zheng | Technical University of Munich |
Bernhard, Tim | Technical University of Munich |
Fottner, Johannes | Technical University of Munich |
Keywords: Remanufacturing, Sustainability and Green Automation, Intelligent and Flexible Manufacturing
Abstract: With a growing emphasis on circular economy, disassembling industrial products to retain their value after the end of their life cycle is increasingly seen as a promising approach. In modern industry, selective disassembly—focusing on specific components rather than full product disassembly—offers a more practical balance between economic and ecological considerations. In this work, we first present a general framework for addressing the Disassembly Sequence Planning (DSP) problem. Building on this framework, we propose two methods to solve a specific variant of the problem: selective DSP, which is particularly relevant to industrial applications. One of these methods demonstrates a significant ability to address this challenge, reducing the number of removal attempts by about 20 % to over 60 % in validation assemblies. This result validates the effectiveness of the proposed framework, highlighting its potential for practical implementation.
|
|
15:03-15:21, Paper TuCT2.2 | |
A New Deep Reinforcement Learning Run-To-Run Control Algorithm for Mixed-Product Production Mode in Semiconductor Manufacturing (I) |
|
Fan, Shu-Kai S. | National Taipei University of Technology |
Keywords: AI-Based Methods, Learning and Adaptive Systems, Reinforcement
Abstract: This paper proposes a new Run-to-Run (R2R) control framework based on deep deterministic policy gradient (DDPG) for the mixed-product production mode in semiconductor manufacturing. The DDPG algorithm is particularly developed to configure a deep reinforcement learning environment well suited to mixed-product production modes. To address the challenges posed in deep reinforcement learning, three enhanced mechanisms have been developed to improve the training of the proposed DDPG model for mixed-product R2R applications. These mechanisms include a piece-wise reward function, training with dynamic targets, and the new recall principle. It is demonstrated from the comprehensive simulation results that the proposed R2R control framework outperforms five noted mixed-product R2R control algorithms in the literature. The research outcome of this paper signifies a promising viability of deep reinforcement learning for highly complex and dynamic environments with continuous action spaces in the mixed-product R2R practice.
|
|
15:21-15:39, Paper TuCT2.3 | |
A Digital Twin Trinity for Adaptive Evaluation of Machining Process (I) |
|
Hsu, Chih-Hua | CJCU |
Su, Guan-Jhen | National Kaohsiung Univ. of Tech |
Li, Wan-Ling | Walsin Lihwa Company |
Yang, Haw-Ching | National Kaohsiung Univ. of Sci. and Tech |
Keywords: Cyber-physical Production Systems and Industry 4.0, Force and Tactile Sensing, Machine learning
Abstract: Digital Twins (DT) are crucial in cyber-physical manufacturing systems. However, achieving accuracy, real-time processing and precision in machining applications remains a significant challenge. This work introduces the Digital Twin Trinity (DTT) framework to address these challenges, integrating physical simulation, surrogate inference, and metrology certification for CNC machining. The physical simulation stage captures fundamental machining physics, including cutting force effects and thermal influences based on process parameters. A recurrent-based neural network is a surrogate inference model, enabling real-time predictions of cutting forces and tool wear. This enhances the accuracy of digital twin simulations. Finally, a Smart Tool Holder (STH) acts as a metrology certification tool, validating the physical simulation and ensuring precision in machining operations.
|
|
15:39-15:57, Paper TuCT2.4 | |
Advancing Multi-Label Melt Pool Defect Detection in Laser Powder Bed Fusion with Self-Supervised Learning (I) |
|
Ziad, Erfan | Arizona State University |
Yang, Zhuo | Georgetown University |
Lu, Yan | National Institute of Standards and Technology |
Ju, Feng | Arizona State University |
Keywords: Machine learning, Additive Manufacturing, Data fusion
Abstract: This paper presents an image-based deep learning model for analyzing melt pool images in laser powder bed fusion (LPBF) additive manufacturing. Melt pool imaging is essential for process monitoring and defect detection; however, challenges such as labeling accuracy, model robustness, and the scarcity of labeled high-frequency melt pool data remain. To address these issues, we implement a self-supervised learning approach using Bootstrap Your Own Latent (BYOL) to enhance label accuracy and improve feature extraction. Additionally, we propose an optimized pseudo-labeling strategy to further refine defect classification. Trained on a large dataset of melt pool images, our model significantly improves label reliability and predicts multiple defect types with high precision. Experimental results demonstrate robust predictive capabilities, enabling highly accurate and reliable defect detection. This advancement enhances image-based defect detection methods and contributes to the development of automated monitoring systems for LPBF, improving process stability and quality assurance.
|
|
15:57-16:15, Paper TuCT2.5 | |
Intelligent Retrieval and Knowledge Question-Answering System for Semi-Structured Operational Manuals (I) |
|
Cheng, You-Wei | National Taipei University of Technology |
Hsu, Chia-Yu | National Tsing Hua University |
Lan, Yu-Ying | Industrial Technology Research Institute |
Keywords: AI-Based Methods, Machine learning, Semiconductor Manufacturing
Abstract: In the field of industrial production and manufacturing, various management systems such as Manufacturing Execution Systems (MES), Enterprise Resource Planning (ERP), and Advanced Planning Systems (APS) play a crucial role in production monitoring, performance evaluation, and decision support. These systems enable managers to obtain real-time production data, analyze process anomalies, and formulate appropriate adjustment strategies to ensure production efficiency and quality consistency. However, when users are unfamiliar with system operations, they often need to search through extensive Standard Operating Procedure (SOP) manuals to find relevant operational guidance, which is both time-consuming and detrimental to operational efficiency. In recent years, the integration of Generative Artificial Intelligence (Generative AI) with Retrieval-Augmented Generation (RAG) technology has enabled systems to automatically retrieve and synthesize relevant knowledge content in response to user queries, providing real-time assistance. However, when handling unstructured or semi-structured data, traditional text segmentation methods may disrupt the original hierarchical structure and contextual relationships of documents, leading to retrieval results that fail to accurately match user queries. Moreover, if text segments lack complete source attribution, RAG models may struggle to clearly indicate the origin of the information in their responses, thereby affecting the credibility of the generated content. To address these challenges, this study proposes a method for extracting semi-structured data to enhance the quality of document parsing. By implementing this approach, the final output of the RAG system can generate high-quality responses, precisely annotate interface locations an
|
|
TuCT3 |
Room T3 |
Large Language/Foundation Models 2 |
Regular Session |
Chair: Muthusamy, Rajkumar | Dubai Future Foundation |
|
14:45-15:03, Paper TuCT3.1 | |
INPROVF: Leveraging Large Language Models to Repair High-Level Robot Controllers from Assumption Violations |
|
Meng, Qian | Cornell University |
Zhou, Jin Peng | Cornell University |
Weinberger, Kilian | Cornell University |
Kress-Gazit, Hadas | Cornell University |
Keywords: Formal Methods in Robotics and Automation, Deep Learning in Robotics and Automation, Task Planning
Abstract: This paper presents INPROVF, an automatic framework that combines large language models (LLMs) and formal methods to speed up the repair process of high-level robot controllers. Previous approaches based solely on formal methods are computationally expensive and cannot scale to large state spaces. In contrast, INPROVF uses LLMs to generate repair candidates, and formal methods to verify their correctness. To improve the quality of these candidates, our framework first translates the symbolic representations of the environment and controllers into natural language descriptions. If a candidate fails the verification, INPROVF provides feedback on potential unsafe behaviors or unsatisfied tasks, and iteratively prompts LLMs to generate improved solutions. We demonstrate the effectiveness of INPROVF through 12 violations with various workspaces, tasks, and state space sizes.
|
|
15:03-15:21, Paper TuCT3.2 | |
A Retrieval-Augmented Generation (RAG)-Based LLM for Modern Warehouse Automation and Management |
|
Choudhary, Kabita | BITS Pilani, Dubai Campus |
Uruj, Sheeba | BITS Pilani, Dubai Campus |
Pathak, Abhinav | Robotics Lab, Dubai Future Labs, Dubai, UAE |
Venkatesan, Kalaichelvi | Bits Pilani, Dubai Campus |
Shetty, Sujala D | BITS Pilani, Dubai Campus |
Ramanujam, Karthikeyan | Bits Pilani, Dubai Campus |
Taha, Tarek | Dubai Future Labs |
Muthusamy, Rajkumar | Dubai Future Foundation |
Keywords: Cyber-physical Production Systems and Industry 4.0, Factory Automation, Planning, Scheduling and Coordination
Abstract: Robotics automation and integration is quickly becoming a pillar of Industry 4.0 in modern warehouse management systems, but existing solutions fail to seamlessly integrate autonomous robotics into operational workflows, that leads to inefficiencies and scalability challenges. As the dependency on robotics grows in warehouses, there is the need for more reliable data and adaptable systems. The traditional system is not flexible enough to support dynamically evolving warehouse infrastructure that leads to inefficiencies and operational challenges. To address these issues, an efficient and trustworthy automation system is highly needed. This paper proposes a Modern Warehouse Management System (MWMS) framework that integrates Retrieval-Augmented Generation (RAG) with Large Language Models (LLMs). The system uses an advanced framework to make it possible for robots to scan inventory on their own, sort packages in real time, plan their paths, assist people in pickup and place items, and to work together on tasks. A mock-up warehouse environment is developed to test and demonstrate the system’s capabilities in robotics assistance in a dynamic environment. The framework is evaluated using three LLMs, demonstrating its ability to interact with real-time data and enhance operational efficiency, accuracy, and trustworthiness.
|
|
15:21-15:39, Paper TuCT3.3 | |
A Model-Agnostic Approach for Semantic Table Augmentation in Digital Twin Government Using LLMs and Domain-Specific Ontologies |
|
Lee, Yewon | Dong-A University |
Son, Wonseok | Dong-A University |
Kim, Jeongsu | Dong-A University |
Chun, Sejin | Dong-A University |
Keywords: Automation Technologies for Smart Cities, Calibration and Identification, Data fusion
Abstract: The Digital Twin Government (DTG) paradigm seeks to digitally replicate vast and heterogeneous governmental resources for decision-making and service delivery in real-world scenarios. In this paper, we introduce a Semantic Table Augmentation (STA) framework that automates the semantic enrichment of diverse tabular data using Large Language Models (LLMs). First, we propose the Digital Civil Complaint Ontology (DCCO) that expresses entities and their relationships in civil complaint management under DTG contexts. Our context-driven prompt templates enable the deployment of LLMs in a model-agnostic manner. Finally, we evaluate the performance of our proposed methods on synthetic datasets using cutting-edge LLMs against state-of-the-art method.
|
|
15:39-15:57, Paper TuCT3.4 | |
Adaptive Domain Modeling with Language Models: A Multi-Agent Approach to Task Planning |
|
Babu, Harisankar | Bosch Center for Artificial Intelligence |
Schillinger, Philipp | Bosch Center for Artificial Intelligence |
Asfour, Tamim | Karlsruhe Institute of Technology (KIT) |
Keywords: Task Planning, Agent-Based Systems, AI-Based Methods
Abstract: We introduce TAPAS (Task-based Adaptation and Planning using AgentS), a multi-agent framework that integrates Large Language Models (LLMs) with symbolic planning to solve complex tasks without the need for manually defined environment models. TAPAS employs specialized LLM-based agents that collaboratively generate and adapt domain models, initial states, and goal specifications as needed using structured tool-calling mechanisms. Through this tool-based interaction, downstream agents can request modifications from upstream agents, enabling adaptation to novel attributes and constraints without manual domain redefinition. A ReAct (Reason+Act)-style execution agent, coupled with natural language plan translation, bridges the gap between dynamically generated plans and real-world robot capabilities. TAPAS demonstrates strong performance in benchmark planning domains and in the VirtualHome simulated real-world environment.
|
|
15:57-16:15, Paper TuCT3.5 | |
Intelligent Indoor Navigation for Home Robots Based on Large Language Models |
|
Chen, Zhanjie | Oklahoma State University |
Su, Zhidong | Oklahoma State University |
Chesser, Conlan | Oklahoma State University |
Bourlon, Craig | Oklahoma State University |
Sheng, Weihua | Oklahoma State University |
Keywords: Formal Methods in Robotics and Automation, AI-Based Methods
Abstract: Natural language understanding is crucial for home robots to help people in their daily lives. However, existing home robots mainly rely on keyword matching to understand explicit commands, while struggling with understanding human instructions expressed more implicitly and naturally. Recently, large language models (LLMs) have demonstrated great potential in human language understanding. In this paper, we explore the application of LLM in home robots with a focus on navigation, one of the core capabilities in mobile robots. First, we designed and implemented an LLM-assisted robot navigation framework which adopts a modular architecture to integrate human-robot interaction, semantic mapping, and motion planning, thereby enhancing scalability and deployment flexibility. Second, we established a systematic method and benchmarks to evaluate the performance of LLMs in robot navigation applications, quantifying the performance differences between cloud-based and local LLM models. Finally, we conducted experiments on a custom-designed mobile robot in a real apartment setting. Our findings provide practical insight in selecting and optimizing LLMs for robotic applications.
|
|
TuCT4 |
Room T4 |
Reinforcement Learning 2 |
Regular Session |
Chair: Berrueta, Thomas | California Institute of Technology |
|
14:45-15:03, Paper TuCT4.1 | |
Physical State Exploration for Reinforcement Learning from Scratch |
|
Pinosky, Allison | Northwestern University |
Berrueta, Thomas | California Institute of Technology |
Li, Olivia | Northwestern Universtiy |
Murphey, Todd | Northwestern University |
Keywords: Deep Learning in Robotics and Automation, Reinforcement, Machine learning
Abstract: Although reinforcement learning (RL) algorithms have demonstrated impressive capabilities in simulation, their transition into the real-world often reveals a performance gap. A key challenge of real-world deployment is ensuring robustness to complex or unmodeled physical phenomena, such as anisotropic friction during locomotion. This discrepancy between simulated and real-world performance underscores the need for hardware benchmarks to rigorously evaluate and improve RL algorithms for physical deployment. In this work, we present NoodleBot—a low-cost, untethered three-link swimmer robot—as a hardware benchmark for RL algorithms. This benchmark is intended to complement embodied learning approaches, where simulations provide confidence that algorithms are able to learn from scratch in challenging environments. We demonstrate three algorithms learning on the platform and compare the results to learning with a simulated swimmer, highlighting the importance of effective state exploration to agent performance. We also show the ability of one algorithm to learn in single-shot hardware deployments. Design, firmware, and software are available open source at https://github.com/MurpheyLab/NoodleBot.
|
|
15:03-15:21, Paper TuCT4.2 | |
Hierarchically Connecting Modularly-Learned Policies to Generate a Controller for a Combined Robot System |
|
Takeda, Sho | Kyoto University |
Yamamori, Satoshi | Kyoto University |
Yagi, Satoshi | Kyoto University |
Morimoto, Jun | Kyoto University |
Keywords: Deep Learning in Robotics and Automation, Reinforcement, Machine learning
Abstract: Deep reinforcement learning offers a promising approach for controlling robots with high degrees of freedom. However, its application is limited by substantial data requirements and difficulties in simulating complex physical interactions. This paper proposes a novel component-wise hierarchical policy learning approach that addresses these challenges by decomposing the robot into individual components and learning a separate policy for each. This strategy improves data efficiency and allows for more focused training of individual sub-systems. Upper-level policies then integrate these component policies through a separate learning process, enabling coordinated control of the robot whole-body. This hierarchical structure can be interpreted as a curriculum learning strategy, where the robot gradually learns more complex tasks by mastering individual component skills first. By learning modular policies and then combining them, the approach offers improved generalization and robustness compared to monolithic policy learning. We validate the approach on a modular robot, demonstrating that this hierarchical, component-wise policy learning framework enables efficient control of complex robots.
|
|
15:21-15:39, Paper TuCT4.3 | |
Re4MPC: Reactive Nonlinear MPC for Multi-Model Motion Planning Via Deep Reinforcement Learning |
|
Akmandor, Neset Unver | Motional AD Inc |
Prajapati, Sarvesh | Northeastern University |
Zolotas, Mark | Toyota Research Institute |
Padir, Taskin | Northeastern University |
Keywords: Motion and Path Planning, Deep Learning in Robotics and Automation, Reinforcement
Abstract: Traditional motion planning methods for robots with many degrees-of-freedom, such as mobile manipulators, are often computationally prohibitive for real-world settings. In this paper, we propose a novel multi-model motion planning pipeline, termed Re4MPC, which computes trajectories using Nonlinear Model Predictive Control (NMPC). Re4MPC generates trajectories in a computationally efficient manner by reactively selecting the model, cost, and constraints of the NMPC problem depending on the complexity of the task and robot state. The policy for this reactive decision-making is learned via a Deep Reinforcement Learning (DRL) framework. We introduce a mathematical formulation to integrate NMPC into this DRL framework. To validate our methodology and design choices, we evaluate DRL training and test outcomes in a physics-based simulation involving a mobile manipulator. Experimental results demonstrate that Re4MPC is more computationally efficient and achieves higher success rates in reaching end-effector goals than the NMPC baseline, which computes whole-body trajectories without our learning mechanism.
|
|
15:39-15:57, Paper TuCT4.4 | |
A Production Scheduling Framework for Reinforcement Learning under Real-World Constraints |
|
Hoss, Jonathan | Rosenheim University of Applied Sciences |
Schelling, Felix | Rosenheim University of Applied Sciences |
Klarmann, Noah | Rosenheim University of Applied Sciences |
Keywords: Intelligent and Flexible Manufacturing, Discrete Event Dynamic Automation Systems, Factory Automation
Abstract: The classical Job Shop Scheduling Problem (JSSP) focuses on optimizing makespan under deterministic constraints. Real-world production environments introduce additional complexities that cause traditional scheduling approaches to be less effective. Reinforcement learning (RL) holds potential in addressing these challenges, as it allows agents to learn adaptive scheduling strategies. However, there is a lack of a comprehensive, general-purpose frameworks for effectively training and evaluating RL agents under real-world constraints. To address this gap, we propose a modular framework that extends classical JSSP formulations by incorporating key real-world constraints inherent to the shopfloor, including transport logistics, buffer management, machine breakdowns, setup times, and stochastic processing conditions, while also supporting multi-objective optimization. The framework is a customizable solution that offers flexibility in defining problem instances and configuring simulation parameters, enabling adaptation to diverse production scenarios. A standardized interface ensures compatibility with various RL approaches, providing a robust environment for training RL agents and facilitating the standardized comparison of different scheduling methods under dynamic and uncertain conditions. We release JobShopLab as an open-source tool for both research and industrial applications, accessible at: https://github.com/proto-lab-ro/jobshoplab
|
|
15:57-16:15, Paper TuCT4.5 | |
Ship Berthing Control Using Autonomous Tugboats and Reinforcement Learning |
|
Oh, Jaejin | Chungnam National University |
Koo, Japyeong | Chungnam National University |
Kim, Jaewoo | Samsung Heavy Industries |
Park, Kwang-Phil | Chungnam National University |
Jung, Jongdae | Chungnam National University |
Keywords: AI-Based Methods, Model Learning for Control, Simulation and Animation
Abstract: Berthing a large ship in a port is a challenging task that requires precise control from tugboats. This study proposes a cooperative control strategy for multiple tugboats to guide a target mother ship to its desired berthing position and orientation. By treating each tugboat as an independent agent, we designed a reward function that takes into account the thrust distribution and the interactions between the tugboats and the mother ship. A reinforcement learning algorithm based on Proximal Policy Optimization (PPO) was utilized to train the models for the thruster controller. Our simulations showed that ship berthing can be successfully achieved with two tugboats, achieving a success rate of over 70%. For real-world verification, we developed three small-scale autonomous surface vehicles and implemented LiDAR-based object detection using a custom dataset and deep neural networks. Future research will focus on deploying the learned controller on actual model ships and verifying the berthing process.
|
|
TuCT5 |
Room T5 |
Simulations & Digital Twins 2 |
Regular Session |
Chair: Goldberg, Ken | UC Berkeley |
|
14:45-15:03, Paper TuCT5.1 | |
Rapid Modeling Architecture for Lightweight Simulator to Accelerate and Improve Decision Making for Industrial Systems |
|
Kato, Takumi | Hitachi America, Ltd |
Hu, Zhi | Hitachi America |
Keywords: Manufacturing, Maintenance and Supply Chains, Factory Automation, Domain-specific Software and Software Engineering
Abstract: Designing industrial systems, such as building, improving, and automating distribution centers and manufacturing plants, involves critical decision-making with limited information in the early phases. The lack of information leads to less accurate designs of the systems, which are often difficult to resolve later. It is effective to use simulators to model the designed system and find out the issues early. However, the modeling time required by conventional simulators is too long to allow for rapid model creation to meet decision-making demands. In this paper, we propose a Rapid Modeling Architecture (RMA) for a lightweight industrial simulator that mitigates the modeling burden while maintaining the essential details in order to accelerate and improve decision-making. We have prototyped a simulator based on the RMA and applied it to the actual factory layout design problem. We also compared the modeling time of our simulator to that of an existing simulator, and as a result, our simulator achieved a 78.3% reduction in modeling time compared to conventional simulators.
|
|
15:03-15:21, Paper TuCT5.2 | |
Digital Twin Framework for Real-Time Optimization of Hairpin Coil Straightening Process |
|
Kim, Jihun | Sungkyunkwan University |
Han, Sun Woo | Sungkyunkwan University |
Lee, Minsub | Hyundai Mobis |
Lee, Eun-Ho | Sungkyunkwan Univeristy |
Keywords: Zero-Defect Manufacturing, Intelligent and Flexible Manufacturing, Cyber-physical Production Systems and Industry 4.0
Abstract: As the demand for customized electric vehicles (EVs) continues to grow, enhancing the flexibility of manufacturing processes for key propulsion components, such as hairpin motor windings, has become increasingly critical. Conventional roller straightening methods, which rely on fixed equipment configurations, exhibit significant limitations in accommodating variations in wire geometry and material properties, resulting in inefficiencies and increased production costs. This study presents a flexible roller straightening approach that integrates real-time sensing, finite element method (FEM)-based process simulation, and surrogate modeling using multilayer perceptron (MLP) neural network. Laser micrometers and eddy current sensors are employed to capture real-time measurements of wire thickness and plastic strain, enabling the development of a digital twin for process optimization. The proposed optimization framework aims to simultaneously minimize wire curvature and plastic strain, thereby mitigating spring-back effects and reducing electrical resistance, both of which are critical for ensuring the mechanical precision and electrical performance of the final product. The results demonstrate that the proposed methodology significantly enhances the adaptability and efficiency of hairpin winding manufacturing, providing a scalable solution for flexible EV motor production.
|
|
15:21-15:39, Paper TuCT5.3 | |
Digital Twin in Industrie 4.0 Implementation for Embedded Systems |
|
Soler Perez Olaya, Santiago | TU-Dresden |
Cavegn, Dennis | ZHAW |
Braunisch, Nico | TU Dresden |
Ristin, Marko | Nazarbayev University |
Sadurski, Marcin | ZHAW |
van de Venn, Hans Wernher | Zurich University of Applied Science |
Wollschlaeger, Martin | TU Dresden |
Keywords: Cyber-physical Production Systems and Industry 4.0, Product Design, Development and Prototyping, Control Architectures and Programming
Abstract: The realization of Industry 4.0 potential depends on the utilization of digital twins, commonly implemented as Asset Administration Shells (AAS). AAS provides a robust framework for describing assets and their functionalities, and is becoming a key component of digital twins for embedded systems. In the fields of Industrial Cyber Physical Systems and Industrial Internet of Things, AAS serves as a bridge between microcontroller and high-performance systems. Current Software Development Kits for AAS focus on cloud and PC applications. Building on the foundation of transpilation-based approaches, we explore novel ways to adapt SDKs for AAS to bring them to industrial embedded systems. Our method is designed to streamline the development process for microcontroller and Industrial Control System contexts. The evaluation shows the feasibility of ASS for embedded systems, current memory requirements, and the response time of ASS implemented in an embedded system. An implementation of Industry 4.0 AAS including parts of the standardized AAS REST API opens the door to further research and development of Industry 4.0 applications based on embedded systems.
|
|
15:39-15:57, Paper TuCT5.4 | |
GrowSplat: Constructing Temporal Digital Twins of Plants with Gaussian Splats |
|
Adebola, Simeon Oluwafunmilore | University of California, Berkeley |
Xie, Shuangyu | UC Berkeley |
Kim, Chung Min | University of California, Berkeley |
Kerr, Justin | University of California, Berkeley |
Bart M, van Marrewijk, Bart | Wageningen University and Research |
van Vlaardingen, Mieke | Wageningen University and Research |
van Daalen, Tim | Wageningen University and Research (Netherlands) |
van Loo, E.N. | Wageningen Research |
Solowjow, Eugen | Siemens Corporation |
Zedde, van de, Rick | Wageningen University & Research |
Goldberg, Ken | UC Berkeley |
Susa Rincon, Jose Luis | Siemens Corporation |
Keywords: Agricultural Automation, Computer Vision in Automation, Autonomous Agents
Abstract: Accurate temporal reconstructions of plant growth can be valuable for plant phenotyping and breeding, yet remain challenging due to complex geometries, occlusions, and non-rigid deformations of plants. We present GrowSplat, a novel framework for building temporal digital twins of plants by combining 3D Gaussian Splatting with a robust spatial alignment pipeline. GrowSplat begins by constructing a temporal sequence of Gaussian Splats from multi-view camera data, then performs a two-stage spatial registration approach: coarse alignment through feature-based matching and Fast Global Registration, followed by fine alignment with Iterative Closest Point. This pipeline yields a consistent 4D model of plant development in discrete time steps. We evaluate the approach on data from the Netherlands Plant Eco-phenotyping Center, demonstrating detailed temporal reconstructions of Sequoia and Quinoa species. Videos and Images can be seen at https://berkeleyautomation.github.io/GrowSplat/
|
|
TuCT6 |
Room T6 |
Manipulation 2 |
Regular Session |
Chair: Ni, Yun | Stanford University |
|
14:45-15:03, Paper TuCT6.1 | |
SSL-HWE: Semi-Supervised Learning-Based Hierarchical Waypoint Extraction for Imitation Learning in Robotic Manipulation |
|
Liu, Xinzi | South University of Science and Technology of China |
Zhai, Xinyu | Southern University of Science and Technology |
Liu, Junwei | Southern University of Science and Technology |
Zhang, Wei | Southern University of Science and Technology |
Keywords: Deep Learning in Robotics and Automation, Human Factors and Human-in-the-Loop, Machine Learning
Abstract: Imitation learning has proven effective for performing manipulation tasks when trained on high-quality human demonstrations. However, human demonstrations always contain imperfections that complicate the action data distribution, leading to compounding errors during policy deployment. In this paper, we propose a novel data processing approach to simplify action data for human demonstrations with a small set of labeled key intervals that capture critical actions. The approach integrates two components: semi-supervised key interval segmentation and hierarchical optimization for waypoint extraction. The semi-supervised learning model identifies key intervals in the unlabeled dataset, enabling the segmentation of each trajectory into key and non-key intervals. These interval labels are then used to solve optimization problems with interval-specific constraints to derive waypoints, which are subsequently applied to adjust the actions in the demonstration dataset. The proposed data processing approach was validated through a series of real-world experiments in conjunction with a state-of-the-art imitation learning method. The results demonstrate that our approach significantly enhances downstream imitation learning, and outperforms the compared data processing method by 30% to 50% in success rate. Videos are available at https://liuxinzi.github.io/SSL-HWE.
|
|
15:03-15:21, Paper TuCT6.2 | |
Heterogeneous Object Manipulation on Nonlinear Soft Surface through Linear Controller |
|
Ingle, Pratik | IT University of Copenhagen |
Stoy, Kasper | IT University of Copenhagen |
Faińa, Andres | IT University of Copenhagen |
Keywords: Robust/Adaptive Control, Factory Automation, Simulation and Animation
Abstract: Manipulation surfaces indirectly control and reposition objects by actively modifying their shape or properties rather than directly gripping objects. These surfaces, equipped with dense actuator arrays, generate dynamic deformations. However, a high-density actuator array introduces considerable complexity due to increased degrees of freedom (DOF), complicating control tasks. High DOF restrict the implementation and utilization of manipulation surfaces in real-world applications as the maintenance and control of such systems exponentially increase with array/surface size. Learning-based control approaches may ease the control complexity, but they require extensive training samples and struggle to generalize for heterogeneous objects. In this study, we introduce a simple, precise and robust PID-based linear close-loop feedback control strategy for heterogeneous object manipulation on MANTA-RAY (Manipulation with Adaptive Non-rigid Textile Actuation with Reduced Actuation density). Our approach employs a geometric transformation-driven PID controller, directly mapping tilt angle control outputs(1D/2D) to actuator commands to eliminate the need for extensive black-box training. We validate the proposed method through simulations and experiments on a physical system, successfully manipulating objects with diverse geometries, weights and textures, including fragile objects like eggs and apples. The outcomes demonstrate that our approach is highly generalized and offers a practical and reliable solution for object manipulation on soft robotic manipulation, facilitating real-world implementation without prohibitive training demands.
|
|
15:21-15:39, Paper TuCT6.3 | |
Novel Sweeping Methods for Robotic Rearrangement of Object Piles |
|
Rathi, Abhijeet Sanjay | Worcester Polytechnic Institute |
Radil, Filip | Brno University of Technology |
Pawar, Hrishikesh Dhairyasheel | Worcester Polytechnic Institute |
Calli, Berk | Worcester Polytechnic Institute |
Keywords: Manipulation Planning, Industrial and Service Robotics, Environment Monitoring and Management
Abstract: In this paper, we investigate robotic pile rearrangement algorithms to redistribute objects in cluttered scenes for improving the performance of object recognition and picking systems. In particular, we focus on identifying the best pile sweeping actions in fast-paced industrial sorting applications where the robot would only have limited time to react, i.e., has one chance to sweep. The robot's sweep aims to minimize occlusions and improve visibility of the objects in the scene. We first study the performance of three methods that utilize dimensionality reduction techniques (i.e., clustering and PCA) as baselines, discussing their strengths and weaknesses. We then present a novel sampling-based approach for determining the best start and end points for the sweeping motions based on an action evaluation metric. Additionally, we adapted the best performing method to prioritize minimizing occlusion for specific objects. Our results show that the sampling-based method performs more consistently and successfully across different initial pile configurations.
|
|
15:39-15:57, Paper TuCT6.4 | |
Learning Bimanual Manipulation Via Action Chunking and Inter-Arm Coordination with Transformers |
|
Motoda, Tomohiro | National Institute of Advanced Industrial Science and Technology |
Hanai, Ryo | National Institute of Industrial Science and Technology(AIST) |
Nakajo, Ryoichi | National Institute of Advanced Industrial Science and Technology |
Murooka, Masaki | AIST |
Erich, Floris Marc Arden | National Institute of Advanced Industrial Science and Technology |
Domae, Yukiyasu | The National Institute of Advanced Industrial Science and Techno |
Keywords: AI-Based Methods, Machine learning, Motion Control
Abstract: Robots that can operate autonomously in a human living environment are necessary to have the ability to handle various tasks flexibly. One crucial element is coordinated bimanual movements that enable functions that are difficult to perform with one hand alone. In recent years, learning-based models that focus on the possibilities of bimanual movements have been proposed. However, the high degree of freedom of the robot makes it challenging to reason about control, and the left and right robot arms need to adjust their actions depending on the situation, making it difficult to realize more dexterous tasks. To address the issue, we focus on coordination and efficiency between both arms, particularly for synchronized actions. Therefore, we propose a novel imitation learning architecture that predicts cooperative actions. We differentiate the architecture for both arms and add an intermediate encoder layer, Inter-Arm Coordinated transformer Encoder (IACE), that facilitates synchronization and temporal alignment to ensure smooth and coordinated actions. To verify the effectiveness of our architectures, we perform distinctive bimanual tasks. The experimental results showed that our model demonstrated a high success rate for comparison and suggested a suitable architecture for the policy learning of bimanual manipulation.
|
|
15:57-16:15, Paper TuCT6.5 | |
Teleoperation of a Compliant Avian-Inspired Robotic Claw with Continuum Digits |
|
Mohrmann, John | Clemson ARL |
Schuver, Jack | Clemson University |
Yu, Miao | Clemson University |
Walker, Ian | University of Wyoming |
Lv, Ge | Clemson University |
Keywords: Telerobotics and Teleoperation, Physically Assistive Devices, Human-Centered Automation
Abstract: This paper explores the development of a bio-inspired, large-scale compliant CLAW with continuum digits designed for grasping of delicate objects via human teleoperation. Drawing inspiration from avian perching behaviors, the proposed CLAW utilizes minimal sensing, with input provided through a flexible glove equipped with a single bending sensor, while its shape is controlled via an inertial measurement unit mounted on one of its continuum digits. Unlike previous bird-inspired grippers focused on small-scale perching applications, our approach targets the grasping of delicate objects, addressing challenges associated with the compliance and shape-sensing of soft robotic components. The experiments demonstrate the CLAW’s ability to effectively grasp a variety of objects, as well as proof-of-concept results on "rescuing" objects from a confined environment, showcasing the potential of continuum robots for adaptive manipulation with minimal sensor input. These preliminary results underscore the potential of the proposed continuum-based grippers for applications involving human teleoperators in human-robot collaborative tasks.
|
|
TuCT7 |
Room T7 |
Modeling and Control for Automation in Mfg 2 |
Special Session |
Chair: Barton, Kira | University of Michigan at Ann Arbor |
Organizer: Barton, Kira | University of Michigan at Ann Arbor |
Organizer: Balta, Efe | Inspire AG |
Organizer: Bristow, Douglas | Missouri University of Science and Technology |
Organizer: Kovalenko, Ilya | Pennsylvania State University |
Organizer: Wang, Zi | University of Nottingham |
|
14:45-15:03, Paper TuCT7.1 | |
A Model Predictive Control Framework to Enhance Safety and Quality in Mobile Additive Manufacturing Systems (I) |
|
Li, Yifei | The Pennsylvania State University |
Robbins, Joshua | Pennsylvania State University |
Manogharan, Guha | The Pennsylvania State University |
Pangborn, Herschel | The Pennsylvania State University |
Kovalenko, Ilya | Pennsylvania State University |
Keywords: Additive Manufacturing, Motion Control, Motion and Path Planning
Abstract: In recent years, the demand for customized, on- demand production has grown in the manufacturing sector. Additive Manufacturing (AM) has emerged as a promising tech- nology to enhance customization capabilities, enabling greater flexibility, reduced lead times, and more efficient material usage. However, traditional AM systems remain constrained by static setups and human worker dependencies, resulting in long lead times and limited scalability. Mobile robots can improve the flexibility of production systems by transporting products to designated locations in a dynamic environment. By integrating AM systems with mobile robots, manufacturers can optimize travel time for preparatory tasks and distributed printing operations. Mobile AM robots have been deployed for on-site production of large-scale structures, but often neglect critical print quality metrics like surface roughness. Additionally, these systems do not have the precision necessary for producing small, intricate components. We propose a model predictive control framework for a mobile AM platform that ensures safe navigation on the plant floor while maintaining high print quality in a dynamic environment. Three case studies are used to test the feasibility and reliability of the proposed systems.
|
|
15:03-15:21, Paper TuCT7.2 | |
Coverage Path Planning for Ultrasonic Non-Destructive Inspection (I) |
|
Beachy, Jonas | University of Washington |
Chen, Xu | University of Washington |
Keywords: Motion and Path Planning, Intelligent and Flexible Manufacturing
Abstract: Non-destructive inspection of composite materials is a critical process in industries such as aerospace and construction, where structural integrity must be verified to prevent catastrophic failures. Ultrasonic testing is a widely used NDI method for detecting defects in composite parts, but its effectiveness depends on maintaining a water column between the probe and the part surface. Traditional automated coverage path planning (CPP) approaches fail to ensure full coverage, particularly along the edges of complex geometries where the loss of the water column can compromise inspection reliability. This paper presents a novel CPP method for robotic NDI that enhances inspection coverage by fitting the edges with parametric Bézier curves to ensure inspection around the edges before constructing the remainder of the path. The method then refines the path using a modified potential field algorithm and a redundant coverage elimination algorithm. The result of the proposed method is a largely automated, efficient NDI process while maintaining a high quality of inspection.
|
|
15:21-15:39, Paper TuCT7.3 | |
Aligning Digital-Physical Twin Layout for Reconfigurable Manufacturing Systems Using Point Cloud Analysis (I) |
|
Wang, Zi | University of Nottingham |
Yang, Harvey Mingda | University of Nottingham |
Ratchev, Svetan | The University of Nottingham |
Keywords: Cyber-physical Production Systems and Industry 4.0, Intelligent and Flexible Manufacturing, Calibration and Identification
Abstract: The adoption of digital twin in modern manufacturing is becoming an inevitable trend but there are still open issues in realising its full potential. Modern manufacturing is shifting from rigid production lines towards reconfigurable manufacturing systems (RMSs), answering to global challenges of fluctuating market demand, supply chain disruptions and sustainability. Rapid reconfiguration is critical for RMSs, requiring seamless integration between current state analysis, new state design, deployment and validation. To shorten changeover time, digital twin is increasingly being adopted. The distinctive feature of a digital twin compared to any simulation environment is its closeness to its physical correspondence. Yet how to efficiently update digital twin with data from a physical factory is still an open issue. Scanned point-cloud can represent the geometry of a physical system. But point cloud is no equivalent to digital twin. Point cloud contains large amount of discretised information, whereas a factory digital twin focuses on managing functionalities of manufacturing processes. For RMSs, system layout is directly linked to functionality, and is often being optimised for production capacity, material handling cost, and production lead time etc. Therefore, the ability to efficiently translate point cloud into layout information usable by the digital twin is a key enabler to realise rapid reconfiguration in RMS. The proposed research introduces a method to obtain layout data as transformation matrices from scanned factory point cloud and demonstrates the capability with photogrammetry camera output.
|
|
15:39-15:57, Paper TuCT7.4 | |
Inspection State Placement to Reduce Fault Propagation in Manufacturing Systems |
|
Beal, Gregory | Pennsylvania State University |
Tavakkoli Anbarani, Mostafa | Pennsylvania State University |
Meira-Goes, Romulo | Pennsylvania State University |
Kovalenko, Ilya | Pennsylvania State University |
Keywords: Control Architectures and Programming, Discrete Event Dynamic Automation Systems, Factory Automation
Abstract: Manufacturers need to ensure continuous and efficient operation of their production facilities. However, with advances in edge computing and other manufacturing system technology, manufacturing systems have become more distributed. In these settings, machines in the system have an increased risk for faults to occur and propagate compared to traditional manufacturing plants. This behavior is due to the wide range of tasks and collaborative work done in the distributed architectures. In coupled systems, subsystems share events without communicating the fault status. This poses the risk that silent faults that occur in one subsystem propagate to another, further downgrading system performance. This paper defines both inspection states and resultant states and demonstrates that inspecting the resultant states between coupled systems ensures that faults and silent faults are detected between sub-automata. A case study using a physical manufacturing testbed is then analyzed using the proposed framework.
|
|
TuCT8 |
Room T8 |
Quality and Reliability for System Intelligence |
Special Session |
Chair: Yue, Xiaowei | Tsinghua University |
Organizer: Yue, Xiaowei | Tsinghua University |
Organizer: Wu, Jianguo | Peking University |
Organizer: Li, Yongxiang | Shanghai Jiao Tong University |
Organizer: Nabhan, Mohammad | Georgia Tech |
|
14:45-15:03, Paper TuCT8.1 | |
Reconceptualizing Active Learning: A Refined Framework for Methods, Frontiers, and Applications (I) |
|
Chen, Jinglei | Tsinghua University |
Ai, Yibo | University of Science and Technology Beijing |
Yue, Xiaowei | Tsinghua University |
Keywords: Machine learning, Model Learning for Control, Robust/Adaptive Control
Abstract: A refined methodological framework for active learning (AL) techniques is proposed through a systematic review of conventional categorization and recent developments. The proposed framework primarily integrates sample-aware methods and metric-based methods, to offer enhanced practical utility for real-world applications. Furthermore, we review and examine emerging research frontiers focusing on three key aspects: transferability, robustness, and computational effi ciency. The study also investigates diverse application scenarios from both statistical and engineering perspectives. This work serves to (a) synthesize recent advancements in the field, and (b) provide exemplars and actionable guidelines to facilitate promising research directions.
|
|
15:03-15:21, Paper TuCT8.2 | |
STG: Seasonal-Trend Decomposition Using Gaussian Processes for Photovoltaic Performance Data (I) |
|
Li, Ruixian | The University of Hong Kong |
Li, Yongxiang | Shanghai Jiao Tong University |
Cheng, Yao | The University of Hong Kong |
Keywords: Renewable Energy Sources, Probability and Statistical Methods, Diagnosis and Prognostics
Abstract: Accurately assessing photovoltaic performance loss rates is crucial for predicting long-term energy yield and financial viability, yet this task is complicated by environmental factors and data quality issues. Although many traditional seasonal-trend decomposition methods are utilized to analyze photovoltaic systems, they typically offer limited flexibility and robustness when dealing with unknown non-integer seasonal periods and noisy high-frequency field-collected data. Moreover, they often overlook the long-range dependence in photovoltaic performance loss processes, which is induced by the multiple interactions between the natural environment and photovoltaic systems. To address these challenges, we propose STG, a novel seasonal-trend decomposition framework grounded in Gaussian Processes (GPs). STG combines a periodic GP and a fractional Wiener process to simultaneously capture intricate periodic patterns and long-range dependencies in photovoltaic performance data, while also providing probabilistic uncertainty quantification. By leveraging efficient computational techniques and scalable inference algorithms, STG can accommodate large, high-frequency photovoltaic datasets with improved accuracy and reduced uncertainty. Through extensive simulation studies and real-world case analyses involving multiple photovoltaic systems, we demonstrate that STG surpasses existing state-of-the-art decomposition methods in capturing subtle seasonal and trend behaviors. Additionally, STG could yield predictive distributions that offer more reliable uncertainty quantification, facilitating more informed decision-making in system operation.
|
|
15:21-15:39, Paper TuCT8.3 | |
Modeling and Monitoring Dynamic, Directional, Sparse, Attributed Networks Via a Dynamic Hurdle Regression Model with Latent Variables (I) |
|
Wu, Hao | Tsinghua University |
Wang, Kaibo | Tsinghua University |
Keywords: Model Learning for Control, Probability and Statistical Methods, Diagnosis and Prognostics
Abstract: Network data is commonly available across various domains, sparking a surge in research dedicated to modeling and monitoring network systems. In the realm of network analysis with node attributes, the majority of existing studies utilize generalized linear models (GLMs) to establish connections between network topology and node characteristics. However, these studies often overlook the incongruity between directional edges and directionless attributes within the context of directional networks, as well as the inadequacy of using observable attributes only to explain the network topology. In this paper, we introduce a novel Hurdle regression model with latent variables (HRML), which assigns four latent variables to each node to govern the directionality of interactions. By integrating observable attributes, our proposed model adeptly manages directional, sparse, and attributed networks. We further develop the HRML into its dynamic version (D-HRML) within the state space model framework to capture the temporal dynamics of network streams. An extended Kalman filter (EKF) is employed for optimal parameter estimation. Ultimately, we devise a monitoring scheme based on the generalized likelihood ratio test (GLRT) to detect abrupt changes across diverse scenarios. Extensive simulations demonstrate that our proposed method outperforms several competitive approaches, particularly in detecting shifts in interaction propensities. A case study utilizing the Enron E-mail corpus further substantiates the high efficiency of our methodology.
|
|
15:39-15:57, Paper TuCT8.4 | |
Remaining Useful Life Prediction with Spatial-Temporal Gate-Based Physics-Informed Neural Networks (I) |
|
Li, Yuan | Academy of Mathematics and Systems Science, CAS |
|
|
15:57-16:15, Paper TuCT8.5 | |
Nonlinear Spatio-Temporal Run-To-Run Control: Leveraging Diffusion Models for Disturbance Management (I) |
|
Zhang, Zihan | Georgia Institute of Technology |
Mao, Lingchao | Georgia Institute of Technology |
Paynabar, Kamran | Georgia Tech |
Shi, Jianjun | Georgia Institute of Technology |
Keywords: Process Control, Sensor-based Control, Big-Data and Data Mining
Abstract: Run-to-run (R2R) control is widely used in advanced manufacturing processes, particularly in environments with complex nonlinear dynamics and high-dimensional disturbances. Conventional R2R methods often assume that system disturbances exhibit linear structures or low-rank approximations in high-dimensional data, such as images or videos. While these assumptions facilitate implementation, they limit the effectiveness of R2R control in managing sophisticated manufacturing processes with intricate nonlinear disturbances. To address these challenges, we propose a novel nonlinear spatio-temporal R2R control framework that estimates system disturbances using a diffusion model and applies control actions to compensate for them. The proposed approach integrates an offline modeling phase to capture disturbance dynamics and an online control phase that dynamically adjusts actions to minimize deviations, ensuring system stability and precision. The effectiveness of this method is validated through simulation studies and a case study, demonstrating its adaptability in complex, high-dimensional environments.
|
|
TuCT9 |
Room T9 |
Detection, Estimation and Prediction 4 |
Regular Session |
Chair: Zhang, Chen | Tsinghua University |
|
14:45-15:03, Paper TuCT9.1 | |
ReMaskNet: Regenerate Mask Network for Industrial Anomaly Detection and Segmentation |
|
Gao, Wenbo | Hangzhou Dianzi University |
Tao, Xian | Institute of Automation, Chinese Academy of Sciences |
Gong, Xinyi | Hangzhou Dianzi University |
Zhu, Yihang | Hangzhou Dianzi University |
Song, Zhaohui | Hangzhou Dianzi University |
Keywords: Computer Vision in Automation, Machine learning, AI-Based Methods
Abstract: We propose ReMaskNet, an innovative unsupervised anomaly detection network designed to address the challenges of defect detection and localization. ReMaskNet leverages the strengths of three major methods: reconstruction-based, synthesis-based, and embedding-based methods. Specifically, it first employs a pre-trained network to extract multi-scale features from input images, which are then adapted to the target domain through feature adapters. Following this, a dual-level anomaly synthesis strategy, incorporating both image-level and feature-level anomaly synthesis, is applied to enhance the model's ability to learn from synthesized abnormal samples. Subsequently, a feature reconstruction network is employed to reconstruct the features. By computing the difference between the reconstructed and original features, an initial anomaly segmentation map is generated. To refine this segmentation, ReMaskNet introduces a feature inpainting module, where the generated anomaly segmentation map serves as a mask. Anomaly regions are masked out and subsequently restored using an inpainting network. Additionally, an iterative optimization mechanism is employed to progressively enhance segmentation accuracy. Experimental results demonstrate that ReMaskNet outperforms state-of-the-art (SOTA) methods on the MVTec AD dataset, validating its effectiveness in handling complex backgrounds and subtle defect detection. Code is available at: https://github.com/gwb55/ReMaskNet
|
|
15:03-15:21, Paper TuCT9.2 | |
WFDRNet: Wavelet-Embedded Feature Distillation and Refinement Network for Unsupervised Anomaly Segmentation |
|
Zhu, Yihang | Hangzhou Dianzi University |
Tao, Xian | Institute of Automation, Chinese Academy of Sciences |
Gong, Xinyi | Hangzhou Dianzi University |
Gao, Wenbo | Hangzhou Dianzi University |
Song, Zhaohui | Hangzhou Dianzi University |
Wang, Hongbo | Hangzhou Dianzi University |
Keywords: Computer Vision in Automation, Machine learning, AI-Based Methods
Abstract: We propose a novel Wavelet-Embedded Feature Distillation and Refinement Network to address the limitations of existing unsupervised anomaly detection methods in local anomaly localization. WFDRNet combines the advantages of Reverse Knowledge Distillation and Autoencoder while introducing Discrete Wavelet Transform to enhance the model’s sensitivity to anomalous regions, thus improving localization accuracy. The network employs a two-stage training strategy: in the first stage, the student network learns feature representations from the teacher network; in the second stage, the Anomaly Mask Generation Module uses multi-scale features for anomaly localization. Experimental evaluations on the MVTec AD dataset demonstrate that WFDRNet surpasses existing state-of-the-art methods in Image-AUROC, Pixel-AUROC and Pixel-AP, particularly in precise anomaly localization, validating its effectiveness in complex contexts. Code is available at: ttps://github.com/IvanZhu666/WFDRNet.
|
|
15:21-15:39, Paper TuCT9.3 | |
FedCOP: Federated Contrastive Orthonormal Prototype Learning Framework for Multi-Wind Farm Collaborative Fault Detection |
|
Zhou, Rui | Shanghai Jiao Tong University |
Li, Yanting | Shanghai Jiao Tong University |
Keywords: Diagnosis and Prognostics, Big data Analytics for Large-scale Energy Systems, Big-Data and Data Mining
Abstract: Accurate and efficient fault detection for wind turbines is essential for wind farm operation and management. The federated learning supports collaboration across wind farms, concurrently preserving privacy and alleviating data silos. However, existing methods overlook the personalized characteristics of each wind farm and fail to mitigate the deterioration in model performance caused by data heterogeneity. Moreover, transferring parameters at a million-scale results in high communication expenses. To address these challenges, a federated contrastive orthonormal prototype learning (FedCOP) framework designed for collaborative fault detection among multiple wind farms is proposed in this paper. Initially, local normality and fault prototypes are established from latent representations with identical semantics. Following local training, wind farms only upload local prototypes, greatly improving communication speed. Secondly, the server aggregates local prototypes into global prototypes that encapsulate semantic details of various turbine states, and broadcasts to the wind farms for later communication rounds. Thirdly, in local objectives, the regularization term, contrastive loss, and orthonormal loss are integrated to align local prototypes with global ones and enhance the separability and distinguishability. Finally, the server applies momentum updates on the global prototypes to maintain consistency. The FedCOP is validated using data from four wind farms located in Jiangsu, Tianjin, Shanghai, and Hubei, China. Experimental results show that under heterogeneous conditions, FedCOP outperforms existing methods in detection accuracy, latent representation boundaries, and communication efficiency.
|
|
15:39-15:57, Paper TuCT9.4 | |
Data-Efficient Spectral Classification of Hyperspectral Data Using MiniROCKET and HDC-MiniROCKET |
|
Theisen, Nick | University Koblenz |
Schlegel, Kenny | Chemnitz University of Technology |
Paulus, Dietrich | Universtät Koblenz-Landau |
Neubert, Peer | University of Koblenz |
Keywords: Machine learning, Computer Vision in Automation, Deep Learning in Robotics and Automation
Abstract: The classification of pixel spectra of hyperspectral images, i.e spectral classification, is used in many fields ranging from agricultural, over medical to remote sensing applications and is currently also expanding to areas such as autonomous driving. Even though for full hyperspectral images the best-performing methods exploit spatial-spectral information, performing classification solely on spectral information has its own advantages, e.g. smaller model size and thus less data required for training. Moreover, spectral information is complementary to spatial information and improvements on either part can be used to improve spatial-spectral approaches in the future. Recently, 1D-Justo-LiuNet was proposed as a particularly efficient model with very few parameters, which currently defines the state-of-the-art in spectral classification. However, we show that with limited training data the model performance deteriorates. Therefore, we investigate MiniROCKET and HDC-MiniROCKET for spectral classification to mitigate that problem. The model extracts well-engineered features without trainable parameters in the feature extraction part and is therefore less vulnerable to limited training data. We show that even though MiniROCKET has more parameters it outperforms 1D-Justo-LiuNet in limited data scenarios and is mostly on par with it in the general case.
|
|
15:57-16:15, Paper TuCT9.5 | |
Enhanced Sensor Fault Detection in Bridge Monitoring Using Siamese-Based Encoder-Decoder |
|
Felicioni, Simone | University of Perugia - Department of Engineering |
Castellini, Luca | WISEPOWER |
Tinti, Luca | Wisepower Srl |
Giorgetti, Folco | University of Perugia |
Fravolini, Mario Luca | University of Perugia |
Keywords: Failure Detection and Recovery, Automation Technologies for Smart Cities, Machine learning
Abstract: Bridge monitoring is crucial for ensuring the safety of these infrastructures, as they are constantly exposed to environmental stress, aging, harsh weather conditions, and intense traffic loads. Traditional inspection methods are often labor-intensive and prone to human errors, motivating the development of automated and data-driven approaches based on sensor networks directly mounted on the bridges. For this reason, a key challenge in such monitoring systems lies in the accurate detection of sensor faults, which can severely compromise the quality of collected data and, consequently, the reliability of the monitoring outcomes. This study presents a novel data-driven approach for sensor fault detection in bridge monitoring applications, based on a combination of an encoder-decoder architecture and a Siamese network. The former aims to reconstruct the spectrograms computed from the accelerometer signals, whereas the latter aims to learn a similarity metric to better discriminate faulty readings from healthy ones. The experimental results of this study demonstrate the potential of the proposed approach in enhancing sensor fault detection performance compared to several baselines, providing more accurate predictions even with small fault intensities.
|
|
TuCT10 |
Room T10 |
Sustainability for Production and Service Systems 1 |
Special Session |
Chair: Zhang, Liang | University of Connecticut |
Organizer: Yan, Chao-Bo | Xi'an Jiaotong University |
Organizer: Pei, Zhi | Zhejiang University of Technology |
Organizer: Ju, Feng | Arizona State University |
Organizer: Li, Congbo | Chongqing University |
Organizer: Wang, Junfeng | Huazhong University of Science and Technology |
Organizer: Yu, Chunlong | Tongji University |
|
14:45-15:03, Paper TuCT10.1 | |
Energy Consumption Optimization for Two-Machine Bernoulli Serial Lines with Time-Sensitive Constraint and General Bounds on Machine Efficiencies (I) |
|
Ma, Xu | Xi'an Jiaotong University |
Zhang, YongKang | Xian Jiaotong University |
Yan, Chao-Bo | Xi'an Jiaotong University |
Keywords: Manufacturing, Maintenance and Supply Chains, Sustainability and Green Automation
Abstract: In many production systems with high energy use, machines consume a large amount of energy during operation. To deal with this issue, it is important to find ways to reduce energy consumption. This paper studies how to minimize energy use in a two-machine Bernoulli serial production line with time-sensitive constraint. It presents general lower and upper bounds on machine efficiencies. Specifically, the energy optimization problem with production rate and time-sensitive constraint is formulated as a nonlinear programming problem. Then, two relaxation problems and the structures of their feasible regions are analyzed. Based on these explorations, the optimal solution to the original energy optimization problem for two-machine Bernoulli production line, which includes time-sensitive constraint and general bounds on machine efficiencies, is constructed from the solutions of the relaxation problems (i.e., the problem without general bounds on machine efficiencies).
|
|
15:03-15:21, Paper TuCT10.2 | |
Energy Consumption Optimization for Two-Machine Bernoulli Batch Serial Lines (I) |
|
Chen, Zeyuan | Xi'an Jiaotong University |
Gou, Tongxin | Xi'an Jiaotong University |
Zhang, Sheng | Xi'an Jiaotong University |
Yan, Chao-Bo | Xi'an Jiaotong University |
Keywords: Manufacturing, Maintenance and Supply Chains, Sustainability and Green Automation
Abstract: Industrial production consumes substantial amounts of energy, making energy efficiency in production lines a critical concern. A class of high-energy-consumption production lines, commonly found in industries such as battery manufacturing and semiconductor fabrication, can be modeled as batch production systems capable of processing multiple parts simultaneously. Since existing energy consumption optimization research rarely addresses batch production lines, this paper formulates and solves the energy consumption optimization problem for a two-machine Bernoulli batch serial production line. Specifically, the problem is modeled as a nonlinear programming. Through in-depth analysis of its characteristics, we derive two nonlinear equations that the optimal solution must satisfy. Based on the properties of these equations, a binary search algorithm is designed to efficiently compute the optimal solution. Extensive numerical experiments demonstrate the effectiveness of the proposed method in solving this class of energy consumption optimization problems.
|
|
15:21-15:39, Paper TuCT10.3 | |
Energy Consumption Optimization for Three-Machine Bernoulli Serial Lines with Time-Sensitive Constraint (I) |
|
Ma, Xu | Xi'an Jiaotong University |
Zhang, YongKang | Xian Jiaotong University |
Yan, Chao-Bo | Xi'an Jiaotong University |
Keywords: Manufacturing, Maintenance and Supply Chains, Sustainability and Green Automation
Abstract: In energy-intensive production systems, machines consume a large amount of energy during the production process. In this paper, the problem of reducing energy consumption for three-machine Bernoulli serial line with time-sensitive constraint of the first buffer. Specifically, first, based on the aggregation method, structural characteristics of the optimization model are analyzed and thus, the problem is transformed. Then, with the efficiency of the third machine and production rate fixed, based on the results of the energy consumption optimization model in the two-machine line, the optimization model is further analyzed. Next, only with the efficiency of the third machine fixed, the impact of production rate is discussed. Finally, the behavior of the objective function with respect to the third machine efficiency is analyzed, and the unique optimal solution is obtained numerically. This work provides a practical approach to energy-efficient scheduling in multi-machine production systems.
|
|
15:39-15:57, Paper TuCT10.4 | |
Job Scheduling Optimization in Autonomous Vehicle Storage and Retrieval System with Multi-Level Rack Based on a Hybrid Heuristic Algorithm (I) |
|
Zhang, Xinyan | Tongji University |
Xiao, Xiao | Tongji University |
Keywords: Planning, Scheduling and Coordination, AI-Based Methods, Intelligent and Flexible Manufacturing
Abstract: The autonomous vehicle storage and retrieval system (AVS/RS) with multi-level rack has good application prospects due to its excellent throughput capacity and flexibility. This paper presents a mathematical model of inbound and outbound job’s scheduling problem in AVS/RS with multi-level rack based on the parallel work of lift, vehicles and telescopic arms, and the optimization objective is to minimize the makespan. This paper proposes a novel hybrid heuristic algorithm based on Harris Hawk Optimization (HHO) and Adaptive Large Neighborhood Search (ALNS) to solve the optimization problem. The study conducts numerical experiments to verify the effectiveness of the proposed algorithm. For small-scale problems, the mathematical model is solved by Gurobi, and the results show that the proposed algorithm can obtain a solution with an average deviation of less than 3% from the exact solution in a short time. For medium-scale and large-scale problems, the proposed algorithm is more efficient and stable compared to other heuristics. This research has practical implications for optimizing the sequence of operations and improving the work efficiency of AVS/RS with multi-level rack.
|
|
15:57-16:15, Paper TuCT10.5 | |
Data Analytics for Automated Feature Extraction and Anomaly Detection in a Class of Industrial Manufacturing Process (I) |
|
Zhu, Tianyu | University of Connecticut |
Bai, Yishu | University of Connecticut |
Tu, Jiachen | University of Connecticut |
Zhang, Liang | University of Connecticut |
Keywords: Failure Detection and Recovery, Probability and Statistical Methods, Cyber-physical Production Systems and Industry 4.0
Abstract: Industry 4.0/Smart Manufacturing is transforming the manufacturing industry through the integration of technologies such as the Internet of Things (IoT), big data, and cloud computing. These advancements have revolutionized the way production is carried out, leading to the generation of huge amounts of data in a more affordable and accessible manner which makes it possible for anomalies to be detected. By identifying anomalies in this data, manufacturers can detect potential faults or issues in the production process early. This early detection can help prevent costly downtime, reduce waste, and improve product quality, ultimately leading to better customer satisfaction and profitability. Given the significant potential benefits, we have collected data from a local factory and launched a research initiative to explore how anomaly detection can help small/med-sized manufacturers to adopt Industry 4.0 technologies. This work addresses this challenge for a key production operation in a local manufacturing plant. In particular, sensors were deployed to continuously measure a critical process variable of the equipment for this operation. Using the sensor data, we develop effective and robust algorithms that can effectively extract the main features and detect anomalies that occurred during the production process. Numerical experiments demonstrate that the proposed method can detect the anomalies in high accuracy.
|
|
TuDT1 |
Room T1 |
Model Predictive Control |
Regular Session |
Chair: Mamaev, Ilshat | Proximity Robotics & Automation GmbH |
|
16:30-16:48, Paper TuDT1.1 | |
Improving Feasibility and Safety of Nonlinear MPC with Control Barrier Function Via Learning-Based Non-Convex Reachable Sets |
|
Tang, Yucheng | University of Applied Sciences Karlsruhe |
Chen, Tao | Karlsruher Institut Für Technologie |
Hein, Björn | Karlsruhe University of Applied Sciences |
Mamaev, Ilshat | Proximity Robotics & Automation GmbH |
Keywords: Optimization and Optimal Control, AI-Based Methods, Motion Control
Abstract: This paper proposes a learning-based reachability-aware (RA) nonlinear model predictive control (NMPC) framework with discrete control barrier functions (DCBF) to improve safety and feasibility for complex, nonlinear systems, specifically demonstrated on a tractor-trailer system. Traditional NMPC-CBF methods often encounter feasibility issues due to the potential empty intersection between the reachable set and the safe region confined by CBF constraints. Additionally, standard reachability analysis methods rely on oversimplified approximations of reachable sets. To address these Challenges, we introduce a novel learning-based approach leveraging sampling-based reachability analysis combined with Alpha Shapes, accurately capturing non-convex reachable sets. A ac{MLP} neural network is trained offline to efficiently approximate the infeasible reachable region online. By integrating this learned approximation directly into the NMPC's objective function, the proposed method significantly enhances optimization feasibility without compromising safety guarantees. Numerical experiments validate the approach, demonstrating better constraint handling, and enhanced maneuverability and feasibility compared to conventional convex hull-based RA-NMPC-DCBF and slack variable-based NMPC.
|
|
16:48-17:06, Paper TuDT1.2 | |
Real-Time Optimal Energy Management of Dual-Motor Battery Electric Vehicles Using MPC on an Automotive-Grade Microcontroller |
|
Kim, Jiwon | Inha University |
Gwon, Minwoo | Inha University |
Heo, Jae | Inha University |
Kim, Kwangki | Inha University |
Keywords: Optimization and Optimal Control, Plug-in Electric Vehicles, Power and Energy Systems automation
Abstract: This paper presents a real-time energy management strategy for dual-motor Battery Electric Vehicles (BEVs) using Nonlinear Model Predictive Control (NMPC) and Linear Model Predictive Control (LMPC) frameworks. The energy management problem is formulated as Nonlinear Programming (NLP) and Quadratic Programming (QP), depending on the nonlinearity or linearity of the mechanical motor forces in the electric power model. The linear battery power model serves as a model reduction approach, enabling real-time model-based optimal control. To validate real-time feasibility in an embedded environment, a Processor-in-the-Loop Simulation (PiLS) is conducted by implementing LMPC on a 32-bit MCU-based system (Teensy 4.1 with an ARM Cortex-M7 processor). Simulation results demonstrate that NMPC achieves 99% of the performance of Dynamic Programming (DP), which serves as the global reference solution, while LMPC achieves 98%, confirming the effectiveness of the proposed approach. Furthermore, LMPC successfully meets real-time computational constraints, demonstrating its feasibility for practical vehicle implementation.
|
|
17:06-17:24, Paper TuDT1.3 | |
Parameter-Regression MPC (PaReMPC): Experiments and Extensions of a Broadly Adaptive Approach for SISO Robotic Applications |
|
Annem, Vivek | Florida State University |
Hubicki, Christian | Florida State University |
Keywords: Optimization and Optimal Control, Robust/Adaptive Control, Motion and Path Planning
Abstract: Model learning combined with model predictive control (MPC) can be an effective control approach for adapting to a wide range of unknown system dynamics or changes thereof. Specifically, we take a parameter regression approach to maximize the adaptation rate during control operation. Here we present this approach as Parameter Regression MPC (PaReMPC) with an N-integrator plant model and demonstrate the approach in two hardware experiments. First, in a loaded motor experiment, PaReMPC achieved high-fidelity tracking within 0.5 seconds, and second, stabilized a ball-balancing robot---both with poor initial parameters. However, this learn-as-you-go approach is inherently greedy and creates an unstable feedback loop with the controller, particularly when nonminimum phase zeros are present. In numerical experiments, we benchmark the ability of PaReMPC to adapt to a range of SISO LTI systems against scenario tree dual control methods (i.e., ST-SMPC) which infer probabilities of candidate modes. In addition, we combine PaReMPC with scenario trees to create a novel ST-PaReMPC approach. Our numerical simulations found that PaReMPC attains the lowest-cost performance in cases of low-noise or non-changing dynamics, but the novel ST-PaReMPC performs best with high noise and changing system dynamics.
|
|
17:24-17:42, Paper TuDT1.4 | |
Multi-Agent Feedback Motion Planning Using Probably Approximately Correct Nonlinear Model Predictive Control |
|
Gonzales, Mark | Johns Hopkins University |
Polevoy, Adam | Johns Hopkins University Applied Physics Lab |
Kobilarov, Marin | Johns Hopkins University |
Moore, Joseph | Johns Hopkins University |
Keywords: Optimization and Optimal Control, Autonomous Agents, Agent-Based Systems
Abstract: For many tasks, multi-robot teams often provide greater efficiency, robustness, and resiliency. However, multi-robot collaboration in real-world scenarios poses a number of major challenges, especially when dynamic robots must balance competing objectives like formation control and obstacle avoidance in the presence of stochastic dynamics and sensor uncertainty. In this paper, we propose a distributed, multi-agent receding-horizon feedback motion planning approach using Probably Approximately Correct Nonlinear Model Predictive Control (PAC-NMPC) that is able to reason about both model and measurement uncertainty to achieve robust multi-agent formation control while navigating cluttered obstacle fields and avoiding inter-robot collisions. Our approach relies not only on the underlying PAC-NMPC algorithm but also on a terminal cost-function derived from gyroscopic obstacle avoidance. Through numerical simulation, we show that our distributed approach performs on par with a centralized formulation, that it offers improved performance in the case of significant measurement noise, and that it can scale to more complex dynamical systems.
|
|
17:42-18:00, Paper TuDT1.5 | |
Safety-Aware Robust Model Predictive Control for Robotic Arms in Dynamic Environments |
|
Nam, Sanghyeon | Korea Institute of Industrial Technology |
Kim, Dongmin | Daegu Mechatronics Materials Institute |
Choi, Seung-Hwan | Korea Institute of Industrial Technology |
Kim, Chang-Hyun | Korea Institute of Industrial Technology |
Kwon, Hyoeun | Daegu Gyeongbuk Institute of Science & Technology, Korea Institu |
Kawamoto, Hiroaki | University of Tsukuba |
Lee, Suwoong | Korea Institute of Industrial Technology |
Keywords: Collision Avoidance, Robust/Adaptive Control, Motion and Path Planning
Abstract: Robotic manipulators are essential for precise industrial pick-and-place operations, yet planning collision-free trajectories in dynamic environments remains challenging due to uncertainties such as sensor noise, and time-varying delays. Conventional control methods often fail under these conditions, motivating the development of Robust MPC (RMPC) strategies with constraint tightening. In this paper, we propose a novel RMPC framework that integrates phase-based nominal control with a robust safety mode, allowing smooth transitions between safe and nominal operations. Our approach dynamically adjusts constraints based on real-time predictions of moving obstacles—whether human, robot, or other dynamic objects—thus ensuring continuous, collision-free operation. Simulation studies demonstrate that our controller improves both motion naturalness and safety, achieving faster task completion than conventional methods.
|
|
TuDT2 |
Room T2 |
Autonomous and Software-Defined Factory 2 |
Special Session |
Chair: Lee, Chia-Yen | National Taiwan University |
Organizer: Lee, Chia-Yen | National Taiwan University |
Organizer: Hsu, Chia-Yu | National Tsing Hua University |
Organizer: Fan, Shu-Kai S. | National Taipei University of Technology |
Organizer: Ju, Feng | Arizona State University |
Organizer: Blue, Jakey | National Taiwan University |
Organizer: Jang, Young Jae | Korea Advanced Institute of Science and Technology |
Organizer: Skoogh, Anders | Chalmers University of Technology |
Organizer: Lugaresi, Giovanni | KU Leuven |
|
16:30-16:48, Paper TuDT2.1 | |
Reinforcement Learning and Robust Optimization for T/C Balance Scheduling in TFT-LCD Cell Manufacturing (I) |
|
Hong, Tzu-Yen | National Taipei University of Technology |
Lu, Kuan-Chun | National Taiwan University |
Chu, Chia-Fan | National Taiwan University |
Lee, Chia-Yen | National Taiwan University |
Keywords: Planning, Scheduling and Coordination, Manufacturing, Maintenance and Supply Chains, Intelligent and Flexible Manufacturing
Abstract: This study aims to deal with the TFT-LCD cell process scheduling as a dynamic flexible job shop scheduling problem (DFJSS). Focusing on balancing production between Thin-Film Transistor (TFT) array and Color Filter (CF) substrates, new job arrivals, and uncertain processing times, a deep reinforcement learning (DRL) framework and robust optimization are proposed to optimize multiple objectives, including total weighted tardiness, makespan, over-queued time, and T/C balance. An empirical study of one leading TFT-LCD manufacturer conducted with several experiments shows the effectiveness and robustness of the proposed DRL framework in dynamic manufacturing scenarios.
|
|
16:48-17:06, Paper TuDT2.2 | |
Interpretable Domain-Adaptive Physics-Informed Neural Network for Correcting Model Misspecifications (I) |
|
Hong, Rui-Qian | National Taiwan University |
Hung, Yu-Hsin | National Taiwan University |
Lee, Chia-Yen | National Taiwan University |
Keywords: Deep Learning in Robotics and Automation, Manufacturing, Maintenance and Supply Chains, Factory Automation
Abstract: Dynamic systems describe environments in which variables evolve over time according to mathematical equations, typically formulated as ordinary differential equations (ODEs) or partial differential equations (PDEs). Traditional models require extensive domain expertise, whereas ML approaches like Sparse Identification of Nonlinear Dynamics (SINDy) and Physics-Informed Neural Networks (PINNs) offer a data-driven alternative. However, challenges like interpretability and scarce data remain. Previous studies have shown that integrating neural networks as correction terms can mitigate model misspecifications, though the methods based solely on collocation struggle with complex derivative terms. This study proposes Domain-Adaptive Physics-Informed Neural Networks (DAPINNs) framework enhancing PINNs by employing a correction model that learns solely from discrepancies between PINNs outputs and governing equations. We pre-train PINNs on similar phenomena, fine-tune with an alternating update strategy, and employ symbolic regression to extract interpretable correction terms.
|
|
17:06-17:24, Paper TuDT2.3 | |
Unsupervised Domain Adaptation for Defect Detection Based on Test-Time Feature Caching Network and Empirical Study in Multilayer Ceramic Capacitors Manufacturing (I) |
|
Lu, Yi-Wei | National Taipei University of Technology |
Hsu, Chia-Yu | National Taiwan University of Science and Technology |
Keywords: Computer Vision in Automation, AI-Based Methods, Machine learning
Abstract: Defect detection in Multi-Layer Ceramic Capacitors (MLCCs) is critical for ensuring the reliability of electronic devices. However, traditional inspection methods and CNN-based models struggle with domain shifts caused by variations in image characteristics across production environments, leading to degraded detection accuracy. Addressing this gap, this study proposes a Test-Time Feature Caching Network (TTFCN), a novel unsupervised domain adaptation framework designed to enhance defect detection without requiring labeled target domain data or costly retraining. TTFCN introduces a feature caching mechanism and advanced augmentation strategies to learn domain-invariant features and generate accurate pseudo-labels during the adaptation phase. The framework was validated on real-world MLCC production data from different branches, demonstrating superior performance in defect detection accuracy compared to traditional and state-of-the-art methods. Empirical results confirm that TTFCN effectively mitigates domain shifts, reduces model adaptation time, and minimizes production downtime. This study contributes a scalable solution for improving MLCC inspection and offers insights for broader applications in dynamic manufacturing environments.
|
|
17:24-17:42, Paper TuDT2.4 | |
Deploying Vision Retrieval Augmented Generation As Assistant for xPPU Maintenance (I) |
|
Höfgen, Josua | Technical University of Munich |
Tran, Nguyen Thi Huyen | Program on Semiconductor Packaging and Testing And, Academy of I |
Vogel-Heuser, Birgit | Technical University Munich |
Chen, Po-Jui | National Cheng Kung University |
Wilch, Jan | Technical University of Munich |
Cheng, Fan-Tien | National Cheng Kung University |
Hsieh, Yu-Ming | National Cheng Kung University |
Keywords: Agent-Based Systems, AI-Based Methods, Factory Automation
Abstract: The maintenance and operator training of manufacturing systems faces challenges due to the overwhelming volume and system-specific documents, which require extensive memorization. While large language models have shown the capability to support operators their lack of knowledge of machine specific information limits their out-of-the-box applicability. To address this, Retrieval-Augmented Generation has been introduced as a solution for large-scale document retrieval in various industries, including automation and manufacturing. However, processing visual elements such as images, graphs, and diagrams within these documents remains a significant hurdle. This paper proposes the integration of Visual Retrieval-Augmented Generation (VRAG) into a maintenance support system for the lab-sized manufacturing systems extended Pick and Place Unit (xPPU). The deployed VRAG combines advanced Chain-of-Thought (CoT) reasoning. The system's evaluation framework is designed to assess performance across various stages, optimizing both indexing and retrieval. Results demonstrate that VRAG significantly improves maintenance efficiency, providing a robust solution for handling a wide range of queries that appear in typical maintenance tasks. The contributions of this work include the successful application of VRAG to the xPPU system, the development of comprehensive evaluation metrics involving experts familiar with the xPPU system, and the integration of agentic capabilities for improved query resolution.
|
|
17:42-18:00, Paper TuDT2.5 | |
Blockchain-Enabled Decentralized Carbon Trading Framework for Manufacturing: Enhancing Offset Efficiency and Fairness Via Emission Error Analysis (I) |
|
Hung, Min-Hsiung | Chinese Culture University |
Kuo, Wei-Chuan | Institute of Manufacturing Information and Systems, National Che |
Lin, Yu-Chuan | Chinese Culture University |
Chen, Chao-Chun | National Cheng Kung University |
Tieng, Hao | National University of Tainan |
Cheng, Fan-Tien | National Cheng Kung University |
Keywords: Sustainable Production and Service Automation, Sustainability and Green Automation, Intelligent and Flexible Manufacturing
Abstract: This paper presents an Emission Error-based Decentralized Carbon Trading (EEDCT) framework, leveraging blockchain technology to enhance the carbon offset efficiency and fairness of carbon trading for manufacturing enterprises. EEDCT integrates carbon trading reputation values, emission errors, and transaction credibility to optimize trade matching and pricing. The key contributions include: (1) reducing verification costs by 27% through an ISO 14064-based Automated Verification and Quantification Scheme (IAVQS), minimizing full-scale inspections; (2) accurately estimating emissions with only 50% of data via a Selective High-Accuracy Emission Estimation (SHAEE) mechanism, refining estimates with weighted averaging; and (3) boosting carbon neutrality to 80% and doubling fairness, prioritizing low-error, high-credibility enterprises. Our blockchain-based EEDCT framework can reduce carbon verification costs, enhance the carbon neutrality rate in carbon trading, and is promising to promote fair and efficient carbon trading.
|
|
TuDT3 |
Room T3 |
Large Language/Foundation Models 3 |
Regular Session |
Chair: Clever, Debora | ABB Corporate Research Center |
|
16:30-16:48, Paper TuDT3.1 | |
FDR-Net: Foundation Model-Driven Depth Restoration for Transparent Object Grasping |
|
Kwon, Taeyong | Korea Institute of Science and Technology |
Baek, Seung | Korea University |
Kim, KangGeon | Korea Institute of Science and Technology |
Keywords: Deep Learning in Robotics and Automation, Computer Vision in Automation, Industrial and Service Robotics
Abstract: Transparent objects are commonly found in our surroundings and are frequently handled in automated systems such as smart factories and laboratories. Enabling robots to grasp and manipulate these objects opens up various automation possibilities. However, the unique reflection and refraction of light on the surface of transparent objects make it challenging to recognize depth information with a commercial depth camera, leading to the failure of most grasping algorithms that heavily rely on depth information. In this work, we address this challenge by transferring knowledge from the foundation model trained on a large-scale dataset to the depth restoration model. The foundation model extracts RGB features while keeping its pre-trained weights frozen. The extracted RGB features are then densely fused with the features extracted from the inaccurate depth image, and finally decoded to generate an accurate depth image. Comparative experiments with baseline methods demonstrate that our method has superior and more generalizable performance. Real robot experiments show that our method is also applicable in real environments and, when applied, improves the success rate of grasping algorithms for transparent objects.
|
|
16:48-17:06, Paper TuDT3.2 | |
Learning from SAM: Harnessing a Foundation Model for Sim2Real Adaptation by Regularization |
|
Bonani, Mayara | Autonomous Intelligent Systems Group of University of Bonn, Germ |
Schwarz, Max | University Bonn |
Behnke, Sven | University of Bonn |
Keywords: Deep Learning in Robotics and Automation, Computer Vision in Automation, Simulation and Animation
Abstract: Domain adaptation is especially important for robotics applications, where target domain training data is usually scarce and annotations are costly to obtain. We present a method for self-supervised domain adaptation for the scenario where annotated source domain data (e.g. from synthetic generation) is available, but the target domain data is completely unannotated. Our method targets the semantic segmentation task and leverages a segmentation foundation model (Segment Anything Model) to obtain segment information on unannotated data. We take inspiration from recent advances in unsupervised local feature learning and propose an invariance-variance loss over the detected segments for regularizing feature representations in the target domain. Crucially, this loss structure and network architecture can handle overlapping segments and oversegmentation as produced by Segment Anything. We demonstrate the advantage of our method on the challenging YCB-Video and HomebrewedDB datasets and show that it outperforms prior work and, on YCB-Video, even a network trained with real annotations. Additionally, we provide insight through model ablations and show applicability to a custom robotic application.
|
|
17:06-17:24, Paper TuDT3.3 | |
A Unified Framework for Multi-Stage Decision Optimization with Deep Reinforcement Learning and Foundation Models |
|
Wang, Qinghao | Peking University |
Jiang, Jinyang | Peking University |
Liu, Xiaotian | George Institute of Technology |
Ren, Tao | Peking University |
Zheng, Yi | Peking University |
Zhang, Cheng | Peking University |
Yang, Yaodong | Peking University |
Peng, Yijie | Peking University |
Keywords: Inventory Management, Manufacturing, Maintenance and Supply Chains
Abstract: In today’s rapidly information-sharing and highly coordinated industries, multi-stage management is pivotal for enhancing adaptability and optimizing profitability in dynamic and uncertain business environments. However, conventional heuristic approaches often struggle to capture dynamic interdependencies and adjust to real-time fluctuations. To address these limitations, we first employ Deep Reinforcement Learning (DRL) to optimize individual management stages, demonstrating its effectiveness over heuristics in isolated scenarios. Building on this foundation, we propose a novel Foundation Model for Management Decision-making (FMMD), a transformer-based foundation model integrated with DRL, thereby enabling unified cross-domain decision-making across production, inventory management, dynamic pricing, and recommendation. Experimental evaluations show that our method substantially outperforms existing approaches, underscoring its transformative potential. Through an end-to-end generative mechanism, FMMD holistically coordinates actions among interconnected domains, significantly boosting efficiency and adaptability. By unifying multiple management stages under an AI-driven framework, FMMD lays the foundation for the construction of a large management model capable of accommodating massive data, actions, and scales. Our framework highlights a new paradigm in business management and paves the way for future research in automation science and decision-making processes.
|
|
17:24-17:42, Paper TuDT3.4 | |
GenAI for Robot System Engineering |
|
Clever, Debora | ABB Corporate Research Center |
Tan, Ruomu | ABB AG |
Stuhlenmiller, Florian | ABB AG Corporate Research Center Germany |
Maczey, Sylvia | ABB AG |
Dai, Fan | ABB AG, Corporate Research Germany |
Keywords: Agent-Based Systems, Industrial and Service Robotics, Machine learning
Abstract: While there are several approaches to use generative artificial intelligence (GenAI) for a simplified generation of robot programs, only limited knowledge exists on the use of GenAI for robot system configuration. However, to support non-experts in deploying robots, this initial setup of the system is crucial. Therefore, an agent-based tool is proposed, combining large language models (LLMs) with retrieval augmented generation (RAG), few-shot learning and a set of available actions to generate configuration files in a desired format based on user input. A key component of the approach is a feedback loop that provides error messages from the robot controller to the AI-agent that can be exploited in a subsequent trial. A prototypic implementation of the agent-based approach is used to demonstrate several use cases. Although evaluations and analysis with additional use cases is needed to gain scientific rigorous conclusions, it showcases high potential of such an approach for real-world applications, and problems and ideas for further developments can be identified.
|
|
TuDT4 |
Room T4 |
Learning and Control 1 |
Regular Session |
Chair: Yang, Jung-Min | Kyungpook National University |
|
16:30-16:48, Paper TuDT4.1 | |
Learning Human-Robot Interactions in Perturbed Teleoperations |
|
Kebria, Parham | The University of Queensland |
Nahavandi, Saeid | Swinburne University of Technology |
Howe, Robert D. | Harvard University |
Keywords: Telerobotics and Teleoperation, Learning and Adaptive Systems, Deep Learning in Robotics and Automation
Abstract: Remotely controlling a robotic platform could be challenging, specifically in the presence of uncertainties between the human and robot. A major problem is that humans’ inputs to the system may not meet all task-specific criteria. For example, in teleoperated medical robotic applications, disrupted haptic signals (force feedback) can make it difficult for the human operator to command precise maneuvers, preventing the robot from maintaining stable and safe task execution. This paper addresses this issue by proposing an imitation learning strategy through human-robot interactions. Learning from successful task executions, the proposed methodology can condition teleoperation signals and human operator’s inputs to determine the legibility of the concurrent execution of the task. Utilizing collections of structured demonstration data of successful scenarios, the current study incorporates the renowned deep learning technique, recurrent neural network with long short-term memory units to achieve the goal. The developed framework learns optimum policies from expert demonstrations through human interactions. Once properly learned, the framework mimics the accomplished executions to protect and guarantee the performance in the case of disrupted teleoperation signals. The results demonstrate the effectiveness of the developed framework in performing teleoperation tasks under uncertainty-perturbed scenarios, tested over a long distance between Australia and the USA. Additionally, the developed framework enables the human operator to achieve a zero failure ratio, a critical factor in safety-sensitive applications such as remote clinical diagnosis.
|
|
16:48-17:06, Paper TuDT4.2 | |
Output-Feedback Corrective Control for Accommodating Perpetual State Damage in Asynchronous Sequential Machines |
|
Yang, Jung-Min | Kyungpook National University |
Kwak, Seongwoo | Pukyong National University |
Keywords: Discrete Event Dynamic Automation Systems, Diagnosis and Prognostics, Foundations of Automation
Abstract: A control-theoretic policy is addressed for tolerating permanent state faults occurring to asynchronous sequential machines (ASMs). The considered ASM suffers from permanent state faults which invalidate a portion of the machine’s state space. A structure of a state observer and output-feedback controller is proposed to diagnose and tolerate any occurrence of permanent state faults, while compensating for the closed-loop operation of the machine in a desirable way. The proposed configuration of fault diagnosis and fault-tolerant control is implemented on the field programmable gate array (FPGA) circuitry to validate the applicability and transplantability of the scheme.
|
|
17:06-17:24, Paper TuDT4.3 | |
Touch-To-Touch Translation - Learning the Mapping between Heterogeneous Tactile Sensing Technologies |
|
Grella, Francesco | University of Genova |
Albini, Alessandro | University of Oxford |
Cannata, Giorgio | University of Genova |
Maiolino, Perla | University of Oxford |
Keywords: Force and Tactile Sensing, Deep Learning in Robotics and Automation, Learning and Adaptive Systems
Abstract: The use of data-driven techniques for tactile data processing and classification has recently increased. However, collecting tactile data is a time-expensive and sensor-specific procedure. Indeed, due to the lack of hardware standards in tactile sensing, data is required to be collected for each different sensor. This paper considers the problem of learning the mapping between two tactile sensor outputs with respect to the same physical stimulus - we refer to this problem as touch-to-touch translation. In this respect, we proposed two data-driven approaches to address this task and we compared their performance. The first one exploits a generative model developed for image-to-image translation and adapted for this context. The second one uses a ResNet model trained to perform a regression task. We validated both methods using two completely different tactile sensors - a camera-based, Digit [1] and a capacitance-based, CySkin [2]. In particular, we used Digit images to generate the corresponding CySkin data. We trained the models on a set of tactile features that can be found in common larger objects and we performed the testing on a previously unseen set of data. Experimental results show the possibility of translating Digit images into the CySkin output by preserving the contact shape and with an error of 15.18% in the magnitude of the sensor responses.
|
|
17:24-17:42, Paper TuDT4.4 | |
Modeling and Control of a Fan-Driven Ball Levitation System |
|
Pauyac Estrada, Claudia Rosa | Oklahoma State University |
Sheng, Weihua | Oklahoma State University |
Keywords: Control Architectures and Programming, Sensor-based Control
Abstract: This paper proposes the design, modeling, and control of a fan-driven ball levitation system for engineering education in control and embedded computing. The closed-loop control system for ball levitation generates a vertical upward airflow from a DC fan to suspend a lightweight ball in the air at a specific height. A mathematical model of the system is developed. After examining the Magnus effect, we proposed the implementation of a honeycomb structure upstream of the fan to promote laminar airflow, therefore reducing the influence of the Magnus effect. Simulation and experimental results show that this modification improved the efficiency and smoothness of the position control, mitigated the disturbances, and enhanced the overall stability of the system.
|
|
17:42-18:00, Paper TuDT4.5 | |
Enhancing Stability Via Adaptive Torque Control for a Four-Wheeled Omniwheel Ballbot |
|
Liu, Yixiao | University of Illinois at Urbana-Champaign |
Han, Tianyi | University of Illinois at Urbana-Champaign |
O'Reilly, Katherine | Columbia University, University of Illinois at Urbana Champaign |
Banks, Keona | University of Illinois Urbana-Champaign |
Ramos, Joao | University of Illinois at Urbana-Champaign |
Hsiao-Wecksler, Elizabeth T. | University of Illinois at Urbana-Champaign |
Keywords: Sensor-based Control, Optimization and Optimal Control, Robust/Adaptive Control
Abstract: Traditional omniwheel-based ballbots typically use three omniwheels for locomotion, offering high maneuverability but limited stability and load capacity. Introducing a fourth wheel can address these limitations; however, it also makes torque distribution underdetermined, requiring an additional constraint to distribute motor torques effectively. This paper presents a normal force based torque distribution strategy for a four-wheeled omniwheel ballbot. The proposed method leverages real-time normal force measurements to dynamically adjust torque limits, reducing wheel slippage and improving stability. The effectiveness of the proposed strategy is evaluated through recovery angle and braking experiments. Compared to the baseline symmetric torque distribution strategy, incorporating normal force feedback reduces wheel slippage by 43–74% during recovery and by 76% during braking. These improvements demonstrate the efficacy of the normal force based torque distribution strategy in enhancing stability.
|
|
TuDT5 |
Room T5 |
Sustainability for Production and Service Systems 2 |
Special Session |
Chair: Ju, Feng | Arizona State University |
Organizer: Yan, Chao-Bo | Xi'an Jiaotong University |
Organizer: Pei, Zhi | Zhejiang University of Technology |
Organizer: Ju, Feng | Arizona State University |
Organizer: Li, Congbo | Chongqing University |
Organizer: Wang, Junfeng | Huazhong University of Science and Technology |
Organizer: Yu, Chunlong | Tongji University |
|
16:30-16:48, Paper TuDT5.1 | |
Simulation Budget Allocation across Iterations in Iterative Local Search: A Stopping Rule-Based Approach (I) |
|
Sun, Yi | Tongji University |
Yu, Chunlong | Tongji University |
Keywords: Probability and Statistical Methods, Discrete Event Dynamic Automation Systems, Simulation and Animation
Abstract: Simulation optimization has a wide range of applications, with simulation budget allocation serving as a critical method for improving efficiency. In many studies on simulation budget allocation for iterative search, the primary focus is on budget allocation within iterations, i.e., splitting the budget among neighborhood solutions. While allocation across iterations often relies on simple strategies such as equal distribution or gradual increase. This study explores the potential for improving search performance by determining the budget allocated to each iteration in a more effective manner. To achieve this, we propose a stopping rule based on the posteriori probability of correct move for the classical local search. Numerical experiments were conducted on a custom-designed grid-based search problem and a buffer allocation problem in an engine cylinder head assembly line. The results demonstrate that the proposed method enhances the performance of local search.
|
|
16:48-17:06, Paper TuDT5.2 | |
Fast Multi-Agent Path Planning with Turn Actions: A Priority Inheritance Approach (I) |
|
Tao, Zheyu | Tongji University |
Yu, Chunlong | Tongji University |
Keywords: Motion and Path Planning, Collision Avoidance, Agent-Based Systems
Abstract: Multi-Agent Path Finding (MAPF) has found wide application in real-world scenarios. However, the classical MAPF action models often overlook the kinematic constraints of real agents, which require in-place turns due to their orientations. Existing algorithms for MAPF with turn actions typically suffer from low computational efficiency and limited scalability, primarily due to the significantly larger state space. In this paper, we propose a priority inheritance approach that efficiently generates valid solutions in high-agent-density environments, significantly reducing computation time while maintaining acceptable solution quality. The proposed approach can solve instances with up to 1000 agents in seconds, making it well-suited for scenarios requiring rapid responses and high scalability.
|
|
17:06-17:24, Paper TuDT5.3 | |
Modeling and Analysis of Shared Buffers in Small Production Systems (I) |
|
Dong, Heng | Tsinghua University |
Fan, Zhenghao | Tsinghua University |
Li, Jingshan | Tsinghua University |
Keywords: Sustainable Production and Service Automation, Intelligent and Flexible Manufacturing, Factory Automation
Abstract: This abstract introduces an ongoing research on modeling and analyzing production systems with shared buffers. For small systems with four machines, a system-theoretic method is introduced to analyze line performance. Specifically, a discrete Markov chain model is presented to evaluate the productivity of three-machine lines, serving as a building block for analyzing longer lines. Then a decomposition and iteration-based method is introduced to approximate performance of four-machine lines. Such a work lays a groundwork for developing quantitative tools to study larger systems with shared buffers.
|
|
17:24-17:42, Paper TuDT5.4 | |
Challenges to Robotic Sorting at Materials Recovery Facilities (I) |
|
Haldankar, Tanmay Neelesh | University of Maryland College Park, National Institute of Stand |
Schumacher, Kelsea | NIST |
Harrison, William | National Institute of Standards and Technology |
Keywords: Sustainability and Green Automation, Sustainable Production and Service Automation, Industrial and Service Robotics
Abstract: In 2018, the United States generated 292 million tons of municipal solid waste (MSW), which is composed of everyday waste items discarded by consumers. The 2018 waste management statistics, reported by the U.S. Environmental Protection Agency (EPA), indicate a large loss of resources and call for increased material recovery and recycling to bolster domestic supply chains. Materials recovery facilities (MRFs) are processing facilities where recyclable material separated by consumers from MSW (source-separated material) is taken to be sorted for further processing. Feedstock contamination is a significant challenge faced at MRFs, and to minimize disruptions and remove contamination, MRFs rely on manual sorters. This reliance can slow down operations and lead to higher labor costs, as well as worker health and safety risks. These persistent challenges have made material sorting at MRFs an increasingly popular application space for robotic system development. Despite their potential, significant barriers exist to the integration of robotic systems in MRFs. The objective of our study is to assess the current state of robotic sorters deployed at MRFs and the challenges they face, through industry interviews and literature reviews. This assessment aims to inform future research and enhance material recovery and recycling in the U.S.
|
|
17:42-18:00, Paper TuDT5.5 | |
Optimal Scheduling in Aluminum Melting and Die-Casting Systems for On-Time Delivery and Energy Savings (I) |
|
Zhong, Hanqing | Tsinghua University |
Fan, Zhenghao | Tsinghua University |
He, Shenghua | Shuyi Link Company |
He, Xiaoheng | Shuyi Link Company |
Li, Jingshan | Tsinghua University |
|
|
TuDT6 |
Room T6 |
Medical Applications 2 |
Regular Session |
Chair: Ichnowski, Jeffrey | Carnegie Mellon University |
|
16:30-16:48, Paper TuDT6.1 | |
Exploiting Physical Human-Robot Interaction to Provide a Unique Rolling Experience with a Riding Ballbot |
|
Xiao, Chenzhang | University of Illinois at Urbana-Champaign |
Song, Seung Yun | University of Illinois at Urbana-Champaign |
Chen, Yu | University of Illinois at Urbana-Champaign |
Mansouri, Mahshid | University of Illinois at Urbana-Champaign |
Bleakney, Adam | University of Illinois |
Ramos, Joao | University of Illinois at Urbana-Champaign |
Norris, William | University of Illinois Urbana-Champaign |
Hsiao-Wecksler, Elizabeth T. | University of Illinois at Urbana-Champaign |
Keywords: Physically Assistive Devices, Human Factors and Human-in-the-Loop, Motion Control
Abstract: This study introduces the development of hands-free control schemes for a riding ballbot that was previously developed to allow riders, including manual wheelchair users, to control its movement through torso motion. The hardware platform, Personal Unique Rolling Experience (PURE), utilizes a ballbot drivetrain, a dynamically stable robot that uses a ball as its wheel to provide omnidirectional movement. To accommodate users with varying torso motion functions, the control scheme should be adjustable based on the rider's torso function and personal preferences. Therefore, concepts of impedance and admittance control were integrated into the existing control scheme. A duo-agent optimization-based simulation framework was utilized to assess the efficiency of this rider-ballbot system for a critical task: braking from 1.4 m/s. The candidate control schemes were further implemented in the physical robot hardware and validated with two experienced users, demonstrating the efficiency and robustness of the hands-free admittance control scheme. This interface, which utilized physical human-robot interaction as the input, resulted in lower braking effort and shorter braking distance and time. Twelve novice participants, six able-bodied individuals (ABI) and six manual wheelchair users (mWCU), with different levels of torso functions, were then recruited to benchmark the braking performance with HACS. They successfully finished the braking task and achieved similar performance compared with experienced users. By exploiting pHRI, the proposed admittance-style control scheme provided effective control of the ballbot via torso motions. This interface enables PURE to provide a personal unique rolling experience to ABI and mWCU for safe and agile indoor navigation.
|
|
16:48-17:06, Paper TuDT6.2 | |
Hospital Trial Results for the Acceptability of an Adaptive Robot Nursing Assistant |
|
SharafianArdakani, Payman | UofL |
Kondaurova, Irina | UofL |
Ashary, Ali | University of Louisville |
Zhang, Nancy | UofL |
Logsdon, M Cynthia | University of Louisville |
Walker, Mandi D. | UofL Health |
Popa, Dan | University of Louisville |
Keywords: Medical Robots and Systems, Telerobotics and Teleoperation, Robotics and Automation in Life Sciences
Abstract: The integration of robotic nursing assistants in healthcare settings has the potential to enhance patient care and alleviate staff workload. This study evaluates the feasibility and user acceptance of the Adaptive Robot Nursing Assistant (ARNA) in a hospital environment through two experimental scenarios: (1) Patient Sitter, where patients teleoperate the robot to retrieve and deliver objects using two interfaces, and (2) Nurse Walker, where ARNA assists with mobility through a shared control mechanism. A total of 10 patients and 5 nurses participated in the study, assessing ARNA’s usability based on the Technology Acceptance Model (TAM) framework. Results indicate that joystick control provided a more intuitive experience compared to the tablet interface, yielding higher ratings for perceived ease of use (EU) and behavioral intention (BI). Additionally, prior video-game experience (VGE) positively correlated with perceived usefulness (PU) and satisfaction (S). While TAM model validation confirmed strong relationships between key acceptance factors, time to complete the tasks showed no significant differences across control methods. The findings emphasize the potential benefits of assistive robots in clinical settings, while also noting design challenges related to interface optimization and maneuverability, and providing insights into their effective integration.
|
|
17:06-17:24, Paper TuDT6.3 | |
ORB: Operating Room Bot, Automating Operating Room Logistics through Mobile Manipulation |
|
Qiu, Jinkai | Carnegie Mellon University |
Kim, Yungjun | Carnegie Mellon University |
Sethia, Gaurav | Carnegie Mellon University |
Agarwal, Tanmay | Carnegie Mellon University |
Ghodasara, Siddharth | Carnegie Mellon University |
Erickson, Zackory | Carnegie Mellon University |
Ichnowski, Jeffrey | Carnegie Mellon University |
Keywords: Medical Robots and Systems, Inventory Management, Health Care Management
Abstract: Efficiently delivering items to an ongoing surgery in a hospital operating room can be a matter of life or death. In modern hospital settings, delivery robots have successfully transported bulk items between rooms and floors. However, automating item-level operating room logistics presents unique challenges in perception, efficiency, and maintaining sterility. We propose the Operating Room Bot (ORB), a robot framework to automate logistics tasks in hospital operating rooms (OR). ORB leverages a robust, hierarchical behavior tree (BT) architecture to integrate diverse functionalities of object recognition, scene interpretation, and GPU-accelerated motion planning. The contributions of this paper include: (1) a modular software architecture facilitating robust mobile manipulation through behavior trees; (2) a novel real-time object recognition pipeline integrating YOLOv7, Segment Anything Model 2 (SAM2), and Grounded DINO; (3) the adaptation of the cuRobo parallelized trajectory optimization framework to real-time, collision-free mobile manipulation; and (4) empirical validation demonstrating an 80% success rate in OR supply retrieval and a 96% success rate in restocking operations. These contributions establish ORB as a reliable and adaptable system for autonomous OR logistics.
|
|
17:24-17:42, Paper TuDT6.4 | |
Improving Kinematic Accuracy of Laparoscopic Surgical Robots through Sub-Millimeter-Scale Kinematics |
|
Shkurti, Tom | Case Western Reserve University |
Cavusoglu, M. Cenk | Case Western Reserve University |
Keywords: Medical Robots and Systems, Calibration and Identification
Abstract: As automation in robotic surgery develops, the kinematics and kinematic accuracy of laparoscopic robots have become important prerequisites for development in the field. However, the kinematics of these robots remain inaccessible to purely analytic solutions, and manufacturing imprecision causes the kinematic parameters to vary from robot to robot, requiring calibration. In this paper, we present a novel iterative numerical kinematic algorithm which exploits the tool-and-arm structure of a typical surgical robot to provide arbitrarily precise inverse kinematic solutions with the use of an arbitrary number of calibration parameters. To our knowledge, this is the first inverse kinematic algorithm to be presented in the literature, which specifically exploits the properties of laparoscopic surgical robots. We also present an improved calibration methodology to rapidly and reliably determine these parameters. Both components are evaluated on a da Vinci Surgical Unit in a direct accuracy measurement test involving precise contact with locations on a physical target. With a root-mean-squared accuracy of 2.54mm, and a maximum positioning error of 3.78mm, our methods meet or exceed the state of the art in improving the accuracy of computer-controlled surgical robots.
|
|
17:42-18:00, Paper TuDT6.5 | |
Robotic Needle Steering by Concentric Beveled-Tip Tube and Wire |
|
Qi, Boshen | UC Riverside |
Vu, Matthew | University of California - Riverside |
Deuling, Ryan | University of California, Riverside |
Sheng, Jun | University of California Riverside |
Keywords: Medical Robots and Systems, Mechanism Design in Meso, Micro and Nano Scale
Abstract: This paper presents a novel steerable needle robot designed for minimally invasive surgery (MIS) in tissue environments. The robot consists of a concentric configuration of a tube and a wire, both of which are constructed from superelastic (SE) nitinol and have beveled tips. The key innovation lies in controlling the relative rotation angle between the beveled tips of the tube and wire to adjust the steering direction and curvature of the robot during insertion into soft tissues. Unlike classic beveled-tip needles requiring continuous rotation of the robot body, our design enables the control of the motion trajectory through occasional rotation of the nested tube and wire, enhancing safety and minimizing tissue damage. Through non-dimensional analytical modeling, we determined the robot's bending capability as a function of the design parameters, enabling us to select wires and tubes for a sufficiently large workspace and a minimal dead zone. Experimental evaluations with various combinations of beveled-tip angles validated the model predictions and design selections, demonstrating that the robot tip displacement significantly depends on the beveled tips and their relative rotation. The tested needle with 25◦ beveled tips at the wire and tube can achieve maximum tip displacement of 16.75 mm at an insertion of 70 mm when the beveled tips are aligned. The developed system demonstrated predictable and dexterous steering performance, highlighting its potential in MIS applications.
|
|
TuDT7 |
Room T7 |
Modeling and Control for Automation in Mfg 3 |
Special Session |
Chair: Barton, Kira | University of Michigan at Ann Arbor |
Organizer: Barton, Kira | University of Michigan at Ann Arbor |
Organizer: Balta, Efe | Inspire AG |
Organizer: Kovalenko, Ilya | Pennsylvania State University |
Organizer: Wang, Zi | University of Nottingham |
|
16:30-16:48, Paper TuDT7.1 | |
Leveraging Manufacturing Apps: A Modular Framework for Capturing & Packaging Manufacturing Process Logic (I) |
|
Ahmed, Mohammed Ismael | Institute for Advanced Manufacturing, University of Nottingham |
Chaplin, Jack | The University of Nottingham |
Sanderson, David | The University of Nottingham |
Ratchev, Svetan | The University of Nottingham |
Keywords: Software, Middleware and Programming Environments, Cyber-physical Production Systems and Industry 4.0, Intelligent and Flexible Manufacturing
Abstract: The manufacturing landscape is experiencing a paradigm shift, transitioning from an era of mass production to a more consumer-orientated mass customisation model. However, this transition and a broader shift in industry practices have introduced more sophisticated challenges. One significant challenge is the need for more modular and flexible software development methodologies to enable easily reconfigurable production lines. Compounding this is the lack of interoperability between different Original Equipment Manufacturers (OEMs) and the loss of shop-floor knowledge, which widens the skills gaps. To address these challenges, this paper presents Manufacturing Apps as a hardware-agnostic, digital solution capable of capturing manufacturing process logic and knowledge in a modular and reusable manner, enabling its effective reuse. By formalising the concept of Manufacturing Apps, this work provides general models and a methodology that enables developers to easily capture, reuse, and deploy process-specific knowledge. A robotic drilling process was selected as a case study, which formed the basis of the validation scenario presented.
|
|
16:48-17:06, Paper TuDT7.2 | |
A Knowledge-Graph Powered Recommender System for Large-Volume Metrology in Reconfigurable Manufacturing System Design (I) |
|
Wang, Zi | University of Nottingham |
Griffin, Joseph W | University of Nottingham |
Kendall, Peter | University of Nottingham |
Sanderson, David | The University of Nottingham |
Ratchev, Svetan | The University of Nottingham |
Keywords: Intelligent and Flexible Manufacturing, Cyber-physical Production Systems and Industry 4.0, Product Design, Development and Prototyping
Abstract: Concurrent design for modern flexible manufacturing systems demands multifaceted expertise, with early-stage consideration of metrology capabilities emerging as a critical area of research to maximize productivity and quality. While digital transformation has advanced other aspects of manufacturing system design, metrology capability remains underexplored, particularly in dynamic, reconfigurable environments. This paper introduces a knowledge graph-powered recommender system designed to represent metrology capabilities and the underlying knowledge for instrument selection. The knowledge graph database captures metrology capability provided by three widely used large-volume metrology devices, laser trackers, laser radar, and photogrammetry systems, alongside their interfaces and measurement features. Selection queries based on product/part and process requirements were also proposed to guide system design decisions. The system's feasibility was tested across three industrial aerostructure assembly scenarios, with results compared against choices made by system engineers. The proposed recommender system demonstrated its effectiveness, with future development focusing on integration with Product Lifecycle Management (PLM) systems and enhanced simulation and optimization capabilities.
|
|
17:06-17:24, Paper TuDT7.3 | |
Determining Investment Opportunities and Directions for Artificial Intelligence and Machine Learning Applied to Digital Twin Technology in Semiconductor Manufacturing (I) |
|
Moyne, James | University of Michigan |
Jia, Xiaodong | University of Cincinnati |
Shi, Jing | University of Cincinnati |
Keywords: AI-Based Methods, Machine learning, Simulation and Animation
Abstract: The National Institute for Standards and Technology (NIST) established a United States (US)-based industrial Artificial Intelligence (AI) consortium to address the high-priority technical barriers in semiconductor manufacturing high-mix production and provide a roadmap for US manufacturers to improve overall competitiveness in a global marketplace. As part of this process, challenge categories were defined, with a key challenge category being “Digital Twin” (DT). In investigating this challenge category, it was determined that DTs are already being successfully leveraged in the industry for solutions such as predictive maintenance and model-based process control, however significant opportunities exist if investment is focused on key AI and Machine Learning (ML) areas. This document is a position paper presenting challenges, potential solutions and a roadmap for advancing DTs and associated AI in the manufacturing space. A key conclusion is the need for technical DT and DT framework standards, including a taxonomy and definition of interfaces. With these standards in place, opportunities exist for improvement of existing DT-driven capabilities, as well as realization of new capabilities, especially those resulting from the interoperability DT classes throughout the manufacturing ecosystem.
|
|
17:24-17:42, Paper TuDT7.4 | |
A Large Language Model-Enabled Control Architecture for Dynamic Resource Capability Exploration in Multi-Agent Manufacturing Systems (I) |
|
Lim, Jonghan | Pennsylvania State University |
Kovalenko, Ilya | Pennsylvania State University |
Keywords: Agent-Based Systems, Factory Automation, AI-Based Methods
Abstract: Manufacturing environments are becoming more complex and unpredictable due to factors such as demand variations and shorter product lifespans. This complexity requires real-time decision-making and adaptation to disruptions. Traditional control approaches highlight the need for advanced control strategies capable of overcoming unforeseen challenges, as they demonstrate limitations in responsiveness within dynamic industrial settings. Multi-agent systems address these challenges through decentralization of decision-making, enabling systems to respond dynamically to operational changes. However, current multi-agent systems encounter challenges related to real-time adaptation, context-aware decision-making, and the dynamic exploration of resource capabilities. Large language models provide the possibility to overcome these limitations through context-aware decision-making capabilities. This paper introduces a large language model-enabled control architecture for multi-agent manufacturing systems to dynamically explore resource capabilities in response to real-time disruptions. A simulation-based case study demonstrates that the proposed architecture improves system resilience and flexibility. The case study findings show improved throughput and efficient resource utilization compared to existing approaches.
|
|
17:42-18:00, Paper TuDT7.5 | |
Energy-Aware Model Predictive Control for Batch Manufacturing System Scheduling under Different Electricity Pricing Strategies (I) |
|
Li, Hongliang | The Pennsylvania State University |
Pangborn, Herschel | The Pennsylvania State University |
Kovalenko, Ilya | Pennsylvania State University |
Keywords: Planning, Scheduling and Coordination, Sustainability and Green Automation
Abstract: Manufacturing industries are among the highest energy-consuming sectors, facing increasing pressure to reduce energy costs. This paper presents an energy-aware Model Predictive Control (MPC) framework to dynamically schedule manufacturing processes in response to time-varying electricity prices without compromising production goals or violating production constraints. A network-based manufacturing system model is developed to capture complex material flows, batch processing, and capacities of buffers and machines. The scheduling problem is formulated as a Mixed-Integer Quadratic Program (MIQP) that balances energy costs, buffer levels, and production requirements. A case study evaluates the proposed MPC framework under four industrial electricity pricing schemes. Numerical results demonstrate that the approach reduces energy usage expenses while satisfying production goals and adhering to production constraints. The findings highlight the importance of considering the detailed electricity cost structure in manufacturing scheduling decisions and provide practical insights for manufacturers when selecting among different electricity pricing strategies.
|
|
TuDT8 |
Room T8 |
Autonomous Systems 1 |
Regular Session |
Chair: Oksanen, Timo | Technical University of Munich |
|
16:30-16:48, Paper TuDT8.1 | |
SCOUT: Spatiotemporal Coverage for Optimal Unmanned Tasking |
|
Peng, Xinghao | The Pennsylvania State University |
Liu, Runsang | Pennsylvania State University |
Yang, Hui | The Pennsylvania State University |
Keywords: Autonomous Agents, Logistics, Simulation and Animation
Abstract: Spatiotemporal heterogeneity in demand distribution poses a significant challenge for deployment and coverage control in unmanned aerial vehicle (UAV) tasking. Traditional methods typically assume uniform or static demand, overlooking spatial and temporal variations, ultimately leading to suboptimal deployment of UAVs. To address this shortcoming, this paper introduces Spatiotemporal Coverage for Optimal Unmanned Tasking (SCOUT), a method that begins by initially identifying high-demand areas and subsequently refines UAV locations through an iterative, gradient-based update. The resulting deployment and coverage control minimizes a weighted cost function that integrates spatial distances and demand density, thereby enhancing both resource accessibility and equity for the target areas. Simulated experiments show that SCOUT consistently outperforms 3D K-means and weighted Voronoi methods. Implementing a continuous deployment task further underscores the strong potential of the method for dynamic decision support in complex and rapidly changing environments.
|
|
16:48-17:06, Paper TuDT8.2 | |
Secure and Trustworthy Operation of a Differential Drive UGV Using Random Forest under Power Constraints |
|
Fraga Da Silva, Eduardo | Charles Sturt University and Amazon Web Services |
Santoso, Fendy | Charles Sturt University |
Zheng, Lihong | Charles Sturt University |
Keywords: Sensor-based Control, Failure Detection and Recovery, AI-Based Methods
Abstract: Command injection attacks represent a significant threat to autonomous unmanned ground vehicles (UGVs), enabling adversaries to manipulate vehicle behavior while evading traditional detection mechanisms. This paper introduces a novel physics-aware approach to detecting such attacks in differential-drive systems by leveraging the inherent electromechanical constraints that malicious commands typically violate. This research develops a framework that systematically integrates conventional command monitoring with physics-aware features derived from motor power relationships, including power balance, asymmetry, and current-velocity correlations. Experimental evaluation on a physical UGV demonstrates that the physics-enhanced detection achieves a 93% F1-score, representing a 45% reduction in missed attacks compared to conventional approaches. Feature importance analysis confirms that physics-aware features contribute 46% of the model's predictive power, with UGV power asymmetry and motor balance providing particularly strong signals for attack detection. The implementation requires minimal computational resources (3.57ms processing time, 194KB memory footprint), making it suitable for real-time deployment on resource-constrained platforms. By creating a security layer that reveals anomalous behavior through tracking UGV physical characteristics, this research advances the development of resilient autonomous systems that maintain operational integrity even under sophisticated attack scenarios. Therefore, this approach ultimately enhances the trustworthiness of automation in critical applications.
|
|
17:06-17:24, Paper TuDT8.3 | |
Modified Super-Twisting Sliding Mode Control for Trajectory Tracking of a Cargo Quadrotor |
|
Gomiero, Sara | Free University of Bolzano-Bozen |
von Ellenrieder, Karl Dietrich | Libera Universita Di Bolzano |
Keywords: Autonomous Agents, Motion Control, Robust/Adaptive Control
Abstract: Quadrotor Uncrewed Aerial Vehicles (UAVs) face several limitations, such as restricted flight time and vulnerability to disturbances, and require control algorithms able to handle complex and underactuated dynamics. Their control is even more challenging when a payload is carried. This paper addresses the trajectory tracking problem of a quadrotor UAV, which operates as a cargo drone transporting cable-suspended payloads. First, we derive a model of the cargo drone in matrix form. The payload is treated as a rigid body and the gyroscopic effects of the propellers are included in the formulation. Then, we design second order sliding mode controllers, using the Modified Super-Twisting Algorithm. The hierarchical control scheme consists of an outer position control loop and an inner attitude control loop. The coefficients of the sliding surfaces and the reaching laws are calculated via Lyapunov analysis, ensuring finite time convergence to zero of both sliding surfaces and their first order derivatives. The performance of the proposed controller is examined in simulations of a UAV carrying a payload while tracking a circular trajectory. In the simulation case studied, the integral of time multiplied by the absolute value of the tracking error is an average of about 70% lower for the proposed second order sliding mode controller than it is for the first order sliding mode comparison controller.
|
|
17:24-17:42, Paper TuDT8.4 | |
Bringing Loop Closure to Event-Based SLAM: A Place Recognition Approach |
|
Mirhajianmoghadam, Hengameh | New Mexico State University |
Garcia Carrillo, Luis Rodolfo | New Mexico State University |
Keywords: Motion and Path Planning, Autonomous Agents, Computer Vision in Automation
Abstract: Event-based cameras excel at capturing visual information with high temporal resolution and low latency, making them well-suited for dynamic environments and low-light conditions where traditional cameras struggle. Several event-based visual odometry and SLAM methods for real-time motion estimation have been proposed to leverage these benefits. However, most of these approaches lack a loop closure detection mechanism, which is crucial for long-term localization and drift correction. To address this limitation, we propose an Event-Based Place Recognition Module that can integrate with different event-based SLAM frameworks to enhance loop closure detection. We evaluate our approach using event-based camera datasets in combination with a biologically-inspired RatSLAM framework adapted for event-based cameras, demonstrating its ability to detect loop closures effectively.
|
|
17:42-18:00, Paper TuDT8.5 | |
Advanced Path Tracking Controller for Follow Trails-Of-Tractor Guidance Objective without GNSS |
|
Hefele, Ruben | Technical University of Munich |
Oksanen, Timo | Technical University of Munich |
Keywords: Agricultural Automation, Autonomous Vehicle Navigation, Robotics and Automation in Life Sciences
Abstract: Automatic guidance systems are crucial in modern agriculture, enabling precise field operations such as planting, spraying, and harvesting. Current methods often rely on Global Navigation Satellite System (GNSS) technology, which can be limited by signal loss or inaccuracies. While GNSS-based tractor guidance is standard, combined tractor and trailer navigation is not widely adopted, and dependency on centimeter-level GNSS is a limitation. This paper introduces a novel guidance method that eliminates the need for GNSS, utilizing a limited set of sensors for state estimation including, trailer steer angle, tractor velocity and tractor curvature feedback. The proposed approach calculates the tractor path based on past states, allowing the trailed implement to follow the tractor tracks with high accuracy. A feedback controller adjusts the implement steering based on computed lateral and angular deviation errors, using proportional gains. Additionally, a feed-forward component from the tractor curvature enhances system responsiveness. This paper presents the concept, controller design, simulation results, and real field experiments with full-size machinery.
|
|
TuDT9 |
Room T9 |
Detection, Estimation and Prediction 5 |
Regular Session |
Chair: Faieghi, Reza | Toronto Metropolitan University |
|
16:30-16:48, Paper TuDT9.1 | |
HR-PD&GCNI: A Highly Robust Elevator Button Detection and Recognition Method Based on Prior Distribution and GCN Inference for Autonomous Elevator Riding Robots |
|
Wang, Haoming | Xiangtan University |
Zhang, Dongbo | Xiangtan University |
Keywords: Computer Vision in Automation, Deep Learning in Robotics and Automation, Learning and Adaptive Systems
Abstract: Accurate detection and recognition of elevator buttons are essential for autonomous service robots. However, issues such as missed and false detections, caused by lighting, reflections, and damage, remain common in real-world environments. This paper presents a robust button detection and recognition method based on the prior button layout characteristics and graph convolutional network (GCN) inference. Using an improved YOLOv8n+SPD model, we cluster row and column coordinates to construct a layout map, predicting missed buttons, and improving recall. GCN is then applied to model spatial relationships and correct optical character recognition (OCR) false recognitions, enhancing recognition accuracy. Experimental results on public datasets demonstrate that our method achieves state-of-the-art performance, showing strong robustness in practical applications.
|
|
16:48-17:06, Paper TuDT9.2 | |
An Unsupervised Time Series Anomaly Detection Approach for Efficient Online Process Monitoring of Additive Manufacturing |
|
Cantu, Frida | University of Texas Rio Grande Valley |
Ibarra, Salomon | University of Texas Rio Grande Valley |
Gonzalez, Arturo | UTRGV |
Barreda, Jesus | University of Texas Rio Grande Valley |
Liu, Chenang | Oklahoma State University |
Zhang, Li | University of Texas Rio Grande Valley |
Keywords: Additive Manufacturing
Abstract: Online sensing plays an important role in advancing modern manufacturing. The real-time sensor signals, which can be stored as high-resolution time series data, contain rich information about the operation status. One of its popular usages is online process monitoring, which can be achieved by effective anomaly detection from the sensor signals. However, most of the existing studies heavily rely on label information for training supervised models. Although unsupervised approaches have been investigated, the use of unsupervised semantic segmentation on sensor data has mostly been limited to segmenting data to capture where new regimes or unexpected routines start. This can be useful in other applications such as finding where a possible error might have occurred to change the flow of the data. To realize this, this work proposes to establish a matrix profile-based unsupervised anomaly detection algorithm for additive manufacturing process monitoring, where little to no label information about the sensor data is given. We propose an effective algorithm for detecting anomalies existing for additive manufacturing processing. The effectiveness of the proposed method is demonstrated by the experiments on real-world sensor data.
|
|
17:06-17:24, Paper TuDT9.3 | |
Computationally Efficient Moving Horizon Estimator for Linear Time-Invariant Systems |
|
Mobeen, Surrayya | Toronto Metropolitan University |
Yazdanshenas, Amin | Toronto Metropolitan University |
Faieghi, Reza | Toronto Metropolitan University |
Keywords: Optimization and Optimal Control
Abstract: This paper presents a computationally efficient formulation for cost computation in Moving Horizon Estimation (MHE) applied to Linear Time-Invariant (LTI) systems. MHE is a powerful method for state estimation in dynamic systems, but its performance can be hindered by the computational load associated with solving optimization problems at each time step. By leveraging the inherent properties of LTI systems, we derive an optimized cost function formulation that significantly reduces the computational complexity without compromising estimation accuracy. The proposed method is validated through multiple simulation scenarios, including quadrotor formation flight, demonstrating a 77% to 96% reduction in computational time compared to conventional MHE across various levels of system complexity. Our findings present promising results for enhancing the performance of MHE, especially in applications where computational resources are limited.
|
|
17:24-17:42, Paper TuDT9.4 | |
Stealthy False Data Injection (FDI) Attacks on Oil and Gas Smart Sensor Networks Using Wasserstein GAN with Gradient Penalty (WGAN-GP): Evaluation and Impact Analysis |
|
Sayghe, Ali | Yanbu Industril College |
Keywords: Sensor Networks
Abstract: The oil and gas industry is increasingly reliant on smart sensor networks to monitor and control critical infrastructure. This dependence also increases vulnerability to cyberattacks, especially False Data Injection (FDI) attacks. This paper proposes a novel FDI attack model using Wasserstein GAN with Gradient Penalty (WGAN-GP) to generate hidden data modifications that evade traditional detection methods. We assess the effectiveness of the WGAN-GP-based attack in a simulated oil and gas smart sensor network, comparing it to conventional FDI attacks in terms of detection rate, impact score, and statistical similarity. The results show that WGAN-GP-based attacks are significantly harder to detect and cause more severe disruptions. This highlights the inadequacy of current detection mechanisms and underscores the need for enhanced protection methods to defend oil and gas infrastructure against advanced cyber threats.
|
|
17:42-18:00, Paper TuDT9.5 | |
Container Retrieval Sequence Uncertainty Quantification Using Calibrated Machine Learning Models |
|
Deng, Jianfeng | Shanghai Jiao Tong University |
Zhang, Zhanluo | Shanghai Jiao Tong University |
Qin, Wei | Shanghai Jiao Tong University |
Huang, Heng | Nezha Intelligent Technology (Shanghai) Co., Ltd |
Zhang, Chuanjie | Nezha Intelligent Technology (Shanghai) Co., Ltd |
Tian, Yu | Shanghai International Port (Group) Co., Ltd |
Xu, Dong | Shanghai International Port (Group) Co., Ltd |
Keywords: Calibration and Identification, Machine learning, Probability and Statistical Methods
Abstract: The Container Relocation Problem (CRP) has garnered significant attention from scholars. However, most studies addressing the CRP assume a deterministic container retrieval sequence, which deviates from real-world terminal operations and limits the practical applicability of such research. To address the uncertainty in container retrieval sequences, this study proposes a novel uncertainty quantification framework that integrates multi-class classification models with probabilistic calibration techniques. Experiments are conducted to compare two multi-class classification models and several probabilistic calibration strategies. In this study, we also propose a novel calibration strategy. The results demonstrate that the output of models calibrated with Isotonic Regression closely approximates true probabilities. Meanwhile, our newly proposed calibration strategy demonstrates potential as a complementary approach to Platt Scaling.
|
|
TuDT10 |
Room T10 |
Reinforcement Learning 3 |
Regular Session |
Chair: Zhang, Xiaotong | Massachusetts Institute of Technology |
|
16:30-16:48, Paper TuDT10.1 | |
Demonstration-Augmented Deep Reinforcement Learning for Robotic Task Optimization: A Framework for Enhanced Learning Efficiency and Precision |
|
Matour, Mohammad-Ehsan | Hochschule Mittweida, University of Applied Sciences |
Winkler, Alexander | Hochschule Mittweida, University of Applied Sciences |
Keywords: Reinforcement, Deep Learning in Robotics and Automation, Learning and Adaptive Systems
Abstract: Deep Reinforcement Learning (DRL) enables robots to learn complex tasks autonomously. However, its reliance on random exploration and sparse reward signals often leads to inefficient learning and suboptimal performance. This paper presents a framework that integrates expert-generated data into the model-free DRL training process to improve learning efficiency and task precision. The proposed method introduces expert demonstrations into the replay buffer during training and dynamically adjusts their sampling to facilitate a smooth transition from expert-guided learning to autonomous policy optimization. Unlike approaches that rely on pre-collected expert datasets, this framework provides expert data online during training, which allows adaptive guidance based on task requirements. The approach is evaluated in robotic tasks involving sequential decision-making and object interaction. The results demonstrate that integrating expert data enhances learning efficiency and makes it an effective strategy for training DRL agents in structured environments.
|
|
16:48-17:06, Paper TuDT10.2 | |
Reinforcement Learning Enhanced Graph Learning for CTR Prediction in Online Advertising |
|
Zhang, Yiming | Singapore Management University |
Zhu, Feida | Singapore Management University |
Li, Jiayi | Singapore Management University |
Keywords: Big-Data and Data Mining, Reinforcement, Machine learning
Abstract: Click-through rate (CTR) prediction serves as a key functional module in various online advertising ecosystems. CTR prediction aims to estimate the probability that a user is interested in an advertisement (ad) and will click on it. Current methods typically treat this task as a binary classification problem which often suffers from severe label noise issues. Furthermore, user-related features (i.e., diverse user interests and online behavior information) are heavily relied upon and may suffer from the problem of feature sparsity. To address these challenges, we propose a novel model called RL-EGL for CTR prediction in online advertising. In RL-EGL, we first introduce a heterogeneous graph to model the relationships among different types of entities in online advertising platforms. Subsequently, we build an attention-based heterogeneous graph convolution network to integrate both structural relations and semantic content information for learning ad representations. To identify effective samples from the training dataset, we further design a reinforcement learning framework to model the effective sample selection process. During the training of RL-EGL, the heterogeneous graph convolution network and prediction classifier are enhanced using the selected effective samples, while the noise-robust agent is strengthened by considering the refined node representations and the performance of the prediction classifier as feedback. Through reinforcement learning, the heterogeneous graph learning model, the agent, and the prediction classifier are trained jointly and improved together. Extensive experiments on three datasets demonstrate that RL-EGL exhibits satisfactory efficiency and outperforms state-of-the-art approaches.
|
|
17:06-17:24, Paper TuDT10.3 | |
Bounded Active Exploration for Model-Based Reinforcement Learning |
|
Qiao, Ting | The University of Auckland |
Williams, Henry | University of Auckland |
MacDonald, Bruce | University of Auckland |
Keywords: Reinforcement, Model Learning for Control, Deep Learning in Robotics and Automation
Abstract: A precise world model is imperative for the performance of Model-Based Reinforcement Learning (MBRL). Active exploration enhances world models via repeatedly visiting uncertain regions where the world model lacks proficiency. However, this strategy may introduce an objective mismatch between maximising rewards and developing an accurate world model. In response to this challenge, we propose a novel exploration strategy, termed bounded active exploration (BAE), that confines exploration behaviours within action candidates derived from a soft reward-exploitation policy. As the policy becomes `confident', these candidates converge on a single decisive action. We evaluate BAE with algorithms from two disparate MBRL research streams on simulation and real-world tasks. Empirical results manifest the superiority of our novel exploration strategy in most simulation tasks. BAE not only elevates MBRL agents' data efficiency but also provides an alternative method for applying intrinsic motivations in Reinforcement Learning.
|
|
17:24-17:42, Paper TuDT10.4 | |
Optimal Multi-Agent Reinforcement Learning for Efficient Partially Observable Multi-Robot Collaboration in Warehousing |
|
Abbas, Muhammad Naveed | Technological University of the Shannon: Midlands Midwest |
Liston, Paul | Technological University of the Shannon: Midlands Midwest |
Lee, Brian | Technological University of the Shannon: Midlands Midwest |
Qiao, Yuansong | Technological University of the Shannon: Midlands Midwest |
Keywords: Autonomous Agents, Optimization and Optimal Control, Logistics
Abstract: Efficient collaboration of autonomous mobile robots or multi-robots in order fulfillment tasks in warehouses enhances the delivery rate. However, the limited visibility of the working environment and varied task configurations can affect their throughput, i.e., delivery rate. Multi-agent reinforcement learning (MARL) and a MARL environment with partial observability replicating real-world warehouse order fulfillment tasks, in this backdrop, become a natural choice for the optimization of autonomous multi-robot collaboration under partial observability. To maximize the delivery rate through efficient multi-robot collaboration, this research proposes Multi-Agent Self-attention Recurrent Q-Learning (MASRQL) which is enabled by a general-purpose Q-function approximator, Deep Self-attention Recurrent Q-Network (DSRQN) and harnesses self-attention and recurrence mechanisms. The findings of the empirical analysis corroborate that the approach enhances efficiency by achieving 15.4%, 29%, and 26.2% more average delivery rate than the closest performing baselines in the respective task configurations showcasing adaptability to the varying task configurations.
|
|
17:42-18:00, Paper TuDT10.5 | |
Isaac Sim-To-Real: Reinforcement Learning Based Locomotion for Quadrupeds |
|
Dowdy, Jordan | University of Louisville |
Chagas Vaz, Jean | University of Louisville |
Keywords: Reinforcement, Deep Learning in Robotics and Automation, Learning and Adaptive Systems
Abstract: Learning-based approaches to locomotion have risen in popularity in recent years, showing the capability for complex legged locomotion and whole-body control. Reinforcement learning (RL), the primary learning-based approach for locomotion, often utilizes a high-performance simulation tool, providing a controlled and efficient training and development environment. However, policies that perform well in simulation frequently encounter unexpected challenges when deployed on a physical system, known as the sim-to-real gap. This work presents a robust RL locomotion framework capable of whole-body control. The proposed RL framework utilizes Nvidia's new set of simulation tools, Isaac Sim, and its companion RL framework, Isaac Lab, for training, achieving a zero-shot sim-to-real policy. The performance of our policy is validated on physical hardware using the Unitree Go1, with experimental results showing similar velocity tracking performance to the quadrupeds' integrated controller, with a greater ability to recover from large disturbances, and achieve linear velocities of 2.0m/s and angular velocities of 1.8rad/s.
|
| |