CASE 2025 Program | Monday August 18, 2025


MoOS	Crystal
Opening Session and Welcome	Plenary Session
Chair: Huang, Qiang	University of Southern California


MoK1N	Crystal
Jan Shi: Quality Feedback Control: A New Paradigm of In-Process Quality Improvement in Smart and Autonomous Manufacturing Systems	Plenary Session
Chair: Ding, Yu	Georgia Institute of Technology


MoAM_BR	Foyer
Monday Morning Coffee Break


MoAT1	Roman
Planning, Scheduling and Control 1	Regular Session
Chair: Wang, Yinan	RPI

10:30-10:48, Paper MoAT1.1
A Learning-Based Approach to Motion Planning with State Lattices in Off-Road Environments

Rosser, Joshua	University of Rochester
Warnell, Garrett	U.S. Army Research Laboratory
Lancaster, Eli	Booz Allen Hamilton
Sanchez, Felix	Booz Allen Hamilton
Fahnestock, Ethan	MIT
Damm, Eric	University of Rochester
Gregory, Jason M.	US Army Research Laboratory
Howard, Thomas	University of Rochester
Keywords: Motion and Path Planning, Learning and Adaptive Systems, Autonomous Vehicle Navigation Abstract: To safely navigate unmanned ground vehicles op- erating in partially observed environments, unstructured, and partially observed environments, intelligence architectures re- quire efficient motion planning algorithms that generate near- optimal routes and satisfy motion constraints in real-time. The recombinant nature of state lattice-based search spaces enables efficient search in motion planning graphs with precomputed trajectory libraries that satisfy nonholonomic constraints. The implementation of lattice planner-based search spaces however requires design choices that include resolution, fidelity, and expressiveness of the graph. Parameters that are well tuned for some environments may prove suboptimal or ineffective in others. In this paper we proposed a classification-based approach to pa- rameter learning that optimizes state lattice planner performance online using context from the environment and planning problem definition. Experimental results using data collected from a high- speed off-road unmanned ground vehicle operating in off-road environments demonstrate a substantial improvement in relative optimality of generated trajectories for regional motion planning.

10:48-11:06, Paper MoAT1.2
ARMOR: Egocentric Perception for Bimanual Robot Collision Avoidance and Motion Planning

Kim, Daehwa	Carnegie Mellon University
Srouji, Mario	Apple Inc
Chen, Chen	Apple
Zhang, Jian	Apple
Keywords: Motion and Path Planning, Sensor-based Control, Motion Control Abstract: Robotic arms have significant gaps in their sensing and perception, making it hard to perform motion planning in dense environments. To address this, we introduce ARMOR, a novel egocentric perception system that integrates both hardware and software, specifically incorporating wearable-like depth sensors for bimanual robotic platforms with arms. Our distributed perception approach enhances the robot's spatial awareness, and facilitates more agile motion planning. We also train a transformer-based imitation learning (IL) policy in simulation to perform dynamic collision avoidance, by leveraging around 86 hours worth of human realistic motions from the AMASS dataset. We show that our ARMOR perception is superior against a setup with multiple dense head-mounted, and externally mounted depth cameras, with a 63.7% reduction in collisions, and a 78.7% improvement in success rate. We also compare our IL policy against a sampling-based motion planning expert cuRobo, showing 31.6% less collisions, 16.9% higher success rate, and 26x reduction in computational latency. Lastly, we deploy our ARMOR perception on our real-world GR1 humanoid robot from Fourier Intelligence. The simulation environment, HW description, and 3D CAD files are available at https://daehwakim.com/armor.

11:06-11:24, Paper MoAT1.3
Interaction-Minimizing Roadmap Optimization for High-Density Multi-Agent Path Finding

Weindel, Sören	Karlsruhe Institute of Technology
Wilch, Jan	Technical University of Munich
Kögel, Christoph	SOMIC Verpackungsmaschinen GmbH & Co. KG
Xiao, Kevin	Planar Motor Incorporated
Vogel-Heuser, Birgit	Technical University Munich
Keywords: Motion and Path Planning, Intelligent Transportation Systems, Cyber-physical Production Systems and Industry 4.0 Abstract: Modern industry increasingly demands customizability from each element of their workflow and factories. A prominent example for this is the advent of Automated Guided Vehicles (AGV) in intralogistics tasks, which autonomously navigate the manufacturing floor, reacting dynamically to variations in the workflow. One such application makes use of magnetically propelled planar drive systems to transport products between manufacturing stations, replacing traditional solutions which are limited in their ability to efficiently adapt to new requirements. This work presents an AGV control approach capable of offloading large parts of the computational expense into an offline preprocessing step: A unidirectional roadmap is generated using alternating position optimization and network modification operations, with the goal of reducing the number of interactions between agents to be resolved at runtime. This concept was successfully validated in simulation. An accompanying tech report and implementation further details the presented approach, as well as the used controller and simulator.

11:24-11:42, Paper MoAT1.4
Swarm Intelligence-Based Optimization of Matrix System Layout with Workstation Combination and Assignment

Lee, Changha	Sungkyunkwan University
Oh, Seog-Chan	General Motors
Arinez, Jorge	General Motors Research & Development Center
Noh, Sang Do	Sungkyunkwan University
Keywords: Optimization and Optimal Control, AI-Based Methods, Intelligent and Flexible Manufacturing Abstract: The layout of a manufacturing system is a critical factor that significantly influences system performance during operation. Achieving an optimal layout requires simultaneous consideration of equipment selection, such as determining the appropriate combination of workstations, and workstation assignment for a layout design. However, existing research often assumes a predetermined workstation combination, focusing solely on optimizing workstation assignments for a layout design. Since workstation combination is typically undecided in the early stages of layout design, particularly in greenfield systems, an integrated optimization approach is essential. This study proposes a layout design optimization methodology for Matrix Manufacturing Systems (MMS) that simultaneously considers workstation combination (i.e., equipment selection) and workstation assignment problems. The proposed approach integrates Particle Swarm Optimization (PSO) with production simulation, employing a dimensionally reduced and information-dense solution representation—a structured format that encapsulates candidate solutions—to optimize both aspects simultaneously. To validate the proposed methodology, we conducted a case study on an automotive assembly process, comparing our approach with an existing method that separately solves workstation combination and assignment problems.

11:42-12:00, Paper MoAT1.5
Robust Temporal Logic Planning under Contingency Constraints

Yuksel, Sadik Bera	Northeastern University
Taheri, Azizollah	Northeastern University
Yazicioglu, Yasin	Northeastern University
Aksaray, Derya	Northeastern University
Keywords: Formal Methods in Robotics and Automation, Motion and Path Planning, Autonomous Agents Abstract: In dynamic and uncertain environments, robots are often required not only to complete their primary tasks but also to be able to switch to a contingency mode that facilitates a safe and effective response in the face of an unpredictable event. In this paper, we address the problem of synthesizing optimal high-level control policies for a robot 1) to satisfy a desired task and 2) to be able to respond safely in the face of an unexpected event. We model the dynamics of the robot as a Stochastic Transition System. We express the primary task as a Time Window Temporal Logic (TWTL) specification, and we consider the contingency behavior to be reaching a safe region in at most k time steps with a probability greater than a desired threshold. We propose an automata-theoretic framework to compute optimal policies for task satisfaction and contingency behavior. While the contingency behavior is always ensured, the task satisfaction probability is maintained by minimally extending the mission horizon when necessary. We demonstrate the performance of the proposed method through simulations and experiments.


MoAT2	Athenian
TASE Paper Session 1	Special Session
Chair: Wang, Hui	Florida State University

10:30-10:48, Paper MoAT2.1
Efficient Constrained Motion Planning Using Direct Sampling of Screw-Constraint Manifolds

Pettinger, Adam	Texas A&M University
Panthi, Janak	The University of Texas at Austin
Alambeigi, Farshid	University of Texas at Austin
Pryor, Mitchell	University of Texas
Keywords: Motion and Path Planning, Manipulation Planning, Industrial and Service Robotics Abstract: Manipulating articulated objects is especially difficult if the robot is operating autonomously or far from any human operator. Object articulation imposes strict constraints on robot motion, making it a challenge to generate valid trajectories to complete the task. Problems compound when the robot is mobile and operates in an uncontrolled environment, where the location or articulation model is unknown a priori. In this work, we leverage screw theory to model constraints imposed on a generic manipulator by simple articulated objects and present two novel, fast, and robust methods– Sequential Path Stepping (SPS) and Direct Screw Sampling (DSS) –for planning trajectories by directly sampling these constraints. We show that these methods are hardware-agnostic and work in cluttered environments using long, complex paths modeled by multiple screw-axis constraints. We demonstrate that modeling constraints using multiple screw axes handles objects with multiple DoF, or multi-step tasks (e.g., turning a knob before opening the door). In addition, the direct sampling component of the proposed approaches is implemented as a module that used with existing well-known probabilistic planning methods, allowing customization across different hardware, domains, or planning problems. We validate our methods across many planning and inverse kinematic elements, with three different mobile and stationary manipulators, and on a set of challenging planning problems that include single-and multiple-screw constraints. Results demonstrate a 97.6% success rate planning in cluttered environments, in less than 0.2 seconds.

10:48-11:06, Paper MoAT2.2
MINER-RRT*: A Hierarchical and Fast Trajectory Planning Framework in 3D Cluttered Environments

Wang, Pengyu	Hong Kong University of Science and Technology
Tang, Jiawei	Hong Kong University of Science and Technology
Lin, Hin Wang	The Hong Kong University of Science and Technology
Zhang, Fan	The Hong Kong University of Science and Technology
Wang, Chaoqun	Shandong University
Wang, Jiankun	Southern University of Science and Technology
Shi, Ling	The Hong Kong University of Science and Technology
Meng, Max Q.-H.	The Chinese University of Hong Kong
Keywords: Motion and Path Planning, Planning, Scheduling and Coordination Abstract: Trajectory planning for quadrotors in cluttered environments has been challenging in recent years. While many trajectory planning frameworks have been successful, there still exists potential for improvements, particularly in enhancing the speed of generating efficient trajectories. In this paper, we present a novel hierarchical trajectory planning framework to reduce computational time and memory usage called MINER-RRT*, which consists of two main components. First, we propose a sampling-based path planning method boosted by neural networks, where the predicted heuristic region accelerates the convergence of rapidly-exploring random trees. Second, we utilize the optimal conditions derived from the quadrotor's differential flatness properties to construct polynomial trajectories that minimize control effort in multiple stages. Extensive simulation and real-world experimental results demonstrate that, compared to several state-of-the-art (SOTA) approaches, our method can generate high-quality trajectories with better performance in 3D cluttered environments.

11:06-11:24, Paper MoAT2.3
Automated Ontology Generation for Zero-Shot Defect Identification in Manufacturing

Yhdego, Tsegai	Florida A&M University-Florida State University College of Engin
Wang, Hui	Florida State University
Keywords: Process Control, Machine learning, Hybrid Strategy of Intelligent Manufacturing Abstract: A lack of labeled data presents a significant challenge to automatic defect identification in manufacturing, which is a crucial step in process control and certification during process development. State-of-the-art transfer learning is incapable of handling such zero-shot learning (ZSL) when defect labels are absent in training datasets. The latest research on ZSL leverages natural language processing (NLP) based on large language models (LLM) and shows promise by supplementing information to generate labels. However, its performance is hampered by the supporting LLMs pre-trained on generic vocabulary that failed to characterize manufacturing defects accurately. This paper establishes a methodology to automatically extract multi-level attributes from literature to improve defect representation, thereby facilitating ZSL. The extracted attributes contribute to a hierarchical knowledge graph, called defect ontology, to characterize multiple aspects of manufacturing defects. The proposed algorithm takes the defect images and associated text from the literature as input and develops an unsupervised method to identify the hierarchical relationships among the tokenized information extracted from the input text-feature corpora. The hierarchical graph is refined to retain the most relevant information by a pruning algorithm based on a minimum path search. A walk algorithm, along with NLP, parsed the generated ontology to create embedding of defects to enable zero-shot attribute learning to identify defects. The proposed method advances the ZSL methodology by automatically creating a hierarchical knowledge representation from literature and images to replace generic vocabulary in LLM adopted by ZSL algorithms, thus improving defect representation.

11:24-11:42, Paper MoAT2.4
Predicting Vulnerable Road User Behavior with Transformer-Based Gumbel Distribution Networks

Astuti, Lia	Feng Chia University
Lin, Yu-Chen	Feng Chia University
Chiu, Chui-Hong	Feng Chia University
Chen, Wen-Hui	National Taipei University of Technology
Keywords: Collision Avoidance, Motion and Path Planning, AI-Based Methods Abstract: This study introduces the crossing intention and trajectory with the Transformer networks (CITraNet) prediction model to process multimodal input data of vulnerable road users (VRUs), such as pedestrians and bicyclists, whose behavior is inherently unpredictable. Unlike traditional approaches that rely on sequence transduction and Gaussian distribution-based models, CITraNet employs Transformer networks and Gumbel distribution. First, we utilize multi-head attention and feed-forward layers to extract hidden features from historically observed data to allow effective parallelization and handling of long-range dependencies. Second, CITraNet features an innovative Transformer-based Gumbel distribution network that significantly enhances the model’s ability to accurately predict all possible trajectories using extreme value theory, which replaces the conventional Gaussian distribution models that struggle with discrete and non-linear data. The effectiveness and accuracy of CITraNet are validated on the Taiwan pedestrian (TaPed) dataset, as well as the publicly available JAAD and PIE datasets. The model’s deterministic and stochastic trajectory predictions are assessed over short (0.5s), medium (1.0s), and long (1.5s) intervals, crucial for gauging predictive accuracy across varying durations. The results demonstrate that CITraNet outperforms previous benchmarks.

11:42-12:00, Paper MoAT2.5
Parallel Inspection Route Optimization with Priorities for 5G Base Station Networks

Dai, Xiangqi	Tsinghua University
Liang, Zhenglin	Tsinghua University
Keywords: Machine learning, Motion and Path Planning Abstract: 5G base station networks generate numerous alarms daily. With the increasing demand for digital services, it is vital to inspect and rectify anomalies to uphold user satisfaction. This study explores the potential of unmanned aerial vehicle (UAV) empowered opportunistic inspection based on alarm data. We formulate the inspection routing problem as a prioritized traveling salesman problem (PTSP) encompassing two categories of base stations. Priority is assigned to stations generating more alarms, while others are subject to opportunis tic inspection. To expedite large-scale opportunistic inspection routes, we introduce a novel transformer-based parallelizable routing algorithm (TPRA). TPRA is an intelligent optimization that orchestrates multiple parallelized constrained reinforcement learning algorithms. Through balancing spectral clustering, the large-scale graph is segmented into manageable subgraphs. For each subgraph, the prioritized inspection routing problem is for mulated as a constrained Markov decision process and optimized by transformer-based reinforcement learning in parallel. The optimized subgraphs are then merged using an adaptive large neighborhood search approach. Through parallel computing, our approach achieves as much as 75% reduction in computation time, while concurrently generating shorter routes. The approach is implemented in real-world cases to validate its efficacy.


MoAT3	Crystal
Optimization for Energy Systems	Special Session
Chair: Robba, Michela	University of Genoa
Organizer: Grammatico, Sergio	Delft University of Technology
Organizer: Dotoli, Mariagrazia	Politecnico Di Bari
Organizer: Carli, Raffaele	Politecnico Di Bari
Organizer: Scarabaggio, Paolo	Politecnico Di Bari
Organizer: Mignoni, Nicola	Politecnico Di Bari

10:30-10:45, Paper MoAT3.1
Distributed Model Predictive Control for Building Automation Systems: A Parallel ADMM Approach (I)

Robba, Michela	University of Genoa
Ferro, Giulio	University of Genoa
Parodi, Luca	University of Genoa
Keywords: Building Automation, Power and Energy Systems automation, Optimization and Optimal Control Abstract: This paper proposes a distributed Model Predictive ontrol (MPC)-based approach for comfort temperature tracking and electric consumption minimization in building automation systems (BASs). The developed optimization model and the overall architecture have been developed taking into account real-world applications with in-field controllers and sensors. A distributed optimization algorithm is here proposed, which extends the well-known alternating direction method of multipliers (ADMM) to handle inequality constraints (that are necessary to model the typical local temperature sensors and actuators in smart buildings). The methodology is validated through testing on a real case study, i.e. the Smart Energy Building (SEB) at the Savona Campus of the University of Genova, characterized by a geothermal heat pump, photovoltaics, storage systems, and charging stations. The algorithm allows reaching comfort temperature, limiting power variation for heat pump, and minimizing costs. As regards other solution methods, comparison with state-of-the-art approaches proves the reduction in 25% of iterations needed to convergence.

10:45-11:00, Paper MoAT3.2
An Energy Management System for Green Ports (I)

Casella, Virginia	University of Genoa
Gallo, Marco	University of Genova
Graffione, Federico	University of Genova
Robba, Michela	University of Genoa
Silvestro, Federico	University of Genova
Keywords: Power and Energy Systems automation, Renewable Energy Sources, Optimization and Optimal Control Abstract: Port areas serve as hubs for energy production, storage, distribution, and high consumption, and can be key players for the energy transition and reduction of emissions. In fact, the management of renewables, storage systems, electric vehicles and boats can be integrated with logistics operations to satisfy energy demands and to provide support to the distribution system operator. The aim of this paper is to present a new Energy Management System (EMS) for port areas (with specific reference to touristic ports), which can integrate sensors and technologies in field, simulation tools and optimization models, to minimize the overall costs and customers’ dissatisfaction. Each component communicates with the EMS using MQTT (Message Queuing Telemetry Transport) protocol. A real case study of a touristic port (Finale Ligure) in the Savona Municipality is considered.

11:00-11:15, Paper MoAT3.3
Optimal Stochastic Management of Energy Storage Systems Based on Non-Linear Energy Reservoir Models (I)

Mignoni, Nicola	Politecnico Di Bari
Scarabaggio, Paolo	Politecnico Di Bari
Carli, Raffaele	Politecnico Di Bari
Dotoli, Mariagrazia	Politecnico Di Bari
Keywords: Power and Energy Systems automation, Renewable Energy Sources, Optimization and Optimal Control Abstract: This paper discusses and extends energy-reservoir models (ERMs) for energy storage systems (ESSs), recently proposed in the related literature, by introducing a non-unitary efficiency for the discharging process. This enhancement allows the resulting ESS model to more accurately represent losses during both charging and discharging cycles, albeit at the cost of introducing a non-linearity. We show that, while the lower ERM capacity bound preserves convexity, the upper bound does not. Hence, a mixed-integer reformulation is provided to tackle such a non-convexity. We focus on the perspective of a prosumer equipped with an ERM and served by an energy retailer characterized by a realistic energy pricing scheme. We also account for the inherent uncertainty in ESS management related to the prosumer's energy demand and generation curves: to accommodate this uncertainty, our approach accepts probabilistic forecasts as inputs, enabling objective function approximation through techniques such as sample average approximation. The proposed approach is numerically validated using real data, implementing the formulation within a model-predictive-control framework.

11:15-11:30, Paper MoAT3.4
Multi-Energy Demand-Side Management for Flexibility Service in Distribution Networks (I)

Wang, Meichen	University of Manchester
Xu, Yiqiao	University of Manchester
Parisio, Alessandra	The University of Manchester
Keywords: Demand Side Management, Distributed Generation and Storage, Planning, Scheduling and Coordination Abstract: The high penetration of renewable energy has significantly increased the demand for flexibility services, particularly in distribution networks. At the same time, the electrification of the heating sector introduces interactions and interdependencies between energy vectors, presenting a promising yet unexplored source of flexibility. This paper proposes a multi-energy demand-side management (DSM) framework for flexibility service provision in distribution networks. The proposed framework includes multi-energy assets and demands, such as pumped thermal energy storage (PTES) and combined heat and power (CHP) systems. Among diverse energy storage technologies, PTES emerges as a cost-effective and grid-scale solution with environmentally friendly operation and extended lifetime. The proposed model is formulated as a mixed-integer linear optimization problem to schedule resources, optimizing utilization payment and energy costs while ensuring compliance with the flexibility service requirements. The performance of the proposed optimization framework is verified through numerical studies on a modified benchmark network, demonstrating the successful operation of flexibility services and the potential of PTES to enhance network flexibility.

11:30-11:45, Paper MoAT3.5
User-Centric Vehicle-To-Grid Optimization with an Input Convex Neural Network-Based Battery Degradation Model (I)

Mallick, Arghya	Delft University of Technology
Pantazis, Georgios	TU Delft
Khosravi, Mohammad	TU Delft
Mohajerin Esfahani, Peyman	TU Delft
Grammatico, Sergio	Delft University of Technology
Keywords: Plug-in Electric Vehicles, Machine learning, Human-Centered Automation Abstract: We propose a data-driven, user-centric vehicle-to- grid (V2G) methodology based on multi-objective optimization to balance battery degradation and V2G revenue according to EV user preference. Given the lack of accurate and generalizable battery degradation models, we leverage input convex neural networks (ICNNs) to develop a data-driven degradation model trained on extensive experimental datasets. This approach enables our model to capture nonconvex dependencies on battery temperature and time while maintaining convexity with respect to the charging rate. Such a partial convexity property ensures that the second stage of our methodology remains computationally efficient. In the second stage, we integrate our data-driven degradation model into a multi- objective optimization framework to generate an optimal smart charging profile for each EV. This profile effectively balances the trade-off between financial benefits from V2G participation and battery degradation, controlled by a hyperparameter reflecting the user prioritization of battery health. Numerical simulations show the high accuracy of the ICNN model in predicting battery degradation for unseen data. Finally, we present a trade-off curve illustrating financial benefits from V2G versus losses from battery health degradation based on user preferences and showcase smart charging strategies under realistic scenarios.

11:45-12:00, Paper MoAT3.6
Automatic Virtual Metrology for Long-Term Energy Baseline Forecasting (I)

Tieng, Hao	National University of Tainan
Chen, Pin-Jui	National Cheng Kung University
Wu, Tung-Qing	National Cheng Kung University
Wu, Chia-Hsi	National Cheng Kung University Intelligent Manufacturing Researc
Cheng, Fan-Tien	National Cheng Kung University
Keywords: Big data Analytics for Large-scale Energy Systems, Energy and Environment-aware Automation, Power and Energy Systems automation Abstract: Global warming poses significant environmental and economic challenges, with energy consumption being a major contributor. Industry 4.2 for Green Intelligent Manufacturing (I4.2-GiM) integrates digitalization, intelligent manufacturing, and energy management systems (EMSs) to optimize energy efficiency and achieve net zero. Accurate long-term energy forecasting is crucial but challenging due to complex consumption patterns and external influences. Automatic Virtual Metrology (AVM) enhances real-time quality prediction, with the second generation AVM (AVMII) incorporating Convolutional Neural Networks (CNNs) for improving prediction accuracy. However, CNNs struggle with long-term dependencies. This study proposes the third generation AVM (AVMIII) by integrating Long Short-Term Memory (LSTM) networks to enhance long-term time-series predictions so as to enable smarter energy management in factories.


MoAT4	Bernard's
Trajectory, Object, and Position 1	Regular Session
Chair: Mikšík, Martin	Czech Technical University in Prague

10:30-10:48, Paper MoAT4.1
Synthetic Dataset for Vision-Based Air-To-Air Object Detection

Rassas, Basaam	Toronto Metropolitan University
Singoji, Shashank	Toronto Metropolitan University
Waslander, Steven Lake	University of Toronto
Faieghi, Reza	Toronto Metropolitan University
Keywords: Computer Vision for Transportation, Deep Learning in Robotics and Automation, Autonomous Vehicle Navigation Abstract: The increasing deployment of Uncrewed Aerial Vehicles (UAVs) across various industries has heightened the need for robust airborne object detection to ensure safe airspace operations. Vision-based deep learning models offer effective solutions but require large, diverse, and well-annotated datasets for optimal performance. Existing datasets are limited in diversity, expensive to acquire, and manually annotated, restricting their scalability. This work presents a scalable synthetic dataset for vision-based airborne object detection, generated using AirSim integrated with Unreal Engine. Our dataset includes nine UAV models, multiple airborne object classes (birds, helicopters, balloons, aircraft), and diverse environmental conditions. Unlike existing synthetic datasets, our framework eliminates the need for photogrammetry or chroma-keying, making it fully automated and cost-effective. We evaluate our dataset by training detection models and benchmarking them against existing real-world datasets, demonstrating that synthetic data can improve model generalization while significantly reducing acquisition costs. The dataset is available at: www.kaggle.com/datasets/avldevelopment/air-to-air-object-detection-dataset

10:48-11:06, Paper MoAT4.2
OPTRObot: A Synthetic Training Paradigm for Robotic Grasping of Specular Objects in Cluttered Environments

Mikšík, Martin	Czech Technical University in Prague
Zeman, Vít	Czech Technical University in Prague, Faculty of Electrical Enge
Moroz, Artem	Czech Institute of Informatics, Robotics and Cybernetics, CTU In
Burget, Pavel	Czech Technical University in Prague
Keywords: Computer Vision for Manufacturing, Machine learning, Factory Automation Abstract: We present a novel and complete vision-based pipeline for 6DoF object pose estimation in challenging industrial bin-picking scenarios, characterized by significant clutter, occlusions, and reflective surfaces. Our approach addresses the limitations of both computationally expensive fine-tuning methods and the current immaturity of foundation models in handling such complex environments. The key contribution lies in a balanced approach leveraging synthetic data augmentation and a streamlined architecture to achieve robust performance without extensive per-object optimization. The pipeline integrates existing state-of-the-art object detection, coarse pose estimation, and a render-and-compare refinement strategy, enabling accurate pose estimation from monocular images. We introduce a new benchmark on the recently released dataset, establishing a baseline for future research. Unlike existing industrial approaches, our system minimizes reliance on multi-sensor configurations, offering a cost-effective and easily deployable solution. We demonstrate the impact of error propagation across a complete pipeline using datasets that mirror real-world industrial conditions, in contrast to commonly used but less representative datasets.

11:06-11:24, Paper MoAT4.3
You Only Look Once, but the Parts Keep Moving: YOLO-Based Workpiece Pose Classification for Aerodynamic Part Feeding

Shieff, Dasha	Leibniz University Hannover
Akchi, Mohamed	Leibniz University Hanover, Institute of Assembly Technology And
Raatz, Annika	Leibniz Universität Hannover
Keywords: Computer Vision for Manufacturing, Factory Automation, Intelligent and Flexible Manufacturing Abstract: Flexible part feeding is a key challenge in modern automated production, where increasing uncertainties, shorter product life cycles, and cost pressures require adaptable solutions. Aerodynamic part feeding systems, which use controlled air jets to manipulate workpieces, offer a retooling-free alternative to traditional vibratory bowl feeders. To ensure precise workpiece handling, reliable pose classification is essential. This paper presents a machine learning-based framework for classifying workpiece poses using a class of convolutional neural networks (CNNs) called YOLO and an industrial camera. Instead of relying on manually labeled real-world images—which would introduce machine downtimes and increased setup efforts—the proposed method trains CNNs exclusively on synthetic datasets. Artificial images of workpieces in various poses are generated from CAD models using the open-source rendering engine Blender. Multiple CNN architectures are trained and evaluated, achieving a classification precision exceeding 95 % for most workpieces when tested on real workpiece images. The results demonstrate that the approach enables accurate and efficient workpiece pose classification without the need for labor-intensive dataset creation. While developed for aerodynamic part feeding, the proposed method is applicable to a wide range of industrial scenarios requiring automated workpiece orientation classification.

11:24-11:42, Paper MoAT4.4
Voxel-Based Hierarchical Approximate Convex Decomposition for Efficient 3D Representation of Objects in Robotic Applications

Mastromarino, Fabio	Politecnico Di Bari
Scarabaggio, Paolo	Politecnico Di Bari
Carli, Raffaele	Politecnico Di Bari
Dotoli, Mariagrazia	Politecnico Di Bari
Keywords: Collision Avoidance, Formal Methods in Robotics and Automation, Industrial and Service Robotics Abstract: Approximate Convex Decomposition (ACD) is essential for industrial robotics, enabling efficient collision detection, motion planning, and physics-based simulation of robotic manipulators. However, traditional ACD methods, such as Hierarchical ACD (HACD) and Volumetric HACD (V-HACD), often suffer from high computational costs and over-segmentation, making them unsuitable for real-time robotic applications. This paper presents a novel voxel-based HACD (VX-HACD) approach designed to enhance computational efficiency while preserving the geometric fidelity of robotic manipulator components. The proposed approach first converts the input mesh into a structured voxel grid, simplifying the convex decomposition process. A gap-filling algorithm ensures topological continuity, preventing segmentation artifacts caused by voxel discretization. Additionally, a hierarchical voxel aggregation strategy reduces the number of convex components while maintaining accuracy, optimizing the representation for robotic applications. The methodology is validated on high-complexity robotic manipulator components, demonstrating reduced processing times, lower volumetric error, and fewer convex components compared to state-of-the-art ACD techniques. The proposed approach, while validated in the context of industrial robotics for collision-aware motion planning, can be applied to a wide range of applications requiring efficient convex decomposition in high-performance simulation (e.g., precision or surgical robotics, video games, and physics simulation).

11:42-12:00, Paper MoAT4.5
Learning Rapid Turning, Aerial Reorientation, and Balancing Using Manipulator As a Tail

Yang, Insung	Korean Advanced Institute of Science and Technology
Hwangbo, Jemin	Korean Advanced Institute of Science and Technology
Keywords: Motion Control, Model Learning for Control Abstract: In this research, we investigated the innovative use of a manipulator as a tail in quadruped robots to augment their physical capabilities. Previous studies have primarily focused on enhancing various abilities by attaching robotic tails that function solely as tails on quadruped robots. While these tails improve the performance of the robots, they come with several disadvantages, such as increased overall weight and higher costs. To mitigate these limitations, we propose the use of a 6-DoF manipulator as a tail, allowing it to serve both as a tail and as a manipulator. To control this highly complex robot, we developed a controller based on reinforcement learning for the robot equipped with the manipulator. Our experimental results demonstrate that robots equipped with a manipulator outperform those without a manipulator in tasks such as rapid turning, aerial reorientation, and balancing. These results indicate that the manipulator can improve the agility and stability of quadruped robots, similar to a tail, in addition to its manipulation capabilities.


MoAT5	Cordoban
Drones and UAV	Regular Session
Chair: Fikri, Muhamad Rausyan	Tampere University

10:30-10:48, Paper MoAT5.1
Tracking Multiple Moving Assets with a Smaller Group of Drones

Shahsavar, Mohammadreza	University of Houston
Rajasekaran, Siddharth	University of Houston
Kabin, Richard A	University of Houston
Yannuzzi, Michael	University of Houston
Becker, Aaron	University of Houston
Keywords: Swarms, Surveillance Systems, Optimization and Optimal Control Abstract: A limited number of drones must monitor a large number of assets whose future motions are unknown, ensuring that each asset is monitored by at least one drone. The ideal configuration places the drone’s sensors as close as possible to their assets. This objective is achieved by minimizing the altitude of all drones, which in turn reduces the total area of their ground coverage footprints. There exists an optimal assignment of assets to drones. However, if the assets are moving, then the optimal assignment is not static. Assets must instead be swapped between drones. We present centralized and decentralized methods to cluster our moving assets statically each control iteration, and then our cover algorithm guarantees continuous 100% coverage over our moving assets while minimizing the total area covered by drones.

10:48-11:06, Paper MoAT5.2
Human-Drone Swarm Interaction System for Persistent Monitoring of Large Disperse Area

Kosonen, Petri	Tampere University
Fadaeian, Yeganeh	Tampere University
Eura, Reeta	Tampere University
Sulameri, Jussi Ilari	Tampere University
Fikri, Muhamad Rausyan	Tampere University
Gusrialdi, Azwirman	Tampere University
Keywords: Robot Networks, Swarms, Surveillance Systems Abstract: This paper presents a human-drone swarm interaction system that enables adaptive and prioritized monitoring through an ergodic coverage control algorithm. The system is designed to ensure that drone swarms effectively cover a userdefined probability density function while maintaining both safety and persistent monitoring. A robust set of safety features prevents collisions between neighboring drones and prevents the drones from exiting the monitoring area. In addition, a fault tolerance algorithm is implemented to ensure continuous monitoring even if individual drones fail. The proposed algorithm is validated through real-world experiments using Crazyflie 2.1 from Bitcraze. The experimental results demonstrate that the algorithm successfully optimizes coverage as measured by the ergodic metric while maintaining safe operation.

11:06-11:24, Paper MoAT5.3
On Formation Control Strategies in a Failure-Prevention Scenario for Multi-UAV Payload Transportation

Delbene, Andrea	Univerity of Genoa
Baglietto, Marco	University of Genoa
Keywords: Autonomous Vehicle Navigation, Autonomous Agents, Motion and Path Planning Abstract: This study considers a system composed of multiple Unmanned Aerial Vehicles (UAVs), carrying a payload by flexible cables. A formation control problem is addressed in the course of a path-following procedure. In the context of a scenario where one UAV could fail during the flight and is automatically detached from the system, the optimal formation to be kept for the whole mission is found, which would minimize the maximum path the remaining UAVs should travel to change the formation if any of the UAVs would fail. The focus is on a system with three UAVs and the transition to a formation of two UAVs is studied after possible failures. The evolution of the system is described by a set of dynamic equations; a considered software architecture is provided, and software-in-the-loop tests are performed to validate the proposed algorithms. Results compare the performances of the proposed formation against a static configuration in a failure scenario of one UAV during the path-following, highlighting the benefits of the first when the failure happens in critical points during the flight.

11:24-11:42, Paper MoAT5.4
AeroPowerNet: Fixed-Wing UAV Power Consumption Estimation with an AI-Driven Hybrid Deep Learning Framework

Wahid, Mirza Anas	École De Technologie Supérieure ÉTS
Lahlou, Laaziz	École De Technologie Supérieure ÉTS
Kara, Nadjia	ETS, University of Quebec
Keywords: AI-Based Methods, Deep Learning in Robotics and Automation, Machine learning Abstract: The instantaneous power consumption of electric-powered aerial vehicles in the aviation industry is crucial for optimizing flight activities. However, devising physics-based power consumption models requires deep insight into the dynamics of an unmanned aerial vehicle (UAV). This becomes challenging due to the variability and complexity of the parameters of airspeed, altitude, and motion of the flight controls. Therefore, a power consumption model is needed to map the influence of flight parameters on power utilization during varying flight phases. This model is crucial for mission planning, optimization, and extending the endurance of UAVs. This study introduces the AeroPowerNet framework, employing a data-driven approach based on deep learning to model UAV power consumption utilizing real-world flight data, which can serve as a foundation for future integration into fixed-wing UAV flight operations. We implemented a test of four different models: RNN-LSTM, GRU, Transformer, and a hybrid model, which were trained and compared. Experimental results show that the hybrid model outperforms all other models, achieving the best performance with MAE of 3.38W, R2 of 99.31%, and NRMSE of 0.21% for the first UAV flight, and an MAE of 7.52W, R2 of 96.68%, and NRMSE of 1.03% for the second flight.

11:42-12:00, Paper MoAT5.5
Learning Based Approach towards AUV and Marine Life Interaction

Kumar, Harshith	Drexel University
P, Siri	Global Academy of Technology
Keywords: Collision Avoidance, Motion and Path Planning, Reinforcement Abstract: Autonomous Underwater Vehicles (AUVs) have significantly advanced in their capabilities, enabling exploration and operations in diverse underwater environments. While navigation in sparse and obstacle-free terrains is relatively simple, navigating deeper waters introduces new challenges due to the presence of marine life, such as migrating shoals of fish and predators hunting for food. This paper explores a novel approach with the integration of reinforcement learning for motion planning in underwater robotics. The primary focus is on the implementation of the Proximal Policy Optimization (PPO) algorithm and Gumbel Social Transformer (GST), which enables the robot to learn how to navigate in an underwater environment with dynamic obstacles. The obstacles are modeled as a shoal of fish and a predator, with the robot tasked with avoiding collisions. The underwater system will be simulated using the Robot Operating System (ROS) framework, and onboard sonar sensors will be employed to detect and track any dynamic obstacles in the vicinity.


MoAT6	Corinthian
Human Robot Collaboration for Smart Manufacturing 1	Special Session
Chair: Zhang, Yunbo	Rochester Institute of Technology
Organizer: Wang, Weitian	Montclair State University
Organizer: Zhou, MengChu	New Jersey Institute of Technology
Organizer: Guo, Xiwang	Liaoning Petrochemical University
Organizer: Qiao, Yan	Macau University of Science and Technology

10:30-10:48, Paper MoAT6.1
RoPESim: A Framework for Robot Manipulation Policy Evaluation Via Simulation (I)

Wang, Xueting	Rochester Institute of Technology
Dengxiong, Xiwen	Rochester Institute of Technology
Bai, Shi	IServe Robotics
Zhang, Yunbo	Rochester Institute of Technology
Keywords: Human-Centered Automation, Collaborative Robots in Manufacturing, Intelligent and Flexible Manufacturing Abstract: Predicting the robot manipulation plan prior to real-world execution is an important capability for robots to complete tasks in manufacturing environments. However, current AI-based manipulation planning methods lack this capability, making it difficult to deploy them in real-world manufacturing scenarios. In this work, we propose a simulation-based human-robot collaboration framework to evaluate predicted robot actions before real-world execution. The framework consists of a VLM-based scenario generator, a diffusion-based action simulator, and an evaluator. First, the scenario generator automatically creates a simulation scenario with objects and obstacles identified and placed. Then, the action simulator generates a series of manipulation action trajectories using a diffusion model in the simulation environment. Each action trajectory is assessed by the evaluator for collision failure, manipulation failure, and completion rate. The final evaluation results are returned to the user for verification and approval. In our experiment, we apply our framework to five chosen scenarios with highly potential collision failures. For each scenario, at least one feasible planned action trajectory is generated. It is then verified through real robot execution, demonstrating the effectiveness of the proposed framework.

10:48-11:06, Paper MoAT6.2
Zero-Shot Robot Manipulation Via Action Decomposition and Composition (I)

Dengxiong, Xiwen	Rochester Institute of Technology
Wang, Xueting	Rochester Institute of Technology
Li, Rui	Rochester Institute of Technology
Zhang, Yunbo	Rochester Institute of Technology
Keywords: Task Planning, Intelligent and Flexible Manufacturing, AI-Based Methods Abstract: The ability to learn generalized skills from demonstrations and apply the acquired skills in various real-world scenarios is a key challenge for robot manipulation. Different from the typical robot learning tasks that learn the action from multiple demonstrated samples in a single task, zero-shot robot manipulation requires the robot to efficiently leverage multiple learned robot skills to accomplish a new task. In this paper, we propose an action decomposition/composition framework that efficiently transfers key manipulation skills to various new derivative tasks. Specifically, we first decompose one demonstration that encompasses several foundation skills that do not contain the derivative task. Then we adopt an action prediction approach to generate possible manipulation poses and the end pose for each subtask in the derivative task based on the robot's action and the video frames from the robot cameras. Since the generated poses may be impacted by previous misleading actions, we denoise the action by selecting the most possible manipulation poses based on the task to guide the robot manipulation.

11:06-11:24, Paper MoAT6.3
Cognitive Architecture for Adaptive Skill Learning towards Fluent Human-Robot Collaborative Assembly (I)

Zhou, Rui	The University of Auckland
Lu, Yuqian	The University of Auckland
Keywords: Collaborative Robots in Manufacturing, Cognitive Automation Abstract: This paper presents a new cognitive architecture to enable human-robot collaborative assembly in complex, unstructured environments. While existing human-robot collaboration technologies have demonstrated success in simple setups, they struggle with fluent interaction in complex scenarios characterized by unpredictable human intentions and flexible workspace configurations. Our approach addresses these limitations by developing a cognitive architecture built upon the SOAR architecture, emphasizing internal cognitive processes, including real-time learning, adaptive decision-making, and knowledge evolution. The proposed system integrates perception, learning, memory, and execution components into a unified architecture that enables robots to continuously acquire skills through human interaction. Through HRC experiments for assembling the Generic Assembly Box (GAB), we demonstrated a 6.74% improvement in task success rates, a 10.9% reduction in execution time, and a 15.6% decrease in human instruction needs over traditional methods. These results validate the system’s potential to bridge the gap between traditional cognitive architectures and practical robotic applications, contributing to more adaptive and intelligent human-robot collaboration.

11:24-11:42, Paper MoAT6.4
Development and Evaluation of a Deep Q-Network-Based Robot Learning Paradigm in Real-World Human-Robot Collaborative Tasks (I)

Modery, Garrett	Montclair State University
Wang, Weitian	Montclair State University
Li, Rui	Montclair State University
Chen, Yi	ABB US Research Center
Zhou, MengChu	New Jersey Institute of Technology
Keywords: Human-Centered Automation, Collaborative Robots in Manufacturing Abstract: As robot systems continue to be advanced and implemented across industries, they do so typically by two methodologies, including standalone systems and collaborative ones. Standalone systems are typically set in their own areas, away from human workers. Collaborative robots share a common workspace with their human counterparts and work with them to complete tasks together efficiently and safely. Within this category of robotics, there exists another subcategory that describes the method of implementation and usage rather than simply the type of system. This subcategory involves how the machine will interact with workers and understand its role in interaction. Thus, it raises interest in the field of learning from demonstrations, where the robot may dynamically learn the behavior that is desired by the user rather than being explicitly hardcoded to perform its task. In this work, we develop a deep Q-network-based robot learning paradigm for human-robot partnerships in shared tasks. The proposed approach is validated in real-world human-robot collaborative contexts. In addition, to assess the performance of this approach and the acceptance from practitioners, we conduct a multi-metric user study. Implementation and evaluation results indicate that the developed solution works effectively for human-robot teamwork and receives high support from active users who rate it very well on several key metrics. The future work of this study is also discussed.

11:42-12:00, Paper MoAT6.5
Safe and Intuitive Human-Robot Collaborative Assembly with Potential Field-Based Dynamic Obstacle Avoidance and Gestured-Based Communication Interface (I)

Patel, Dipesh	The University of Auckland
Phu, Nathan	The University of Auckland
Lu, Yuqian	The University of Auckland
Keywords: Human-Centered Automation, Collision Avoidance, Motion and Path Planning Abstract: We present a robotic system that enhances human-robot collaboration (HRC) in manufacturing through intuitive gesture recognition and dynamic obstacle avoidance. Despite the growing adoption of HRC systems, significant challenges persist in creating natural interfaces and maintaining safety without impeding workflow. Our contributions include: (1) a motion planning algorithm with dampening and softening components for safe and efficient obstacle avoidance; (2) an accurate gesture recognition and tool detection model; and (3) a finite state machine that translates human gestures into fitting assistance actions. We validate our approach through a manufacturing case study that shows improved collaboration efficiency while maintaining safety. Results show that our system reduces production delays by eliminating the need for workers to divert attention from assembly tasks to manage robot interactions, thereby creating more natural and productive human-robot partnerships in manufacturing settings.


MoAT7	Mediterranean
Learning and Computation	Regular Session
Chair: Kim, Heeyoung	KAIST

10:30-10:48, Paper MoAT7.1
Development of an Automatic Algorithm for Predicting Physics-Informed Mechanical Properties in Intelligent Manufacturing of Hairpin Motors

Shim, Young-Dae	Georgia Institute of Technology
Kim, Jihun	Sungkyunkwan University
Kim, Changhyeon	Sungkyunkwan University
Park, Jihyun	HYUNDAI MOBIS
Yang, DongWook	Hyundai Mobis
Lee, Eun-Ho	Sungkyunkwan Univeristy
Keywords: Intelligent and Flexible Manufacturing, Zero-Defect Manufacturing, Process Control Abstract: This paper introduces a physics-informed approach for the real-time prediction of mechanical properties in metallic materials, specifically utilized in the intelligent manufacturing of hairpin coils for electric vehicle production. Inconsistencies in processing that result in variations in material properties frequently cause faults in metal forming operations, highlighting the necessity for continuous non-destructive monitoring. Although conventional tensile testing offers precise mechanical characterization, it is inadequate for inline monitoring. We create a prediction model utilizing eddy current testing (ECT), thermodynamic energy balance, and dislocation-based crystal plasticity. The method creates a theoretical connection between plastic deformation and electrical impedance, utilizing Matthiessen’s Rule and circuit theory to measure the impact of dislocation density on conductivity and magnetic energy transfer efficiency. An algorithm based on physics was deployed on a production line and adjusted using environmental sensors. Experimental validation using 22 hairpin coil samples shown remarkable agreement with tensile test results, achieving a prediction error margin of 3.5% and a root mean square error (RMSE) of 1.56 MPa. This study illustrates the effectiveness of the suggested framework for non-destructive, real-time prediction of mechanical properties in intelligent manufacturing systems.

10:48-11:06, Paper MoAT7.2
Quantized Parallel Particle Filtering on FPGA for High-Speed Target Tracking in Edge Computing Devices

Kim, Nayeon	Kumoh National Institute of Technology
Lee, Heoncheol	Kumoh National Institute of Technology
Lee, Jieun	LIG Nex1
Kim, Haerim	LIG Nex1
Choi, Wonseok	LIGnex1
Keywords: Optimization and Optimal Control Abstract: This paper addresses the problem of applying particle filters to edge computing devices for tracking a high speed target with nonlinear and non-Gaussian characteristics in real time. It is not efficient to use conventional particle filters due to much computation time caused by sequential processes with a lot of particles. This paper proposes a quantized parallel particle filtering method which can be efficiently conducted on FPGA (Field-Programmable Gate Array) to accelerate most of the computation processes. The proposed method employs INT16 quantization rather than double-precision arithmetic for better compatibility with FPGAs in edge computing devices. Then, pipelining and unrolling methods were used for parallelization. Experimental results showed that the proposed method achieved 3.32× speedup in total compared to conventional particle filters on CPU. Also, despite measurement noises and quantized errors, the proposed method tracked the high-speed target accurately.

11:06-11:24, Paper MoAT7.3
Performance Analysis for AIE-Based Matrix Multiplication Acceleration in Adaptive Compute Acceleration Algorithms

Kim, Haerim	LIG Nex1
Lee, Jieun	LIG Nex1
Keywords: Optimization and Optimal Control, AI-Based Methods, Domain-specific Software and Software Engineering Abstract: Real-time matrix computations, particularly the optimization of matrix multiplication using advanced technologies like the Versal AI Engine, are essential for modern guided weapon systems to enhance precision and responsiveness. This paper investigates the optimization of matrix multiplication (MMUL) using the Versal AI Engine (AIE) on the Xilinx VCK190 board, which is significant due to its advanced parallel processing capabilities that are ideal for enhancing computational efficiency in real-time applications. We analyze the performance impact of various block size configurations, such as 4×4×4 and 2×4×8, to exploit the board's parallel processing capabilities. Experimental results demonstrate that smaller block sizes, such as 4×4×4 and 2×4×8, achieve optimal computational efficiency by effectively balancing execution speed with memory access demands, thereby maximizing the use of available resources. These findings offer a scalable and efficient approach to enhancing real-time processing in guidance systems, potentially leading to significant improvements in precision and responsiveness of modern guided weapon systems.

11:24-11:42, Paper MoAT7.4
Integrated Monitoring in Aquaponics - a Preliminary Study

Edan, Yael	Ben-Gurion University of the Negev
Turetzky, Dan	Ben Gurion University of the Negev
Amir, Nadav	Ben Gurion University
Aflalo, Eliahu	Ben Gurion University
Eisa, Adam	Ben Gurion University of the Negev
Keywords: Machine learning, Deep Learning in Robotics and Automation, Computer Vision in Automation Abstract: Aquaponics is an innovative and sustainable food production system integrating aquaculture and hydroponics to create a symbiotic environment. In this preliminary study, the growth of both plants and fish grown in a nutrient film technique (NFT) system was monitored along one season. Computer vision techniques were developed to: 1) estimate lettuce plant size 2) measure root system size (length and width), and 3) estimate fish weight. Statistical analyses were applied to determine the effect of location along the NFT system to the root length and plant size. The fish length was used to estimate fish weight based on a regression model developed from fish length and weight measurements. By providing an integrated approach to automated monitoring systems we aim to contribute to the development of improved aquaponic cultivation.

11:42-12:00, Paper MoAT7.5
Shape Feature-Informed Segmentation in the Rolling Process

Lee, Doryun	POSCO
Keywords: AI-Based Methods, Computer Vision in Automation, Computer Vision for Manufacturing Abstract: In the steel industry, recognizing the shape of steel plates during the rolling process using AI-based cameras is crucial. However, due to the characteristics of industrial environments, issues such as steam and water often degrade the quality of images extracted by cameras. In such conditions, errors can occur during image segmentation, where non-steel areas are mistakenly classified as steel, or parts of the steel plate are incorrectly identified as non-steel. To address these issues, this study developed a segmentation method leveraging domain knowledge of steel plate characteristics. Specifically, we designed loss functions that consider the thickness-direction object recognition and the continuity of steel plates, integrating them into the segmentation process. As a result, compared to the existing YOLOv11 segmentation method, the proposed approach improved precision from 0.8393 to 0.8647, maintained recall at 1, and increased mAP50-95 from 0.8782 to 0.8991, indicating enhanced overall performance. This approach allows for more accurate recognition of steel plate shapes even in conditions where image quality is compromised, increasing its applicability in the rolling processes of the steel industry.


MoAT8	Heinsbergen
Federated and Distributed Learning	Special Session
Chair: Yue, Xubo	University of Michigan
Co-Chair: Reisi Gahrooei, Mostafa	University of Florida
Organizer: Reisi Gahrooei, Mostafa	University of Florida
Organizer: Yue, Xubo	University of Michigan
Organizer: Gaw, Nathan	Georgia Institute of Technology

10:30-10:48, Paper MoAT8.1
Federated Learning for Deep Anomaly Detection with Noisy and Heterogeneous Data (I)

Li, Ao	The Hong Kong University of Science and Technology (HKUST)
Li, Songze	Southeast University
Tsung, Fugee	HKUST
Keywords: Machine learning Abstract: We consider the problem of unsupervised learning for deep anomaly detection (DAD) in a federated learning (FL) network, consisting of a central server and many distributed clients. The conventional FedAvg algorithm involves clients training local DAD models with their private datasets, and then uploading these models to the server for aggregation into a global model. In practical scenarios, this framework faces two major challenges: 1) unlabeled training datasets may contain unknown anomalies; 2) training datasets from different clients are typically non-independent and identically distributed (non-IID). To address these problems, we propose FedDAD, a robust FL framework for training DAD models with noisy and heterogeneous data. In FedDAD, a small public dataset at the server, containing only a few normal samples (e.g., one sample from each normal class), serves as a normal anchor in the latent space across all clients. This anchor significantly improves the accuracy of identifying unknown anomalies at clients, in the presence of data heterogeneity. Having identified anomalies, clients utilize contrastive learning to train local feature extractors that further enhance the separation between normal and abnormal data. Extensive experimental results demonstrate the uniform superiority of FedDAD over all FL baselines across various settings and datasets. Furthermore, the model trained with FedDAD even achieves comparable performance to the model trained in a centralized manner.

10:48-11:06, Paper MoAT8.2
Adaptive Asynchronous Federated Learning with Convolutional Autoencoders in Heterogeneous Systems (I)

Kim, Jungmin	Rutgers University
Guo, Weihong	Rutgers University
Keywords: Machine learning Abstract: Federated Learning (FL) has emerged as a promising solution for distributed machine learning in various systems, enabling collaborative model training while preserving data privacy. However, traditional synchronous FL approaches face challenges in heterogeneous environments where clients have varying computational capabilities and network conditions. This paper proposes a novel asynchronous FL framework with convolutional autoencoders for unsupervised representation learning (URL) in distributed systems. Unlike conventional FL methods that require synchronous updates from all clients, our approach enables flexible client participation based on their update history, making it particularly suitable for real-time scenarios where continuous model improvement is critical. The framework addresses key challenges in diverse systems: privacy preservation, unlabeled data handling, and system heterogeneity. Experimental results in the MNIST dataset demonstrate that our method achieves faster convergence and lower training time compared to centralized learning (CL), individual learning (IL), and traditional FL approaches, such as Federated Averaging (FedAvg) and FedProx, while maintaining comparable accuracy.

11:06-11:24, Paper MoAT8.3
Federated Learning of Dynamic Bayesian Network Via Continuous Optimization from Time Series Data (I)

Yue, Xubo	Northeastern University
Keywords: Causal Models Abstract: Traditionally, learning the structure of a Dynamic Bayesian Network has been centralized, requiring all data to be pooled in one location. However, in real-world scenarios, data are often distributed across multiple entities (e.g., companies, devices) that seek to collaboratively learn a Dynamic Bayesian Network while preserving data privacy and security. More importantly, due to the presence of diverse clients, the data may follow different distributions, resulting in data heterogeneity. This heterogeneity poses additional challenges for centralized approaches. In this study, we first introduce a federated learning approach for estimating the structure of a Dynamic Bayesian Network from homogeneous time series data that are horizontally distributed across different parties. We then extend this approach to heterogeneous time series data by incorporating a proximal operator as a regularization term in a personalized federated learning framework. To this end, we propose texttt{FDBNL} and texttt{PFDBNL}, which leverage continuous optimization, ensuring that only model parameters are exchanged during the optimization process. Experimental results on synthetic and real-world datasets demonstrate that our method outperforms state-of-the-art techniques, particularly in scenarios with many clients and limited individual sample sizes.

11:24-11:42, Paper MoAT8.4
A Federated Semi-Supervised Approach to Predicting Parkinson’s Disease Severity from Tabular Data (I)

Allsop, Jennifer	Air Force Institute of Technology
Gaw, Nathan	Georgia Institute of Technology
Reisi Gahrooei, Mostafa	University of Florida
Cox, Bruce	Air Force Institute of Technology
Johnstone, Chancellor	Air Force Institute of Technology
Keywords: AI and Machine Learning in Healthcare Abstract: Data privacy is a growing concern in real-world machine learning (ML) applications, particularly in sensitive domains like healthcare. Federated learning (FL) offers a promising solution by enabling model training across decentralized, private data sources. However, both traditional ML and FL approaches typically assume access to fully labeled datasets, an assumption that rarely holds in practice. Users often lack the time, motivation, or expertise to label their data, making labeled examples scarce. This paper proposes a federated semi-supervised learning (FSSL) framework that learns from a small set of labeled data alongside a large volume of unlabeled data. Our approach combines FL with VIME, a leading semi-supervised learning (SSL) method for tabular data. Unlike image or text data, tabular data presents unique challenges for SSL due to the absence of transferable pretext tasks. We evaluate our method of predicting Parkinson’s disease severity and show that it significantly outperforms both supervised and SSL baselines across varying proportions of labeled data. The model achieves an RMSE of 7.74 and an MAE of 6.26 in the most challenging setting with only 10% labeled data, substantially outperforming both supervised FL and standalone SSL baselines, demonstrating the strength of our method under limited supervision. These results show that our method effectively leverages unlabeled data to enhance predictive performance in a privacy-preserving, real-world setting.

11:42-12:00, Paper MoAT8.5
Federated Automatic Latent Variable Selection in Multi-Output Gaussian Processes (I)

Gao, Jingyi	University of Virginia
Chung, Seokhyun	University of Virginia
Keywords: Probability and Statistical Methods, Learning and Adaptive Systems, Optimization and Optimal Control Abstract: This paper explores a federated learning approach that automatically selects the number of latent processes in multi-output Gaussian processes (MGPs). The MGP has seen great success as a transfer learning tool when data is generated from multiple sources or units. A common approach in MGPs to transfer knowledge across units involves gathering all data from each unit to a central server and extracting common independent latent processes to express each unit as a linear combination of the shared latent patterns. However, this approach poses key challenges in (i) determining the adequate number of latent processes and (ii) relying on centralized learning which leads to potential privacy risks and significant computational burdens on the central server. To address these issues, we propose a hierarchical model that places spike-and-slab priors on the coefficients of each latent process. These priors help automatically select only needed latent processes by shrinking the coefficients of unnecessary ones to zero. To estimate the model while avoiding the drawbacks of centralized learning, we propose a variational inference-based approach, that formulates model inference as an optimization problem compatible with federated settings. We then design a federated learning algorithm that allows units to jointly select and infer the common latent processes without sharing their data. Simulation and case studies on Li-ion battery degradation demonstrate the advantageous features of our proposed approach.


MoAT9	Moroccan
Human-Robot and HCA 1	Regular Session
Chair: Rahman, S M Mizanoor	Pennsylvania State University

10:30-10:48, Paper MoAT9.1
Emotion Recognition: Low-Rank Multimodal Shear and Splicing Fusion

Wang, Jiaming	University of Chinese Academy of Sciences
Yue, ZhiJian	University of Chinese Academy of Sciences
Huang, Jiangpeng	Chongqing Aerospace Rocket Electronics Technology Co., Ltd
Yang, Leiyu	Chongqing Aerospace Rocket Electronics Technology Co., Ltd
Liu, Yujie	Chongqing Aerospace Rocket Electronics Technology Co., Ltd
Peng, Yi	Chongqing Aerospace Rocket Electronics Technology Co., Ltd
Xia, Chengzhu	Chongqing Aerospace Rocket Electronics Technology Co., Ltd
Wang, Xupeng	Chongqing Aerospace Rocket Electronics Technology Co., Ltd
Wang, Yong	University of Chinese Academy of Sciences
Keywords: Computer Vision in Automation, Data fusion, Deep Learning in Robotics and Automation Abstract: We propose a low-rank multi-modal shearing and splicing fusion (LMSSF) method for accurate emotion recognition by effectively integrating the information from three modalities: text, image, and voice. Recognizing users emotions accurately using social media information is challenging due to the diversity of user-generated data and the difficulty in accurately identifying and extracting features from multi-modal information. Our method fills the gap in the multi-modal field by leveraging feature extraction and fusion techniques to combine voice, image, and text modalities for emotion recognition. To address the interaction of multi-modal information, we introduce standard feature extraction and private feature retention methods to ensure the integrity of the multimodal information. Further-more, we have developed a step-by-step discrimination approach that significantly reduces calculation and discrimination time by distinguishing standard features of three modalities, common features of two modalities, and private features. Our method effectively solves the problem of accurately recognizing emotions in social media users with diverse information modalities, achieving 97.3% accuracy with only 180K and an accuracy improvement of close to 10% than others.

10:48-11:06, Paper MoAT9.2
Towards Human-Understandable Visual Recognition for Nonexperts in Industrial Inspection: A Case Study for Car Manufacturing Lines

Sardari, Sarvenaz	Mercedes-Benz AG
Fernandes, Freddy	Mercedes Benz
Araya Martinez, Jose Moises	Mercedes-Benz AG, TU Berlin
Zak, Jan Alexander	Mercedes-Benz AG
Roitberg, Alina	University of Stuttgart
Keywords: Computer Vision for Manufacturing, Human-Centered Automation, Computer Vision in Automation Abstract: Despite growing interest in attribution-based eXplainable Artificial Intelligence (XAI) methods, existing tools designed by Artificial Intelligence (AI) experts often overlook the needs of end users such as Blue Collar Workers (BCW), increasing environmental and financial risks for companies relying on them for critical decisions in industrial inspection. In this work, we aim to understand BCW needs in car manufacturing lines and improve the transparency and trustworthiness of visual recognition tools. We evaluate existing XAI methods, such as Shapley values and saliency maps, and explore how enhancing them with Large Multimodal Model (LMM)s can help bridge the gap between model explanations and human intuition. Our goal is to provide explanations that better align with the needs of nonexpert users. Our hypothesis is validated through a custom-designed user study involving 20 nonexpert users, specifically addressing two key use cases: nut welding quality assurance in body-in-white assembly and part carrier inspection for assembly line readiness. The results demonstrate that humans, assisted by our XAI methods, satisfaction and efficiency increased in use cases consisting of object detection or classification. These findings suggest that augmenting attribution-based methods with LMM explanations offers great potential for human-understandable XAI interfaces in automotive manufacturing, reducing the risks of over- or under-trusting AI in critical production settings.

11:06-11:24, Paper MoAT9.3
Trust-Triggered Cyber-Physical-Human System for Human-Robot Collaboration in Flexible Manufacturing

Rahman, S M Mizanoor	Pennsylvania State University
Keywords: Human-Centered Automation, Collaborative Robots in Manufacturing, Cyber-physical Production Systems and Industry 4.0 Abstract: We proposed and investigated a bidirectional trust-triggered cyber-physical-human (CPH) system framework for human-robot collaborative assembly in flexible manufacturing. For this purpose, we developed a one human-one robot hybrid cell where the human and the robot collaborated with each other to perform the assembly operation of different manufacturing components in a flexible manufacturing setup. In the proposed framework, we configured the human-robot collaborative system using three interconnected components of a CPH system: cyber system (software system), physical system (the robot, sensors and necessary hardware), and human system (the human co-worker, supervisor and work environment). We divided the functions of the CPH framework into three interconnected modules: computing, communication and control. We proposed a model to compute human and robot’s bidirectional trust in real-time to monitor and control the performance of the CPH framework. We evaluated the performance of the framework implementing it on a human-robot collaborative assembly setup for different experimental conditions considering variations in: (i) computing complexity, (ii) communication complexity, (iii) control complexity, and (iv) human perceptual complexity. We compared the results among those experimental conditions and identified a condition that enabled the CPH framework to demonstrate the highest level of performance. The results revealed satisfactory performance of the CPH framework in terms of human-robot interaction, and task efficiency and quality. The results can help incorporate modularity and objectivity in the design, development, analysis and control of human-robot collaborative systems configuring them in the form of a CPH framework.

11:24-11:42, Paper MoAT9.4
Efficient and Human Centered Industry 5.0 Data Propagation on the Operational Technology Level. Case Study with OPC UA Interfacing, Node-RED and Ignition

Korodi, Adrian	University Politehnica Timisoara, Faculty of Automation and Comp
Vesa, Vlad-Cristian	University Politehnica Timisoara, Faculty of Automation and Comp
Dontu, Raul Andrei	University Politehnica Timisoara, Faculty of Automation and Comp
Keywords: Human-Centered Automation, Cyber-physical Production Systems and Industry 4.0, Factory Automation Abstract: Industry 5.0 is focused on human centricity, sustainability and resilience, but in the meantime it is relying on Industry 4.0 main objective that refers to efficiency increase. The operational technology (OT) level usually lacks data structuring and context, these steps being attributed to the middleware interfacing with the information technology (IT), or higher SCADA levels. Higher the hierarchical level, lower the technological process related knowledge becomes, and therefore representations may not be optimal. Also, various supervisory control and data acquisition (SCADA) solutions development targeting the same controlled process lead to different perspectives, higher costs, larger development times, and a more difficult maintenance. Considering both legacy systems and technological progress, the current paper proposes a solution that assures efficient and human centered structured, contextualized, and graphically sustained data propagation on the OT level. The work approaches and builds upon industry adopted environments and protocols in order to assure a fast adoption and a large impact of the solution. A Node-RED and Ignition based case study is presented, representing the PLC to SCADA data integration levels. The structured and graphically depicted data transfer and representation using Open Platform Communication Unified Architecture (OPC UA) protocol is showing good results, without the need of additional developments on the SCADA levels.

11:42-12:00, Paper MoAT9.5
Real-Time Social Presence Modulation of Embodied AI-Based Robots: An Audio-Centric Approach

Wijesinghe, Nipuni	University of Canberra
Jayasuriya, Maleen	University of Canberra
Grant, Janie Busby	University of Canberra
Herath, Damith	University of Canberra
Keywords: Human-Centered Automation, Machine learning, Behavior-Based Systems Abstract: Recent advancements in large language models have enabled the robotic embodiment yielding AI-driven robots with simulated personalities and social adeptness. However, modulating embodiment and presence remains overlooked. Unlike humans and animals, who instinctively adjust presence, heightened in emergencies, subdued in focus, robots operate rigidly, lacking such adaptability, like modulating vocal tone contextually. This paper proposes a framework for real-time, context-based social presence modulation in embodied AI which comprises three components: Sensor Integration, Context Identification, and Adaptive Presence Expressions. In our initial implementation, we use audio-based context detection, expandable to visual and physiological cues. The system fuses CNN-based ambient sound detection, speech-to-text keyword analysis, and sentiment evaluation via a transformer pipeline, statistically fused and classifying context through a Naive Bayes model. We define three primary states—Alarmed (emergency scenario), Social ('everyday' functioning scenario), and Disengaged (no system presence scenario) and two intermediate states to address uncertainty: Alert (between Alarmed and Social) and Passive (between Social and Disengaged). Real-world testing on a robot confirmed real-time modulation of actions and speech, validating the framework’s efficacy in adaptive social presence.


MoAT10	Corsican
Manipulation 1	Regular Session
Chair: Komaee, Arash	Southern Illinois University, Carbondale

10:30-10:48, Paper MoAT10.1
Estimating Force/Torque Sensor Offsets and Gravity Parameters Using Only Wrench Measurements to Facilitate Human Demonstration of Robot Manipulation Tasks in Contact

Mousavi Mohammadi, Ali	KU Leuven
Vochten, Maxim	KU Leuven
De Schutter, Joris	KU Leuven
Aertbelien, Erwin	KU Leuven
Keywords: Calibration and Identification, Model Learning for Control, Compliant Assembly Abstract: In human demonstration of manipulation tasks involving contact, a tool equipped with a force/torque (FT) sensor is typically used to capture contact wrenches (i.e., forces and moments). For accurate measurements, the sensor must be properly calibrated to ensure that it only records the contact wrenches during tool-environment interactions. This estimation in our context refers to estimating the sensor offsets, as well as the gravity parameters, i.e. mass and center of mass (COM), of the rigid object mounted on the FT sensor. Proper estimation enables access to the true contact wrench, which facilitates construction of reliable task models from human demonstrations. We propose a method for estimating sensor offsets and gravity parameters using only wrench measurements. This is particularly beneficial in scenarios where orientation information is unavailable or unreliable. By relying solely on wrench data, the method mitigates motion-related noise, eliminates the need for sensor orientation calibration, and remains computationally efficient, making it well-suited for real-time applications. The method's effectiveness is evaluated against a baseline method that utilizes both wrench and accurate orientation measurements, whereas our method relies solely on wrench data. The results show that both methods perform well, particularly when the excitation range exceeds 10^{circ}. However, the proposed method consistently outperforms the baseline across all experiments within this excitation range.

10:48-11:06, Paper MoAT10.2
HARMONI: Haptic-Guided Assistance for Unified Robotic Tele-Manipulation and Tele-Navigation

Sripada, Venkatesh	University of Surrey
Khan, Muhammad Arshad	University of Lincoln
Foecker, Julia	University of Lincoln
Parsa, Soran	University of Huddersfield
Palavajjhala, Susmitha	Jaguar Land Rover
Maior, Horia Alexandru	University of Nottingham
Ghalamzan Esfahani, Amir Masoud	University of Surrey
Keywords: Control Architectures and Programming, Behavior-Based Systems, Robust/Adaptive Control Abstract: Shared control, which combines human expertise with autonomous assistance, is critical for effective teleoperation in complex environments. While recent advances in haptic-guided teleoperation have shown promise, they are often limited to simplified tasks involving 6- or 7-DoF manipulators and rely on separate control strategies for navigation and manipulation. This increases both cognitive load and operational overhead. In this paper, we present a unified tele-mobile manipulation framework that leverages haptic guided shared control. The system integrates a 9-DoF follower mobile manipulator and a 7-DoF leader robotic arm, enabling seamless transitions between tele-navigation and tele-manipulation through real-time haptic feedback. A user study with 20 participants under real-world conditions demonstrates that our framework significantly improves task accuracy and efficiency without increasing cognitive load. These findings highlight the potential of haptic-guided shared control for enhancing operator performance in demanding teleoperation scenarios

11:06-11:24, Paper MoAT10.3
Immersive Teleoperation Framework for Locomanipulation Tasks

Boehringer, Takuya	University College London
Embley-Riches, Jonathan	University College London
Hammoud, Karim	University College London
Modugno, Valerio	University College London
Kanoulas, Dimitrios	University College London
Keywords: Virtual Reality and Interfaces, Telerobotics and Teleoperation Abstract: Recent advancements in robotic loco-manipulation have leveraged Virtual Reality (VR) to enhance the precision and immersiveness of teleoperation systems, significantly outperforming traditional methods reliant on 2D camera feeds and joystick controls. Despite these advancements, challenges remain, particularly concerning user experience across different setups. This paper introduces a novel VR-based teleoperation framework designed for a robotic manipulator integrated onto a mobile platform. Central to our approach is the application of Gaussian splattering, a technique that abstracts the manipulable scene into a VR environment, thereby enabling more intuitive and immersive interactions. Users can navigate and manipulate within the virtual scene as if interacting with a real robot, enhancing both the engagement and efficacy of teleoperation tasks. An extensive user study validates our approach, demonstrating significant usability and efficiency improvements. Two-thirds (66%) of participants completed tasks faster, achieving an average time reduction of 43%. Additionally, 93% preferred the Gaussian Splat interface overall, with unanimous (100%) recommendations for future use, highlighting improvements in precision, responsiveness, and situational awareness. Finally, we demonstrate the effectiveness of our framework through real-world experiments in two distinct application scenarios, showcasing the practical capabilities and versatility of the Splat-based VR interface.

11:24-11:42, Paper MoAT10.4
Robust Feedback Linearization for Noncontact Manipulation of Magnetic Particles by a Hexagonal Array of Electromagnets

Hasan, MD Nazmul	Southern Illinois University Carbondale
Komaee, Arash	Southern Illinois University, Carbondale
Keywords: Robust/Adaptive Control, Medical Robots and Systems, Automation at Micro-Nano Scales Abstract: This paper develops a robust feedback control law for planer steering of magnetic particles in a circular workspace using a hexagonal arrangement of electromagnets encircling the workspace. The electromagnets are actuated independently by a feedback loop to control their aggregate magnetic field flexibly, which is then exploited to exert magnetic force on the particles as needed for steering them along desired reference trajectories. This magnetic force is a highly nonlinear function of both the position of magnetic particles and the control voltages actuating the electromagnets. The control strategy in this work is to cancel this nonlinearity using a nonlinear inverse map, which in effect, realizes the concept of feedback linearization. Such an inverse map is developed in this paper in a specific way that drastically enhances its robustness against inherent modeling errors and uncertainty in the magnetic field. The resulting robust inverse map notably improves the performance of feedback control, as demonstrated by experiments in this paper.

11:42-12:00, Paper MoAT10.5
Hierarchical Control Framework for Collision-Free Collaborative Loco-Manipulation of Large and Heavy Objects

Rigo, Alberto	University of Southern California
Ma, Junchao	University of Southern California
Chun, Nathan	University of Southern California
Gupta, Satyandra K.	University of Southern California
Nguyen, Quan	University of Southern California
Keywords: Optimization and Optimal Control, Planning, Scheduling and Coordination, Collision Avoidance Abstract: Legged manipulators offer significant advantages over traditional mobile manipulators, particularly in navigating uneven terrain. However, they are limited by payload capacity and the dimensions of the objects they can manipulate. Collaboration between legged manipulators can mitigate these challenges, but it introduces complexities in coordinating the robots to manipulate and locomote as a unified team. This paper presents a novel hierarchical framework for collaborative loco-manipulation with quadruped manipulators, designed to address the challenges of coordinating a robot team and handling bulky, heavy objects. The framework starts with a manipulation planner that computes the desired trajectory of the object, incorporating obstacle avoidance. A subsequent mapping between the object's trajectory and the robot's states ensures that commands are compatible with each robot. Finally, a decentralized loco-manipulation controller tracks the reference for the end effector while steering the robot base to avoid obstacles and compensate for manipulation forces. Our approach has been validated through simulations and hardware experiments, demonstrating the framework's versatility across different robot team compositions and a variety of payload-carrying tasks. We highlight the critical role of obstacle avoidance when manipulating bulky objects in real-world scenarios, particularly the need to prevent contact between the ground and the object being manipulated.


MoAT11	Emerald
Best Conference Paper Competition	Special Session
Chair: Lennartson, Bengt	Chalmers University of Technology

10:30-10:55, Paper MoAT11.1
A Fast Solution Method for Unit Commitment with Renewable Power Via Ordinal Optimization (I)

Xu, Zhibo	Tsinghua University
Liu, Siwei	Tsinghua University
Jia, Qing-Shan	Tsinghua University
Keywords: Optimization and Optimal Control, AI-Based Methods, Smart Grids Abstract: The high penetration of renewable power has led to an increasing demand for rapid solutions to large scale security-constrained unit commitment (SCUC) problems. In this paper, we develop a fast solution method for SCUC problems with a significant number of renewable power scenarios based on Ordinal Optimization (OO). First, we propose a Crude Feasibility Model to efficiently generate estimated feasible solutions. The Crude Feasibility Model uses machine learning (ML) techniques to generate high-quality initial solutions, and then evaluates the feasibility of these solutions based on feasibility conditions with negligible computational effort. Second, we propose an OO-based method to rapidly seek good enough solutions while still providing probabilistic performance guarantees. The computational burden is reduced from the original large-scale mixed-integer programming (MIP) problem to a few linear programming (LP) problems that can be solved in parallel, significantly accelerating the solution process. Numerical experiments on the IEEE 118-bus system demonstrate that our method achieves the near-optimal solution with a performance gap of no more than 0.4%, while achieving a speedup of 8.39 to 27.99 times across different scales compared to Gurobi. Our method exhibits high computational efficiency and scales effectively to larger problems.

10:55-11:20, Paper MoAT11.2
Multi-Modal Generative Modeling of Event Sequences and Time Series for Solar PV Systems

Huang, Jiayu	Arizona State University
Xu, Boyang	Arizona State University
Liu, Yongming	Arizona State University
Yan, Hao	Arizona State University
Keywords: Data fusion, Machine learning, Deep Learning in Robotics and Automation Abstract: This paper presents a multimodal learning model designed to simulate a solar energy plant system for predictive maintenance and fault prediction. We propose a novel approach to enhance system generation through conditional generation by leveraging both event sequences and time-series data. We employ a Transformer Hawkes process to encode event data and an iTransformer to encode time-series data. We then introduce a co-attention mechanism to effectively combine these two modalities, capturing dependencies and interactions between events and time-series signals. The integrated representations enable a conditional generation framework that iteratively predicts future system states and events over an extended time horizon. The simulation is based on a comprehensive collection of data from the Red Rock Solar Site in Arizona. Experimental evaluations demonstrate that our approach offers promising applications in predictive maintenance, fault detection, and energy optimization in solar power systems.

11:20-11:45, Paper MoAT11.3
Extended Invalid Action Mask: Training-Time Size-Agnostic Safe Reinforcement Learning for Flexible Job Shop Scheduling Problems

Cai, Weilin	The Chinese University of Hong Kong, Shenzhen
Zheng, Wenjun	The Chinese University of Hong Kong, Shenzhen
Wang, Zhaoli	The Chinese University of Hong Kong, Shenzhen
Chen, Yilan	Columbia University in the City of New York
Mao, Jianfeng	The Chinese University of Hong Kong, Shenzhen
Keywords: Planning, Scheduling and Coordination, Reinforcement, Intelligent and Flexible Manufacturing Abstract: The flexible job shop scheduling problem (FJSP) is an NP-hard problem in which machine assignments are additionally considered compared to the classical job shop scheduling problem. It has received increased attention in recent years, as its global hard constraints highlight the importance of policy safety, especially in real production contexts. Deep reinforcement learning (DRL) has become a popular method for solving FJSP; however, ensuring both size-agnostic and safety-guaranteed properties during training remains challenging. Penalty-based approaches allow infeasible actions, degrading policy quality, while standard invalid action masking prevents training across different problem sizes. To address this, we propose an extended action-masking mechanism that maintains semantic consistency across varying scales while strictly enforcing action feasibility. Our framework integrates state and action embeddings to support adaptive scheduling and multi-scale training. Empirical results demonstrate improved performance over the latest DRL-based methods, indicating that our approach is scalable and robust for production scheduling.


MoLU_BR	Biltmore Bowl
Monday Conference Lunch


MoK2N	Crystal
Larry Matthies: Advances in Autonomy for Robotic Exploration of Mars	Plenary Session
Chair: Gupta, Satyandra K.	University of Southern California


MoPM1_BR	Foyer
Monday Afternoon Coffee Break 1


MoBT1	Roman
AI-Powered Collaborative Manufacturing	Special Session
Chair: Wang, Junkai	Tongji University
Co-Chair: Li, Rui	Montclair State University
Organizer: Wang, Junkai	Tongji University
Organizer: Chang, Qing	University of Virginia
Organizer: Matta, Andrea	Politecnico Di Milano
Organizer: Wang, Xi Vincent	KTH Royal Institute of Technology
Organizer: Li, Xiaoou	Center of Research and Advanced Studies of NationalPolytechnic I
Organizer: Tang, Ying	Rowan University
Organizer: Yan, Chao-Bo	Xi'an Jiaotong University
Organizer: Zhu, Haibin	Nipissing University

14:45-15:03, Paper MoBT1.1
Emotion-Based Robotic Action Optimization System for Human-Robot Collaboration (I)

Murphy, Jordan	Montclair State University
Parron, Jesse	Montclair State University
Wang, Weitian	Montclair State University
Li, Rui	Montclair State University
Keywords: Collaborative Robots in Manufacturing, Human-Centered Automation, Assembly Abstract: Although collaborative robots aim to boost productivity in manufacturing, misalignment between robot’s actions and the human’s intentions of the collaboration can cause discomfort or frustration, potentially discouraging future collaborations. Inspired by human-to-human interactions, this paper aims to help solve this problem by enabling a collaborative robot to adjust how it moves and acts based on human emotions to improve the overall collaboration process. To achieve this goal, an emotion-based robotic action optimization system was developed and integrated into a collaborative robot. The system utilizes hierarchical reinforcement learning (HRL) to train and guide the robot to adjust its actions according to detected human emotions. Specifically, this paper introduces (1) a HRL model that leverages a vision-audio-based emotion recognition model to determine and adjust robot actions (movement speed, drop-off distance, reaction time, and rate of success) according to human emotions. The goal of this model is to avoid negative emotions of the human user that are triggered by the robot actions. (2) A robot motion control method driven by recognized human intentions and actions from the HRL model, guiding the robot arm and gripper to adjust movements and deliver parts as desired. (3) objective and subjective evaluation experiments to evaluate the effectiveness of the developed system. The results and analysis of the experiments demonstrated the effectiveness of our developed system in a human-robot collaboration setting.

15:03-15:21, Paper MoBT1.2
Robust Electricity Forecasting in Smart Buildings with Missing Data: A Concept Echo State Network Approach (I)

Zhu, Yingqin	CINVESTAV
Li, Xiaoou	Center of Research and Advanced Studies of NationalPolytechnic I
Yu, Wen	CINVESTAV-IPN
Keywords: Smart Home and City, AI-Based Methods, Machine learning Abstract: Accurate electricity forecasting is essential for smart building energy optimization, yet dynamic usage and missing data pose significant challenges. This paper introduces a Concept Echo State Network (CESN) approach for robust forecasting. CESNs extract semantic concepts, constructing a dynamic matrix via a recursive process. A context-aware multi-objective optimization minimizes errors and maximizes robustness against missing data. A hierarchical fusion enhances adaptability. Evaluations on real-world datasets, simulating data gaps, demonstrate that CESNs outperform existing methods. This approach delivers superior accuracy and actionable insights, even with incomplete data. This research advances smart building energy management through interpretable, robust electricity prediction, directly addressing missing data challenges.

15:21-15:39, Paper MoBT1.3
Multi-Products Production Control and Human-Autonomous Truck Distribution Planning with the Use of Collaborative Multi-Agent Reinforcement Learning (I)

Deng, Yang	City University of Hong Kong
Keywords: Inventory Management, AI-Based Methods, Optimization and Optimal Control Abstract: In the era of Industry 5.0, the integration of human-centric values with advanced automation is pivotal. This paper addresses a complex distribution planning problem that necessitates the collaboration of human distributors and autonomous trucks within a manufacturing setting. We formulate the problem as a Partially Observable Markov Decision Process (POMDP) and propose a hierarchical multi-agent reinforcement learning (MARL) framework that unifies production control with distributor planning. At the upper level, two different agents determine production quantities for both preordered and on-time products, while concurrently planning appropriate distribution channels based on fluctuating market conditions. At the lower level, a decentralized execution strategy enables human and robotic distributors to dynamically fulfill customer demands. Key to this approach is a communication-based centralized learning scheme that leverages differentiable inter-agent learning to coordinate decisions, ensuring that autonomous operations do not undermine the human workforce, which is a core tenet of Industry 5.0. Computational experiments demonstrate that the proposed MARL framework outperforms traditional heuristic algorithms and conventional RL methods, achieving superior performance through improved coordination and adaptability. The results highlight not only the efficiency gains from automation but also the essential balance maintained between technological advancement and human involvement.

15:39-15:57, Paper MoBT1.4
Manufacturing Task Scheduling Optimization with Buffer Zones Using Parallel Advantage Actor-Critic Algorithm (I)

Wenjing, Zeng	New Jersey City University
Guo, Xiwang	Liaoning Petrochemical University
Wang, Jiacun	Monmouth University
Tang, Ying	Rowan University
Wang, Weitian	Montclair State University
Bin, Hu	Kean University
Wang, Nan	William Paterson University
Keywords: Intelligent and Flexible Manufacturing, Robust Manufacturing, Remanufacturing Abstract: With the diversification of market demands and the limited availability of production resources, optimizing the allocation of manufacturing tasks within the constraints of limited workstation resources becomes increasingly important. This study explores the workstation buffer zone disassembly-assembly line balancing problem, aiming to improve the operational area of workstations and reduce component transportation costs. Based on the characteristics of the problem, a computational model is developed to maximize the recovery profit. To facilitate the search for an optimal solution, the parallel advantage actor-critic (PA2C) algorithm is used to address this problem, and the feasibility of the developed approach in disassembly and assembly lines is analyzed. Comparisons with AC and the original A2C suggest the competitive performance of the proposed solution.

15:57-16:15, Paper MoBT1.5
Deterioration-Aware Collaborative Energy-Efficient Batch Scheduling and Maintenance for Unrelated Parallel Machines Based on Improved MOEA/D

Wang, Haixuan	Tongji University
Qiao, Fei	Tongji University
Jiang, Shengxi	Tongji University
Zhu, Haibin	Nipissing University
Wang, Junkai	Tongji University
Keywords: Manufacturing, Maintenance and Supply Chains, Planning, Scheduling and Coordination, Sustainable Production and Service Automation Abstract: The deterioration phenomenon is common and lasting as machines' service time increases within energy-intensive manufacturing processes such as heat treatment, which may bring about processes time extension or even the breakdown of a machine. It is crucial to collaboratively optimize batch scheduling and maintenance to ensure stable, efficient production, and achieve energy efficiency. This study takes into account preventive maintenance, where a maintenance activity is carried out after a certain number of batches are processed. A novel multi-objective mixed-integer programming model for unrelated parallel batching machines is proposed to minimize the makespan, total completion time and total energy consumption. The entire problem is broken down into four sub-issues: job division, job dispatching, batch formation and batch sequencing. Given the NP-hard nature of the problem, three heuristic algorithms based on several structural properties are designed according to the features of the latter three parts. Meanwhile, an integrated methodology, a Multi-Objective Evolutionary Algorithm based on Decomposition combined with Variable Neighborhood Search (MOEA/D-VNS), is put forward to handle job division and the multi-dimensional collaborative optimization problem. The performance of the proposed algorithms is compared with that of other typical dominance-based evolutionary algorithms. Extensive numerical experiments are conducted to validate the effectiveness of the proposed model and algorithms.


MoBT2	Athenian
TASE Paper Session 2	Special Session
Chair: Zhang, Chen	Tsinghua University

14:45-15:03, Paper MoBT2.1
Nonlinear Causal Discovery Via Dynamic Latent Variables

Yang, Xing	Shenzhen University
Lan, Tian	Tsinghua University
Qiu, Hao	Sichuan Baicha Baidao Industrial Co., Ltd
Zhang, Chen	Tsinghua University
Keywords: Causal Models, Probability and Statistical Methods, Machine learning Abstract: Distinguishing causality from mere correlation is a cornerstone in empirical research, as conflating the two can result in significant errors in decision-making, affecting policy formulation and the validity of scientific inferences. Traditional experimental designs, such as randomized trials, often fall short in complex systems where variables interact in a high-dimensional space with limited data. This paper aims to address these challenges by introducing an innovative causal discovery approach, extending beyond conventional methodologies by incorporating algorithmic advances in computational efficiency and design. We present a novel double Gaussian process state space causal model (GPSSCM) that contends with the multifaceted nature of causal inference, accounting for noisy observations and latent variables, which are commonly encountered in dynamic systems. Our methodological contribution includes the application of a Markov chain Monte Carlo technique for unraveling latent state dynamics and an expectation-maximization (EM) algorithm for robust parameter estimation. The acyclic nature of the causal graph is ensured through an integrated acyclic constraint within the EM framework, maintaining the integrity of the causal model. The efficacy of our proposed GPSSCM is evaluated through a series of tests on both synthetic data and empirical case studies from the industrial domain. The results highlight the model's capacity to accurately infer complex nonlinear causal relationships, demonstrating its superiority over traditional structural equation modeling, especially when dealing with time series data and latent variables. This paper not only contributes a sophisticated tool for researchers and practitioners but also enriches the literature on causal discovery by offeri

15:03-15:21, Paper MoBT2.2
LeTO: Learning Constrained Visuomotor Policy with Differentiable Trajectory Optimization

Xu, Zhengtong	Purdue University
She, Yu	Purdue University
Keywords: Deep Learning in Robotics and Automation, Machine learning, Optimization and Optimal Control Abstract: This paper introduces LeTO, a method for learning constrained visuomotor policy with differentiable trajectory optimization. Our approach integrates a differentiable optimization layer into the neural network. By formulating the optimization layer as a trajectory optimization problem, we enable the model to end-to-end generate actions in a safe and constraint-controlled fashion without extra modules. Our method allows for the introduction of constraint information during the training process, thereby balancing the training objectives of satisfying constraints, smoothing the trajectories, and minimizing errors with demonstrations. This "gray box" method marries optimization-based safety and interpretability with powerful representational abilities of neural networks. We quantitatively evaluate LeTO in simulation and in the real robot. The results demonstrate that LeTO performs well in both simulated and real-world tasks. In addition, it is capable of generating trajectories that are less uncertain, higher quality, and smoother compared to existing imitation learning methods. Therefore, it is shown that LeTO provides a practical example of how to achieve the integration of neural networks with trajectory optimization. We release our code at https://github.com/ZhengtongXu/LeTO.

15:21-15:39, Paper MoBT2.3
DartBot: Overhand Throwing of Deformable Objects with Tactile Sensing and Reinforcement Learning

Aslam, Shoaib	The Hong Kong University of Science and Technology (HKUST), Clea
Kumar, Krish	Purdue University
Zhou, Pokuang	Purdue University
Yu, Hongyu	The Hong Kong University of Science and Technology
Wang, Michael Yu	Hong Kong University of Science and Technology
She, Yu	Purdue University
Keywords: Machine learning, Force and Tactile Sensing, Deep Learning in Robotics and Automation Abstract: Object transfer through throwing is a classic dynamic manipulation task that necessitates precise control and perception capabilities. However, developing dynamic models for unstructured environments using analytical methods presents challenges. In this study, we present DartBot, a robot that integrates tactile exploration and reinforcement learning to achieve robust throwing skills for nonrigid relatively small objects under the influence of moment of inertia which cause the object to spin in the air. Unlike traditional sim-to-real transfer methods, our approach involves direct training of the agent on a real hardware robot equipped with a high-resolution tactile sensor, enabling reinforced learning in a realistic and dynamic environment. By leveraging tactile perception, we incorporate pseudo-embeddings of the physical properties of objects into the learning process through tilting actions at two distinct angles. This tactile information enables the agent to infer and adapt its throwing strategy, resulting in improved accuracy when handling various objects and targeting distant locations. Furthermore, we demonstrate that the quality of a grasp significantly impacts the success rate of the throwing task. We evaluate the effectiveness of our method through extensive experiments, demonstrating superior performance and generalization capabilities in real-world throwing scenarios. We achieved a success rate of 95% for unseen objects with a mean error of 3.15 cm from the goal. A high-resolution video demo of our work is available at https://youtu.be/KNFgDeLt-0g.

15:39-15:57, Paper MoBT2.4
High-Quality Dataset-Sharing and Trade Based on a Performance-Oriented Directed Graph Neural Network

Zeng, Yingyan	University of Cincinnati
Zhou, Xiaona	University of Illinois Urbana-Champaign
Chilukuri, Premith Kumar	Virginia Tech
Lourentzou, Ismini	University of Illinois Urbana-Champaign
Jin, Ran	Virginia Tech
Keywords: AI-Based Methods, Big-Data and Data Mining, Data fusion Abstract: The advancement of Artificial Intelligence (AI) models heavily relies on large high-quality datasets. However, in advanced manufacturing, collecting such data is time-consuming and labor-intensive for a single enterprise. Hence, it is important to establish a context-aware and privacy-preserving data sharing system to share small-but-high-quality datasets between trusted stakeholders. Existing data sharing approaches have explored privacy-preserving data distillation methods and focused on valuating individual samples tied to a specific AI model, limiting their flexibility across data modalities, AI tasks, and dataset ownership. In this work, we propose a performance-oriented representation learning (PORL) framework in a Directed Graph Neural Network (DiGNN). PORL distills raw datasets into privacy-preserving proxy datasets for sharing and learns compact meta data representations for each stakeholder locally. The meta data will then be used in DiGNN to forecast the AI model performance and guide the sharing via graph-level supervised learning. The effectiveness of the PORL-DiGNN is validated by two case studies: data sharing in the semiconducting manufacturing network between similar processes to create similar quality defect models; and data sharing in the design and manufacturing network of Microbial Fuel Cell anodes between upstream (design) and downstream (Additive Manufacturing) stages to create distinct but related AI models.

15:57-16:15, Paper MoBT2.5
Orchestrated Robust Controller for Precision Control of Heavy-Duty Hydraulic Manipulators

Hejrati, Mahdi	Tampere University
Mattila, Jouni	Tampere University
Keywords: Robust/Adaptive Control, Neural and Fuzzy Control, Motion Control Abstract: Vast industrial investment along with increased academic research on heavy-duty hydraulic manipulators has unavoidably paved the way for their automatization, necessitating the design of robust and high-precision controllers. In this study, an orchestrated robust controller is designed to address the mentioned issue for generic manipulators with an anthropomorphic arm and spherical wrist. Thanks to virtual decomposition control (VDC), the entire robotic system is decomposed into subsystems, and a robust controller is designed at each local subsystem by considering unknown model uncertainties, unknown disturbances, and compound input nonlinearities. As such, radial basis function neural networks (RBFNNs) are incorporated into VDC to tackle unknown disturbances and uncertainties, resulting in novel decentralized RBFNNs. All robust local controllers designed at each local subsystem, then, are orchestrated to accomplish high-precision control. In the end, for the first time in the context of VDC, a semi-globally uniformly ultimate boundedness is achieved under the designed controller. The validity of the theoretical results is verified by performing extensive simulations and experiments on a 6-degrees-of-freedom industrial manipulator with a nominal lifting capacity of 600, kg at 5 meters reach. Comparing the simulation results with the state-of-the-art controllers along with provided experimental results, demonstrates that proposed method fulfilled all promises and performed excellently.


MoBT3	Crystal
Automation for Enhanced Healthcare 1	Special Session
Chair: Liu, Feng	Stevens Institute of Technology
Organizer: Wen, Yuxin	Chapman University
Organizer: Huang, Jiajing	Kennesaw State University
Organizer: Liu, Feng	Stevens Institute of Technology
Organizer: Chen, Jia	University of California Riverside
Organizer: Wang, Chao	University of Maryland
Organizer: Shen, Xin	University of California, Riverside

14:45-15:03, Paper MoBT3.1
Integrating Intracranial EEG and Scalp EEG for Whole Brain Network Inference (I)

Yang, Shihao	Stevens Institute of Technology
Liu, Feng	Stevens Institute of Technology
Keywords: AI and Machine Learning in Healthcare, Modelling, Simulation and Optimization in Healthcare, Machine learning Abstract: Over the past few decades, brain imaging research has shifted from mapping task-evoked brain regions of activation to identifying and characterizing dynamic brain networks involving multiple coordinated regions. Electrophysiological signals directly reflect brain activity, making the characterization of whole-brain electrophysiological networks (WBEN) a crucial tool for both neuroscience research and clinical applications. In this work, we introduce a novel framework for integrating scalp EEG and intracranial EEG (iEEG) to estimate WBEN, based on a principled state-space model estimation approach. An Expectation-Maximization (EM) algorithm is designed to simultaneously infer state variables and brain connectivity. We validated the proposed method using synthetic data, demonstrating improved performance over traditional two-step methods that rely solely on scalp EEG. This highlights the importance of incorporating iEEG signals for accurate WBEN estimation. For real data involving simultaneous EEG and iEEG recordings, we applied the developed framework to investigate the information flow during the encoding and maintenance phases of a working memory task. Our findings reveal distinct information flows between subcortical and cortical regions, with more significant flows from cortical to subcortical regions during the maintenance phase. These results align with previous studies, but provide a comprehensive view of the whole brain, underscoring the unique utility of the proposed framework.

15:03-15:21, Paper MoBT3.2
An Optimization Model to Study the Impact of Digital Health on Regional Healthcare Accessibility (I)

Weng, Leqi	Tsinghua University
Wang, Qing	Tsinghua University
Li, Jingshan	Tsinghua University
Keywords: Health Care Management, Modelling, Simulation and Optimization in Healthcare, Scheduling in Healthcare Abstract: This abstract proposes a quantitative model to study the impact of digital health on regional healthcare accessibility. An improved two-step floating area method (2SFCA) is used to describe the accessibility of offline and online medical services. Using this model, the impact of allocation of internet medical resources on regional healthcare accessibility is investigated, and optimal allocation of online medical resources can be achieved.

15:21-15:39, Paper MoBT3.3
An Intelligent Wireless Capsule System for Early Detection and Precision Treatment of Gastrointestinal Disorders (I)

Zheng, Jie-Ming	National Sun Yat-Sen University
Tsai, Wen Chin	National Sun Yat-Sen University
Lin, Jyun Ying	National Sun Yat-Sen University
Liu, Hsiao-Chuan	University of Southern California
Wu, Jian-Xing	National Sun Yat-Sen University
Keywords: AI and Machine Learning in Healthcare, Physically Assistive Devices, Medical Robots and Systems Abstract: At present, no wireless optical therapeutic capsule system is commercially available. This study presents the development of an integrated capsule system combining (1) wireless chip-based data transmission, (2) image compression techniques for efficient wireless communication, (3) high-resolution optical imaging, (4) a color feature extraction algorithm for bleeding detection, and (5) a controllable microneedle injection mechanism for targeted drug delivery. The system's performance is evaluated based on three key criteria: (1) real-time visualization of gastrointestinal bleeding, (2) accurate drug injection to target sites, and (3) successful capsule excretion following operation. The wireless communication module is based on the nRF52840 chipset, ensuring stable and reliable data transmission. Optical imaging employs a CMOS sensor with a 140-degree field of view, integrated with four white-light LEDs to provide uniform illumination. Image processing is performed using an ESP32 controller, which enhances visual data and applies advanced compression algorithms, achieving a data size reduction of up to 1/12. For hemorrhage detection, a combination of ResNet-based deep learning and color feature extraction algorithm is implemented, achieving an accuracy of 95.75%. Power is supplied by four button batteries, providing an operational duration of up to 10.3 hours. This innovative capsule system demonstrates significant potential for wireless, minimally invasive gastrointestinal diagnostics and therapy, enabling real-time monitoring, intelligent bleeding detection, and precise on-demand drug delivery within the gastrointestinal tract. The proposed platform advances current capsule endoscopy technologies and paves the way for future smart therapeutic capsules.

15:39-15:57, Paper MoBT3.4
An Autonomous Robotic System for Aorta Ultrasound Screening with Deep Learning Segmentation (I)

Farsoni, Saverio	University of Ferrara
Bertagnon, Alessandro	University of Ferrara
D'Antona, Andrea	University of Ferrara
Rizzi, Jacopo	University of Ferrara
Roma, Marco	University of Ferrara
Bonfe, Marcello	University of Ferrara
Proto, Antonino	University of Ferrara
Baldazzi, Giulia	University of Ferrara
Pagani, Anselmo	University of Ferrara
Zamboni, Paolo	University of Ferrara
Keywords: Human-Centered Automation, Robotics and Automation in Life Sciences, AI and Machine Learning in Healthcare Abstract: Abdominal aortic aneurysm consists in the enlargement of the abdominal aorta involving a local diameter greater than 3 cm. Despite the fact that most aneurysms are asymptomatic, such a pathology becomes critical in case of complications as embolization, occlusion and rupture. The diagnosis is commonly achieved by means of an ultrasound examination carried out by expert sonographers. We designed a robotic system that can autonomously perform the ultrasound screening of the abdominal aorta, measuring its diameter and therefore providing the early diagnosis of the aneurysm. We use an impedance-controlled collaborative robot to move the ultrasound probe on the patient's abdomen while a deep learning neural network segments the aorta in the ultrasound image and estimates the diameter. Our motion planning algorithm makes use of an artificial potential field that guides the robot to move the probe toward the generation of a good aorta view. Finally, we conducted several experiments to validate the feasibility of the proposed approach.

15:57-16:15, Paper MoBT3.5
Addressing the Allocation of Medical Examination Resources Using Simulation (I)

Zhang, Mirui	Tsinghua University
Zhao, Yue	Beijing Tsinghua Changgung Hospital
Wang, Feifan	Tsinghua University
Keywords: Health Care Management, Modelling, Simulation and Optimization in Healthcare, Scheduling in Healthcare Abstract: The allocation of medical examination resources plays a critical role in hospital operations. Inefficient allocation can lead to delays in treatments and extend patient stays, affecting bed availability. The examination resources are allocated to inpatients and outpatients, but it must strike a balance. Prioritizing either inpatients or outpatients may disproportionately affect the other, leading to delays and reduced patient satisfaction. This study is motivated by a medical examination resource allocation problem in a general hospital. Currently, hospital managers rely on experience-based methods for resource allocation. However, long patient waiting times suggest that the intuitively determined allocation is not optimal. Determining the optimal allocation strategies in real-world hospital operations is difficult due to uncertain demand and unpredictable long-term factors. This study develops a simulation program to help determine an appropriate proportion of an examination team allocated to inpatients and outpatients. It integrates medical examinations, hospitalization, and surgery and has a user-friendly interface. Experiments based on both virtual scenario and real-world case are conducted. It is shown that the simulation program can help hospital managers systematically evaluate different allocation strategies before implementation.


MoBT4	Bernard's
Trajectory, Object, and Position 2	Regular Session
Chair: Yamaguchi, Tomoya	Toyota Motor Corporation

14:45-15:03, Paper MoBT4.1
Robot Trajectory Optimization for Safe Transport of Deformable Packages

Shukla, Rishabh	University of Southern California
Moode, Samrudh	University of Southern California
Wang, Fan	Amazon Robotics
Mayya, Siddharth	Amazon Robotics
Gupta, Satyandra K.	University of Southern California
Keywords: Motion and Path Planning, Industrial and Service Robotics, Foundations of Automation Abstract: Efficient and safe transport of deformable packages using suction cups is crucial in warehouse automation. Unlike rigid packages, deformable packages exhibit complex oscillatory behaviors and can detach under aggressive motions. Traditional motion planners typically overlook these oscillations, often resulting in either unsafe trajectories or overly conservative, slow motions. This paper addresses that gap by formulating package oscillation dynamics as constraints and incorporating them into a Cartesian trajectory optimization framework. These constraints are formulated to be state-dependent - i.e., they adapt according to the instantaneous conditions along the planned trajectory (such as acceleration and gripper orientation) - to ensure that oscillations remain within safe limits. We derive a pendulum-like model to characterize package swing, enforcing constraints on peak oscillation angles. Our approach then optimizes end-effector trajectories under these state-dependent constraints, ensuring safe transport when the end-effector follows constant-acceleration profiles. Real-world experiments demonstrate that our optimized trajectories reduce transport time by up to 18% compared to baseline motions while strictly adhering to safety limits on package swing.

15:03-15:21, Paper MoBT4.2
A Mixed-Integer Conic Program for the Multi-Agent Moving-Target Traveling Salesman Problem

George Philip, Allen	Texas A&M University
Ren, Zhongqiang	Shanghai Jiao Tong University
Rathinam, Sivakumar	TAMU
Choset, Howie	Carnegie Mellon University
Keywords: Motion and Path Planning, Planning, Scheduling and Coordination, Autonomous Agents Abstract: The Moving-Target Traveling Salesman Problem (MT-TSP) seeks a shortest path for an agent that starts at a stationary depot, visits a set of moving targets exactly once, each within one of their respective time windows, and returns to the depot. In this paper, we introduce a new Mixed-Integer Conic Program (MICP) formulation for the Multi-Agent Moving-Target Traveling Salesman Problem (MA-MT-TSP), a generalization of the MT-TSP involving multiple agents. Our approach begins by restating the current state-of-the-art MICP formulation for MA-MT-TSP as a Nonconvex Mixed-Integer Nonlinear Program (MINLP), followed by a novel reformulation into a new MICP. We present computational results demonstrating that our formulation outperforms the state-of-the-art, achieving up to two orders of magnitude reduction in runtime, and over 90% improvement in optimality gap.

15:21-15:39, Paper MoBT4.3
Heterogeneous Performance of Swarm Collision Avoidance Strategies

Lewis, Ryan	University of Houston
Becker, Aaron	University of Houston
Bernardini, Francesco	University of Houston
Julien, Leclerc	University of Houston
Keywords: Collision Avoidance, Motion Control, Agent-Based Systems Abstract: Four prominent collision avoidance methods are Artificial Potential Fields, Artificial Potential Fields expressed as Control Barrier Functions, Control Barrier Functions, and Reciprocal Velocity Obstacles. Prior work often assumes all agents are using the same obstacle avoidance methods. The methods differ in computational scalability, how they react to different types of obstacles, and in how they react to agents with heterogenous collision avoidance methods. This paper explores the scenario robustness and scalability of these methods through three key navigation scenarios: different structures of stationary obstacles, circle-crossing collision avoidance benchmarks, and defense against an antagonistic swarm.

15:39-15:57, Paper MoBT4.4
Scalable Multi-Agent Path Finding for Delivery Robot Systems with Temporal and Edge Capacity Constraints on Weighted Graphs

Yamaguchi, Tomoya	Toyota Motor Corporation
Nishitani, Ippei	Toyota Motor Corporation
Ota, Yusuke	Toyota Motor Corporation
Hoxha, Bardh	Toyota Research Institute of North America
Fainekos, Georgios	Toyota NA-R&D
Keywords: Planning, Scheduling and Coordination, Formal Methods in Robotics and Automation, Task Planning Abstract: The Multi-Agent Path Finding (MAPF) problem on a graph is a fundamental research topic in robotics. This paper formulates a mathematical problem that not only captures path finding in traditional MAPF but also incorporates constraints related to weighted graphs, deadlines, and edge capacity considerations. The feasibility of this formulation is demonstrated and compared with Mixed Integer Linear Programming, Satisfiability Modulo Theories, and OR-Tools, focusing on scalability within a realistic delivery robot system.

15:57-16:15, Paper MoBT4.5
Sampling-Based Near Time-Optimal Trajectory Generation for Pneumatic Drives

Hoffmann, Kathrin	University of Stuttgart
Baumgart, Michaela	University of Stuttgart
Kanagalingam, Gajanan	University of Stuttgart
Verl, Alexander	University of Stuttgart
Sawodny, Oliver	University of Stuttgart
Keywords: Motion and Path Planning, Hydraulic/Pneumatic Actuators Abstract: When servo-pneumatic drives are applied in automation, their motion trajectories should be fast to maximize productivity. There occur nonlinear state-dependent jerk constraints because the pressure dynamics are not negligibly fast, the air mass flow through the valves is subject to pressure-dependent constraints, and the mechanics and pneumatics are coupled. The goal of this work is to generate near time-optimal trajectories for pneumatic drives, taking into account the aforementioned in a model-based way. To this end, first, the system dynamics and constraints are formulated using differential flatness such that they can be incorporated into trajectory generation frameworks. Then, the class of sampling-based near time-optimal path parametrization approaches, which build a tree of samples in the path parameter space, is chosen and extended to the present type of constraints. Results for various scenarios are discussed, compared to our previous work where nonlinear programming was applied, and validated in real-world experiments. The experimental outcomes demonstrate the applicability of the sampling-based algorithm to the present system.


MoBT5	Cordoban
Human-Robot and HCA 2	Regular Session
Chair: Hu, Lianming	Massachusetts Institute of Technology

14:45-15:03, Paper MoBT5.1
Eye Movement Feature-Guided Signal De-Drifting in Electrooculography Systems

Hu, Lianming	Massachusetts Institute of Technology
Zhang, Xiaotong	Massachusetts Institute of Technology
Youcef-Toumi, Kamal	Massachusetts Institute of Technology
Keywords: Sensor Fusion, Human Performance Augmentation, Human-Centered Automation Abstract: Electrooculography (EOG) is widely used for gaze tracking in Human-Robot Collaboration (HRC). However, baseline drift caused by low-frequency noise significantly impacts the accuracy of EOG signals, creating challenges for further sensor fusion. This paper presents an Eye Movement Feature-Guided De-drift (FGD) method for mitigating drift artifacts in EOG signals. The proposed approach leverages active eye-movement feature recognition to reconstruct the feature-extracted EOG baseline and adaptively correct signal drift while preserving the morphological integrity of the EOG waveform. The FGD is evaluated using both simulation data and real-world data, achieving a significant reduction in mean error. The average error is reduced to 0.896° in simulation, representing a 36.29% decrease, and to 1.033° in real-world data, corresponding to a 26.53% reduction. Despite additional and unpredictable noise in real-world data, the proposed method consistently outperforms conventional de-drifting techniques, demonstrating its effectiveness in practical applications such as enhancing human performance augmentation.

15:03-15:21, Paper MoBT5.2
Enhancing Autonomous Manipulator Control with Human-In-Loop for Uncertain Assembly Environments

Mishra, Ashutosh	Tohoku University
Santra, Shreya	Tohoku University
Gozbasi, Hazal	Tohoku University
Uno, Kentaro	Tohoku University
Yoshida, Kazuya	Tohoku University
Keywords: Assembly, Human Factors and Human-in-the-Loop, Manipulation Planning Abstract: This study presents an advanced approach to enhance robotic manipulation in uncertain and challenging environments, with a focus on autonomous operations augmented by human-in-the-loop (HITL) control for lunar missions. Emphasizing the critical role of HITL control, the research integrates human decision-making capabilities with autonomous robotic functions to improve task reliability and efficiency for space applications. The key task addressed is the autonomous deployment of flexible solar panels using an extendable ladder-like structure and a robotic manipulator with real-time feedback for precision. The manipulator continuously relays position and force-torque data, enabling dynamic error detection, correction, and adaptive control during deployment. To mitigate the effects of sinkage, variable payload, and low-lighting conditions, efficient motion planning strategies are employed, supplemented by human control that allows operators to intervene in ambiguous scenarios. Digital twin simulation enhances system robustness by enabling continuous feedback, iterative task refinement, and seamless integration with the deployment pipeline. The system has been tested to validate its performance in simulated lunar conditions and ensure reliability in extreme lighting, variable terrain, changing payloads, and sensor limitations.

15:21-15:39, Paper MoBT5.3
Force Plates for Analyzing, Recording and Teaching Forces of Dis-/Assembly Processes for Robot Programming-By-Demonstration

Bargmann, Daniel	Fraunhofer IPA
Kraus, Werner	Fraunhofer IPA
Huber, Marco F.	University of Stuttgart
Keywords: Human-Centered Automation, Sensor Fusion, Assembly Abstract: Assembly tasks remain a challenge for industrial robots, as they involve physical contact where small path deviations can cause irreversible damage. While force control can mitigate such issues, tuning control parameters requires expert knowledge.Learning-from-Demonstration (LfD) or Im- itation Learning (IL) offers a more intuitive alternative, but most force-based approaches rely on hand-guiding, requiring expensive cobots or raising safety concerns. Observation-based methods, in contrast, often depend on cameras and are limited to position control, suffering from occlusions and inaccuracies. We propose a novel tool that estimates position from force signals, without cameras or direct robot interaction, to enable force-based programming of assembly tasks. Using a network of four force-torque sensors, our system detects contact position, direction, and magnitude, achieving sub-millimeter accuracy (< 1 mm) in relevant areas. Demonstrations include the force-based assembly of a snap-fit, occluded plug insertion, and terminal clamp assembly. A user study with 20 participants shows that our approach reduces teaching time for terminal clamps by 61% compared to hand-guided methods, and lowers both mental load and perceived frustration by over 40% each.

15:39-15:57, Paper MoBT5.4
Enhanced Human-Robot Collaboration Using Constrained Probabilistic Human-Motion Prediction

Kothari, Aadi	Massachusetts Institute of Technology
Tohme, Tony	Massachusetts Institute of Technology
Zhang, Xiaotong	Massachusetts Institute of Technology
Youcef-Toumi, Kamal	Massachusetts Institute of Technology
Keywords: Human-Centered Automation, Human Factors and Human-in-the-Loop Abstract: Human motion prediction is an essential step for efficient and safe human-robot collaboration. Current methods either purely rely on representing the human joints in some form of neural network-based architecture or use regression models offline to fit hyper-parameters in the hope of capturing a model encompassing human motion. While these methods provide good initial results, they are missing out on leveraging well-studied human body kinematic models as well as body and scene constraints, which can help boost the efficacy of these prediction frameworks. These methods are also lacking on mechanisms to explicitly avoid implausible human joint configurations. We propose a novel human motion prediction framework that incorporates human joint constraints and scene constraints in a Gaussian Process Regression (GPR) model, while considering associated measurement uncertainty, to predict human motion. This formulation is combined with an online context-aware constraint model to leverage task-dependent motions. Our emphasis on explicit constraint modeling differentiates this work from prior studies. The proposed approach is validated on a human arm kinematic model and implemented in a human-robot collaborative setup with a UR5 robot arm, demonstrating its real-time feasibility. Simulations show that our framework dramatically improves overall mean per joint position error by as much as 66% on HA4M dataset and 51% on Andy dataset, while negative log-likelihood on the predicted probability distribution function is also improved by 32% on HA4M dataset and 15% on Andy dataset when compared to baseline methods.

15:57-16:15, Paper MoBT5.5
Safe Human Robot Navigation in Warehouse Scenario

Farrell, Seth	University of California San Diego
Li, Chenghao	University of California, San Diego
Yu, Hongzhan	University of California San Diego
Yoshimitsu, Ryo	IHI Corporation
Gao, Sicun	UCSD
Christensen, Henrik Iskov	UC San Diego
Keywords: Model Learning for Control, Collision Avoidance, Autonomous Agents Abstract: The integration of autonomous mobile robots (AMRs) in industrial environments, particularly warehouses, has revolutionized logistics and operational efficiency. However, ensuring the safety of human workers in dynamic, shared spaces remains a critical challenge. This work proposes a novel methodology that leverages control barrier functions (CBFs) to enhance safety in warehouse navigation. By integrating learning-based CBFs with the Open Robotics Middleware Framework (Open-RMF), the system achieves adaptive and safety-enhanced controls in multi-robot, multi-agent scenarios. Experiments conducted using various robot platforms demonstrate the efficacy of the proposed approach in avoiding static and dynamic obstacles, including human pedestrians. Our experiments evaluate different scenarios in which the number of robots, robot platforms, speed, and number of obstacles are varied, from which we achieve promising performance.


MoBT6	Corinthian
Detection, Estimation and Prediction 1	Regular Session
Chair: Jin, Ran	Virginia Tech

14:45-15:03, Paper MoBT6.1
DCAF: Dynamic Cross-Attention Feature Fusion from Robotic Anomaly Detection to Position Accuracy Modeling

Liu, Hui	Virginia Tech
Qiao, Guixiu	National Institute of Standards and Technology
Piliptchak, Pavel	National Institute of Standards and Technology
Moore, James	University of Sheffield Advanced Manufacturing Research Centre
Sawyer, Daniela	University of Sheffield - Advanced Manufacturing Research Centre
Zeng, Yingyan	University of Cincinnati
Jin, Ran	Virginia Tech
Keywords: AI-Based Methods, Failure Detection and Recovery, Data fusion Abstract: In robotic operations, heterogeneous computation tasks and sensor configurations pose significant challenges to analyze different modalities of data for data sharing and collaborative learning in robotic Artificial Intelligence (AI) tasks. The lack of historical data in new scenarios or new computation tasks complicates model training and limits the applicability of existing AI methodologies. Current transfer learning approaches heavily rely on static feature extraction, which fail to dynamically adjust to specific feature relationships between different samples or modalities. In the literature, these methods struggle to capture inter-modal associations effectively, resulting in insufficient information sharing and poor modeling performance. Motivated by these challenges, this paper proposes a Dynamic Cross-Attention Feature Fusion (DCAF) approach to map the features from one robotic AI task to another. By calculating attention weights tailored to each target domain sample, DCAF extracts the most relevant source domain features and generates dynamic fused representations. The proposed approach enables sample-specific feature selection and fine-grained domain alignment, effectively enhancing the modeling performance compared with traditional transfer learning and model training based on the local data source. It is particularly suited for a new robotic AI training task with limited sample size and new data modalities. Experimental results for feature fusion from a robotics anomaly detection dataset to a position accuracy modeling data set demonstrate the effectiveness of DCAF, providing an efficient solution for domain adaptation and multimodal fusion.

15:03-15:21, Paper MoBT6.2
Reflex-Plan: A Safety Monitoring Architecture for Thinking Fast and Slow

Rizwan , Momina	Lund University
Reichenbach, Christoph	Lund University
Krueger, Volker	Lund University
Keywords: Domain-specific Software and Software Engineering, Failure Detection and Recovery, Software, Middleware and Programming Environments Abstract: Ensuring functional safety is crucial for the deployment of autonomous systems in real-life dynamic environments, as they must operate reliably and safely among humans. However, existing safety systems are designed with a closed-world assumption and can over-constrain the system by shutting down the robot at every safety violation, limiting the robot's ability to complete its tasks. To address this problem, we present a novel operational safety approach supported by our software architecture Reflex-Plan, where a safety monitor proactively influences high-level planning to enable safe and adaptive recovery behaviors thus preventing unnecessary stops. Unlike traditional safety monitors that primarily react to violations through predefined stop mechanisms, our software architecture follows a two-step process: the fast-thinking safety monitor provides immediate reflexive responses, while the slow-thinking high-level planner processes the safety monitor's feedback to plan recovery strategies. This allows the robot to respond quickly to safety-critical situations while maintaining adaptability for long-term autonomy. We validate the effectiveness of Reflex-Plan through real-world robot experiments in a mock hospital environment. Our experimental results confirm that keeping immediate safety responses within the safety monitor ensures fast reactivity while delegating recovery strategies to the reasoning layer enables efficient adaptation, reducing failures and ensuring more stable operation without reliance on external intervention.

15:21-15:39, Paper MoBT6.3
Towards Trustworthy Degradation Prediction: An Interpretable Deep Learning Approach with Sparse Feature Extraction and Temporal Fusion

Li, Dongpeng	The Hong Kong Polytechnic University
Zheng, Pai	The Hong Kong Polytechnic University
Li, Weihua	South China University of Technology
Keywords: Diagnosis and Prognostics, AI-Based Methods, Machine learning Abstract: Degradation prediction of industrial equipment is crucial for ensuring the reduction of downtime and optimizing maintenance strategies. Although the existing Deep Learning (DL) based estimation methods provide accurate predictions with generalizability, the interpretable extraction of deep features related to degradation has not been discussed. It is also challenging to interpret and trace the temporal dynamics of deep features, including regular degradation accumulation and abnormal situations. To address these issues, this paper proposes an interpretable and traceable framework for degradation prediction. First, the Degradation-Informed Interpretable Encoder (DIIE) encodes the raw signal into sparse features, in which the parametric wavelet kernel and degradation constraint are designed to guide the automatic degradation feature extraction. Then the Interpretable Temporal Fusion Module (ITFM) with binarized gating values is used to directly process the multi-step features with more transparency. Finally, the temporal-enhanced features are fed into the predictor to make inferences. The proposed approach was validated on a bearing degradation dataset and has achieved competitive predictive performance. Additionally, it provides interpretations for feature extraction and temporal fusion, which can improve the understanding and trustworthiness regarding the prediction of mechanical degradation.

15:39-15:57, Paper MoBT6.4
ROCKET-LRP: Explainable Time Series Classification with Application to Anomaly Prediction in Manufacturing

Ling, Zhijian	University of Toronto
Aoyama, Takuya	Konica Minolta
Yano, Keijiro	Konica Minolta
Cohen, Eldan	University of Toronto
Keywords: Machine learning, AI-Based Methods, Intelligent and Flexible Manufacturing Abstract: Time Series Classification is a popular approach in machine learning with many applications. The Random Convolutional Kernel Transform (ROCKET) model has achieved state-of-the-art performance in various time-series classification tasks due to its ability to capture complex patterns and temporal relationships. However, its reliance on random convolutions hinders the explainability of the model, as the relationships between the transformed features and the original input data become obscured. To address these challenges, we propose a novel approach for computing explanations in ROCKET-based time-series classification models that integrates Layer-wise Relevance Propagation with either model-agnostic post-hoc or model-intrinsic local explanation techniques. We implement our approach for two widely used classification models and three local explanation techniques. We validate our approach on two simulated datasets, demonstrating its faithfulness and effectiveness. Additionally, we present an application of our approach to anomaly prediction in real-world manufacturing data and show that it provides superior local explanations compared to popular explanation techniques such as SHAP and LIME.

15:57-16:15, Paper MoBT6.5
Using Style Transfer to Leverage Synthetic Data for Machine Learning-Based Quality Inspection in Forming Processes

Benfer, Achim	Technical University of Munich
Hujo, Dominik	Technical University of Munich
Krüger, Marius	Technical University of Munich
Land, Kathrin Sophie	Technical University of Munich
Lechner, Michael	Friedrich-Alexander-Universität
Merklein, Marion	Friedrich-Alexander-Universität
Vogel-Heuser, Birgit	Technical University Munich
Keywords: Computer Vision in Automation, Machine learning, Data fusion Abstract: Camera-based measurement systems are increasingly used in manufacturing, with machine learning models outperforming traditional image recognition methods. However, industrial adoption remains limited, partly due to the effort required for data collection and model training, which typically relies on real manufacturing data. Many mid-sized companies that build the machines do not have the required personnel to set up and train these systems. In addition, the time it takes to implement these networks either delays the start of manufacturing or prevents them from being implemented until after manufacturing has started. A possible solution is training models on synthetic data, such as Computer Aided Design (CAD) renderings, instead of real manufacturing images. However, differences in appearance between renderings and real images lead to poor model performance due to the domain gap, which is the difference between the appearance of the photos and the renderings. This paper proposes an AI-based visual quality inspection method using synthetic training data, bridging the domain gap with a style transfer applied only to the training set. This approach is evaluated on two use cases which are implemented into the industrial PC (IPC) connected to an industrial high-speed press and evaluated and compared to baselines on real manufacturing photos from this process. Results show that the domain gap between synthetic and real images can be closed through style transfer. When using the same product as a style reference, an Intersection over Union (IoU) of over 96% is achieved.


MoBT7	Mediterranean
Assembly Automation	Regular Session
Chair: Popa, Dan	University of Louisville

14:45-15:03, Paper MoBT7.1
Multimodal Sensing and Machine Learning to Compare Printed and Verbal Assembly Instructions Delivered by a Social Robot

Mishra, Ruchik	University of Louisville
Prasanna, Laksita	University of Louisville
Adair, Adair	University of Louisville
Popa, Dan	University of Louisville
Keywords: Assembly, AI-Based Methods, Machine learning Abstract: In this paper, we compare a manual assembly task communicated to workers using both printed and robot-delivered instructions. The comparison was made using physiological signals (blood volume pulse (BVP) and electrodermal activity (EDA)) collected from individuals during an experimental study. In addition, we also collected responses using the NASA Task Load Index (TLX) survey. Furthermore, we mapped the collected physiological signals to the responses of participants for NASA TLX to predict their workload. For both classification problems, we compared the performance of Convolutional Neural Networks (CNNs) and Long-Short-Term Memory (LSTM) models. Results show that for our CNN-based approach using multimodal data including both BVP and EDA gave better results than using just BVP (approx. 8.38% better) and EDA (approx. 20.49% better). Furthermore, our LSTM-based approach also had better results when we used multimodal data (approx. 8.38% better than just BVP and 6.70% better than just EDA). Overall, CNNs performed 7.72% better than LSTMs for classifying physiologies for paper vs. robot-based instruction. The CNN-based model also provided better classification results (approximately 17.83% better on an average across all responses of the NASA TLX) within a few minutes of training compared to the LSTM-based models.

15:03-15:21, Paper MoBT7.2
Design and Integration of a Robotic Gripper and Warehouse System for Automated Cable Assembly

Govoni, Andrea	University of Bologna
Cavuoto, Michela	Università Di Bologna
Massini Alunni, Miriam	Alma Mater Studiorum
Palli, Gianluca	University of Bologna
Indovini, Maurizio	Iema
Keywords: Collaborative Robots in Manufacturing, Product Design, Development and Prototyping, Manufacturing, Maintenance and Supply Chains Abstract: Robotic automation can improve efficiency in switchgear manufacturing, but fully automating the wiring phase remains challenging. A key limitation is the unstructured organization of cables after production, which hinders robotic integration. This paper introduces a cost-effective, modular warehouse system that bridges the gap between automated cable production and robotic wiring. The proposed system combines a custom gripper, capable of handling cables of various diameters, with a structured storage solution using adaptive clips optimized via finite element analysis. By enabling deterministic cable placement, the system aims to eliminate the need for complex vision-based identification. Experimental results confirm its robustness and repeatability, paving the way toward fully automated cable assembly in industrial applications.

15:21-15:39, Paper MoBT7.3
Cutaway View Learning for Visually-Guided Assembly with Crane

Li, Pusong	University of Illinois at Urbana-Champaign
Hauser, Kris	University of Illinois at Urbana-Champaign
Nagi, Rakesh	University of Illinois, Urbana-Champaign
Keywords: Reinforcement, Deep Learning in Robotics and Automation, Automation in Construction Abstract: Object manipulation for construction assembly using a crane is a control problem with highly challenging dynamics, merging contact-rich manipulation, high dynamics uncertainty, and an underactuated system. Learning a vision-guided controller for such a system using reinforcement learning is a promising but challenging approach, as mating surfaces are occluded during the last stage of assembly, making feedback indirect. We present a novel form of cutaway-view privileged information for assembly tasks that is used within a student-teacher framework, making alignment information readily available during the initial stage of training. This is paired with a pretrained encoder and embedding buffer that leverages nonphysical manipulation within the simulator to collect its training data. We evaluate our method on four different assembly-type placement tasks, and find that our system significantly outperforms both kinodynamic planning and standard reinforcement-learning baselines. We also evaluate the ability of our trained controllers to transfer to a realistic simulation environment with different underlying dynamics, demonstrating continued superior performance under deployment with a significant dynamics gap.

15:39-15:57, Paper MoBT7.4
Task-Context-Aware Diffusion Policy with Language Guidance for Multi-Task Disassembly

Kang, Jeon Ho	University of Southern California
Joshi, Sagar Jatin	University of Southern California
Dhanaraj, Neel	University of Southern California
Gupta, Satyandra K.	University of Southern California
Keywords: Manipulation Planning, AI-Based Methods, Intelligent and Flexible Manufacturing Abstract: Diffusion-based policy learning has shown strong performance across diverse robotic tasks, often achieving high success rates. However, real-world deployment requires more than task success—it demands efficient execution and the ability to handle complex environments. In many assembly and disassembly settings, a single scene contains multiple potential task goals. This can confuse learned policies, leading to ambiguous behavior. Enabling explicit task selection via natural language is thus crucial for robust and flexible operation. In this paper, we address two key challenges: (1) improving task execution efficiency by structuring tasks into distinct sub-task modes using language, and (2) resolving goal ambiguity by allowing human operators to specify desired tasks through natural language commands. We further introduce an adaptive parameter selection mechanism that adjusts reliance on different sensory modalities depending on the active sub-task. We evaluate our approach on the NIST Task Board, a representative benchmark with multiple co-located task goals. Our method improves execution speed by 57% and increases task success rate by 19% compared to baseline approaches. Demonstration videos are available at: https://rros-lab.github.io/task-aware-diffusion

15:57-16:15, Paper MoBT7.5
Accurate Pose Estimation Using Contact Manifold Sampling for Safe Peg-In-Hole Insertion of Complex Geometries

Negi, Abhay	University of Southern California
Manyar, Omey Mohan	University of Southern California
Penmetsa, Dhanush Kumar Varma	University of Southern California
Gupta, Satyandra K.	University of Southern California
Keywords: Assembly, Compliant Assembly, Intelligent and Flexible Manufacturing Abstract: Robotic assembly of complex, non-convex geometries with tight clearances remains a challenging problem, demanding precise state estimation for successful insertion. In this work, we propose a novel framework that relies solely on contact states to estimate the full SE(3) pose of a peg relative to a hole. Our method constructs an online submanifold of contact states through primitive motions with just 6 seconds of online execution, subsequently mapping it to an offline contact manifold for precise pose estimation. We demonstrate that without such state estimation, robots risk jamming and excessive force application, potentially causing damage. We evaluate our approach on five industrially relevant, complex geometries with 0.1 to 1.0 mm clearances, achieving a 96.7% success rate-a 6x improvement over primitive-based insertion without state estimation. Additionally, we analyze insertion forces, and overall insertion times, showing our method significantly reduces the average wrench, enabling safer and more efficient assembly.


MoBT8	Heinsbergen
Social and Intelligent Manufacturing 1	Special Session
Chair: Wang, Di	South China University of Technology
Co-Chair: Lin, Weizhi	University of Southern California
Organizer: Wang, Feiyue	Institute of Automation, Chinese Academy of Sciences
Organizer: Jiang, Pingyu	Xi’an Jiatong University
Organizer: Huang, Qiang	University of Southern California
Organizer: Pian, Chunyuan	Xinxiang University
Organizer: Wang, Di	South China University of Technology
Organizer: Shen, Zhen	Institute of Automation, Chinese Academy of Sciences

14:45-15:03, Paper MoBT8.1
Accelerating Additive Manufacturing Slicing: A GPU-Based Parallel Algorithm for Large and Complex Mesh Models (I)

Xiao, Yao	Xi'an Jiaotong University
Qu, Zhi	Beijing Aerospace Propulsion Institute
Wei, Chao	ZWSOFT CO., LTD.(Guangzhou)
Yan, Chao-Bo	Xi'an Jiaotong University
Cui, Bin	Xi'an Jiaotong University
Keywords: Additive Manufacturing Abstract: Efficient slicing of massive data remains a significant challenge in additive manufacturing. To address computer memory limitations and enhance slicing efficiency, this study presents a novel approach combining batch processing of large-scale mesh models with a GPU-accelerated parallel slicing algorithm. The proposed method partitions mesh model files, which are typically too large for single memory allocation, into multiple sub-models based on the slicing direction. During sub-model processing, an optimized edge-labeling algorithm is implemented to topologically mark all edges within each sub-model. The slicing operation is then executed in parallel across sub-models using GPU acceleration through OpenCL, significantly improving computational efficiency. The individual slicing results are subsequently integrated to generate the final output. Theoretically, this algorithm eliminates memory constraints on mesh model size while maintaining high slicing efficiency. Comparative experiments with industry-standard software Magics and Cura demonstrate the superiority of our method. The proposed algorithm successfully processes large-scale mesh models that exceed the capacity of both commercial solutions. Furthermore, it achieves a remarkable 80% reduction in slicing time for complex models compared to Magics and Cura, demonstrating both the feasibility and superior efficiency of our approach.

15:03-15:21, Paper MoBT8.2
Dynamic Double-Sided Rolling-Horizon Auction Mechanisms for Additive Manufacturing Collaboration in Social Manufacturing (I)

Sun, Mingyue	The Hong Kong Polytechnic University
Li, Jinpeng	The Hong Kong Polytechnic University
Chen, Qiqi	The Hong Kong Polytechnic University
Zhang, Mengdi	The Hong Kong Polytechnic University
Zhao, Zhiheng	The Hong Kong Polytechnic University
Huang, George Q.	The Hong Kong Polytechnic University
Keywords: Planning, Scheduling and Coordination, Task Planning, Additive Manufacturing Abstract: The fusion of Social Manufacturing (SM) and Additive Manufacturing (AM) has led to the emergence of distributed and cooperative production models, where prosumers seamlessly transition between the roles of producers and consumers. However, existing manufacturing-sharing platforms often struggle to accommodate the bidirectional, dynamic, and long-term collaborative requirements inherent to AM. We first design an one-shot double-sided VCG auction mechanism, ensuring incentive compatibility, allocative efficiency and individual rationality. To support long-term and adaptive AM collaboration, this study design two rolling-horizon auction mechanisms: (1) a greedy algorithm-driven approach, which iteratively assigns AM orders by utilizing short-term price variations, and (2) a heuristic-based auction, which reformulates the collaborative manufacturing problem as a Maximum Weighted Independent Set (MWIS) problem, solved using a hybrid Iterated Local Search (ILS) heuristic. To assess the effectiveness of these mechanisms, we conduct a numerical experiment, demonstrating their capability to enhance AM resource efficiency and foster collaborative production.

15:21-15:39, Paper MoBT8.3
Automated Qualification of 3D-Printed Products for Personalized Manufacturing (I)

Lin, Weizhi	University of Southern California
Huang, Qiang	University of Southern California
Keywords: Additive Manufacturing, Machine learning Abstract: Product qualification is typically performed by specifying features or regions of interest (ROIs) during design, conducting shape registration of the inspected product to establish correspondence with its design counterpart, and measuring discrepancies for compliance assessment. However, qualification of complex freeform products often requires human intervention to ensure accuracy, particularly in personalized manufacturing through 3D printing. Geometric variety and complexity can induce operator-to-operator variability due to heterogeneous spatial distributions of geometric distortions. To enable automated product qualification, we propose to identify and represent ROIs as surface patches using geometric descriptors indicative of their intrinsic deviation patterns. ROI specification via shape space dimension reduction, non-rigid intrinsic shape registration, and intrinsic deviation representation can therefore be conducted for product qualification. Finite types of ROIs or surface patches can be extracted based on their intrinsic deviation patterns, independent of covariates such as size and location. A MATLAB software suite has been developed to implement the entire process, demonstrating its effectiveness in the qualification of complex dental models.

15:39-15:57, Paper MoBT8.4
EVT-CLIP: Enhancing Zero-Shot Anomaly Segmentation with Vision-Text Models (I)

Yue, ZhiJian	University of Chinese Academy of Sciences
Shen, Zhen	Institute of Automation, Chinese Academy of Sciences
Fang, Qihang	Institute of Automation, Chinese Academy of Sciences
Wang, Weixing	CASIA
Xiong, Gang	Institute of Automation, Chinese Academy of Sciences
Dong, Xisong	Institute of Automation, Chinese Academy of Sciences
Wang, Feiyue	Institute of Automation, Chinese Academy of Sciences
Keywords: Computer Vision for Manufacturing, Zero-Defect Manufacturing, Additive Manufacturing Abstract: In recent years, zero-shot anomaly segmentation (ZSAS) has emerged as a cutting-edge technology, demonstrating significant potential in the field of anomaly detection. However, traditional methods often rely on manually designed fixed textual descriptions or anomaly prompts, which limits the model's adaptability to different types of anomalies. Additionally, existing methods exhibit shortcomings in the interaction and fusion of image and text features, resulting in suboptimal cross-modal understanding and insufficient information sharing. To address these challenges, in this paper we propose an innovative ZSAS method—EVT-CLIP—aimed at enhancing the performance of anomaly detection and localization tasks. The core idea of this method is to combine the Dynamic Attention-Enhanced Prompt (DAEP) module with the Cross-modal Interaction (CMI) module to improve the model's generalization capability and cross-modal information fusion. Specifically, the DAEP module reduces reliance on category-specific information by precisely fusing global image features with textual prompts, thereby enhancing the model's adaptability to various anomaly types. Meanwhile, the CMI module captures both local details and global contextual information in images through deep interaction between image and text features, optimizes text embeddings, and significantly enhances cross-modal understanding between images and text. Experimental validation on multiple benchmark datasets demonstrates that the EVT-CLIP framework achieves remarkable performance improvements in anomaly segmentation tasks, outperforming existing ZSAS methods and proving its effectiveness and advantages in practical applications.

15:57-16:15, Paper MoBT8.5
Group-Based QMIX for Multi-Agent Reinforcement Learning (I)

Hong, Weixin	Institute of Automation, Chinese Academy of Sciences
Wu, Huaiyu	Institute of Automation, Chinese Academy of Sciences
Fang, He	Institute of Automation, Chinese Academy of Sciences
Shen, Zhen	Institute of Automation, Chinese Academy of Sciences
Han, Yunjun	Institute of Automation, Chinese Academy of Sciences
Lv, Yisheng	Chinese Academy of Sciences
Xiong, Gang	Institute of Automation, Chinese Academy of Sciences
Keywords: AI-Based Methods, Agent-Based Systems, Autonomous Agents Abstract: In multi-agent reinforcement learning environments, value decomposition methods are popularly applied to address the cooperation issue among agents. However, in some multi-agent value decomposition methods, the global action-value is usually approximated using upper and lower bounds, and leads to a lack of fine-grained cooperative actions. Furthermore, current state-of-the-art value decomposition approaches are predominantly confined to addressing cooperative learning problems involving small-scale multi-agent systems. As the number of agents increases, these methods may lead to difficulties in the convergence of the Q value function especially in more complex cooperative scenarios. To address the above two challenges, we propose a Group-based QMIX (GQMIX) method which learns to dynamically divide agents into multiple groups during exploration while applying Graph Attention Network (GAT) to simultaneously learn value decomposition under both global observation and local observation. This enables the subdivision of agents into different groups in largescale settings, allowing the learning of the common subtasks in complex scenarios and improving the convergence efficiency of the value function. Experimental results demonstrate that our proposed algorithm is valid by providing better scheduling solutions for the extended flexible job shop scheduling problem. And it outperforms existing multi-agent reinforcement learning methods in terms of convergence and stability


MoBT9	Moroccan
Human Robot Collaboration for Smart Manufacturing 2	Special Session
Chair: Zhang, Yunbo	Rochester Institute of Technology
Organizer: Zheng, Pai	The Hong Kong Polytechnic University
Organizer: Peng, Tao	Zhejiang University
Organizer: Gu, Xi	Rutgers University
Organizer: Wang, Yongjing	University of Birmingham
Organizer: Kim, Hyun-Jung	Korea Advanced Institute of Science and Technology
Organizer: Zhang, Yunbo	Rochester Institute of Technology
Organizer: Bao, Jinsong	DongHua University
Organizer: Huang, George Q.	The Hong Kong Polytechnic University
Organizer: Wang, Lihui	KTH Royal Institute of Technology
Organizer: Pham, Duc Truong	University of Birmingham

14:45-15:03, Paper MoBT9.1
More Attention for Human: A Multimodal Data-Driven Human Intention Identification Method (I)

Liu, Zhixin	Zhejiang University
Feng, Yixiong	Zhejiang University
Lou, Shanhe	Nanyang Technological University
Lu, Chengyu	Zhejiang University
Tan, Jianrong	Zhejiang University
Keywords: Human-Centered Automation, Data fusion, Machine learning Abstract: “Human-Machine Symbiosis” is a defining characteristic of Industry 5.0, where human-centric manufacturing paradigms emphasize the integration of advanced production tools with intrinsically human problem-solving capabilities. While product manufacturing is inherently inseparable from product design, and conceptual design influences the cost during subsequent manufacturing stages. Current assistive manufacturing software is limited to passive command execution during interactions, lacking the ability to recognize human intent, which leads to barriers in human-machine collaboration. To address this limitation, a multimodal data-driven human intention identification method is proposed. This methodology employs 3D spatial modeling and verbal analysis to capture multimodal data generated during collaborative manufacturing processes. A novel Transformer architecture integrating T2T-ViT and Bert (TB-Multiformer) is developed to identify human intention. The multimodal features of data are extracted by inter-modal attention module and intra-modal self-attention module. A coordinated manufacture case involving two types of mechanism structures is utilized to verify the feasibility and viability of proposed method.

15:03-15:21, Paper MoBT9.2
A Lightweight Human Posture Predictive Assessment in Human-Robot Collaboration Via Diffusion Mamba (I)

Zhong, Ruirui	Zhejiang University
Hu, Bingtao	Zhejiang University
Zhang, Zhifeng	Zhejiang University
Feng, Yixiong	Zhejiang University
Yuan, Yixiu	Zhejiang University
Tan, Jianrong	Zhejiang University
Keywords: Deep Learning in Robotics and Automation, Human-Centered Automation, AI and Machine Learning in Healthcare Abstract: Accurate human posture assessment is essential for ensuring safety and ergonomics in human-robot collaboration. Traditional assessment methods rely on real-time human posture estimation, which, while effective, lacks predictive capabilities and diversity in motion outcomes, limiting its ability to assess ergonomic risks proactively. To address this, we propose MambaFusion, a lightweight and diverse human posture predictive assessment framework based on Diffusion Mamba. MambaFusion integrates diffusion models and Mamba to generate diverse future posture predictions, enhancing the flexibility of ergonomic evaluation. A CrossMamba block is introduced for noise prediction, where cross attention mechanisms improve contextual understanding by refining conditional embeddings, leading to more accurate motion representation. Additionally, a REBA-based ergonomic assessment module evaluates predicted human postures, enabling more comprehensive ergonomic risk assessments. Extensive experiments demonstrate that MambaFusion exhibits strong performance in prediction accuracy, motion diversity, and the reliability of ergonomic evaluation, and outperforms other algorithms.

15:21-15:39, Paper MoBT9.3
Dual-Replanning Tree: Fast Multi-Query Path Planning in Dynamic Environments (I)

Li, Cheng	Xi'an Jiaotong University
Huang, Ziang	Xi'an Jiaotong University
Yan, Chao-Bo	Xi'an Jiaotong University
Hu, Jianchen	Xi'an Jiaotong University
Keywords: Motion and Path Planning, Collision Avoidance, Industrial and Service Robotics Abstract: This paper presents Dual-Replanning Tree (DRT), a real-time multi-query path planning algorithm that integrates local and global path generation, multi-query planning, and dynamic obstacle avoidance. Existing algorithms, such as FAT, which fix the tree root at the destination, exhibit high replanning efficiency but struggle with multi-query path planning. On the other hand, algorithms like RT-RRT*, which maintain the tree root near the robot, are advantageous for multi-query path planning but are significantly affected by newly detected obstacles, limiting their performance in dynamic environments. Our method innovatively improves the replanning strategy of algorithms that keep the tree root near the robot by introducing a reference path mechanism, enabling efficient node rewiring and expansion. This reference path is obtained through local small-scale rewiring, resulting in low computational overhead. Based on this reference path, the proposed algorithm can achieve higher replanning efficiency than algorithms that fix the tree root at the destination while retaining the dynamic adjustment of the tree root to adapt to multi-query path planning. Experimental results demonstrate that under various environmental conditions, the DRT algorithm outperforms the FAT algorithm in two key metrics: execution cost and arrival time.

15:39-15:57, Paper MoBT9.4
Multi-Class Human/Object Detection on Robot Manipulators Using Proprioceptive Sensing (I)

Hehli, Justin	Unversity of Zurich
Heiniger, Marco	Unversity of Zurich
Rezayati, Maryam	University of Zurich (UZH), Zurich University of Applied Science
van de Venn, Hans Wernher	Zurich University of Applied Science
Keywords: Collaborative Robots in Manufacturing, Deep Learning in Robotics and Automation, Human-Centered Automation Abstract: In physical human-robot collaboration (pHRC) settings, humans and robots collaborate directly in shared environments. Robots must analyze interactions with objects to ensure safety and facilitate meaningful workflows. One critical aspect is human/object detection, where the contacted object is identified. Past research introduced binary machine learning classifiers to distinguish between soft and hard objects. This study improves upon those results by evaluating three-class human/object detection models, offering more detailed contact analysis. A dataset was collected using the Franka Emika Panda robot manipulator, exploring preprocessing strategies for time-series analysis. Models including LSTM, GRU, and Transformers were trained on these datasets. The best-performing model achieved 91.11% accuracy during real-time testing, demonstrating the feasibility of multi-class detection models. Additionally, a comparison of preprocessing strategies suggests a sliding window approach is optimal for this task.

15:57-16:15, Paper MoBT9.5
A Contrastive Learning Approach to Paraphrase Identification (I)

Zhou, Jing	Xi'an Jiaotong University
Hu, Min	China Mobile
Li, Shuaipeng	Xi'an Jiaotong University
Wang, Yuyang	Xi'an Jiaotong University
Guo, Yifeng	China Mobile
Dong, Xin	Xi'an Jiaotong University
Cui, Jian	Xi'an Jiaotong University
Li, Yibo	Xi'an Jiaotong University
Song, Yunpeng	Xi'an Jiaotong University
Cai, Zhongmin	Xi'an Jiaotong University
Zhou, Wuai	China Mobile
Yan, Chao-Bo	Xi'an Jiaotong University
Keywords: Machine learning, AI-Based Methods, Data fusion Abstract: This paper focuses on the paraphrase identification (PI) task, a fundamental NLP task, which aims to determine whether a pair of sentences convey the same or similar meanings. Despite the significant progresses of current pre-trained language models in PI task, the inherent ambiguity of natural languages stemming from the polysemous nature of words presents a challenge in assessing semantic similarity. Therefore, there is a necessity for further enhancement in capturing intricate relationships between sentences. In light of this challenge, we propose a method that utilizes contrastive learning to enhance sentence embeddings that are optimized for discriminating between sentences with similar or dissimilar semantic meanings. To be specific, the novel framework involves training a BERT model on modified Natural Language Inference (NLI) datasets using two-level contrastive learning to obtain a 2-Level-CLPI-BERT model, aiming to enhance sentence representations for the PI task. Experiments conducted on four PI datasets demonstrate that the proposed model outperforms state-of-the-art methods in intra-dataset. Furthermore, the cross-dataset performance evaluation substantiates the generalizability of 2-Level-CLPI-BERT embeddings.


MoBT10	Corsican
Planning, Scheduling and Control 2	Regular Session
Chair: Reisi Gahrooei, Mostafa	University of Florida

14:45-15:03, Paper MoBT10.1
GS-NBV: A Geometry-Based, Semantics-Aware Viewpoint Planning Algorithm for Avocado Harvesting under Occlusions

Song, Xiaoao	University of California Riverside
Karydis, Konstantinos	University of California, Riverside
Keywords: Agricultural Automation, Manipulation Planning, Reactive and Sensor-Based Planning Abstract: Efficient identification of picking points is critical for automated fruit harvesting. Avocados present unique challenges owing to their irregular shape, weight, and less-structured growing environments, which require specific viewpoints for successful harvesting. We propose a geometry-based, semantics-aware viewpoint-planning algorithm to address these challenges. The planning process involves three key steps: viewpoint sampling, evaluation, and execution. Starting from a partially occluded view, the system first detects the fruit, then leverages geometric information to constrain the viewpoint search space to a 1D circle, and uniformly samples four points to balance the efficiency and exploration. A new picking score metric is introduced to evaluate the viewpoint suitability and guide the camera to the next-best view. We validate our method through simulation against two state-of-the-art algorithms. Results show a 100% success rate in two case studies with significant occlusions, demonstrating the efficiency and robustness of our approach. Our code is available at https://github.com/lineojcd/GSNBV.

15:03-15:21, Paper MoBT10.2
Passenger Simulation Models for Transit Scheduling and Facility Layout Planning in Airports

Alfas, Muhammad	Indian Institute of Technology Delhi
Negi, Apurv	Indian Institute of Technology Delhi
Masiwal, Mohit	Indian Institute of Technology Delhi
Koriya, Vishal Kumar	Wipro India
Babre, Tirtharaj Purushottam	Wipro India
Vepakomma, Navya	Wipro India
Shriyam, Shaurya	IIT Delhi
Keywords: Planning, Scheduling and Coordination, Simulation and Animation, Agent-Based Systems Abstract: Simulating the movements of passengers on service systems has always been an interesting business problem. Unlike manufacturing systems, where processes and the movement of personnel follow predefined pathways, service systems like airports require a lot of information about customer movements and behaviors. In this work, we propose two simulation models to simulate passenger movement in airport systems. The first one is a discrete event simulation model, which models passengers and trains in an airport transit system. In the second model, we use the social force model, a microscopic agent-based model, to model the movement of passengers inside airport check-in facilities. Simulated annealing, along with the transit simulation model, is used to optimize the scheduling of trains to minimize waiting times and maximize occupancy. We consider the facility layout planning problem for the airport check-in area and use large language models to improve the layout. To evaluate the efficacy of the model, we use the social force model and space syntax analysis. Results indicate that the scheduling optimization can bring a 200% improvement, whereas the improved facility layout delivers a 2% improvement in the throughput.

15:21-15:39, Paper MoBT10.3
Superquadric Object Representation As a Control Barrier Function for Obstacle Avoidance

Fernandez, Louis Ferdinand Nicodemus	University of Technology Sydney
Hernandez Moreno, Victor	University of Technology Sydney
Sutjipto, Sheila	University of Technology, Sydney
Carmichael, Marc	Centre for Autonomous Systems
Keywords: Collision Avoidance Abstract: Ensuring successful robot task performance and safety in unstructured environments is a critical challenge in robotics. A key requirement in addressing this challenge is to accurately model the scene in which robots operate to effectively perform obstacle avoidance. State-of-the-art approaches for online obstacle avoidance generally rely on simplified representations (e.g. ellipsoids) that often result in highly conservative collision models that limit their effectiveness. To address this challenge, this paper proposes a controller which leverages superquadrics to reduce conservative behaviours. The parametric nature of superquadrics allow for accurate modelling of the robotic system and the environment. The calculated distance between the superquadrics informs the construction of a Control Barrier Function, which is integrated into a Quadratic Program to enable obstacle avoidance. Finally, by formulating the problem in the operating space and considering object volumes, the proposed controller is able to utilise rotational deviation to achieve safer behaviours. The proposed approach is evaluated through simulation and real-world experiments. The simulation results demonstrate the effectiveness of the proposed framework, while results from the real-world experiments highlight the advantages of the framework in different scenarios.

15:39-15:57, Paper MoBT10.4
Lightweight Learning Algorithm for Lane Line Detection in Different Lighting Conditions

Shi, Xiaolin	Xi'an University of Posts & Telecommunications
Keywords: Intelligent Transportation Systems, Autonomous Vehicle Navigation Abstract: Lane line detection is a critical task in autonomous driving and autonomous vehicle navigation that requires fast and accurate prediction. For the lane detection task, an improved lightweight lane detection method is proposed by combining multi-scale feature fusion technology and attention mechanism to improve the accuracy and efficiency. Firstly, the original ResNet lane feature extraction algorithm is replaced with RepVGG algorithm, and to further utilize low-level features, a scheme of multi-scale lane feature extraction is designed. Then, to increase the capture ability on the lane regions and expand the network's receptive field, a lane dual attention (LDA) module dedicated to lane feature extraction is proposed. Finally, to address the problem of imbalance in lane samples, focal loss function is selected to replace the cross entropy loss function. Experimental results on the CULane test dataset show the overall superior performance on detection accuracy and speed in different lighting and traffic environments.

15:57-16:15, Paper MoBT10.5
Trajectory Planning of a Curtain Wall Installation Robot Based on Biomimetic Mechanisms

Liu, Xiao	Shenzhen Institute of Advanced Technology, Chinese Academy of Sc
Wang, Weijun	Guangzhou Institute of Advanced Technology, Chinese Academy of Sc
Huang, Tianlun	University of Chinese Academy of Sciences, Beijing 100049; Shenz
Wang, Zhiyong	University of Chinese Academy of Sciences, Beijing 100049; Shenz
Feng, Wei	Shenzhen Institutes of Advanced Technology, Chinese Academy of S
Keywords: Automation in Construction, Biomimetics, Motion Control Abstract: As the robotics market rapidly evolves, energy consumption has become a critical issue, particularly restricting the application of construction robots. To tackle this challenge, our study innovatively draws inspiration from the mechanics of human upper limb movements during weight lifting, proposing a bio-inspired trajectory planning framework that incorporates human energy conversion principles. By collecting motion trajectories and electromyography (EMG) signals during dumbbell curls, we construct an anthropomorphic trajectory planning that integrates human force exertion patterns and energy consumption patterns. Utilizing the Particle Swarm Optimization (PSO) algorithm, we achieve dynamic load distribution for robotic arm trajectory planning based on human-like movement features. In practical application, these bio-inspired movement characteristics are applied to curtain wall installation tasks, validating the correctness and superiority of our trajectory planning method. Simulation results demonstrate a 48.4% reduction in energy consumption through intelligent conversion between kinetic and potential energy. This approach provides new insights and theoretical support for optimizing energy use in curtain wall installation robots during actual handling tasks.


MoBT11	Emerald
Peter Luh Memorial Best Paper Award for Young Researcher	Special Session
Chair: Lennartson, Bengt	Chalmers University of Technology

14:45-15:10, Paper MoBT11.1
Modifying ABIT* for Tethered Rappelling Robot Motion Planning

Goddu, Austen	Michigan Technological University
Brown, Travis	NASA Jet Propulsion Laboratory, California Institute of Technolo
Paton, Michael	Jet Propulsion Laboratory
Motes, James	University of Illinois Urbana-Champaign
Chen, Tan	Michigan Technological University
Keywords: Motion and Path Planning, Autonomous Vehicle Navigation, Motion Control Abstract: In this paper, we improve a path planner for a tethered rappelling robot to find initial solutions efficiently. Implemented on NASA-JPL's Axel rover, the new planner offers an increase in success rate as high as 20% while using fewer resources compared to the original. This higher performance is achieved by modifying the underlying random geometric graph configuration to use a k-nearest neighbor approach, and biasing the sampling portion of the algorithm to add more consideration to sloped regions. These improvements are tested on a number of sloped maps constructed to have specific features or model the real world, using a pipeline involving generated terrain models and a simulated depth camera. In addition to the comparison to the original planner, various configurations are found to improve the success and efficiency of the motion planner on large and noisy maps.

15:10-15:35, Paper MoBT11.2
Tackling the ‘small Data’ Challenge: A Versatile Melt Pool Encoder Via Reconstructive Unsupervised Pretraining for Few-Shot Learning (I)

Chen, Zijue	CSIRO
Wang, Heng	University of Sydney
Gunasegaram, Dayalan	CSIRO
Keywords: Computer Vision for Manufacturing, AI-Based Methods, Additive Manufacturing Abstract: Research in metal additive manufacturing frequently encounters `small data' challenges due to the high costs associated with data generation and the extensive categorization of the process. In addition, most of the available data is unlabeled, i.e., it is not data science-ready. In this paper, we demonstrate a method that overcomes this issue for analyzing melt pool (MP) signatures, a key real-time indicator of process health. Specifically, we propose a novel approach that utilizes reconstructive self-supervised pretraining to develop a pretrained MP encoder that is subsequently finetuned for multiple downstream tasks—--including predictions of MP width, aspect ratio, size, and average temperature—--using only a few labeled samples. By leveraging dense unlabeled visual monitoring data, our method significantly improves prediction accuracy compared to traditional end-to-end supervised learning, especially in few-shot scenarios. Comprehensive computational experiments demonstrate that our pretrained encoder, when finetuned with a simple regression head, achieves a superior performance while reducing the overall reliance on labeled data. It thus offers a cost-effective solution for advanced manufacturing applications, e.g., real-time process control.

15:35-16:00, Paper MoBT11.3
Cooking Task Planning Using LLM and Verified by Graph Network

Takebayashi, Ryunosuke	Osaka University
Isume, Vitor Hideyo	Osaka University
Kiyokawa, Takuya	Osaka University
Wan, Weiwei	Osaka University
Harada, Kensuke	Osaka University
Keywords: Task Planning, Planning, Scheduling and Coordination, AI-Based Methods Abstract: Cooking tasks remain a challenging problem for robotics due to their complexity. Videos of people cooking are a valuable source of information for such task, but introduces a lot of variability in terms of how to translate this data to a robotic environment. This research aims to streamline this process, focusing on the task plan generation step, by using a Large Language Model (LLM)-based Task and Motion Planning (TAMP) framework to autonomously generate cooking task plans from videos with subtitles, and execute them. Conventional LLM-based task planning methods are not well-suited for interpreting the cooking video data due to uncertainty in the videos, and the risk of hallucination in its output. To address both of these problems, we explore using LLMs in combination with Functional Object-Oriented Networks (FOON), to validate the plan and provide feedback in case of failure. This combination can generate task sequences with manipulation motions that are logically correct and executable by a robot. We compare the execution of the generated plans for 5 cooking recipes from our approach against the plans generated by a few-shot LLM-only approach for a dual-arm robot setup. It could successfully execute 4 of the plans generated by our approach, whereas only 1 of the plans generated by solely using the LLM could be executed.


MoPM2_BR	Foyer
Monday Afternoon Coffee Break 2


MoCT1	Roman
To Automate or to Augment?	Special Session
Chair: Davari Najafabadi, Shakiba	Georgia Tech
Co-Chair: Marsella, Stacy	Northeastern University
Organizer: Moghaddam, Mohsen	Georgia Institute of Technology
Organizer: Davari Najafabadi, Shakiba	Georgia Tech
Organizer: Andrist, Sean	Microsoft Research
Organizer: Bohus, Dan	Microsoft Research
Organizer: Marsella, Stacy	Northeastern University

16:30-16:48, Paper MoCT1.1
A Scalable Data-Driven Methodology for Human Intention Prediction in Diverse Collaborative Scenarios (I)

Dell'Oca, Samuele	University of Applied Sciences and Arts of Southern Switzerland
Montini, Elias	University of Applied Sciences of Southern Switzerland (SUPSI)
Cutrona, Vincenzo	University of Applied Sciences of Southern Switzerland (SUPSI)
Matteri, Davide	University of Applied Sciences and Arts of Southern Switzerland
Landolfi, Giuseppe	SUPSI
Bettoni, Andrea	University of Applied Sciences of Southern Switzerland (SUPSI)
Keywords: Collaborative Robots in Manufacturing, Human-Centered Automation, Deep Learning in Robotics and Automation Abstract: Collaborative robots were designed to work alongside humans and enable Human-Robot Collaboration in industry, but without intelligence, these robots cannot adapt, make decisions, or respond dynamically to human actions. They function more as programmable tools rather than true teammates. This study proposes a methodology to predict operators' short-term intentions by identifying execution patterns and contextual features. Its key strength lies in its task-agnostic nature, enabling adaptation across different scenarios through a structured formalization of use-case characteristics. This flexibility allows model reconfiguration to optimize performance in various industrial applications. The predictive capability is integrated into an orchestration system, allowing the cobot to complement human actions and optimize task execution. The Intention Prediction Model was first validated in individual scenarios, demonstrating its ability to interpret human intentions in different manufacturing contexts. It was then tested with 12 participants in three collaborative scenarios, showing effective task adaptation, improved role synchronization, and task variability management, despite lower intention prediction accuracy compared to individual setups. While no collisions occurred, collaboration smoothness varied, indicating the need for advanced coordination techniques and task assignment logics based on human intention interpretation.

16:48-17:06, Paper MoCT1.2
6D Pose Tracking for Adaptive AR-Mediated Human-Robot Collaboration (I)

Ajikumar, Akhil	Georgia Institute of Technology
Wen, Bowen	NVIDIA
Moghaddam, Mohsen	Georgia Institute of Technology
Keywords: Collaborative Robots in Manufacturing, Human-Centered Automation, Human Performance Augmentation Abstract: This paper presents a system framework for adaptive, augmented reality (AR)-mediated human-robot collaboration (HRC), enabling real-time multimodal interaction and adaptive robot behaviors during collaborative manipulation tasks. The system integrates egocentric 6D object pose tracking with real-time visual attention tracking (gaze), hand gestures, and speech, facilitating seamless two-way communication between the human and the robot. While leveraging an existing 6D pose tracker (FoundationPose), we present the first evaluation of its integration within a real-time, egocentric AR + robot framework for dynamic HRC. Our results highlight practical limitations (e.g., tracking drift) and provide design insights for building more robust, adaptive collaboration systems. The system was validated across four real-world scenarios, demonstrating promising performance and identifying key challenges for future research.

17:06-17:24, Paper MoCT1.3
SIGMA: An Open-Source Platform for Mixed-Reality Task Assistance Research (I)

Andrist, Sean	Microsoft Research
Bohus, Dan	Microsoft Research
Keywords: Virtual Reality and Interfaces, AI-Based Methods, Software, Middleware and Programming Environments Abstract: In this presentation, we will introduce an open-source system called SIGMA (short for "Situated Interactive Guidance, Monitoring, and Assistance") as a platform for conducting research on task-assistive agents in mixed-reality scenarios. The system leverages the sensing and rendering affordances of a head-mounted mixed-reality device in conjunction with large language and multimodal models to guide users step by step through procedural tasks. We present the system's core capabilities, discuss its overall design and implementation, and outline directions for future research enabled by the system. We will also discuss the underlying features and affordances of the Platform for Situated Intelligence framework that enabled the development of SIGMA.

17:24-17:42, Paper MoCT1.4
System As a Collaborator (SAAC): A Framework for Modeling, Capturing and Augmenting Collaborative Activities in Extended Reality (I)

Léchappé, Aurélien	IMT Atlantique, DAPI, LS2N, LabSticc
Milliat, Aurélien	IMT Atlantique, DAPI, LS2N
Kabil, Alexandre	LISN, CNRS
Chollet, Mathieu	University of Glasgow
Dumas, Cédric	IMT Atlantique, DAPI, LS2N
Keywords: Virtual Reality and Interfaces, Human Factors and Human-in-the-Loop, Human Factors in Healthcare Abstract: When humans collaborate on a shared task, they use a myriad of verbal, para-verbal and non-verbal cues to achieve this end. Modern Artificial Intelligence sensing techniques such as Social Signal Processing (SST) allow the characterization of users’ activities in Augmented or Virtual Environments through the analysis of heterogeneous multimodal data sources, such as interaction actions, gaze direction, avatars’ positions, and speech analysis. Therefore, this enables realtime assessment of team processes, including communication or team situation awareness. In return, it can provide context-specific feedback and information to augment team capabilities. However, implementing this vision remains a technical challenge, particularly in gathering real-time data sources in multi-user collaborative scenarios. This paper presents a framework for assessing and augmenting team collaboration with Extended Reality Environments (XRE). We propose a multimodal architecture to capture users’ activities and augment the team capabilities by displaying system-generated context-specific collaborative cues. Our proposed vision, System As A Collaborator (SAAC), frames the XRE as a direct actor embedded in the collaborative activity, improving team members’ capabilities with collaborative augmentations instead of being merely the space where collaboration occurs. We demonstrate the feasibility of our approach and system architecture through use cases of experimental XRE where our software architecture collects multimodal, heterogeneous, and multi-user interaction, behavioral and physiological data, and generates direct, reactive cues, and higher-level context-specific cues, augmenting team collaboration.

17:42-18:00, Paper MoCT1.5
Assessing AI Roles in a Multi-Modal Human-AI Collaboration Framework for E-Scooters (I)

Lo, Wei-Hsiang	San Jose State University
Huang, Gaojian	San Jose State University
Keywords: Human-Centered Automation, Human Factors and Human-in-the-Loop, Human Performance Augmentation Abstract: Although incorporating AI-driven systems may help address the safety concerns caused by the growing use of e-scooters, the specific role of AI in micromobility has yet to be examined. This study explores user preferences for AI-human interaction in e-scooters across three AI roles (Advisor, Co-Pilot, Guardian) and eight human-machine interfaces (HMIs). A national survey (N=473) found no significant preference differences among the three AI roles. Auditory HMIs were preferred over Visual and Tactile HMIs. Among visual HMIs (i.e., AR glasses, control panel display, and road projection), the results indicated that AR glasses were the least satisfying, whereas control panels were the least useful. For auditory HMIs, informative voice assistance was favored over conversational types. Tactile feedback received the most positive response when delivered through handlebars compared to helmets and footpads. The findings of this study may guide the design of next-generation AI-driven systems for intelligent mobility.


MoCT2	Athenian
RAL Paper Session 1	Special Session
Chair: Greer, Ross	University of California, Merced

16:30-16:48, Paper MoCT2.1
The Persistent Robot Charging Problem for Long-Duration Autonomy

Kumar, Nitesh	Texas A&M University
Lee, Jaekyung Jackie	Texas A&M University
Rathinam, Sivakumar	TAMU
Darbha, Swaroop	TAMU
Pb, Sujit	IISER Bhopal
Raman, Rajiv	IIIT Delhi
Keywords: Surveillance Robotic Systems, Planning, Scheduling and Coordination, Task Planning Abstract: This paper introduces a novel formulation aimed at determining the optimal schedule for recharging a fleet of n heterogeneous robots, with the primary objective of minimizing resource utilization. This study provides a foundational framework applicable to Multi-Robot Mission Planning, particularly in scenarios demanding Long-Duration Autonomy (LDA) or other contexts that necessitate periodic recharging of multiple robots. A novel Integer Linear Programming (ILP) model is proposed to calculate the optimal initial conditions (partial charge) for individual robots, leading to minimal utilization of charging stations. This formulation was further generalized to maximize the servicing time for robots when charging stations are limited. The efficacy of the proposed formulation is evaluated through a comparative analysis, measuring its performance against the thrift price scheduling algorithm documented in the existing literature. The findings not only validate the effectiveness of the proposed approach but also underscore its potential as a valuable tool in optimizing resource allocation for a range of robotic and engineering applications.

16:48-17:06, Paper MoCT2.2
Lights As Points: Learning to Look at Vehicle Substructures with Anchor-Free Object Detection

Keskar, Maitrayee	University of California, San Diego
Greer, Ross	University of California, Merced
Gopalkrishnan, Akshay	University of California San Diego
Deo, Nachiket	UC San Diego
Trivedi, Mohan	University of California San Diego (UCSD)
Keywords: Autonomous Vehicle Navigation, Computer Vision for Transportation, Deep Learning for Visual Perception Abstract: Vehicle detection is a paramount task for safe autonomous driving, as the ego-vehicle must localize other sur- rounding vehicles for safe navigation. Unlike other traffic agents, vehicles have necessary substructural components such as the headlights and tail lights, which can provide important cues about a vehicle’s future trajectory. However, previous object detection methods still treat vehicles as a single entity, ignoring these safety-critical vehicle substructures. Our research addresses the detection of substructural components of vehicles in conjunction with the detection of the vehicles themselves. Emphasizing the integral detection of cars and their substructures, our objective is to establish a coherent representation of the vehicle as an entity. Inspired by the CenterNet approach for human pose estimation, our model predicts object centers and subsequently regresses to bounding boxes and key points for the object. We evaluate multiple model configurations to regress to vehicle substructures on the ApolloCar3D dataset and achieve an average precision of 0.782 for the threshold of 0.5 using the direct regression approach.

17:06-17:24, Paper MoCT2.3
Maximum Next-State Entropy for Efficient Reinforcement Learning

Zhong, Dianyu	Tsinghua University
Yang, Yiqin	Institue of Automation, Chinese Academy of Sciences
Zhang, Ziyou	Tsinghua University
Jiang, Yuhua	Tsinghua University
Xu, Bo	Institute of Automation, Chinese Academy of Sciences
Zhao, Qianchuan	Tsinghua University
Keywords: Reinforcement Learning, Machine Learning for Robot Control, Motion Control Abstract: Entropy regularization is widely used to improve policy optimization and encourage exploration in reinforcement learning. By maximizing both the expected return and entropy, the agent aims to succeed at the task while acting as randomly as possible. However, current methods based on policy entropy encourage the agent to explore diverse actions, but they do not directly promote exploring diverse states. In this study, we theoretically reveal the challenge of optimizing the agent’s nextstate entropy and the gap between maximum next-state entropy and current policy entropy regularization methods. To address this limitation, we introduce Maximum Next-State Entropy (MNSE), a novel method that maximizes next-state entropy through an action mapping layer following the inner policy. We provide a theoretical analysis demonstrating that MNSE can maximize next-state entropy by optimizing the entropy of the inner policy. We conduct extensive experiments on various continuous control tasks and demonstrate that MNSE can significantly improve the performance of RL algorithms.

17:24-17:42, Paper MoCT2.4
Inverse Design of Snap-Actuated Jumping Robots Powered by Mechanics-Aided Machine Learning

Tong, Dezhong	University of Michigan
Hao, Zhuonan	University of California, Los Angeles
Liu, Mingchao	Nanyang Technological University
Huang, Weicheng	Newcastle University
Keywords: Modeling, Control, and Learning for Soft Robots, Dynamics, Methods and Tools for Robot System Design Abstract: Simulating soft robots offers a cost-effective approach to exploring their design and control strategies. While current models, such as finite element analysis, are effective in capturing soft robotic dynamics, the field still requires a broadly applicable and efficient numerical simulation method. In this paper, we introduce a discrete differential geometry-based framework for the model-based inverse design of a novel snap-actuated jumping robot. Our findings reveal that the snapping beam actuator exhibits both symmetric and asymmetric dynamic modes, enabling tunable robot trajectories (e.g., horizontal or vertical jumps). Leveraging this bistable beam as a robotic actuator, we propose a physics-data hybrid inverse design strategy to endow the snap-jump robot with a diverse range of jumping capabilities. By utilizing a physical engine to examine the effects of design parameters on jump dynamics, we then use extensive simulation data to establish a data-driven inverse design solution. This approach allows rapid exploration of parameter spaces to achieve targeted jump trajectories, providing a robust foundation for the robot’s fabrication. Our methodology offers a powerful framework for advancing the design and control of soft robots through integrated simulation and data-driven techniques.

17:42-18:00, Paper MoCT2.5
Towards a Steerable Neurosurgical Robot for Debulking of Brain Mass Lesions

Saini, Sarvesh	University of Miami
Rezaeian, Saeed	University of California Riverside
Akbari, Arshia	University of California Riverside
Badie, Behnam	City of Hope National Medical Center
Sheng, Jun	University of California Riverside
Keywords: Surgical Robotics: Steerable Catheters/Needles, Medical Robots and Systems, Soft Robot Applications Abstract: Minimally invasive surgery is regarded as a safer approach than open craniotomy to removing deep intracerebral mass lesions such as hematomas. It is usually performed by introducing a straight suction tool, sometimes combined with accessories for tissue debridement and irrigation, into the brain. Since the collateral trauma to healthy tissue is proportional to the diameter of the tools, slender tools with small diameters are desired. However, current minimally invasive tools are inadequate for removal of large, multi-focal, and fibrous mass lesions. In this work, we present a new robotic surgical device for removing intracerebral mass lesions. The device consists of four concentric tubes. From outermost to innermost, they include a straight rigid stainless steel tube, a precurved superelastic nitinol tube with asymmetric notches, a braid-reinforced composite tube with tissue cutting holes at the tip, and a suction tube connected with a suction machine. A Pebax sleeve covers the notched area of the outer tube except the two most distal notches. By rotating and translating the notched nitinol tube, the robot tip can be manipulated inside a mass lesion. By concurrently rotating the cutting tube and applying negative pressure, tissues can be cut and removed through the suction tube. In this paper, we present our design and fabrication of this robotic device, kinematic modeling of the robot in terms of the rotation and translation of the notched tube and rotation of the cutting tube, and the results of feasibility studies show 540% improvement of mass lesion removal efficiency.


MoCT3	Crystal
Automation for Enhanced Healthcare 2	Special Session
Chair: Wang, Feifan	Tsinghua University
Organizer: Wang, Feifan	Tsinghua University
Organizer: Zhong, Xiang	University of Florida

16:30-16:48, Paper MoCT3.1
Fractional Order Modeling and Control of Type 1 Diabetes with Genetic Algorithm Optimization (I)

Caponetto, Riccardo	University of Messina
Patanè, Luca	University of Messina
Koledin, Nebojsa	University of Messina
Wrona, Andrea	Sapienza University of Rome
Baldisseri, Federico	Sapienza University of Rome
Delli Priscoli, Francesco	Sapienza University of Rome
Keywords: Modelling, Simulation and Optimization in Healthcare, Health Care Management, Clinical and Operational Decision Support Abstract: Type 1 diabetes mellitus is a condition in which blood glucose levels rise to dangerously high levels due to insufficient or absent insulin production. To regulate blood glucose levels, diabetic patients require exogenous insulin infusion. This paper presents a comparison of four control strategies for autonomous regulation of glycemia: integer-order proportional-integral (IO-PI), fractional-order proportional-integral (FO-PI), integer-order sliding mode control (IO-SMC), and fractional-order sliding mode control (FO-SMC). Genetic algorithms are employed to optimize the controllers, that are validated through numerical simulations on the fractional Bergman glucoregulatory model in presence of meal disturbance. It is found that FO-SMC exhibits superior performance in terms in time in normal glycemic range and time spent in hypoglycemia, also showing robustness properties with respect to meal increments.

16:48-17:06, Paper MoCT3.2
Optimizing Network Simulation of Cardiac Electrical Dynamics (I)

Liu, Runsang	Pennsylvania State University
Yang, Hui	The Pennsylvania State University
Keywords: Modelling, Simulation and Optimization in Healthcare, Automation in Life Science: Biotechnology, Pharmaceutical and Health Care, AI and Machine Learning in Healthcare Abstract: Our recent study has discovered that the structural geometry of a heart can be effectively encoded into a network, which provides an opportunity to simulate cardiac electrical dynamics on a sparse adjacency matrix. While simulation models often require optimization to effectively explore “what-if” scenarios, the calibration of heart models presents greater levels of complexity. These models not only exhibit chaotic and nonstationary dynamics but are also computationally expensive, which poses significant challenges to traditional calibration methodologies. This paper presents a new statistical metamodeling framework for optimizing network simulation of cardiac electrical dynamics. We first introduce a new control parameter to characterize the heart network and regulate cell-to-cell communications. Next, a Gaussian process (GP) surrogate is developed to predict the simulation response and guide the selection of the next best parameter setting that yields the maximum expected improvement. This calibration process is iteratively performed until convergence, and the performance is evaluated and validated through case studies on 2D cardiac tissues and 3D hearts. Experimental results show that the proposed statistical metamodeling approach efficiently tailors network simulation to complex spatiotemporal dynamics.

17:06-17:24, Paper MoCT3.3
An Analytical Framework for Image-Based Evaluation of Motor Function Rehabilitation (I)

Zhao, Yishen	Tsinghua University
Wang, Qing	Tsinghua University
Ma, Lin	China Rehabilitation Research Center
Li, Jingshan	Tsinghua University
Keywords: Health Care Management, Modelling, Simulation and Optimization in Healthcare, Rehabilitation Abstract: This abstract introduces an analytical framework for image-based automatic evaluation in motor rehabilitation. The framework utilizes a regular camera to capture and analyze a patient's motion, uses machine learning methods for status analysis, and generates evaluation score for rehabilitation. Such a work provides a quantitative tool to reduce the burden of physicians, improve system efficiency and patient outcome in rehabilitation processes.

17:24-17:42, Paper MoCT3.4
Leveraging Structured EHR Data and Machine Learning Methods for Improved Prediction and Interpretability in ASA-PS Classification (I)

Zheng, Hanyi	Tsinghua University
Wang, Qing	Tsinghua University
Li, Jingshan	Tsinghua University
Keywords: Health Care Management, AI and Machine Learning in Healthcare, AI-Based Methods Abstract: This abstract explores classification prediction of American Society of Anesthesiologists Physical Status (ASA-PS) using structured electronic health records (EHR) and machine learning techniques. The MOVER database, which contains patient demographic information, laboratory measurements, and diagnosis codes from 2017 to 2022, is used in the study. The diagnosis codes are transformed into sentence embeddings based on their official descriptions and integrated with other structured features to serve as inputs for machine learning models. XGBoost and an attention-based neural network (NN) are employed, offering varying levels of interpretability. This work provides a promising preoperative tool for anesthesiologists, to provide early and interpretable predictions to reduce workload prior to patient evaluation.

17:42-18:00, Paper MoCT3.5
Enhancing Prediction Accuracy of Surgery Duration Via Natural Language Models in Operating Rooms (I)

Liu, Zhaoyang	Tsinghua University
Wang, Qing	Tsinghua University
Li, Jingshan	Tsinghua University
Keywords: Health Care Management, AI and Machine Learning in Healthcare, AI-Based Methods Abstract: This study develops a natural language processing (NLP) framework for surgery duration prediction in operating rooms using Mixture Density Networks (MDN). By incorporating Named Entity Recognition (NER) and word embeddings, with procedure names being the dominant predictive factor, the model achieves superior accuracy. This work demonstrates that significant improvement over traditional methods can be achieved using this approach, which enables more efficient operating room scheduling.


MoCT4	Bernard's
3D Point Cloud Modeling 1	Special Session
Chair: Biehler, Michael	University of Wisconsin - Madison
Co-Chair: Wang, Yinan	RPI
Organizer: Biehler, Michael	University of Wisconsin - Madison
Organizer: Wang, Yinan	RPI

16:30-16:48, Paper MoCT4.1
Thickness Measurement Method for Panel Grids Based on Point Cloud Segmentation and Clustering (I)

Zuo, Liling	Donghua University
Zhang, Jie	Donghua University
Ding, SiLong	Donghua University
Cai, Hongyang	Donghua University
Lyu, Youlong	Donghua University
Keywords: AI-Based Methods, Big-Data and Data Mining, Computer Vision for Manufacturing Abstract: The thickness measurement of panel grids plays an important role in ensuring its quality and load-bearing capacity. With the widespread application of 3D laser scanning technology, point cloud-based methods have provided an effective solution for accurate thickness measurement. This paper proposes an automated measurement method based on point cloud segmentation and clustering to estimate the thickness of multiple grids on the panel. First, an adaptive convolution-based point cloud segmentation network is designed to identify grid points from the input point cloud of panel. This segmentation network features a position adaptive convolution module and a direction adaptive convolution module. The former dynamically encodes spatial positions to capture local geometric relationships, while the latter learns the direction of the normal vector to distinguish global geometric differences. Second, the extracted grid points are then clustered using the DBSCAN algorithm to identify individual grid cells. Finally, the thickness of each grid cell is computed by measuring the distance between corresponding inner and outer surface points based on neighborhood normal vectors. Comparative experimental results demonstrate the effectiveness of the proposed thickness measurement method for panel grids.

16:48-17:06, Paper MoCT4.2
STGS: Spatio-Temporal Gaussian Splatting for Traffic Prediction (I)

Cui, Songyi	The University of Hong Kong
Yan, Yimo	The University of Hong Kong
Kuo, Yong-Hong	The University of Hong Kong
Keywords: Big-Data and Data Mining, Intelligent Transportation Systems, Data fusion Abstract: Accurately predicting traffic speed is challenging due to the difficulty of integrating heterogeneous urban data sources and the data sparsity in certain regions. In this paper, we propose STGS, a novel framework that integrates three distinct modalities using a spatial Gaussian splatting mechanism. By diffusing latent “hotspots” across the urban grid, our approach effectively captures complex spatial correlations and fuses this enriched representation with gated recurrent networks for traffic prediction. Experimental results on real-world datasets demonstrate that STGS outperforms traditional unimodal and simple multimodal baselines, showcasing the effectiveness of our integrated spatial diffusion process for modeling spatiotemporal evolution. These findings underscore the potential of STGS to improve real-time decision-making and support sustainable urban traffic management.

17:06-17:24, Paper MoCT4.3
Physics-Informed Attention-Enhanced Fourier Neural Operator for Solar Magnetic Field Extrapolations (I)

Cao, Jinghao	New Jersey Institute of Technology
Li, Qin	New Jersey Institute of Technology
Du, Mengnan	New Jersey Institute of Technology
Haimin, Wang	New Jersey Institute of Technology
Shen, Bo	NJIT
Keywords: AI-Based Methods, Big-Data and Data Mining Abstract: We propose a Physics-informed Attention-enhanced Fourier Neural Operator (PIANO) for solving the Nonlinear Force-Free Field (NLFFF) problem in solar physics. PIANO leverages Efficient Channel Attention (ECA) with Dilated Convolutions (DC) to capture multimodal inputs by emphasizing critical channels for magnetic field variations. Moreover, physics-informed loss functions enforcing force-free and divergence-free conditions ensure that predictions adhere to the underlying physics. Experiments on the ISEE NLFFF dataset show that PIANO outperforms state-of-the-art neural operators in accuracy and consistently reproduces the physical characteristics of solar magnetic fields.

17:24-17:42, Paper MoCT4.4
Multiscale Spatio-Temporal Changepoint Detection with Outliers (I)

Yang, Kai	Medical College of Wisconsin
Keywords: Surveillance Systems, Probability and Statistical Methods, Process Control Abstract: Spatio-temporal changepoint detection is crucial for applications such as disease surveillance, environmental monitoring, and crime analysis. While most existing methods focus on analyzing data at a single geospatial scale (e.g., state, county, or census tract), they often struggle with detecting changepoints, particularly in the presence of outliers. In many applications, however, spatio-temporal data, such as incidence rates of infectious diseases, are typically managed across multiple geospatial scales to optimize data management, resulting in multiscale spatio-temporal datasets. Furthermore, large-scale spatio-temporal datasets are prone to outliers, which complicate changepoint detection. This talk introduces a robust method for sequential spatio-temporal process monitoring, designed to effectively handle both the multiscale structure of spatio-temporal data and the presence of outliers. The proposed method integrates a rank-based loss function with a spatial variable selection procedure to identify regions experiencing changes across different spatial scales. To enhance computational efficiency, a forward elimination strategy is proposed to leverage the hierarchical aggregation structure inherent in multiscale spatio-temporal data. The optimal spatial scale for process monitoring is determined based on the detected regions of change, using a novel scale information criterion. Finally, a charting statistic is constructed based on the detected changes identified at the optimal scale, enabling real-time detection of significant changes. The proposed method is not only robust to outliers but also highly scalable and computationally efficient, making it a reliable and effective tool for online monitoring of multiscale spatio-temporal data with outliers.

17:42-18:00, Paper MoCT4.5
Sub-Surface Thermal Measurement in Additive Manufacturing Via Machine Learning-Enabled High-Resolution Fiber Optic Sensing (I)

Wang, Rongxuan	Auburn University
Keywords: Additive Manufacturing, Machine learning, Sensor Fusion


MoCT5	Cordoban
Human-Robot and HCA 3	Regular Session
Chair: Zhang, Zihan	Georgia Institute of Technology

16:30-16:48, Paper MoCT5.1
KoARob: Towards AI-Based Safety in Human-Robot-Collaborations

Bermuth, Daniel	University of Augsburg
Poeppel, Alexander	University of Augsburg
Reif, Wolfgang	University of Augsburg
Keywords: Collaborative Robots in Manufacturing, AI-Based Methods, Foundations of Automation Abstract: As industries face changes in population and the need for better production efficiency arises, combining human workers with robots is becoming more common. In such collaborations, besides the user experience, the safety of the human worker is a critical aspect. This paper introduces an AI-based safety system designed to maintain a safe distance from human workers to prevent injuries. To circumvent the limitations of similar safety approaches, the system uses redundant methods to detect humans and their various joints. An evaluation in a real-world scenario shows that such an AI-based system can reliably detect humans and stop robots before a collision occurs. This work proves that using AI-based systems for human detection in safety-related contexts is not impossible and creates a basis for a new generation of safety systems that can enhance future human-robot collaborations.

16:48-17:06, Paper MoCT5.2
Trustworthy Human-Robot Collaboration Programming Using a Gantt Chart-Based Domain-Specific Modeling Language

Buchner, Lukas	University of Applied Sciences Upper Austria
Zallinger, Philipp	University of Applied Sciences Upper Austria
Nachbagauer, Karin	University of Applied Science Upper Austria
Zoitl, Alois	Johannes Kepler University Linz
Froschauer, Roman	University of Applied Sciences Upper Austria
Keywords: Collaborative Robots in Manufacturing, Task Planning, Human-Centered Automation Abstract: Programming human-robot collaboration (HRC) systems remains challenging for non-expert users. It often requires complex coding skills and provides limited visibility into robot behavior. We present the Robot Collaboration Language (RCL), a Gantt chart-based, domain-specific modeling language designed to facilitate intuitive and safe programming of collaborative tasks by non-experts. Using familiar Gantt chart representations, RCL aims to improve HRC programming transparency, predictability, and safety. We describe the language's meta-model and implementation and demonstrate its use with an example. Our approach contributes to developing trustworthy HRC systems by simplifying the complexity of programming.

17:06-17:24, Paper MoCT5.3
Language-Guided Robust Navigation for Mobile Robots in Dynamically-Changing Environments

Simons, Cody	University of California, Riverside
Liu, Zhichao	University of California, Riverside
Marcus, Brandon	University of California, Riverside
Roy-Chowdhury, Amit	University of California, Riverside
Karydis, Konstantinos	University of California, Riverside
Keywords: Human-Centered Automation, AI-Based Methods, Autonomous Vehicle Navigation Abstract: In this paper, we develop an embodied AI system for human-in-the-loop navigation using a wheeled mobile robot. We propose a direct yet effective method for monitoring the robot's current plan to detect changes in the environment that significantly impact the intended trajectory of the robot and then query a human for feedback. We also develop a means to parse human feedback expressed in natural language into local navigation waypoints and integrate it into a global planning system by leveraging a map of semantic features and an aligned obstacle map. Extensive testing in simulation and physical hardware experiments with a resource-constrained wheeled robot tasked to navigate in a real-world environment validate the efficacy and robustness of our method. This work can support applications such as precision agriculture and construction, where persistent monitoring of the environment provides users with information about the state of the environment.

17:24-17:42, Paper MoCT5.4
Enabling Shared-Control for a Riding Ballbot System

Chen, Yu	University of Illinois at Urbana-Champaign
Mansouri, Mahshid	University of Illinois at Urbana-Champaign
Xiao, Chenzhang	University of Illinois at Urbana-Champaign
Wang, Ze	University of Illinois Urbana-Champaign
Hsiao-Wecksler, Elizabeth T.	University of Illinois at Urbana-Champaign
Norris, William	University of Illinois Urbana-Champaign
Keywords: Human Performance Augmentation, Human Factors and Human-in-the-Loop, Human-Centered Automation Abstract: This study introduces a shared-control approach for collision avoidance in the self-balancing riding ballbot, PURE, marked by its dynamic stability, omnidirectional movement, and hands-free interface. Integrating a sensor array with a novel Passive Artificial Potential Field (PAPF) method, PURE provides intuitive navigation with deceleration assistance and haptic/audio feedback, effectively mitigating collision risks. This approach addresses the limitations of traditional APF methods, such as control oscillations and unnecessary speed reduction in challenging scenarios. A human subject test, with 20 manual wheelchair users and able-bodied individuals, was conducted to evaluate the performance of indoor navigation and obstacle avoidance with the proposed shared-control algorithm. Results showed that shared-control significantly reduced collisions and cognitive load without affecting travel speed, offering intuitive and safe operation. These findings highlight the shared-control system’s suitability for enhancing collision avoidance in self-balancing mobility devices, a relatively unexplored area in assistive mobility research.

17:42-18:00, Paper MoCT5.5
Enhancing Hand Palm Motion Gesture Recognition by Eliminating Reference Frame Bias Via Frame-Invariant Similarity Measures

Verduyn, Arno	KU Leuven
Vochten, Maxim	KU Leuven
De Schutter, Joris	KU Leuven
Keywords: Human-Centered Automation, Machine learning Abstract: The ability of robots to recognize human gestures facilitates a natural and accessible human-robot collaboration. However, most work in gesture recognition remains rooted in reference frame-dependent representations. This poses a challenge when reference frames vary due to different work cell layouts, imprecise frame calibrations, or other environmental changes. This paper investigated the use of invariant trajectory descriptors for robust hand palm motion gesture recognition under reference frame changes. First, a novel dataset of recorded Hand Palm Motion (HPM) gestures is introduced. The motion gestures in this dataset were specifically designed to be distinguishable without dependence on specific reference frames or directional cues. Afterwards, multiple invariant trajectory descriptor approaches were benchmarked to assess how their performances generalize to this novel HPM dataset. After this offline benchmarking, the best scoring approach is validated for online recognition by developing a real-time Proof of Concept (PoC). In this PoC, hand palm motion gestures were used to control the real-time movement of a manipulator arm. The PoC demonstrated a high recognition reliability in real-time operation, achieving a F1-score of 92.3%. This work demonstrates the effectiveness of the invariant descriptor approach as a standalone solution. Moreover, we believe that the invariant descriptor approach can also be utilized within other state-of-the-art pattern recognition and learning systems to improve their robustness against reference frame variations.


MoCT6	Corinthian
Detection, Estimation and Prediction 2	Regular Session
Chair: Wang, Yongjing	University of Birmingham

16:30-16:48, Paper MoCT6.1
Simulation-To-Reality Hyperparameter Optimization of MPPI Controllers Via Bayesian Optimization in NVIDIA Omniverse Isaac Sim

Ruhe, Maximilian	Proximity Robotics & Automation GmbH
Alba, Kathrin	Proximity Robotics & Automation GmbH
Kipfmüller, Martin	Karlsruhe University of Applied Sciences
Mamaev, Ilshat	Proximity Robotics & Automation GmbH
Keywords: Simulation and Animation, Optimization and Optimal Control, Motion Control Abstract: Autonomous mobile robots navigating in dynamic environments require robust and efficient control strategies. Model Predictive Path Integral (MPPI) control offers flexibility and computational efficiency but relies on the careful tuning of multiple hyperparameters, which is traditionally performed through heuristic approaches. In this paper, we present an automated and systematic method for MPPI hyperparameter tuning using Bayesian Optimization (BO) within a ROS 2 and Nav2-based framework. By leveraging a high-fidelity digital twin implemented in NVIDIA Omniverse Isaac Sim, our method automates and accelerates hyperparameter tuning, significantly enhancing trajectory smoothness, reducing control effort, and improving overall navigation efficiency. Experimental validation on a real differential-drive robot demonstrates strong consistency between simulation-optimized parameters and real-world performance, confirming the effectiveness of our simulation-to-reality approach. This work provides a practical and reproducible method for integrating BO with ROS 2 and Nav2, enabling streamlined deployment and adaptive tuning of MPPI controllers in real-world robotic applications.

16:48-17:06, Paper MoCT6.2
Retrieval-Augmented Generation Using Knowledge Graphs for Manufacturing Problem-Solving

Meister, Frederic	Fraunhofer Institute for Casting, Composite and Processing Techn
Khanal, Parikshit	Fraunhofer Institute for Casting, Composite and Processing Techn
Trauner, Ludwig	Fraunhofer Institute for Casting, Composite and Processing Techn
Daub, Rüdiger	Technical University of Munich (TUM, Fraunhofer IGCV
Keywords: Failure Detection and Recovery, Causal Models, AI-Based Methods Abstract: The increasing variety of products has resulted in a rise in production errors, primarily due to more complexity in manufacturing processes. This paper proposes a data-driven inline problem-solving approach to mitigate the response times associated with these errors. Problem-solving is initiated by detecting anomalies within processes by an autoencoder model. Upon identifying these anomalies, the proposed approach employs causal inference using a Failure Mode and Effects Analysis (FMEA)-based Bayesian Network (BN) to determine potential root causes. The inferred causes, along with the user's problem description, are processed within a hybrid Retrieval-Augmented Generation (RAG) framework. The RAG produces two sets of retrievals: one by querying a Knowledge Graph (KG) containing historic eight discipline-based (8D) problem-solving data to extract failure information and relationships; the other through keyword similarity and vector search techniques. The combined retrievals, along with the results from the BN, are then input for a relatively small-scale Large Language Model (LLM) from Mistral. The findings indicate that this approach achieves accurate information retrieval and provides reliable outputs, even when problem descriptions are vague.

17:06-17:24, Paper MoCT6.3
A Unified Framework for Real-Time Failure Handling in Robotics Using Vision-Language Models, Reactive Planner and Behavior Trees

Ahmad, Faseeh	Lund University
Ismail, Hashim	Lund University
Styrud, Jonathan	ABB
Stenmark, Maj	Lund University
Krueger, Volker	Lund University
Keywords: Failure Detection and Recovery, Behavior-Based Systems, Collaborative Robots in Manufacturing Abstract: Robotic systems often face execution failures due to unexpected obstacles, sensor errors, or environmental changes. Traditional failure recovery methods rely on predefined strategies or human intervention, making them less adaptable. This paper presents a unified failure recovery framework that combines Vision-Language Models (VLMs), a reactive planner, and Behavior Trees (BTs) to enable real-time failure handling. Our approach includes proactive pre-execution verification, which checks for potential failures before execution, and reactive failure handling, which detects and corrects failures during execution by inferring missing preconditions and, when necessary, generating new skills. The framework uses a scene graph for structured environmental perception and an execution history for continuous monitoring, enabling context-aware and adaptive failure handling. We evaluate our framework through real-world experiments with an ABB YuMi robot on tasks like peg insertion, object sorting, and drawer placement, as well as simulation benchmarks in AI2-THOR. Compared to proactive-only, reactive-only, and post-execution recovery methods, our approach achieves higher task success rates and improves adaptability. Ablation studies highlight the importance of VLM-based reasoning, structured scene representation, and execution history tracking for effective failure recovery in robotics.

17:24-17:42, Paper MoCT6.4
Leveraging CVAE for Joint Configuration Estimation of Multifingered Grippers from Point Cloud Data

MÉrand, Julien	Université Paris-Saclay, CEA, List
Meden, Boris	Université Paris Saclay, CEA, LIST, F-91120 Palaiseau, France
Grossard, Mathieu	Université Paris-Saclay, CEA, List
Keywords: Model Learning for Control, AI-Based Methods, Deep Learning in Robotics and Automation Abstract: This paper presents an efficient approach for determining the joint configuration of a multifingered gripper solely from the point cloud data of its poly-articulated chain, as generated by visual sensors, simulations or even generative neural networks. Well-known inverse kinematics (IK) techniques can provide mathematically exact solutions (when they exist) for joint configuration determination based solely on the fingertip pose, but often require post-hoc decision-making by considering the positions of all intermediate phalanges in the gripper's fingers, or rely on algorithms to numerically approximate solutions for more complex kinematics. In contrast, our method leverages machine learning to implicitly overcome these challenges. This is achieved through a Conditional Variational Auto-Encoder (CVAE), which takes point cloud data of key structural elements as input and reconstructs the corresponding joint configurations. We validate our approach on the MultiDex grasping dataset using the Allegro Hand, operating within 0.05 milliseconds and achieving accuracy comparable to state-of-the-art methods. This highlights the effectiveness of our pipeline for joint configuration estimation within the broader context of AI-driven techniques for grasp planning.

17:42-18:00, Paper MoCT6.5
Predicting Bifurcation Points of Evolving Model Uncertainty in Online Alarm Flood Classification

Manca, Gianluca	Ruhr University Bochum
Kunze, Franz Christopher	Ruhr University Bochum
Fay, Alexander	Ruhr University Bochum
Keywords: Machine learning, Diagnosis and Prognostics, Probability and Statistical Methods Abstract: Online alarm flood classification (AFC) methods help to address the challenge of alarm floods in automated industrial systems by assigning observed alarm sequences to known alarm flood classes. However, AFC methods can encounter temporary class ambiguities, where limited data introduces substantial uncertainty. Recent research addressed this issue by integrating conformal prediction (CP) with AFC, dynamically producing sets of plausible alarm flood classes. Despite their effectiveness, these approaches lack predictive insights into how the model uncertainty might evolve, thereby leaving operators unsure about optimal timing for decision-making. To resolve this limitation, we propose a novel method utilizing random forest regression models to predict bifurcation points—defined as future time steps at which at least one previously plausible class can be excluded. Additionally, we introduce a delay timer to stabilize prediction sets, significantly reducing chattering and unnecessary fluctuations in predicted class sets. Evaluating our preliminary results on a synthetic dataset using three AFC methods from the literature, we demonstrate effective estimation of upcoming bifurcation points and improved stability of predictions.


MoCT7	Mediterranean
Planning and Control for Semiconductor Mfg	Special Session
Chair: Moench, Lars	University of Hagen
Organizer: Moench, Lars	University of Hagen
Organizer: Yugma, Claude	Ecole Des Mines De Saint-Etienne

16:30-16:48, Paper MoCT7.1
Analysis for Steady Schedule Convergence against Random Time Disruptions in a Dual-Armed Cluster Tool (I)

Kim, Min-Chan	Korea Advanced Institute of Science and Technology
Kim, Hyun-Jung	Korea Advanced Institute of Science and Technology
Keywords: Semiconductor Manufacturing, Petri Nets for Automation Control, Discrete Event Dynamic Automation Systems Abstract: Cluster tools are single-wafer processing systems widely used in semiconductor manufacturing. These tools typically operate under a steady regime, where the 1-cyclic schedule is the most desirable for maintaining stable and efficient production, as it ensures uniform quality and higher yield. However, time disruptions, such as process time delay, can destabilize the system, potentially leading to deviations from the 1-cyclic schedule. This study identifies the threshold of time disturbances that allow natural recovery to a 1-cyclic schedule without increasing cycle time. Furthermore, we show that dynamically adjusting start time of robot tasks contributes to system stability, ensuring consistent productivity despite disturbances. Our work lays the theoretical groundwork for a dynamic wafer loading control method that adapts to varying disturbance levels, ensuring stable and efficient tool operation.

16:48-17:06, Paper MoCT7.2
Forecasting Wafer Fab Outputs Using Lot Remaining Cycle Time Prediction in Semiconductor Manufacturing (I)

Wartelle, Adrien	Mines Saint-Etienne
Dauzere-Peres, Stephane	Mines Saint-Etienne
Yugma, Claude	Ecole Des Mines De Saint-Etienne
Christ, Quentin	STMicroelectronics Crolles
Roussel, Renaud	STMicroelectronics Crolles
Keywords: Semiconductor Manufacturing, Planning, Scheduling and Coordination, Probability and Statistical Methods Abstract: Semiconductor manufacturing is a crucial component of modern supply chains, yet it faces significant challenges due to the complexity and variability of production processes. This study addresses the problem of forecasting wafer fab outputs by predicting the remaining cycle time of lots in semiconductor manufacturing. We propose a statistical modeling approach that leverages historical data to estimate lot departures and output quantities for a given forecasting horizon. The study focuses on the 20 most highest-volume products in a high-mix, low-volume production environment, using data from a major wafer fab in southeast France. A total of 26 linear regression models are evaluated, considering different contexts input and output variables, to predict RCTs and forecast lot departures. The results indicate that the best-performing model achieves a forecasting accuracy of 41.55% for an 8-week forecasting horizon and 33.39% for a 12-week forecasting horizon, comparable to what is found in the literature. However, the study also highlights the challenges of extrapolating training data to future periods, particularly due to variability in product prioritization and system congestion. Future work will focus on integrating additional variables, such as congestion metrics and due dates, as well as exploring non-linear models to enhance forecasting accuracy and robustness. This research provides a foundational framework for wafer fab output forecasting, essential for effective production management and supply chain stability in the semiconductor industry.

17:06-17:24, Paper MoCT7.3
A Metaheuristic Approach for a Flexible Flow Shop Scheduling Problem with Batch Processing Machines and Maximal Time Lags (I)

Rocholl, Jens	University of Hagen
Moench, Lars	University of Hagen
Keywords: Planning, Scheduling and Coordination, Semiconductor Manufacturing Abstract: A flexible flow shop scheduling problem motivated by process conditions in semiconductor wafer fabrication facilities (wafer fabs) is considered. Maximal time lags between consecutive operations are respected. A biased random-key genetic algorithm (BRKGA) is hybridized by list scheduling to minimize the total weighted tardiness of the jobs. In addition to this job-based decomposition approach, another decomposition approach using mixed integer linear pro-gramming (MILP) is applied for comparison. Results of com-putational experiments based on randomly generated problem instances are presented that show that the proposed meta-heuristic outperforms the MILP-based decomposition approach.

17:24-17:42, Paper MoCT7.4
Using Genetic Programming for Solving a Two-Stage Flexible Flow Shop Scheduling Problem with Maximal Time Lags (I)

Schorn, Daniel	University of Hagen
Moench, Lars	University of Hagen
Keywords: Planning, Scheduling and Coordination, Semiconductor Manufacturing Abstract: A scheduling problem for a two-stage flexible flow shop with maximal time lags between consecutive operations motivated by process conditions found in semiconductor wafer fabrication facilities (wafer fabs) is considered. The jobs have unequal ready times and both initial and inter-stage time lags. The performance measure is the total weighted tardiness (TWT). A heuristic scheduling framework using genetic programming (GP) to automatically discover priority indices is designed. Computational experiments based on randomly generated problem instances are carried out. The GP scheme is compared with a reference heuristic based on a biased random key genetic algorithm (BRKGA) combined with a backtracking procedure and a constraint programming (CP)-based decomposition approach. The results show that high-quality schedules are obtained in a short amount of computing time (CT) using the GP scheme.

17:42-18:00, Paper MoCT7.5
Minimizing Total Weighted Tardiness for a Multiple-Orders-Per-Job Scheduling Problem with Unequal Release Times (I)

Korde, Rohan	Arizona State University
Fowler, John	Arizona State University
Moench, Lars	University of Hagen
Keywords: Semiconductor Manufacturing, Planning, Scheduling and Coordination Abstract: We minimize total weighted tardiness of customer orders with unequal release times in a two-stage permutation flow shop. We solve this problem using a mixed integer linear programming (MILP) formulation, a hybrid heuristic, and a metaheuristic. Computational experiments based on randomly generated problem instances are used to compare the solution approaches. We find that the metaheuristic provides high-quality solutions in a short amount of computing time.


MoCT8	Heinsbergen
Social and Intelligent Manufacturing 2	Special Session
Chair: Wang, Di	South China University of Technology
Co-Chair: Lin, Weizhi	University of Southern California
Organizer: Jiang, Zhibin	Shanghai Jiao Tong University
Organizer: Zhou, Liping	Shanghai Jiao Tong University

16:30-16:48, Paper MoCT8.1
Environment-Aware Continual Transfer Learning for Real-Time Defect Detection in 3D Printing (I)

Li, Hongyu	CASIA
Shen, Zhen	Institute of Automation, Chinese Academy of Sciences
Fang, Qihang	Institute of Automation, Chinese Academy of Sciences
Guo, Jinyuan	Institute of Automation, Chinese Academy of Sciences
Dong, Xisong	Institute of Automation, Chinese Academy of Sciences
Wang, Di	South China University of Technology
Xiong, Gang	Institute of Automation, Chinese Academy of Sciences
Wang, Feiyue	Institute of Automation, Chinese Academy of Sciences
Keywords: Machine learning, Computer Vision in Automation, Additive Manufacturing Abstract: In 3D printing material defect detection, environmental variations frequently induce false positives and missed detections by automated systems. Current research addresses dynamic environments by fine-tuning models during the detection phase. However, existing methods suffer from delayed adaptation: models require prolonged iterations to achieve stable performance in new environments. Furthermore, noise-contaminated pseudo-labels during domain adaptation exacerbate error accumulation and catastrophic forgetting, leading to severe performance degradation of the model. To address these challenges, we propose a robust and rapid adaptive detection framework tailored for dynamic environments. First, we innovatively employ the Gram matrix of model feature layers to quantify environmental shifts, endowing the model with real-time environmental awareness. Second, the dynamically maintained sample buffer ensures that stored samples satisfy three critical properties: pseudo-label reliability, class-balanced distribution, and diverse environmental representation. This mechanism selects samples most representative of new environments for fine-tuning, significantly accelerating adaptation. Experimental results demonstrate that our method achieves superior detection accuracy on real-world 3D printing material datasets under complex scenarios (e.g., sudden illumination changes and environmental shifts). Compared to baseline Test-Time Adaptation (TTA) methods, it exhibits enhanced adaptability and robustness.

16:48-17:06, Paper MoCT8.2
Postpone or Not: Dynamic Scheduling for Single Additive Manufacturing Machine with Monte Carlo Tree Search (I)

Wu, Hao	Tongji University
Yu, Chunlong	Tongji University
Keywords: Additive Manufacturing, Optimization and Optimal Control Abstract: Additive manufacturing (AM) faces high costs, driving small enterprises toward social manufacturing by pooling distributed orders to spread out fixed costs. However, due to stochastic order arrivals in such systems, real-time scheduling must balance immediate processing to minimize delays with strategic postponement to facilitate cost effective batch production. This study proposes a dynamic scheduling method for single-AM-machine operators. We first formulate the problem as a Markov Decision Process, then develop a direct lookahead policy based on Monte Carlo Tree Search to approximate optimal decisions, effectively managing the trade-off between operational costs and tardiness penalties.

17:06-17:24, Paper MoCT8.3
Quantifying Overlay Printing Registration Accuracy with Object Keypoint Detection for Automated Process Control in FPE Printing (I)

Kim, Juhuhn	Korea Advanced Institute of Science and Technology
Jung, Younsu	Sungkyunkwan University
Parajuli, Sajjan	Sungkyunkwan University
Shrestha, Sagar	Sungkyunkwan University
Park, Jinhwa	Sungkyunkwan University
Cho, Gyoujin	Sungkyunkwan Univerdity
Lee, Jong-Seok	Korea Advanced Institute of Science and Technology
Keywords: Process Control, Computer Vision for Manufacturing, Deep Learning in Robotics and Automation Abstract: Achieving high-precision overlay printing registration accuracy (OPRA) is a critical challenge in the flexible printed electronics (FPE) printing process, particularly in roll-to-roll (R2R) gravure printing. Conventional OPRA quantification methods, primarily based on template matching, suffer from instability under real-world conditions, such as poor contrast, severe noise, and morphological variations in printed register markers. In this study, we propose a deep learning-based framework for marker detection and OPRA quantification, addressing key limitations of traditional approaches. Our method enables accurate localization of marker centers, overcoming inaccuracies caused by ink translucency, occlusion, and motion-induced blurring. Furthermore, it facilitates automatic real-time OPRA assessment, enabling statistical process control in FPE printing. Experimental evaluations demonstrate the superior robustness and reliability of the proposed approach.

17:24-17:42, Paper MoCT8.4
Heterogeneous Graph Neural Network with Dual-View Fusion for Machine-Robot Collaborative Scheduling (I)

Xiao, Meng	Shanghai Jiao Tong University
Chen, Nuo	Shanghai Jiao Tong University
Chen, Lu	Shanghai Jiao Tong University
Keywords: Planning, Scheduling and Coordination, AI-Based Methods, Intelligent and Flexible Manufacturing Abstract: In recent years, deep reinforcement learning (DRL), by integrating neural network technologies, has made significant progress in the field of production scheduling,represents a promising approach. However, current DRL-based solutions lack sufficient analysis of the heterogeneity among production factors in scheduling problems, resulting in suboptimal representations of scheduling states received by DRL algorithms during training. To address this issue,this paper proposes a heterogeneous graph neural network with dual-view fusion for the machine-robot collaborative scheduling problem. It effectively resolves the heterogeneity among the three production factors—operations, machines, and logistics robots—through three neural network modules, thereby providing high-quality representations of scheduling states for DRL algorithms. Experimental results demonstrate that the proposed method outperforms a representative metaheuristic algorithm (Genetic algorithm), a priority dispatching rule (First in first out), and existing DRL-based methods, both in small-scale problems and in generalization to large-scale scenarios.

17:42-18:00, Paper MoCT8.5
Transfer Learning-Driven Scalability for Indoor Positioning Systems in Industrial Environments (I)

Li, Peisen	The Hong Kong Polytechnic University
Liu, Haoran	Nanyang Technological University
Guo, Wei	The Hong Kong Polytechnic University，Department of the In
Yue, Pengjun	The Hong Kong Polytechnic University
Shen, Leidi	Hong Kong Polytechnic University
Zhao, Zhiheng	The Hong Kong Polytechnic University
Huang, George Q.	The Hong Kong Polytechnic University
Keywords: Cyber-physical Production Systems and Industry 4.0, Intelligent and Flexible Manufacturing, Logistics Abstract: Accurate and timely spatial-temporal data enables organizations to improve operational efficiency by optimizing production processes, monitoring worker safety, and managing resources. Indoor positioning systems (IPS) are critical for acquiring such data in environments where GNSS does not work. However, traditional IPS implementations often involve substantial effort in signal collection and system calibration, which can be challenging when production layouts change frequently or expand. This research introduces a transfer learning-enabled indoor positioning system (TLIPS) to address these challenges. TLIPS reduces the need for extensive data collection and system calibration by leveraging operational knowledge from existing environments. By applying transfer learning (TL), TLIPS enables the system to adapt and calibrate automatically in new environments, using minimal new data. This significantly reduces manual intervention and calibration time. The effectiveness of TLIPS was tested and validated through deployments in both experimental testbed (source environment) and new environment (target environment). Results demonstrated a marked reduction in calibration time and costs, highlighting the efficiency and adaptability of TLIPS. This approach offers a scalable and efficient solution for IPS deployment across diverse industrial environments, making the system more flexible and less dependent on frequent manual adjustments.


MoCT9	Moroccan
Trajectory, Object, and Position 3	Regular Session
Chair: Khanal, Abhish	George Mason University

16:30-16:48, Paper MoCT9.1
Cross-Scale Clustering and Neighborhood-Weighted Motion Pattern Extraction for Radar Trajectory Analysis

Wang, Ziqian	University of Chinese Academy of Sciences
Zhang, Lifang	The Chinese People's Liberation Army
Guo, Yuxin	University of Chinese Academy of Sciences
Su, Hu	Institute of Automation, Chinese Academy of Science
Zou, Wei	Chinese Academy of Sciences, University of Chinese Academy of Sci
Ma, Hongxuan	Institute of Automation, Chinese Academy of Sciences
Keywords: Big-Data and Data Mining, Probability and Statistical Methods, Optimization and Optimal Control Abstract: Clustering and motion pattern extraction of radar trajectories with the same flight route are of great significance for analyzing the intent of flight targets. However, the subjectivity of flight intent and the maneuverability of targets result in large differences between trajectories and increased trajectory complexity, presenting challenges for trajectory clustering and motion pattern extraction. To address these issues, this paper proposes a novel trajectory analysis algorithm for radar trajectories, capable of clustering trajectories across different spatial scales and extracting their motion patterns. Specifically, in the clustering phase, a cross-scale density clustering method is proposed to efficiently cluster trajectories with similar shapes but varying spatial scales. Subsequently, in the motion pattern extraction phase, a neighborhood-weighted algorithm is designed to extract concise and ordered motion patterns from complex trajectories with multiple loops and repetitive movements. Finally, we constructed a large-scale dataset containing 2,754 trajectories collected from real radar data throughout the entire year of 2017. Based on this dataset, we conducted a series of experiments to verify the effectiveness of the proposed algorithm. Experimental results show that our approach effectively overcomes the impact of geographical location and spatial scale differences on clustering and successfully extracts typical motion patterns from complex trajectories.

16:48-17:06, Paper MoCT9.2
SceneDM: Consistent Diffusion Models for Coherent Multi-Agent Trajectory Generation

Guo, Zhiming	Huazhong University of Science & Technology
Gao, Xing	Shanghai AI Lab
Zhou, Jianlan	Huazhong University of Science & Technology
Cai, Xinyu	Shanghai AI Laboratory
Yang, Xuemeng	Shanghai Artificial Intelligence Laboratory
Wen, Licheng	Shanghai AI Laboratory
Sun, Xiao	Shanghai AI Laboratory, China
Keywords: Deep Learning in Robotics and Automation, Motion and Path Planning, Autonomous Vehicle Navigation Abstract: Realistic multi-agent motion simulations are essential for the advancement of self-driving algorithms. However, the majority of existing works tend to overlook the kinematic realism of the simulated motions. In this paper, we present SceneDM, a novel consistent diffusion model designed to jointly generate consistent and realistic motions for all types of agents within a traffic scene. To employ temporal dependencies and improve the kinematic realism of the generated motions, we introduce an innovative constructive noise pattern alongside smoothing regularization techniques integrated into the framework of the diffusion model. Moreover, the inference procedure of this model is tailored to effectively ensure local temporal consistency. Furthermore, a scene-level scoring function is incorporated to evaluate the safety and road adherence of the generated agents’ motions, helping to filter out unrealistic simulations. Through empirical validation in the Waymo Sim Agents task, we substantiate the effectiveness of SceneDM in improving the smoothness and realism of generated agent trajectories. The project webpage is available at https://alperen-hub.github.io/SceneDM.

17:06-17:24, Paper MoCT9.3
DRPA-MPPI: Dynamic Repulsive Potential Augmented MPPI for Reactive Navigation in Unstructured Environments

Fuke, Takahiro	Keio University
Endo, Masafumi	CyberAgent, Inc
Honda, Kohei	Nagoya University
Ishigami, Genya	Keio University
Keywords: Motion and Path Planning, Reactive and Sensor-Based Planning, Optimization and Optimal Control Abstract: Reactive mobile robot navigation in unstructured environments is challenging when robots encounter unexpected obstacles that invalidate previously planned trajectories. Model predictive path integral control (MPPI) enables reactive planning, but still suffers from limited prediction horizons that lead to local minima traps near obstacles. Current solutions rely on heuristic cost design or scenario-specific pre-training, which often limits their adaptability to new environments. We introduce dynamic repulsive potential augmented MPPI (DRPA-MPPI), which dynamically detects potential entrapments on the predicted trajectories. Upon detecting local minima, DRPA-MPPI automatically switches between standard goal-oriented optimization and a modified cost function that generates repulsive forces away from local minima. Comprehensive testing in simulated obstacle-rich environments confirms DRPA-MPPI's superior navigation performance and safety compared to conventional methods with less computational burden.

17:24-17:42, Paper MoCT9.4
Learning-Augmented Model-Based Multi-Robot Planning for Time-Critical Search and Inspection under Uncertainty

Khanal, Abhish	George Mason University
Prince Mathew, Joseph	Geroge Mason University
Nowzari, Cameron	George Mason University
Stein, Gregory	George Mason University
Keywords: Planning, Scheduling and Coordination, Motion and Path Planning, Autonomous Agents Abstract: In disaster response or surveillance operations, quickly identifying areas needing urgent attention is critical, but deploying response teams to every location is inefficient or often impossible. Effective performance in this domain requires coordinating a multi-robot inspection team to prioritize inspecting locations more likely to need immediate response, while also minimizing travel time. This is particularly challenging because robots must directly observe the locations to determine which ones require additional attention. This work introduces a multi-robot planning framework for coordinated time-critical multi-robot search under uncertainty. Our approach uses a graph neural network to estimate the likelihood of PoIs needing attention from noisy sensor data and then uses those predictions to guide a multi-robot model based planner to determine the cost effective plan. Simulated experiments demonstrate that our planner improves performance at least by 16.3%, 26.7%, and 26.2% for 1, 3, and 5 robots, respectively, compared to non-learned and learned baselines. We also validate our approach on real-world platforms using quad-copters.

17:42-18:00, Paper MoCT9.5
An Efficient and Unified Method for Extracting the Shortest Path from the Dubins Set

Huang, Xuanhao	Xi'an Jiaotong University
Yan, Chao-Bo	Xi'an Jiaotong University
Keywords: Motion and Path Planning, Autonomous Vehicle Navigation Abstract: Path planning is crucial for the efficient operation of Autonomous Mobile Robots (AMRs) in factory environments. Many existing algorithms rely on Dubins paths, which have been adapted for various applications. However, an efficient method for directly determining the shortest Dubins path remains underdeveloped. This paper presents a comprehensive approach to efficiently identify the shortest path within the Dubins set. We classify the initial and final configurations into six equivalency groups based on the quadrants formed by their orientation angle pairs. Paths within each group exhibit shared topological properties, enabling a reduction in the number of candidate cases to analyze. This pre-classification step simplifies the problem and eliminates the need to explicitly compute and compare the lengths of all possible paths. As a result, the proposed method significantly lowers computational complexity. Extensive experiments confirm that our approach consistently outperforms existing methods in terms of computational efficiency.


MoCT10	Corsican
Planning, Scheduling and Control 3	Regular Session
Chair: Ruiz, Cesar	University of Oklahoma

16:30-16:48, Paper MoCT10.1
Lexicographic Optimization-Based Model Predictive Control Framework for Rigid-Formation-Based Collaborative Transportation

Huang, Xuanhao	Xi'an Jiaotong University
Li, Yuanxiang	Xi'an JiaotongUniversity
Yan, Chao-Bo	Xi'an Jiaotong University
Hu, Jianchen	Xi'an Jiaotong University
Keywords: Collaborative Robots in Manufacturing, Motion and Path Planning, Optimization and Optimal Control Abstract: As manufacturing continues to advance, autonomous mobile robots (AMRs) are becoming increasingly essential in logistics transportation. However, individual AMRs fall short in terms of size and payload capacity, restricting their ability to transport large or heavy objects. To address this limitation, collaborative systems composed of multiple AMRs have emerged as a promising solution for transporting oversized items. Despite their potential, existing collaborative transportation methods for controlling the system encounter several challenges, including formation maintenance, trajectory optimization, and obstacle avoidance, each of which holds varying levels of priority. To effectively manage these challenges and manage task prioritization, this study proposes a lexicographic optimization-based model predictive control (LO-MPC) framework. Through simulation experiments, the proposed framework demonstrates its ability to ensure a safe and reliable transportation process.

16:48-17:06, Paper MoCT10.2
Virtual Fencing for Safer Cobots

Pippera Badguna, Vineela Reddy	New York University
Arab, Aliasghar	NYU
Kodavalla, Durga Avinash	New York University
Li, Rui	New York University
Kurabayashi, Katsuo	New York University
Keywords: Collaborative Robots in Manufacturing, Collision Avoidance, Optimization and Optimal Control Abstract: Collaborative robots (cobots) increasingly operate alongside humans, demanding robust real-time safeguarding. Current safety standards (e.g., ISO 10218, ANSI/RIA 15.06, ISO/TS 15066) require risk assessments but offer limited guidance for real-time responses. We propose a virtual fencing approach that detects and predicts human motion, ensuring safe cobot operation. Safety and performance trade-offs are modeled as an optimization problem and solved via sequential quadratic programming. Experimental validation shows that our method minimizes operational pauses while maintaining safety, providing a modular solution for human-robot collaboration.

17:06-17:24, Paper MoCT10.3
Group Confident Policy Optimization

Li, Yao	Tsinghua University
Liang, Zhenglin	Tsinghua University
Keywords: Autonomous Agents, Collision Avoidance, Reinforcement Abstract: Learning-based policy improvement methods rely on extensive data collection and iterative training, leading to credibility challenges during early training stage when the agent is transferred to new, uncertain and risky environments. To address this, this paper develops a novel reinforcement learning algorithm, termed Group Confident Policy Optimization (GCPO), which emphasizes enhancing the safety and confidence of exploration processes and policy updates. The algorithm proposes a confident advantage function that leverages group normalization to mitigate training variance induced by sampled data and introduces a confidence-based corrective factor. By employing confidence-augmented policy gradient updates, this method ensures safe agent behaviors and progressive performance improvement throughout the training cycle. Simulation experiments demonstrate superior performance of GCPO in risky tasks compared to conventional methods. These findings contribute to establishing trustworthy engineering paradigms for safety-critical automation in cross-environment transfer scenarios.

17:24-17:42, Paper MoCT10.4
NN-PL-VIO: A Customizable Neural Network Based VIO Framework with a Lightweight Point-Line Joint Network

Tang, Jiahao	Tsinghua University
Yu, Jincheng	Tsinghua University
Xiang, Yunfei	Tsinghua University
Xue, Min	Tsinghua University
Xu, Yuanfan	Tsinghua University
Dong, Yuhan	Tsinghua University
Wang, Yu	Tsinghua University
Keywords: Motion and Path Planning, Deep Learning in Robotics and Automation, Learning and Adaptive Systems Abstract: Harnessing the potential of line features to enhance the localization accuracy of point-based Visual-Inertial SLAM (VINS) has become a focus due to the additional constraints they provide on scene structure, especially in environments with low texture. However, the challenge of real-time performance when integrating line features into VINS remains unaddressed. This paper introduces NN-PL-VIO, a customizable real-time Visual-Inertial Odometry (VIO) system designed for embedded devices to achieve both accuracy and efficiency in point-line feature extraction. This framework facilitates the performance testing of various point, line extraction, and matching methods in a positioning system for multi-feature joint localization. In addition, to offer real-time methods, we propose SuperPLNet, a self-supervised fusion network for joint point-line detection and description. Experiments on the Euroc Dataset show that the accuracy surpasses the baseline by 20% in difficult scenes with a processing speed of up to 11.7fps on embedded systems. The source code of our method is available at: https://github.com/efc-robot/NN-PL-VIO

17:42-18:00, Paper MoCT10.5
Risk-Aware Planner for Quadrotor in Cluttered and Dynamic Environments

Li, Yongjian	The Hong Kong University of Science and Technology (Guangzhou)
Zheng, Minzhe	The Hong Kong University of Science and Technology (Guangzhou)
Chen, Kai	The Hong Kong University of Science and Technology
Liu, Hongji	The Hong Kong University of Science and Technology
Zhou, Jinni	Hong Kong University of Science and Technology (Guangzhou)
Wang, Lujia	The Hong Kong University of Technology (Guangzhou)
Ma, Jun	The Hong Kong University of Science and Technology
Keywords: Motion and Path Planning, Collision Avoidance Abstract: Autonomous quadrotors face significant challenges in navigating through complex environments due to dynamic obstacles. Existing trajectory planning methods typically rely on simplified motion assumptions and struggle to account for the uncertainties introduced by dynamic agents, leading to overly conservative or unsafe paths. To address this issue, we propose a risk-aware planner that integrates probabilistic risk assessment into motion planning, which incorporates dynamic constraints to enable safer and more efficient navigation. Our approach first generates a risk map by combining a probabilistic representation of dynamic obstacles with a distance-based static risk model, offering a more comprehensive environmental risk assessment. Based on this map, we introduce a kinodynamic A* planner that generates an initial path utilizing risk-based heuristics, which is then optimized to minimize risk while ensuring smoothness and feasibility. Simulation experiments demonstrate that our method allows quadrotors to navigate dynamic environments more safely and efficiently.


MoCT11	Emerald
Best Application Paper Competition	Special Session
Chair: Lennartson, Bengt	Chalmers University of Technology

16:30-16:55, Paper MoCT11.1
Requirement-Driven Sharing of Manufacturing Digital Twins Along the Value Chain

Gnadlinger, Michael	Technical University of Munich (TUM)
Tilbury, Dawn	University of Michigan
Barton, Kira	University of Michigan at Ann Arbor
Wilch, Jan	Technical University of Munich
Vogel-Heuser, Birgit	Technical University Munich
Keywords: Cyber-physical Production Systems and Industry 4.0, Manufacturing, Maintenance and Supply Chains, Software, Middleware and Programming Environments Abstract: Digital Twins (DTs) are key enablers of Smart Manufacturing, yet their adoption across the value chain is hindered by the lack of a standardized sharing framework. This paper addresses this challenge by identifying essential descriptive and qualitative elements of DTs based on standards and literature. Leveraging the Asset Administration Shell (AAS), it proposes a Submodel Template, which standardizes the packaging of DT models, interfaces, and computational and network requirements thus going beyond, and combining, existing AAS Submodels, i.e. for simulation models, to encapsulate the full multidimensionality of DTs. A case study on a Quality Monitoring DT (QM-DT) demonstrates the template’s ability to support seamless DT deployment, aggregation, and operation across heterogeneous manufacturing environments. Results show that the template enables structured transfer of subject matter expertise captured in DT models, real-time constraint support, and interoperability, laying the groundwork for improved DT integration and exchange.

16:55-17:20, Paper MoCT11.2
Bayesian Intention for Enhanced Human Robot Collaboration

Hernandez-Cruz, Vanessa	Massachusetts Institute of Technology
Zhang, Xiaotong	Massachusetts Institute of Technology
Youcef-Toumi, Kamal	Massachusetts Institute of Technology
Keywords: Human-Centered Automation, Probability and Statistical Methods, Industrial and Service Robotics Abstract: As robots increasingly assist humans in dynamic tasks, predicting human intent is essential to achieving seamless Human-Robot Collaboration (HRC). Many existing approaches for human intention prediction fail to fully exploit the inherent relationships between objects, tasks, and the human model. Current methods for predicting human intent, such as Gaussian Mixture Models (GMMs) and Conditional Random Fields (CRFs), often lack interpretability due to their failure to account for causal relationships between variables. To address these challenges, in this paper, we developed a novel Bayesian Intention (BI) framework to predict human intent within a multi-modality information framework in HRC scenarios. This framework captures the complexity of intent prediction by modeling the correlations between human behavior conventions and scene data. Our framework leverages these inferred intent predictions to optimize the robot's response in real-time, enabling smoother and more intuitive collaboration. We demonstrate the effectiveness of our approach through an HRC task involving a UR5 robot, highlighting BI's capability for real-time human intent prediction and collision avoidance using a unique dataset we created. Our evaluations show that the multi-modality BI model predicts human intent within 2.69ms, with a 36% increase in precision, a 60% increase in F1 Score, and an 85% increase in accuracy compared to its best baseline method. The results underscore BI's potential to advance real-time human intent prediction and collision avoidance, making a significant contribution to the field of HRC.

17:20-17:45, Paper MoCT11.3
Data-Driven Inventory Management for New Products: An Adjusted Dyna-Q Approach with Transfer Learning (I)

Qu, Xinye	The University of Hong Kong
Liu, Longxiao	The University of Hong Kong
Huang, Wenjie	The University of Hong Kong
Keywords: Inventory Management, Reinforcement, AI-Based Methods Abstract: In this paper, we propose a novel reinforcement learning algorithm for inventory management of newly launched products with no historical demand information. The algorithm follows the classic Dyna-Q structure, balancing the model-free and model-based approaches, while accelerating the training process of Dyna-Q and mitigating the model discrepancy generated by the model-based feedback. Based on the idea of transfer learning, warm-start information from the demand data of existing similar products can be incorporated into the algorithm to further stabilize the early-stage training and reduce the variance of the estimated optimal policy. Our approach is validated through a case study of bakery inventory management with real data. The adjusted Dyna-Q shows up to a 23.7% reduction in average daily cost compared with Q-learning, and up to a 77.5% reduction in training time within the same horizon compared with classic Dyna-Q. By using transfer learning, it can be found that the adjusted Dyna-Q has the lowest total cost, lowest variance in total cost, and relatively low shortage percentages among all the benchmarking algorithms under a 30-day testing.


MoWR	Biltmore Bowl
CASE 2025 Welcome Reception

Technical Program for Monday August 18, 2025