| |
Last updated on July 16, 2025. This conference program is tentative and subject to change
Technical Program for Wednesday August 20, 2025
|
WeAT1 |
Room T1 |
Planning, Scheduling and Control 5 |
Regular Session |
Chair: Farzan, Siavash | California Polytechnic State University |
|
08:00-08:18, Paper WeAT1.1 | |
A MILP-Based Solution to Multi-Agent Motion Planning and Collision Avoidance in Constrained Environments |
|
Jaitly, Akshay | Worcester Polytechnic Institute |
Cline, Jack | California Polytechnic State University |
Farzan, Siavash | California Polytechnic State University |
Keywords: Motion and Path Planning, Collision Avoidance, Optimization and Optimal Control
Abstract: We propose a mixed-integer linear program (MILP) for multi-agent motion planning that embeds Polytopic Action-based Motion Planning (PAAMP) into a sequence-then-solve pipeline. Region sequences confine each agent to adjacent convex polytopes, while a big-M hyperplane model enforces inter-agent separation. Collision constraints are applied only to agents sharing or neighboring a region, which reduces binary variables exponentially compared with naive formulations. An L1 path-length-plus-acceleration cost yields smooth trajectories. We prove finite-time convergence and demonstrate on representative multi-agent scenarios with obstacles that our formulation produces collision-free trajectories an order of magnitude faster than an unstructured MILP baseline.
|
|
08:18-08:36, Paper WeAT1.2 | |
CB-GCS: Conflict-Based Search on the Graph of Convex Sets for Multi-Agent Motion Planning |
|
Zhao, Shizhe | Shanghai Jiao Tong University |
George Philip, Allen | Texas A&M University |
Rathinam, Sivakumar | TAMU |
Choset, Howie | Carnegie Mellon University |
Ren, Zhongqiang | Shanghai Jiao Tong University |
Keywords: Motion and Path Planning
Abstract: Multi-Agent Motion Planning (MAMP) seeks collision-free trajectories for multiple agents from their respective start to goal locations among static obstacles, while minimizing some cost function over the trajectories. Existing approaches include Mixed-Integer Programming (MIP) models, graph-based and sampling-based methods, and trajectory optimization, each with its own limitations. This paper introduces CB-GCS, a new approach that develops a Conflict-Based Search on the Graph of Convex Sets, to solve the MAMP. CB-GCS plans trajectories for agents in continuous workspaces, represented by time-augmented graphs of convex sets (GCS-T), and resolves agent-agent conflicts by adding constraints to the agents in that GCS-T. We test our CB-GCS against various baselines, including a graph-based method that combines search and sampling, as well as a Mixed-Integer Linear Program (MILP) formulation. The numerical results show that solutions found by our approach often have an optimality gap more than 10 times smaller than those found by the baseline when given the same amount of runtime limit.
|
|
08:36-08:54, Paper WeAT1.3 | |
A Formal Definition of the Multi-Robot Multi-Task Time-Extended Assignment Problem Configuration |
|
Miloradovic, Branko | Mälardalen University |
Papadopoulos, Alessandro Vittorio | Mälardalen University |
Keywords: Planning, Scheduling and Coordination, Task Planning
Abstract: Multi-Robot Systems (MRSs) play a crucial role in several fields, including industrial automation, precision agriculture, and urban search and rescue, by enhancing efficiency and operational capabilities. One of the main challenges in these systems is the efficient allocation of tasks, known as Multi-Robot Task Allocation (MRTA). This paper focuses on a particularly complex configuration of MRTA that involves Multi-Task (MT) Robots capable of performing multiple tasks simultaneously, Multi-Robot (MR) Tasks that require coordinated efforts from several robots, and Time-extended Assignments (TA) that demand extended duration scheduling. Despite notable advancements in this area, the MT-MR-TA configuration remains underexplored. Existing research often fails to address the specific challenges associated with coordinating and scheduling these complex tasks. This paper aims to fill this gap by introducing new computational models based on Integer Linear Programming (ILP) and Constraint Programming (CP). These models are purposefully designed to formalize and tackle the intricate dynamics of MT-MR-TA, offering a structured approach to solve this multidimensional optimization problem. We rigorously evaluate these models for their effectiveness using advanced, general-purpose solvers across various instances, with a focus on model scalability, solver efficiency, and overall solution quality.
|
|
08:54-09:12, Paper WeAT1.4 | |
Improving Lower Bounds of the Shortest Path Problem in a 3D Environment of Axis Aligned Cuboids |
|
Battistini, Jarrett | Texas a & M |
Rathinam, Sivakumar | TAMU |
Keywords: Motion and Path Planning
Abstract: This article addresses a 3D shortest path problem in the presence of obstacles. These obstacles are axis aligned cuboids. Our approach relaxes the collision constraint to compute a lower bound of the shortest path. The obstacles are decomposed into line segments that are treated as nodes in a graph. This method also allows for discontinuity between the entering and exiting points of the line segments. Numerical results demonstrate that the gap between our lower bound and the upper bound narrows and that the proposed method outperforms the traditional Euclidean distance as a lower bound.
|
|
WeAT2 |
Room T2 |
RAL Paper Session 3 |
Special Session |
Chair: Zheng, Minghui | Texas A&M University |
|
08:00-08:18, Paper WeAT2.1 | |
SIMPNet: Spatial-Informed Motion Planning Network |
|
Soleymanzadeh, Davood | Texas A&M University |
Liang, Xiao | Texas A&M University |
Zheng, Minghui | Texas A&M University |
Keywords: Motion and Path Planning, Task and Motion Planning, Integrated Planning and Learning
Abstract: Current robotic manipulators require fast and efficient motion-planning algorithms to operate in cluttered environments. State-of-the-art sampling-based motion planners struggle to scale to high-dimensional configuration spaces and are inefficient in complex environments. This inefficiency arises because these planners utilize either uniform or hand-crafted sampling heuristics within the configuration space. To address these challenges, we present the Spatial-informed Motion Planning Network (SIMPNet). SIMPNet consists of a stochastic graph neural network (GNN)-based sampling heuristic for informed sampling within the configuration space. The sampling heuristic of SIMPNet encodes the workspace embedding into the configuration space through a cross-attention mechanism. It encodes the manipulator's kinematic structure into a graph, which is used to generate informed samples within the framework of sampling-based motion planning algorithms. We have evaluated the performance of SIMPNet using a UR5e robotic manipulator operating within simple and complex workspaces, comparing it against baseline state-of-the-art motion planners. The evaluation results show the effectiveness and advantages of the proposed planner compared to the baseline planners.
|
|
08:18-08:36, Paper WeAT2.2 | |
Enhancing Real-Time Body Pose Estimation in Occluded Environments through Multimodal Musculoskeletal Modeling |
|
Guidolin, Mattia | University of Padova |
Vanuzzo, Michael | University of Padua |
Michieletto, Stefano | University of Padua |
Reggiani, Monica | University of Padua |
Keywords: Multi-Modal Perception for HRI, Human-Robot Collaboration, Human Detection and Tracking
Abstract: In recent years, there has been a growing interest in Human-Robot Collaboration (HRC). One of the main challenges in developing effective tools for HRC is accurately estimating human pose in real-time, ensuring both human safety and efficient collaboration. To address this, we propose a novel approach enabling accurate and robust full-body pose estimation in real-time, even in the presence of occlusions. Our system combines information from RGB-D cameras and inertial measurement units, leveraging it to control a musculoskeletal model of the human through a multimodal inverse kinematics optimization. This approach ensures improvements in the anatomical realism and accuracy of the tracked movement while allowing flexibility in accommodating various sensor configurations. The consideration of the underlying anatomical structure also enhances the ability to estimate body poses in occluded environments. We conducted several HRC experiments where the operator's view was obstructed by various types of occlusions. The outcomes demonstrate how our methodology significantly improves pose estimation accuracy, even with a limited set of sensors and in the presence of occlusions in the scene. Our work aims to facilitate advanced HRC applications that require a precise understanding of human movement.
|
|
08:36-08:54, Paper WeAT2.3 | |
Power Adaptation-Enabled Admittance Control for Stable and Safe Actuated Interaction in Unmodeled Environment |
|
Huang, Yizhou | Zhejiang University |
Yang, Liangjing | Zhejiang University |
Keywords: Compliance and Impedance Control, Robust/Adaptive Control, Physical Human-Robot Interaction
Abstract: Pin-based shape display is a type of interactive interface researched in the field of human-robot interaction (HRI) for physical shape rendering through a grid of linear motion actuators. Some researchers have enabled it with the abilities of manipulating objects and interacting with humans. This important expansion of usages in pin-based shape display to dynamic shape rendering imposes a potential challenge in interacting with unmodeled environment dynamics consisting of humans or other objects in a safe and stable way. We have previously introduced admittance control to the pin-based shape rendering in an attempt to regulate the relation between the motions of pins and the external force applied by the unmodeled environment. Despite the functional realization, there is a need for a safe and stable interaction as the admittance controller may lead to excessive power output. To overcome this, one approach is to apply the energy-based control method to the admittance controller. Bounded energy is allocated to the low-level controller, and bounded power is introduced to limit the power output of the pin-based shape rendering. However, these boundaries are difficult to estimate. Especially for the power boundary, which needs to be dynamically adjusted to balance the safety and quick interaction response of the system. To address the above issues, we introduce a power adaptation method to the admittance controller in pin-based shape rendering. Experiments with previously developed pin-based shape rendering are used to validate the benefit of the power adaption method. The results support the fact that the proposed method has better performance in terms of safety and interaction response, demonstrating the potential to introduce higher-level plannin
|
|
08:54-09:12, Paper WeAT2.4 | |
HAS-RRT: RRT-Based Motion Planning Using Topological Guidance |
|
Uwacu, Diane | Mount Holyoke College |
Yammanuru, Ananya | University of Illinois at Urbana-Champaign |
Nallamotu, Keerthana | University of Illinois, Urbana-Champaign |
Chalasani, Vasu | University of Illinois, Urbana-Champaign |
Morales, Marco | University of Illinois Urbana-Champaign & Instituto Tecnológico |
Amato, Nancy | University of Illinois Urbana-Champaign |
Keywords: Motion and Path Planning, Computational Geometry, Constrained Motion Planning
Abstract: We present a hierarchical RRT-based motion planning strategy, Hierarchical Annotated-Skeleton Guided RRT (HAS-RRT), guided by a workspace skeleton, to solve motion planning problems. HAS-RRT provides up to a 91% runtime reduction and builds a roadmap at least 30% smaller than competitors while still finding competitive-cost paths. This is because our strategy prioritizes paths indicated by the workspace guidance to efficiently find a valid motion plan for the robot. Existing methods either rely too heavily on workspace guidance or have difficulty finding narrow passages. By taking advantage of the assumptions that the workspace skeleton provides, HAS-RRT is able to build a smaller roadmap and find a path faster than its competitors. Additionally, we show that HAS-RRT is robust to the quality of workspace guidance provided and that, in a worst-case scenario where the workspace skeleton provides no additional insight, our method performs comparably to an unguided method.
|
|
WeAT3 |
Room T3 |
Mobile Robots 1 |
Regular Session |
Chair: Carpin, Stefano | University of California, Merced |
|
08:00-08:18, Paper WeAT3.1 | |
SeGuE: Semantic Guided Exploration for Mobile Robots |
|
Simons, Cody | University of California, Riverside |
Samanta, Aritra | University of California, Riverside |
Roy-Chowdhury, Amit | University of California, Riverside |
Karydis, Konstantinos | University of California, Riverside |
Keywords: Reactive and Sensor-Based Planning, Motion and Path Planning, Autonomous Vehicle Navigation
Abstract: The rise of embodied AI applications has enabled robots to perform complex tasks that require sophisticated understanding of their environment. To allow successful robot operation in such settings, maps must be constructed to include both semantic and geometric information. In this paper, we address the novel problem of semantic exploration, whereby a mobile robot must autonomously explore an environment to fully map its structure and features' semantic appearance. We develop a method based on next-best-view exploration, where potential poses are scored based on the semantic features visible from that pose. We explore two alternative methods for sampling potential views and demonstrate the effectiveness of our framework in both simulation and physical experiments. The automatic creation of high-quality semantic maps can enable robots to better understand and interact with their environments and facilitate the deployment of future embodied AI applications.
|
|
08:18-08:36, Paper WeAT3.2 | |
Moving Matter: Using a Single, Simple Robot to Reconfigure a Connected Set of Building Blocks |
|
Garcia Gonzalez, Javier | University of Houston |
Friemel, Jonas | Bochum University of Applied Sciences |
Kosfeld, Ramin | Technische Universität Braunschweig |
Yannuzzi, Michael | University of Houston |
Kramer, Peter | TU Braunschweig |
Rieck, Christian | Technische Universität Braunschweig |
Scheffer, Christian | Technische Universität Braunschweig |
Schmidt, Arne | TU Braunschweig |
Kube, Harm | Technische Universität Berlin |
Biediger, Daniel | University of Houston |
Fekete, Sándor | Technische Universität Braunschweig |
Becker, Aaron | University of Houston |
Keywords: Motion and Path Planning, Intelligent and Flexible Manufacturing, Automation in Construction
Abstract: We implement and evaluate different methods for the reconfiguration of a connected arrangement of tiles into a desired target shape, using a single active robot that can move along the tile structure. This robot can pick up, carry, or drop off one tile at a time, but it must maintain a single connected configuration at all times. Becker et al. [5] recently proposed an algorithm that uses histograms as canonical intermediate configurations, guaranteeing performance within a constant factor of the optimal solution if the start and target configuration are well-separated. We implement and evaluate this algorithm, both in a simulated and practical setting, using an inchworm type robot to compare it with two existing heuristic algorithms.
|
|
08:36-08:54, Paper WeAT3.3 | |
Approximation Algorithms for Cooperative Multi-Robot Patrolling in Core-Periphery Graph Settings |
|
Bassot, Alex | University of Milan |
Carpin, Stefano | University of California, Merced |
Basilico, Nicola | University of Milan |
Keywords: Planning, Scheduling and Coordination, Motion and Path Planning, Surveillance Systems
Abstract: The adoption of autonomous robots for patrolling introduces efficiency and resilience into automated surveillance systems. A fundamental challenge in this domain lies in optimal planning of patrol routes, a problem typically addressed through graph-based models of the environment. In this work, we consider core-periphery graph settings, where locations are divided into a high-priority core and a lower-priority periphery, a structure that naturally aligns with many real-world scenarios, such as urban surveillance and infrastructure security. We extend the Overlapping Partition Problem (OPP), a recently proposed formalization for core-periphery settings, providing novel theoretical insights by deriving approximation bounds under the assumption that the core is known, a realistic assumption in many surveillance contexts. Our theoretical contributions are complemented by empirical comparisons of our method with two state-of-the-art baselines, demonstrating our method's superior performance in computing effective patrolling strategies.
|
|
08:54-09:12, Paper WeAT3.4 | |
Integrating Bilevel Planning and Offline Skill Learning for Enhancing Mobile Manipulation |
|
Watanabe, Shin | University of Oslo |
Horn, Geir | University of Oslo |
Torresen, Jim | University of Oslo |
Ellefsen, Kai Olav | University of Oslo |
Keywords: Task Planning, Deep Learning in Robotics and Automation, Motion and Path Planning
Abstract: Solving complex robotic mobile manipulation tasks requires both planning the sequence of skills to execute and learning how to robustly execute each skill. Planning-based approaches such as task and motion planning (TAMP) can help train skills more efficiently through demonstrations, while learning-based approaches such as reinforcement learning (RL) can help plan tasks more quickly through heuristics. This paper presents a novel approach to generalizing mobile manipulation tasks by synergistically combining sampling-based TAMP and value-based RL. The TAMP solver first generates suboptimal demonstration trajectories of a particular skill, from which an offline RL algorithm distills a robust policy and a value function, the latter serving as a skill feasibility classifier. The policy and the classifier are both fed back into the TAMP workflow not only to improve the skill success rate but also to speed up the planner by sampling robot configurations for which the skill is likely to succeed. We evaluate the approach on a simulated block-pushing domain. Re-purposing a byproduct of an offline skill-learning process leads to an integrated planning and learning system that exploits the awareness of its own skill competence.
|
|
WeAT4 |
Room T4 |
Production and Logistics Automation |
Regular Session |
Chair: Nguyen, Quoc Hung | Hitachi, Ltd |
|
08:00-08:18, Paper WeAT4.1 | |
Low Cost Automation of Last Mile Delivery |
|
Amarnath Reddy, Kallam | Cognizant |
Edulakanti, Nikitha | Fresenius Medical Care |
Krivic, Senka | University of Sarajevo |
Keywords: Intelligent Transportation Systems, Logistics, Planning, Scheduling and Coordination
Abstract: Last-mile delivery remains a significant logistical challenge, requiring cost-effective automation to optimize routes, resource allocation, and package handling. This paper presents a low-cost automation framework for last-mile delivery that integrates a capacitated vehicle routing problem (CVRP) with dynamic pickups, 3D bin packing, and multi-modal optimization. We propose scalable algorithms that leverage computer vision for volume estimation, heuristic-based route planning, and dynamic scheduling to minimize operational costs while maximizing efficiency. Experimental results demonstrate reduced travel distances, improved load utilization, and enhanced delivery reliability. Our approach provides a practical and scalable solution for automating last-mile logistics, ensuring timely and cost-efficient deliveries.
|
|
08:18-08:36, Paper WeAT4.2 | |
Safety Stock Model Selection Optimization for Budget-Constrained Multi-Item Inventory Management: A Scalable Framework |
|
Nguyen, Quoc Hung | Hitachi, Ltd |
Suemitsu, Issei | Hitachi, Ltd |
Akutsu, Itoe | Hitachi, Ltd |
Aimi, Daisuke | Hitachi, Ltd |
Oka, Tsuyoshi | Hitachi Vantara, Ltd |
Keywords: Inventory Management, Optimization and Optimal Control
Abstract: Effective safety stock management is essential for balancing service levels and inventory holding costs in supply chains. Conventional analytical models often rely on unrealistic assumptions, such as normally distributed demand, while simulation-based methods, though more accurate and flexible, are computationally impractical for firms managing thousands of items. This paper proposes the Safety Stock Model Selection Optimization (SSMSO) framework, where a model refers to a specific approach or algorithm used to determine safety stock levels for a single item and is evaluated by the win rate metric—defined as the proportion of periods with no stockouts. SSMSO determines safety stock levels by optimally selecting a model for each item from a list of individual models, thereby reducing reliance on specific probabilistic assumptions. The optimization problem is formulated as a binary integer linear program (ILP) that maximizes the total win rate while ensuring the total safety stock cost remains within budget. By leveraging LP solvers with special ordered set constraints, SSMSO efficiently handles thousands of items, scaling significantly better than nonlinear programming-based methods, which struggle with convergence for large datasets. Experimental results on a real-world dataset demonstrate the effectiveness of SSMSO in reducing shortage rates while maintaining budget feasibility, making it a practical and scalable solution for multi-item safety stock optimization.
|
|
08:36-08:54, Paper WeAT4.3 | |
From Off-Line Programming to Cyber-Physical Systems: An Optimized Software Architecture |
|
Fantuzzi, Cesare | Università Di Modena E Reggio Emilia |
Battilani, Nicola | University of Modena and Reggio Emilia |
Costi, Silvia | Industria Tecnologica Italiana Srl |
Da Silva Araujo, Joao Marcos | Industria Tecnologica Italiana Srl |
Gaddoni, Giacomo | SACMI |
Gambazza, Mattia | Gaiotto.Sacmigroup |
Masotti, Gabriele | Sacmi |
Mattioli, Mirko | Sacmi S.c |
Morchia, Lorenzo | Gaiotto Automation (SACMI Group) |
Ragaglia, Matteo | Gaiotto Automation SpA |
Keywords: Industrial and Service Robotics, Cyber-physical Production Systems and Industry 4.0, Control Architectures and Programming
Abstract: The increasing role of industrial robots in modern manufacturing is crucial for enhancing competitiveness. However, small-to-medium-sized enterprises (SMEs) still face challenges in adopting robotic technologies, primarily due to the complexity and time-consuming nature of robot programming. Traditional programming methods rely on either teach pendant programming, which causes machine downtime, or offline programming (OLP), which reduces downtime but requires extensive calibration and validation. Despite its advantages, OLP remains a challenging approach due to the difficulty of accurately replicating real-world conditions in a virtual environment. A key challenge is ensuring proper calibration, as even minor discrepancies between the simulation and the actual robotic system can lead to positioning errors, inefficiencies, and operational failures. This paper presents SmartOffline NextGen (SmOffNG), SACMI’s advanced OLP solution designed to improve calibration efficiency and system performance. The proposed innovation introduces a dynamic recompilation functionality that minimizes computational overhead while maintaining high accuracy. The method leverages a precomputed lattice of offset configurations to predict optimal recalibrations without requiring full reprocessing. By applying a spatial locality principle, the system efficiently selects the closest precomputed configuration, significantly reducing recalibration times while preserving precision. The main contributions of this work include: • An optimized architecture for program rebuilding. • Virtualization of the rebuilding lattice to enhance flexibility and efficiency. • Performance optimization of the cyber-physical system in terms of cycle time and duty cycle.
|
|
08:54-09:12, Paper WeAT4.4 | |
UNIQC: Unified I/Q Correction with Quality-Scored Filtering for BLE-Enabled Application Traceability for Industry 4.0 Applications |
|
Huang, Yijia | The Hong Kong Polytechnic University |
Lam, Hin Sang | The Hong Kong Polytechnic University |
Zhao, Zhiheng | The Hong Kong Polytechnic University |
Huang, George Q. | The Hong Kong Polytechnic University |
Keywords: Cyber-physical Production Systems and Industry 4.0
Abstract: Industry 4.0 has catalyzed resource allocation, merging high-precision traceability to streamline production and minimize wastage. Within the industrial context, real-time tracking assets enhances operational efficiency and diminishes energy expenditure by alleviating logistical inefficiencies. Although Global Positioning System demonstrates robustness for outdoor localization, indoor positioning systems (IPS) frequently exhibit suboptimal performance. The advent of Bluetooth Low Energy (BLE) v5.1, leveraging angle of arrival (AoA), delivers a cost-efficient, precise framework for traceability in industrial indoor environments. Prevailing research primarily emphasizes on exploiting the collected signals to estimate angles for indoor localization. However, few attentions are paid to signal quality, particularly affected by complex industrial environment, which may markedly degrade the positioning performance. Specifically, imbalances between in-phase (I) and quadrature (Q) signal components impair received signal quality, ultimately undermining positioning accuracy. To address this challenge, this article proposes UNIQC for BLE AoA-enabled IPS, incorporating unified I/Q correction with quality-scored filtering to elevate precision and reduce computational load. Onsite experiment validates its efficacy, evidencing enhanced filtering and superior angle estimation for resource traceability in Industry 4.0.
|
|
WeAT5 |
Room T5 |
Motion Control and Planning 1 |
Regular Session |
Chair: Heikkilä, Tapio | VTT Technical Research Center of Finland |
|
08:00-08:18, Paper WeAT5.1 | |
Gaussian Path Model Library for Intuitive Robot Motion Programming by Demonstration |
|
Soutukorva, Samuli | VTT Technical Research Centre of Finland |
Suomalainen, Markku | VTT Technical Research Centre of Finland |
Kollingbaum, Martin | Unaffiliated |
Heikkilä, Tapio | VTT Technical Research Center of Finland |
Keywords: Learning and Adaptive Systems, Machine learning, Intelligent and Flexible Manufacturing
Abstract: This paper presents a system for generating Gaussian path models from teaching data representing the path shape. In addition, methods for using these path models to classify human demonstrations of paths are introduced. By generating a library of multiple Gaussian path models of various shapes, human demonstrations can be used for intuitive robot motion programming. A method for modifying existing Gaussian path models by demonstration through geometric analysis is also presented.
|
|
08:18-08:36, Paper WeAT5.2 | |
Developing a Distributed Control Architecture for Legged Robot Locomotion |
|
Moreira, João P. | University of Porto |
Pinto, Vítor H | SYSTEC (DIGI2) |
Costa, Paulo | University of Porto |
Keywords: Control Architectures and Programming, Formal Methods in Robotics and Automation, Motion Control
Abstract: Control of legged robots is a complex task involving high degree of freedom underactuated systems with contact constraints. This work presents a control architecture which leverages simplifications and model segregation in an attempt to develop a transparent solution to the locomotion problem. By splitting a robot's model into a set of legs and a main body, each of these smaller dimension components become easier to analyze and a controller is developed for each of them. The controllers' interface is done with the wrenches applied by each leg, making the distribution of the leg controllers' inputs linear. The different controllers are tested in a 2D simulation.
|
|
08:36-08:54, Paper WeAT5.3 | |
2D Balancing Controller for a Unicycling Biped Via Legged Locomotion |
|
Ma, Dylan | Florida State University |
Hubicki, Christian | Florida State University |
Higgins, Taylor | Florida State University |
Keywords: Optimization and Optimal Control, Motion Control, Biomimetics
Abstract: Legged humanoid robots are growing in popularity in the commercial space, so determining how to control legs to achieve dynamic balancing is critical. Previous work has been done on creating robots that can unicycle, but these robots directly actuate the wheel or do not have fully actuated legs. The controller proposed in this work uses a multi-body model consisting of three branched pendulums connected to a wheel. Double pendulums are used to represent the legs while a single pendulum represents the body of the biped joined to the unicycling stand. The model uses anthropometric data to ensure human-like proportions of the leg segments. A linear-quadratic regulator (LQR) was implemented to drive the robotic hip and knee joints such that they would actuate the pedal mechanism of the unicycle. In simulation, we were able to drive the system to balance while pedaling two rotations of the wheel in under 5 seconds. When disturbance torques were applied to oppose the motion, the system was able to recover from a maximum torque of 1.3 Nm applied during its most robust configuration and 0.5 Nm during its least robust configuration, both within 10 seconds of the angular impulse.
|
|
08:54-09:12, Paper WeAT5.4 | |
Implementation and Validation of Obstacle Avoidance Algorithms on a Self-Balancing Robot |
|
Mansouri, Mahshid | University of Illinois at Urbana-Champaign |
Huang, Zhe | University of Illinois at Urbana-Champaign |
Chen, Yu | University of Illinois at Urbana-Champaign |
Driggs-Campbell, Katherine | University of Illinois at Urbana-Champaign |
Norris, William | University of Illinois Urbana-Champaign |
Ramos, Joao | University of Illinois at Urbana-Champaign |
Hsiao-Wecksler, Elizabeth T. | University of Illinois at Urbana-Champaign |
Keywords: Autonomous Agents, Collision Avoidance, Motion and Path Planning
Abstract: Driving dynamically-stable robots such as ballbots through complex environments poses significant challenges due to their unique dynamics and underactuated nature. This study presents the evaluation of a state-of-the-art sampling-based path planning algorithm, called Neural Informed Rapidly-exploring Random Tree Star (NIRRT*), on a payload-carrying ballbot, namely MiaPURE, for autonomous navigation applications. Extensive testing was conducted to assess the algorithm's effectiveness in navigating challenging indoor environments with static and dynamic obstacles, under various speeds and payload conditions. Results demonstrated overall effectiveness of the NIRRT* algorithm in avoiding collisions with static and dynamic obstacles. NIRRT* algorithm performance was also compared against Dynamic Window Approach (DWA), and two non-autonomous human-controlled methods (i.e., on-board control using hands-free control, remote control using joystick teleoperation). NIRRT* demonstrated similar completion times and no collisions compared to DWA, while also achieving completion times and collision rates comparable to teleoperation. Finally, we demonstrated MiaPURE’s autonomous navigation in an indoor setting, successfully navigating at 0.4 m/s about 20 m and passing through areas as narrow as 90 cm, highlighting its potential for practical deployment in indoor payload-carrying and human-riding applications.
|
|
WeAT6 |
Room T6 |
Factory Automation 1 |
Regular Session |
Chair: Sarkar, Mrinmoy | North Carolina A&T State University |
|
08:00-08:18, Paper WeAT6.1 | |
Modelling and Solution to Production Planning in Cold Rolling |
|
Yang, Yang | Northeastern University |
Su, Peng | Northeastern University |
Tang, Lixin | Northeastern University |
Keywords: Planning, Scheduling and Coordination, Manufacturing, Maintenance and Supply Chains
Abstract: Cold rolling production process is characterized by the restriction that the starting time of the item in the downstream production lines cannot be earlier than the completion time in the upstream production lines. To make the production capacity utilized sufficiently and coordinate the multiple production lines, a production planning problem in cold rolling is derived, which is to decide which orders to produce and what quantities in each production line in each day, with consideration of the production capacity restrictions and the material distribution throughout the production process. To describe the problem, an integer programming model is formulated. To solve the problem, a branch and pricing algorithm is proposed. Firstly, the primal model is reformulated, where the master problem is to coordinate the production plans, the sub-problem is to decide which order to be processed and its production volume in each production line in each day. The lower bound could be obtained by column generation. Then the optimal solution is found by a branch-and-bound approach. Moreover, a differential evolution algorithm is developed to solve the large-scale instances. Finally, the performance of the proposed approaches is illustrated by computational experiments based on randomly generated instances.
|
|
08:18-08:36, Paper WeAT6.2 | |
Approaches to Automatic Discovery and Modeling of Industrial Assets for IT/OT Integration |
|
Todkar, Anandrao Harishchandra | Siemens Corporation |
Sarkar, Mrinmoy | North Carolina A&T State University |
Solanki, Jitendra Singh | Siemens |
Tylka, Joseph | Siemens Technology |
Keywords: Cyber-physical Production Systems and Industry 4.0, AI-Based Methods, Factory Automation
Abstract: Data integration in brownfield industrial environments is an essential step to realize the benefits of Industry 4.0 and information technology (IT)/operational technology (OT) convergence, but requires significant effort to connect, discover, identify, organize, and model the runtime data available on the relevant assets. Two methods are implemented to automatically construct OPC-UA (Open Platform Communications Unified Automation) information models from discovered assets. The first is based on large language models (LLMs) and knowledge graph-based retrieval augmented generation (GraphRAG), while the second employs text clustering to map data points (e.g., tags) to the appropriate OPC UA type model. The methods are compared by executing generation tasks for target models of differing complexity and assessing syntactical and semantic accuracy. The results suggest that, while GraphRAG based approaches are capable of generating accurate and compliant results for simple target semantic models, for more complex tasks, they struggle to match the performance of more traditional domain-specific algorithms. By providing a holistic system and methodology for asset discovery and modeling, this research aims to significantly minimize engineering efforts and enable seamless IT/OT integration in industrial environments.
|
|
08:36-08:54, Paper WeAT6.3 | |
Modular Production Using Hierarchical Planning and a Generic OPC UA Skills Concept |
|
Koch, Philip | Fraunhofer Institute for Manufacturing Technology and Advanced M |
Töpfer, Nico | Fraunhofer IFAM |
Albrecht, Sebastian | Siemens Corporate Technology |
Thiele, Bernhard | German Aerospace Center |
Reiser, Robert | Institute of Robotics and Mechatronics, German Aerospace Center |
Keywords: Cyber-physical Production Systems and Industry 4.0, Planning, Scheduling and Coordination, Factory Automation
Abstract: In response to the growing global demand for customized and sustainable aircraft with shorter lead times, this research aims to enhance flexibility and automation in aircraft production. The study focuses on integrating hierarchical process planning with machine skills and simulation to streamline the final assembly phase, where numerous components and modules are combined. Existing research highlights challenges in coordinating module production across various locations, emphasizing the need for cost-effective and adaptable systems to manage demand fluctuations and supply chain disruptions. This study presents a novel approach by expanding the concept of machine skills to include prediction skills, enabling pre-execution, simulation-based outcome estimation for informed decision-making and resource optimisation. The proposed methodology involves a minimal demonstration scenario featuring a warehouse, preassembly stations, a final assembly station, and flexible intralogistics with automated guided vehicles (AGVs), all connected through standardized OPC UA interfaces. This setup serves as a use case to illustrate the application of a generic software architecture and standardised modeling for task-specific skill combinations. By addressing hierarchical planning tasks supported by simulation and standardized interfaces, the research provides a framework for adaptive process orchestration and flexible logistics. The results show a significant improvement in the flexibility and efficiency of production process planning and execution, which has practical implications for the further development of aircraft manufacturing processes.
|
|
08:54-09:12, Paper WeAT6.4 | |
Task and Grasp Constrained 3D Robotic Work-Cell Layout Planning |
|
Ali, Radwa | Osaka University |
Hu, Zhengtao | Shanghai University |
Kiyokawa, Takuya | Osaka University |
Wan, Weiwei | Osaka University |
Nishi, Tatsushi | Ritsumeikan University |
Harada, Kensuke | Osaka University |
Keywords: Factory Automation, Logistics, Task Planning
Abstract: Current robotic cellular layout planning predominantly focuses on 2D configurations, often neglecting the spatial constraints imposed by manipulated objects and robotic tasks, which can lead to impractical solutions. This research introduces a multi-objective optimization framework for 3D robotic work-cell layout planning, integrating task, grasp, and motion constraints to enhance feasibility and real-world applicability. The proposed approach formulates objectives by balancing spatial efficiency, task-aware motion planning, and grasp feasibility. To improve computational efficiency, the framework incorporates a hybrid NSGA-II and PSO optimization strategy alongside adaptive and parallel computing techniques. The methodology is validated through case studies, including randomized layout evaluations and a greenfield automotive robotic work-cell, demonstrating significant improvements in solution quality and computational efficiency. A multi-perspective evaluation, including a comparative analysis with learning-based algorithms and an assessment based on task-aware performance metrics, verifies the robustness and superiority of the proposed framework. Results confirm that incorporating task-aware constraints significantly influences layout design.
|
|
WeAT7 |
Room T7 |
Energy and Sustainability 1 |
Regular Session |
Chair: Jafari, Mohsen | Rutgers University |
|
08:00-08:18, Paper WeAT7.1 | |
Dynamic Occupancy Measurement for Smart Buildings: A Few-Shot Large Language Model Approach |
|
Qaisar, Irfan | Tsinghua University |
Sun, Kailai | Massachusetts Institute of Technology, Tsinghua University |
Zhao, Qianchuan | Tsinghua University |
Keywords: Building Automation, Sensor Fusion, Human Factors and Human-in-the-Loop
Abstract: Accurate indoor occupancy measurement is essential for energy-efficient smart buildings and optimized HVAC control. Traditional deep-learning models often struggle with dynamic occupancy patterns and limited data availability. Recently, Large Language Models (LLMs) have emerged as promising solutions due to their adaptability and contextual learning capabilities. This study proposes an LLMs-based building occupancy detection and estimation framework with few-shot learning and in-context learning (ICL). This study evaluates three advanced LLMs (Llama3.2, Gemini-Pro, Deepseek-R1) against traditional models (Logistic Regression, Random Forest, XGBoost) using real-world datasets collected in China and Singapore. Results indicate that LLMs consistently achieve superior performance, especially under limited training data conditions. For occupancy detection tasks, Gemini-Pro reached 95.83% accuracy with a 4-day training split and maintained 95.90% accuracy even with a reduced 3-day training period. Similarly, in occupancy estimation tasks, Gemini-Pro achieved 91.15% accuracy (4-day training) and maintained robust performance (94.14%) in a 1-day training scenario. Besides, this study simulates occupancy-centric control using real-world occupancy data in one office, verifying the potential of the proposed framework for saving building energy (10%-30%) and improving occupant comfort.
|
|
08:18-08:36, Paper WeAT7.2 | |
Active Inference for Energy Control and Planning in Smart Buildings and Communities |
|
Nazemi, Seyyed Danial | Rutgers University |
Jafari, Mohsen | Rutgers University |
Matta, Andrea | Politecnico Di Milano |
Keywords: Building Automation, Smart Home and City, Learning and Adaptive Systems
Abstract: Active Inference (AIF) is emerging as a powerful framework for decision-making under uncertainty, yet its potential in engineering applications remains largely unexplored. In this work, we propose a novel dual-layer AIF architecture that addresses both building-level and community-level energy management. By leveraging the free energy principle, each layer adapts to evolving conditions and handles partial observability without extensive sensor information and respecting data privacy. We validate the continuous AIF model against both a perfect optimization baseline and a reinforcement learning-based approach. We also test the community AIF framework under extreme pricing scenarios. The results highlight the model’s robustness in handling abrupt changes. This study is the first to show how a distributed AIF works in engineering. It also highlights new opportunities for privacy-preserving and uncertainty-aware control strategies in engineering applications.
|
|
08:36-08:54, Paper WeAT7.3 | |
BIM-LOD: The Envelope Interactive Design and Energy System Dynamic Scheduling Considering Carbon Emissions Throughout the Smart Building Life Cycle |
|
Tian, Ying | The MOE KLINNS Lab of Xi'an Jiaotong University |
Zhikun, Gao | Xi'an Jiaotong University |
Wu, Jiang | Xian Jiaotong University |
Xu, Zhanbo | Xi'an Jiaotong University |
Guan, Xiaohong | Xi'an Jiaotong University |
Keywords: Smart Home and City, Modelling, Simulation and Validation of Cyber-physical Energy Systems, Automation in Construction
Abstract: Nowdays, the interplay between building ontology and energy systems is progressively intensifying with the development of new energy technologies, creating new challenges in terms of economic and low-carbon requirments. In order to effectively address this issue, a parametric modelling approach was first used to investigate the architecture of an interactive optimal structure for the building envelope and energy system. Secondly, BIM-LOD was employed to conduct modelling research on the evaluation and calculation of building energy systems based on information physical integration, muti-objectives, and constraint conditions such as energy balance and enclosure structure etc. Thirdly, using the actual case to provide decisions for the simulation object, and carry out analysis. The results of the study show that the average carbon emission is reduced by 77.29% and the average cost is reduced by 75.63% compared to the corresponding reference model. In addition, the effect of combined optimal solution was shown to increase by 19.21%, 15.43% and 20.35% respectively in the heating, transition and cooling seasons compared to the baseline scheme. The practical value of this study lies in its potential to manage envelope structures and energy systems in complex scenarios, thereby providing interactive design strategies and dynamic scheduling for optimising carbon emissions.
|
|
08:54-09:12, Paper WeAT7.4 | |
A Learning-Based Lagrangian Relaxation Algorithm for Integrated Scheduling of Oxygen Allocation and Steelmaking-Continuous Casting in Steel Enterprise |
|
Chang, Miao | Northeastern University |
Zhao, Shengnan | Northeastern University |
Tang, Lixin | Northeastern University |
Keywords: Energy and Environment-aware Automation, Intelligent and Flexible Manufacturing, Reinforcement
Abstract: Effective energy management is crucial for reducing production costs in industrial settings. This is particularly evident in the steel industry, where the steelmaking and continuous casting scheduling processes consume substantial amounts of energy. The intermittent nature of steelmaking leads to significant fluctuations in energy supply and demand, resulting in energy waste. Neglecting the scheduling of energy supply during the production process fails to effectively reduce the overall production costs. To address this issue, we propose a mixed-integer linear programming (MILP) model that integrates steelmaking-continuous casting scheduling with oxygen supply scheduling. Given complex constraints on the model and combinatorial characteristics, we employ the Lagrangian relaxation algorithm to solve it. To enhance the algorithm's efficiency, we introduce a reinforcement learning approach to dynamically update the step size coefficients of the Lagrangian multipliers. Through numerical experiments, we first validate the model's ability to reduce production costs and subsequently examine the impact of various reinforcement learning strategies on algorithm performance. The results indicate that the proposed model effectively reduces production costs and the algorithm improvement strategy significantly improves performance.
|
|
WeAT8 |
Room T8 |
Autonomous Systems 2 |
Regular Session |
Chair: Oksanen, Timo | Technical University of Munich |
|
08:00-08:18, Paper WeAT8.1 | |
Continuous Integration and Delivery of ROS2 Projects |
|
von Zmuda, Johannes | University of Bremen |
Koch, Till | University of Bremen |
Ahmed, Sakib | Cognitive Neuroinformatics, University of Bremen |
Keywords: Software, Middleware and Programming Environments, Domain-specific Software and Software Engineering
Abstract: Autonomous systems are becoming increasingly capable and are used in manifold situations in our entire life. Ranging from aerospace, healthcare, autonomous driving, industry, agriculture to home automation, these complex systems are embedded in sensitive, safety-critical and complex environments that demand a high degree of software quality. One important building block for software quality is continuous integration. However, previous work has shown a lack of sophisticated continuous integration in robotics projects, especially for projects using the popular ROS (Robot Operating System). This work suggests CIRDAN, a Continuous Integration and Delivery (CI/CD) framework that enables robotic developers to make use of a wide variety of CI/CD features offered by GitLab CI, like testing, code coverage, code analysis and publishing releases. Since previous work has focused on a cloud based approach, we developed CIRDAN as an on-premise solution. With CIRDAN we provide multiple Docker images for CPU architectures amd64, arm64 and riscv64. The framework itself as well as the Docker images and experimental ROS2 binaries for RISC-V are provided as open-source. The framework has been successfully applied in multiple autonomous systems research projects.
|
|
08:18-08:36, Paper WeAT8.2 | |
Kerberos-Based Secure Discovery Protocol for Software-Defined Robots Using ROS 2 |
|
Brodie, Samuel | TUM |
Oksanen, Timo | Technical University of Munich |
Keywords: Software, Middleware and Programming Environments
Abstract: The Robot Operating System 2 (ROS 2) utilises the Data Distribution Service (DDS) middleware for publish/subscribe communications between nodes. The interoperable DDS Simple Discovery Protocol and standardised built-in security plugins are used to enable secure, peer-to-peer discovery using public key infrastructure. The peer-to-peer nature of this protocol has many benefits. However, there can be a delay every time an application joins the network because it needs to perform computationally expensive asymmetric-key operations multiple times to authenticate each remote participant before exchanging information. The increasing adaptation of software-defined robotics means that it is not necessarily the case that software modules can trust one another simply because they are running on the same hardware, increasing the number of nodes which must be authenticated and exacerbating the problems of long discovery times. We propose novel DDS Authentication and AccessControl security plugins based on the Kerberos protocol and a key distribution center. A discovery protocol is presented which enables applications to join the DDS domain using only symmetric cryptography, instead of asymmetric, without affecting the run-time functionality of ROS 2. This is beneficial in use cases where there are many DDS Domain participants in the domain and reliance on a trusted key distribution center is acceptable.
|
|
08:36-08:54, Paper WeAT8.3 | |
Strengthening Cyber Defenses for Networked Autonomous Robots |
|
Ajeigbe, Oluwafemi | Texas A&M |
Kim, Jaewon | Texas A&M University |
Ozelton, Ryan | Texas A&M University |
Tang, Jeremy | Texas A&M University |
Munoz, Anthony | Texas A&M |
Roy, Sandip | Texas A&M University |
Keywords: Autonomous Vehicle Navigation, Failure Detection and Recovery, Sensor-based Control
Abstract: A compact testbed for assessing cyber-attacks and defenses for autonomous mobile robots is developed, using the TurtleBot3 (TB3) platform. The TB3 is set up to perform tracking tasks in the testbed. We then implement three classes of attacks targeting camera, LiDAR, and position sensors through e.g. spoofing, temporal delay, and replay, respectively. Our experimental setup demonstrates that these attacks can compromise robot tracking performance. To address these vulnerabilities, we develop and validate three lightweight defense mechanisms: dynamic watermarking for camera defense, velocity consistency checking for LiDAR validation, and model-based variance monitoring for position attack detection. Experiments have been conducted to assess attack impacts and defense implications. For instance, experiments have shown that replay attacks can cause position errors exceeding 0.4m within 15 seconds. Likewise, experiments indicate that our LiDAR defense achieves detection within 200ms. The testbed provides a practical framework for evaluating cyber-physical vulnerabilities and defense strategies in autonomous robotic systems.
|
|
08:54-09:12, Paper WeAT8.4 | |
Comparison of CNN and LSTM Networks on Human Intention Prediction in Physical Human-Robot Interactions |
|
Ghorbani Zadeh, Khosro | Missouri University of Science and Technology |
Zendehdel, Niloofar | Missouri University of Science and Technology |
Holmes, George | Hire Henry |
Moreno Bonnett, Keyri | Hire Henry |
Costa, Amy | University of Missouri |
Burns, Devin | Missouri University of Science and Technology |
Leu, Ming | Missouri University of Science and Technology |
Song, Yun Seong | Missouri University of Science and Technology |
Keywords: Deep Learning in Robotics and Automation, AI and Machine Learning in Healthcare, Modelling, Simulation and Optimization in Healthcare
Abstract: Advancements in robotics and AI have increased the demand for interactive robots in healthcare and assistive applications. However, ensuring safe and effective physical human-robot interactions (pHRIs) remains challenging due to the sophistication of human motor communication and intent recognition. Traditional physics-based models struggle to capture the dynamic nature of human force interactions, limiting robot adaptability. To address these limitations, neural networks (NNs) have been explored for force-movement intention prediction. While multi-layer perceptron (MLP) networks show potential, they struggle with temporal dependencies and generalization. Long Short-Term Memory (LSTM) networks effectively model sequential dependencies, while Convolutional Neural Networks (CNNs) enhance spatial feature extraction from human force data. Building on these strengths, this study introduces a hybrid LSTM-CNN framework to improve force-movement intention prediction, increasing accuracy from 69% to 86% through effective denoising and advanced architectures. The combined CNN-LSTM network proved particularly effective in handling individualized force-velocity relationships and presents a generalizable model paving the way for more adaptive strategies in robot guidance. These findings highlight the importance of integrating spatial and temporal modeling to enhance robot precision, responsiveness, and human-robot collaboration.
|
|
WeAT9 |
Room T9 |
Robotics Solutions in Emerging Domains |
Special Session |
Chair: Luensch, Dennis | Fraunhofer Institute for Material Flow and Logistics |
Co-Chair: Menebröker, Fabian | Fraunhofer IML |
Organizer: Luensch, Dennis | Fraunhofer Institute for Material Flow and Logistics |
Organizer: Menebröker, Fabian | Fraunhofer IML |
|
08:00-08:18, Paper WeAT9.1 | |
A Comparative Analysis of Multi-Modal Semantic Perception Tasks and Datasets for Mobile Robotics (I) |
|
Ohnemus, Lars | Karlsruhe Institute of Technology |
Pang, Hao | Karlsruhe Institute of Technology |
Zhou, Lei | Karlsruhe Institute of Technology |
Müller, Lukas | Karlsruhe Institute of Technology |
Furmans, Kai | Institute for Material Handling and Logistics (IFL), Karlsruhe I |
Keywords: Sensor Fusion, Computer Vision for Transportation, Deep Learning in Robotics and Automation
Abstract: Data is key for semantic perception tasks, such as object detection and semantic segmentation. This is particularly true for multi-modal perception, where different sensors such as LiDAR and camera are fused to improve overall performance. While many public datasets for mobile robotics and adjacent domains exist, a comprehensive comparison of these datasets is still lacking. To address this, and to provide the research community with an overview of the available datasets, this paper conducts a large-scale meta-analysis to identify and compare the most relevant datasets for multi-modal perception tasks in mobile robotics. We provide an evaluation of 31 different datasets across six different tasks, and summarize key challenges, opportunities, and future directions associated with datasets.
|
|
08:18-08:36, Paper WeAT9.2 | |
Fast Rescheduling for Multi-Agent Plan Execution in Dynamic Urban Environments: A Machine Scheduling Perspective (I) |
|
Mogali, Jayanth Krishna | Carnegie Mellon University |
Ramesh, Sriram | Ottonomy.io |
Dammala, Aakash Shetty | Ottonomy IO Pvt Ltd |
Korupolu, Pradyot | Ottonomy Inc |
Keywords: Planning, Scheduling and Coordination, Task Planning, Industrial and Service Robotics
Abstract: The Multi-Agent Path Finding (MAPF) problem is crucial for sidewalk delivery robots operating in dynamic urban environments. These robots frequently encounter unforeseen obstacles and delays, causing deviations from precomputed routes and schedules, necessitating rapid real-time replanning to maintain efficient operations. However, traditional MAPF approaches can be computationally intensive, particularly when recalculating both routes and schedules. We present a novel replanning approach that addresses this challenge by retaining the original routes and only adjusting the schedules. This constrained MAPF variant is closely related to the Blocking Job Shop (BJS) problem, allowing us to leverage existing BJS local search techniques to develop an anytime replanning procedure. We evaluated our method on various maps, including a real-world deployment site, demonstrating its efficiency in computing high-quality schedules essential for seamless operations.
|
|
08:36-08:54, Paper WeAT9.3 | |
Multi Mobile Robot Collaboration in Industrial Applications: A Structured Survey (I) |
|
Menebröker, Fabian | Fraunhofer IML |
Stadtler, Jannik | Fraunhofer Institute for Material Flow and Logistics (IML) |
Böckenkamp, Adrian | Fraunhofer IML |
Luensch, Dennis | Fraunhofer Institute for Material Flow and Logistics |
Franke, Sven | TU Dortmund University |
Keywords: Collaborative Robots in Manufacturing, Planning, Scheduling and Coordination, Logistics
Abstract: multi-robot collaboration has been proposed as a solution to various problems in logistics, manufacturing and other industrial applications due to its potential for increasedefficiency and flexibility. This work analyzes existing literature on the topic in form of a structured survey and provides a taxonomy and definitions. We identify four main task domains in which Multi Mobile Robot Collaboration has been employed: Collaborative Transport, Load Transfer, Collaborative Support, and Mobile Processing. For these we analyze how the subproblems of Task Planning, Motion Planning, Motion Control, and Interactive Operation are addressed and on which level the methods are evaluated. Based on this, we provide a map of existing literature and identify white spots for future research.
|
|
08:54-09:12, Paper WeAT9.4 | |
Automated Tuning of Non-Differentiable Rigid Body Simulation Models for Wheeled Mobile Robots |
|
Wiedemann, Marvin | Fraunhofer Institute for Material Flow and Logistics |
Ahmed, Ossama | Fraunhofer Institute for Material Flow and Logistics |
Hatwar, Mrunal | Fraunhofer Institute for Material Flow and Logistics |
Gasoto, Renato | NVIDIA |
Detzner, Peter | Fraunhofer Institute for Material Flow and Logistics |
Kerner, Sören | Fraunhofer Institute for Material Flow and Logistics |
Keywords: Simulation and Animation, Product Design, Development and Prototyping, Intelligent Transportation Systems
Abstract: Simulation plays a crucial role in robotics development, yet creating and tuning accurate models with a small sim-to-real gap remains challenging and limits their broader applicability. This work introduces a pipeline for the automated tuning of models of wheeled mobile robots for simulation tools based on non-differentiable physics engines. Leveraging real-world motion data, the pipeline computes a sim-to-real error and applies black-box optimization to refine model parameters. Three design goals are followed: flexibility in implementing different optimization algorithms, applicability to wheeled mobile robots with diverse kinematics, and the ability to tune multiple model parameters. Experiments involving four wheeled mobile robots with differential, Ackermann, and omnidirectional kinematics validate the approach across diverse trajectories and algorithms. Results indicate successful automated tuning, revealing insights into the relationships between robot complexity, trajectory dynamics, and optimization algorithms.
|
|
WeAT10 |
Room T10 |
Simulation and Optimization in Automation 1 |
Special Session |
Chair: Yan, Bing | Rochester Institute of Technology |
Co-Chair: Feng, Shuo | Tsinghua University |
Organizer: Jia, Qing-Shan | Tsinghua University |
Organizer: Yan, Bing | Rochester Institute of Technology |
Organizer: Feng, Shuo | Tsinghua University |
Organizer: Lennartson, Bengt | Chalmers University of Technology |
Organizer: Fanti, Maria Pia | Politecnico Di Bari |
|
08:00-08:18, Paper WeAT10.1 | |
AI-Based Framework for Robust Model-Based Connector Mating in Robotic Wire Harness Installation (I) |
|
Kienle, Claudius | ArtiMinds Robotics GmbH |
Alt, Benjamin | ArtiMinds Robotics |
Schneider, Finn | Karlsruhe Institute of Technology |
Pertlwieser, Tobias | Karlsruhe Institute of Technology |
Jäkel, Rainer | Karlsruhe Institute of Technology |
Rayyes, Rania | Karlsruhe Institute for Technology (KIT) |
Keywords: Intelligent and Flexible Manufacturing, Cyber-physical Production Systems and Industry 4.0, Assembly
Abstract: Despite the widespread adoption of industrial robots in automotive assembly, wire harness installation remains a largely manual process, as it requires precise and flexible manipulation. To address this challenge, we design a novel AI-based framework that automates cable connector mating by integrating force control with deep visuotactile learning. Our system optimizes search-and-insertion strategies using first-order optimization over a multimodal transformer architecture trained on visual, tactile, and proprioceptive data. Additionally, we design a novel automated data collection and optimization pipeline that minimizes the need for machine learning expertise. The framework optimizes robot programs that run natively on standard industrial controllers, permitting human experts to audit and certify them. Experimental validations on a center console assembly task demonstrate significant improvements in cycle times and robustness compared to conventional robot programming approaches. Videos are available under https://claudius-kienle.github.io/AppMuTT
|
|
08:18-08:36, Paper WeAT10.2 | |
Safety-Guaranteed Policy Composition Via Generalized Policy Improvement for Autonomous Vehicles (I) |
|
Mu, Ni | Tsinghua University |
Luan, Yao | Tsinghua University |
Jia, Qing-Shan | Tsinghua University |
Keywords: Reinforcement, Machine learning, Autonomous Vehicle Navigation
Abstract: Autonomous driving has the potential to revolutionize transportation by improving safety, efficiency, and accessibility. However, existing methods, such as reinforcement learning (RL) approaches and rule-based strategies, struggle to achieve high sample efficiency, superior performance, and safety assurance, a challenge we refer to as the “impossible triangle”. In this paper, we propose Safe-GPI, a novel decision-making method designed to address this issue. Safe-GPI employs Generalized Policy Improvement (GPI) to effectively combine multiple policies, thereby improving safety without compromising performance. Our theoretical analysis demonstrates that SafeGPI maintains formal guarantees for collision avoidance under specific reward structures, while also providing performance assurances. Experimental results demonstrate that the simple combination of a rule-based model and a planning-based policy has the ability to significantly reduce collision rates, while achieving performance comparable to deep reinforcement learning methods. These findings suggest that Safe-GPI offers a feasible solution to the “impossible triangle”, enhancing the safety and reliability of autonomous driving systems.
|
|
08:36-08:54, Paper WeAT10.3 | |
Addressing Coupling in Restless Multi-Armed Bandits by Finetuning Whittle Index (I) |
|
Luan, Yao | Tsinghua University |
Mu, Ni | Tsinghua University |
Jia, Qing-Shan | Tsinghua University |
Keywords: Reinforcement, Machine learning
Abstract: Restless multi-armed bandit (RMAB) is a popular model for budget-constrained scheduling problems, where the Whittle index policy could effectively solve it. As the requirements of arm separability limit the application of RMAB in real-world scenarios involving coupling components, recent studies have further extended the RMAB formulation to enable the coupling of arms. However, existing methods in this field require searching algorithms or domain knowledge, which loses the effectiveness and the universality of the Whittle index policy. This paper presents an extension of the Whittle index policy to coupled RMAB problems. Noticing that the pre-trained Whittle index is an informative arm feature and induces a suboptimal policy, we leverage it as a base policy to minimize the joint training process and thus mitigate the complexity of joint optimization. The proposed algorithm does not rely on specific problem properties or domain knowledge and has the potential to be applied in various domains. Experimental results demonstrate that the proposed algorithm is effective and efficient in learning a satisfactory policy.
|
|
08:54-09:12, Paper WeAT10.4 | |
Few-Shot Knowledge Extraction for Manufacturing Domain Based on Large Language Models (I) |
|
Li, Shuaipeng | Xi'an Jiaotong University |
Wang, Pinghui | Xi'an Jiaotong University |
Gao, Huajie | Xi’an Jiaotong University |
Yan, Chao-Bo | Xi'an Jiaotong University |
Keywords: Big-Data and Data Mining, AI-Based Methods, Data fusion
Abstract: Knowledge graph is an effective tool for managing and representing multi-source heterogeneous data in the manufacturing domain. As the key to building knowledge graph, knowledge extraction in the manufacturing domain usually faces the problem of insufficient labeled data, which usually requires huge annotation costs and further limits the application of knowledge graphs in the manufacturing domain. This paper proposes a novel entity-relation joint extraction method, which aims to use the powerful text generation capability of large language models to achieve data augmentation in the manufacturing domain, and finally efficient extraction of manufacturing knowledge in the few-shot scenario. Besides, in order to further alleviate the hallucination problem caused by the large model generation process, this paper designs a discriminator to filter duplicate and irrelevant data. Experimental results show that our proposed model is effective and robust in the manufacturing domain compared with the state-of-the-art model and ablation experiments prove the effectiveness of each module proposed.
|
|
WeBT1 |
Room T1 |
LiDAR-Based Applications |
Regular Session |
Chair: Girard, Alexandre | Université De Sherbrooke |
|
10:45-11:03, Paper WeBT1.1 | |
Model-Based Real-Time Pose and Sag Estimation of Overhead Power Lines Using LiDAR for Drone Inspection |
|
Girard, Alexandre | Université De Sherbrooke |
Parkison, Steven | Hydro-Québec (Research Institute) |
Hamelin, Philippe | Hydro-Quebec Research Institute |
Keywords: Computer Vision in Automation, Sensor Fusion, Smart Grids
Abstract: Drones can inspect overhead power lines while they remain energized, significantly simplifying the inspection process. However, localizing a drone relative to all conductors using an onboard LiDAR sensor presents several challenges: (1) conductors provide minimal surface for LiDAR beams limiting the number of conductor points in a scan, (2) not all conductors are consistently detected, and (3) distinguishing LiDAR points corresponding to conductors from other objects, such as trees and pylons, is difficult. This paper proposes an estimation approach that minimizes the error between LiDAR measurements and a single geometric model representing the entire conductor array, rather than tracking individual conductors separately. Experimental results, using data from a power line drone inspection, demonstrate that this method achieves accurate tracking, with a solver converging under 50 ms per frame, even in the presence of partial observations, noise, and outliers. A sensitivity analysis shows that the estimation approach can tolerate up to twice as many outlier points as valid conductors measurements.
|
|
11:03-11:21, Paper WeBT1.2 | |
Improving LiDAR Odometry with Hausdorff Distance-Based Variance Estimation |
|
Yilmaz, Onurcan | HACETTEPE UNIVERSITY |
Uyanik, Ismail | Hacettepe University |
Keywords: Probability and Statistical Methods, Sensor Fusion
Abstract: This paper1 introduces a novel variance estimation method for LiDAR Odometry and Mapping (LOAM) using the Hausdorff Distance (HD) metric to enhance sensor fusion accuracy. Traditional LOAM and Iterative Closest Point (ICP) algorithms rely on optimization without providing variance estimation, limiting their reliability or using optimization error as variance, which can be optimistic in case of wrong convergences in sensor fusion tasks. Instead, by leveraging the HD as another performance metric to assess transformation accuracy and integrating the estimated variance into a Minimum Variance Unbiased Estimator (MVUE), the proposed method dynamically adjusts the confidence assigned to LOAM updates without need for any parameter tuning. This approach improves pose estimation when fusing LiDAR data with IMU and GNSS sensors. Experimental results on the MULRAN dataset demonstrate that HD-based variance estimation significantly reduces localization errors compared to direct odometry outputs. These findings establish Hausdorff Distance as a reliable variance metric, contributing to more robust and adaptive localization in autonomous systems.
|
|
11:21-11:39, Paper WeBT1.3 | |
Enhancing LiDAR Odometry with Adaptive Integration of Wheel Odometry and IMU in Environments with Few Geometric Features |
|
Pereira da Cruz Júnior, Gilmar | Universidade Federal De Minas Gerais |
Oliveira, Gabriel | UFOP |
Cid, André | Instituto Tecnologico Vale |
Pessin, Gustavo | Instituto Tecnológico Vale |
Freitas, Gustavo | Federal University of Minas Gerais |
Keywords: Sensor Fusion, Autonomous Vehicle Navigation, Industrial and Service Robotics
Abstract: Autonomous robots have been employed to improve safety and efficiency in sectors such as mining and industry, particularly in inspection tasks within confined environments like tunnels and underground galleries. However, these scenarios pose significant challenges for robot localization and mapping, mainly due to the lack of geometrical references. Although LiDAR-based SLAM techniques are widely used, they tend to fail under such conditions. This paper proposes a sensor fusion approach that combines wheel odometry, LiDAR, and IMU data using an Extended Kalman Filter. The methodology is validated in a simulated environment with long corridors and asymmetric rooms, where traditional SLAM algorithms show limitations. The results demonstrate improved odometry accuracy compared to both a baseline strategy without the EKF and our previous EKF-LOAM approach, as well as enhanced navigation reliability in geometrically sparse environments.
|
|
11:39-11:57, Paper WeBT1.4 | |
Sediment Release Control in Excavator Bucket Using LiDAR |
|
Sugihara, Ryuma | Institute of Science Tokyo |
Yamakita, Masaki | Institute of Science Tokyo |
Keywords: Automation in Construction, Motion Control, Control Architectures and Programming
Abstract: Recently, the number of people working the construction industry has been declining in Japan. One solution to this problem is automating heavy construction machines. In this paper, we focus on controlling sediment release within the bucket of an excavator. We introduce a geometrically approximated bucket model to estimate the sediment volume within the bucket using LiDAR point cloud data. Based on this model and sediment volume estimation, we propose a sediment release control method that utilizes two degrees of freedom controller. Finally, we evaluate the precision of this control method using a small-scale experimental excavator.
|
|
WeBT2 |
Room T2 |
RAM-TRO Paper Session |
Special Session |
Chair: Freeman, Caitlin | University of Alabama |
|
10:45-11:03, Paper WeBT2.1 | |
Reinforcement Learning for High-Speed Quadrupedal Locomotion with Motor Operating Region Constraints |
|
Shin, Young-Ha | KAIST |
Song, Tae-Gyu | Korea Advanced Institute of Science and Technology, KAIST |
Ji, Gwanghyeon | Korea Advanced Institute of Science and Technology |
Park, Hae-Won | Korea Advanced Institute of Science and Technology |
Keywords: Legged Robots, Reinforcement Learning, Hardware-Software Integration in Robotics
Abstract: This paper presents a method for achieving high-speed running of a quadruped robot by considering the actuator torque-speed operating region in reinforcement learning. The physical properties and constraints of the actuator are included in the training process to reduce state transitions that are infeasible in the real world due to motor torque-speed limitations. The gait reward is designed to distribute motor torque evenly across all legs, contributing to more balanced power usage and mitigating performance bottlenecks due to single-motor saturation. Additionally, we designed a lightweight foot to enhance the robot's agility. We observed that applying the motor operating region as a constraint helps the policy network avoid infeasible areas during sampling. With the trained policy, KAIST Hound, a 45 kg quadruped robot, can run up to 6.5 m/s, which is the fastest speed among electric motor-based quadruped robots.
|
|
11:03-11:21, Paper WeBT2.2 | |
Autonomous UV-C Disinfection and Wiping Robot: Assessment in a Hospital Environment |
|
Byun, Jaewon | POSTECH |
Byun, Joonsub | POSTECH |
Kang, Junsu | Postech |
Yi, Inje | Samsung Electronics |
Lee, Jung-Woo | Korea Institute of Robot and Convergence |
Noh, Kyoungseok | Korea Institute of Robotics & Technology Convergence |
Kim, Jong Chan | Korea Institute of Robotics & Technology Convergenc |
Choi, Young-Ho | Korean Institute of Robot and Convergence |
Chung, Goobong | Korea Institute of Robot and Convergence |
Oh, Sang-Rok | KIST |
Kim, Keehoon | POSTECH, Pohang University of Science and Technology |
Keywords: Service Robotics, Autonomous Agents, Mobile Manipulation
Abstract: This article explores the advent of an innovative autonomous robot designed to enhance hospital disinfection through targeted wiping and UV-C irradiation methods. The urgency for such advancements has been underscored by the COVID-19 pandemic, which revealed challenges such as low compliance rates, physical fatigue, labor shortages, and heightened risk of pathogen exposure for disinfection workers. Traditional disinfection approaches, including UV-C irradiation mobile robots and hydrogen peroxide vapor methods, while effective, fall short in addressing obstacles like shaded areas and surface contaminants. To bridge this gap, we introduce a novel robot that combines physical wiping to remove contaminants with targeted UV-C irradiation for areas less amenable to wiping. Our development efforts have centered on optimizing disinfection efficacy and ensuring the robot's reliability in real-world hospital settings. The performance and usability of this autonomous disinfection robot were thoroughly assessed at Pohang St. Mary’s Hospital, demonstrating its potential to transform hospital disinfection practices by complementing traditional disinfection tasks with advanced robotic assistance
|
|
11:21-11:39, Paper WeBT2.3 | |
Heterogeneous Collaborative Pursuit Via Coverage Control Driven by Fokker-Planck Equations |
|
Lin, Ruoyu | University of California, Irvine |
Kim, Soobum | Georgia Institute of Technology |
Egerstedt, Magnus | University of California, Irvine |
Keywords: Multi-Robot Systems, Cooperating Robots, Distributed Robot Systems, Networked Robots
Abstract: Inspired by common features found in collaborative behaviors in nature, we investigate a general collaborative pursuit framework enabling heterogeneous multi-robot systems to adapt to dynamic environments and diverse tasks. A class of augmented Fokker-Planck equations is formulated to characterize dynamic environmental conditions, and the resulting time-varying density functions drive a novel coverage-based controller, with provable stability properties, for the participating robots to perform tasks in real time. The developed framework is decentralized and incorporates heterogeneity among different robots in task suitability, relative performance in a specific task, and safe operating regions. To demonstrate its adaptivity and effectiveness, the framework is implemented across four experimental applications ranging from multi-robot coordination to collaboration, namely forest firefighting, pursuit-evasion, monitoring of various environmental phenomena, and phoretic interactions.
|
|
11:39-11:57, Paper WeBT2.4 | |
Environment-Centric Learning Approach for Gait Synthesis in Terrestrial Soft Robots |
|
Freeman, Caitlin | University of Alabama |
Mahendran, Arun Niddish | The University of Alabama, Tuscaloosa |
Vikas, Vishesh | University of Alabama |
Keywords: Modeling, Control, and Learning for Soft Robots, Soft Robot Applications, Motion Control, Gait Synthesis
Abstract: Locomotion gaits are fundamental for control of soft terrestrial robots. However, synthesis of these gaits is challenging due to modeling of robot-environment interaction and lack of a mathematical framework. This work presents an environment-centric, data-driven, and fault-tolerant probabilistic Model-Free Control (pMFC)framework that allows for soft multi-limb robots to learn from their environment and synthesize diverse sets of locomotion gaits for open-loop control. Here, discretization of factors dominating robot-environment interactions enables an environment-specific graphical representation where the edges encode experimental robot motion primitive data. Locomotion gaits are defined as transformation invariant simple cycles. Gait synthesis is formulated as Binary Integer Linear Programming (BILP) problem. Experimentally, gaits are synthesized for varying robot morphology - three-limb and four-limb robots; substrate - rubber mat, whiteboard and carpet; and actuator functionality. On an average, gait synthesis improves the translation and rotation speeds by 82% and 97% respectively. The results highlight that data-driven methods are vital to soft robot locomotion control due to complex robot-environment interactions and simulation-to-reality gaps, particularly when biological analogues are unavailable.
|
|
WeBT3 |
Room T3 |
Mobile Robots 2 |
Regular Session |
Chair: Quintero, David | San Francisco State University |
|
10:45-11:03, Paper WeBT3.1 | |
Onboard Sensing and Pushing of Unknown Payload for CoM Estimation with a Holonomic Mobile Robot |
|
Hyland, Steven | Worcester Polytechnic Institute |
Xiao, Jing | Worcester Polytechnic Institute (WPI) |
Onal, Cagdas | WPI |
Keywords: Autonomous Agents, Intelligent Transportation Systems, Reactive and Sensor-Based Planning
Abstract: This paper presents a novel approach for estimating the center of mass (CoM) direction of unknown payloads using only onboard sensing - namely, a force sensor and an RGB camera - on a mobile robot. Unlike methods requiring external infrastructure, extensive datasets, or machine learning, our technique employs an active perception framework to guide adaptive pushing of the payload and employs a robust search algorithm to find the CoM direction. By eliminating the need for prior inertial knowledge or global pose information, the proposed method converges on a translational line of action (LoA), indicated by zero rotational motion about the CoM. The method is validated on payloads of varying shape, size, and mass distribution, demonstrating consistent CoM estimation accuracy. Overall, this approach offers a reliable and adaptive solution for mobile robotic manipulation in environments where external global sensing is unavailable or impractical.
|
|
11:03-11:21, Paper WeBT3.2 | |
Scaling Cooperative Mobile Multi-Robot Systems for Object Handling |
|
Recker, Tobias | Leibniz University Hanover |
Lachmayer, Lukas | Leibniz University Hannover, Insitute of Assembly Technolog |
Raatz, Annika | Leibniz Universität Hannover |
Keywords: Industrial and Service Robotics, Collaborative Robots in Manufacturing, Motion Control
Abstract: Cooperative Mobile Multi-Robot Systems (CMMRS) are supposed to enable more flexible handling systems but face challenges in scalability due to kinematic overdetermination. This paper presents a scalable control architecture using admittance control to mitigate said overdetermination. A Temporal Convolutional Network (TCN) for real-time force estimation serves to mitigate instabilities in the admittance controller that occur in rigid surface contact. Experimental validation with up to eight industrial robots demonstrates high tracking accuracy, with position errors below 2 mm and orientation errors around 10 mrad.
|
|
11:21-11:39, Paper WeBT3.3 | |
Lie-Algebra Learning for Mobile Robots Tracking Control with Model Uncertainty |
|
Tang, Jiawei | Hong Kong University of Science and Technology |
Yang, Nachuan | Hong Kong University of Science and Technology |
Wu, Shuang | Huawei |
Li, Shilei | Beijing Institute of Technology |
Shi, Dawei | Beijing Institute of Technology |
Shi, Ling | The Hong Kong University of Science and Technology |
Keywords: Learning and Adaptive Systems, Model Learning for Control, Motion and Path Planning
Abstract: This paper presents a novel Lie-algebra learning approach for differential wheeled robots (DWRs) trajectory tracking with uncertainty in the kinematic model. The approach is motivated by the fundamental property of group affine systems, which convert the state space from group space to vector space and derive a state-independent error kinematic model. After that, based on the controllability analysis of the Lie-algebra optimal control problem, we design a suitable tracking scenario for the data collection and learning process. The theoretical analysis of the optimal Lie-algebra tracking control facilitates the development of the learning control algorithm to handle different trajectory tracking scenarios. Simulation experiments validate the efficiency of the proposed method and demonstrate the advantages of our control method over existing approaches.
|
|
11:39-11:57, Paper WeBT3.4 | |
Design of a Modular Rotary Actuator Characterization Test Station for Wearable Robotics |
|
Hartnett, Ryan J. | San Francisco State University |
Berdal, Jarren | San Francisco State University |
Quintero, David | San Francisco State University |
Keywords: Prosthetics and Exoskeletons, Mechatronics in Meso, Micro and Nano Scale, Actuation and Joint Mechanisms
Abstract: This paper presents the development of a benchtop actuator test platform designed to advance actuator design and performance for wearable robotic systems (e.g., powered prosthesis and exoskeletons). Designing actuators to assist human movement, it is critical to validate their performance to ensure safety and effectiveness. Custom designed actuators require rigorous testing prior to clinical deployment to reduce long term costs and safety risks. To address this need, we developed a Modular Rotary Actuator Characterization Test Station capable of performing static and dynamic tests with measuring realtime angular velocity and torque to evaluate high performance wearable robot actuators. In this initial evaluation, we characterized a metal planetary gearbox transmission with an attached brushless DC motor and evaluated the test station measurement capabilities. Such testing included actuator performance output of sinusoidal angular velocity at 0.25 Hz to evaluate a root mean square error of 0.95 ± 0.03 rad/s. Further, an actuation torque output versus command current, to achieve a best fit linear regression of actuator control input-output relationship through the test station (r2 = 0.997, n = 48, p << 0.001). These results demonstrate that the system produces reliable data consistent with other benchmark actuator test stations. In all, this platform establishes a practical and scalable framework for actuator validation, and enable characterization of actuator dynamics to optimize active wearable robotic actuators for improving human rehabilitation performance.
|
|
WeBT4 |
Room T4 |
Simulation and Optimization in Automation 2 |
Special Session |
Chair: Mou, Shancong | Georgia Tech |
Co-Chair: Feng, Shuo | Tsinghua University |
Organizer: Jia, Qing-Shan | Tsinghua University |
Organizer: Yan, Bing | Rochester Institute of Technology |
Organizer: Feng, Shuo | Tsinghua University |
Organizer: Lennartson, Bengt | Chalmers University of Technology |
Organizer: Fanti, Maria Pia | Politecnico Di Bari |
|
10:45-11:03, Paper WeBT4.1 | |
A Transformer-Embedded Reinforcement Learning for Computing Power Scheduling in Data Centers (I) |
|
Zhou, Hanchen | Tsinghua University |
Cui, Gaochen | Tsinghua University |
Jia, Qing-Shan | Tsinghua University |
Keywords: Cloud Computing For Automation, Modelling, Simulation and Validation of Cyber-physical Energy Systems, Reinforcement
Abstract: The growing demand for computing power driven by advancements of Cloud Computing has posed significant challenges to data centers. As a solution, distributed data centers have emerged, so the optimization of edge computing and resource allocation is of great significance for research. However, the increasing complexity of scheduling rules, the need for flexible load management, along with the growing demands for real-time control and robustness presents considerable challenges to traditional optimization methods. To address this problem, this study formulates the computing power scheduling optimization problem as a Markov Decision Process (MDP) and develops the solution using a Transformerembedded Proximal Policy Optimization (PPO) framework. The Transformer component captures temporal relationships among tasks and processes variable-length state, which are integrated into PPO to optimize the policy. Numerical experiments on the Alibaba Cluster Trace V2017 dataset show that the proposed method is compatible with various lower-level control policies, effectively reducing energy consumption while keeping latency within predefined thresholds, thus achieving a controllable trade-off between energy costs and job latency.
|
|
11:03-11:21, Paper WeBT4.2 | |
Robust Linear Quadratic Reinforcement Learning by Filtering |
|
Svedlund, Ludvig | Chalmers University of Technology |
Lennartson, Bengt | Chalmers University of Technology |
Keywords: Robust/Adaptive Control, Reinforcement, Optimization and Optimal Control
Abstract: This paper investigates the robustness of linear quadratic reinforcement learning when unmodeled dynamics are included, and how the performance can be improved by applying filtering. We examine both model-free and model-based approaches, and it is shown that the model-free approach greatly suffers from the inclusion of unmodeled dynamics when no filtering is used. With the inclusion of filtering, similar performance is, however, achieved by both approaches. It is also concluded that model-based reinforcement learning has some notable other advantages over the model-free approach, thus making model-based the preferable approach when unmodeled dynamics are present.
|
|
11:21-11:39, Paper WeBT4.3 | |
A Deep-Ensemble Bayesian Optimization with Computation Budget Allocation for Design Space Exploration Problems (I) |
|
Zhu, Yuhang | Tsinghua University |
Lv, Xiaoliang | Xi'an Jiaotong University |
Jia, Qing-Shan | Tsinghua University |
Guan, Xiaohong | Xi'an Jiaotong University |
Keywords: Optimization and Optimal Control, AI-Based Methods
Abstract: The micro-architectures of processors are becoming increasingly complex, which introduces a large number of micro-architecture parameters. The design space exploration (DSE) problem, obtaining a set of the micro-architecture parameters that could make the processor perform well, is both crucial and challenging. This simulation-based optimization problem involves a vast search space with more than 50 dimensions. Evaluating the performance of a given parameter set requires expensive simulations using the Cycle Accurate Simulator (CAS). The CAS simulates each typical program (referred to as a slice in DSE) included in the benchmark to obtain their respective instruction per cycle (IPC) scores, which are then weighted and aggregated to calculate the overall performance score for the given parameter design. The optimization objective in the DSE problem is the final score. Traditional DSE algorithms use black-box optimization methods, such as Bayesian optimization (BO), to optimize the parameters. However, these approaches neither leverage the information contained in the individual slice scores nor allocate the simulation budget efficiently. In this paper, we propose a deep-ensemble Bayesian optimization with computation budget allocation (DEBO-CBA) algorithm for DSE problems. The numerical and empirical tests demonstrate that the proposed method outperforms state-of-the-art approaches, including the black-box optimization algorithm HEBO and the genetic algorithm (GA). In practical micro-architecture DSE problems for BOOM architecture, our algorithm requires 27% fewer iterations than HEBO to achieve a set of good enough parameters.
|
|
11:39-11:57, Paper WeBT4.4 | |
A Trading-Integrated Reverse Logistics Network for Demolition Waste Using Digital Twins: A Comprehensive Cost-Benefit Analysis (I) |
|
Su, Shuaiming | The University of Hong Kong |
Yin, Li | The University of Hong Kong (HKU) |
Zhong, Ray Y. | The University of Hong Kong |
Keywords: Logistics, Sustainability and Green Automation, Automation in Construction
Abstract: The growing volume of building demolition waste has gradually become a worldwide concern. Existing reverse logistics network research mainly focuses on minimizing logistics costs. The impact of waste trading on reverse logistics networks and demolition projects is overlooked. In response to this, this paper proposed a trading-integrated reverse logistics network for demolition waste, leveraging the classic five-dimensional digital twin model. Subsequently, a mixed-integer linear programming model is formulated. From the demolisher's perspective, it aims to identify the optimal building demolition waste disposal route within the trading-integrated reverse logistics network. Moreover, through cost-benefit analysis, it explores how waste trading affects the net profit of the demolition project. A case study is carried out using data from a real estate site in Beijing, which comprises eight buildings. Based on this, several valuable implications are summarized. This research is among the first to discuss the trading-integrated reverse logistics network with a cost-benefit analysis.
|
|
WeBT5 |
Room T5 |
Optimization and Learning 1 |
Regular Session |
Chair: Carpin, Stefano | University of California, Merced |
|
10:45-11:03, Paper WeBT5.1 | |
Solving Stochastic Orienteering Problems with Chance Constraints Using a GNN Powered Monte Carlo Tree Search |
|
Zuzuarregui, Marcos | University of California, Merced |
Carpin, Stefano | University of California, Merced |
Keywords: Optimization and Optimal Control, Planning, Scheduling and Coordination, Agricultural Automation
Abstract: Leveraging the power of a graph neural network (GNN) with message passing, we present a Monte Carlo Tree Search (MCTS) method to solve stochastic orienteering problems with chance constraints. While adhering to an assigned travel budget the algorithm seeks to maximize collected reward while incurring stochastic travel costs. In this context, the acceptable probability of exceeding the assigned budget is expressed as a chance constraint. Our MCTS solution is an online and anytime algorithm alternating planning and execution that determines the next vertex to visit by continuously monitoring the remaining travel budget. The novelty of our work is that the rollout phase in the MCTS framework is implemented using a message passing GNN, predicting both the utility and failure probability of each available action. This allows to enormously expedite the search process. Our experimental evaluation shows that with the proposed method and architecture we manage to efficiently solve complex problem instances while incurring in moderate losses in terms of collected reward. Moreover, we demonstrate how the approach is capable of generalizing beyond the characteristics of the training dataset. The paper’s website, open-source code, and supplementary documentation can be found at ucmercedrobotics.github.io/gnn-sop.
|
|
11:03-11:21, Paper WeBT5.2 | |
N(CO)2: Neural Combinatorial Optimization with Chance Constraints to Solve Stochastic Orienteering |
|
Saeed, Anas | University of California, Merced |
Zuzuarregui, Marcos | University of California, Merced |
Carpin, Stefano | University of California, Merced |
Keywords: Planning, Scheduling and Coordination, Learning and Adaptive Systems
Abstract: Neural combinatorial optimization (NCO) offers a promising alternative to traditional heuristic-based methods for solving complex graph optimization problems by proposing to learn heuristics through data. This class of problems frequently arises in automation, as it can be used to model a variety of applications. While NCO has been extensively studied for deterministic combinatorial optimization problems, there are only a few works that aim to solve stochastic combinatorial optimization problems. In this work, we present N(CO)^2: Neural Combinatorial Optimization with Chance cOnstraints to solve the Stochastic Orienteering Problem (SOP) without the use of hand-crafted heuristics. By integrating a reinforcement learning (RL) framework, the model optimizes path selection under uncertainty, effectively balancing exploration and exploitation. Empirical results demonstrate that our method generalizes well across diverse SOP instances, achieving competitive performance compared to the state-of-the-art mixed-integer linear program (MILP) for the task. The proposed approach reduces human effort in heuristic design while enabling adaptive and efficient decision-making in uncertain environments.
|
|
11:21-11:39, Paper WeBT5.3 | |
An Improved Algorithm to Select Homotopy Candidates for the Constrained Optimization of Minimum-Power Networks |
|
Bernardini, Francesco | University of Houston |
Biediger, Daniel | University of Houston |
Pineda, Ileana | University of Houston |
Becker, Aaron | University of Houston |
Keywords: Swarms, Robot Networks, Optimization and Optimal Control
Abstract: We present a strategy to extract an optimal subset---i.e., a subset that contains the homotopy of the optimal solution---from the complete set of homotopies of a minimum-power network in the presence of obstacles. The strategy only requires knowing the optimal solution in the absence of obstacles (the ``free'' solution) and the details of the obstacle distribution. The homotopy candidates are obtained by overlaying the free solution on the obstacle distribution and analyzing the generated intersections.
|
|
11:39-11:57, Paper WeBT5.4 | |
Game-Theoretic Defense Policy for Network Security against Intelligent Adversary |
|
Kazeminajafabadi, Armita | Northeastern University |
Lan, Tian | George Washington University |
Imani, Mahdi | Northeastern University |
Keywords: Probability and Statistical Methods, Reinforcement, Task Planning
Abstract: The rapid evolution of IT infrastructure and networked systems has increased their susceptibility to sophisticated and intelligent cyber threats. Despite advancements in attack detection, adversaries continuously refine their strategies, exploiting vulnerabilities with growing complexity. In this paper, we model the dynamic interaction between a defender and an intelligent adversary as a two-player zero-sum game. The defender's partial observability of the adversary and network state is represented using a partially observable Markov decision process (POMDP). We develop a recursive method to compute the posterior distribution of network compromises based on incomplete observations of network states and no access to adversarial actions. An optimal minimum mean square error (MMSE) estimator leverages this posterior for the recursive estimation of network compromises. To ensure the defender follows the Nash equilibrium, where neither player has the incentive to deviate, our automated defense policy employs the Nash strategy based on the optimal MMSE estimate of the network state. Two evaluation metrics are introduced to assess the policy's effectiveness: expected mean square error and expected policy misalignment. Simulation results show improved defense effectiveness over static or non-strategic automated policies, demonstrating the advantages of strategic decision-making in network security.
|
|
WeBT6 |
Room T6 |
Factory Automation 2 |
Regular Session |
Chair: Muthusamy, Rajkumar | Dubai Future Foundation |
|
10:45-11:03, Paper WeBT6.1 | |
MOTORCYCLE 1.0: Automating Bimanual Cable Routing Around Fixtures on the NIST Task Board |
|
Azulay, Osher | University of California, Berkeley |
Kondap, Kavish | University of California, Berkeley |
Drake, Jaimyn | University of California, Berkeley |
Xie, Shuangyu | UC Berkeley |
Li, Hui | Autodesk Research |
Chitta, Sachin | Autodesk Inc |
Goldberg, Ken | UC Berkeley |
Keywords: Task Planning, Compliant Assembly, Factory Automation
Abstract: Automated cable routing requires deformable object manipulation in constrained and cluttered environments. However, achieving reliable routing is challenging due to fixture constraints, cable friction, and the need for slack management. In this work, we introduce MOTORCYCLE 1.0 (Multi-turn Optimized Trajectories for Ordered Routing of Cable Yoking and Cable Loop Execution), a bimanual cable routing method that integrates a learned cable tracer with sliding-based motion planning to achieve desired cable trajectories while ensuring slack control. Unlike previous methods that use single-arm manipulation, MOTORCYCLE 1.0 uses coordinated bimanual sliding motions to dynamically adjust slack to avoid tangling and misrouting. Physical experiments with a modified NIST Task Board 4 demonstrate 84% average success rate across multiple tiers, significantly outperforming a single-arm version.
|
|
11:03-11:21, Paper WeBT6.2 | |
Visual Prompting for Robotic Manipulation with Annotation-Guided Pick-And-Place Using ACT |
|
Muttaqien, Muhammad Angga | University of Tsukuba |
Motoda, Tomohiro | National Institute of Advanced Industrial Science and Technology |
Hanai, Ryo | National Institute of Industrial Science and Technology(AIST) |
Domae, Yukiyasu | The National Institute of Advanced Industrial Science and Techno |
Keywords: AI-Based Methods, Deep Learning in Robotics and Automation, Learning and Adaptive Systems
Abstract: Robotic pick-and-place tasks in convenience stores pose challenges due to dense object arrangements, occlusions, and variations in object properties such as color, shape, size, and texture. These factors complicate trajectory planning and grasping. This paper introduces a perception-action pipeline leveraging annotation-guided visual prompting, where bounding box annotations identify both pickable objects and placement locations, providing structured spatial guidance. Instead of traditional step-by-step planning, we employ Action Chunking with Transformers (ACT) as an imitation learning algorithm, enabling the robotic arm to predict chunked action sequences from human demonstrations. This facilitates smooth, adaptive, and data-driven pick-and-place operations. We evaluate our system based on success rate and visual analysis of grasping behavior, demonstrating improved grasp accuracy and adaptability in retail environments.
|
|
11:21-11:39, Paper WeBT6.3 | |
Multi-Modal Sensorized Soft Gripper for Reliable Grasping |
|
Haddad, Karim | Dubai Future Labs |
Taha, Tarek | Dubai Future Labs |
Muthusamy, Rajkumar | Dubai Future Foundation |
Keywords: Collaborative Robots in Manufacturing, Industrial and Service Robotics, Product Design, Development and Prototyping
Abstract: Grasp uncertainty in robotic manipulation is a known issue where grippers, especially soft grippers, struggle to generate secure holds on objects. The problem arises from computational factors such as incomplete data on object properties, and is magnified by environmental factors such as occlusion or motion. It presents a critical concern, as unstable grasps may lead to dropped items, damaged goods, or safety hazards in industrial applications. This paper presents a human-inspired, sensorized soft gripper to address this issue. The design incorporates soft fingers with a multimodal vision system with real-time monitoring of finger-object interactions. Additionally, unique features are implemented through proprioception markers to build a deeper understanding of the gripper's interaction with objects. Experiments with the YCB object set demonstrate the gripper’s robustness in dynamic manipulation, while external disturbance tests highlight the stability of various grasp configurations. A predictive framework is proposed which uses the validated features to predict grasp success. Gripper Validation and results show significant promise for safe and adaptive robotic grasping and manipulation across various industries.
|
|
11:39-11:57, Paper WeBT6.4 | |
Hierarchical Blockchain for Mapping Manufacturing Process Flow |
|
Kuo, Timothy | The Pennsylvania State University |
Yang, Hui | The Pennsylvania State University |
Keywords: Cyber-physical Production Systems and Industry 4.0, Factory Automation, Process Control
Abstract: As manufacturing processes become increasingly complex, maintaining quality and improving efficiency requires mapping of process flows. Mapping process flows, in turn, depends on comprehensive end-to-end data traceability. Such traceability relies on lifecycle data that capture every stage, from raw-material handling to final-product assembly, and provide indispensable insights for process refinement. However, conventional centralized database-based systems for managing these data introduce single points of failure and remain vulnerable to tampering and cyberattacks. As a result, data traceability and authenticity are compromised. Therefore, this research develops a novel blockchain architecture coupled with digital twin (DT) model to secure end-to-end documentation of manufacturing process flows. First, a hierarchical blockchain framework is developed to record production events and ensure comprehensive, tamper-proof records of process activities. Second, the DT model, operating in collaboration with the blockchain tiers, enables real-time alignment between the manufacturing floor and its virtual twin. Third, a unified data representation is designed to transform diverse manufacturing datasets into a homogeneously structured format. Experimental results show that the proposed framework significantly enhances data authenticity while reducing the time required to map manufacturing process flows.
|
|
WeBT7 |
Room T7 |
Energy and Sustainability 2 |
Regular Session |
Chair: Yan, Bing | Rochester Institute of Technology |
|
10:45-11:03, Paper WeBT7.1 | |
Machine Learning Enhanced Formulation Tightening of Energy Storage Resource Constraints in Unit Commitment |
|
Hyder, Farhan | Rochester Institute of Technology |
Quang, Uyen Nhi | Rochester Institute of Technology |
Yan, Bing | Rochester Institute of Technology |
Keywords: Power and Energy Systems automation, Optimization and Optimal Control, Machine learning
Abstract: Recent years have seen an increased penetration of utility-scale energy storage resources (ESRs) into the power grid. To incorporate ESRs into the unit commitment (UC) problem, binary variables are required to prevent simultaneous charging and discharging. This further increases the problem's complexity, prompting computationally efficient modeling of ESRs. A systematic formulation tightening approach based on constraint-to-vertex conversion, developed in our previous work, has shown effectiveness in tightening formulations of conventional generators. However, it requires extensive manual analysis and expert knowledge during the parameterization step (expressing numerical coefficients as combinations of generator parameters). In this paper, the systematic formulation tightening approach is enhanced via machine learning for ESRs. To address the challenge caused by the manual analysis for parameterization, a novel machine learning model is developed by taking ESR parameters as inputs and numerical coefficients of the tight constraints as outputs to identify the mathematical relationship between them. Furthermore, the entire tightening process is automated for flexibility and applicability. Results based on the IEEE 118-bus system show a significant reduction in solving time (up to 64%) while maintaining the solution quality of UC with ESRs, demonstrating the effectiveness of the approach. The approach is general and has great potential for tightening complicated Mixed Binary Linear Programming (MBLP) problems in power systems and beyond.
|
|
11:03-11:21, Paper WeBT7.2 | |
Integrated Optimization and Control Method for Hydrogen-Based Zero-Carbon Isolated Microgrid |
|
Zhu, Chenghao | Xi'an Jiaotong University |
Liu, Jinhui | Xi'an Jiaotong University |
Xu, Zhanbo | Xi'an Jiaotong University |
Wu, Jiang | Xian Jiaotong University |
Guan, Xiaohong | Xi'an Jiaotong University |
Keywords: Optimization and Optimal Control, Cyber-physical Production Systems and Industry 4.0, Modelling, Simulation and Validation of Cyber-physical Energy Systems
Abstract: For isolated microgrids (IMGs) with fuel cell-based combined heating and power systems (FC-CHPSs), the ability to handle rapid load variations is particularly critical for ensuring stable zero-carbon operation. Under complex dynamic conditions, effectively maintaining system stability is essential for reliable operation in practical applications. In this paper, taking a typical hydrogen-based zero-carbon IMG containing FC-CHPS as an example, the operation optimization model is first developed. At the same time, its control boundary is formulated and simulated on MATLAB/Simulink. To overcome the limitations of the standalone operation optimization and control strategies, an integrated optimization and control (IOC) method is developed. In the upper-level optimization problem, it optimizes the energy cost and realizes zero carbon emissions while satisfying the energy balance and complicated operating constraints of the IMG, which provides an optimal operation strategy to guide the power of the lower-layer. In the lower-level control problem, it realizes the power coordination of FC-BES-electrical load in seconds to response the rapid load variations. Compared with the Proportional-Integral-Derivative (PID) control without upper-level guidance, the proposed IOC method can accommodate a wider range of rapid load variations while maintaining stable IMG output, and also consider the FC degradation to improve economic efficiency.
|
|
11:21-11:39, Paper WeBT7.3 | |
Optimal Coordination of Solar-Assisted Geothermal-Based Integrated Energy Systems Considering Soil Thermal Dynamics |
|
Jian, Xiyan | Xi'an Jiaotong University |
Xu, Zhanbo | Xi'an Jiaotong University |
Dong, Xiangxiang | Xi'an Jiaotong University |
Liu, Kun | Xi'an Jiaotong University |
Wu, Jiang | Xian Jiaotong University |
Guan, Xiaohong | Xi'an Jiaotong University |
Keywords: Planning, Scheduling and Coordination, Renewable Energy Sources, Sustainability and Green Automation
Abstract: To address escalating carbon emissions from heating in cold regions and seasonal mismatches in renewable energy utilization, this paper proposes a solar-assisted geothermal-based integrated energy system that ensures sustainable geothermal utilization and resolves solar energy intermittency by replenishing soil thermal reserves with surplus solar energy during non-heating seasons. A mixed-integer linear programming framework is developed to model the spatiotemporal thermal dynamics of the soil thermal energy storage system, incorporating operational constraints of energy supply and storage devices. The framework incorporates thermal interference between boreholes and employs piecewise linearization techniques to address nonlinear energy conversion constraints inherent to ground source heat pump (GSHP) operations. Numerical results demonstrate that the proposed method reduces the soil thermal imbalance ratio from 42% to 3%, effectively mitigating thermal depletion of GSHP under cold climates. Simultaneously, the system achieves 100% solar energy utilization efficiency and reduces operation costs by 5% through seasonal thermal storage. This work provides a costeffective and sustainable paradigm for optimizing clean heating systems in cold regions, balancing economic viability with long-term environmental benefits.
|
|
11:39-11:57, Paper WeBT7.4 | |
Robust Resilience Enhancement Considering EVs Emergency Response under Endogenous and Exogenous Uncertainties |
|
Ma, Donglai | Xi'an Jiaotong University |
Qiu, Luru | Xi'an Jiaotong University |
Cao, Xiaoyu | Xi'an Jiaotong University |
Sun, Xunhang | Xi'an Jiaotong University |
Keywords: Planning, Scheduling and Coordination, Optimization and Optimal Control, Smart Grids
Abstract: With the rapid advances in wireless communications and internet-of-thing (IoT) technologies, a mass quantity of electric vehicles (EVs) can serve as mobile energy resources based on the vehicle-to-grid (V2G) infrastructure. The emergency response of EVs has become a economically viable solution for enhancing the resilience of power distribution networks (PDNs) against extreme weather events. This paper presents a robust resilience enhancement approach for the PDN by fully exploiting the EVs aggregation as mobile emergency resources. Particularly, the influence of financial incentives (offered by grid operators) on the spatio-temporal distribution of EV fleets (i.e., a higher incentive may attract more EVs to support the on-emergency power supply) are considered and analytically modeled. To capture the endogenous uncertainty of EVs' distribution associated with the incentive offering strategy, a decision-dependent uncertainty (DDU) set is developed. Also, the faults of distribution feeders under severe N-k contingencies are modeled as conventional exogenous uncertainties. A complex robust optimization problem with a mixture of endogenous and exogenous uncertainty models is developed, and efficiently solved through the customized parametric column-and-constraint generation (C&CG) algorithm. Numerical results on a 33-bus test distribution system validates the resilience benefits and economical feasibility of the proposed method. The significance of DDU modeling for network resilience enhancement is demonstrated and highlighted.
|
|
WeBT8 |
Room T8 |
Autonomous Systems 3 |
Regular Session |
Chair: Vundurthy, Bhaskar | Carnegie Mellon University |
|
10:45-11:03, Paper WeBT8.1 | |
Hybrid Autonomy Framework for a Future Mars Science Helicopter |
|
Di Pierno, Luca | ETH Zurich |
Hewitt, Robert | Jet Propulsion Laboratory |
Weiss, Stephan | Universität Klagenfurt |
Brockers, Roland | California Institute of Technology |
Keywords: Control Architectures and Programming, Discrete Event Dynamic Automation Systems, Behavior-Based Systems
Abstract: Autonomous aerial vehicles, such as NASA’s Ingenuity, enable rapid planetary surface exploration beyond the reach of ground-based robots. Thus, NASA is working on a Mars Science Helicopter (MSH), an advanced concept capable of performing long-range science missions and autonomously navigating challenging Martian terrain. Given significant Earth-Mars communication delays and mission complexity, an advanced autonomy framework is required to ensure safe and efficient operation by continuously adapting behavior based on mission objectives and real-time conditions, without human intervention. This study presents a deterministic high-level control framework for aerial exploration, integrating a Finite State Machine (FSM) with Behavior Trees (BTs) to achieve a scalable, robust, and computationally efficient autonomy solution for critical scenarios like deep space exploration. In this paper we outline key capabilities of a possible MSH and detail the FSM-BT hybrid autonomy framework which orchestrates them to achieve the desired objectives. Monte Carlo simulations and real field tests validate the framework, demonstrating its robustness and adaptability to both discrete events and real-time system feedback. These inputs trigger state transitions or dynamically adjust behavior execution, enabling reactive and context-aware responses. The framework is middleware-agnostic, supporting integration with systems like F-Prime and extending beyond aerial robotics.
|
|
11:03-11:21, Paper WeBT8.2 | |
Game-Theoretic Autonomous Vehicle Path Coordination with Limited Communication |
|
Cruz, Nicole | Florida International University |
Fuentes, Jose | Florida International University |
Bobadilla, Leonardo | Florida International University |
Keywords: Planning, Scheduling and Coordination, Robot Networks, Motion and Path Planning
Abstract: The coordination of autonomous vehicles in communication-constrained environments is vital for tasks such as search and rescue, maritime surveillance, and disaster response. Traditional path planning approaches often rely on extensive communication to ensure collision avoidance and optimal trajectory selection, which is not feasible in adversarial or bandwidth-limited settings. This paper introduces a game-theoretic framework for multi-vehicle coordination, leveraging Nash Equilibria and approximation algorithms to enable effective decision-making with minimal communication overhead. We explore three limited communication strategies: a 2/3-Well-Supported Nash Equilibrium (WSNE) algorithm, a 3/4-WSNE algorithm, and a proposed two-way communication extension based on the 3/4-WSNE algorithm, dubbed “Two-Way 3/4-WSNE". These methods are evaluated using both simulated Voronoi roadmaps and real-world experiments with Autonomous Surface Vehicles. Our results demonstrate that strategic path selection can be achieved with minimal information exchange while ensuring collision avoidance and near-optimal path efficiency. This work contributes to the development of scalable, decentralized coordination strategies for autonomous vehicles operating in communication-constrained environments.
|
|
11:21-11:39, Paper WeBT8.3 | |
A Max-Min Tree Approach to the Automated Construction of Ad Hoc Wireless Networks in Unknown Environments |
|
Noren, Charles | Carnegie Mellon University |
Vundurthy, Bhaskar | Carnegie Mellon University |
Bagree, Namya | Carnegie Mellon University |
Travers, Matthew | Carnegie Mellon University |
Keywords: Robot Networks, Autonomous Agents
Abstract: Reliable communication networks are essential for the remote operation of automated teams of robotic agents. For unknown (no prior map) communications-deprived (no existing communication infrastructure) environments, the robotic agents must construct the network as the robots move through the terrain. We present a novel method for automated network construction tailored for mobile robotic teams that require communication with a central base station. Our key innovation is the introduction of a maximin spanning tree structure, which guarantees a minimum level of communication performance between nodes. By directly optimizing node placement based on signal-based metrics, instead of relying on geometric surrogates like distance and visibility, we also achieve significant decreases in agent utilization while maintaining coverage for the traversed area. By using the robotic agents themselves as mobile repeaters in a communication network, each robotic agent can be individually assigned to prioritize network connectivity during critical operations. Numerical simulations on common Multi-Agent Path Finding benchmarks demonstrate up to a 36% reduction in the number of required nodes compared to existing techniques. Furthermore, this work guarantees robust network connectivity in dynamic environments, outperforming strongest-neighbor approaches that are vulnerable to link disruptions. Lastly, hardware tests confirm the robustness of our method in challenging scenarios encountered in real-world deployments.
|
|
11:39-11:57, Paper WeBT8.4 | |
Model Predictive Control for Speed Regulation of Autonomous Vehicles at Road Intersections and Performance Evaluation in a V2X Communication Scenario |
|
Fasciani, Angelo | Università Degli Studi Dell'Aquila |
Zacchia Lun, Yuriy | University of L'Aquila |
Smarra, Francesco | University of L'Aquila |
D'Innocenzo, Alessandro | University of L'Aquila |
Keywords: Collision Avoidance, Robust/Adaptive Control, Autonomous Vehicle Navigation
Abstract: The paper aims at evaluating the performance of a centralised control strategy, based on a scheduling procedure and MPC (model predictive control), that regulates the crossing of a four-way intersection by autonomous vehicles avoiding collisions in a V2X (vehicle-to-everything) communication scenario. In particular, we evaluate the performance considering different communication channel gain conditions, with packet loss process implemented through a Bernoullian probabilistic model based on the V2X communication protocol, as well as the impact of transmitting to the vehicles control aggregated commands of multiple future time horizons.
|
|
WeBT9 |
Room T9 |
Smart Logistics, Manufacturing and Healthcare 1 |
Special Session |
Chair: Mehdi-Souzani, Charyar | Université Paris-Saclay ENS-PAris Saclay Université Sorbonne Paris Nord |
Co-Chair: Xie, Haoyang | Arizona State University |
Organizer: Fanti, Maria Pia | Politecnico Di Bari |
Organizer: Mahulea, Cristian | Universidad De Zaragoza |
Organizer: Mangini, Agostino Marcello | Politecnico Di Bari |
Organizer: Roccotelli, Michele | Polytechnic University of Bari |
Organizer: Vogel-Heuser, Birgit | Technical University Munich |
Organizer: Zhou, MengChu | New Jersey Institute of Technology |
|
10:45-11:03, Paper WeBT9.1 | |
Video Diffusion Based Digital Twin for Large Format Additive Manufacturing (I) |
|
Liu, Lu | Arizona State University |
Xie, Haoyang | Arizona State University |
Hoskins, Dylan | Haddy Inc |
Rowe, Kyle | Local Motors |
Ju, Feng | Arizona State University |
Keywords: AI-Based Methods, Computer Vision in Automation, Additive Manufacturing
Abstract: Large Format Additive Manufacturing (LFAM) enables the fabrication of large, complex structures but presents challenges in thermal management, particularly in determining the optimal layer time to ensure interlayer bonding and structural integrity. Digital Twin (DT) technology has emerged as a key solution for predicting temperature distributions and optimizing process parameters. However, existing Physics-Based and Data-Driven DT models provide static, one-time predictions, lacking the adaptability to dynamically update thermal profile predictions based on real-time parameter adjustments. To address this limitation, we propose an adaptive Digital Twin framework based on the Video Diffusion Transformer (VDT). Unlike traditional DT models, our approach leverages Generative AI to dynamically simulate future temperature distributions when layer time or other printing parameters change. This method ensures that adjustments in printing strategy are immediately reflected in updated temperature predictions, leading to enhanced efficiency, improved print quality, and greater adaptability in LFAM workflows. Experimental results demonstrate that our approach is highly effective, generating realistic future frames that accurately reflect the temperature distribution. This work represents a significant step forward in Digital Twin technology, highlighting the potential of Generative AI in manufacturing.
|
|
11:03-11:21, Paper WeBT9.2 | |
Data-Driven Optimization of EV Charging Station Placement Using Causal Discovery (I) |
|
Junker, Julius Stephan | The University of Cologne |
Hu, Rong | Hunan University |
Li, Ziyue | University of Cologne |
Ketter, Wolfgang | University of Cologne |
Keywords: Optimization and Optimal Control, Plug-in Electric Vehicles, Causal Models
Abstract: This paper addresses the critical challenge of optimizing electric vehicle charging station placement through a novel data-driven methodology employing causal discovery techniques. While traditional approaches prioritize economic factors or power grid constraints, they often neglect empirical charging patterns that ultimately determine station utilization. We analyze extensive charging data from Palo Alto and Boulder (337,344 events across 100 stations) to uncover latent relationships between station characteristics and utilization. Applying structural learning algorithms (NOTEARS and DAGMA) to this data reveals that charging demand is primarily determined by three factors: proximity to amenities, EV registration density, and adjacency to high-traffic routes. These findings, consistent across multiple algorithms and urban contexts, challenge conventional infrastructure distribution strategies. We develop an optimization framework that translates these insights into actionable placement recommendations, identifying locations likely to experience high utilization based on the discovered dependency structures. The resulting site selection model prioritizes strategic clustering in high-amenity areas with substantial EV populations rather than uniform spatial distribution. Our approach contributes a framework that integrates empirical charging behavior into infrastructure planning, potentially enhancing both station utilization and user convenience. By focusing on data-driven insights instead of theoretical distribution models, we provide a more effective strategy for expanding charging networks that can adjust to various stages of EV market development.
|
|
11:21-11:39, Paper WeBT9.3 | |
OffLight: An Offline Multi-Agent Reinforcement Learning Framework for Traffic Signal Control (I) |
|
Bokade, Rohit | Northeastern University |
Jin, Xiaoning | Northeastern University |
Keywords: Agent-Based Systems, Automation Technologies for Smart Cities, Reinforcement
Abstract: Efficient traffic signal control (TSC) is crucial for reducing congestion and improving urban mobility. While multi-agent reinforcement learning (MARL) has shown promise in adaptive traffic management, its reliance on real-time interactions makes online training costly and impractical. Offline MARL, which learns from historical data, offers a safer and more scalable alternative but struggles with heterogeneous behavior policies in real-world datasets. We introduce OffLight, a novel offline MARL framework designed to handle policy heterogeneity in traffic signal control datasets. OffLight integrates importance sampling (IS) to correct for distributional shifts and return-based prioritized sampling (RBPS) to emphasize high-quality experiences. To model diverse behavior policies, it leverages a Gaussian mixture model variational graph autoencoder (GMM-VGAE) which captures spatial and temporal traffic dynamics and improves policy estimation. We evaluate OffLight across real-world urban traffic scenarios varying from small to large traffic signal networks. Results show that OffLight outperforms existing offline RL methods by reducing average travel time by up to 7.8% and queue length by 11.2% compared to state-of-the-art approaches. Unlike prior offline RL methods, OffLight effectively adapts to complex and mixed-policy datasets, making it more reliable for real-world deployment without risky online training. These results highlight OffLight’s potential to improve urban traffic management at scale. Our implementation is available at url{https://github.com/rbokade/offlight}.
|
|
11:39-11:57, Paper WeBT9.4 | |
Automated Knowledge Graph Construction for Supply Chain Datasets Assisted by LLMs (I) |
|
Wang, Luxuan | The Hong Kong University of Science and Technology |
Tsung, Fugee | HKUST |
Keywords: Logistics, AI-Based Methods
Abstract: In today’s data-driven supply chain management landscape, establishing connections among stakeholders from diverse resources is essential for effective analysis and decisionmaking. Knowledge graphs (KGs) provide a transformative solution by organizing fragmented inventory data into semantically rich, interconnected networks, enabling contextualized insights and robust reasoning. However, automating KG construction for supply chain datasets is challenging due to issues such as heterogeneous data integration (e.g., text documents, spreadsheets), domain-specific contextualization, and the need to model implicit operational dependencies. This paper introduces a novel framework that leverages large-language models (LLMs) with multi-step prompting workflow to address these challenges. Our AutoKG4SC approach automates the extraction of entities from various sources and constructs KGs to capture complex interdependencies among these entities. We utilize zero-shot prompting for ontology construction, Named Entity Recognition (NER), and Relation Extraction (RE) tasks, thereby eliminating the need for extensive domain-specific training and human prior knowledge. We validate the framework through a case study that demonstrates AutoKG4SC’s ability to construct high-quality KGs from supply chain datasets. This research presents an effective framework and prompt strategy for KG construction, which can be easily adapted to datasets with richer information and other application scenarios.
|
|
WeBT10 |
Room T10 |
Motion Control and Planning 2 |
Regular Session |
Chair: Zhang, Xi | College of Engineering, Peking University |
|
10:45-11:03, Paper WeBT10.1 | |
Fast UAV Trajectory Planning for Time-Optimal and Jerk-Minimal Paths: A Quadratic Programming Approach |
|
Hamdaan, Mohammad | Ubifly Technologies Private Limited |
Shyamsundar, Bakthakolahalan | Ubifly Technologies Private Limited |
Chakravarthy, Satyanarayanan R | Indian Institute of Technology Madras, Ubifly Technologies Priva |
Keywords: Motion and Path Planning, Motion Control, Planning, Scheduling and Coordination
Abstract: A widely adopted approach for trajectory planning in robotics is the decoupled method, in which a geometric path is formulated first for which a kino-dynamically feasible time parameterization is subsequently computed. This paper introduces a novel quadratic programming (QP)-based approach for generating time-optimal and jerk-minimal trajectories for planning in robotics. Our approach maintains the decoupled structure of path planning and time parameterization, ensuring flexibility and computational efficiency. By exploiting the unique structure of the QP formulation, we reduce the dimensionality of the optimization problem and eliminate equality constraints, making it computationally efficient and suitable for online replanning applications. Our algorithm offers comparable performance to traditional Time-Optimal Path Parameterization (TOPP)-based methods and other state-of-the-art techniques, while providing additional benefits such as improved jerk minimization, which is crucial for smoothness in motion and energy optimization. While preserving the robustness of Convex Optimization (CO)-based methods, experiments on dense real-world maps demonstrate that the proposed algorithm performs on par with TOPP-RA in terms of computation time and trajectory duration, and significantly out performs TOPP-RA in terms of path jerk. Furthermore, real-world experiments on a multirotor UAV validate our approach and demonstrate the algorithm’s versatility and applicability to a wide-range of motion planning problems in robotics and engineering.
|
|
11:03-11:21, Paper WeBT10.2 | |
Hybrid Motion Control for a Novel Wheeled Quadruped Robot |
|
Zhang, Chenyun | Fudan University |
Li, Ruijiao | Fudan Univeristy |
Zhang, Anzheng | Fudan University |
Khan, Rezwan Al Islam | Fudan University |
Pan, Yuzhen | Fudan University |
Zhao, Xuan | Yiwu Research Institute of Fudan University; Fudan University |
Li, Qiong | Fudan University |
Zhou, Chengrui | Columbia University |
Shang, Huiliang | Fudan University |
Keywords: Optimization and Optimal Control, Motion Control, Robust/Adaptive Control
Abstract: We present a modeling and hybrid locomotion control method for a novel wheeled quadruped robot, Pegasus, featuring a unique lightweight linear-actuator-driven ankle structure that enables four-wheel independent steering. A quadruped model, which only activates 12 joints(hip, thigh, and knee), and a vehicle model are implemented on Pegasus simultaneously, with a hybrid velocity allocation strategy aiming to combine both legged and wheeled movements, distributing desired linear and angular velocity to both locomotion modes. Moreover, a model predictive control (MPC) controller is designed to generate optimal torque for the quadruped model in real-time, and a ``telescopic vehicle model'' is derived to calculate the commands for ankles and wheels. Our experimental results demonstrate that the proposed modeling method and velocity allocation strategy with our controller can enable Pegasus to track the desired commands and adapt to various terrains, such as ramps, curves, and gravel roads. Furthermore, the utilization of ankle joints has led to a considerable reduction in the cost of transport (CoT) by surpassing 20% compared to scenarios lacking ankle joints.
|
|
11:21-11:39, Paper WeBT10.3 | |
GPD: Guided Polynomial Diffusion for Motion Planning |
|
Srikanth, Ajit | International Institute of Information Technology, Hyderabad |
Mahajan, Parth | Delhi Technological University |
Saha, Kallol | Carnegie Mellon University |
Mandadi, Vishal Reddy | International Institute of Information Technology, Hyderabad |
Paul, Pranjal | International Institute of Information Technology |
Wadhwani, Pawan | RRC, IIIT Hyderabad |
Bhowmick, Brojeshwar | Tata Consultancy Services |
Singh, Arun Kumar | University of Tartu |
Krishna, Madhava | IIIT Hyderabad |
Keywords: Motion and Path Planning, Deep Learning in Robotics and Automation, Collision Avoidance
Abstract: Diffusion-based motion planners are becoming popular due to their well-established performance improvements, stemming from sample diversity and the ease of incorporating new constraints directly during inference. However, a primary limitation of the diffusion process is the requirement for a substantial number of denoising steps, especially when the denoising process is coupled with gradient-based guidance. In this paper, we introduce, diffusion in the parametric space of trajectories, where the parameters are represented as Bernstein coefficients. We show that this representation greatly improves the effectiveness of the cost function guidance and the inference speed. We also introduce a novel stitching algorithm that leverages the diversity in diffusion-generated trajectories to produce collision-free trajectories with just a single cost function-guided model. We demonstrate that our approaches outperform current SOTA diffusion-based motion planners for manipulators and provide an ablation study on key components.
|
|
11:39-11:57, Paper WeBT10.4 | |
Transformer-Based World Interaction Modeling for Humanoid Locomotion Control |
|
Zheng, Han | Tsinghua University |
Cheng, Yi | Tsinghua University |
Liu, Hang | University of Michigan |
Li, Jiayi | Tsinghua University |
Li, Yizhe | Tsinghua University |
Ye, Linqi | Shanghai University |
Liu, Houde | Shenzhen Graduate School, Tsinghua University |
Keywords: Motion Control, Reinforcement
Abstract: Locomotion tasks for humanoid robots are challenging, especially in complex terrains. Understanding the physical processes of robot-environment interactions is key to achieving stable walking for humanoid robots. Since there is privileged information that the robot cannot directly access, the observation states are partially observable. Previous reinforcement learning(RL)-based methods either reconstruct environmental information from partial observations or reconstruct robotic dynamics information from partial observations, but they fail to fully model the physical processes of robot-environment interactions. In this work, we propose an end-to-end reinforcement learning control framework based on world physical interaction model for Humanoid robots. Our primary innovation is the introduction of a physical interaction world model to understand the dynamic interactions between the robot and the environment. Additionally, to address the temporal and dynamic nature of these interactions, we employ the hidden layers of Transformer-XL for implicit modeling. The proposed framework can showcase robust and flexible locomotion ability in complex environments such as slopes, stairs, and discontinuous surfaces. We validate the robustness of this method using the humanoid robot in simulations, and quantitatively compare our method against the baselines with better traversability and command-tracking.
|
|
WeCT1 |
Room T1 |
Image, Video and Vision 1 |
Regular Session |
Chair: Kim, Seoung Bum | Korea University |
|
14:45-15:03, Paper WeCT1.1 | |
Pixels-To-Graph: Real-Time Integration of Building Information Models and Scene Graphs for Semantic-Geometric Human-Robot Understanding |
|
Longo, Antonello | Polytechnic University of Bari, NASA Jet Propulsion Laboratory |
Chung, Chanyoung | NASA Jet Propulsion Laboratory (JPL) |
Palieri, Matteo | NASA Jet Propulsion Laboratory |
Kim, Sung-Kyun | NASA Jet Propulsion Laboratory, Caltech |
Agha-mohammadi, Ali-akbar | NASA-JPL, Caltech |
Guaragnella, Cataldo | Politecnico Di Bari |
Khattak, Shehryar | NASA Jet Propulsion Laboratory |
Keywords: Deep Learning in Robotics and Automation, Human-Centered Automation, Autonomous Agents
Abstract: Autonomous robots are increasingly playing key roles as support platforms for human operators in high-risk, dangerous applications. To accomplish challenging tasks, an efficient human-robot cooperation and understanding is required. While typically robotic planning leverages 3D geometric information, human operators are accustomed to a high-level compact representation of the environment, like top-down 2D maps representing the Building Information Model (BIM). 3D scene graphs have emerged as a powerful tool to bridge the gap between human readable 2D BIM and the robot 3D maps. In this work, we introduce Pixels-to-Graph (Pix2G), a novel lightweight method to generate structured scene graphs from image pixels and LiDAR maps in real-time for the autonomous exploration of unknown environments on resource-constrained robot platforms. To satisfy onboard compute constraints, the framework is designed to perform all operation on CPU only. The method output are a de-noised 2D top-down environment map and a structure-segmented 3D pointcloud which are seamlessly connected using a multi-layer graph abstracting information from object-level up to the building-level. The proposed method is quantitatively and qualitatively evaluated during real-world experiments performed using the NASA JPL NeBula-Spot legged robot to autonomously explore and map cluttered garage and urban office like environments in real-time.
|
|
15:03-15:21, Paper WeCT1.2 | |
MULoc: Robust Monocular Visual Localization for Underwater Robots Via Multi-Modal Feature Fusion |
|
Gao, Yuer | The Hong Kong University of Science and Technology (Guangzhou) |
Cai, Yi | The Hong Kong University of Science and Technology (Guangzhou) |
Keywords: Computer Vision in Automation, Deep Learning in Robotics and Automation, Sensor-based Control
Abstract: Precise state estimation and localization for underwater robots are essential prerequisites for autonomous operation in complex aquatic environments. While traditional underwater localization methods primarily rely on specialized hardware such as acoustic sensors and inertial measurement units, these approaches often increase system complexity, cost, and potential failure points. This research presents MULoc (Multi-feature Underwater Localization), a real-time monocular visual localization framework for underwater robots that leverages complementary feature fusion to overcome the significant challenges presented by underwater environments, including non-uniform illumination, scattering effects, and varying turbidity levels. By utilizing a single camera approach, MULoc significantly reduces hardware complexity and cost while maintaining robust performance. The proposed framework implements an end-to-end optimization mechanism that integrates deep learning-based RGB feature extraction with geometric depth estimation derived from the same monocular source. Experimental validation in controlled underwater testing facilities demonstrates that MULoc achieves an average localization error below 1% of trajectory length across diverse underwater scenarios, with real-time processing capabilities at 25 FPS on embedded computing platforms. Performance comparisons reveal substantial improvements in localization performance compared to traditional methods and a tracking success rate of 92% in challenging conditions. The presented system offers a comprehensive and cost-effective automation solution for underwater robot navigation with significant applications in marine resource exploration, infrastructure inspection, and underwater archaeology.
|
|
15:21-15:39, Paper WeCT1.3 | |
Attention Distribution across Multiple Video Feeds in Unmanned Construction: The Impact of Free-Viewpoint Video on Visual Field Awareness |
|
Yamashita, Yuki | Waseda University |
Motohashi, Shutaro | Waseda University |
Hashimoto, Takeshi | Public Works Research Institute |
Endo, Daisuke | Public Works Research Institute |
Yamauchi, Genki | Public Works Research Institute |
Iwata, Hiroyasu | Waseda University |
Keywords: Human Factors and Human-in-the-Loop, Human-Centered Automation, Automation in Construction
Abstract: Although unmanned construction ensures worker safety during restoration work at disaster sites, its efficiency is approximately twice lower than that of manned construction. Herein, a semi-autonomous control system using a drone was constructed to address this issue, and a new video presentation method combining three video viewpoints (bird’s-eye camera, onboard camera, and side camera) was developed. The side-camera video was a free-viewpoint video allowing the operator to directly control the viewpoint. The experiments showed that some operators overlooked hazards in other videos while operating the free-viewpoint video, resulting in contact between the machine and obstacles. To address this problem, we defined a new “Visual Field Awareness” as an area where the operator can respond to sudden visual changes in the video, analyzed the ability of operators to distribute attention to multiple videos, and experimentally verified the effect of the introduction of free-viewpoint videos on the Visual Field Awareness of the operators. The methodology for the estimation of Visual Field Awareness established herein can be applied in multiple areas beyond construction sites, extending to other fields where remote operation is utilized, such as telemedicine. This approach may provide new insights into the evaluation and improvement of remote operation capabilities across various domains.
|
|
15:39-15:57, Paper WeCT1.4 | |
Image Denoising for Wafer Transmission Electron Microscopy Using Segment Anything-Guided Optimization |
|
Kim, Sungsu | Korea University |
Jang, Gunhui | Korea University |
Cho, Hansam | Korea University |
Roh, Heejoong | SK Hynix Inc |
Kim, Kyunghye | SK Hynix Inc |
Jo, Munki | SK Hynix Inc |
Tae, Jaeung | SK Hynix Inc |
Kim, Seoung Bum | Korea University |
Keywords: Computer Vision in Automation, Semiconductor Manufacturing, Computer Vision for Manufacturing
Abstract: Wafer transmission electron microscopy (TEM) images have gained significant attention for analyzing the internal structure of wafers. However, accurate measurements are often hindered by substantial errors because of unknown noise in these images. Therefore, image denoising is essential for reliable measurements of wafer TEM images. Denoising wafer TEM images, however, is particularly challenging because their noise characteristics differ significantly from those of typical images. Moreover, the absence of clean-noisy image pairs reduces the effectiveness of machine learning (ML)-based methods. To address this challenge, we propose SAMOD, a Segment Anything (SAM)-guided optimization-based wafer TEM image denoising framework that combines filter-based denoised images generated with various hyperparameter settings. By optimizing the combination weights using pseudo measurement points identified by the vision foundation model SAM, SAMOD reduces measurement errors across six different manufacturing process images. Notably, SAMOD achieved competitive performance without requiring prior knowledge of noise characteristics or measurement information, time consuming hyperparameter searches, or model training, ensuring both practicality and robustness.
|
|
15:57-16:15, Paper WeCT1.5 | |
Synthetic Dataset Generation for Autonomous Mobile Robots Using 3D Gaussian Splatting for Vision Training |
|
Deogan, Aneesh | Eindhoven University of Technology |
Beks, Wout | Eindhoven University of Technology |
Teurlings, Peter | Eindhoven University of Technology |
de Vos, Koen | Eindhoven University of Technology |
van den Brand, Mark | Eindhoven University of Technology |
van de Molengraft, Marinus Jacobus Gerardus | University of Technology Eindhoven |
Keywords: Deep Learning in Robotics and Automation, Computer Vision in Automation, Autonomous Agents
Abstract: Annotated datasets are critical for training neural networks for object detection, yet their manual creation is time- and labour-intensive, subjective to human error, and often limited in diversity. This challenge is particularly pronounced in the domain of robotics, where diverse and dynamic scenarios further complicate the creation of representative datasets. To address this, we propose a novel method for automatically generating annotated synthetic data in Unreal Engine. Our approach leverages photorealistic 3D Gaussian splats for rapid synthetic data generation. We demonstrate that synthetic datasets can achieve performance comparable to that of real-world datasets while significantly reducing the time required to generate and annotate data. Additionally, combining real-world and synthetic data significantly increases object detection performance by leveraging the quality of real-world images with the easier scalability of synthetic data. To our knowledge, this is the first application of synthetic data for training object detection algorithms in the highly dynamic and varied environment of robot soccer. Validation experiments reveal that a detector trained on synthetic images performs on par with one trained on manually annotated real-world images when tested on robot soccer match scenarios. Our method offers a scalable and comprehensive alternative to traditional dataset creation, eliminating the labour-intensive error-prone manual annotation process. By generating datasets in a simulator where all elements are intrinsically known, we ensure accurate annotations while significantly reducing manual effort, which makes it particularly valuable for robotics applications requiring diverse and scalable training data.
|
|
WeCT2 |
Room T2 |
RAL Paper Session 4 |
Special Session |
Chair: Lee, Donggun | North Carolina State University |
|
14:45-15:03, Paper WeCT2.1 | |
Certifiable Reachability Learning Using a New Lipschitz Continuous Value Function |
|
Li, Jingqi | Berkeley |
Lee, Donggun | North Carolina State University |
Lee, Jaewon | Boson AI |
Dong, Kris Shengjun | UC Berkeley |
Sojoudi, Somayeh | UC Berkeley |
Tomlin, Claire | UC Berkeley |
Keywords: Machine Learning for Robot Control, Reinforcement Learning, Optimization and Optimal Control
Abstract: We propose a new reachability learning framework for high-dimensional nonlinear systems, focusing on reach-avoid problems. These problems require computing the reach-avoid set, which ensures that all its elements can safely reach a target set despite any disturbance within pre-specified bounds. Our framework has two main parts: offline learning of a newly designed reach-avoid value function and post-learning certification. Compared to prior works, our new value function is Lipschitz continuous and its associated Bellman operator is a contraction mapping, both of which improve the learning performance. To ensure deterministic guarantees of our learned reach-avoid set, we introduce two efficient post-learning certification methods. Both methods can be used online for real-time local certification or offline for comprehensive certification. We validate our framework in a 12-dimensional crazyflie drone racing hardware experiment and a simulated 10-dimensional highway takeover example.
|
|
15:03-15:21, Paper WeCT2.2 | |
Multi-Finger Manipulation Via Trajectory Optimization with Differentiable Rolling and Geometric Constraints |
|
Yang, Fan | University of Michigan |
Power, Thomas | Robotics Institute, University of Michigan |
Aguilera, Sergio | Honda Research Institute - USA |
Iba, Soshi | Honda Research Institute USA |
Soltani Zarrin, Rana | Honda Research Institute - USA |
Berenson, Dmitry | University of Michigan |
Keywords: In-Hand Manipulation, Optimization and Optimal Control
Abstract: Parameterizing finger rolling and finger-object contacts in a differentiable manner is important for formulating dexterous manipulation as a trajectory optimization problem. In contrast to previous methods which often assume simplified geometries of the robot and object or do not explicitly model finger rolling, we propose a method to further extend the capabilities of dexterous manipulation by accounting for non-trivial geometries of both the robot and the object. By integrating the object's Signed Distance Field (SDF) with a sampling method, our method estimates contact and rolling-related variables in a differentiable manner and includes those in a trajectory optimization framework. This formulation naturally allows for the emergence of finger-rolling behaviors, enabling the robot to locally adjust the contact points. To evaluate our method, we introduce a benchmark featuring challenging multi-finger dexterous manipulation tasks, such as screwdriver turning and in-hand reorientation. Our method outperforms baselines in terms of achieving desired object configurations and avoiding dropping the object. We also successfully apply our method to a real-world screwdriver turning task and a cuboid alignment task, demonstrating its robustness to the sim2real gap.
|
|
15:21-15:39, Paper WeCT2.3 | |
Hybrid Theta*: Motion Planning for Dubins Vehicles with Integral Constraints |
|
Manyam, Satyanarayana Gupta | Infoscitex Corp |
Casbeer, David | AFRL |
Taylor, Colin | Parallax Advanced Research |
Keywords: Motion and Path Planning, Collision Avoidance, Constrained Motion Planning
Abstract: We consider a motion planning problem for vehicles with curvature constraints, such as minimum turn radius, and a secondary cost such as resource cost. Traditional motion planning problems address the secondary cost as a soft constraint within the cost function. In the current paper, we take a different approach and treat this as a constraint, separate from the primary objective cost. Specifically, the integrated resource cost along the vehicle’s path is constrained to be within a prespecified limit, which is separate from the main travel cost being optimized. This approach is suitable for applications such as fire fighting, where finding the paths of minimum cost or time is essential while limiting exposure to the high heat areas. To address the resource constraints, we introduce the Hybrid Theta∗ (HΘ∗) algorithm. This is an incremental sampling based search algorithm and draws inspiration from labeling algorithms used in resource constrained shortest path problems. We present two versions of the (HΘ∗) algorithm, label-select and focal-select; these variants differs in how labels to be expanded are selected from the processing queue. The proposed algorithms significantly outperform the baseline methods that uses the traditional motion planning algorithms in terms of both the success rate and the solution cost. We validate the algorithms through computational experiments and real-world flight testing with on-board computation.
|
|
15:39-15:57, Paper WeCT2.4 | |
Probabilistically Correct Language-Based Multi-Robot Planning Using Conformal Prediction |
|
Wang, Jun | Washington University in St. Louis |
He, Guocheng | Washington University in St. Louis |
Kantaros, Yiannis | Washington University in St. Louis |
Keywords: Task Planning, Multi-Robot Systems, AI-Enabled Robotics
Abstract: This paper addresses task planning problems for language-instructed robot teams. Tasks are expressed in natural language (NL), requiring the robots to apply their skills at various locations and semantic objects. Several recent works have addressed similar planning problems by leveraging pre-trained Large Language Models (LLMs) to design effective multi-robot plans. However, these approaches lack mission completion guarantees. To address this challenge, we introduce a new distributed LLM-based planner, called S-ATLAS for Safe plAnning for Teams of Language-instructed AgentS, that can achieve user-defined mission success rates. This is accomplished by leveraging conformal prediction (CP), a distribution-free uncertainty quantification tool. CP allows the proposed multi-robot planner to reason about its inherent uncertainty, due to imperfections of LLMs, in a distributed fashion, enabling robots to make individual decisions when they are sufficiently certain and seek help otherwise. We show, both theoretically and empirically, that the proposed planner can achieve user-specified task success rates while minimizing the average number of help requests. We provide comparative experiments against related works showing that our method is significantly more computational efficient and achieves lower help rates.
|
|
WeCT3 |
Room T3 |
Intelligent Process Control |
Special Session |
Chair: Liu, Changhui | Tongji University |
Co-Chair: Mou, Shancong | Georgia Tech |
Organizer: Liu, Changhui | Tongji University |
Organizer: Dong, Yuanfa | China Three Gorges University |
Organizer: Du, Juan | The Hong Kong University of Science and Technology (Guangzhou) |
Organizer: Liu, Yinhua | University of Shanghai for Science and Technology |
Organizer: Wang, Yinan | RPI |
Organizer: Mou, Shancong | Georgia Tech |
Organizer: Zheng, Shuai | Xi’an Jiaotong University |
Organizer: Xiao, Jinhua | Polytechnical University of Milan |
Organizer: Ding, Siyi | Donghua University |
|
14:45-15:03, Paper WeCT3.1 | |
Optimizing Fixture Layout for Compliant Parts in Ship Assembly: A Metaheuristic Approach Enhanced by Surrogate Modeling (I) |
|
Liu, Changhui | Tongji University |
Yu, Chunlong | Tongji University |
Keywords: Assembly, Compliant Assembly, Intelligent and Flexible Manufacturing
Abstract: This paper addresses the critical challenge of optimizing fixture layouts for compliant part assembly in shipbuilding, focusing on minimizing the number of fixtures while satisfying engineering constraints such as nodal displacement and assembly gap requirements. Traditional approaches predominantly optimize fixture positions with a predetermined quantity, often leading to excessive fixtures or compromised quality. To overcome these limitations, a two-phase Kriging-based metaheuristic method is proposed. The first phase iteratively adds fixtures on compliant parts using a greedy strategy guided by surrogate modeling to achieve an initial feasible layout. The second phase employs a large neighborhood search (LNS) to systematically remove redundant fixtures without violating constraints, enhancing cost-efficiency. A Kriging surrogate model accelerates the evaluation of candidate layouts, reducing reliance on computationally intensive finite element simulations. Numerical experiments on ship panel assembly demonstrate the method’s ability to generate layouts with 45% fewer fixtures than conventional approaches, while maintaining compliance with deformation and gap tolerances (e.g., reducing maximum assembly gap from 0.66 mm to 0.39 mm). Comparative analyses against simulated annealing and genetic algorithm benchmarks highlight superior convergence speed and solution quality, attributed to the integration of constraint-aware surrogate modeling and adaptive fixture reallocation. The framework offers practical value in reducing setup costs and enhancing production efficiency for large-scale compliant structures. Future work may extend the approach to incorporate manufacturing uncertainties and dynamic fixture positioning.
|
|
15:03-15:21, Paper WeCT3.2 | |
DeepOFormer: Deep Operator Learning with Domain-Informed Features for Fatigue Life Prediction (I) |
|
Li, Chenyang | New Jersey Institute of Technology |
Kapure, Tanmay | New Jersey Institute of Technology |
Roy, Prokash Chandra | University of Texas at El Paso |
Gan, Zhengtao | Arizona State University |
Shen, Bo | NJIT |
Keywords: AI-Based Methods, Big-Data and Data Mining
Abstract: Fatigue life characterizes the duration a material can function before failure under specific environmental conditions, and is traditionally assessed using stress-life (SN) curves. While machine learning and deep learning offer promising results for fatigue life prediction, they face the overfitting challenge because of the small size of fatigue experimental data in specific materials. To address this challenge, we propose, DeepOFormer, by formulating S-N curve prediction as an operator learning problem. DeepOFormer improves the deep operator learning framework with a Transformer-based encoder and a mean L2 relative error loss function. We also consider Stussi, Weibull, and Pascual and Meeker (PM) features as domain-informed features. These features are motivated by empirical fatigue models. To evaluate the performance of our DeepOFormer, we compare it with different deep learning models and XGBoost on a dataset with 54 SN curves of aluminum alloys. With seven different aluminum alloys selected for testing, our DeepOFormer achieves an R2 of 0.9515, a mean absolute error of 0.2080, and a mean relative error of 0.5077, significantly outperforming state-of-the-art deep/machine learning methods including DeepONet, TabTransformer, and XGBoost, etc. The results highlight that our Deep0Former integrating with domain-informed features substantially improves prediction accuracy and generalization capabilities for fatigue life prediction in aluminum alloys.
|
|
15:21-15:39, Paper WeCT3.3 | |
SmartFixture: Physics-Guided Reinforcement Learning for Automatic Fixture Layout Design in Manufacturing Systems (I) |
|
Wang, Yinan | RPI |
Keywords: Compliant Assembly, Process Control, Agent-Based Systems
Abstract: Fixture layout design critically impacts the shape deformation of large-scale sheet parts and the quality of the final product in the assembly process. The existing works focus on developing mathematical-optimization (MO)-based methods to generate the optimal fixture layout via interaction finite element analysis (FEA)-based simulations or its surrogate models. Their limitations can be summarized as memorylessness and lack of scalability. Memorylessness indicates that the experience in designing the fixture layout for one part is usually not transferable to others. Scalability becomes an issue for MO-based methods when the design space of fixtures is large. Furthermore, the surrogate models might have limited representation capacity when modeling high-fidelity simulations. To address these limitations, we propose a learning-based framework, SmartFixture, to design the fixture layout by training a Reinforcement learning (RL) agent through direct interaction with the FEA-based simulations. The advantages of the proposed framework include: (1) it is generalizable to design fixture layouts for unseen scenarios after offline training; (2) it is capable of finding the optimal fixture layout over a massive search space. Experiments demonstrate that the proposed framework consistently generates the best fixture layouts that receive the smallest shape deformations on the sheet parts with different initial shape variations.
|
|
15:39-15:57, Paper WeCT3.4 | |
Finite Element Analysis Based Methods for Fuselage Shape Control (I) |
|
Mou, Shancong | Georgia Tech |
Keywords: Assembly, Optimization and Optimal Control, AI-Based Methods
Abstract: In precision aircraft assembly, maintaining the desired fuselage shape is essential for ensuring structural integrity and minimizing assembly-induced defects. This talk presents a framework for fuselage shape control, where numerical methods are directly integrated into the optimization process to enable accurate, fast, and robust decision-making. By embedding finite element analysis (FEA) models into the optimization loop, we capture the nonlinear mechanical behavior of composite structures under external loads, supporting effective control strategies that adapt to individual shape deviations. First, a fuselage finite element analysis (FEA) model will be introduced as the foundation of the framework. Based on this model, three case studies are conducted. The first case presents a cautious fixture control scheme that enhances control performance under incoming fuselage uncertainty, where the uncertainty is modeled through variations in the FEA parameters. The second case develops an optimal fixture design strategy, where the optimal locations of supporting fixtures are analytically determined to minimize shape deviation under gravitational loading. The third case introduces an optimal sensor placement strategy that enables accurate reconstruction of measurement signals, thereby achieving shape control with a balanced trade-off between control accuracy and sensing budget. These three examples highlight the importance of integrating physical models into the optimal control problem in fuselage assembly, providing both physical fidelity and computational efficiency.
|
|
WeCT4 |
Room T4 |
Robotic Systems |
Regular Session |
Chair: Figat, Maksym | Warsaw University of Technology |
|
14:45-15:03, Paper WeCT4.1 | |
Towards Task-To-Model Transformation in Robotic System Design |
|
Figat, Maksym | Warsaw University of Technology |
Keywords: Formal Methods in Robotics and Automation, Petri Nets for Automation Control, Control Architectures and Programming
Abstract: This paper presents the Robotic System Task-to-Model Transformation Methodology RSTM^2, an approach for transforming high-level robotic tasks into formal system specifications. Extending the Robotic System Specification Methodology (RSSM), RSTM^2 introduces a systematic parametrisation process that refines task descriptions, system architecture, and behaviour modelling. Validated by MuJoCo simulations, the approach demonstrates the impact of system parametrisation, affordance analysis, and architectural validation. Future work will focus on designing complex systems, obtaining and comparing multiple models for the same task, and exploring the integration of RSTM^2 with Large Language Models (LLMs) to automate and refine system parametrisation.
|
|
15:03-15:21, Paper WeCT4.2 | |
Towards Miniaturized Bending and Torsional Soft Actuators for Robotic Catheters |
|
Lee, Kyungjoon | University of California Riverside |
Vu, Steven | University of California, Riverside |
Manian, Vinesh | University of California Riverside |
Afsharinejad, Parmida | University of California Riverside |
Sevic, Sophia | UCI |
Sheng, Jun | University of California Riverside |
Keywords: Hydraulic/Pneumatic Actuators, Mechanism Design in Meso, Micro and Nano Scale, Medical Robots and Systems
Abstract: In this paper, we present our progress towards a soft robotic catheter consisting of a flexible shaft and a dexterous tip made of two soft hydraulic actuators capable of bending and torsion, respectively. By coordinating the motion of the two soft actuators with the axial translation of the catheter, the tip can be articulated in 3D space. Each actuator is a continuum body made of hyperelastic silicone with anisotropy induced by fiber and fabric reinforcement. By regulating the water flow into each soft actuator, the motion of each actuator can be controlled. This paper presents our methods on how to scale down the design and fabrication of soft actuators to the size required by minimally invasive surgery. Through experimental characterization, the motion range is about 60° to 381° for the bending actuator and about 437° for the torsion actuator. We are refining the fabrication method and combining the bending and torsional actuators to develop a steerable catheter. Experiments will be conducted to demonstrate its applicability in endovascular surgical procedures.
|
|
15:21-15:39, Paper WeCT4.3 | |
Fine Teleoperation of an Anthropomorphic Robotic Hand by Combining Joint-Cartesian Space |
|
Wang, Ruize | Zhejiang University |
Bao, Yangbin | ZheJiangUniversity |
Su, Yuan | Northeastern University |
Xie, Yuwei | Zhejiang University |
Xu, Peisen | Zhejiang University |
Deng, Yongsheng | Northeastern University, China |
Ye, Qi | Zhejiang University |
Li, Gaofeng | Zhejiang University |
Chen, Jiming | Zhejiang University |
Keywords: Telerobotics and Teleoperation
Abstract: Mapping human hand motion to an anthropomorphic robotic hand is crucial for dexterous hand teleoperation. However, numerous challenges remain due to the high Degrees-of-Freedom (DoF) of human hands and mismatches in sizes and kinematic configurations between human and robotic hands. In recent years, various mapping methods have been proposed to address these challenges, particularly hybrid mapping, which combines the advantages of Cartesian and joint mapping. Although many studies have focused on hybrid mapping, they primarily conducted simulation experiments and did not consider the substantial gaps between simulated and real hands. In this paper, we design a hybrid mapping framework combining proposed mapping methods and extend it from simulation to reality. Specifically, to bridge the gap between human and robotic hands, we propose a novel Cartesian mapping method based on an auxiliary frame to compensate for errors. In the experiment, we utilize a highly under-actuated exoskeleton to capture the movement of the human hand and apply the hybrid mapping method in fine dual-arm teleoperation tasks. The experimental results demonstrate that the hybrid mapping method based on our proposed Cartesian mapping method exhibits outstanding performance in the fine and dexterous teleoperation tasks.
|
|
15:39-15:57, Paper WeCT4.4 | |
Novel Gaits for Snake Robot Navigation in Complex External Pipe Networks |
|
Karumanchi, Karthik | Carnegie Mellon University |
Pellegrini, Sylvain | EPFL |
Orekhov, Andrew | Carnegie Mellon University |
Gu, Yizhu | Carnegie Mellon University |
Boirum, Ralph | Carnegie Mellon University |
Vundurthy, Bhaskar | Carnegie Mellon University |
Choset, Howie | Carnegie Mellon University |
Keywords: Motion Control, Industrial and Service Robotics, Biomimetics
Abstract: This paper introduces novel locomotion strategies for snake robots navigating complex external pipe networks, addressing challenges beyond simple pipe traversal. Existing gait-based control methods struggle with obstacles like valves and complex junctions such as T-shaped intersections. To overcome these limitations, we propose two new gaits: the spiraling gait and the windowed rolling helix gait. The spiraling gait enables autonomous obstacle avoidance by rotating the robot in a helical motion around the pipe axis, effectively navigating around obstructions. The windowed rolling helix gait facilitates seamless traversal of T-shaped junctions by allowing the robot to selectively extend and shift segments, enabling transitions between pipe branches with minimal user intervention. Experimental results demonstrate the effectiveness of these gaits in navigating diverse external pipe features, showcasing improved adaptability and autonomy compared to existing methods. This work advances snake robot capabilities for robust inspection and maintenance in complex industrial environments.
|
|
15:57-16:15, Paper WeCT4.5 | |
Vibration Vanquished: Enhancing Grasping of Deformable Objects with Jet Gripper Technology |
|
Mykhailyshyn, Roman | National Institute of Advanced Industrial Science and Technology |
Romancik, Jaroslav | Technical University of Kosice, Faculty of Mechanical Engineerin |
Harada, Kensuke | Osaka University |
Majewicz Fey, Ann | University of Texas at Austin |
Keywords: Hydraulic/Pneumatic Actuators, Manipulation Planning, Robust Manufacturing
Abstract: Effective grasping and manipulation of deformable objects remains a challenge for most manufacturing applications. However, advances in gripper technology are moving the automation of this process closer to reality. Despite their potential, our previously developed state-of-the-art gripper and other jet grippers have the disadvantage of vibrating the deformable object during grasping. This results in a reduction of frictional force between the gripper and the deformable object, thereby impairing the ability to effectively perform manipulation. In this paper, we successfully eliminated vibrations during the grasping and manipulation of textile objects by introducing a novel anti-vibration grid that redirects airflow away from the object. Additionally, a comprehensive parametric study led to a significant increase in the gripper’s force characteristics. Future work will focus on advancing dexterous manipulation and integrating the gripper with other actuators for improved performance and efficiency.
|
|
WeCT5 |
Room T5 |
Optimization and Learning 2 |
Regular Session |
Chair: Yan, Hao | Arizona State University |
|
14:45-15:03, Paper WeCT5.1 | |
Unified Self-Supervised Representation Learning for Multi-Modal and Single-Modal 3D Perception |
|
Xu, Xiaohao | University of Michigan, Ann Arbor |
Li, Ye | University of Michigan |
Zhang, Tianyi | Carnegie Mellon University |
Yang, Jinrong | Huazhong University of Science and Technology |
Johnson-Roberson, Matthew | Carnegie Mellon University |
Huang, Xiaonan | University of Michigan |
Keywords: AI-Based Methods, Sensor Fusion, Machine learning
Abstract: Constructing large-scale labeled datasets for multi-modal perception in autonomous driving is challenging, driving the need for self-supervised pretraining. However, existing methods typically adopt separate pretraining strategies for each modality. Inspired by Neural Radiance Field (NeRF), which unifies multi-modal optimization, we propose NeRF-Supervised Masked Autoencoder (NS-MAE), a unified pretraining strategy for LiDAR-Camera 3D perception. NS-MAE encodes both appearance and geometry, enabling efficient masked reconstruction across modalities. Specifically, it extracts embeddings from corrupted LiDAR point clouds and images, conditioned on view direction and location. These embeddings are then rendered into multi-modal feature maps from two key perspectives for autonomous driving: perspective view and bird’s-eye view (BEV). The original uncorrupted data serve as reconstruction targets, facilitating self-supervised learning. Extensive experiments demonstrate NS-MAE’s superior transferability across various 3D perception tasks. Notably, NS-MAE outperforms some prior SOTA pretraining methods which rely on separate modality-specific strategies, particularly in BEV map segmentation under label-efficient fine-tuning. Our code is publicly available at https://github.com/Xiaohao-Xu/Unified-Pretrain-AD/.
|
|
15:03-15:21, Paper WeCT5.2 | |
Probabilistic Kolmogorov-Arnold Networks Via Sparsified Deep Gaussian Processes with Additive Kernels |
|
Zou, Qing | Xi'an Jiaotong University |
Yan, Hao | Arizona State University |
Keywords: Machine learning, Probability and Statistical Methods, AI-Based Methods
Abstract: In this paper, we propose a new probabilistic formulation of the Kolmogorov-Arnold neural network model with the sparsity constraint. To achieve this, we first prove that replacing the edge of the KAN model by the Gaussian process is equivalent to a deep Gaussian Process model with additive kernel structure. To achieve a similar sparse network architecture like the KAN model, we incorporate the spike and slab prior distribution to the proposed model to learn the sparse network structure, aiming at reducing unnecessary network parameters while keeping the model accuracy unchanged. In addition, an efficient variational inference algorithm is proposed and validated in a simulation study and some benchmark datasets.
|
|
15:21-15:39, Paper WeCT5.3 | |
A Gossip-Based Approach for Measurement Task Allocation and Routing in Multi-Robot Systems with Heterogeneous Sensing |
|
Chakraa, Hamza | Université Le Havre Normandie |
Deplano, Diego | University of Cagliari |
Seatzu, Carla | University of Cagliari |
Lefebvre, Dimitri | University LE HAVRE |
Franceschelli, Mauro | University of Cagliari, Italy |
Keywords: Task Planning, Optimization and Optimal Control, Agent-Based Systems
Abstract: This paper presents a decentralized task allocation strategy for heterogeneous multi-robot systems to minimize makespan during mission execution. The approach leverages a Gossip-based consensus mechanism, where robots communicate and exchange task information to optimize task distribution. The problem is modelled as a Multi-Robot Task Allocation (MRTA) challenge with the objective of minimizing task com- pletion time (makespan). The proposed heuristic algorithm operates by iteratively improving task sequences via local exchanges between robots. Simulations demonstrate the algo- rithm’s effectiveness in assigning tasks while considering various robot capabilities and environmental constraints, resulting in improved mission performance and reduced overall task completion time.
|
|
15:39-15:57, Paper WeCT5.4 | |
Automated Generation of Diverse Courses of Actions for Multi-Agent Operations Using Binary Optimization and Graph Learning |
|
Poddar, Prithvi | University at Buffalo |
Esfahani, Ehsan | University at Buffalo, State University of New York |
Dantu, Karthik | University of Buffalo |
Chowdhury, Souma | University at Buffalo, State University of New York |
Keywords: Task Planning, Reinforcement, Optimization and Optimal Control
Abstract: Operations in disaster response, search & rescue, and military missions that involve multiple agents demand automated processes to support the planning of the courses of action (COA). Moreover, traverse-affecting changes in the environment (rain, snow, blockades, etc.) may impact the expected performance of a COA, making it desirable to have a pool of COAs that are diverse in task distributions across agents. Further, variations in agent capabilities, which could be human crews and/or autonomous systems, present practical opportunities and computational challenges to the planning process. This paper presents a new theoretical formulation and computational framework to generate such diverse pools of COAs for operations with soft variations in agent-task compatibility. Key to the problem formulation is a graph abstraction of the task space and the pool of COAs itself to quantify its diversity. Formulating the COAs as a centralized multi-robot task allocation, a genetic algorithm is used for (order-ignoring) allocations of tasks to each agent that jointly maximize diversity within the COA pool and overall compatibility of the agent-task mappings. A graph neural network is trained using a policy gradient approach to then perform single agent task sequencing in each COA, which maximizes completion rates adaptive to task features. Our tests of the COA generation process in a simulated environment demonstrate significant performance gain over a random walk baseline, small optimality gap in task sequencing, and execution time of about 50 minutes to plan up to 20 COAs for 5 agent/100 task operations.
|
|
15:57-16:15, Paper WeCT5.5 | |
Multi-Stage Planning for Multi-Target Surveillance Using Aircrafts Equipped with Synthetic Aperture Radars Aware of Target Visibility |
|
Fuertes, Daniel | Universidad Politécnica De Madrid |
del-Blanco, Carlos R. | Universidad Politécnica De Madrid |
Jaureguizar, Fernando | Universidad Politécnica De Madrid |
Navarro Corcuera, Juan Jose | Airbus |
García, Narciso | Universidad Politécnica De Madrid |
Keywords: Deep Learning in Robotics and Automation, Machine learning, Motion and Path Planning
Abstract: Generating trajectories for synthetic aperture radar (SAR)-equipped aircraft poses significant challenges due to terrain constraints, and the need for straight-flight segments to ensure high-quality imaging. Related works usually focus on trajectory optimization for predefined straight-flight segments that do not adapt to the target visibility, which depends on the 3D terrain and aircraft orientation. In addition, this assumption does not scale well for the multi-target problem, where multiple straight-flight segments that maximize target visibility must be defined for real-time operations. For this purpose, this paper presents a multi-stage planning system. First, the waypoint sequencing to visit all the targets is estimated. Second, straight-flight segments maximizing target visibility according to the 3D terrain are predicted using a novel neural network trained with deep reinforcement learning. Finally, the segments are connected to create a trajectory via optimization that imposes 3D Dubins curves. Evaluations demonstrate the robustness of the system for SAR missions since it ensures high-quality multi-target SAR image acquisition aware of 3D terrain and target visibility, and real-time performance.
|
|
WeCT6 |
Room T6 |
Smart Logistics, Manufacturing and Healthcare 2 |
Special Session |
Chair: Mahulea, Cristian | Universidad De Zaragoza |
Organizer: Fanti, Maria Pia | Politecnico Di Bari |
Organizer: Mahulea, Cristian | Universidad De Zaragoza |
Organizer: Mangini, Agostino Marcello | Politecnico Di Bari |
Organizer: Roccotelli, Michele | Polytechnic University of Bari |
Organizer: Vogel-Heuser, Birgit | Technical University Munich |
Organizer: Zhou, MengChu | New Jersey Institute of Technology |
|
14:45-15:03, Paper WeCT6.1 | |
A Deep Reinforcement Learning Control Framework for Medical Image Augmentation (I) |
|
Wrona, Andrea | Sapienza University of Rome |
Baldisseri, Federico | Sapienza University of Rome |
Liberati, Francesco | Sapienza University of Rome |
Menegatti, Danilo | Sapienza University of Rome |
Delli Priscoli, Francesco | Sapienza University of Rome |
Vendittelli, Marilena | Sapienza University of Rome |
Keywords: AI and Machine Learning in Healthcare, Clinical and Operational Decision Support, Modelling, Simulation and Optimization in Healthcare
Abstract: Data augmentation is essential for handling limited and unbalanced medical datasets in classification tasks. Traditional methods, such as noise injection, random transformations, and generative adversarial networks enhance classifier accuracy, but lack intrinsic optimization for classification metrics. This work proposes a Deep Reinforcement Learning-based framework that autonomously selects and optimizes geometric/color transformations for multi-class medical image classification. Experiments on three medical datasets show that the proposed approach outperforms state-of-the-art augmentation techniques, improving classification accuracy on unseen images by over 13%.
|
|
15:03-15:21, Paper WeCT6.2 | |
Skill Orchestration Agent: A Knowledge-Driven Orchestration Framework for Adaptive Manufacturing Control (I) |
|
Lober, Andreas | University of Applied Science Ulm |
Weber, Jakob | Technische Hochschule Ulm |
Baumgaertel, Hartwig | Technische Hochschule Ulm |
Ollinger, Lisa | Ulm University of Applied Sciences |
Keywords: Intelligent and Flexible Manufacturing, Agent-Based Systems, AI-Based Methods
Abstract: The Skill Orchestration Agent (SkillOA) introduces a modular, distributed approach to manufacturing control, enhancing flexibility beyond traditional programmable logic controllers (PLCs). It is capable of determining and executing ad-hoc orchestrations of skills—representing manufacturing functions—by combining two sub-areas of AI: semantic knowledge graphs and multi-agent systems. By decomposing production orders into executable skills, the SkillOA concept enables reconfiguration and efficient resource utilization during operative processes. A core component is its semantic knowledge graph, which dynamically determines optimal skill sequences, reducing engineering complexity, and system downtime. The queue-based execution model prioritizes service request, ensuring adaptability in high-mix, low-volume production. The integration of parallel and asynchronous execution strategies enhances process efficiency but also introduces system complexity, requiring robust synchronization mechanisms. Challenges include managing execution dependencies, ensuring interoperability across automation architectures, and refining error-handling mechanisms. The presented concept represents a significant step toward autonomous, reconfigurable manufacturing, aligning with Industry 4.0 principles.
|
|
15:21-15:39, Paper WeCT6.3 | |
An Artificial Intelligence Approach to Manage Vehicles Motorway Entries in Congested Traffic (I) |
|
Salcuni, Antonio | Polytechnic University of Bari |
Volpe, Gaetano | Politecnico Di Bari |
Mangini, Agostino Marcello | Politecnico Di Bari |
Fanti, Maria Pia | Politecnico Di Bari |
Keywords: Intelligent Transportation Systems, Deep Learning in Robotics and Automation, AI-Based Methods
Abstract: Efficient traffic management in high-density urban areas is a critical challenge, specifically in highway entry ramps and merging points. This paper deals with the problem of manag- ing vehicles that have to enter highway ramps and intersections. The problem is formulated as an optimization task to minimize sudden braking, waiting times, and collision risks. A Deep Reinforcement Learning approach based on the Actor-Critic framework is proposed to train the intelligent traffic light agents managing the vehicle entries. The reward function is designed to dynamically balance safety and efficiency by reducing braking events and congestion. A simulation campaign is performed in a high-density merging scenario located in central-northern Italy at the intersection between the A1 Highway and the A14 Motorway. The results demonstrate the benefits obtained in reduced traffic flow and increased safety.
|
|
15:39-15:57, Paper WeCT6.4 | |
State Opacity Via Structural Observability in Continuous Petri Nets (I) |
|
Mahulea, Cristian | Universidad De Zaragoza |
Keywords: Petri Nets for Automation Control, Formal Methods in Robotics and Automation, Diagnosis and Prognostics
Abstract: This paper investigates state-opacity in discrete Petri nets through structural observability derived from continuous Petri nets. We provide structural conditions ensuring or violating initial-state and current-state opacity, leveraging partial measurements of markings. We illustrate our methodology through a healthcare application involving the confidentiality of patient decisions in anterior cruciate ligament (ACL) treatments.
|
|
15:57-16:15, Paper WeCT6.5 | |
Design Patterns for Multi-Agent Systems in Intelligent Mobile Construction Machines (I) |
|
Hujo-Lauer, Dominik | Technical University of Munich |
Krüger, Marius | Technical University of Munich |
Land, Kathrin Sophie | Technical University of Munich |
Sack, Richard | Technical University of Munich |
Prinz, Theresa | Technical University of Munich, TUM School of Engineering and De |
Cha, Suhyun | HAWE Hydraulik SE |
Waterman, Daniel | HAWE Hydraulik SE |
Kerausch, Cornelia | BAUER Maschinen GmbH |
Vogel-Heuser, Birgit | Technical University Munich |
Keywords: Agent-Based Systems, Control Architectures and Programming, Human-Centered Automation
Abstract: Mobile construction machines are often challenging to operate and require experienced operators to master their use effectively. Training new drivers demands significant time and financial resources. Operator expertise is particularly crucial in handling free-swinging tools, e.g., those used to create diaphragm walls or cranes. AI-based assistance systems have been developed for this application to reduce costs through efficient operation, even by novice operators. However, integrating such assistance systems into existing control architectures poses substantial challenges, as AI applications must be coupled with safety-critical systems. This paper proposes a design pattern for a Multi-Agent System to facilitate the seamless integration of novel technologies into traditional control architectures of mobile construction machines and partially automate this integration process. The Multi-Agent System approach enables standardized and efficient addition of assistance functions, ensuring timeliness and safety. These assistance functions were developed in a real industrial application, and the Multi-Agent System was evaluated on a Hardware-in-the-Loop test bench. One main contribution of this approach is that parts of the Agent System were tested on a real mobile hydraulic machine with an experienced operator.
|
|
WeCT7 |
Room T7 |
Automation Applications 1 |
Regular Session |
Chair: Gross, Jason | West Virginia University |
|
14:45-15:03, Paper WeCT7.1 | |
Robust Flower Cluster Matching Using the Unscented Transform |
|
Chu, Andy | West Virginia University |
Shrestha, Rashik | West Virginia University |
Gu, Yu | West Virginia University |
Gross, Jason | West Virginia University |
Keywords: Agricultural Automation, Computer Vision in Automation, Autonomous Agents
Abstract: Monitoring flowers over time is essential for precision robotic pollination in agriculture. To accomplish this, a continuous spatial-temporal observation of plant growth can be done using stationary RGB-D cameras. However, image registration becomes a serious challenge due to changes in the visual appearance of the plant caused by the pollination process and occlusions from growth and camera angles. Plants flower in a manner that produces distinct clusters on branches. This paper presents a method for matching flower clusters using descriptors generated from RGB-D data and considers allowing for spatial uncertainty within the cluster. The proposed approach leverages the Unscented Transform to efficiently estimate plant descriptor uncertainty tolerances, enabling a robust image-registration process despite temporal changes. The Unscented Transform is used to handle the nonlinear transformations by propagating the uncertainty of flower positions to determine the variations in the descriptor domain. A Monte Carlo simulation is used to validate the Unscented Transform results, confirming our method’s effectiveness for flower cluster matching. Therefore, it can facilitate improved robotics pollination in dynamic environments.
|
|
15:03-15:21, Paper WeCT7.2 | |
Automated and Robust Phased Array Ultrasonic Testing (PAUT) for Weld Inspection with Seam Identification and Tracking in Large Storage Tanks |
|
Li, Minghui | University of Glasgow |
Keywords: Deep Learning in Robotics and Automation, Computer Vision in Automation, Computer Vision for Manufacturing
Abstract: Phased Array Ultrasonic Testing (PAUT) integrated with an autonomous robot crawler presents a promising solution for the non-destructive evaluation (NDE) of welds in large storage tanks. The crawler can be programmed to navigate along the tank surface, which automates the testing process and accelerates the scanning speed. However, most traditional PAUT crawlers lack the capability to automatically identify and track weld seams, which may result in path deviations and compromised inspection. This paper addresses this challenge by proposing a feasible and cost-effective solution that leverages a convolutional neural network (CNN) for weld seam detection, an angle differentiation algorithm for path tracking, and a controller for real-time alignment. A prototype robot car, equipped with a Raspberry Pi and a camera, is developed to validate the approach. Experimental results show that the system achieves over 80% accuracy in weld seam identification. The robot car successfully detects, tracks, and realigns itself to the weld seam even after displacement, validating the proof of concept of integrating automatic weld tracking capabilities into PAUT robot crawlers for robust NDE scanning.
|
|
15:21-15:39, Paper WeCT7.3 | |
Machine Learning for Robotic Accuracy Improvement in Drilling Operations |
|
Moore, James | University of Sheffield Advanced Manufacturing Research Centre |
Burkinshaw, Christopher | University of Sheffield AMRC |
Sawyer, Daniela | University of Sheffield - Advanced Manufacturing Research Centre |
Keywords: AI-Based Methods, Industrial and Service Robotics, Robust Manufacturing
Abstract: Drilling of rivet/fastener holes in aircraft presents a major manufacturing challenge where manual processes are heavily relied upon. It is estimated that modern aircraft can require upwards of 1.5 million holes to be drilled using methods that involve some form of manual input. This introduces concerns over both hole accuracy and worker wellbeing. Industrial robotic arms offer a potentially promising solution due to their reach and flexibility. However, limitations in their positional accuracy can be a barrier. This paper presents an open-loop methodology to address these limitations by improving the positional accuracy of a robotic drilling platform using Gaussian process regression (GPR) models, without the need for permanently installing costly metrology equipment, such as laser trackers or secondary encoders. The models demonstrated an average reduction in the positioning error of the platform from 0.993 mm down to 0.022 mm (97.7%) in x, and from 0.209 mm down to 0.055 mm (73.5%) in y in free air. This methodology is then demonstrated on physical drilling trials, where the average hole position error was reduced from 0.688 mm to 0.323mm (53.0%) in x. However, due to limitations in the training of the models, the error in y increased from 0.261 mm to 0.378 (45.1%). Despite these results being less successful, it is intended that they serve as a baseline for future development of the methodology so that it can include the effects of process (drilling) forces.
|
|
15:39-15:57, Paper WeCT7.4 | |
Underwater Pipe Detection Using Passive Magnetic Sensing |
|
Lewis, Ryan | University of Houston |
Garcia Gonzalez, Javier | University of Houston |
Julien, Leclerc | University of Houston |
Keywords: Industrial and Service Robotics, Collision Avoidance, Sensor-based Control
Abstract: The detection and tracking of underwater assets can be challenging in high turbidity waters. Solutions based on magnetic sensing are promising for the detection of metallic objects. These typically rely on a magnetic field source (per- manent magnet or electromagnet) to magnetize surrounding ferromagnetic objects and increase their magnetic presence so that these objects can be detected by the sensing electron- ics. Electromagnets have the disadvantage of requiring large amounts of power. Permanent magnets can disrupt and/or over- saturate onboard sensors and are attracted by ferromagnetic objects, increasing the risk of collision. To solve this issue, this paper presents a new device that uses a passive method of magnetic sensing for detecting ferrous objects underwater. The device measures the minute distortions of the Earth’s magnetic field (geomagnetic field) caused by ferromagnetic objects while filtering out the overall effects of the much stronger geomagnetic field itself. A prototype was built using a linear array of off-the-shelf 3-axis magnetometers, mounted on an underwater Remotely Operated Vehicle (ROV), then used to detect and map a vertically-mounted steel pipe in a pool.
|
|
15:57-16:15, Paper WeCT7.5 | |
Development-Agile and Accurate Pig Size Measurement with Edge-AI Devices for Livestock Industries |
|
Lin, Yu-Chuan | Chinese Culture University |
Chen, Yi-Han | National Cheng Kung University |
Hung, Min-Hsiung | Chinese Culture University |
Chen, Chao-Chun | National Cheng Kung University |
Keywords: Agricultural Automation, Deep Learning in Robotics and Automation
Abstract: Periodical nursing tasks, such as body length measurement, are critical in managing the health quality of livestock. In this paper, we proposed a keypoint-based measurement-via-prior (MVP) method to accurately estimate pig size on edge-computing devices with sufficient development agility and popular camera-embedded edge devices for being promoted to livestock industries. To achieve development agility and avoid dedicated expense imaging devices, our method leverages deepnets whose complexity is manipulable with the resources of edge devices to identify pig keypoints forming the body length for the following size estimation. The MVP method is a two-stage length estimation with merely lightweight matrix computation: the first is to roughly estimate the body length by referencing the prior-defined mark (such as a fence); the second is to refine the size estimate with a calibration parameter obtained by solving a restriction optimization from the training dataset. Note that the keypoint identifying deepnet and the calibration parameter can be obtained in advance in the cloud server, so that the edge devices with merely two easily maintainable modules perform inference within acceptable time. We conduct real-world experiments to validate our proposed method. The visualization results show that the accurate length estimation comes from the precise keypoint identification. The numerical results obtained using the Raspberry Pi 4 device show that our proposed method indeed has higher accuracy than related methods within acceptable execution time, supporting that the proposed method fits for most small and medium farms.
|
|
WeCT8 |
Room T8 |
AI-Driven Scheduling and Optimization 1 |
Special Session |
Chair: Kim, Hyun-Jung | Korea Advanced Institute of Science and Technology |
Organizer: Kim, Hyun-Jung | Korea Advanced Institute of Science and Technology |
Organizer: Lu, Yuqian | The University of Auckland |
Organizer: Shen, Weiming | Huazhong University of Science and Technology |
Organizer: Li, Xinyu | Huazhong University of Science and Technology |
|
14:45-15:03, Paper WeCT8.1 | |
Resource-Constrained Parallel Machine Scheduling for Tire Manufacturing (I) |
|
Kim, Hong-Yeon | KAIST(Korea Advanced Institute of Science and Technology) |
Shin, Woo-Jin | Korea Advanced Institute of Science and Technology |
Joo, Sanghyun | Korea Advanced Institute of Science and Technology(KAIST) |
Kim, Hyun-Jung | Korea Advanced Institute of Science and Technology |
Keywords: Planning, Scheduling and Coordination
Abstract: This study aims to optimize the scheduling process in tire manufacturing, with a particular focus on the scheduling of tread, one of the key tire components. The problem is formulated as a resource-constrained parallel machine scheduling problem, with the objective of minimizing total tardiness and setup time. One of the main challenges in this problem is managing carts, a renewable resource required from the production of tread through to the completion of tire assembly. After the assembly process is completed, the carts are released for reuse in tread production, further complicating the scheduling process. To tackle this NP-hard problem, we propose a novel iterated greedy algorithm incorporating the block-split method in the construction phase. This effectively reduces the solution space and facilitates escaping from local optima, as the block size is dynamically adjusted during the construction phase. In the experiments, we validate the proposed methodology by comparing it with constraint programming, demonstrating that the proposed algorithm guarantees optimal performance for small-sized instances. Furthermore, the proposed method outperforms other heuristic approaches, particularly in solving large-scale problems. Currently, this algorithm is being implemented in a real-world tire manufacturing plant operated by one of the largest companies in South Korea.
|
|
15:03-15:21, Paper WeCT8.2 | |
Multiple Orders Remaining Completion Time Prediction Based on Digital Twin and Spatiotemporal Multi-Graph Network Cascade Algorithm (I) |
|
Yu, Haiwen | Shanghai Jiao Tong University |
Zhong, Jingshu | Shanghai Jiaotong University |
Chen, Liang | Shanghai Jiao Tong University |
Zheng, Yu | Shanghai Jiao Tong University |
Keywords: AI-Based Methods, Planning, Scheduling and Coordination, Big-Data and Data Mining
Abstract: In discrete manufacturing systems, the concurrent production of customized orders increases the complexity of multi-order remaining completion time (MORCT) prediction. Most existing methods rely on empirical estimates of production capacity, ignoring production planning constraints and failing to incorporate real-time manufacturing performance, resulting in limited prediction accuracy. To address this, a DT-based MORCT prediction framework is proposed. It constructs spatiotemporal datasets of production tasks and performance from DT and introduces a spatiotemporal multi-graph cascade algorithm (STMG) for prediction. The algorithm first embeds data into two types of graph models, which contain manufacturing system information and order-task scheduling information, extracts real-time performance features across multiple manufacturing units (MUs), and applies heterogeneous graph attention to model the embedded features of order nodes in the order-task-MU graph for final MORCT prediction. The experimental results show that the STMG model can accurately predict MORCT and is suitable for parallel production scenarios with multiple orders. It outperforms existing methods in terms of loss and adaptability.
|
|
15:21-15:39, Paper WeCT8.3 | |
Deep Reinforcement Learning for the Container Retrieval Problem (I) |
|
Shin, Woo-Jin | Korea Advanced Institute of Science and Technology |
Jung, Ji-Kwang | Korea Advanced Institute of Science and Technology |
Cho, Sang-Hyun | Korea Advanced Institute of Science and Technology |
Kim, Hyun-Jung | Korea Advanced Institute of Science and Technology |
Keywords: Planning, Scheduling and Coordination, AI-Based Methods, Intelligent Transportation Systems
Abstract: This study addresses the container retrieval problem (CRP) arising in automated container terminals. Containers are stacked in the storage yard, and to retrieve a container from below, the containers above it must be relocated. The CRP aims to generate an efficient retrieval plan for containers based on their given retrieval order, while minimizing the yard crane’s total working time. Due to the NP-hard nature of the problem, finding optimal solutions within practical time limits is challenging. To overcome this, we propose a deep reinforcement learning approach that generates near-optimal retrieval plans in a very short time. Our approach, based on an attention mechanism, is size-agnostic and adapts effectively to varying sizes of yard layouts. Through experiments on benchmark data, we have verified that our approach significantly outperforms existing methodologies in the literature. Furthermore, its strong performance across various problem scales demonstrates its high practicality for real-world applications.
|
|
15:39-15:57, Paper WeCT8.4 | |
Rule Induction for Minimizing Total Tardiness in Parallel Machine Scheduling with Sequence-Dependent Setup Times (I) |
|
Lee, Cheolwoo | Dongguk University |
Jun, Sungbum | Dongguk University |
Keywords: Manufacturing, Maintenance and Supply Chains, Machine learning
Abstract: The parallel machine scheduling problem involves assigning multiple jobs to identical machines without preemption. Traditional approaches struggle with increasing computational complexity as problem sizes grow. This study introduces a decision tree-based dispatching rule optimization method that leverages Time Bucketing, Rolling Horizon, and pairwise comparisons to enhance scheduling efficiency.
|
|
15:57-16:15, Paper WeCT8.5 | |
Adaptive Self-Supervised Learning for Solving Flow Shop Scheduling Problems (I) |
|
Choi, Inguk | Korea Advanced Institute of Science and Technology |
Kim, Hyun-Jung | Korea Advanced Institute of Science and Technology |
Keywords: Planning, Scheduling and Coordination, Autonomous Agents, AI-Based Methods
Abstract: We propose a novel adaptive self-supervised learning (SSL)-based autonomous scheduling agent to address flow shop scheduling problems (FSSP), a critical production scheduling challenge frequently encountered in diverse manufacturing environments. In contrast to conventional reinforcement learning (RL) methods that rely on sparse reward signals, our approach generates multiple candidate schedules for each instance and learns by imitating the best-performing schedule as a pseudo-label, without requiring external feedback. To enhance learning effectiveness, we modify the cross-entropy loss function to adaptively control imitation intensity. We evaluate our method on both permutation flow shop problems (PFSP) and flexible flow shop problems (FFSP). Experimental results demonstrate that our approach outperforms conventional heuristic algorithms and learning-based methods.
|
|
WeCT9 |
Room T9 |
Learning and Control 2 |
Regular Session |
Chair: Zhang, Qiong | Clemson University |
|
14:45-15:03, Paper WeCT9.1 | |
Gaussian Process-Driven Predictive Control for Robot Re-Orientation in Interaction Tasks |
|
Roveda, Loris | SUPSI-IDSIA |
Keywords: Model Learning for Control, Optimization and Optimal Control, Compliance and Impedance Control
Abstract: To address the issue of proper robot tool orientation in interaction tasks (e.g., for screwing and assembly tasks), this paper proposes a Gaussian Process-driven predictive controller (GPDPC). The GPDPC allows for the online robot tool orientation alignment w.r.t. the main task direction. The proposed method uses the end-effector wrench measurements/estimations only. The GPDPC exploits a Gaussian Process (GP) to model a cost function providing information about the correct alignment of the robot tool w.r.t. the main task direction. The predictive controller then uses the GP online to compute the rotational optimal trajectory for the re-orientation of the robot tool. The GP estimated variance is used to guide the robot safely. Derivation of the controller is presented, together with simulation (considering a probing task) and experimental results showing the achieved performance. The GPDPC performance are compared with a state-of-the-art method to show its superiority.
|
|
15:03-15:21, Paper WeCT9.2 | |
Bayesian Optimization with Transformed Additive Gaussian Processes |
|
Kerfonta, Caroline | Clemson University |
Zhang, Qiong | Clemson University |
Keywords: Machine learning, Probability and Statistical Methods, Optimization and Optimal Control
Abstract: Bayesian optimization is commonly used to optimize black-box functions associated with simulations in ground vehicle systems. Bayesian optimization contains two essential components: the acquisition function and the statistical surrogate model. In each step of Bayesian optimization, an input point is selected for a new simulation run by optimizing the acquisition function computed under the assumptions of the surrogate statistical model. For black-box functions with higher dimensions, it can be a challenging problem of optimizing the acquisition function in each step. To solve this challenge, we propose to use transformed additive Gaussian processes cite{TAG} as the surrogate statistical model. Therefore, we can simplify the optimization of the acquisition function by decomposing the objective function into additive one-dimensional problems. The acquisition function is then optimized for each dimension and the new points selected from each dimension are joined together to create the new input point. We evaluate the performance of the proposed method through numerical examples by comparing with classical Bayesian optimization with expected improvement.
|
|
15:21-15:39, Paper WeCT9.3 | |
Modified YOLOv8 for Real-Time Multi-Class Uncooperative Aerial Vehicle Detection and Classification |
|
Varma, Pavan | Indian Institute of Technology Kanpur |
Singh, Kripesh | Indian Institute of Technology Kanpur |
Das, Arya | Indian Institute of Technology, Kanpur |
Dipak, Giri | IIT Kanpur |
Dwivedi, Prasiddha Nath | Defence Research and Development Organisation |
Keywords: Surveillance Systems, Machine learning, Deep Learning in Robotics and Automation
Abstract: With the rapid increase in UAVs (fixed-wing or rotor-based) for commercial and recreational applications, serious concerns arise regarding their misuse. Moreover, their small size complicates detection, especially distinguishing them from other aerial vehicles like airplanes, helicopters, and birds. This paper introduces an efficient model for UAV detection and classification from a distance using visual sensors. We enhance the YOLOv8 model by integrating a novel attention module called Multi Group Dual Attention Module (MGDAM) aimed at small-object detection. Additionally, a synthetic dataset is generated for multiple classes (drones, airplanes, helicopters, and birds) that closely resembles real-world camera-captured images. Unlike many studies focusing only on mAP@50, this paper emphasize mAP50–95 to evaluate performance in detecting very small objects. To identify an optimal configuration, thorough ablation experiments was conducted involving different attention modules—Convolutional Block Attention Module (CBAM), Coordinate Attention, Shuffle Attention, Squeeze-and-Excitation, Efficient Channel Attention (ECA), and proposed MGDAM. Experimental results show that the MGDAM-enhanced YOLOv8 model achieves a 0.4% gain in mAP50 and a 1.7% boost in mAP50–95 over the YOLOv8 baseline. Finally, real-world testing confirms robust UAV detection under challenging conditions (e.g., extreme lighting and clutter), demonstrating the method’s efficacy for improving aerial surveillance and security measures.
|
|
15:39-15:57, Paper WeCT9.4 | |
Mutation Testing of Programs for Industrial Robots |
|
Ameena, Ameena K Ashraf | International Institute of Information Technology |
D'Souza, Meenakshi | IIIT-Bangalore |
Keywords: Formal Methods in Robotics and Automation, Industrial and Service Robotics
Abstract: In today’s manufacturing environment, industrial robots play a pivotal role in automating tasks within a tightly integrated real-time platform. Given their interaction with humans and other robots, ensuring their safety is important. This necessitates rigorous testing and verification processes throughout the development life-cycle. We consider a mutation testing approach for evaluating the robustness of industrial robot programs, by using mutations or changes in unit, system, and task levels. By injecting deliberate changes, or mutations, into the program code and creating mutated programs, mutation testing evaluates the effectiveness of a test suite by simulating alternate behaviors through mutated programs. This process detects deviations from expected behavior, thereby strengthening overall test coverage and quality assurance. We design a set of custom-built mutation operators for programs that manipulate industrial robots, including move instructions that involve 3-D geometry, changing the velocity of the robot arm movements and for low-level synchronization constructs for multi-task programs.
|
|
15:57-16:15, Paper WeCT9.5 | |
Gimballed Rotor Mechanism for Omnidirectional Quadrotors |
|
Jann, Cristobal | Toronto Metropolitan University |
Zain Aldeen, Ahmad Ziad | Toronto Metropolitan University |
Izadi, Mohammadreza | Toronto Metropolitan University |
Faieghi, Reza | Toronto Metropolitan University |
Keywords: Actuation and Joint Mechanisms, Mechanism Design in Meso, Micro and Nano Scale
Abstract: This paper presents the design of a gimballed rotor mechanism as a modular and efficient solution for constructing omnidirectional quadrotors. Unlike conventional quadrotors, which are underactuated, this class of quadrotors achieves full actuation, enabling independent motion in all six degrees of freedom. While existing omnidirectional quadrotor designs often require significant structural modifications, the proposed gimballed rotor system maintains a lightweight and easy-to-integrate design by incorporating servo motors within the rotor platforms, allowing independent tilting of each rotor without major alterations to the central structure of a quadrotor. To accommodate this unconventional design, we develop a new control allocation scheme in PX4 Autopilot and present successful flight tests, validating the effectiveness of the proposed approach.
|
|
WeCT10 |
Room T10 |
Energy and Sustainability 3 |
Regular Session |
Chair: Yue, Xiaowei | Tsinghua University |
|
14:45-15:03, Paper WeCT10.1 | |
Enabling the Use of Environmentally-Sustainable Energy-Harvesting Sensors in the Smart Building Automation Domain |
|
Ramanathan, Ganesh | Siemens AG, Switzerland |
Mayer, Simon | University of St.Gallen and ETH Zurich |
Hess, Simon | Netcloud AG |
Gomez, Andres | TU Braunschweig |
Keywords: Sensor Networks, Automation Technologies for Smart Cities, Agent-Based Systems
Abstract: Low-power wireless sensors that use harvested energy (e.g., from light) do not require batteries and contribute significantly to higher environmental sustainability. % However, the constrained and non-deterministic availability of ambient energy means that sensors and automation agents deployed to control and monitor physical environments must coordinate and collaborate at run time. Current methods to achieve this require extensive upfront engineering and are not adaptive to changes in energy availability. % We propose an approach where low-power wireless sensors, functioning as agents in a multi-agent system, use their local knowledge to autonomously evaluate their functional capabilities and energy availability to decide about role adoption, both at run time. Knowledge about system organization and functional profiles that they require is disseminated in the network. % Our evaluation, conducted in a real-life setting, demonstrates that this approach simplifies system design and leads to higher efficacy in fulfilling measurement roles. Our results furthermore show improved utilization of the energy-harvesting sensors, leading to increased usage of sustainably harvested energy while minimizing the use of battery-powered sensors in the system, thereby prolonging their lifetime.
|
|
15:03-15:21, Paper WeCT10.2 | |
Integrating Semantic Web Ontologies towards Achieving Automated Engineering of Controls in Smart Buildings |
|
Ramanathan, Ganesh | Siemens AG, Switzerland |
Mayer, Simon | University of St.Gallen and ETH Zurich |
Keywords: Building Automation, Automation Technologies for Smart Cities, Cognitive Automation
Abstract: The operation of large electro-mechanical systems in a building needs to be automated to manage physical processes in a controlled and coordinated manner. % Currently, automation systems need to be manually engineered during their commissioning and when changes to requirements and design occur during their operation. % In this article, we show that the main challenge in achieving automated engineering is that a textit{unified} system knowledge consisting of interlinked machine-understandable descriptions of emph{requirements}, system emph{design}, and the principles of the underlying emph{physical processes} is missing. % We have investigated the conceptual bridging required to bring together this hitherto fragmented knowledge and have created a high-level ontology that helps achieve our desired unified view of the system. % We evaluated our approach in a real-life building automation system. We show that it enables the automated selection of a reusable control program that matches the system requirements.
|
|
15:21-15:39, Paper WeCT10.3 | |
High-Precision Land Surface Temperature (LST) Mapping Using Co-Kriging for Urban Centers |
|
Bhat, Adnan Ilahi | Indian Institute of Technology Delhi |
N V, Neelima | Indian Institute of Technology Delhi |
Shriyam, Shaurya | IIT Delhi |
Mishra, Saroj Kanta | Indian Institute of Technology Delhi |
Keywords: Sustainability and Green Automation, Environment Monitoring and Management, Probability and Statistical Methods
Abstract: With the ongoing expansion of urban areas worldwide, addressing the resulting imbalances has become an urgent concern for city sustainability, especially in changing climatic conditions. The Urban Heat Island (UHI) effect is a prominent phenomenon in this context; Land Surface Temperature (LST) is widely recognized as a vital indicator for quantifying UHI intensity, providing essential data for researchers and policymakers. The research employs universal co-kriging approach for mapping UHI, using LST as the primary variable for the urban center of Ahmedabad, India. The study integrates Normalized Difference Vegetation Index (NDVI) and the Normalized Difference Building Index (NDBI) as covariates, derived from 2024 Landsat-8 satellite imagery at a 30-meter spatial resolution to create more precise and robust LST maps of the study area. The methodology demonstrates effective temperature prediction capabilities, with a coefficient of determination (R²) of 0.67 on average and a maximum value of 0.7. Results reveal a pronounced UHI effect, with urban centers reaching temperatures of 40°C, while non-urban areas maintain lower temperatures of approximately 30°C. The strong positive correlation between LST and NDBI (R = 0.65) and negative correlation with NDVI (R = −0.54) highlight the significant impact of built environments and vegetation on urban thermal conditions. These findings provide quantitative evidence to support targeted urban greening initiatives and heat-adaptive planning strategies in rapidly urbanizing regions facing climate challenges.
|
|
15:39-15:57, Paper WeCT10.4 | |
Energy-Efficient Activation Control of Laguerre Neural Network |
|
Hou, Chen | Peking University |
Keywords: Optimization and Optimal Control, Energy and Environment-aware Automation, AI-Based Methods
Abstract: The Laguerre Neural Network (LaNN) employs Laguerre polynomials (LPs) as the activation functions of its hidden neurons (HNs) to approximate the nonlinear functions (NFs). From the perspective of statistics, the more LPs are activated, the more LP-based activation functions (LPAFs) of the HNs will operate, and thus the smaller approximation error (AE) will be suffered by the LaNN, while leading to more energy consumption since activating any LP has to consume energy. Therefore, how to control the activation of LPs to make the optimal tradeoff between approximation accuracy and energy consumption arises as an interesting issue. To address this issue, this paper first establishes an energy-constraint probability-based approximation-accuracy optimization-theoretical (EPAO) framework, considering the probability of obtaining the minimum AE (MAE) as the objective while the energy consumption of the LaNN as the constraint, and then proposes an algorithm for LaNN to maximize the probability of obtaining the MAE at the acceptable level of energy consumption. Experimental comparisons with existing methods verifies its performance.
|
|
15:57-16:15, Paper WeCT10.5 | |
Detect As You Fly: Using Attention-Based YOLOv10 for Automatic Fault Detection of Transmission Line Insulators |
|
Li, Nuo | Tsinghua University |
Liang, Zhenglin | Tsinghua University |
Keywords: Computer Vision in Automation, Machine learning
Abstract: Insulator fault detection constitutes a critical role in ensuring the reliability of power transmission lines. An innovative and efficacious inspection strategy involves combining the Unmanned Aerial Vehicles (UAVs) with deep learning models, notably the YOLO series, to automatically detect insulator status while in flight. To achieve this, building upon the YOLOv10 framework, this study integrates the Expectation Maximization Attention (EMA) module and the Space-to-Depth (SPD) layer to improve its network structure, thereby enhancing its ability to extract key information from input data and improving its robustness towards small objects and blurred images while ensuring the efficiency. Specifically, the EMA modules are introduced to connect each detection head module with the neck network to strengthen the model’s attention during the output process. Furthermore, the SPD layers are incorporated before several convolution modules to convert the spatial dimension of input into the depth dimension, thereby mitigating information loss and augmenting the model’s detection capability towards small and low-resolution objects. Subsequently, this attention-based model is verified based on a real-world dataset of insulator images. The results demonstrate its ability to improve recognition accuracy, achieving a mean average precision of 97.3%, while incurring only a 0.1 millisecond increase in processing time per image. This indicates that applying the attention-based model to UAV-based patrol strategies holds immense potential for high-precision and real-time fault detection, thus achieving automated fault detection process.
|
|
WeDT1 |
Room T1 |
Image, Video and Vision 2 |
Regular Session |
Chair: De Los Rios Alatorre, Gustavo | ITESM |
|
16:30-16:48, Paper WeDT1.1 | |
Image Robotic Auto-Photographing System for Remote Visual Train Bogie Inspection: A Multi-Environment Gaze-Speed and 3D Clustering Approach |
|
Aoki, Nobuaki | Hitachi, Ltd |
Sakai, Ryo | Hitachi, Ltd |
Keywords: Industrial and Service Robotics, Human-Centered Automation, Motion and Path Planning
Abstract: We propose a multi-environment motion-teaching approach for a robotic auto-photographing system designed to replicate an inspector’s viewpoint during remote visual inspection of train bogies. The system utilizes a head-mounted device (HMD) to record the inspector’s gaze points and head pose, and then determine the robot’s camera pose based on them. Previous works extracted camera poses based on dense clusters of gaze points recorded by the HMD, assuming that the inspector’s gaze points are concentrated. However, this approach fails to capture large components, where the gaze naturally moves along the surface. Additionally, these methods require the environments for HMD data collection and robot operation to be identical, limiting inspections in locations inaccessible to human inspectors. To address these limitations, we propose a method that leverages gaze speed information in addition to a 3D clustering method to identify both small and large components. The method also adjusts camera poses, enabling the robot to capture images in diverse environments. Experiments conducted in a mock-up environment replicating a real train bogie for HMD data collection, as well as simulations using robot and operational environment models, confirm that the proposed method allows the robot to successfully capture images of all the small and large components, even in environments different from the one where the HMD data was collected.
|
|
16:48-17:06, Paper WeDT1.2 | |
Adaptive Sensor-Image Fusion for Enhanced 3D Pose Estimation in Cardiac Intervention |
|
Benkadja, Abdallah | Concordia University |
Sayadi, Amir | McGill Universiity |
Fevens, Thomas | Concordia University |
Hooshiar, Amir | McGill University |
Keywords: Sensor Fusion, AI and Machine Learning in Healthcare, Machine learning
Abstract: Precise tracking of surgical instruments and anatomical structures is vital for minimally invasive cardiac surgery to perform safe and effective interventions. However, conventional methods that rely on frames for pose estimation are vulnerable to lighting fluctuations, occlusions, and motion blur, the challenges of real surgical environments. To mitigate these challenges, we propose a multi-modal pose estimation architecture allowing us to fuse image data, which can be processed using models such as Swin Transformer, along with position and orientation sensor data via a gated fusion, as well as a layered multi-head attention module. By adaptively addressing visual and sensor-based spatial details, the framework not only contributes to improved robustness against noise and uncertainty but also pose estimation accuracy. We extensively evaluate the proposed method against image-only and sensoronly baselines and demonstrate its superior performance in estimating both position and orientation. The gated fusion mechanism and multi-head attention structure improve the robustness against missing data and sensor noise, thereby enhancing prediction reliability. Position estimation has very close to perfect accuracy in the R2 > 0.9997 range along all spatial axes. We see that orientation prediction also shows strong linear correlations, where the R2 values exceed 0.9975 for all components of the quaternion, which confirms the model’s capability of capturing the rotational behavior. The implications of these results are significant, as the proposed framework offers a robust and accurate approach to surgical pose estimation, enabling more reliable cardiac surgical navigation.
|
|
17:06-17:24, Paper WeDT1.3 | |
Topological Mapping and Navigation Using a Monocular Camera Based on AnyLoc |
|
Zhang, Wenzheng | Hosei University |
Hara, Yoshitaka | Chiba Institute of Technology |
Nakamura, Sousuke | Hosei University |
Keywords: Deep Learning in Robotics and Automation, Motion and Path Planning, AI-Based Methods
Abstract: This paper proposes a method for topological mapping and navigation using a monocular camera. Based on AnyLoc, keyframes are converted into descriptors to construct topological relationships, enabling loop detection and map building. Unlike metric maps, topological maps simplify path planning and navigation by representing environments with key nodes instead of precise coordinates. Actions for visual navigation are determined by comparing segmented images with the image associated with target nodes. The system relies solely on a monocular camera, ensuring fast map building and navigation using key nodes. Experiments show effective loop detection and navigation in real and simulation environments without pre-training. Compared to a ResNet-based method, this approach improves success rates by 63.8% on average while reducing time and space costs, offering a lightweight solution for robot and human navigation in various scenarios.
|
|
17:24-17:42, Paper WeDT1.4 | |
A Self-Supervised Miniature One-Shot Texture Segmentation (MOSTS) Model for Real-Time Indoor Drivable Area Segmentation |
|
Chen, Yu | University of Illinois at Urbana-Champaign |
Rastogi, Chirag | University of Illinois at Urbana Champaign |
Zhou, Zheyu | Johns Hopkins University |
Norris, William | University of Illinois Urbana-Champaign |
Keywords: Deep Learning in Robotics and Automation, Computer Vision in Automation, Sensor Fusion
Abstract: Determining the drivable area, or free space segmentation, is critical for mobile robots to navigate indoor environments safely. This paper explores the use of a self-supervised one-shot texture segmentation framework and an RGB-D camera to achieve robust drivable area segmentation. With a fast inference speed and compact size, the developed model, MOSTS is ideal for real-time robot navigation and various embedded applications. A validation dataset was built to assess MOSTS's ability to perform texture segmentation in the real world, where it effectively identified small low-lying objects that were previously undetectable by depth measurements. Further, the study also compared MOSTS's performance with two State-Of-The-Art (SOTA) indoor semantic segmentation models, both quantitatively and qualitatively. The results showed that MOSTS offers comparable accuracy with up to eight times faster inference speed in indoor drivable area segmentation.
|
|
17:42-18:00, Paper WeDT1.5 | |
A Classical Vision 3-D Registration Methodology for a Successful Pose Estimation of Semi-Deformable Objects |
|
De Los Rios Alatorre, Gustavo | ITESM |
Nieto Gutierrez, Nezih | Tecnológico De Monterrey |
Mendez Meraz, Armando Enrico | ITESM |
Murra López, Arturo José | Tecnológico De Monterrey |
Duran, Ian | ITESM |
Serna, Cesar | ITESM |
Escobedo-Cabello, Jesus-Arturo | INRIA Rhone-Alpes |
Munoz, Luis Alberto | Tec De Monterrey |
Keywords: Agricultural Automation, Cognitive Automation, Computer Vision in Automation
Abstract: This paper presents a robust computer vision-based method for the detection and 3D pose estimation of semi-deformable objects, a crucial challenge in robotics due to their non-standardized characteristics and high susceptibility to occlusions and viewpoint changes. Our solution aims to empower vision-based robotics systems by strengthening their understanding of these objects. Initially developed on a Windows 11 system, the method was adapted to Ubuntu enabling deployment on a Jetson Nano 2GB and testing in ROS2 Gazebo simulations. The system demonstrated consistent performance across different platforms and sensors, including the Azure Kinect DK and a simulated Intel D435-i. We specifically tested the ability of the system to accurately estimate the 3D pose of bell peppers in different angles, showcasing its potential for practical applications, such as robotic harvesting.
|
|
WeDT2 |
Room T2 |
RAL Paper Session 5 |
Special Session |
Chair: Boiko, Igor | Khalifa University |
|
16:30-16:48, Paper WeDT2.1 | |
AREPO: Uncertainty-Aware Robot Ensemble Learning under Extreme Partial Observability |
|
Du, Yurui | KU Leuven |
Hanut, Louis | KU Leuven |
Bruyninckx, Herman | KU Leuven |
Detry, Renaud | KU Leuven |
Keywords: Reinforcement Learning, Transfer Learning, Sensor-based Control
Abstract: In real-world applications of vision-based robot learning, two major challenges emerge: learning under extreme partial observability and effective simulation-to-reality (sim-to-real) transfer. This paper introduces a robust robot learning framework that enhances uncertainty awareness to address these challenges. We reinterpret variational autoencoder-based visual reinforcement learning (RL) from an uncertainty-quantification perspective, accommodating high levels of sensory noise and severe visual occlusions, typical in industrial robotic tasks. To further improve sim-to-real transfer performance, we propose an uncertainty-aware ensemble RL algorithm. We validate our methods using a laboratory task designed as a proxy for a wide range of real-world industrial robotic applications characterized by harsh environments with low visibility and physical occlusions. Both simulation and real-world validation results demonstrate significant improvements in task accuracy and efficiency over various baselines, underscoring the potential of incorporating uncertainty-aware robot learning techniques to achieve more reliable and effective robotic systems in complex operational contexts.
|
|
16:48-17:06, Paper WeDT2.2 | |
Fuzzy Ensembles of Reinforcement Learning Policies for Systems with Variable Parameters |
|
Haddad, Abdel Gafoor | Khalifa University |
Mohiuddin, Mohammed | Khalifa University |
Boiko, Igor | Khalifa University |
Zweiri, Yahya | Khalifa University |
Keywords: Machine Learning for Robot Control, Reinforcement Learning, Robust/Adaptive Control
Abstract: This paper presents a novel approach to improving the generalization capabilities of reinforcement learning (RL) agents for robotic systems with varying physical parameters. We propose the Fuzzy Ensemble of RL policies (FERL), which enhances performance in environments where system parameters differ from those encountered during training. The FERL method selectively fuses aligned policies, determining their collective decision based on fuzzy memberships tailored to the current parameters of the system. Unlike traditional centralized training approaches that rely on shared experiences for policy updates, FERL allows for independent agent training, facilitating efficient parallelization. The effectiveness of FERL is demonstrated through extensive experiments, including a real-world trajectory tracking application in a quadrotor slung-load system. Our method improves the success rates by up to 15.6% across various simulated systems with variable parameters compared to the existing benchmarks of domain randomization and robust adaptive ensemble adversary RL. In the real-world experiments, our method achieves a 30% reduction in 3D position RMSE compared to individual RL policies. The results underscores FERL robustness and applicability to real robotic systems. A supplementary video is available at https://youtu.be/I33i5xq5t7c.
|
|
17:06-17:24, Paper WeDT2.3 | |
Single Pump-Valve Pneumatic Actuation with Continuous Flow Rate Control for Soft Robots |
|
Liu, Sicong | Southern University of Science and Technology |
Wang, Lin | Wisson Technology (Shenzhen) Co. Ltd |
Qian, Zhongfeng | Wisson Technology Ltd |
Dihan, Liu | Southern University of Science and Technology |
Zhu, Wenpei | Southern University of Science and Technology |
Tang, Shaowu | Southern University of Science and Technology |
Zhao, Xuda | Southern University of Science and Technology |
Yang, Wenjian | Southern University of Science and Technology |
Lu, Ying | Southern University of Science and Technology |
Yi, Juan | Southern University of Science and Technology |
Dai, Jian | School of Natural and Mathematical Sciences, King's College Lond |
Wang, Zheng | Southern University of Science and Technology |
Keywords: Soft Robot Applications, Soft Sensors and Actuators
Abstract: Pneumatic actuated soft robots attract increasing interest of the researchers due to the availability and simplicity in actuation. The soft robots driven by soft pneumatic actuators (SPAs) of various active volumes demand pneumatic systems with various range of flow rate. However, the usually bulky and hard-to-carry pneumatic actuation systems restrict the portability, and the air pumps provide constant flow rate which constrained the applications such as soft wearable devices and scenarios require fine flow rate control. In this work, aiming for simplicity, high portability, continuous and small flow rate regulation, the pneumatic actuation system consists of identical integrated soft robotic drivers (iSoRD) modules is proposed, obtaining positive and negative pressure output (-53 ~ 83 kPa) in each module using one-pump-one-valve (4-way/2-position solenoid) design. With the check valves installed and the modular design, pressure holding and flow independence are achieved in each pneumatic branch. The heat generation (37.7 ℃) and power consumption (2.95 W per-channel) are measure to verify usability. The continuous and fine flow rate regulation (15 mL/s) is achieved by applying the PID controller on the pump motor, which shows superior performance in signal tracking in comparison with the non-continuous Bang-Bang and Varia-speed Bang-Bang algorithms. With the same control, the iSoRD system reduces the error by 37.5% in comparison to our previous two-pump system. The portability, versatility in wearing, practicality and adaptivity of the system are validated by driving three wearable soft robots, a small gripper and a pollination device. Comparing with the existing, the iSoRD is capable of fine flow rate regulation in both negative and pos
|
|
17:24-17:42, Paper WeDT2.4 | |
Characterization of a Quasi-Direct Drive Knee Perturbation System for Mechanical Impedance Estimation |
|
Nazon II, Yves | University of Michigan - Ann Arbor |
Thomas, Gray | Texas A&M University |
Rouse, Elliott | University of Michigan |
|
|
WeDT3 |
Room T3 |
Force Modeling |
Regular Session |
Chair: Yau, Her-Terng | National Chung Cheng University, Department of Mechanical Engineering |
|
16:30-16:48, Paper WeDT3.1 | |
Desired Contact Force Realization in Unknown Environments Via Multiple Virtual Dynamics-Based Control Framework |
|
Kanekiyo, Mikihiro | Kyushu University |
Arita, Hikaru | Kyushu University |
Nakashima, Kazuto | Kyushu University |
Tahara, Kenji | Kyushu University |
Keywords: Compliance and Impedance Control, Force and Tactile Sensing, Manipulation Planning
Abstract: Contact task execution in unknown environments is fundamental to robotic applications, requiring three essential capabilities: accurate position tracking, safe contact establishment, and achievement of desired contact force. While our previous study has demonstrated that combining admittance and impedance control in series enables accurate position tracking and safe contact, achieving desired contact force remains challenging due to environmental and robot dynamic uncertainties. This paper presents a novel force control methodology that integrates all three capabilities by introducing an additional admittance layer to the series admittance-impedance control framework. The key idea lies in the conversion of sensor information into continuous virtual object motion through virtual dynamics, enabling seamless transition from position to force tracking control without controller switching. This approach eliminates the need for direct feedback of raw sensor measurements to controllers while ensuring precise contact force achievement regardless of environmental or dynamic uncertainties. The effectiveness of the proposed method is validated through both numerical simulations using a 2-DOF manipulator model and experimental verification on a physical manipulator system.
|
|
16:48-17:06, Paper WeDT3.2 | |
Application and Comparison of a Friction Force Model by Using Physic Informed Neural Network |
|
Lee, Chi-Wei | National Chung Cheng University |
Chen, Yen-Wen | National Chung Cheng University |
Chen, Yu-Tsun | National Chung Cheng University |
Yau, Her-Terng | National Chung Cheng University, Department of Mechanical Engine |
Keywords: AI-Based Methods, Model Learning for Control, Motion Control
Abstract: This research discovers the ability on using Physics-Informed Neural Network (PINN) for modelling friction force. The training data is generated using the Runge-Kutta method, and the physical laws incorporated into the PINN are based on the LuGre model. To evaluate the effectiveness of this model, this research discusses both forward and inverse method. The results show that, whether using the forward or inverse method, the prediction time is minimized while maintaining a high level of accuracy. Although the physics-based loss function proves valuable, it is crucial not to overly depend on it. Additionally, the number of data points and iterations wield significantly influence on the outcomes. PINN is effective for constructing precise friction force models, but it is necessary to repeat the experiments to ensure the best performance of this method.
|
|
17:06-17:24, Paper WeDT3.3 | |
Preliminary Indoor Coefficient of Friction Estimation Using a Wheeled Robot |
|
Pearson, Nathaniel | West Virginia University |
Wang, Xiangrui | West Virginia University |
Komarraju, Kulbhushan | WVU |
Gross, Jason | West Virginia University |
Gu, Yu | West Virginia University |
Keywords: Human-Centered Automation, Force and Tactile Sensing, Industrial and Service Robotics
Abstract: Slips, Trips, and Falls (STFs) are a leading cause of injury, often resulting from inadequate walking surface friction. This paper presents a method for estimating the coefficient of friction (CoF) of indoor surfaces using a wheeled skid-steer mobile robot and explores how robotic CoF estimates can enhance environmental safety assessments. Relative differences in CoF are estimated using wheel slip and torque measurements across different surfaces through the force-slip relationship. The proposed method utilizes onboard motor current sensing, and robot velocity measurements to derive relative CoF estimates during braking maneuvers on indoor surfaces. Results demonstrate the viability of employing a robot in the home and retail spaces for STF mitigation through CoF estimation as an auxiliary task.
|
|
17:24-17:42, Paper WeDT3.4 | |
A Modular Soft Magnetic Sensor with 3D Tactile Force Feedback for Adaptive Robot Grasping |
|
Sharrma, Neehal | Worcester Polytechnic Institute |
Onal, Cagdas | WPI |
Keywords: Force and Tactile Sensing, Product Design, Development and Prototyping, Haptics and Haptic Interfaces
Abstract: Human hands are capable of very dexterous manipulation by relying solely on multimodal, directional feedback through physical touch. This level of tactile perception is difficult to reliably translate into sensors for robotic applications unless they are purpose-built for the task. Soft sensors are thus an appealing generalized solution owing to their naturally compliant structures closely mimicking the nuances of the human skin; however, minimal, extensible implementations remain an active challenge. We present a modular soft magnetic sensor closely mimicking the morphology of the human fingertip and capable of 3D force feedback, discussing standalone sensor elements and characteristics and demonstrating its potential use in adaptive robot grasping applications.
|
|
17:42-18:00, Paper WeDT3.5 | |
Contact Sensing Via Joint Torque Sensors and a Force/Torque Sensor for Legged Robots |
|
Grinberg, Jared | University of Michigan |
Ding, Yanran | University of Michigan |
Keywords: Force and Tactile Sensing, Sensor Fusion
Abstract: This paper presents a method for detecting and localizing contact along robot legs using distributed joint torque sensors and a single hip-mounted force-torque (FT) sensor using a generalized momentum-based observer framework. We designed a low-cost strain-gauge-based joint torque sensor that can be installed on every joint to provide direct torque measurements, eliminating the need for complex friction models and providing more accurate torque readings than estimation based on motor current. Simulation studies on a floating-based 2-DoF robot leg verified that the proposed framework accurately recovers contact force and location along the thigh and shin links. Through a calibration procedure, our torque sensor achieved an average 96.4 % accuracy relative to ground truth measurements. Building upon the torque sensor, we performed hardware experiments on a 2-DoF manipulator, which showed sub-centimeter contact localization accuracy and force errors below 0.2 N.
|
|
WeDT4 |
Room T4 |
Automation in Construction |
Regular Session |
Chair: Lu, Yuqian | The University of Auckland |
|
16:30-16:48, Paper WeDT4.1 | |
Hybrid Perception and Equivariant Diffusion for Robust Multi-Node Rebar Tying |
|
Wang, Zhitao | Tsinghua University |
Xiong, Yirong | Tsinghua University |
Horowitz, Roberto | Berkeley |
Wang, Yanke | The Hong Kong University of Science and Technology |
Han, Yuxing | Tsinghua University |
Keywords: Automation in Construction, Deep Learning in Robotics and Automation, Data fusion
Abstract: Rebar tying is a repetitive but critical task in reinforced concrete construction, typically performed manually at considerable ergonomic risk. Recent advances in robotic manipulation hold the potential to automate the tying process, yet face challenges in accurately estimating tying poses in congested rebar nodes. In this paper, we introduce a hybrid perception and motion planning approach that integrates geometry-based perception with Equivariant Denoising Diffusion on SE(3) (Diffusion-EDFs) to enable robust multi-node rebar tying with minimal training data. Our perception module utilizes density-based clustering (DBSCAN), geometry-based node feature extraction, and principal component analysis (PCA) to segment rebar bars, identify rebar nodes, and estimate orientation vectors for sequential ranking, even in complex, unstructured environments. The motion planner, based on Diffusion-EDFs, is trained on as few as 5–10 demonstrations to generate sequential end-effector poses that optimize collision avoidance and tying efficiency. The proposed system is validated on various rebar meshes, including single-layer, multi-layer, and cluttered configurations, demonstrating high success rates in node detection and accurate sequential tying. Compared with conventional approaches that rely on large datasets or extensive manual parameter tuning, our method achieves robust, efficient, and adaptable multi-node tying while significantly reducing data requirements. This result underscores the potential of hybrid perception and diffusion-driven planning to enhance automation in on-site construction tasks, improving both safety and labor efficiency.
|
|
16:48-17:06, Paper WeDT4.2 | |
AI-Enabled Automation for Material Selection and Construction Planning for Multi-Story Steel Buildings |
|
Xu, Li | University of Auckland |
Xing, Deao | University of Auckland |
Chang-Richards, Alice | University of Auckland |
Zou, Yang | The University of Auckland |
Lu, Yuqian | The University of Auckland |
Keywords: Automation in Construction
Abstract: Reliance on personal experience and the absence of digital and intelligent workflows have led to inefficiency and low productivity in the Architecture, Engineering, and Construction (AEC) industry. This study proposes an integrated approach that leverages artificial intelligence technologies to support requirements analysis, scenario simulation, and dynamic decision-making in multi-story steel building design and construction. The design automation module employs a structured methodology consisting of component configuration design, structural verification, and decision strategy formulation, supported by a product library and AI-assisted selection algorithms for efficient material assessment and selection in building design. The construction management module structures multimodal design data and applies reinforcement learning to incorporate supply chain, resource, and cost constraints, providing dynamic decision support for steel component erection. This approach addresses the complexity and coupling challenges in multi-story steel building design and construction management, with the potential to reduce manual effort, improve data quality, and enhance the efficiency and sustainability of the AEC industry.
|
|
17:06-17:24, Paper WeDT4.3 | |
Estimating the Parameters of Superposed Oscillations of Mobile Construction Machines Like Diaphragm Wall Hydraulic Grabs |
|
Alexander, Vieres | Technical University of Munich |
Krüger, Marius | Technical University of Munich |
Hujo, Dominik | Technical University of Munich |
Cha, Suhyun | HAWE Hydraulik SE |
Waterman, Daniel | HAWE Hydraulik SE |
Kerausch, Cornelia | BAUER Maschinen GmbH |
Prinz, Theresa | Technical University of Munich, TUM School of Engineering and De |
Pohl, Daniel | Sensor-Technik Wiedemann GmbH |
Vogel-Heuser, Birgit | Technical University Munich |
Keywords: Automation in Construction, Big-Data and Data Mining, Hydraulic/Pneumatic Actuators
Abstract: The operation of mobile construction machines like cranes or grabs is influenced by oscillations from their cable-suspended end-effector. For Diaphragm Wall Hydraulic Grabs focused in this article, end-effector oscillations can be inferred from pressure oscillations in the hydraulic actuators. The oscillating end-effector stimulates the hydraulic actuators that respond with pressure oscillations. These pressure oscillations are noisy and superposed by multiple single oscillations from different sources like machine vibration and hydraulic circuit oscillation. The end-effector oscillations can be inferred from single pressure oscillations that must be separated from the pressure signal to enable control in future work. Signal decomposition is numerically expensive and time-consuming and can be optimized by providing a ballpark estimation for the single oscillation parameters, introduced and validated in this article on a benchmark data set. Amplitude, frequency, phase, and decay rate are estimated from the superposed oscillation without providing prior information about the data set.
|
|
17:24-17:42, Paper WeDT4.4 | |
A Reference-Giving Device Prototype for Performance Assessment of Closed-Loop Robot Systems |
|
Finkbeiner, Martin Satoshi | Fraunhofer IPA |
Lehnertz, Christian Michael | Fraunhofer IPA |
Vrhar, Maria | Fraunhofer IPA |
Stoll, Johannes T. | Fraunhofer Institute for Manufacturing Engineering and Automatio |
Kraus, Werner | Fraunhofer IPA |
Verl, Alexander | University of Stuttgart |
Giftthaler, Markus | Google |
Keywords: Industrial and Service Robotics, Process Control, Intelligent and Flexible Manufacturing
Abstract: Industrial robot systems that determine or adapt their motion at runtime using sensor data are expected to gain popularity in manufacturing, due to higher flexibility, less programming effort and the possibility to enable new processes that have been difficult to automate with classical open-loop motion. For the performance assessment of such robot systems, a new benchmarking method for closed-loop systems, inspired by ISO 9283, was proposed in [1]. However, for real-world experiments, a device is required that allows to shape the measurements recorded by external sensors, and thus creates well-defined (artificial) robot motion. In this paper, we propose a design for such a reference-giving device, show the build and test it for fulfillment of its requirements. Finally, we demonstrate the suitability of the reference-giving device in a closed-loop benchmark on two different state-of-the-art industrial robot systems using the performance assessment methodology from [1].
|
|
17:42-18:00, Paper WeDT4.5 | |
ToolNavigator: Dataset Generation for Small Tools Handling and Vision-Language Navigation in Construction Sites Via Simulation for Robots |
|
Bonyani, Mahdi | Louisiana State University |
Soleymani, Maryam | Louisiana State University |
Odugu, Obiora | Louisiana State University |
Wang, Chao | Louisiana State University |
Keywords: Automation in Construction, Environment Monitoring and Management, Building Automation
Abstract: Construction sites present complex and dynamic environments where small tools contribute to a significant proportion of accidents and hazards. There doesn't exist any datasets for vision-language navigation (VLN) in construction sites, restricting the development of AI models for autonomous navigation and tool handling. This paper introduces ToolNavigator, a simulation-driven dataset for small tool handling and VLN at construction sites. By integrating Blender with NVIDIA Isaac Sim, we create a scalable and customizable dataset that replicates real-world construction scenarios. ToolNavigator contains diverse construction site configurations, 543 small tools, 102 types of equipment, and 22 categories of heavy machinery. The dataset includes rich multimodal annotations such as 2D/3D bounding boxes, depth maps, semantic masks, and scene graphs to support enhanced spatial reasoning and object interaction. Experimental results demonstrate the usefulness of ToolNavigator for creating challenging navigation tasks for tools in construction sites. Previous state-of-the-art models trained on ToolNavigator did not achieve high performance in navigation tasks, with Maplm achieving only a 48.9% success rate in 5-shot learning and 53.2% in full training for unseen environments. These results highlight ToolNavigator’s potential to advance AI-driven automation in construction sites.
|
|
WeDT5 |
Room T5 |
AI-Driven Emerging Automation |
Special Session |
Chair: Zhang, Xi | College of Engineering, Peking University |
Organizer: An, Yu | National University of Singapore |
Organizer: Zhang, Xi | College of Engineering, Peking University |
|
16:30-16:48, Paper WeDT5.1 | |
Quantify Uncertainty Beyond Covariate Shift in RUL Estimation by Conformal Prediction (I) |
|
Piao, Shiyuan | The HongKong University of Science and Technology (Guangzhou) |
Wang, Ying | Shanghai Jiao Tong University |
Huang, Ruyi | Case Western Reserve University |
Wang, Di | Shanghai Jiao Tong University |
Tsung, Fugee | HKUST |
Keywords: Diagnosis and Prognostics, Probability and Statistical Methods, Big-Data and Data Mining
Abstract: Recent advancements in Automation Science and Engineering (ASE) have improved Prognostics and Health Management (PHM) methods for estimating the remaining useful life (RUL) of industrial equipment. While accurate RUL prediction is crucial for machinery reliability, deep learning (DL) models often neglect real-world uncertainty. Uncertainty quantification (UQ) provides essential confidence measures for risk-aware maintenance, yet faces challenges like covariate shift and reliance on optimization or distribution assumptions. To address these issues, we introduce an Adaptive Conformalized RUL Predictor (ACRP) that surpasses traditional methods. Our approach combines conformal prediction with a group-weighted strategy to achieve UQ in a model-free, distribution-free manner with rigorous theoretical guarantee, ensuring reliable prediction intervals that meet efficiency and validity criteria under covariate shift. It is adaptable to any model, and extensive experiments demonstrate its robustness and versatility, enhancing the trustworthiness of AI-driven PHM solutions in industrial environments.
|
|
16:48-17:06, Paper WeDT5.2 | |
Predicting Gasoline Transaction Events in Price-Commitment Scenarios: An Automated Framework (I) |
|
Li, Boyang | Peking University |
Wang, Ziqi | University of Michigan |
Qiu, Yunzhe | Olin Business School, Washington University in St. Louis |
Zhang, Xi | College of Engineering, Peking University |
Keywords: Big-Data and Data Mining, Failure Detection and Recovery, Probability and Statistical Methods
Abstract: Automatic prediction of consumer transactions in the gasoline market is critical for enabling targeted advertising interventions and improving retailer profitability. However, the absence of daily gasoline consumption data at the consumer poses significant challenges in accurately assessing consumer tank levels and predicting transaction events. This difficulty is further exacerbated underprice commitment scenarios, where transaction prices are protected by policy, leading to distinct shifts in daily consumption patterns and transaction events in response to market price fluctuations. To address these challenges, we introduce a time-to-event prediction framework designed for gasoline markets under price commitment. Our model dynamically tracks individual gasoline tank levels by integrating transaction prices, price volatility, and account remaining volume, which enhances adaptability to market fluctuations and improves prediction accuracy. The framework employs a lightweight neural network to establish a hazard function, capturing nonlinear dynamics between time-varying gasoline levels, static consumer profiles, and transaction likelihood. Through validation using a North China retail dataset, our method demonstrates significantly improved transaction timing prediction compared to traditional survival analysis and machine learning benchmarks.
|
|
17:06-17:24, Paper WeDT5.3 | |
Field Testing Model Validation of Onshore OWC Wave Power Plants (I) |
|
Garrido, Aitor | University of the Basque Country (UPV/EHU) |
Garrido, Izaskun | UPV/EHU |
Villasante, Amparo | UPV/EHU |
Keywords: Renewable Energy Sources, Modelling, Simulation and Validation of Cyber-physical Energy Systems, Power and Energy Systems automation
Abstract: Ocean wave energy has de potential to satisfy 15% of EU energy demand, cutting 136 MT/MWh off the CO2 emissions by 2050, as stated by the EU Energy Road Map. Analogously, the Spanish Renewable Energies Plan specifically highlighted the Spanish marine energy potential with special emphasis in wave energy. In this context, Oscillating Water Column (OWC) converters are maybe nowadays the most promising wave energy converters, with the potential capability of sea energy harnessing from diverse on-shore and floating structures. This paper presents an analytic modeling of the wave capture chamber for a fixed on-shore OWC wave power plant. The model is particularized and parameterized for the case of the Mutriku MOWC wave power plant, located in the Spanish Basque Country coast, and then validated using both measured real wave entry data and experimental generated output power from the plant.
|
|
17:24-17:42, Paper WeDT5.4 | |
A Robotic Actuation Approach for 6-DOF Force Control in Hydrodynamic Real-Time Hybrid Simulation (I) |
|
Ni, Yun | Stanford University |
Seki, Akiri | Stanford University |
Bosma, Bret | Oregon State University |
Brekken, Ted | Oregon State University |
Robertson, Bryson | Oregon State University |
Schellenberg, Andreas | Maffei Structural Engineering |
Lomonaco, Pedro | Oregon State University |
Simpson, Barbara | Stanford University |
Keywords: Modelling, Simulation and Validation of Cyber-physical Energy Systems, Cyber-physical Production Systems and Industry 4.0, Renewable Energy Sources
Abstract: Hydrodynamic real-time hybrid simulation (hydro-RTHS) is a testing approach that couples physical and numerical sub-assemblies through actuators and sensors in real time, offering an enhanced-fidelity alternative to small-scale experiments; e.g., testing of a floating offshore wind turbine (FOWT) in a hydrodynamic laboratory. To address multi-degree-of-freedom (DOF) actuation in a unique, floating environment, a Franka Emika Panda robotic arm was integrated into the actuation approach for hydro-RTHS. The actuation system emulates the 6-DOF small-scale aerodynamic forces computed from a numerical model acting on a small-scale FOWT specimen in a wave basin. A task-prioritized control strategy was designed to apply forces with secondary "pose-keeping" control to limit the robot’s configuration drift. The system was experimentally validated for a 1:50-scale FOWT specimen for operational wind-only cases in still water at the O.H. Hinsdale Wave Research Laboratory. Results demonstrate consistent force tracking across all six DOFs and agreement between measured platform motions and OpenFAST numerical simulations of the complete FOWT. These results illustrate the feasibility of using an off-the-shelf robotic arm for 6-DOF force control in hydro-RTHS, expanding actuation options and enhancing force control capabilities.
|
|
17:42-18:00, Paper WeDT5.5 | |
SusXAI: Evaluating Suspicious Machine Explanations for Anomaly Predictions (I) |
|
Cohen, Joseph | Rutgers University |
Huan, Xun | University of Michigan |
Keywords: AI-Based Methods, Diagnosis and Prognostics, Cyber-physical Production Systems and Industry 4.0
Abstract: Explainable artificial intelligence (XAI) is an emerging research area that aims to enhance the interpretability and transparency of complex machine learning (ML) models. Within manufacturing, XAI holds significant promise for improving anomaly detection systems, which are essential for identifying faults, process deviations, and potential security threats. Model explanations can reveal latent tendencies and bias within a trained model. As manufacturing systems become increasingly automated, the ability to detect anomalies and simultaneously understand why they occur is paramount. To address this challenge, we propose SusXAI, a novel methodology designed to assess the suspicion level of Shapley value explanations. SusXAI incorporates density-based clustering to isolate neighborhoods of similar explanations, allowing us to contextualize anomalies in the explanation space. To compute suspicion levels, the methodology leverages eigendecompositions of pairwise Shapley value correlations to characterize typical explanation patterns across data instances. By quantifying both local (within-cluster) and global suspicion through statistical deviation from these patterns, the method identifies explanations that diverge significantly from expected behavior. Furthermore, this framework enables ranking explanations based on how unusual or suspicious they are, promoting a new dimension of situational trustworthiness in XAI. Applied within manufacturing systems, the SusXAI method enhances operational awareness and strengthens the reliability of ML/AI-based anomaly detection systems, promoting secure, resilient, and explainable industrial AI.
|
|
WeDT6 |
Room T6 |
Robotics Ontologies for Collaboration |
Special Session |
Chair: Mosley, Jeffery | MoTech LLC |
Organizer: Habib, Maki Khalil | Saga University |
Organizer: Gonçalves, Paulo | Instituto Politecnico De Castelo Branco |
|
16:30-16:48, Paper WeDT6.1 | |
Knowledge Driven Robotics (KDR) (I) |
|
Freidank, William | Georgia Tech Research Institute |
Lindbeck, Christopher | The Georgia Institute of Technology |
Ahlin, Konrad | GTRI |
Balakirsky, Stephen | Georgia Tech |
Keywords: Task Planning, Control Architectures and Programming, Failure Detection and Recovery
Abstract: This paper presents a software architecture that enables practical deployment of robotic systems in roles that require the execution of tasks not well-defined at design time. The paradigm employed is a maximization of reuse, generalization, and interoperability of logical structures and robot tasks which can readily be composited, supporting a robust and flexible autonomy stack. The Knowledge Driven Robotics (KDR) software architecture achieves this by offering a unique set of interactions between symbolic planning, Behavior Trees, and database information. New composite tasks can readily be composed of highly parameterized atomic tasks to manipulate known classes of task objects. Key contributions include the ability of the behavior layer to query, transform, and utilize data in a flexible manner and to act as data management systems during execution, as well as surrounding infrastructure enabling the treatment of data as a first-class citizen.
|
|
16:48-17:06, Paper WeDT6.2 | |
Intuitive Real-Time Robot Teleoperation Via VR, ROS, and Externally Guided Motion for Human Skill Transfer (I) |
|
Sharp, Ryan | Central Connecticut State University |
Wang, Haoyu | Central Connecticut State University |
Melick, Anson | GKN Aerospace |
Perdomo, Oscar | Central Connecticut State University |
Keywords: Human-Centered Automation, Telerobotics and Teleoperation, Virtual Reality and Interfaces
Abstract: This research integrated virtual reality (VR) and robotic teleoperation through a user-friendly motion capture control system, aiming to broaden the accessibility of teleoperated robotic systems for heavily labor and skill-intensive manufacturing processes, such as airfoil deburring and blending. VR technology enabled the incorporation of a digital twin into the program so that users could visualize the robot they were working with from any remote location. Additionally, by enabling the use of motion tracking gloves as an input device, teleoperation can be made more user-friendly than similar systems which rely solely on standard VR controllers. By combining these two technologies, we hope to be able to more easily capture complex human motions for manufacturing tasks to program industrial robots to replicate highly skilled craftsmanship motion for automation.
|
|
17:06-17:24, Paper WeDT6.3 | |
Use Case for Human-Robotic Interaction, an Application of Robot Task Representation - IEEE Std 1872.1™-2024 (I) |
|
Mosley, Jeffery | MoTech LLC |
Keywords: Robotics and Automation in Life Sciences, Task Planning, Robot Networks
Abstract: The exposition of the human-robotic interaction (HRI) use case will attempt to prototype a general schema for HRI tasks focusing on semi-autonomous agents (e.g., robotic) and the relationship between expertise and error regarding human robotic interaction and its relevance to interaction design. We will further investigate what causes stress in complex task execution, in a concurrent task execution environment (i.e., dynamic) with an attempt to provide constructs for human performance evaluation. In short, it is proposed that through the evaluation of data flows and interaction setup, and task execution, that it will be possible to mediate stress via interaction design (e.g. HRI design). The use of IEEE Std 1872.1 and its ontology provides a common understanding that can be shared by all stakeholders involved through the definition of the wholistic task where HRI is to be employed not limited to a single or specific robotic or human action.
|
|
17:24-17:42, Paper WeDT6.4 | |
Detection and Management of Human-Cable Collision in Cable-Driven Parallel Robots |
|
Gao, Hanbang | Laboratoire Des Sciences Du Numérique De Nantes (LS2N) |
Chevallereau, Christine | CNRS |
Caro, Stéphane | CNRS/LS2N |
Keywords: Human-Robot Collaboration, Tendon/Wire Mechanism, Parallel Robots
Abstract: This letter discusses the challenges and innovations in collision detection and management strategies for Cable-Driven Parallel Robots (CDPRs), focusing on enhancing safety in human-robot collaborative environments. A comprehensive collision management method is introduced. It integrates a novel method to detect collisions and identify the collided cable, leveraging tension sensor data and algorithmic strategies to improve accuracy and response to collision events. Adaptive management strategies for different collision severities, including minor and severe contacts, are presented, along with procedures for post-collision management. The methodologies are validated through experimentation with the CRAFT prototype, demonstrating their practical effectiveness. The findings have significant implications for the design and implementation of safety protocols in CDPRs.
|
|
17:42-18:00, Paper WeDT6.5 | |
Personalized Speech Emotion Recognition in Human-Robot Interaction Using Vision Transformers |
|
Mishra, Ruchik | University of Louisville |
Frye, Andrew | University of Louisville |
Rayguru, Madan Mohan | University of Louisville |
Popa, Dan | University of Louisville |
Keywords: Emotional Robotics, Deep Learning Methods, AI-Based Methods
Abstract: Emotions are an essential element in human verbal communication, therefore it is important to understand individuals' affect during human-robot interaction (HRI). This paper investigates the application of vision transformer models, namely ViT (Vision Transformers) and BEiT (Bidirectional Encoder Representations from Pre-Training of Image Transformers) pipelines for Speech Emotion Recognition (SER) in HRI. The focus is to generalize the SER models for individual speech characteristics by fine-tuning these models on benchmark datasets and exploiting ensemble methods. For this purpose, we collected audio data from several human subjects having pseudo-naturalistic conversations with the NAO social robot. We then fine-tuned our ViT and BEiT-based models and tested these models on unseen speech samples from the participants in order to identify four primary emotions from speech: neutral, happy, sad, and angry. The results show that fine-tuning vision transformers on benchmark datasets and then using either these already fine-tuned models or ensembling ViT/BEiT models results in higher classification accuracies than fine-tuning vanilla-ViTs or BEiTs.
|
|
WeDT7 |
Room T7 |
Automation Applications 2 |
Regular Session |
Chair: Lee, Eun-Ho | Sungkyunkwan Univeristy |
|
16:30-16:48, Paper WeDT7.1 | |
Evaluation of Actuation Precision in Deboning Process for Pork Leg Using 6-DOF Industrial Robots |
|
Hattori, Kazuhiro | Mayekawa Mfg. Co., Ltd |
Yamashita, Tomoki | Mayekawa MFG. Co., Ltd |
Kashiwazaki, Koshi | Mayekawa Mfg. Co., Ltd |
Keywords: Cognitive Automation, Computer Vision in Automation, Deep Learning in Robotics and Automation
Abstract: This paper evaluates the accuracy of the intersection over union (IoU) in the recognition of the exposed cross-section of the hip bone (pubic bone) on the external surface and the success rate of bone grasping actuation for the deboning process for pork leg using a 6-DOF industrial robot in the deboning system equipped with multiple robot arms in the “cell”. Information on bone shapes and positions is analysed with the recognition unit which acquires an X-ray image and height/surface image including the surface position of the pubic bone. Deep learning and image analysis are applied to the acquired combined information in the recognition unit to obtain information on bone shapes and positions, which IoU is around 0.7 in average and the success rate of grasping the pubic bone is more than 99% among the evaluated workpieces which are being processed until the step of grasping the target position with the chuck equipped with a 6-DOF robot.
|
|
16:48-17:06, Paper WeDT7.2 | |
Robotic Automation in Apparel Manufacturing: A Novel Approach to Fabric Handling and Sewing |
|
Ajith, Abhiroop | Siemens Technology |
Sathya narayanan, Gokul narayanan | Worcester Polytechnic Institute |
Zornow, Jonathan | Sewbo |
Calle, Carlos | Levis |
Herrero Lugo, Auralis | Bluewater Defense |
Susa Rincon, Jose Luis | Siemens Corporation |
Wen, Chengtao | Siemens |
Solowjow, Eugen | Siemens Corporation |
Keywords: Collaborative Robots in Manufacturing, Product Design, Development and Prototyping
Abstract: Sewing garments using robots has consistently posed a research challenge due to the inherent complexities in fabric manipulation. In this paper, we introduce an intelligent robotic automation system designed to address this issue. By employing a patented technique that temporarily stiffens garments, we eliminate the traditional necessity for fabric modeling. Our methodological approach is rooted in a meticulously designed three-stage pipeline: first, an accurate pose estimation of the cut fabric pieces; second, a procedure to temporarily join fabric pieces; and third, a closed-loop visual servoing technique for the sewing process. Demonstrating versatility across various fabric types, our approach has been successfully validated in practical settings, notably with cotton material at the Bluewater Defense production line and denim material at Levi's research facility. The techniques described in this paper integrate robotic mechanisms with traditional sewing machines, devising a real-time sewing algorithm, and providing hands-on validation through a collaborative robot setup.
|
|
17:06-17:24, Paper WeDT7.3 | |
Proactive Maritime Threat Prediction: Vessel Intent Classification with LSTMs and Transformers Using a Sliding Window Approach |
|
Meepaganithage, Ayesh | University of Nevada Reno |
Sayed, Md Abu | University of Nevada, Reno |
Nicolescu, Mircea | University of Nevada, Reno |
Nicolescu, Monica | University of Nevada, Reno |
Keywords: Collision Avoidance, Machine learning, Motion and Path Planning
Abstract: Maritime safety has long been a major concern, with significant efforts dedicated to improving it. Early prediction of hostile behaviors from nearby ships can greatly enhance maritime security. Machine learning has proven valuable in related research areas, such as driving behavior prediction and naval trajectory forecasting. In our previous studies, we highlighted how deep learning could be applied to detect the behavior of nearby vessels. In this research, we take it a step further by predicting future behavior using a sliding window approach. We conduct binary classification to determine whether the behavior of external ships is hostile or benign, followed by multiclass classification to identify specific types of hostile or benign behaviors. We experiment with both a large, unbalanced dataset and a balanced subset to assess performance variations. Our results show that all RNN-based deep learning models perform significantly better on the balanced dataset, while the transformer model achieves superior performance on the full dataset. In binary classification, the transformer model achieves the best performance on the full dataset, with an F1-score of 89.48%, whereas the Bidirectional LSTM (Bi-LSTM) model performs best on the balanced dataset, with an F1-score of 90.08%. In multiclass classification, the Bi-LSTM model attains the highest performance, achieving an F1-score of 87.93% on the balanced dataset and 81.41% on the full dataset. This research demonstrates that deep learning models can effectively predict the behavior of nearby ships in advance, enhancing maritime safety by enabling timely preventive actions.
|
|
17:24-17:42, Paper WeDT7.4 | |
MoistureMapper: An Autonomous Mobile Robot for High-Resolution Soil Moisture Mapping at Scale |
|
Rose, Nathaniel | University of Nevada, Reno |
Chuang, Hannah | The University of Nevada, Reno |
Andrade-Rodriguez, Manuel Alejandro | University of Nevada, Reno |
Parashar, Rishi | Desert Research Institute |
Or, Dani | University of Nevada Reno |
Maini, Parikshit | University of Nevada Reno |
Keywords: Agricultural Automation, Autonomous Vehicle Navigation
Abstract: Soil moisture is a quantity of interest in many application areas including agriculture and climate modeling. Existing methods are not suitable for scale applications due to large deployment costs in high-resolution sensing applications such as for variable irrigation. In this work, we design, build and field deploy an autonomous mobile robot, MoistureMapper, for soil moisture sensing. The robot is equipped with Time Domain Reflectometry (TDR) sensors and a direct push drill mechanism for deploying the sensor to measure volumetric water content in the soil. Additionally, we implement and evaluate multiple adaptive sampling strategies based on a Gaussian Process based modeling to build a spatial mapping of moisture distribution in the soil. We present results from large scale computational simulations and proof-of-concept deployment on the field. The adaptive sampling approach outperforms a greedy benchmark approach and results in up to 30% reduction in travel distance and 5% reduction in variance in the reconstructed moisture maps. Link to video showing field experiments: https://youtu.be/S4bJ4tRzObg Keywords: Robotic Sampling, Soil Moisture, Precision Agriculture, Adaptive Sampling, Resource Mapping
|
|
17:42-18:00, Paper WeDT7.5 | |
A Data-Driven Electromagnetic Field Method for Leakage Detection in Industrial Water Reservoirs |
|
Kim, Changhyeon | Sungkyunkwan University |
Shim, Young-Dae | Georgia Institute of Technology |
Kim, Jihun | Sungkyunkwan University |
Gu, Jauk | Samsung Electronics |
Lee, Eun-Ho | Sungkyunkwan Univeristy |
Keywords: Environment Monitoring and Management, Diagnosis and Prognostics, Sustainability and Green Automation
Abstract: This paper presents a novel electromagnetic sensor-based moisture prediction system designed to detect water leakage in industrial water storage reservoirs (concrete tanks) from the outside. By leveraging the low conductivity of concrete and its increasing dielectric constant under elevated moisture content, an electromagnetic field is employed to penetrate from the reservoir’s exterior to its interior. The system’s fundamental design is based on Maxwell’s equations, with an equivalent circuit model formulated to relate changes in permittivity to impedance variations. Laboratory-scale experiments confirm that impedance rises with increasing moisture content, and comparisons with finite-element (FE) simulations show strong agreement. Field validation was conducted in an actual semiconductor facility, where measured impedance signals closely matched visible leakage areas. While minor temperature-related fluctuations highlight the sensor’s thermal sensitivity, the proposed system offers a quantitative method for evaluating internal moisture states without necessitating reservoir drainage. Further miniaturization and data correction algorithms—such as those addressing lift-off errors—can broaden its applicability, making it a promising solution for leakage detection in industrial environments.
|
|
WeDT8 |
Room T8 |
AI-Driven Scheduling and Optimization 2 |
Special Session |
Chair: Kim, Hyun-Jung | Korea Advanced Institute of Science and Technology |
Organizer: Kim, Hyun-Jung | Korea Advanced Institute of Science and Technology |
Organizer: Lu, Yuqian | The University of Auckland |
Organizer: Shen, Weiming | Huazhong University of Science and Technology |
Organizer: Li, Xinyu | Huazhong University of Science and Technology |
|
16:30-16:48, Paper WeDT8.1 | |
Decomposition-Based Optimization Method for Parallel Machine Scheduling under Quality Uncertainty in Wafer Manufacturing (I) |
|
Lee, Sang-Wook | Korea Advanced Institute of Science and Technology (KAIST) |
Lee, Dongha | KAIST |
Kim, Minchan | KAIST |
Lee, So-Young | KAIST |
Kim, Hyun-Jung | Korea Advanced Institute of Science and Technology |
Keywords: Planning, Scheduling and Coordination, Optimization and Optimal Control, Semiconductor Manufacturing
Abstract: We address a parallel machine scheduling problem with quality uncertainty and deadline constraints inspired by the challenges in the epitaxial growth (EPI) stage of wafer manufacturing process. The final wafer quality depends on both the input wafer quality and the assigned machine, making it more challenging to account for in scheduling decisions. We define quality measures for wafer yield and risk to address quality uncertainty. Switching between wafer types requires setup operations that requires unnecessary resource consumption and setup time. In the field, operators aim to minimize the number of setups. We propose a decompositionbased Branch-and-Check (B&Ch) approach that separates job assignment and sequencing to find the optimal solution. The master problem assigns jobs to machines while minimizing setups and ensuring quality constraints, whereas the subproblem optimizes sequencing while satisfying deadline constraints. To enhance computational efficiency, we introduce new cut generation method that eliminates unnecessary search space. Computational experiments demonstrate that the proposed algorithm outperforms the integer programming model.
|
|
16:48-17:06, Paper WeDT8.2 | |
In-Context Learning for User-Specified Constraint Enforcement in Discrete Event Simulation (I) |
|
Park, Jimin | Korea Advanced Institute of Science and Technology |
Faturrahman, Zuhdi | KAIST |
Kim, Hyun-Jung | Korea Advanced Institute of Science and Technology |
Keywords: Manufacturing, Maintenance and Supply Chains, Planning, Scheduling and Coordination, AI-Based Methods
Abstract: Effective real-world scheduling solutions must rapidly adapt to temporally changing user-defined constraints and preferences, traditionally requiring labor-intensive manual modifications. To address this challenge, we propose an in-context learning (ICL)-based method using large language models (LLMs) to automatically convert user-specified constraints in natural language into executable scheduling code modifications. Our approach leverages contextual examples, enabling fast and accurate adaptation without extensive retraining or domain expertise. Experiments demonstrate the method's accuracy in generating constraint-enforcing code. Furthermore, our approach directly produces a modified Python code compatible with traditional and reinforcement learning-based schedulers.
|
|
17:06-17:24, Paper WeDT8.3 | |
Modeling for Digital Twin of Machining Cells with Mobile Manipulators (I) |
|
Lee, Chae-won | Seoul National University of Science and Technology |
Jang, Soojin | Seoul National University of Science and Technology |
Wang, Hyun-sik | Seoul Nationl University Science and Technology |
Nam, Eunseok | Korea Institute of Industrial Technology |
Park, Kyu-Tae | Seoul National University of Science and Technology |
Keywords: Cyber-physical Production Systems and Industry 4.0, Industrial and Service Robotics, Simulation and Animation
Abstract: Mobile manipulators (MMs) in manufacturing environments are sophisticated robotic systems that integrate material handling and processing capabilities, thereby enabling flexible and efficient operations across a wide range of manufacturing scenarios. This study formalizes the operational behaviors of MMs tasked with material transport and processing within machining cells. To accurately represent the states and transitions of MMs in such environments, a modeling framework based on Petri nets is adopted. The proposed approach systematically encompasses key scenarios that may occur within the machining cell, which explicitly specifies the operational logic and state transition conditions of the MMs. To assess the applicability of the proposed model within a digital twin (DT), a discrete event simulation was employed. During implementation, a comprehensive set of events and conditional behaviors pertinent to the machining cell was rigorously defined and integrated into the DT. Consequently, an event-driven simulation environment was established, effectively replicating the dynamic and complex operational context of MMs in real-world manufacturing systems. As a result, the machining cell was able to effectively utilize the event-driven simulation to address dynamic and unpredictable situations, which demonstrates the practical utility of the proposed modeling approach within a DT environment.
|
|
17:24-17:42, Paper WeDT8.4 | |
Dynamic Graph-Based Deep Reinforcement Learning Approach for Large-Size Flexible Job Shop Scheduling (I) |
|
Park, Jeongwon | Arizona State University |
Ju, Feng | Arizona State University |
Keywords: Planning, Scheduling and Coordination, Reinforcement, Deep Learning in Robotics and Automation
Abstract: The Flexible Job Shop Scheduling Problem (FJSP) is an NP-hard optimization challenge with significant industrial applications, especially for large-scale instances. Traditional approaches often struggle with time-intensive design processes and suboptimal performance as problem size increases. This paper introduces a scalable, dynamic graph-based reinforcement learning framework designed to address large-scale FJSP efficiently. At each scheduling step, the framework focuses on relevant information by pruning superfluous operation and machine nodes and arcs. By modeling FJSP as a dynamic graph, our approach reformulates the scheduling task as a Markov decision process (MDP). The experimental results show that our method outperforms other prior works, including traditional heuristics and leading algorithms within a feasible time.
|
|
WeDT9 |
Room T9 |
Intelligent Modeling and Optimization |
Special Session |
Chair: Kim, Minhee | University of Florida |
Organizer: Wang, Di | Shanghai Jiao Tong University |
Organizer: Wang, Yuanxiang | Tongji University |
Organizer: Huang, Jingsi | Peking University |
Organizer: Kim, Minhee | University of Florida |
|
16:30-16:48, Paper WeDT9.1 | |
Risk Assessment Method for Urban Distribution Systems under Extreme Heavy Rain Disasters (I) |
|
Zheng, Kang | Beijing Jiaotong University |
Zhao, Yuxin | Beijing Jiaotong University |
Huang, Jingsi | Peking University |
Xi, Yanna | State Grid Beijing Electric Power Company Limited |
Liang, Chen | State Grid Beijing Electric Power Company Limited |
Wu, Xiangyu | Beijing Jiaotong University |
Keywords: Probability and Statistical Methods, Power and Energy Systems automation, Smart Grids
Abstract: Extreme natural disasters often lead to large-scale power outages in urban distribution systems. Therefore, this paper models extreme torrential rain disasters and proposes a risk assessment method for urban distribution systems under such extreme conditions. The research includes: describing the impact of torrential rain on distribution systems; dividing the power supply area into grids and establishing a time-varying failure rate model for equipment; setting up power flow constraints; and proposing operational risk indicators.Using the IEEE-33 bus system as an example, the results show that this method can effectively assess the risk of distribution systems under extreme torrential rain conditions.
|
|
16:48-17:06, Paper WeDT9.2 | |
Iterative Durability Design of Products Via Degradation-Informed Bayesian Optimization |
|
Kim, Minhee | University of Florida |
Keywords: AI-Based Methods, Diagnosis and Prognostics, Product Design, Development and Prototyping
Abstract: Green manufacturing has become a pressing issue in recent years, driven by the acknowledgment of its long-term social and economic benefits, increasing demand for environmentally friendly products, and expanding regulations. One of the key approaches to attaining sustainability in green manufacturing is to design long-lasting products. Nevertheless, the traditional approaches have faced several unique challenges in designing a product with an extended lifetime, such as costly and time-consuming procedures, as well as the noisy, sparse, insufficient or incomplete data. This paper proposes a novel framework to tackle these issues from a data-driven perspective. Specifically, the proposed method employs Bayesian optimization with Monte-Carlo acquisition functions to take into account both product design factors and degradation signals and incorporate the inherent uncertainty in the modeling of underlying degradation processes and prediction of product lifetimes. A series of simulation studies are presented to assess the performance of the proposed method. A case study on the Lithium-ion battery dataset is further conducted, which demonstrates the advantages of the proposed method over existing benchmark approaches. Note to Practitioners—This research paper provides a novel data-driven approach to finding an optimal product design with a prolonged lifetime. The conventional methods for the product lifetime extension are often time-consuming and expensive, while the associated data is in general noisy or incomplete. The proposed framework addresses these challenges by encoding the general degradation path model into the Bayesian optimization.
|
|
17:06-17:24, Paper WeDT9.3 | |
Turbine Generator Condition Monitoring Using Bayesian-Optimized Multi-Output Gaussian Process (I) |
|
Wang, Ying | Shanghai Jiao Tong University |
Pan, Xinyu | Shanghai Jiaotong University |
Wang, Di | Shanghai Jiao Tong University |
Ju, Zhenhao | Shanghai Electric Machinery Co., Ltd |
Zhang, Zhou | Shanghai Electric Machinery Co., Ltd |
Li, Mingyin | Shanghai Electric Machinery Co., Ltd |
Zhang, Yan | Shanghai Electric Machinery Co., Ltd |
Keywords: Big-Data and Data Mining, Probability and Statistical Methods, Data fusion
Abstract: Performing online condition monitoring for turbine generators is challenging but imperative to avoid unexpected failure and reduce maintenance costs. To achieve long-term signal prediction, this paper proposes a Bayesian-estimated multi-output Gaussian process (MOGP) framework. In the proposed framework, the mean function captures the global trend caused by degradation, and MOGP models the residual term between the mean function and the observed value which captures the local variation caused by measurement noise and environmental factors. The covariance function of MOGP characterizes the temporal and cross-correlation between observations of signals simultaneously to achieve precise prediction. Moreover, the parameters of the MOGP are learned via Bayesian estimation to avoid overfitting issues. Then, the signals are fused into a 1-D health index (HI) which assesses the health status of turbine generators. Finally, the proposed method is applied to two real turbine generators, and the result compared with both statistical and deep learning-based methods demonstrates its superiority.
|
|
17:24-17:42, Paper WeDT9.4 | |
Empowering PHM Applications with Time Series Foundation Models: A Unified Multi-Task Learning Approach (I) |
|
Yu, Yongzi | Hong Kong University of Science and Technology (Guangzhou) |
Zhu, Feng | The Hong Kong University of Science and Technology (Guangzhou) |
Wang, Di | Shanghai Jiao Tong University |
Tsung, Fugee | HKUST |
Keywords: Sensor Fusion, Big-Data and Data Mining, AI-Based Methods
Abstract: Currently, small, task-specific models dominate the development of Prognostics and Health Management (PHM) applications. However, these isolated models often struggle to address the diverse and fragmented requirements present in real industrial environments. The emergence of time-series foundation models has attracted considerable attention, providing a more flexible and effective approach for PHM applications. This study explores how time-series foundation models can enhance PHM applications. Using a typical aero-engine degradation dataset as the research context, we propose a novel unified multi-task learning approach that leverages pre-trained time-series foundation models. Specifically, we utilize these foundation models as the basis for time-series representation learning to tackle various PHM tasks. To accommodate the diverse requirements of these tasks, we design specialized output heads tailored for multi-task learning objectives. The pre-trained foundation model is then fine-tuned with specific datasets to develop localized task-specific models. We validate our approach through case studies using the C-MAPSS datasets. The experimental results demonstrate the feasibility and effectiveness of foundation models for the development of PHM applications.
|
|
17:42-18:00, Paper WeDT9.5 | |
A Physics-Informed GAN Framework for Modeling a Spatiotemporal Correlated Temperature Field During Grain Storage (I) |
|
Zhang, Peihan | Shanghai Jiao Tong University |
Wang, Di | Shanghai Jiao Tong University |
Keywords: AI-Based Methods, Big-Data and Data Mining, Sensor Networks
Abstract: Accurate modeling of the spatiotemporal temperature field in grain storage is crucial for maintaining grain quality and optimizing storage management. However, existing methods struggle to accurately capture the stochastic variations and high-dimensional correlation in spatiotemporal dynamics, thereby limiting their effectiveness in achieving higher predictive accuracy in real-world applications. This paper proposes an improved Physics-Informed Generative Adversarial Network (PI-GAN) framework for modeling complex spatiotemporal dynamics while quantifying stochastic uncertainties. Stochastic latent variables are introduced to capture high-dimensional uncertainties, while physics constraints are embedded into both the generator and discriminator to fully exploit adversarial learning. By integrating PDE constraints into the GAN structure, the proposed method enhances generalization performance and achieves higher predictive accuracy. To validate its effectiveness, the framework is applied to a grain storage case study, modeling the temperature field within a granary. Model results demonstrate that the proposed approach outperforms state-of-the-art methods in terms of predictive accuracy and robustness.
|
|
WeDT10 |
Room T10 |
Product Design and Manufacturing 1 |
Regular Session |
Chair: Sheng, Weihua | Oklahoma State University |
|
16:30-16:48, Paper WeDT10.1 | |
Design of an Automatic Robotic System for Zebrafish Larval Heart Microinjection |
|
Guo, Zhongyi | University of Macau |
Xu, Qingsong | University of Macau |
Keywords: Automation at Micro-Nano Scales, Robotics and Automation in Life Sciences, Manipulation Planning
Abstract: Zebrafish is an important model organism, and microinjection of zebrafish larvae is a common operation in biological laboratories. The widely used manual microinjection has the disadvantages of low efficiency, low success rate, and low consistency. This paper presents an automatic robotic system for microinjection of zebrafish larval hearts. A localization method based on modulo operation has been proposed for the larval heart. In addition, a new software system with a user-friendly interface has been de-veloped, which integrates the designed heart localization algo-rithm. A prototype robotic system for automatic heart mi-croinjection of zebrafish larvae has been developed by combining multiple hardware devices. Moreover, relevant experimental studies were conducted, and the results indicated that the robotic system can effectively replace human operators and accurately complete automated microinjection targeting zebrafish larval hearts. The developed system can be used in drug screening re-search for cardiovascular-related diseases of zebrafish larvae.
|
|
16:48-17:06, Paper WeDT10.2 | |
Application-Oriented Co-Design of Motors and Motions for a 6DOF Robot Manipulator |
|
Stein, Adrian | Louisiana State University |
Wang, Yebin | Mitsubishi Electric Research Laboratories |
Sakamoto, Yusuke | Mitsubishi Electric Corporation |
Wang, Bingnan | Mitsubishi Electric Research Laboratories |
Fang, Huazhen | University of Kansas |
Keywords: Optimization and Optimal Control, Motion Control, Industrial and Service Robotics
Abstract: This work investigates an application-driven co-design problem where the motion and motors of a six degrees of freedom robotic manipulator are optimized simultaneously, and the application is characterized by a set of tasks. Unlike the state-of-the-art which selects motors from a product catalogue and performs co-design for a single task, this work designs the motor geometry as well as motion for a specific application. Contributions are made towards solving the proposed co-design problem in a computationally-efficient manner. First, a two-step process is proposed, where multiple motor designs are identified by optimizing motions and motors for multiple tasks one by one, and then are reconciled to determine the final motor design. Second, magnetic equivalent circuit modeling is exploited to establish the analytic mapping from motor design parameters to dynamic models and objective functions to facilitate the subsequent differentiable simulation. Third, a direct-collocation-based differentiable simulator of motor and robotic arm dynamics is developed to balance the computational complexity and numerical stability. Simulation verifies that higher performance for a specific application can be achieved with the multi-task method, compared to several benchmark co-design methods.
|
|
17:06-17:24, Paper WeDT10.3 | |
Polarized Magnetic Hydrogel Microrobot Collectives for Efficient Cooperation and Self-Assembly |
|
Chen, Yuanhe | University of Macau |
Xu, Zichen | University of Macau |
Qiu, Xuanping | University of Macau |
Xu, Qingsong | University of Macau |
Keywords: Mechatronics in Meso, Micro and Nano Scale, Mechanism Design in Meso, Micro and Nano Scale, Automation at Micro-Nano Scales
Abstract: Microrobot collectives with good organizational manners provide a promising strategy to enable individual microrobots to tackle complex tasks. However, the current microrobot collectives are difficult to form robust connections among individuals. This paper proposes a novel design of the Polarized Magnetic Hydrogel Microrobot (PMHR) to strengthen the connections among individuals, thereby enhancing the efficiency of microrobot cooperation and self-assembly. The improved connections enable swift collective formation, dexterous manipulation, flexible locomotion for climbing obstacles, and resistance to harsh environmental disturbances. Experimental studies have been conducted on the PMHR for executing complex tasks such as swarm formation, resistance to fluid backflow, and object transportation, all controlled through external magnetic fields. The results show that PMHR can maintain stability and positioning under fluid flow, exert substantial forces to manipulate objects, and form dynamic and reconfigurable swarms. The results demonstrate the superior mobility and force exertion of PMHR compared to non-magnetized microspheres, emphasizing their potential for applications in micromanipulation, biomedical fields, and robotic systems.
|
|
17:24-17:42, Paper WeDT10.4 | |
AssemblyDepth: A Large-Scale Dataset and Domain-Bridging Method for Industrial Assembly Recognition |
|
Murray, Kevin | Overlab LLC |
Duric, Zoran | George Mason University |
Keywords: Computer Vision for Manufacturing, Assembly
Abstract: In industrial automation, reliably recognizing the state of partially assembled equipment is crucial for robotic assembly, maintenance, and quality control. However, progress in this area has been hampered by two major challenges: the absence of a comprehensive real-world dataset for complex industrial assemblies, and the persistent domain gap between synthetic training data and real operating conditions. In this paper, we present two key contributions. First, we introduce a large-scale dataset of real industrial assemblies—comprising 90 scenes from 6 diverse pieces of equipment with over 700 parts—providing detailed ground truth for assembly state. Second, we propose a novel two-stage recognition approach that integrates state-of-the-art monocular depth estimation as a preprocessing step, which effectively reduces the synthetic-to-real domain gap to improve recognition performance. Extensive experiments validate our approach, delivering robust 6D pose estimation and part classification in challenging industrial settings. Code, data, and pretrained weights are available at https://github.com/overlab-kevin/assembly_depth.
|
|
17:42-18:00, Paper WeDT10.5 | |
Automated SQL Query Generation for an Intelligent Sewer Management System Using Large Language Models |
|
Chitte, Dharmendra Reddy | Oklahoma State University |
Liang, Fei | Oklahoma State University |
Sheng, Weihua | Oklahoma State University |
Shan, Yongwei | Oklahoma State University |
Khaleghian, Hossein | InfraTie Solutions LLC |
Keywords: Logistics, AI-Based Methods, Inventory Management
Abstract: The aging U.S. wastewater infrastructure poses significant environmental and public health risks. While municipalities collect data for asset management, handling large datasets can overwhelm decision-makers. Natural language processing (NLP) simplifies data access by enabling users to retrieve information through voice or text commands, reducing reliance on IT support and improving management efficiency. This paper investigates automated SQL query generation from natural language, a critical step in developing an intelligent sewer management system. Large Language Models (LLMs) are employed to achieve it. The proposed method adopts several techniques to improve the performance of SQL query generation, including data augmentation, prompt engineering and fine-tuning. Experimental results are presented to evaluate the proposed method with different settings and compare it with a baseline method.
|
| |