SII 2025 Program | Thursday January 23, 2025


ThuAT1	Forum 1-2-3
Human-Robot Interaction I	In-person Regular Session
Chair: Hashimoto, Hideki	Chuo University
Co-Chair: Sievers, Thomas	University of Lübeck

08:30-08:45, Paper ThuAT1.1
Silence Is Golden - Making Pauses in Human Utterances Comprehensible for Social Robots in Human-Robot Interaction

Sievers, Thomas	University of Lübeck
Keywords: Human-Robot/System Interaction, Human-Robot Cooperation/Collaboration, Modeling and Simulating Humans Abstract: People pause when speaking for a variety of reasons - often in the middle of a sentence. It's not easy for a machine to tell the difference between a pause for thought and an intended turn, but a smooth turn-taking is essential for flawless communication. Pauses within a sentence reveal something about the current emotional state of the speaker, and a correct interpretation of emotions is crucial for the mutual understanding of actors in human-robot interaction (HRI). How can we assess what the pauses a person makes in a dialogue with a social robot tell us about their emotional state? The text-to-speech tool Whisper from OpenAI enables robust speech recognition across different languages and the measurement of pauses between words. These pauses can be used to improve the assessment of the speaker's emotional state by evaluating human utterances, including speech pauses, by a Large Language Model (LLM) from OpenAI (ChatGPT) using sentiment analysis. The inclusion of pauses as a non-verbal cue provides a helpful component for such an analysis.

08:45-09:00, Paper ThuAT1.2
Development of Cybernic Mirror System for Improvement of Hand Motor Functions for Patients with Hemiplegia

Konishi, Sara	University of Tsukuba
Uehara, Akira	University of Tsukuba
Kawamoto, Hiroaki	University of Tsukuba
Sankai, Yoshiyuki	University of Tsukuba
Keywords: Rehabilitation Systems, Welfare systems, Human-Robot/System Interaction Abstract: In the case of cybernics treatment, the wearer performs voluntary movements, in which the neuromuscular and central nervous systems work to produce the intended movement of the integrated wearable cyborg and the wearer’s body. In this way, the interactive Bio-Feedback (iBF) loop is established between the wearer and wearable cyborgs. Using wearable cyborg, motor function can be improvement. Cybernics treatment has been covered by medical insurance in Japan after RCT clinical evaluations. Cybernic mirror system is a wearable cyborg that performs intended bimanual symmetrical movements based on motor commands from the unaffected side. Cybernic mirror system has the potential to develop a new cybernics treatment specifically focused on improving finger functionality. The purpose of this study is to develop a cybernic mirror system for intention of bimanual symmetrical movements by transferring the hand movements of the unaffected hand to the affected hand and to confirm the feasibility of assist movement synchronized with the wearer’s movement intention, based on the delay time calculated through basic experiments. The cybernic mirror system consisted of a hand movement assist unit and a hand movement control unit. The hand movement control unit acquired the skeletal information of the unaffected hand through a camera. The hand movement assist unit utilizing tendon-driven structure realized movement assist of flexion and extension of the affected hand. At the basic experiments with three participants, the developed system provided sufficient response performance for bimanual symmetrical movements of the fingers between both sides. Additionally, the developed system enabled the performance of bimanual symmetrical movements for each of the five fingers individually. In conclusion, we confirmed the feasibility of assist movement synchronized with the wearer’s movement intention, based on the delay time calculated through basic experiments.

09:00-09:15, Paper ThuAT1.3
Study on the Lower Garments Dressing and Undressing Portable Support System for Independent Daily Toilet Activity

Miyazaki, Nanami	University of Tsukuba
Uehara, Akira	University of Tsukuba
Kawamoto, Hiroaki	University of Tsukuba
Sankai, Yoshiyuki	University of Tsukuba
Keywords: Welfare systems, Human-Robot/System Interaction, Rehabilitation Systems Abstract: Independent excretion is crucial for maintaining self-worth among individuals. Toilet care significantly reduces the quality of life for care-receivers and imposes physical and mental burdens on caregivers. Even if care-receivers can move to toilet using a walker independently, they often require assistance with dressing and undressing their lower garments including trousers and underwear. A walker supporting dressing and undressing lower garments could not only safely increase opportunities for independent walking to prevent disuse in any places but also reduce the burden of caregivers. The purpose of this study is to develop the lower garments dressing and undressing portable support system attached to a walker for the care-receiver with independent gait function to support their daily toilet activity with dignity. In addition, we confirmed the basic performance of independent dressing and undressing through the basic experiment. The system consisted of three units: the weight support unit, the lower garment stretching unit, the lower garment up-and-down unit. The weight support unit assisted the user in walking and maintaining a standing posture. The lower garment stretching unit consisted of the Lazy Tong mechanism, which stretched the waistband of the lower garments on hooks by extending the mechanism to perform friction-based holding and reduce friction between the skin and the lower garments when dressing and undressing. The lower garment up-and-down unit raise and lowered the lower garments that holding by lower garment stretching unit. The lower garment stretching unit and lower garment up-and-down unit were attached to both sides of the weight support unit. To confirm the basic performance of the system, we conducted a basic experiment about a series of dressing and undressing movements in the toilet with an able-bodied person. As a result, the system could dress and undress the lower garments up to the target height without lower garments coming off.

09:15-09:30, Paper ThuAT1.4
Front Hair Styling Robot System Using Path Planning for Root-Centric Strand Adjustment

Kim, Soonhyo	The University of Tokyo
Kanazawa, Naoaki	The University of Tokyo
Hasegawa, Shun	The University of Tokyo
Kawaharazuka, Kento	The University of Tokyo
Okada, Kei	The University of Tokyo
Keywords: Motion and Path Planning Abstract: Hair styling is a crucial aspect of personal grooming, significantly influenced by the appearance of front hair. While brushing is commonly used both to detangle hair and for styling purposes, existing research primarily focuses on robotic systems for detangling hair, with limited exploration into robotic hair styling. This research presents a novel robotic system designed to automatically adjust front hairstyles, with an emphasis on path planning for root-centric strand adjustment. The system utilizes images to compare the current hair state with the desired target state through an orientation map of hair strands. By concentrating on the differences in hair orientation and specifically targeting adjustments at the root of each strand, the system performs detailed styling tasks. The path planning approach ensures effective alignment of the hairstyle with the target, and a closed-loop mechanism refines these adjustments to accurately evolve the hairstyle towards the desired outcome. Experimental results demonstrate that the proposed system achieves a high degree of similarity and consistency in front hair styling, showing promising results for automated, precise hairstyle adjustments.

09:30-09:45, Paper ThuAT1.5
Prototyping Support System for Co-Development of Performance Robots by Robot Engineers and Dance Performers

Shido, Hiroki	Waseda University
Zecca, Massimiliano	Loughborough University
Nishi, Hiroko	Toyo Eiwa University
Ishii, Hiroyuki	Waseda University
Keywords: Human-Robot/System Interaction, Entertainment Systems, Human-Robot Cooperation/Collaboration Abstract: Robots are used in many situations. However, there are fields that could potentially use robotic technology but do not at present. The art field is one of these. Compared with new media art, there are few art forms that use robots. Robots are media with a body and intelligence and facilitate various types of interactions. This is different from recent new media. Therefore, robotic media will provide novel experiences with artworks not only to the audience but also to the artist, just as new media provides interactive and immersive experiences with artworks. Our long-term goal is to create robotic media artworks and to reveal what is occurring with these artworks.　Our initial aim is to facilitate interactive discussions with dance performers and to develop a co-development tool named "Prototyping Support System." This system consists of an RGB-D camera, PC, and projector, and achieves touch interaction with a robot in VR. Furthermore, we investigated the impact of a robot's reaction time on the robot's perceived emotion using this system. An experiment showed, especially in the case where the robot reacted by transforming, that a faster reaction time for the robot resulted in a greater perception that the robot was surprised. This paper discusses the system and its application in an experiment designed to determine the impact of the robot's reaction time and form in VR.

09:45-10:00, Paper ThuAT1.6
Development of Admittance Control Considering Operability in Human-Robot Cooperative Motion

Abe, Kota	University of Yamanashi
Noda, Yoshiyuki	University of Yamanashi
Keywords: Human-Robot/System Interaction, Human-Robot Cooperation/Collaboration, Human Factors and Human-in-the-Loop Abstract: As production methods are shifting from mass production to high-mix low-volume production, factory automation is evolving from simple repetitive tasks to assembly tasks that require nonroutine and skillful work and are becoming increasingly complex. In addition to knowledge of mechatronics such as sensors, actuators, and control devices, the introduction of automation equipment in factories requires familiarity with the production site, such as the design of jigs and fixtures according to the manufacturing process. In addition, industrial robots, which are general-purpose machines, are being used to robotize production processes, but teaching robot movements requires specialized knowledge in coordinate input and motion programming. To solve these problems, cooperative robots that can teach robot motion easily, such as direct teaching, have been required in recent years. One possible application of cooperative robots is the cooperative assembly of heavy parts by a worker and a robot. The worker can directly manipulate the heavy parts while the cooperative robot grasps the heavy parts to reduce the workload. This kind of human performance improvement using cooperative robots is becoming common. However, it is difficult to say that the operability is good when a mass-damper system with virtual mass and virtual viscous resistance in general use is applied to the virtual model of admittance control. Therefore, in this study, we propose an admittance control method that considers operability in the cooperative work between a human and a robot to transfer heavy objects. In the proposed method, a mass-damper system with bounded viscous resistance is applied in a virtual model of admittance control. Frequency analysis reveals that the mass damper system with bounded viscous resistance improves the operability. The usefulness of the proposed method is demonstrated by experiments using a cooperative robot to transfer heavy objects by direct manipulation of the robot.

10:00-10:15, Paper ThuAT1.7
Body Motion Noise Reduction of Silent Speech Recognition Using Facial Surface EMG

Kimoto, Ryosuke	Chuo University
Ohhira, Takashi	Chuo University
Hashimoto, Hideki	Chuo University
Keywords: Machine Learning, Haptics and tactile sensors Abstract: The demand for voice recognition systems has increased over the years, and many products now employ this technology. Users can use voice commands to search the web, play music, and operate home appliances. Voice recognition systems are being implemented in a variety of applications and are becoming a new way of life. However, there are several problems with voice recognition. For example, leakage of personal and confidential information, inaccuracy of speech recognition due to ambient noise, and difficulties faced by people with speech impediments. To solve these problems, Silent Speech Recognition (SSR), which reads the content of speech without speech, has been attracting attention in recent years. one application of SSR is to convert speech into speech and text, even in crowded or difficult-to-speak environments, one application of SSR is to enable communication by converting speech into voice and text, even in crowded or difficult-to-speak environments. In addition, its miniaturization improves convenience. Furthermore, demand and this technology can also assist the speech impaired in terms of security, as they can exchange confidential and personal information without fear that the contents of conversations and messages will be leaked to others. It can be used as a pointing device like a remote control that responds to simple words and information such as direction and distance. Current SSR systems need improvement in recognizing speech content, especially when the user's body is moving. In previous studies, measurements were limited to steady-state conditions and tended to overlook the effects of body motion. In this paper, we describe the measurement method in detail and propose an algorithm designed with optimal parameters to reduce the effect of body motion noise. The proposed method aims to minimize body motion noise and increase the potential of silent speech technology.

10:15-10:30, Paper ThuAT1.8
Communication Method Based on Pupil Diameter Changes for Amyotrophic Lateral Sclerosis (ALS) Patients: A System Designed for Adaptation to Symptom Progression

Sawada, Nonoka	University of Tsukuba
Uehara, Akira	University of Tsukuba
Kawamoto, Hiroaki	University of Tsukuba
Sankai, Yoshiyuki	University of Tsukuba
Keywords: Human Interface, Welfare systems, Virtual Reality and Interfaces Abstract: Patients with severe physical disabilities owing to amyotrophic lateral sclerosis (ALS) typically have normal cognitive function; however, they often experience difficulties with speech, writing, and oculomotor functions as their symptoms progress. There is a need for a consistent communication system that can adapt to these progressing symptoms. The ability to modulate pupil diameter remains intact in patients who experience difficulties with eye movements, speech, and writing owing to ALS. This study focuses on the behavior of changes in pupil diameter when attention is directed to a target with periodic changes in light intensity. The purpose of this study is to develop a communication system that consistently adapts to the progression of symptoms, based not on eye movement but on changes in pupil diameter, enabling patients with severe disabilities due to ALS to communicate daily, even when their eye movement symptoms worsen. Additionally, the study aims to confirm the feasibility of the developed system for communication that reflects the intentions of patients with ALS. We developed a wearable communication system with a head-mounted display (HMD) that can be used in any situation. Users can express “Yes” or “No” to their surroundings without moving their gaze, directing attention solely to the target. Targets are displayed on the HMD and flicker periodically in different phases. The intention-estimation algorithm determines which target the user selects based on the pattern of changes in pupil diameter. To evaluate the system’s performance, an experiment was conducted with one patient with ALS who has difficulty with speech and writing. The results showed that the accuracy of intention estimation for yes/no responses was 100 %, and the average time to estimate the intention was 6.88 s. In conclusion, we confirmed the feasibility of a daily communication system based on pupil diameter for patients with severe disabilities because of ALS.


ThuAT2	Forum 9-10-11
Control and Planning I	In-person Regular Session
Chair: Akai, Naoki	Nagoya University
Co-Chair: Yoneyama, Jun	Aoyama Gakuin University

08:30-08:45, Paper ThuAT2.1
Kinodynamic Modular Approach Local Trajectory Planner for Straightforward Motions of Differential Wheeled Mobile Robots

Byeon, Yong Jin	Twinny
Jang, Mingyung	Twinny
Kim, Yunjeong	Twinny
Keywords: Motion and Path Planning, Automation Systems, Control Theory and Technology Abstract: This paper introduces a Kinodynamic Modular Approach(KMA) local planning algorithm designed to generate an obstacle avoidance trajectory that has both feasible and straightforward motions for differential wheeled mobile robots(DWMRs). To enhance the smoothness and feasibility of the trajectory, the proposed planner incorporates a kinodynamic system that considers non-holonomic constraints with velocity and acceleration limits. Not only consistent but also straightforward motion is obtained by defining basic motions and selecting the appropriate one according to the given situation. The simulation and real-world experiment results demonstrate the intuitiveness and applicability of the proposed algorithm compared to the time elastic band(TEB) algorithm.

08:45-09:00, Paper ThuAT2.2
Dropout MPC: An Ensemble Neural MPC Approach for Systems with Learned Dynamics

Syntakas, Spyridon	University of Ioannina
Vlachos, Kostas	University of Ioannina
Keywords: Machine Learning, Autonomous Vehicle Navigation, Control Theory and Technology Abstract: Neural networks are lately more and more often being used in the context of data-driven control, as an approximate model of the true system dynamics. Model Predictive Control (MPC) adopts this practise leading to neural MPC strategies. This raises a question of whether the trained neural network has converged and generalized in a way that the learned model encapsulates an accurate approximation of the true dynamic model of the system, thus making it a reliable choice for model-based control, especially for disturbed and uncertain systems. To tackle that, we propose Dropout MPC, a novel sampling-based ensemble neural MPC algorithm that employs the Monte Carlo dropout technique on the learned system model. The closed loop is based on an ensemble of predictive controllers, that are used simultaneously at each time-step for trajectory optimization. Each member of the ensemble influences the control input, based on a weighted voting scheme, thus by employing different realizations of the learned system dynamics, neural control becomes more reliable by design. An additional strength of the method is that it offers by design a way to estimate future uncertainty, leading to cautious control. While the method aims in general at uncertain systems with complex dynamics, where models derived from first principles are hard to infer, to showcase the application we utilize data gathered in the laboratory from a real mobile manipulator and employ the proposed algorithm for the navigation of the robot in simulation.

09:00-09:15, Paper ThuAT2.3
Tightly Coupled Vector Field Inertial Localization

Akai, Naoki	Nagoya University
Keywords: Automation Systems, Sensor Fusion, Intelligent Transportation Systems Abstract: This paper presents a tightly coupled Vector Field Inertial Localization (VFIL) method that integrates the capabilities of Vector Field Sensors (VFSs) and an Inertial Measurement Unit (IMU). While vector fields, such as magnetic or Wi-Fi signal intensities, provide vector data at each point, leveraging VFSs for accurate localization is challenging due to their limited measurement range. Within the VFIL framework, the IMU predicts the pose, and the VFS compensates any prediction errors. To enhance constraints with the VFS, it is essential to store vector data based on the predicted trajectory. While this trajectory can be deduced from accumulated IMU readings, significant error is often included in these accumulations. Our solution is to register the vector data to a vector field map and simultaneously correct the accumulation error. We achieve this by adopting a factor-graph-based optimization method that concurrently estimates pose, velocity, and biases in both IMU and VFS measurements. To demonstrate effectiveness of VFIL, we conduct simulations and real-world experiments, comparing it against particle-filtering and pose-graph-based optimization methods. Results reveal that VFIL consistently offers superior pose estimation accuracy compared to the compared methods owing to the tightly coupled estimation.

09:15-09:30, Paper ThuAT2.4
Enhancing End-Point Accuracy for Path-Following Motion of Articulated Redundant Arm

Hasegawa, Koki	Institute of Science Tokyo
Shizume, Yuki	Institute of Science Tokyo
Nabae, Hiroyuki	Institute of Science Tokyo
Endo, Gen	Institute of Science Tokyo
Keywords: Mechanism Design, Systems for Search and Rescue Applications, Mechatronics Systems Abstract: To enable decommissioning work at the Fukushima Daiichi Nuclear Power Plant, the development of long-reach articulated arms is underway. In decommissioning work, by setting a target path in three-dimensional space and guiding the entire arm along this path, the end effector can reach the target position while minimizing the volume traversed. Therefore, path-following motion is effective for decommissioning work, as it is suitable for passing through narrow spaces and avoiding obstacles. However, conventional methods for calculating target joint angle in path following face the challenge of insufficient end-point accuracy. To improve this accuracy, we propose a target joint angle calculation method that achieves higher accuracy of tip positioning compared to conventional methods. Furthermore, we conducted hardware verifications using the proposed target joint angle calculation method and demonstrated its effectiveness.

09:30-09:45, Paper ThuAT2.5
Motion Planning Method with Pushing Posture Selection for Arranging Office Chairs Using Dual-Arm Mobile Robot

Suwa, Sotaro	Shinshu University
Iwasaki, Takuya	Shinshu University
Yamazaki, Kimitoshi	Shinshu University
Keywords: Motion and Path Planning, Automation Systems, Control Theory and Technology Abstract: In this study, a motion generation method is proposed to enable a dual-arm mobile manipulator to move a chair to a designated position. This method enables the selection of a robot pose for grasping a chair from candidates. Furthermore, we propose a method for evaluating the appropriateness for two- and one-handed manipulation. After grasping a chair, the robot should transport it to the desired position. Consequently, we develop a motion generation method using quadratic programming optimization. This method calculates the displacement of the robot’s joints for motion while satisfying constraint conditions such as joint limit and obstacle avoidance. Through simulation verification, we confirmed that an upper body humanoid robot can perform a chair arrangement task in three common layouts using our methods.

09:45-10:00, Paper ThuAT2.6
Unknown Input Observer for Takagi-Sugeno Fuzzy Bilinear System with Input and Output Disturbances

Yoneyama, Jun	Aoyama Gakuin University
Keywords: Control Theory and Technology, Automation Systems, System Simulation Abstract: An estimation of the state variables for systems with disturbances is an important problem. In a practical situation, not all the state variables are measurable, and disturbance noises come into the system. Especially, it is difficult to estimate the state variables of complicated nonlinear systems with input and output disturbances. In this paper, observer design methods for a discrete-time Takagi-Sugeno fuzzy bilinear system with unknown inputs and input/output disturbances are proposed. Our observer filters out unknown input and estimates the state and input/output disturbances of the system. Since Takagi-Sugeno fuzzy bilinear system represents a quite large class of nonlinear systems, an unknown input observer that estimates the state and input/output disturbance of Takagi-Sugeno fuzzy bilinear system with unknown inputs and disturbances is essential in many engineering fields. Observer design has started with a parallel distributed observer(PDO), which is constructed with local linear observers and the appropriate grade of the membership functions. However, design conditions for PDO are very conservative. To overcome such disadvantage, non-PDO with multiple Lyapunov matrices technique is proposed to design our observer in this paper. Such a design method is based on a multiple Lyapunov function with a sum of the membership functions. This method drastically reduces the conservatism. To demonstrate the validity of our proposed observer design approach, an illustrative numerical example is provided. Lastly, we end with concluding remarks.

10:00-10:15, Paper ThuAT2.7
Given Data of System Requirements for Canonical Controller Design

Ochiai, Yuki	University of Tsukuba
Nguyen-Van, Triet	University of Tsukuba
Kawai, Shin	University of Tsukuba
Keywords: Control Theory and Technology Abstract: This study aims to clarify the requirements for designing a canonical controller using data obtained from unknown systems. Based on Willems' fundamental lemma, this study discusses the conditions under which the system's data can achieve the desired behavior, and then clarifies the data requirements for a canonical controller. The clarified requirements provide sufficient conditions for designing a canonical controller using finite-length data. Furthermore, simulations were conducted to examine the practicality of the clarified requirements. The simulation results demonstrate that adding disturbances to the reference input can extend the finite interval that satisfies the design requirements for a canonical controller.

10:15-10:30, Paper ThuAT2.8
A Lightweight Approach to Efficient Multimodal 2D Navigation and Mapping: Unified Laser-Scans As an Alternative to 3D Methods

Noel, Ocean	CNRS-AIST Joint Robotic Laboratory
Cisneros Limon, Rafael	National Institute of Advanced Industrial Science and Technology
Kaneko, Kenji	National Inst. of AIST
Kanehiro, Fumio	National Inst. of AIST
Keywords: Sensor Fusion, Multi-Modal Perception, Motion and Path Planning Abstract: In this paper, we propose a novel approach for efficient 2D navigation using a multimodal sensor fusion technique. Our method focuses on merging data from multiple sensors, such as LiDARs, cameras, and ultrasonic sensors, into a unified Laser-Scan, which serves as a foundation for faster and more lightweight navigation. By fusing sensor data at the Laser-Scan level, our approach enables the use of basic 2D Simultaneous Localization And Mapping (SLAM) algorithms for mapping tasks, or any others Laser-Scan based features, while still benefiting from the rich information provided by multimodal 3D inputs. This results in a more computationally efficient solution compared to traditional 3D methods that rely on depth points or full multimodal SLAM systems. Our experimental results demonstrate that the proposed approach achieves comparable accuracy in mapping and localization while significantly reducing computational complexity and processing time. This research offers a promising alternative for real-time 2D navigation in resource-constrained autonomous systems, such as drones or any small unmanned vehicles.


ThuAT3	Forum 12
Perception and Sensing I	In-person Regular Session
Chair: Ando, Noriaki	National Institute of Advanced Industrial Science and Technology
Co-Chair: Graabæk, Søren	University of Southern Denmark

08:30-08:45, Paper ThuAT3.1
Accurate Estimation of Fiducial Marker Positions Using Motion Capture System

Tanonwong, Matus	Tohoku University
Chiba, Naoya	Osaka University
Hashimoto, Koichi	Tohoku University, JAPAN
Keywords: Sensor Fusion, Vision Systems, Surveillance Systems Abstract: In this paper, we present a method for aligning the coordinates of multiple cameras and sensors into a unified coordinate system using a motion capture system. Our simulated convenience store environment includes cameras and sensors with distinct coordinate systems, necessitating coordinate alignment. The motion capture system identifies retroreflective markers, while other cameras detect fiducial markers for position and orientation determination. Three optimization algorithms are experimented with to compute a transformation matrix aligning camera coordinates to motion capture coordinates, with the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm achieving the best results (average errors of 1.13 centimeters and 3.90 degrees). Comparisons with fiducial marker pose estimation using an open-source Pupil Core software indicate our method is more robust and consistent, with lower repeatability errors. Additionally, we examine the estimation errors in relation to the distances of the fiducial markers from the camera to minimize these errors, enhancing installation accuracy of cameras and sensors in our simulated environment. This approach enables precise determination of positions and orientations across integrated cameras, consistent with the motion capture system. The findings contribute to our ongoing project, which requires an accurate system integration for customer behavior analysis.

08:45-09:00, Paper ThuAT3.2
Full Range Torque Estimation on Drivetrains with Strain Wave Gears

Graabæk, Søren	University of Southern Denmark
Sloth, Christoffer	University of Southern Denmark
Keywords: Sensor Fusion, Human-Robot Cooperation/Collaboration Abstract: This paper consider sensorless joint torque estimation for kinesthetic teaching on industrial robots with strain wave gears. We show that a simple torsional compliance model of the gear can be used to augment the joint torque estimated by a generalized momentum observer to obtain accurate joint torque estimates when the system has zero momentum. This is accomplished by providing a framework for joining the two joint torque estimates and showing that the combined method has an improved Root Mean Square Error (RMSE) of 39%, as compared to the momentum observer, when the model parameters of the torsional compliance model are computed from the gear's datasheet and an improved Root Mean Square Error (RMSE) of 49% when the model parameters are estimated on data from the joint.

09:00-09:15, Paper ThuAT3.3
Theoretical Research of Tactile Shape Sensor for Complex Surfaces Based on Fiber-Optic Distributed Sensors

Long, Zeyu	Osaka University
Wakamatsu, Hidefumi	Grad. School of Eng., Osaka Univ
Iwata, Yoshiharu	Osaka University
Keywords: Soft Robotics, Haptics and tactile sensors, System Simulation Abstract: Tactile shape sensor is a crucial research focus in the field of soft robotics, and many researchers have developed intelligent tactile shape sensors. However, current research on tactile shape sensors predominantly focuses on simple shapes such as bending and twisting, neglecting a common but significant shapes – concave-convex shapes. This study introduces a tactile shape sensor based on Fiber-Optic distributed sensors, and it utilizes traditional optimization algorithms, avoiding the influence of training data associated with machine learning. In this research, we propose a method to predict shapes using strain data from fibers. The validation of this prediction method is primarily theoretical, conducted through simulations. This research conducts beyond validating basic shapes like bending, twisting, and stretching, and also includes various complex concave-convex shapes, single-point pressing, multi-point pressing (3 points), and reverse pressing (2 points) in case studies. In ideal conditions, the predicted shapes match the set sample shapes perfectly.

09:15-09:30, Paper ThuAT3.4
Online Object Localization in a Robotic Hand by Tactile Sensing

Hammoud, Ali	Sorbonne University
Khoramshahi, Mahdi	Sorbonne University
Huet, Quentin	Sorbonne ISIR
Perdereau, Véronique	Sorbonne University
Keywords: Robotic hands and grasping, Haptics and tactile sensors, Mechatronics Systems Abstract: Robotic grasping and manipulation mainly rely on vision and tactile sensing. While tactile sensors are frequently proposed for grasp control in the literature, object localization and recognition are achieved solely through vision. Vision-based approaches perform satisfactorily in clear and structured surroundings by providing reliable sensory inputs about the object. However, their performance deteriorates when faced with object occlusion, which is typical of manipulation tasks; for instance, robotic fingers occluding the object during in-hand object manipulation. This work presents an online object pose estimation based on tactile sensing. More specifically, the proposed method finds the wrist-object transformation based on the contact-point positions between fingertips and the object, representing the minimal tactile sensor requirement. We validate our method experimentally using different object geometries during in-hand manipulation tasks. The experimental results demonstrate that our proposed method outperforms the vision-based approaches during in-hand object manipulation due to its inherent robustness to object occlusion.

09:30-09:45, Paper ThuAT3.5
A Pipeline for Transparency Estimation of Glass and Plastic Bottle Images for Neural Scanning

Erich, Floris Marc Arden	National Institute of Advanced Industrial Science and Technology
Susgin, Jerome	National Institute of Advanced Industrial Science and Technology
Ando, Noriaki	National Institute of Advanced Industrial Science and Technology
Yoshiyasu, Yusuke	CNRS-AIST JRL
Keywords: Vision Systems, Machine Learning Abstract: Reduction of the gap between simulation and reality (sim2real gap) is essential for robots to learn how to manipulate objects in real scenarios. Estimation of an alpha value for transparent pixels is necessary to render novel views of common objects such as bottles and cups. While it is straightforward to estimate an alpha value for transparent objects, many practical objects have a mixture of transparent areas and opaque areas. In this paper we present a pipeline for automatically segmenting bottles into label, cap and body and estimating an alpha value for the body. We train a segmentation network (Detectron2) for the task of transparent object segmentation, based on a bottle dataset distilled from the PACO dataset. We combine the segmentation masks into a trimap, which is then used as input for an off-the-shelf matting deep neural network (ViTMatte). In our experiments, we show that the per-pixel error for transparent pixels can be reduced by 44% using our pipeline, compared to the baseline of not applying transparency estimation.

09:45-10:00, Paper ThuAT3.6
Gait Sequence Upsampling Using Diffusion Models for Single LiDAR Sensors

Ahn, Jeongho	Kyushu University
Nakashima, Kazuto	Kyushu University
Yoshino, Koki	Kyushu University
Iwashita, Yumi	NASA / Caltech Jet Propulsion Laboratory
Kurazume, Ryo	Kyushu University
Keywords: Surveillance Systems, Vision Systems, Machine Learning Abstract: Recently, 3D LiDAR has emerged as a promising technology in the field of gait-based person identification, serving as an alternative to traditional RGB cameras due to its robustness under varying lighting conditions and its ability to capture 3D geometric information. However, long capture distances or the use of low-cost LiDAR sensors often result in sparse human point clouds, leading to a significant decline in identification performance. To address these challenges, we propose a sparse-to-dense upsampling model for pedestrian point clouds in gait recognition using 3D LiDAR, named LidarGSU, which is designed to enhance the generalization capability of existing identification models. Our method utilizes diffusion probabilistic models (DPMs), which have shown high fidelity in generative tasks such as image completion. In this work, we leverage DPMs on sparse sequential pedestrian point clouds as conditional masks in a video-to-video translation approach, applied in an inpainting manner. We conducted extensive experiments on the SUSTeck1K dataset to evaluate the generative quality and recognition performance of the proposed method. Furthermore, we demonstrate the applicability of our upsampling model using a real-world dataset, captured with a low-resolution sensor across varying measurement distances.

10:00-10:15, Paper ThuAT3.7
Marker-Free Human Gait Analysis Using a Smart Edge Sensor System

Bauer, Eva Katharina	Hochschule Koblenz - RheinAhrCampus
Bultmann, Simon	University of Bonn
Behnke, Sven	University of Bonn
Keywords: Modeling and Simulating Humans, Machine Learning, Sensor Networks Abstract: The human gait is a complex interplay between the neuronal and the muscular systems, reflecting an individual's neurological and physiological condition. This makes gait analysis a valuable tool for biomechanics and medical experts. Traditional observational gait analysis is cost-effective but lacks reliability and accuracy, while instrumented gait analysis, particularly using marker-based optical systems, provides accurate data but is expensive and time-consuming. In this paper, we introduce a novel markerless approach for gait analysis using a multi-camera setup with smart edge sensors to estimate 3D body poses without fiducial markers. We propose a Siamese embedding network with triplet loss calculation to identify individuals by their gait pattern. This network effectively maps gait sequences to an embedding space that enables clustering sequences from the same individual or activity closely together while separating those of different ones. Our results demonstrate the potential of the proposed system for efficient automated gait analysis in diverse real-world environments, facilitating a wide range of applications.

10:15-10:30, Paper ThuAT3.8
Multifingered Object Recognition with Tactile Sensors and Graph Convolutional Networks Using Topological Graph Segmentation

Kulkarni, Shardul	Waseda University
Funabashi, Satoshi	Waseda University
Schmitz, Alexander	Waseda University
Ogata, Tetsuya	Waseda University
Sugano, Shigeki	Waseda University
Keywords: Machine Learning, Haptics and tactile sensors, Robotic hands and grasping Abstract: This study investigates the application of topological segmentation to Graph Convolutional Networks (GCNs) for object recognition using tactile data from a multi-fingered robotic hand. While GCNs have shown promise in processing tactile information, the large volume of data from distributed tactile sensors poses challenges. Inspired by neurological research indicating intra-digit segmentation in human hand topology, we propose two methods of topological segmentation for GCNs: segmenting by digits and palm, and segmenting by individual skin patches. We evaluate these methods against a non-segmented GCN baseline using various input modalities including tactile features, taxel positions, and joint angles. Data was collected from an Allegro Hand equipped with uSkin tactile sensors, manipulating eight everyday objects. Our results demonstrate that topological segmentation enhances object recognition performance, with the best model achieving a 92.92% recognition rate using patch-level segmentation and tactile features with joint angles as input. UMAP analysis of GCN features reveals that segmentation methods produce distinct representations for each hand segment. Additionally, topological segmentation significantly reduces computational resource requirements compared to non-segmented GCNs. This study contributes the first application of topological segmentation to GCNs for tactile processing in robotic hands, achieving high object recognition rates and providing insights into feature extraction capabilities. The proposed method shows potential for improving efficiency and performance in tactile-based robotic manipulation tasks.


ThuAT4	Forum 13-14
SS2&7 Challenges in the Integration of Soft Robotics / Robotics and Automation in the Commercial Aviation Industry	In-person Special Session
Chair: Hosoda, Koh	Kyoto University
Co-Chair: Nassour, John	Technical University of Munich
Organizer: Nassour, John	Technical University of Munich
Organizer: Suzumori, Koichi	Tokyo Institute of Technology
Organizer: Hosoda, Koh	Kyoto University
Organizer: Renda, Federico	Khalifa University of Science and Technology
Organizer: Cheng, Gordon	Technical University of Munich

08:30-08:45, Paper ThuAT4.1
Time-Lag Generation Mechanical Valve for Enhancing Time Response of Back-Stretchable McKibben Muscles (I)

Tanaka, Shoma	Institute of Science Tokyo
Kobayashi, Ryota	Institute of Science Tokyo
Nabae, Hiroyuki	Institute of Science Tokyo
Suzumori, Koichi	Institute of Science Tokyo
Keywords: Soft Robotics, Mechanism Design Abstract: McKibben artificial muscles are capable of contracting when pneumatic pressure is applied. However, they face difficulties in passively elongating from their natural length due to external forces when not pressurized. This limitation poses challenges in systems where artificial muscles interact, such as in antagonistic drive configurations. To address this issue, we have so far developed a novel type of McKibben artificial muscle called the back-stretchable McKibben muscle (BSM). The BSM consists of two primary sections: a contraction section and an elongation section. An inlet tube is inserted between these two sections to restrict airflow. This design enables the elongation section to activate prior to the contraction motion of the contraction section. While this sequential operation allows the BSM to be used in antagonistic drive systems, a new challenge emerged: the restricted airflow resulted in slower response times of the BSM. To address this issue, this paper proposes a mechanical valve called the “Timelag generation mechanical valve (TLV),” which generates a time lag in air inflow to the sections without using an inlet tube. Experimental results demonstrate that incorporating the TLV into the BSM significantly enhances its time response: by approximately 300 times during contraction and approximately 230 times during pressure release. Furthermore, the integration of TLV-equipped BSMs enabled the successful implementation of object throwing in an antagonistic drive robotic arm, a feat previously unattainable with conventional BSMs.

08:45-09:00, Paper ThuAT4.2
Material-Driven Mechanical Programming of Soft Robotic Tentacles (I)

Lu, Yao	Technical University Munich
Amraouza, Asmaa	Technical University of Munich
Peng, Yifei	Technical University of Munich
Li, Dan	Technical University of Munich
Nassour, John	Technical University of Munich
Cheng, Gordon	Technical University of Munich
Keywords: Soft Robotics Abstract: Soft robotic grippers with stochastic and topological grasping capabilities can be highly desirable for gentle contact and interaction with fragile objects of various shapes. In this study, we developed soft tentacles using silicon elastomeric material with an embedded pneumatic channel which generates desirable three-dimensional (3D) curling deformation under pneumatic pressure. Additionally, various ratios and geometrical distributions of two silicon materials, which differ primarily in stiffness, were investigated to determine their effect on grasping efficiency. The tentacle's grasping performance was systematically tested across multiple tentacle designs and grasping strategies. Results showed that optimizing the combination of softer elastomers with stiffer materials significantly improved the tentacle's ability to securely grip and carry heavier loads while maintaining gentle contact with objects. The pneumatic tentacle's simple control mechanism and versatility in handling objects of various shapes and sizes offer a low-cost, adaptable solution for future applications in soft robotics.

09:00-09:15, Paper ThuAT4.3
Soft Sensory Socks Measure Contact Forces During Locomotion (I)

Nassour, John	Technical University of Munich
Huehne, Jan-Erik	Technical University of Munich
Lee, Jin-Ho	Technical University of Munich
Cheng, Gordon	Technical University of Munich
Keywords: Soft Robotics, Haptics and tactile sensors Abstract: Gait monitoring systems play a crucial role in guiding rehabilitation, improving athletic performance, preventing injuries, and ensuring overall mobility and quality of life. Most of these systems are restricted to specialized setups and require calibrations. We propose a soft sensor design using silicone tubes mounted on the bottom side of socks to measure ground reaction force (GRF) during locomotion. Three sensors were mounted on the socks' anterior, lateral, and posterior areas on the left and right feet. The sensory socks enable the analysis of gait while walking and running at different speeds while wearing shoes and also barefoot. The proposed soft sensory socks open the door to analyzing locomotion in different environments and also barefoot. Preliminary gait analyses confirm the different gait parameters observed in the biomechanics of barefoot locomotion.

09:15-09:30, Paper ThuAT4.4
Hand Tracking System Utilizing Learning Based on Vision Sensing and Ionic Gel Sensor Glove (I)

Tokunaga, Kazuki	Kyoto University
Ozaki, Ryu	Kyoto University
Kamihoriuchi, Yuki	Kyoto University
Kawasetsu, Takumi	Kyoto University
Hosoda, Koh	Kyoto University
Keywords: Sensor Fusion, Multi-Modal Perception, Machine Learning Abstract: Hand tracking has attracted considerable interest in fields such as virtual reality and human-robot interaction. However, single-sensor approaches to hand tracking face challenges, particularly with data loss due to occlusions or finger overlap. This paper proposes a multi-sensor system for 3D hand pose estimation that fuses a vision sensor and an ionic gel sensor. We use the Intel RealSense Depth Camera D435 to capture finger joint angles, supplemented by Ionic Gel Sensor Glove that provides continuous measurements. By combining these data streams using a machine learning framework with an autoencoder and LSTM network, we can accurately estimate finger joint angles even in the presence of missing visual data. The experiments compared the method using both the visual sensor and the glove with the method using only the visual sensor. As a result, it was confirmed that the accuracy of finger joint angle estimation improved significantly, especially in cases where data was missing. Additionally, the method demonstrated consistent improvements in accuracy across different users and gloves.

09:30-09:45, Paper ThuAT4.5
Verification of the Effect of Design Parameters on the Radius of Curvature of Vine-Like, Power Soft Gripper (I)

Kodama, Hiroto	Institute of Science Tokyo
Nabae, Hiroyuki	Institute of Science Tokyo
Endo, Gen	Tokyo Institute of Technology
Suzumori, Koichi	Tokyo Institute of Technology
Keywords: Soft Robotics, Robotic hands and grasping Abstract: We have been developing a vine-like, power soft gripper based on Euler's belt theory to achieve high load capacity for grasping irregularly shaped heavy objects in disaster sites. This gripper consists of a hose with rubber sheets adhered to both sides, with a spiral constant-force spring inserted inside. Initially coiled in a helical shape, it extends while increasing its radius of curvature when air pressure is applied. In this state, it approaches the target object and wraps around it when depressurized. Therefore,the inner diameter of the gripper in its initial state determines the minimum diameter of the object that can be grasped. In addition, currently, the radius of curvature is small when pressurized, limiting its range of motion and restricting the objects it can grasp and its use in confined spaces. Therefore, in this study, we fabricate grippers with varying design parameters and experimentally verify the inner diameter in the initial state, and the radius of curvature when pressurized. We fabricated grippers with varying stiffness of rubber sheets and constant-force springs, which are components of the gripper, and experimentally verified their shape. Results confirmed that increasing the stiffness of the inner rubber sheet and constant-force spring increases the inner diameter and the radius of curvature and effect of outer rubber sheet thickness on them is small.

09:45-10:00, Paper ThuAT4.6
Normative Safety Regulations for Collaborative Robots (I)

Geiger, Laura	Technical University of Munich
Guadarrama-Olvera, J. Rogelio	Technical University of Munich
Cheng, Gordon	Technical University of Munich
Keywords: Human-Robot Cooperation/Collaboration Abstract: Collaborative robots (cobots) are designed to operate alongside humans in shared workspaces, enhancing efficiency and flexibility. Like all machines and robots, they are regulated by the EU Machinery Regulation and must bear the CE marking before market introduction. While the regulation provides broad and general safety requirements, harmonized standards such as ISO 12100, ISO 13849, and IEC 62061 offer more specific guidance for achieving safety in machinery design. For industrial robots, ISO 10218 outlines safety requirements, with the technical specification ISO/TS 15066 further addressing unique considerations of collaborative industrial robots. By providing a comprehensive overview of these regulations and standards, this paper aims to guide manufacturers through the critical steps required for achieving legal compliance and ensuring safe market introduction of cobots in Europe.


ThuAM_BR	Forum 4-5
Networking Break(Thu) 1	Coffee Break


ThuPL	Forum 1-2-3
Plenary Talk: The Evolution of Robotics Hardware Design through AI Advancements	Plenary
Chair: Ogata, Tetsuya	Waseda University


ThuLU_BR	Forum 4-5
Conference Lunch(Thu)	Social Event


ThuP1T1	Forum 1-2-3
Human-Robot Interaction II	In-person Regular Session
Chair: Matsumoto, Mitsuharu	The University of Electro-Communications
Co-Chair: Imamura, Yumeko	National Inst. of AIST

13:30-13:45, Paper ThuP1T1.1
Development of a Computational Thinking Learning Tool Using a Railway Toy and Music

Ozawa, Tsubasa	The University of Electro-Communications
Matsumoto, Mitsuharu	The University of Electro-Communications
Keywords: Software, Middleware and Programming Environments, Entertainment Systems, Software Platform Abstract: In Japan, programming education in elementary schools has begun in 2020, and it is necessary to foster programming thinking effectively. Various learning tools have been developed to introduce programming concepts, and many studies have been conducted to develop tools using music to foster computational thinking that includes programming thinking. In this study, we focused on the affinity between the basic movements of the railway toy and the melodies and chord progressions, and we developed a learning tool to foster computational thinking by combining music and the railway toy. In this study, an activity using this tool was developed and subjects were asked to experience the activity. Based on the subjects' progress in the activity and their opinions in the questionnaire, the improvement of the activity is discussed.

13:45-14:00, Paper ThuP1T1.2
Torque-Based Balancing Control for Teleoperated Bipedal Robots

Yamamoto, Hanae	Hiroshima University
Kanaoka, Katsuya	Ritsumeikan University
Kikuuwe, Ryo	Hiroshima University
Keywords: Human-Robot/System Interaction, Human-Robot Cooperation/Collaboration Abstract: This paper proposes a torque-based leg controller to realize the teleoperated walking of biped robots. The controller is an extension of Kanaoka's balancing controller and allows the robot to switch legs and walk by leg force commands from the operator. It is made possible by switching the controller between the double- and single-support phases with appropriate mode transitions. The proposed controller was validated using a real-time simulator with haptic devices, in which the robot performed static and dynamic walking on a flat surface and slopes, following the operator's command through haptic devices.

14:00-14:15, Paper ThuP1T1.3
Evaluation of Biomimetic Assist Suits on Cross-Slope Walking

Iguchi, Shotaro	Oita Univ
Omori, Kanta	Oita Univ
Ono, Toma	Graduate School of Engineering, Oita Univ
Todaka, Takeru	Graduate School of Engineering, Oita Univ
Abe, Isao	Oita Univ
Kikuchi, Takehito	Oita Univ
Keywords: Biomimetics, Mechatronics Systems, Mechanism Design Abstract: In cross-slope walking, gait differs between uphill and downhill slopes. Furthermore, because ankle sprains and other injuries are more likely to occur during cross-slope walking, assistance is required to prevent falls, ankle joint sprains, and the removal of load from the muscles involved in knee joint motion. This assistance differs between uphill and downhill slope walking. We developed biomimetic assist suits (BAS) by combining a semi-active knee joint, which assists knee joint motion, and dual axial ankle supporter, which assists ankle motion. BAS was developed to provide knee and ankle support during cross-slope walking. A walking experiment with 14 participants was conducted to verify the effectiveness of BAS. The experimental results showed that BAS could reduce the difference between the maximum and minimum angles in inversion/eversion and plantar dorsiflexion during the swing phase, protecting the ankle joint, limiting the ankle joint angle in the elastic connector, and adjusting the assisting direction mechanism.

14:15-14:30, Paper ThuP1T1.4
Comparative Study of Muscle Utilization in a Hybrid Controller Integrating Finite-State Impedance and Electromyography-Driven Musculoskeletal Model for Robotic Ankle Prostheses: A Case Study

Abdelbar, Mohamed	Mechanical Engineering, University of Iceland, Reykjavík, Icelan
Sverrisson, Atli Örn	Össur Hf., Reykjavik, Iceland
Briem, Kristin	School of Health Sciences, University of Iceland, Reykjavík, Ice
Lecomte, Christophe	Össur Hf., Reykjavik, Iceland
Brynjólfsson, Sigurður	Mechanical Engineering, University of Iceland, Reykjavík, Icelan
Keywords: Biologically-Inspired Robotic Systems, Control Theory and Technology, Software Platform Abstract: The optimal neuroprosthetic control paradigm seeks to minimize the differences between artificial and natural physiology regarding subjective embodiment and dynamic performance. This study introduces a volitional control system integrating an electromyography (EMG)-driven musculoskeletal model with a finite state machine (FSM) impedance controller. A Hill-type muscle model is used to simulate the Gastrocnemius (GAS) and Tibialis Anterior (TA) muscles around the ankle joint. The system activates these muscles using input from ankle sensors and EMG data from antagonist muscles. To improve functionality and responsiveness, muscle parameters are optimized using a surrogate-based optimization approach. The impact of EMG control on a hybrid controller was analyzed, focusing on muscle activation during the stance phase. Two controllers were tested on a transtibial amputee for level-ground walking and stair ascent. Insights were gained into optimizing muscle activation for better gait dynamics. The model-based design technique was employed to automate validation, verification, and coding, reducing manual steps and errors. For level-ground walking, both hybrid controllers behaved more like an impedance controller than an EMG control. Disabling EMG control during plantarflexion with controller 1 prevented excessive plantarflexion, leading to a more natural gait. Controller 1, which used the TA muscle exclusively during controlled dorsiflexion, demonstrated greater repeatability. During stair ascent, both controllers allowed the user to place their toe first at each step, closely mimicking a natural gait pattern. Controller 2 exhibited better repeatability and slightly higher torque at the start of the controlled dorsiflexion phase, simplifying the control strategy and reducing computational effort.

14:30-14:45, Paper ThuP1T1.5
Designing Sensor Wear for Posture Estimation with Strain Sensors Using Digital Model

Imamura, Yumeko	AIST, Japan
Ogata, Kunihiro	AIST, Japan
Kurata, Takeshi	AIST, Japan
Keywords: Modeling and Simulating Humans, Rehabilitation Systems, Machine Learning Abstract: In this research, we have been developing a wearable motion measurement system for use in remote rehabilitation. A sensor wear with strain sensors is used to measure the motion of the upper limb during rehabilitation movements. Although the measurement target is four degrees of freedom, we aim for highly robust measurements by using 15 strain sensors. However, using actual equipment, it is difficult to find an optimal sensor arrangement by trial and error. Therefore, this study proposes a method for designing sensor placement by analysis, using a digital model that shows human and sensor systems. We designed the sensor wear using measured motions and constructed a model for joint angle estimation by machine learning using long short-term memory (LSTM). We verified the accuracy of the designed sensor wear by simulation.

14:45-15:00, Paper ThuP1T1.6
Improving Low-Cost Teleoperation: Augmenting GELLO with Force

Sujit, Shivakanth	Araya Inc
Nunziante, Luca	Araya Inc
Ogawa Lillrank, Dan	Araya Inc
Dossa, Rousslan Fernand Julien	Araya Inc
Arulkumaran, Kai	Araya Inc
Keywords: Human-Robot/System Interaction, Human-Robot Cooperation/Collaboration Abstract: In this work we extend the low-cost GELLO teleoperation system, initially designed for joint position control, with additional force information. Our first extension is to implement force feedback, allowing users to feel resistance when interacting with the environment. Our second extension is to add force information into the data collection process and training of imitation learning models. We validate our additions by implementing these on a GELLO system with a Franka Panda arm as the follower robot, performing a user study, and comparing the performance of policies trained with and without force information on a range of simulated and real dexterous manipulation tasks. Qualitatively, users with robotics experience preferred our controller, and the addition of force inputs improved task success on the majority of tasks.

15:00-15:15, Paper ThuP1T1.7
Effect of Contralateral Cane Use on Whole-Body Angular Momentum in the Frontal Plane During Walking

Kawase, Kodai	Ritsumeikan University
Sato, Takahiko	Biwako Professional University of Rehabilitation
Kudo, Shoma	National Institute of Advanced Industrial Science and Technology
Horiuchi, Gen	Ritsumeikan University
Nagano, Akinori	Ritsumeikan University
Keywords: Systems for Service/Assistive Applications, Rehabilitation Systems, Human Factors and Human-in-the-Loop Abstract: This study aims to clarify how using a cane affects the amplitude of whole-body angular momentum (WBAM) about the body’s center of mass in the frontal plane during walking in older adults. Twenty older adults participated in the study, walking under two conditions: (1) without a cane and (2) with a cane. WBAM range (WBAMR) in the frontal plane was calculated as the difference between the minimum and maximum values over one gait cycle. To identify biomechanical factors affecting WBAM, we examined peak moment arms from the body’s center of mass to the center of pressure in the mediolateral direction and peak vertical ground reaction force (GRF). Variables were assessed using paired t-tests for normally distributed data and the Wilcoxon signed-rank test for non-normally distributed data. WBAMR was significantly smaller when walking with a cane compared to walking without one. The peak mediolateral moment arms during the second steps and peak vertical GRF during the first steps were significantly larger without a cane than with one. WBAM amplitude decreased throughout the gait cycle when using a cane. Our findings demonstrate that using a cane reduces WBAM amplitude in the frontal plane across the gait cycle, thereby enhancing mediolateral dynamic balance in older adults during walking.

15:15-15:30, Paper ThuP1T1.8
Dynamic Bimanual Human-To-Robot Object Handovers Using Motion Prediction Deep Neural Networks

Mavsar, Matija	Jozef Stefan Institute
Uchibe, Eiji	ATR Computational Neuroscience Labs
Morimoto, Jun	Kyoto University
Ude, Ales	Jozef Stefan Institute
Keywords: Human-Robot Cooperation/Collaboration, Machine Learning, Vision Systems Abstract: Facilitating dynamic bimanual handovers between humans and robots is a complex endeavor that requires integrating human pose estimation, motion prediction, and generating appropriate trajectories for receiving robots. This study introduces a method designed to predict required bimanual receiver robot motion during handover tasks, leveraging pose estimation and motion prediction networks. Additionally, we propose a real-time control approach for a dual-arm humanoid robot to dynamically adjust its receiving trajectories. We evaluate the ability of neural networks to accurately predict receiver trajectories and thus improve the handover process. We compare long short-term memory (LSTM) and transformer architectures for motion prediction and also assess prediction accuracy of both absolute and relative receiving trajectories as well trajectories for each separate robot arm, and show that the use of absolute and relative coordinates is beneficial for generating more accurate receiver motions.


ThuP1T2	Forum 9-10-11
Control and Planning II	In-person Regular Session
Chair: Okuda, Hiroyuki	Nagoya University
Co-Chair: Sakamoto, Kosuke	Chuo University

13:30-13:45, Paper ThuP1T2.1
One Stack, Diverse Vehicles: Checking Safe Portability of Automated Driving Software

Nenchev, Vladislav	University of the Bundeswehr Munich
Keywords: Formal Methods in System Integration, Autonomous Vehicle Navigation, Motion and Path Planning Abstract: Integrating an automated driving software stack into vehicles with variable configuration is challenging, especially due to different hardware characteristics. Further, to provide software updates to a vehicle fleet in the field, the functional safety of every affected configuration has to be ensured. These additional demands for dependability and the increasing hardware diversity in automated driving make rigorous automatic analysis essential. This paper addresses this challenge by using formal portability checking of adaptive cruise controller code for different vehicle configurations. Given a formal specification of the safe behavior, models of target configurations are derived, which capture relevant effects of sensors, actuators and computing platforms. A corresponding safe set is obtained and used to check if the desired behavior is achievable on all targets. In a case study, portability checking of a traditional and a neural network controller are performed automatically within minutes for each vehicle hardware configuration. The check provides feedback for necessary adaptations of the controllers, thus, allowing rapid integration and testing of software or parameter changes.

13:45-14:00, Paper ThuP1T2.2
Probabilistic VFH-Based Obstacle Avoidance Algorithm for Unknown Environment Exploration Using Swarm Robots

Sakamoto, Kosuke	Chuo University
Kunii, Yasuharu	Chuo University
Keywords: Multi-Robot Systems, Automation Systems, Decision Making Systems Abstract: This paper presents a novel probabilistic Vector Field Histogram (p-VFH) obstacle avoidance algorithm for swarm robot exploration in unknown environments. Conventional path planning algorithms, such as A* and RRT*, are not suitable for unknown environments, and existing obstacle avoidance methods for swarm robots have limitations in terms of computational cost, sensor requirements, and local minima issues. The proposed p-VFH algorithm addresses these challenges by probabilistically selecting the robot's movement direction based on a continuously updated polar histogram of obstacle densities. The algorithm initializes the histogram values, updates them when encountering unknown obstacles, and generates a probability distribution for selecting the next movement direction. Simulation studies compare the performance of p-VFH with a simple "Back step" obstacle avoidance method and a stress-based obstacle avoidance algorithm. The results demonstrate that p-VFH improves exploration efficiency and successfully guides robots to target points while reliably avoiding obstacles, outperforming the other methods in terms of success rate and adaptability to the environment.

14:00-14:15, Paper ThuP1T2.3
Toward Personalized Merging Behaviors: Enhancing Automated Vehicle Trust by Adapting to the Driving Style of Surrounding Vehicles

Goto, Akinobu	Nissan Motor Co., Ltd
Eder, Kerstin	University of Bristol
Keywords: Intelligent Transportation Systems, Human Factors and Human-in-the-Loop, Human-Robot Cooperation/Collaboration Abstract: For successful real-world implementation of automated vehicles, earning trust from other road users is crucial especially in scenarios that require interaction and cooperation such as merging under congestion. While previous research has developed a communication model to improve explicit and implicit intent communication for merging vehicles, this study focuses on the execution phase of the merging maneuver, where the merging vehicle interprets the host vehicle's intent and must decide whether and if so when to merge. As different drivers perceive risk differently, we discovered that risk-averse drivers, classified by their characteristics of following preceding vehicles, prefer merging over longer distances, whereas risk-tolerant drivers expect to merge over shorter distances, as demonstrated through participant experiments. The correlation between the driving style of the host vehicle and the expectations towards merging vehicles indicates that trust towards the merging vehicle can be maximized by tailoring the merging vehicle's decision-making threshold according to the host vehicle's driving style. This study enhances our understanding of the differences in human drivers' expectations and trust within interactive driving scenarios, offering new insights for improving cooperative behavior in automated driving systems.

14:15-14:30, Paper ThuP1T2.4
Analysis, Modeling, and Control of Merging Behavior at Low Speeds

Ishiguro, Tatsuya	Nagoya University
Okuda, Hiroyuki	Nagoya University
Tominaga, Kenta	Mitsubishi Electric
Suzuki, Tatsuya	Nagoya University
Keywords: Modeling and Simulating Humans, Intelligent Transportation Systems, Human-Robot Cooperation/Collaboration Abstract: 自動と手動のマージ環境では操作は混在しており、両当事者が行うことが重要です互いに通信します。多くの研究が行われていますが、高速道路など高速で行われることはほとんどありません通常の速度などの低速で進歩が見られました道路。本発表では、実験的に観察します直角でのドライバーの運転特性低速で合流し、直角合体を構築するネゴシエーションを考慮したモデル。

14:30-14:45, Paper ThuP1T2.5
Modeling of Cyclists' Decision for Left-Turn Vehicle at Unsignalized Intersection Using Logistic Regression Model and Gaussian Mixuter Model

Wakisaka, Ryo	Toyota Technical Development Corporation
Yamaguchi, Takuma	Toyota Technical Development Corporation
Ban, Kazunori	Toyota Technical Development Corporation
Okuda, Hiroyuki	Nagoya University
Suzuki, Tatsuya	Nagoya University
Keywords: Modeling and Simulating Humans, Decision Making Systems, Human Factors and Human-in-the-Loop Abstract: In this paper, the cyclists' decision-making behavior in the interaction with a car at unsignaled intersection is measured and analyzed. Based on the measured data, the cyclists' decision-making model is identified by using logistic regression model. Since the data collection in the real world is hard to realize, we have used an interactive simulator in which the cycling simulator and the driving simulator are connected via network, and share the same virtual traffic environment. The cyclist's decision states are defined by three states regarding their operation, pedaling-on, pedaling-off and brake-on. The models to estimate these three states were constructed using the logistic regression model and the Gaussian mixture model, respectively. Finally the accuracy of the constructed models are verified, and compared with each other.

14:45-15:00, Paper ThuP1T2.6
Stepping Ahead with Electrified, Connected and Automated Shuttles in the Test Area Autonomous Driving BW

Ochs, Sven	FZI Research Center for Information Technology
Grimm, Daniel	FZI Research Center for Information Technology
Doll, Jens	FZI Research Center for Information Technology
Heinrich, Marc	FZI Research Center for Information Technology
Orf, Stefan	FZI Research Center for Information Technology
Fleck, Tobias	FZI Research Center for Information Technology
Nienhüser, Dennis	FZI Forschungszentrum Informatik
Nienhüser, Miriam	Robert Bosch GmbH
Koch, Artur	Robert Bosch GmbH
Schamm, Thomas	Robert Bosch GmbH
Kohlhaas, Ralf	Robert Bosch GmbH
Knoop, Steffen	Bosch
Biber, Peter	Robert Bosch GmbH
Fratzke, Dirk	TÜV SÜD Auto Service GmbH
Kammerer, Jakob	Bosch Home Comfort Group
Jethani, Ravi Shekhar	Ioki GmbH
Dewein, Christian	Ioki GmbH
Kuhnt, Florian	FZI Forschungszentrum Informatik
Schörner, Philip	FZI Research Center for Information Technology
Zofka, Marc René	FZI Research Center for Information Technology
Viehl, Alexander	FZI Research Center for Information Technology
Zöllner, Johann Marius	FZI Forschungszentrum Informatik
Keywords: Intelligent Transportation Systems, Sensor Fusion, Decision Making Systems Abstract: Traditional automated shuttle buses plan their trajectories along a fixed path known as virtual rail. Avoiding obstacles on the road, e.g., parked vehicles, imposes challenges on such systems and requires a safety operator to be an active part of the Highly Automated Driving Function (HAD) to maneuver the shuttle around obstacles. In the context of the ”EVA Shuttle” project (EVA) we present our developed HAD that is breaking free from the virtual rail. Due to this and the absence of a steering wheel inside the shuttle, a new safety concept had to be developed. We evaluated our approach in the form of a public transport service that bridged the first and last mile using our shuttles in a suburban district in Karlsruhe, Germany.

15:00-15:15, Paper ThuP1T2.7
Location Optimization of Manipulator to Minimize Energy Considering the Path Direction

Hibino, Kaho	Tokyo Institute of Technology
Endo, Mitsuru	Tokyo Institute of Technology
Shan, Zexin	Tokyo Institute of Technology
Tsutsui, Yukio	Tokyo Institute of Technology
Keywords: Mechanism Design, Motion and Path Planning, Factory Automation Abstract: This study introduces a novel Directional Energy Index (DEI) to optimize the energy efficiency of manipulators operating within constrained environments. The DEI evaluates the manipulator's energy consumption along a specified path, with the advantage of not requiring consideration of the system's kinematics and dynamics. Through simulations involving a two-link manipulator and the Yaskawa HC10, the DEI effectively identifies the optimal manipulator's location that minimizes energy usage. This index supports the development of energy-efficient robotic systems, aligning with goals of resource conservation and carbon neutrality.


ThuP1T3	Forum 12
Perception and Sensing II	In-person Regular Session
Chair: Sasaki, Takeshi	Shibaura Institute of Technology
Co-Chair: Miyashita, Leo	Tokyo University of Science

13:30-13:45, Paper ThuP1T3.1
Saccade Argos: Hierarchical Robust Tracking System for High Spatio-Temporal Resolution Vision

Miyashita, Leo	Tokyo University of Science
Ishikawa, Masatoshi	Tokyo University of Science
Keywords: Vision Systems, Surveillance Systems, Sensor Networks Abstract: Target tracking is one of the most important tasks in computer vision to keep capturing an object with high spatio-temporal resolution. However, when the field of view is narrowed to capture a target in detail, it is easy to lose sight of the target due to the fast movement or occlusion. In this paper, we propose a system integration method that hierarchically handles multiple visions with different FOV from wide-angle to telephoto, and each vision independently tracks the target while sharing the target position to achieve robust tracking. The proposed method was applied to Saccade Argos, a three-tier tracking system that combines a fixed stereo camera and active tracking systems using Galvano scanners, and achieved robust target tracking against occlusion with high temporal resolution (1,000 fps) and high spatial resolution (248.3 px/deg.).

13:45-14:00, Paper ThuP1T3.2
Person Segmentation and Action Classification for Multi-Channel Hemisphere Field of View LiDAR Sensors

Seliunina, Svetlana	University of Bonn
Otelepko, Artem	University of Bonn
Memmesheimer, Raphael	University of Bonn
Behnke, Sven	University of Bonn
Keywords: Vision Systems, Surveillance Systems, Human-Robot/System Interaction Abstract: Robots need to perceive persons in their surroundings for safety and to interact with them. In this paper, we present a person segmentation and action classification approach that operates on 3D scans of hemisphere field of view LiDAR sensors. We recorded a data set with an Ouster OSDome-64 sensor consisting of scenes where persons perform three different actions and annotated it. We propose a method based on a MaskDINO model to detect and segment persons and to recognize their actions from combined spherical projected multi-channel representations of the LiDAR data with an additional positional encoding. Our approach demonstrates good performance for the person segmentation task and further performs well for the estimation of the person action states walking, waving, and sitting. An ablation study provides insights about the individual channel contributions for the person segmentation task. The trained models, code and dataset are made publicly available.

14:00-14:15, Paper ThuP1T3.3
Dynamic Knitting Simulation for Predicting Defects of Knitted Fabrics

Hayano, Kazuki	Osaka University
Wakamatsu, Hidefumi	Grad. School of Eng., Osaka Univ
Iwata, Yoshiharu	Osaka University
Yamada, Yuya	Shima Seiki Mfg., Ltd
Keywords: System Simulation Abstract: Knit products are widely used in people's daily lives because of their high functionality. Most such knit products are usually made of flat knitted fabric, which is the basic structure of knitted fabrics, and this unique structure of flat knitting gives knit products high functionality. However, a defect called Yotari often occurs during the creation of flat knitting. Since Yotari impairs appearance and functionality of knit products, knit products must be knitted in such a way that it does not occur. However, the mechanism of Yotari has not been theoretically elucidated, and there is a need to predict the occurrence of the problem and to investigate its causes. The purpose of this research is to predict the occurrence and determine the cause of Yotari by dynamically analyzing yarns knitted on a knitting machine. In this paper, the hypothesis of the cause of Yotari is verified. First, yarn is modeled based on its mechanical properties. Next, contact and friction that occur during knitting process are formulated. Finally, knitting process using a knitting machine is dynamically simulated to test the hypotheses of Yotari.

14:15-14:30, Paper ThuP1T3.4
Development of an Image Recognition Model Using an Image Search Function Based on Multiple Pre-Trained Models

Kuragane, Hirotada	Shibaura Institute of Technology, the Robotics Systems Design La
Sasaki, Takeshi	Shibaura Institute of Technology
Keywords: Machine Learning, Software, Middleware and Programming Environments Abstract: Machine learning is widely utilized for data analysis and decision-making, with supervised and unsupervised learning being the primary approaches. However, models suffer from overfitting, where they become overly adapted to the training data. To address this issue, semi-supervised learning has been employed. Semi-supervised learning is an effective technique for dealing with large datasets that are difficult to label, but it faces limitations in fields where ensuring data diversity and quantity is challenging. This paper proposes a robust image recognition model utilizing image search functions from Google. The proposed model improves accuracy by utilizing the order of search results to collect a variety of data and evaluating their reliability. In this paper, the order of search results is defined as "image search depth" to measure the correlation between reliability and accuracy. While it is easy to collect large amounts of data from Google through automated methods, there is a risk that unrelated data could be included, potentially affecting the model's accuracy. To address this issue, the model is trained with automated preprocessing. As part of this preprocessing, inference is performed on all images in the dataset using multiple pre-trained models that were trained on randomly selected images from the dataset to compute predictions. Images with the prediction above a certain threshold are selected as training data to enhance the final model’s accuracy. To assess the contribution of preprocessing to accuracy improvement, we calculate accuracy by varying the number of parallel pre-trained models and the threshold values. Furthermore, the final model is evaluated using CIFAR-100 to objectively demonstrate its performance. The results indicate that image search depth does not contribute to model accuracy, while the number of parallel pre-trained models and the threshold significantly impact accuracy.

14:30-14:45, Paper ThuP1T3.5
Extraction of Color Information Array from RGB-NIR Images Enhanced by Multispectral Illumination and Image Classification by LLGMN

Eguchi, Taiga	Saga University
Yeoh, Wen Liang	Saga University
Okumura, Hiroshi	Saga University
Fukuda, Osamu	Saga University
Keywords: Vision Systems, Machine Learning Abstract: In recent years, advancements in image classification technology have led to significant improvements in classification accuracy. However, challenges remain, such as difficulties in image classification when redundant information is present, even when using state-of-the-art deep learning methods, and the need for large amounts of training samples for deep learning models. To address these issues, we proposes a method that enhances critical color information for image classification by combining multi-illumination and multispectral cameras, and utilizes log-linearized gaussian mixture neural network that can classify images with a small number of training samples. The proposed system utilizes a multi-spectral camera capable of capturing Red (R), Green (G), Blue (B), and Near-Infrared (NIR) images (RGB-NIR), along with corresponding multi-spectral illumination. By emphasizing color information and clarifying differences in object features, this approach enables high-accuracy color classification with an unprecedentedly small dataset when input into the log-linearized gaussian mixture neural network. Experimental results demonstrated the effectiveness of the proposed system, achieving 100% color classification accuracy on green tea samples with similar color features. This achievement is expected to contribute to various fields such as manufacturing, healthcare, chemistry, and agriculture, where multispectral imaging is increasingly utilized.

14:45-15:00, Paper ThuP1T3.6
Enhanced Calibration of a Laser Profiler Sensor for 3D Inspection and Reconstruction

Olleik, Houssein	Cesi Lineact
Vauchey, Vincent	CESI LINEACT
Nait Chabane, Ahmed	CESI LINEACT
Dupuis, Yohan	CESI
Keywords: Vision Systems, Sensor Networks Abstract: In recent decades, inspection and three-dimensional reconstruction of gas and water pipes have required high-precision sensors capable of operating in confined and low-texture environments, which presents challenges to traditional sensing technologies. This paper presents the design of both hardware and software for a laser profiler sensor, introducing a novel approach to calibrating the sensor to enhance its accuracy and functionality in industrial inspections. We introduce a new calibration method suitable for both conic and planar lasers, with calibration achieved through a two-step process: initial multiposition binocular-structured light calibration and subsequent refinement using a standard ring gauge. The calibration method is uniquely designed to be compatible with both cone- and plane-type calibrations, providing a robust and versatile solution for industrial laser profiling. We demonstrate the efficacy and accuracy of our proposed calibration method, with errors consistently remaining below 1 mm, thus validating the reliability of the reconstruction process.

15:00-15:15, Paper ThuP1T3.7
Locating Survivors' Voices in Disaster Sites Using Quadcopters Based on Modeling Complicated Environments by PyRoomAcoustics and SSL by MUSIC-Based Algorithms

Kamada, Masachika	Waseda University
Yamato, Junji	Kogakuin University
Oikawa, Yasuhiro	Waseda University
Okuno, Hiroshi G.	Kyoto University/Waseda University
Ohya, Jun	Waseda University
Keywords: Systems for Search and Rescue Applications, System Simulation, Sensor Networks Abstract: To enable the practical use of quadcopters equipped with microphone arrays in disaster sites for locating survivors' voices, this paper proposes a comprehensive method for modeling and simulating complex acoustic environments using PyRoomAcoustics, and for locating sound sources using variants of the MUSIC algorithms. By comparing impulse responses in PyRoomAcoustics simulations with those in real environments, we observed a high degree of correlation, indicating the simulations' suitability for real-world applications. Utilizing these simulations, we identified the optimal microphone array configuration for sound source localization (SSL) and examined the relationship between flight altitude and SSL performance. Key insights include minimizing ground reflection impacts at higher altitudes and enhancing SSL performance at lower altitudes with reduced ground reflectivity. Additionally, we found that power variations among multiple sound sources significantly affect the SSL performance of weaker sources. Among the MUSIC algorithm variants, iGEVD-MUSIC achieved the highest SSL performance, successfully locating multiple sound sources, including human voices. These findings demonstrate that the proposed simulation method is a valuable tool for developing and optimizing SSL techniques and parameters for real-world quadcopter applications. Furthermore, the insights gained from these simulations can be directly applied to the practical deployment of microphone array-equipped quadcopters in disaster response scenarios, aiding in the precise localization of survivors. This research significantly advances SSL methods and the practical realization of quadcopters for disaster site applications, ultimately enhancing the effectiveness and reliability of search and rescue operations.

15:15-15:30, Paper ThuP1T3.8
Dynamic Threshold Spatial-Temporal Filter on FPGAs for Event-Based Vision Sensors

Toyoda, Ryuta	Kyushu Institute of Technology
Yoshioka, Kanta	Kyushu Institute of Technology
Tamukoh, Hakaru	Kyushu Institute of Technology
Keywords: Vision Systems, Hardware Platform, Systems for Field Applications Abstract: Event-based vision sensors are high-speed, wide dynamic range image sensors with potential applications in domains such as robotics and visual navigation. However, these sensors are sensitive to noise, particularly under low-light conditions, degrading the data quality. Therefore, developing a filter capable of detecting and removing noise from different sources with high accuracy is crucial. Moreover, removing near-edge noise in high-density areas is particularly challenging using conventional methods because of their high spatial-temporal correlation with actual events. We propose a dynamic threshold spatial-temporal filter that detects high- or low-density areas and removes noise. Detection was achieved by counting the number of events occurring within a certain period in the area surrounding each event. Applying an appropriate threshold for each density significantly enhanced noise processing accuracy, as reflected by the mean square error and peak signal-to-noise ratio metrics. Moreover, we synthesize digital circuits in a field-programmable gate array and demonstrated a notable reduction in processing time compared to that of the central processing unit-based approach, achieving up to 74-fold faster in processing speed. These findings suggest that the proposed filter can significantly enhance real-time event-based vision systems, particularly in environments with varying noise conditions.


ThuP1T4	Forum 13-14
SS4-1 Real Space Service System	In-person Special Session
Chair: Wada, Kazuyoshi	Tokyo Metropolitan University
Co-Chair: Niitsuma, Mihoko	Chuo University
Organizer: Wada, Kazuyoshi	Tokyo Metropolitan University
Organizer: Niitsuma, Mihoko	Chuo University
Organizer: Nakamura, Sousuke	Hosei University
Organizer: Ohara, Kenichi	Meijo University

13:30-13:45, Paper ThuP1T4.1
Multimodal Deep Q-Network for Environmental Adaptation of Robotized Plants (I)

Miwaura, Ryo	Tokyo Metropolitan University
Sato-Shimokawara, Eri	Tokyo Metropolitan University
Keywords: Machine Learning, Human-Robot/System Interaction, Multi-Modal Perception Abstract: Though keeping pets increases communication among family and colleagues, it has challenges, such as allergies and the need for environmental management. As an alternative, we propose the robotized plant, designed to enhance group communication through nurturing activities. We posit that environmental adaptation is essential for the robotized plant to coexist with its caretakers over extended periods. To achieve this, we aim to optimize its vocalization behavior by considering both internal and external states using a Multimodal Deep Q-Network (DQN). This paper evaluates the feasibility of environmental adaptation by analyzing the learning outcomes of the proposed system under various simulated conditions.

13:45-14:00, Paper ThuP1T4.2
Comparison of Communication Characteristics for Local 5G Semi-Synchronous Mode for Multiple Robot Control (I)

Kawanishi, Souma	Tokyo Metropolitan University
Miyamoto, Nobuhiko	National Institute of Advanced Industrial Science and Technology
Yoshida, Koji	Tokyo Metropolitan University
Wada, Kazuyoshi	Tokyo Metropolitan University
Keywords: Network Systems, Sensor Networks, Multi-Robot Systems Abstract: 5G communications have evolved dramatically and are being used for various purposes. Among them, private communication is expected to be suitable for remote control of multiple robots due to features such as eMBB and mMTC among local 5G. In robot interconnection communication, the emphasis is on the time ratio of uplink communication from each node to the server, while general 5G communication focuses on the time ratio of downlink communication, and the time ratio of uplink communication is small, making it difficult to use for multiple robot control. In this paper, experiments were conducted to improve the uplink eMBB and URLLC by allocating more time resources to the uplink, assuming the advantage of semi-synchronous mode in the TDD communication scheme. The experimental environment was justified by comparing the experimental data with the base station logs. The results showed that not only the throughput of the semi-synchronous uplink was improved, but also the throughput of the TCP downlink.

14:00-14:15, Paper ThuP1T4.3
Verification of the Usefulness of SRDM through a Case Study of Smart Robot Development (I)

Shimada, Tamaki	Tokyo Metropolitan University
Wada, Kazuyoshi	Tokyo Metropolitan University
Ohara, Kenichi	Meijo University
Ando, Noriaki	National Institute of Advanced Industrial Science and Technology
Kimita, Koji	The University of Tokyo
Keywords: Systems for Service/Assistive Applications, Human-Robot/System Interaction Abstract: In recent years, the demand for service robots has increased rapidly due to the aging population and the labor shortage. However, service robots have not been widely used because they require design improvements even after their introduction. Therefore, a tool named the Service Robot Design Matrix (SRDM) has been developed to support the design of service robots. However, SRDM has not been verified for its usefulness because it has not been applied to many cases in the current situation. The purpose of this study is to verify the role and usefulness of SRDM by analyzing the vital measurement service, one of the services provided by the AIREC smart robot in development, using SRDM. After clarifying the flow of services to measure vital data and organizing and extracting the parameters of the service system as well as the parameters of the robotic system, the dependencies between the parameters were defined and the system was modularized. As a result, the impact of changes in one parameter on the other parameters can now be understood to some extent. In addition, we were able to understand how the actual implementation of the robot interacts with the service system. This example showed that SRDM can serve as a link between service design and robot design.

14:15-14:30, Paper ThuP1T4.4
GPTAlly: A Safety-Oriented System for Human-Robot Collaboration Based on Foundation Models (I)

Bastin, Brieuc	Université Catholique De Louvain
Hasegawa, Shoichi	Ritsumeikan University
Solis, Jorge	Karlstad University / Waseda University
Ronsse, Renaud	Université Catholique De Louvain
Macq, Benoit	UCLouvain
El Hafi, Lotfi	Ritsumeikan University
Garcia Ricardez, Gustavo Alfonso	Ritsumeikan University
Taniguchi, Tadahiro	Ritsumeikan University
Keywords: Human-Robot Cooperation/Collaboration, Human Factors and Human-in-the-Loop, Decision Making Systems Abstract: As robots increasingly integrate into the workplace, Human-Robot Collaboration (HRC) has become increasingly important. However, most HRC solutions are based on pre-programmed tasks and use fixed safety parameters, which keeps humans out of the loop. To overcome this, HRC solutions that can easily adapt to human preferences during the operation as well as their safety precautions considering the familiarity with robots are necessary. In this paper, we introduce GPTAlly, a novel safety-oriented system for HRC that leverages the emerging capabilities of Large Language Models (LLMs). GPTAlly uses LLMs to 1) infer users' subjective safety perceptions to modify the parameters of a Safety Index algorithm; 2) decide on subsequent actions when the robot stops to prevent unwanted collisions; and 3) re-shape the robot arm trajectories based on user instructions. We subjectively evaluate the robot's behavior by comparing the safety perception of GPT-4 to the participants. We also evaluate the accuracy of natural language-based robot programming of decision-making requests. The results show that GPTAlly infers safety perception similarly to humans, and achieves an average of 80% of accuracy in decision-making, with few instances under 50%. Code available at: https://axtiop.github.io/GPTAlly/

14:30-14:45, Paper ThuP1T4.5
Conditional NewtonianVAE to Generate Pre-Grasping Actions in Physical Latent Spaces (I)

Ito, Masaki	Ritsumeikan
Garcia Ricardez, Gustavo Alfonso	Ritsumeikan University
Okumura, Ryo	Panasonic Holdings Corporation
Taniguchi, Tadahiro	Ritsumeikan University
Keywords: Robotic hands and grasping, Vision Systems Abstract: To make robotic grasping scalable, vision-based control with high data efficiency and accuracy is needed. World models are capable of creating representations of physical environments from sensory information. In particular, NewtonianVAE is a world model that can control targets in physical environments by using proportional control in its latent space from input images. However, NewtonianVAE entangles information of each object in separate state subspaces making control unfeasible when trained with multiple objects. In this paper, we introduce Conditional NewtonianVAE, a novel framework designed to generate pre-grasping actions by disentangling object-type information from the state space in physical latent spaces. Our method incorporates a conditioning variable to achieve disentanglement, facilitating the use of the learned state space for control tasks. Through simulation and real-robot experiments, we demonstrate the effectiveness of Conditional NewtonianVAE in accurately positioning the end effector into a pre-grasping pose, thereby enhancing the success rate of robotic grasping. Conditional NewtonianVAE achieves a grasping success rate of 83% for known objects and 78% for unseen objects in the real-robot experiments.

14:45-15:00, Paper ThuP1T4.6
Feedback-Driven Adaptive Task Estimation and Human Error Handling in Human-Robot Collaboration (I)

Tahara, Kota	Chuo University
Niitsuma, Mihoko	Chuo University
Keywords: Human-Robot Cooperation/Collaboration, Human-Robot/System Interaction, Systems for Service/Assistive Applications Abstract: A human-robot collaboration work system has been developed in which the robot operates following human tasks based on recognition and prediction of human tasks using spatial intelligence. However, the robot's task selection depends on human task recognition, and the robot cannot respond to human error or incorrect recognition of human tasks. In this study, we　aim to develop a system that does not depend on the results of human task recognition. The system incorporates human task estimation that accounts for human error and robot motion estimation using reinforcement learning based on human feedback. Its usefulness was evaluated through experiments on human subjects.


ThuPM_BR	Foyer
Networking Break(Thu) 2	Coffee Break


ThuP2T1	Forum 1-2-3
Human-Robot Interaction III	In-person Regular Session
Chair: Inamura, Tetsunari	Tamagawa University
Co-Chair: Ikeda, Atsutoshi	Kindai University

16:00-16:15, Paper ThuP2T1.1
GestEarrings: Developing Gesture-Based Input Techniques for Earrings

Furuuchi, Takehisa	Keio University
Yamamoto, Takumi	Keio University
Masai, Katsutoshi	Kyushu University
Amesaka, Takashi	Keio University
Sugiura, Yuta	Keio University
Keywords: Human Interface Abstract: In recent years, wearable computing has become popular in society, and demand for devices that look comfortable when worn or operated has been increasing. Until now, various items such as hats, hair extensions, and masks have been developed as interfaces. In this study, we propose a method to turn earrings into a gesture input interface. Earrings are widely used as a fashion item, and we believe that the appearance and wearability of earring devices are socially acceptable. First, we explored user-defined gestures using three types of earrings with different shapes and characteristics. Next, we implemented earring devices capable of identifying each of the determined gesture sets. The gesture recognition rate for each earring was 83.6% (hanging earring/11 types), 96.6% (surface earring/8 types), and 87.0% (hoop earring/12 types).

16:15-16:30, Paper ThuP2T1.2
Analysis of Brushing Techniques for Quantitative Evaluation of Hairdressers' Skills

Mori, Katsuya	Kindai University
Ikeda, Atsutoshi	Kindai University
Keywords: Modeling and Simulating Humans, Human Factors and Human-in-the-Loop Abstract: In some developed countries, such as Japan, transferring the skills of skilled workers is a challenge. In particular, it is unclear how skilled technicians use tools skillfully. We focus on the skills of Japanese hairdressers with the aim of analyzing the skills of expert hairdressers and applying them to the efficient training of novice hairdressers. In this study, we focus on brushing, which is one of the most important beauty techniques in diagnosing hair quality and setting up the finishing process. The brushing skills of an expert hairdresser are carefully measured and compared with those of a beginner to analyze the characteristics of brushing skills.

16:30-16:45, Paper ThuP2T1.3
Evaluation of Emotions Related to the Benefits of Inconvenience Using PANAS and Tourism Engineering

Itatsu, Kotaro	Kyoto University of Advanced Science
Kawakami, Hiroshi	Kyoto University of Advanced Science
Keywords: Human-Robot/System Interaction, Human Interface, Human Factors and Human-in-the-Loop Abstract: This study investigates the effects of incorporating inconvenient features into tourist support tools on emotional experiences during travel. Using the Positive and Negative Affect Schedule (PANAS), we evaluated how participants' emotions are influenced by using a "Blur Navigation" and an "Unfriendly Camera" during tourism in some sight-seeing areas of Kyoto. The research involved measuring participants' emotional states before and after their experience to assess the impact of these tools on their feelings. The goal is to explore whether adding elements of inconvenience to tourist support tools can contribute to a more engaging and meaningful travel experience. By examining the emotional responses associated with these tools, this study aims to shed light on the concept of "benefit of inconvenience" in tourism and provide insights for future developments in tourist support systems. The results revealed that such emotional factors as the positive emotions "excited" and "alert," and the negative emotions "distressed," "irritable," and "nervous" were significantly correlated with spatial awareness.

16:45-17:00, Paper ThuP2T1.4
PuzMaty: Supporting Puzzle Mats Design Creation

Yamamoto, Sarii	Keio University
Wang, Jia Jun	National Yang Ming Chiao Tung University
Chan, Liwei	National Yang Ming Chiao Tung University
Sugiura, Yuta	Keio University
Keywords: Systems for Service/Assistive Applications, System Simulation, Human Interface Abstract: Puzzle mats made of cushioned material are widely used in environments like homes and playrooms to prevent injuries from infants and toddlers falling. The mats feature puzzle-like edges, allowing users to freely adjust their size and shape to fit the space where the mat is placed. This study proposes ``PuzMaty,'' a design interface for drawing patterns of animals, numbers, letters, etc. using puzzle mats of different colors. This interface is expected to make the puzzle mat function not only as a safety measure but also as an interior decoration. It is also likely to reduce the burden of childcare when placing mats because it makes the number of mats required for a particular space more intuitive.

17:00-17:15, Paper ThuP2T1.5
Development of Pouring Work Training Simulator with Indicating Operation Score for Efficient Skill Acquisition

Miura, Takumi	University of Yamanashi
Noda, Yoshiyuki	University of Yamanashi
Keywords: Virtual Reality and Interfaces, System Simulation Abstract: The pouring work in the casting industry is a dangerous work for the workers because they are pouring the high temperature molten into the mold with dust exposure. In recent years, the automatic pouring machines have been developed and applied in the casting production lines with mass production. However, it is difficult to apply the pouring machines in the casting factories with high-mix low-volume production. In these factories with high-mix low-volume production, the molten metal is poured by the manual operation of skilled workers. Therefore, the knowledge and skill succession in the pouring work is the key to continuing performance of the casting factories. In a traditional fashion, on the job training has been applied to the practical pouring work for acquiring the skill. However, the pouring work by a novice worker increases risk of the accident. Moreover, it is difficult to acquire the skill in the on-the-job-training without a quantitative assessment to the pouring work.In this study, we develop the training simulator of pouring work with indicating the operation score for grasping quantitatively the trainee's skill while training. It is easy to improve the skill of pouring work by referring the operation score. Moreover, the trainee's ranking based on the operation score can be indicated after the pouring work in the training for increasing the motivation of the training. We propose the operation score which can be derived by the error of the target level and the liquid level in the sprue cup in the pouring work and the variation in the liquid level. The efficacy of the developed training simulator is verified by the training experiments.

17:15-17:30, Paper ThuP2T1.6
Influence of Longterm Duration and Damping Shapes to Perceived Intensity for Vibrotactile Stimulation

Kuhara, Takumi	Nagoya Institute of Technology
Yukawa, Hikari	Nagoya Institute of Technology
Tanaka, Yoshihiro	Nagoya Institute of Technology
Keywords: Haptics and tactile sensors Abstract: Haptic information has been shown to improve various experiences and accuracy of movement. However, the perceived intensity is known to be influenced by various parameters of the vibrotactile stimuli. Not only the amplitude but also the length of a short-term duration, the presence of a decaying factor, and the shape of the waveform affect the subjective intensity. In this paper, we investigated the influence of different damping shapes and duration on the perceived intensity. We prepared four different shaped waves as stimuli; a constant sinusoidal wave, an exponential decaying shaped wave, a linear decaying shaped wave, and a logarithmic decaying shaped wave, and six different long-term durations of up to 3 seconds for each waveform. Ten participants took part in the experiment to see how strong the intensity was perceived for each stimulus. The results indicated that the damping shape of the samples affects the perceived intensity in the order of a constant sinusoidal wave, a logarithmic decaying shaped wave, a linear decaying shaped wave, and an exponential decaying shaped wave. Also, the results suggest the representation of the time-averaged energy to the perceived intensity.

17:30-17:45, Paper ThuP2T1.7
Extraction of Latent Variables for Modeling Subjective Quality in Time-Series Human-Robot Interaction

Mizuchi, Yoshiaki	Tamagawa University
Kobayashi, Taisuke	National Institute of Informatics
Inamura, Tetsunari	Tamagawa University
Keywords: Systems for Service/Assistive Applications, Human-Robot/System Interaction Abstract: This study presents a novel method for modeling subjective evaluation of the quality of interaction (QoI) by extracting explanatory variables that are not explicitly quantifiable by humans from human-robot behavior. The proposed method extracts latent variables that account for both explicit and tacit knowledge by performing maximum likelihood estimation to predict manually selected explanatory variables, alongside QoI score prediction, from time-series interaction data. In this study, we address three key questions: (i) whether the extraction of latent variables improves accuracy compared to conventional regression analysis, (ii) whether implicit variables, beyond those selected by humans, play a significant role, and (iii) whether human-selected explanatory variables are necessary in explaining subjective assessment scores. The results of comparisons across several learning conditions demonstrate that incorporating tacit knowledge variables, uncorrelated with traditional explanatory variables, enhances the accuracy of QoI estimation. This study contributes by enabling data-driven extraction of explanatory variables, revealing the influence of tacit knowledge on QoI estimation, and highlighting the importance of both top-down and bottom-up approaches in accurately estimating subjective evaluations of QoI.

17:45-18:00, Paper ThuP2T1.8
Developing a Framework for Natural Human Movement Mimicry of Low-Dynamic Motions in Mobile-Based Humanoids

Gormuzov, Simon	Waseda University
Wang, Yushi	Waseda University
Yang, Pin-Chu	Waseda University, Cutieroid Project, HatsuMuv Corporation
Miyake, Tamon	Waseda University
Ogata, Tetsuya	Waseda University
Sugano, Shigeki	Waseda University
Keywords: Human-Robot/System Interaction Abstract: In this work, we propose a framework that facilitates natural whole-body movements for a mobile-based humanoid. The framework takes bipedal human motion animation as input. After re-targeting the motion to the robot rig, the joint space data is applied to a physics-enabled robot model for balance confirmation and then deployed one-shot to the real robot. This method is beneficial for: 1) mapping expressive and low-dynamic whole-body motions, such as walking, to mobile-based robots; and 2) serving as a basis for training more complex control policies for more dynamic motions. Experiments were conducted by deploying human walking animations to the robot, assessing its ability to mirror the movements, and evaluating the subjective feelings of humans observing the robot performing the motions generated by both the proposed method and the traditional method. The results indicated that the proposed method is effective for mimicking human movements and consistently delivered a better overall impression in the natural appearance of the motion, the human-like factor, and friendliness.


ThuP2T2	Forum 9-10-11
Grasping and Manipulation	In-person Regular Session
Chair: Kiyokawa, Takuya	Osaka University
Co-Chair: Tsuji, Tokuo	Kanazawa University

16:00-16:15, Paper ThuP2T2.1
Tight Clearance Peg-In-Hole Motion Planner Using Gripper with Flexible Joint and Differential Infinity Rotatable Function of Palm

Ueda, Masanori	Kanazawa University
Tsuji, Tokuo	Kanazawa University
Hiramitsu, Tatsuhiro	Kanazawa University
Seki, Hiroaki	Kanazawa University
Nishimura, Toshihiro	Kanazawa University
Suzuki, Yosuke	Kanazawa University
Watanabe, Tetsuyou	Kanazawa University
Keywords: Motion and Path Planning, Intelligent and Flexible Manufacturing, Automation Systems Abstract: We propose a method for a peg-in-hole task with tight clearance using a gripper developed in this study.The compact gripper developed is equipped with a flexible joint and a differential mechanism that allows infinite palm rotation.The flexible joint allows fingertip movement in any direction and in precise assembly tasks, even under positional errors.The differential mechanism enables unlimited rotation of the gripper palm to accommodate various assembly tasks.The compact design of the gripper enables effective operation in narrow spaces.We successfully achieved a peg-in-hole configuration by aligning a peg with a hole by pressing and rotating it without complex control or hole searching.Subsequently, the alignment is corrected via state recognition using a force sensor.The effectiveness of the developed gripper and control method is confirmed experimentally.

16:15-16:30, Paper ThuP2T2.2
Good Grasps Only: A Data Engine for Self-Supervised Fine-Tuning of Pose Estimation Using Grasp Poses for Verification

Hagelskjær, Frederik	University of Southern Denmark
Keywords: Software Platform, Hardware Platform, Factory Automation Abstract: In this paper, we present a novel method for self-supervised fine-tuning of pose estimation. Leveraging zero-shot pose estimation, our approach enables the robot to automatically obtain training data without manual labeling. After pose estimation the object is grasped, and in-hand pose estimation is used for data validation. Our pipeline allows the system to fine-tune while the process is running, removing the need for a learning phase. The motivation behind our work lies in the need for rapid setup of pose estimation solutions. Specifically, we address the challenging task of bin picking, which plays a pivotal role in flexible robotic setups. Our method is implemented on a robotics work-cell, and tested with four different objects. For all objects, our method increases the performance and outperforms a state-of-the-art method trained on the CAD model of the objects.

16:30-16:45, Paper ThuP2T2.3
Self-Supervised Learning of Grasping Arbitrary Objects On-The-Move

Kiyokawa, Takuya	Osaka University
Nagata, Eiki	Nara Institute of Science and Technology
Tsurumine, Yoshihisa	Nara Institute of Science and Technology
Kwon, Yuhwan	Nara Institute of Science and Technology
Matsubara, Takamitsu	Nara Institute of Science and Technology
Keywords: Systems for Service/Assistive Applications, Machine Learning, Robotic hands and grasping Abstract: Mobile grasping enhances manipulation efficiency by utilizing robots' mobility. This study aims to enable a commercial off-the-shelf robot for mobile grasping, requiring precise timing and pose adjustments. Self-supervised learning can develop a generalizable policy to adjust the robot's velocity and determine grasp position and orientation based on the target object's shape and pose. Due to mobile grasping's complexity, action primitivization and step-by-step learning are crucial to avoid data sparsity in learning from trial and error. This study simplifies mobile grasping into two grasp action primitives and a moving action primitive, which can be operated with limited degrees of freedom for the manipulator. This study introduces three fully convolutional neural network (FCN) models to predict static grasp primitive, dynamic grasp primitive, and residual moving velocity error from visual inputs. A two-stage grasp learning approach facilitates seamless FCN model learning. The ablation study demonstrated that the proposed method achieved the highest grasping accuracy and pick-and-place efficiency. Furthermore, randomizing object shapes and environments in the simulation effectively achieved generalizable mobile grasping.

16:45-17:00, Paper ThuP2T2.4
Cooperative Grasping and Transportation Using Multi-Agent Reinforcement Learning with Ternary Force Representation

Bernard-Tiong, Ing-Sheng	Nara Institute of Science and Technology
Tsurumine, Yoshihisa	Nara Institute of Science and Technology
Sota, Ryosuke	Nara Institute of Science and Technology
Shibata, Kazuki	Nara Institute of Science and Technology
Matsubara, Takamitsu	Nara Institute of Science and Technology
Keywords: Multi-Robot Systems, Machine Learning Abstract: Cooperative grasping and transportation require effective coordination to complete the task. This study focuses on the approach leveraging force-sensing feedback, where robots use sensors to detect forces applied by others on an object to achieve coordination. Unlike explicit communication, it avoids delays and interruptions; however, force-sensing is highly sensitive and prone to interference from variations in grasping environment, such as changes in grasping force, grasping pose, object size and geometry, which can interfere with force signals, subsequently undermining coordination. We propose multi-agent reinforcement learning (MARL) with ternary force representation, a force representation that maintains consistent representation against variations in grasping environment. The simulation and real-world experiments demonstrate the robustness of the proposed method to changes in grasping force, object size and geometry as well as inherent sim2real gap.

17:00-17:15, Paper ThuP2T2.5
Development of Dish-Attached Microchip for Autonomous Cell Manipulation System

Iizuka, Gakuto	University of Yamanashi
Tamura, Kenji	Sinfonia Technology Company
Abe, Takaaki	Osaka University
Ukita, Yoshiaki	University of Yamanashi
Keywords: Micro/Nano Systems, Automation at Micro-Nano Scales Abstract: In this paper, we report on the fabrication of a dish-attached microchip applicable to cell culture dishes and the autonomous control of microvalves based on deep reinforcement learning for manipulation of living cells. We have achieved autonomous position control of microparticles in a two-dimensional plane, and the aim of this study is to apply this technology to living cells. To this end, we first fabricated the contact surfaces of the cell manipulation channels of a microfluidic chip in a structure that can be mounted in a dish used for general cell culture. Deep reinforcement learning is applied to the pumping control of this device in order to acquire autonomous cell manipulation behavior. For efficient learning, a simulator representing the behavior of particles in the manipulation area of the fabricated microfluidic chip was constructed using a neural network, and a behavior decision model was trained in this environment. In this task, rewards are given according to the distance between the particle position and a randomly placed target in the virtual environment in the simulator. As a result, it was observed that the model learnt to manipulate the particles to an arbitrary target position, demonstrating the manipulation of live cells in a real environment by this behavioral decision-making model. This technology is promising as a new platform for the realization of automatic cell array techniques.

17:15-17:30, Paper ThuP2T2.6
A Technical Integration Framework for Human-Like Motion Generation in Symmetric Dual Arm Picking of Large Objects

Baracca, Marco	University of Pisa
Morello, Luca	KU Leuven
Bianchi, Matteo	University of Pisa
Keywords: Motion and Path Planning, Multi-Robot Systems, Integration Platform Abstract: Dual arm robot picking of large objects is a common task in industrial settings, which is often accomplished besides a human operator, as a part of a more complex execution pipeline. This not only requires the simultaneous control of multiple arms to achieve the desired motion of the object and the maintenance of the right amount of force to ensure a stable grasp, but it has also to guarantee a safe and trustworthy human-robot interaction. One way to achieve the latter requirements is to ensure the execution of human-like robot motions, which can be easily understood and predicted by humans. In this paper, we present a technical framework that upon a passivity-based adaptive force-impedance control for modular multi-manual object manipulation, integrating it with a vision-based system to increase the effectiveness and generalizability of the manipulative action, as well as with a human-like Cartesian motion planning algorithm, to enable dual arm picking of large objects. We tested our approach in experiments with real manipulators during different types of large objects picking.

17:30-17:45, Paper ThuP2T2.7
Look Ahead Optimization for Managing Nullspace in Cartesian Impedance Control of Dual-Arm Robots

Origanti, Vamsi Krishna	Deutsches Forschungszentrum Für Künstliche Intelligenz GmbH (DFK
Danzglock, Adrian	Deutsches Forschungszentrum Für Künstliche Intelligenz GmbH (DFK
Kirchner, Frank	University of Bremen
Keywords: Multi-Robot Systems, Control Theory and Technology, Robotic hands and grasping Abstract: This paper presents a method for handling nullspace challenges in Cartesian impedance control of a dual-arm KUKA IIWA robot by employing a Look ahead Controller(LAC) in nullspace. Ambidexterity is crucial for dual-arm robots to perform complex tasks that require coordinated use of both arms. Cartesian impedance control provides significant advantages in dual-arm manipulation tasks, especially in imitation learning for reproducing learned compliant interactions and precise control of end-effector poses. This approach enables the learned tasks to be robot-agnostic, facilitating transfer to other robotic systems. However, the nullspace handling of cartesian impedance control is very challenging. In this paper, we address this issue to handle kinematic constraints and facilitate avoiding singularities, joint limits, and collision in nullspace or redundant space of dual arms with each other with the help of a LAC. The proposed approach utilizes Sequential QP in the optimization loop of LAC for estimating optimal joint configurations for a horizon in redundant space, this provides the safe and efficient operation. Results are provided in this paper for two trajectories and compared with and without optimization, results demonstrate the method's effectiveness in maintaining desired end-effector poses while avoiding kinematic constraints and nullspace collisions.


ThuP2T3	Forum 12
Perception and Sensing III	In-person Regular Session
Chair: André, Antoine N.	AIST

16:00-16:15, Paper ThuP2T3.1
Object Positions Interpretation System for Service Robots through Targeted Object Marking

Yamao, Kosei	Kyushu Institute of Technology
Kanaoka, Daiju	Kyushu Institute of Technology
Isomoto, Kosei	Kyushu Institute of Technology
Tamukoh, Hakaru	Kyushu Institute of Technology
Keywords: Vision Systems, Machine Learning, Human-Robot/System Interaction Abstract: Service robots are typically required to interpret and execute various complex tasks in home environments. Recognizing the environment, such as furniture, and understanding the relationships between object positions is critical for executing various tasks. Set of mark (SoM) is a visual prompting method that focuses on interpreting the relationship between semantic regions by overlaying marks in each region. However, SoM marks segmented regions that are not objects such as walls and floors. This marking creates noise when interpreting object positions. To address this problem, we propose a novel object-position interpretation system that combines an object detection model and a vision-language model (VLM). The proposed system incorporates an object detection model to mark only objects, allowing the VLM to efficiently interpret object positions. Furthermore, the proposed system improves the accuracy of the system by including the original image and label output by the object detection model in the input to the VLM. The experimental results show that the proposed system outperforms SoM in terms of interpreting object positions.

16:15-16:30, Paper ThuP2T3.2
Camera-LiDAR Jaywalking Detection in Traffic Surveillance System

Kim, TaekLim	Chungbuk National University, Dept. of Robot Control Eng, Roboti
Jang, ByungJin	Chungbuk National University
Yeon, Jooyeon	Chungbuk University
Kim, Tae-Hyeong	Korea Intelligent Automotive Parts Promotion Institute
Park, Tae-Hyoung	Chungbuk National University
Keywords: Intelligent Transportation Systems, Sensor Fusion Abstract: Roadside sensors like cameras and LiDAR enhance pedestrian safety by providing comprehensive traffic data. While traditional traffic surveillance systems primarily focus on vehicle-related violations, such as signal violations and speeding, pedestrian jaywalking remains a significant cause of accidents. This paper presents a jaywalking detection method that fuses camera-based image segmentation with LiDAR ground segmentation to handle various conditions, including day and night. Our system addresses challenges such as poor lighting and vehicle occlusion by leveraging LiDAR's robustness in unlearned environments. Road segmentation is enhanced by combining camera outputs with LiDAR ground data, refining road boundary detection for more accurate road area analysis. By integrating road segmentation and object tracking, the system reduces false negatives and improves jaywalking detection. Experimental results from real-road data validate its effectiveness, showing significant potential to enhance traffic surveillance.

16:30-16:45, Paper ThuP2T3.3
Behavior Monitoring System Leveraging Human Pose Estimation

Mitoma, Ryo	Yokohama National University
Mukaeda, Takayuki	Yokohama National University
Shima, Keisuke	Yokohama National University
Kai, Haruto	Yokohama National University
Suzuki, Masayuki	Yokohama National University
Kato, Keiji	Yokohama National University
Keywords: Machine Learning, Vision Systems, Automation Systems Abstract: In recent years, human-centric computer vision technologies, such as human pose estimation and action recognition, have garnered significant attention. This study, therefore, proposes the use of a classroom behavior monitoring system that integrates off-the-shelf human pose estimators to compute behavioral features from the video data. The extracted features include continuous values (body movements, head angles, etc.) and discrete features (timing of hand-raising), which were not leveraged in prior studies. The system can operate in real-time on consumer-grade GPUs. This makes it useful for real-time feedback during classes and post-class analysis of video recordings. The proposed system was experimentally validated using a motion capture dataset featuring scenes with occlusions. We demonstrated its accuracy and real-time performance, along with the challenges of capturing full-body poses in complex environments. Additionally, we applied this system to a novel learning analytics task: estimating cognitive and non-cognitive abilities from videos using real-world data. As part of our analysis, we processed a video from an elementary school classroom with our system and linked the computed features to self-reported skills. These experiments demonstrated promising results, highlighting the potential of evaluating student skills solely based on visual cues.

16:45-17:00, Paper ThuP2T3.4
Image-Based Response Measurement of Liquid Lens and Iterative Calibration of Scanning Focus Tracking for Dynamic Iris Authentication

Sueishi, Tomohiro	Tokyo University of Science
Yokoyama, Keiko	NEC
Yachida, Shoji	NEC
Ishikawa, Masatoshi	Tokyo University of Science
Keywords: Surveillance Systems, Vision Systems Abstract: There is an increasing demand for iris authentication in dynamic environments, such as when people are walking. High-speed focus control using a liquid lens is one solution for such dynamic iris authentication, but it is necessary to increase the frequency of capturing in-focus iris images considering the blinking of the eye. In this paper, we propose a simple response measurement method of the liquid lens using a large and tilted checker pattern, and an iterative calibration method of sinusoidal parameters in scanning focus tracking control using the liquid lens. We have experimentally confirmed step and sinusoidal responses of the liquid lens, and demonstrated sufficient and high-frequency AR marker (alternative to iris) recognition performance of the scanning focus tracking control for manual marker movement and zooming adjustments with a motorized lens.

17:00-17:15, Paper ThuP2T3.5
On the Impact of the Camera Field-Of-View to Direct Visual Servoing Robot Trajectories When Using the Photometric Gaussian Mixtures As Dense Feature

Schulte, Sinta Natalie	AIST Japan
André, Antoine N.	AIST
Crombez, Nathan	Université De Technologie De Belfort-Montbéliard
Caron, Guillaume	CNRS
Keywords: Vision Systems Abstract: This paper studies the impact of cameras with different fields of view (FoV) on Direct Visual Servoing to control robot motions from pixel intensities. Focusing on the Photometric Gaussian Mixture Visual Servoing that showed great convergence domains, this paper investigates two types of FoV: the seminal perspective case and the novel full omnidirectional case. Implemented with our open-source generic software framework libPeR for a fair comparison, the Visual Servoing experiments on a 6 degrees-of-freedom robot arm provide an in-depth evaluation of the impact of each FoV on the convergence domain, straightness of the trajectory and time to reach convergence.

17:15-17:30, Paper ThuP2T3.6
Development of a Versatile Structure for Mounting Drone Audition-Purposed Measurement Equipment

Tsukamoto, Yuta	Institute of Science Tokyo
Hoshiba, Kotaro	Institute of Science Tokyo
Iwatsuki, Nobuyuki	Institute of Science Tokyo
Keywords: Mechanism Design, Hardware Platform, Systems for Search and Rescue Applications Abstract: Drone audition which is an audio-based searching technology is expected to be utilized for search and rescue operations in disaster areas. In this technology, the measurement equipment is required to be mounted on drones. The previous structure for mounting this equipment was designed for each drone and is costly to implement. In this paper, we propose a structure that can be universally installed on drones with a general shape. The structure was actually fabricated and confirmed its versatility by installing it on drones. Additionally the weight of the structure and measurement equipment was measured and verified its applicability by comparing the payload of a drone. We performed static load analysis and eigenvalue analysis of the structure using mechanical properties of 3D printed parts measured by three-point bending test. As a result, it was confirmed that the proposed structure has enough strength and suitable vibration characteristics to replace the previous structure.

17:30-17:45, Paper ThuP2T3.7
Lightweight Hand-Waving Action Recognition Using Reservoir Computing in a Cafeteria Environment

Isomoto, Kosei	Kyushu Institute of Technology
Fumoto, Soma	University of Kitakyushu
Kobayashi, Ryohei	Kyushu Institute of Technology
Tanaka, Yuichiro	Kyushu Institute of Technology
Tamukoh, Hakaru	Kyushu Institute of Technology
Keywords: Human-Robot/System Interaction, Environment Monitoring and Management, Vision Systems Abstract: Owing to the global labor shortage and increasing need for operational efficiency, the adoption of service robots is advancing rapidly. These robots must recognize human action to understand human intention and respond appropriately based on that understanding. The action recognition systems embedded in robots need to be lightweight to operate efficiently with limited computational resources. Reservoir computing (RC) is one of the solutions for lightweight action recognition systems. Yamaguchi et al. proposed an RC-based hand-waving recognition system; however, the system cannot process multiple persons simultaneously and works only when one person is in the image. Therefore, this study proposes a lightweight hand-waving recognition system that integrates OpenPose, StrongSORT, and RC to work in complex environments with multiple individuals. Experimental results demonstrated the effectiveness of the proposed system in processing multiple people simultaneously in a crowded environment and accurately recognizing hand-waving actions with 90.75% accuracy. We also confirmed that the proposed system can process data at 24-26 FPS. We demonstrated that the proposed system can perform real-time processing. In addition, the robot with the proposed system recognized hand-waving actions in the “Restaurant” task of RoboCup@Home 2024 and obtained the second-place score.

17:45-18:00, Paper ThuP2T3.8
Predicting Human Behavior Using Knowledge Information in Jig Operation and Robot Collaborative Action Generation

Tamaki, Mone	Waseda University
Nakajo, Ryoichi	National Institute of Advanced Industrial Science and Technology
Yamanobe, Natsuki	Advanced Industrial Science and Technology
Domae, Yukiyasu	The National Institute of Advanced Industrial Science and Techno
Ogata, Tetsuya	Waseda University
Keywords: Human-Robot Cooperation/Collaboration, Machine Learning, Automation Systems Abstract: In human-robot collaborative tasks, learning-based models that can deal with behavior beyond the scope of human description are progressing rapidly. Deep learning is effective in capturing complex nonlinear relationships, making it valuable in scenarios with intricate interactions between the environment and tasks, such as collaborative tasks. Deep learning improves the performance by incorporating multiple information sources. Human knowledge, which is regarded as supplemental information obtained from the environment, has been shown to enhance the generalization ability of task execution when it is appropriately incorporated into the learning process for robot motion generation. Among the various models, those that utilize action labels subjectively defined by humans for robot behavior enable the robot to comprehend its own actions better, leading to higher generalization. This approach also suggests that estimating human actions contributes to predicting robot movements in human-robot collaboration (HRC). However, the performance of learning-based methods is significantly influenced by the quality of the training data. Therefore, capturing appropriate human information and integrating this information into the learning process are critical for improving the ability of the robot to learn collaborative tasks. In this study, we propose a learning model that not only provides a robot with action labels for its own behavior but also includes human action labels, encouraging the robot to respond to human actions. The optimal amount of human information to be used in learning is evaluated by adjusting the methods for defining human action labels and the quantity of human data utilized. Experiments were conducted with a task in which the robot handled the manipulation of jigs in an assembly operation involving both humans and robots. The results of the learning process suggest that estimating human behavior can assist in generating collaborative robot actions.


ThuP2T4	Forum 13-14
SS4-2 Real Space Service System	In-person Special Session
Chair: Wada, Kazuyoshi	Tokyo Metropolitan University
Co-Chair: Niitsuma, Mihoko	Chuo University
Organizer: Wada, Kazuyoshi	Tokyo Metropolitan University
Organizer: Niitsuma, Mihoko	Chuo University
Organizer: Nakamura, Sousuke	Hosei University
Organizer: Ohara, Kenichi	Meijo University

16:00-16:15, Paper ThuP2T4.1
Efficient Navigation in Crowded Environments for Autonomous Electric Wheelchairs Using Human Flow Activity Trend and Most Frequent Direction (I)

Kojima, Takuya	Chuo University
Niitsuma, Mihoko	Chuo University
Keywords: Human-Robot/System Interaction, Systems for Service/Assistive Applications, Human-Robot Cooperation/Collaboration Abstract: Human flow data is being acquired and utilized in various situations. In this paper, we propose a navigation system for autonomous electric wheelchairs that makes use of usage trends in the driving environment obtained from human flow data. This system is expected to generate globally optimal paths and plan highly effective driving strategies. As a result, the navigation system may reduce the physical and mental burden on the user and harmonize with the surrounding pedestrians. In particular, this paper verifies through simulation experiments the usefulness of the system that takes into account environmental usage information, such as stay history, transit history, and most frequent travel directions.

16:15-16:30, Paper ThuP2T4.2
Integrating Multimodal Communication and Comprehension Evaluation During Human-Robot Collaboration for Increased Reliability of Foundation Model-Based Task Planning Systems (I)

Martin, Eden	UCLouvain
Hasegawa, Shoichi	Ritsumeikan University
Solis, Jorge	Karlstad University / Waseda University
Macq, Benoit	UCLouvain
Ronsse, Renaud	Université Catholique De Louvain
Garcia Ricardez, Gustavo Alfonso	Ritsumeikan University
El Hafi, Lotfi	Ritsumeikan University
Taniguchi, Tadahiro	Ritsumeikan University
Keywords: Human-Robot Cooperation/Collaboration, Decision Making Systems, Multi-Modal Perception Abstract: Foundation models provide the adaptability needed in robotics but often require explicit tasks or human verification due to potential unreliability in their responses, complicating human-robot collaboration (HRC). To enhance the reliability of such task-planning systems, we propose 1) an adaptive task-planning system for HRC that reliably performs non-predefined tasks implicitly instructed through HRC, and 2) an integrated system combining multimodal large language model (LLM)-based task planning with multimodal communication of human intention to increase the HRC success rate and comfort. The proposed system integrates GPT-4V for adaptive task planning and comprehension evaluation during HRC with multimodal communication of human intention through speech and deictic gestures. Four pick-and-place tasks of gradually increasing difficulty were used in three experiments, each evaluating a key aspect of the proposed system: task planning, comprehension evaluation, and multimodal communication. The quantitative results show that the proposed system can interpret implicitly instructed tabletop pick-and-place tasks through HRC, providing the next object to pick and the correct position to place it, achieving a mean success rate of 0.80. Additionally, the system can evaluate its comprehension of three of the four tasks with an average precision of 0.87. The qualitative results show that multimodal communication not only significantly enhances the success rate but also the feelings of trust and control, willingness to use again, and sense of collaboration during HRC.

16:30-16:45, Paper ThuP2T4.3
Development of Upper-Body Humanoid Robot Using 3-US Parallel Link Mechanism (I)

Shimizu, Takafumi	Tokyo Metropolitan University
Obo, Takenori	Tokyo Metropolitan University
Takesue, Naoyuki	Tokyo Metropolitan University
Keywords: Mechanism Design, Mechatronics Systems, Hardware Platform Abstract: Humanoids that can mimic human movements are effective for avatar robots that are remotely controlled by humans. Conventional serial link manipulators can be used for the arms, but the waist of a humanoid affects the range of motion of the head and hands, and it must support the weight of the humanoid's upper body while moving. Therefore, we focused on the parallel link mechanism, which has higher torque and stiffness than the serial link mechanism. In this study, the 3-US parallel link mechanism with 3-DOFs of posture, which has been studied by the authors, is applied to a waist joint of an upper-body humanoid robot that has a range of motion similar to that of a human. We also develop a serial link arm robot with 7-DOFs and combine it with a waist joint as an upper-body humanoid robot.

16:45-17:00, Paper ThuP2T4.4
A Wearable System for Walking Cognitive Assistance Using People Flow Estimation and Vibrotactile Feedback in Crowded Sidewalks (I)

Okai, Ayumu	Chuo University
Niitsuma, Mihoko	Chuo University
Keywords: Systems for Service/Assistive Applications, Welfare systems, Human Interface Abstract: In this paper, we develop a walking cognitive assistance system that enables visually impaired people to walk independently. Focusing on estimating the direction of people flow, we developed a wearable system by achieving wireless and minimalized vibrotactile presentation devices using two Spresense units and independence of image processing units using the Jetson Orin Nano. Additionally, by employing YOLOv8 for people detection, we have realized improved processing speed and expanded the range of people detection. In the accuracy evaluation experiments, the accuracy of estimating the direction of approaching people’s flow improved, making safer walking possible.

17:00-17:15, Paper ThuP2T4.5
Servo-Driven Flapping-Wing Aerial Vehicle (FWAV): Payload Capacity and Navigation Performance (I)

Afakh, Muhammad Labiyb	Tokyo Metropolitan University
Saputra, Azhar Aulia	Tokyo Metropolitan University
Sato, Hidaka	Tokyo Metropolitan University
Wada, Kazuyoshi	Tokyo Metropolitan University
Takesue, Naoyuki	Tokyo Metropolitan University
Keywords: Mechatronics Systems, Mechanism Design, Hardware Platform Abstract: In the current situation, environmental monitoring and disaster mitigation become a concern. There are many developed robot in different type for enviromental issue, but some robot has limitation on coverage area. To solve that problem, several researcher also developed FWAV that has potential on environment monitoring application. This study aims to enhance the potential of Flapping Wing Aerial Vehicles (FWAVs) for wider implementation in these fields. While most FWAVs traditionally use a single actuator, leading to manufacturing and design complexity, recent trends show a shift towards servo-driven mechanisms due to their enhanced capabilities. We achieve modularity through a design that allows for easy assembly and disassembly of main wings, tail, and hardware components. Performance enhancements are realized through the implementation of a square wave pattern as the input signal for flapping motion, which outperforms the common sinusoidal wave pattern as input signal. By adjusting the center of gravity and flapping parameters, we demonstrate the FWAV's potential to carry payloads of up to 100 grams which is almost 28% of its weight, suitable for small-scale environmental monitoring missions. Additionally, we implement a collision avoidance system to enhance the FWAV's ability to navigate safely in complex environments.


ThuBqT7	Augustiner Keller (off site)
Banquet	Social Event

Technical Program for Thursday January 23, 2025